DA-bench

Visual Benchmark for Data Analytics AI Agents

What is DA-bench?

DA-bench is the “Data Analyst Benchmark” - it is a series of questions that you would expect Data Analysts to be able to solve.

This benchmark was created to help us test Unsupervised and other data tools against real-world problems, so we can understand the strengths and weaknesses of various approaches to automating analytics.

DA-bench is a visual benchmark, you can see both the score and videos of how every tool performs on every test. See some examples below:

Leaderboard

Tool Date Tested Hallucination
Rate
Scalability
Score
Test
Score
Overall
Score
Unsupervised

Full Agent Use your Data Warehouse

Feb 14, 2025 11.8%
Correct: 34 | Hallucinations: 4
100.0% 53.6% 72.1% VIEW DETAILS
Databricks Genie

Full Agent Use your Data Warehouse

Mar 19, 2025 23.1%
Correct: 39 | Hallucinations: 9
100.0% 52.6% 71.6% VIEW DETAILS
MicroStrategy Auto Answers

Full Agent Use your Data Warehouse

Mar 19, 2025 35.7%
Correct: 28 | Hallucinations: 10
100.0% 31.6% 58.9% VIEW DETAILS
Snowflake Copilot

Co-pilot Use your Data Warehouse

Mar 10, 2025 28.9%
Correct: 38 | Hallucinations: 11
71.4% 46.3% 56.4% VIEW DETAILS
Databricks Assistant

Co-pilot Use your Data Warehouse

Mar 10, 2025 48.3%
Correct: 29 | Hallucinations: 14
100.0% 24.2% 54.5% VIEW DETAILS
Snowflake Cortex Analyst

Full Agent Use your Data Warehouse

Mar 18, 2025 34.3%
Correct: 35 | Hallucinations: 12
71.4% 40.4% 52.8% VIEW DETAILS
Julius

Full Agent Use Data In Memory

Mar 18, 2025 35.1%
Correct: 37 | Hallucinations: 13
57.1% 42.1% 48.1% VIEW DETAILS
ChatGPT 4o Data Analyst

Full Agent Use Data In Memory

Mar 11, 2025 43.2%
Correct: 37 | Hallucinations: 16
42.9% 36.8% 39.2% VIEW DETAILS
Tableau Agent

Co-pilot Use your Data Warehouse

Mar 12, 2025 100.0%
Correct: 14 | Hallucinations: 14
100.0% -1.4% 39.2% VIEW DETAILS
Qlik Sense Insight Advisor

Natural Language Search Use your Data Warehouse

Feb 18, 2025 300.0%
Correct: 1 | Hallucinations: 3
100.0% -3.9% 37.6% VIEW DETAILS
Thoughtspot Spotter

Full Agent Use your Data Warehouse

Mar 17, 2025 92.3%
Correct: 13 | Hallucinations: 12
85.7% 0.0% 34.3% VIEW DETAILS
IBM Cognos Assistant AI

Natural Language Search Use Data In Memory

Mar 7, 2025 150.0%
Correct: 2 | Hallucinations: 3
85.7% -1.8% 33.2% VIEW DETAILS
Google Gemini in BigQuery

Co-pilot Use your Data Warehouse

Mar 17, 2025 53.6%
Correct: 28 | Hallucinations: 15
42.9% 22.8% 30.8% VIEW DETAILS
Amazon Q in QuickSight

Co-pilot Use your Data Warehouse

Mar 11, 2025 228.6%
Correct: 7 | Hallucinations: 16
100.0% -15.8% 30.5% VIEW DETAILS
SAP Just Ask

Natural Language Search Use your Data Warehouse

Feb 19, 2025 0.0%
Correct: 3
42.9% 5.3% 20.3% VIEW DETAILS

About

The Data Analyst Benchmark is a collection of datasets and prompts that can be used to test how automated analytics tools handle common data analyst tasks.

We use this information to help us prioritize work to improve our AI. We are making it available publicly to help other companies improve their tools and to help users evaluate which tools are relevant to their problems.

DA-bench currently tests dozens of prompts across 9 categories. Evaluation is performed by manual testing by a third-party, scores and videos of test results are displayed on dabench.com.

Test Scoring

Each test question is worth a maximum of 5 points. Deductions are made as follows:

User must hunt for the answer in the UI -1 point
Tool requests the table name from the user -1 point
Tool requests the column name from the user -1 point
Tool requests the column value from the user -1 point
User must fix errors -1 point per error fixed
Tool hallucinates an incorrect answer -5 points
Tool says it cannot answer the question because the necessary column or data is not available -5 points (this is also a hallucination)

More Info

DA-bench is maintained by Unsupervised.

Suggestions and contributions are welcome on the DA-bench Github Repository.