DA-bench

What is DA-bench?

DA-bench is the “Data Analyst Benchmark” - it is a series of questions that you would expect Data Analysts to be able to solve.

This benchmark was created to help us test Unsupervised and other data tools against real-world problems, so we can understand the strengths and weaknesses of various approaches to automating analytics.

DA-bench is a visual benchmark, you can see both the score and videos of how every tool performs on every test. See some examples below:

Databricks Assistant answering a data querying question that tests performing an aggregation by a different name and a second query from that (DQ19).

Unsupervised answering a data querying question that tests ambigious column names (DQ15)

Snowflake Cortex Analyst answering a data query question that tests multiple joins for the same tables across filters and stats (DQ27).

Tableau Agent answering a data query question designed to find and compare information of ambiguous types across tables without joins (DQ4).

Databricks Genie answering a Visualization question that requires charting two series (V2).

Snowflake Copilot returning a hallucination to a feature engineering question designed to test minmax normalization (FE3). Snowflake renormalizes a subset of the column values rather than just returning the requested normalized values.

Google Gemini in BigQuery answering an insight identification question that compares an aggregation for two distinct subsets of data (II2).

Leaderboard

Tool	Date Tested	Hallucination Rate	Scalability Score	Test Score	Overall Score
Unsupervised Full Agent Use your Data Warehouse	Sep 9, 2025	15.2% Correct: 46 \| Hallucinations: 7	100.0%	60.9%	76.6%	VIEW DETAILS
Databricks Genie Full Agent Use your Data Warehouse	Sep 2, 2025	22.5% Correct: 40 \| Hallucinations: 9	100.0%	48.4%	69.1%	VIEW DETAILS
Snowflake Copilot Co-pilot Use your Data Warehouse	Sep 12, 2025	19.6% Correct: 46 \| Hallucinations: 9	71.4%	56.9%	62.7%	VIEW DETAILS
Google Gemini in BigQuery Co-pilot Use your Data Warehouse	Sep 2, 2025	26.8% Correct: 41 \| Hallucinations: 11	71.4%	45.9%	56.1%	VIEW DETAILS
Snowflake Cortex Analyst Full Agent Use your Data Warehouse	Sep 2, 2025	20.0% Correct: 35 \| Hallucinations: 7	71.4%	43.8%	54.8%	VIEW DETAILS
MicroStrategy Auto Answers Full Agent Use your Data Warehouse	Apr 8, 2025	40.9% Correct: 22 \| Hallucinations: 9	100.0%	23.2%	53.9%	VIEW DETAILS
Thoughtspot Spotter Full Agent Use your Data Warehouse	Aug 18, 2025	42.9% Correct: 21 \| Hallucinations: 9	100.0%	18.4%	51.0%	VIEW DETAILS
Databricks Assistant Co-pilot Use your Data Warehouse	Sep 10, 2025	57.6% Correct: 33 \| Hallucinations: 19	100.0%	7.5%	44.5%	VIEW DETAILS
Tableau Agent Co-pilot Use your Data Warehouse	Sep 9, 2025	78.9% Correct: 19 \| Hallucinations: 15	100.0%	5.6%	43.4%	VIEW DETAILS
Qlik Sense Insight Advisor Natural Language Search Use your Data Warehouse	Feb 18, 2025	300.0% Correct: 1 \| Hallucinations: 3	100.0%	-4.0%	37.6%	VIEW DETAILS
Amazon Q in QuickSight Co-pilot Use your Data Warehouse	Jul 14, 2025	118.2% Correct: 11 \| Hallucinations: 13	100.0%	-4.9%	37.0%	VIEW DETAILS
IBM Cognos Assistant AI Natural Language Search Use Data In Memory	Aug 25, 2025	200.0% Correct: 2 \| Hallucinations: 4	85.7%	-3.1%	32.4%	VIEW DETAILS
SAP Just Ask Natural Language Search Use your Data Warehouse	Aug 26, 2025	400.0% Correct: 1 \| Hallucinations: 4	42.9%	-4.7%	14.3%	VIEW DETAILS

About

The Data Analyst Benchmark is a collection of datasets and prompts that can be used to test how automated analytics tools handle common data analyst tasks.

We use this information to help us prioritize work to improve our AI. We are making it available publicly to help other companies improve their tools and to help users evaluate which tools are relevant to their problems.

DA-bench currently tests dozens of prompts across 9 categories. Evaluation is performed by manual testing by a third-party, scores and videos of test results are displayed on dabench.com.

Test Scoring

Each test question is worth a maximum of 5 points. Deductions are made as follows:

User must hunt for the answer in the UI	-1 point
Tool requests the table name from the user	-1 point
Tool requests the column name from the user	-1 point
Tool requests the column value from the user	-1 point
User must fix errors	-1 point per error fixed
Tool hallucinates an incorrect answer	-5 points
Tool says it cannot answer the question because the necessary column or data is not available	-5 points (this is also a hallucination)

More Info

DA-bench is maintained by Unsupervised.

Suggestions and contributions are welcome on the DA-bench Github Repository.

Github Repository Source Data