chatgpt — Run #184 (2025-01-13)
using 4o model
Overall Score:
41.1%
(40% Setup Score + 60% Test Score)
(40% Setup Score + 60% Test Score)
DA-bench Setup for Run #184 — Setup Score: 42.9% (3 / 7)
See how this tool was set up for the test.
-
Setup Less Than 20 Minutes
-
Connects to Data Warehouse
-
Handles 1TB Table
-
No Individual Upload of Files
-
No Python for Setup
-
No SQL for Setup
DA-bench Results for Run #184 — Test Score: 40.0% (94 / 235)
Data Querying (48 / 125)
17 Correct Answers,
7 Hallucinations
Domain Knowledge (5 / 5)
1 Correct Answer,
0 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
dk01
|
2025-01-15 | 5 |
Feature Engineering (20 / 40)
6 Correct Answers,
2 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
fe1
|
2025-01-15 | 5 | |
fe2
|
2025-01-15 | 5 | |
fe3
|
2025-01-15 | 5 | |
fe4
|
2025-01-15 | 5 | |
fe5
|
2025-01-15 | -5 | |
fe6
|
2025-01-15 | -5 | |
fe7
|
2025-01-15 | 5 | |
fe8
|
2025-01-17 | 5 |
Insight Identification (16 / 40)
6 Correct Answers,
2 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
ii2
|
2025-01-17 | 5 | |
ii5
|
2025-01-15 | 5 | |
ii6
|
2025-01-15 | 3 | |
ii7
|
2025-01-15 | 5 | |
ii8
|
2025-01-15 | -5 | |
ii10
|
2025-01-15 | 4 | |
ii12
|
2025-01-15 | 4 | |
ii15
|
2025-01-15 | -5 |
Learning (-10 / 10)
0 Correct Answers,
2 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
l1
|
2025-01-15 | -5 | |
l2
|
2025-01-15 | -5 |
Visualization (15 / 15)
3 Correct Answers,
0 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
v1
|
2025-01-15 | 5 | |
v2
|
2025-01-15 | 5 | |
v3
|
2025-01-16 | 5 |