julius — Run #115 (2024-11-07)
Overall Score:
40.0%
(40% Scalability Score + 60% Test Score)
(40% Scalability Score + 60% Test Score)
DA-bench Setup for Run #115 — Scalability Score: 57.1% (4 / 7)
See how this tool was set up for the test.
-
Setup Less Than 20 Minutes
-
Connects to Data Warehouse
-
Handles 1TB Table
-
Handles 10+ Tables
-
No Table Structure Changes
-
No SQL Expertise for Setup
DA-bench Results for Run #115 — Test Score: 28.6% (60 / 210)
Data Querying (85 / 120)
20 Correct Answers,
3 Hallucinations
Feature Engineering (15 / 40)
5 Correct Answers,
2 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
fe1
|
2024-11-06 | 5 | |
fe2
|
2024-11-06 | 5 | |
fe3
|
2024-11-06 | 5 | |
fe4
|
2024-11-06 | 5 | |
fe5
|
2024-11-06 | -5 | |
fe6
|
2024-11-06 | 5 | |
fe7
|
2024-11-06 | -5 | |
fe8
|
2024-11-06 | 0 |
Insight Identification (-15 / 25)
1 Correct Answer,
4 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
ii2
|
2024-11-06 | 5 | |
ii5
|
2024-11-12 | -5 | |
ii6
|
2024-11-06 | -5 | |
ii7
|
0202-11-11 | -5 | |
ii8
|
2024-11-06 | -5 |
Learning (-10 / 10)
0 Correct Answers,
2 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
l1
|
2024-11-06 | -5 | |
l2
|
2024-11-06 | -5 |
Visualization (-15 / 15)
0 Correct Answers,
3 Hallucinations
Question | Date Tested | Overall Score | Video Recording |
---|---|---|---|
v1
|
2024-11-11 | -5 | |
v2
|
2024-11-12 | -5 | |
v3
|
2024-11-11 | -5 |