julius — Run #186 (2025-01-14)
    
      
        Overall Score:
        64.6%
      
      
      (40% Scalability Score + 60% Test Score)
    
  
  (40% Scalability Score + 60% Test Score)
DA-bench Setup for Run #186 — Scalability Score: 57.1% (4 / 7)
See how this tool was set up for the test.
- 
          
Setup Less Than 20 Minutes
 - 
          
Connects to Data Warehouse
 - 
          
Handles 1TB Table
 - 
          
Handles 10+ Tables
 - 
          
No Table Structure Changes
 - 
          
No SQL Expertise for Setup
 
DA-bench Results for Run #186 — Test Score: 69.6% (160 / 230)
      Data Querying (110 / 125)
      
          
        23 Correct Answers,
        1 Hallucination
      
    
    
      Domain Knowledge (5 / 5)
      
          
        1 Correct Answer,
        0 Hallucinations
      
    
    | Question | Date Tested | Overall Score | Video Recording | 
|---|---|---|---|
| 
              
                 
                  dk01
                    | 
            2025-01-16 | 5 | 
      Feature Engineering (30 / 40)
      
          
        7 Correct Answers,
        1 Hallucination
      
    
    | Question | Date Tested | Overall Score | Video Recording | 
|---|---|---|---|
| 
              
                 
                  fe1
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  fe2
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  fe3
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  fe4
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  fe5
                    | 
            2025-01-16 | -5 | |
| 
              
                 
                  fe6
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  fe7
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  fe8
                    | 
            2025-01-18 | 5 | 
      Insight Identification (-5 / 40)
      
          
        3 Correct Answers,
        4 Hallucinations
      
    
    | Question | Date Tested | Overall Score | Video Recording | 
|---|---|---|---|
| 
              
                 
                  ii2
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  ii5
                    | 
            2025-01-16 | 0 | |
| 
              
                 
                  ii6
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  ii7
                    | 
            2025-01-16 | -5 | |
| 
              
                 
                  ii8
                    | 
            2025-01-16 | -5 | |
| 
              
                 
                  ii10
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  ii12
                    | 
            2025-01-18 | -5 | |
| 
              
                 
                  ii15
                    | 
            2025-01-16 | -5 | 
      Learning (5 / 5)
      
          
        1 Correct Answer,
        0 Hallucinations
      
    
    | Question | Date Tested | Overall Score | Video Recording | 
|---|---|---|---|
| 
              
                 
                  l1
                    | 
            2025-01-16 | 5 | 
      Visualization (15 / 15)
      
          
        3 Correct Answers,
        0 Hallucinations
      
    
    | Question | Date Tested | Overall Score | Video Recording | 
|---|---|---|---|
| 
              
                 
                  v1
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  v2
                    | 
            2025-01-16 | 5 | |
| 
              
                 
                  v3
                    | 
            2025-01-16 | 5 |