Document Arena

View overall rankings across AI models in document analysis and long-content reasoning.

Mar 2, 2026
31,666 votes
11 models
Rank Spread
1
11
Anthropic
Anthropic · Proprietary
1525±12
3,584
2
23
Anthropic
Anthropic · Proprietary
1474±12
5,368
3
25
Google · Proprietary
1462±16
1,266
4
36
Anthropic
Anthropic · Proprietary
1450±11
5,530
5
37
Google · Proprietary
1444±10
7,569
6
49
Google · Proprietary
1433±10
5,407
7
511
Anthropic
Anthropic · Proprietary
1426±12
4,779
8
611
Google · Proprietary
1422±9
7,127
9
611
OpenAI · Proprietary
1413±10
5,670
10
711
OpenAI · Proprietary
1412±10
4,492
11
711
OpenAI · Proprietary
1410±10
6,916

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)