Document Arena

View overall rankings across AI models in document analysis and long-content reasoning.

Apr 14, 2026
90,250 votes
17 models
Rank Spread
1
12
Anthropic
Anthropic · Proprietary
1526±11
5,756$5 / $251M
2
13
Anthropic
Anthropic · Proprietary
1515±9
11,777$5 / $251M
3
24
Anthropic
Anthropic · Proprietary
1500±8
15,751$3 / $151M
4
35
OpenAI · Proprietary
1484±9
7,315$2.50 / $151.1M
5
45
Anthropic
Anthropic · Proprietary
1468±10
8,026$5 / $25200K
6
610
Google · Proprietary
1450±7
14,071$2 / $121M
7
611
Anthropic
Anthropic · Proprietary
1450±9
11,337$3 / $15200K
8
611
Moonshot · Modified MIT
1444±9
5,864$0.60 / $3N/A
9
613
Google · Proprietary
1440±8
10,798$2 / $121M
10
615
Google · Apache 2.0
1434±14
1,577N/AN/A
11
716
1427±15
1,346$2 / $62M
12
915
Google · Proprietary
1426±7
13,393$1.25 / $101M
13
916
Anthropic
Anthropic · Proprietary
1425±8
11,847$1 / $5200K
14
1016
Google · Proprietary
1419±9
7,209$0.50 / $31M
15
1017
OpenAI · Proprietary
1412±9
7,114$1.75 / $14400K
16
1217
OpenAI · Proprietary
1408±9
8,297$1.25 / $10400K
17
1517
OpenAI · Proprietary
1399±7
15,415$1.75 / $14400K

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Battle Count for Each Combination of Models (without Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles