Search Arena

View overall rankings across LLMs with integrated web search.

Jun 15, 2026
744,152 votes
31 models
Rank Spread
1
12
Anthropic
Anthropic · Proprietary
1252±5
82,294$5 / $251M
2
25
OpenAI · Proprietary
1240±6
37,025$5 / $301.1M
3
15
Anthropic
Anthropic · Proprietary
1237±11
5,007$10 / $501M
4
25
Anthropic
Anthropic · Proprietary
1232±6
38,472$5 / $251M
5
27
Baidu · Proprietary
1226±10
3,824N/AN/A
6
57
Anthropic
Anthropic · Proprietary
1218±5
81,575$3 / $151M
7
511
Google · Proprietary
1214±6
58,094N/AN/A
8
713
Google · Proprietary
1207±5
37,327$2 / $12N/A
9
713
1206±5
56,367$2 / $62M
10
714
OpenAI · Proprietary
1206±6
52,868$1.75 / $14400K
11
714
Anthropic
Anthropic · Proprietary
1203±6
18,408$5 / $251M
12
814
Google · Proprietary
1201±5
93,980N/AN/A
13
815
OpenAI · Proprietary
1199±5
60,190$1.25 / $10400K
14
1015
OpenAI · Proprietary
1195±6
56,204$2.50 / $151.1M
15
1316
xAI · Proprietary
1190±6
54,040N/AN/A
16
1519
Anthropic
Anthropic · Proprietary
1179±6
61,808$5 / $25200K
17
1620
OpenAI · Proprietary
1173±5
75,946$1.75 / $14400K
18
1620
xAI · Proprietary
1171±5
82,113$0.20 / $0.502M
19
1620
xAI · Proprietary
1170±4
43,041$0.20 / $0.502M
20
1721
xAI · Proprietary
1165±5
36,089$1.25 / $2.501M
21
2022
Anthropic
Anthropic · Proprietary
1157±5
75,660$3 / $151M
22
2126
Anthropic
Anthropic · Proprietary
1148±5
77,389$15 / $75200K
23
2226
OpenAI · Proprietary
1143±5
20,788$2 / $8200K
24
2227
Google · Proprietary
1142±4
83,776$1.25 / $101M
25
2227
xAI · Proprietary
1141±6
19,389$3 / $15N/A
26
2228
Perplexity AI · Proprietary
1137±6
29,222$1 / $1127.1K
27
2429
OpenAI · Proprietary
1132±6
20,928$1.25 / $10400K
28
2629
Perplexity AI · Proprietary
1129±6
28,720$1 / $1127.1K
29
2729
Anthropic
Anthropic · Proprietary
1126±5
31,226$15 / $75200K
30
3031
Diffbot · Apache 2.0
1023±8
6,437N/AN/A
31
3031
OpenAI · Proprietary
1006±11
3,440$30 / $608.2K

Default Leaderboard Plots

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)