Text Arena🏆Overall

View overall rankings across various AI models in text-to-text tasks across math, coding, creative writing, and other open-ended domains.

May 28, 2026
6,532,755 votes
54 labs
Lab Rank
Model Score
Rank Spread
1
Anthropic
Anthropic
claude-opus-4-6-thinking · Proprietary
1502±4
1
14
2
Meta
Meta
muse-spark · Proprietary
1489±6
5
211
3
Google
gemini-3.1-pro-preview · Proprietary
1487±4
6
411
4
OpenAI
gpt-5.5-high · Proprietary
1482±6
8
519
5
xAI
grok-4.20-beta1 · Proprietary
1476±5
13
824
6
Alibaba
qwen3.7-max-preview · Proprietary
1475±10
15
530
7
Z.ai
glm-5.1
1474±6
16
827
8
Baidu
ernie-5.1 · Proprietary
1470±6
21
930
9
Xiaomi
mimo-v2.5-pro
1465±6
27
1536
10
Moonshot
kimi-k2.6
1462±6
28
1939
11
DeepSeek
deepseek-v4-pro-thinking
1458±6
32
2546
12
Bytedance
Bytedance
dola-seed-2.0-pro · Proprietary
1456±4
34
2747
13
Meituan
longcat-flash-chat-2602-exp · Proprietary
1436±5
59
5072
14
Amazon
amazon-nova-experimental-chat-26-02-10 · Proprietary
1427±10
70
5495
15
Tencent
Tencent
hunyuan-hy3-preview
1416±8
91
70107
16
Mistral
mistral-large-3
1415±4
93
75104
17
MiniMax
minimax-m2.7
1413±5
96
77106
18
Stepfun
StepFun
step-3.5-flash
1394±4
121
108133
19
Arcee AI
trinity-large-preview
1378±4
138
130148
20
Nvidia
nvidia-nemotron-3-super-120b-a12b
1361±7
154
144173
21
Prime Intellect
intellect-3
1356±8
159
147181
22
Cohere
Cohere
command-a-03-2025
1354±3
160
153179
23
Inception AI
mercury-2 · Proprietary
1347±11
172
153194
24
Ant Group
ling-flash-2.0
1346±7
173
156189
25
Ai2
olmo-3.1-32b-instruct
1330±6
192
176206
26
01.AI
01 AI
yi-lightning · Proprietary
1328±5
193
180210
27
NexusFlow
athene-v2-chat
1314±5
216
199229
28
IBM
granite-4.1-8b
1312±10
219
194234
29
AI21 Labs
jamba-1.5-large
1289±7
237
231249
30
Reka AI
reka-core-20240904 · Proprietary
1288±7
239
231249
31
Princeton
gemma-2-9b-it-simpo
1279±7
247
235255
32
Microsoft
phi-4
1256±5
264
258266
33
HuggingFace
zephyr-orpo-141b-A35b-v0.1
1212±11
286
276292
34
Databricks
dbrx-instruct-preview
1194±6
294
288303
35
InternLM
internlm2_5-20b-chat
1191±7
295
288307
36
OpenChat
openchat-3.5
1181±10
301
294314
37
Snowflake
snowflake-arctic-instruct
1179±6
304
295314
38
AllenAI/UW
tulu-2-dpo-70b
1177±10
306
295314
39
NousResearch
openhermes-2.5-mistral-7b
1174±10
307
295318
40
LMSYS
vicuna-33b
1172±6
308
297317
41
Nexusflow
starling-lm-7b-beta
1171±7
309
297320
42
UC Berkeley
starling-lm-7b-alpha
1167±8
312
298321
43
Upstage AI
solar-10.7b-instruct-v1.0
1151±13
318
308334
44
Cognitive Computations
dolphin-2.2.1-mistral-7b
1151±15
319
307336
45
MosaicML
mpt-30b-chat
1149±12
320
312334
46
TII
falcon-180b-chat
1146±17
323
311338
47
UW
guanaco-33b
1126±12
335
320344
48
Together AI
stripedhyena-nous-7b
1120±11
337
327344
49
Stanford
alpaca-13b
1067±11
349
347352
50
Nomic AI
gpt4all-13b-snoozy
1065±15
350
345353
51
Tsinghua
chatglm3-6b
1055±12
352
347353
52
RWKV
RWKV-4-Raven-14B
1041±11
353
350355
53
OpenAssistant
oasst-pythia-12b
1021±11
355
353355
54
Stability
Stability AI
stablelm-tuned-alpha-7b
952±13
360
359360

Default Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)