Text Arena🏆Overall

View overall rankings across various AI models in text-to-text tasks across math, coding, creative writing, and other open-ended domains.

Jun 16, 2026
6,917,183 votes
47 labs
Lab Rank
Model Score
Rank Spread
1
Z.ai
glm-5.1
1475±6
15
930
2
Xiaomi
mimo-v2.5-pro
1466±5
29
1437
3
Moonshot
kimi-k2.6
1460±5
34
2544
4
DeepSeek
deepseek-v4-pro-thinking
1458±5
36
2848
5
Google
gemma-4-31b
1451±8
43
3260
6
Alibaba
qwen3.5-397b-a17b
1444±4
57
4364
7
Mistral
mistral-medium-3.5
1426±7
78
62101
8
MiniMax
minimax-m2.7
1417±4
96
79111
9
Nvidia
nvidia-nemotron-3-ultra-550b-a55b-nvfp4
1416±8
97
76114
10
Tencent
Tencent
hunyuan-hy3-preview
1413±8
103
80120
11
Meituan
longcat-flash-chat
1401±6
118
104131
12
Stepfun
StepFun
step-3.5-flash
1395±4
127
113138
13
Arcee AI
trinity-large-preview
1379±4
145
135153
14
Prime Intellect
intellect-3
1357±8
166
153188
15
Cohere
Cohere
command-a-03-2025
1354±3
167
160186
16
OpenAI
gpt-oss-120b
1353±4
170
160188
17
Ant Group
ling-flash-2.0
1346±7
181
164198
18
Meta
Meta
llama-3.1-405b-instruct-bf16
1335±4
196
178206
19
Ai2
olmo-3.1-32b-instruct
1330±6
199
183215
20
NexusFlow
athene-v2-chat
1314±5
223
206236
21
IBM
granite-4.1-8b
1307±10
231
208242
22
AI21 Labs
jamba-1.5-large
1289±7
244
238256
23
Princeton
gemma-2-9b-it-simpo
1280±7
254
242262
24
Microsoft
phi-4
1256±5
271
265273
25
01.AI
01 AI
yi-1.5-34b-chat
1212±5
292
288297
26
HuggingFace
zephyr-orpo-141b-A35b-v0.1
1212±11
293
283299
27
Databricks
dbrx-instruct-preview
1194±6
301
295310
28
InternLM
internlm2_5-20b-chat
1191±7
302
295314
29
OpenChat
openchat-3.5
1182±10
308
301321
30
Snowflake
snowflake-arctic-instruct
1179±6
311
302321
31
AllenAI/UW
tulu-2-dpo-70b
1177±10
313
302321
32
NousResearch
openhermes-2.5-mistral-7b
1175±10
314
302325
33
LMSYS
vicuna-33b
1172±6
315
304324
34
Nexusflow
starling-lm-7b-beta
1171±7
316
304327
35
UC Berkeley
starling-lm-7b-alpha
1167±8
319
306328
36
Upstage AI
solar-10.7b-instruct-v1.0
1151±13
325
315341
37
Cognitive Computations
dolphin-2.2.1-mistral-7b
1151±15
326
314343
38
MosaicML
mpt-30b-chat
1150±12
327
319341
39
TII
falcon-180b-chat
1147±17
330
318345
40
UW
guanaco-33b
1126±12
342
327350
41
Together AI
stripedhyena-nous-7b
1120±11
344
334351
42
Stanford
alpaca-13b
1068±11
356
354359
43
Nomic AI
gpt4all-13b-snoozy
1066±15
357
352360
44
Tsinghua
chatglm3-6b
1055±12
359
354360
45
RWKV
RWKV-4-Raven-14B
1041±11
360
357362
46
OpenAssistant
oasst-pythia-12b
1022±11
362
360362
47
Stability
Stability AI
stablelm-tuned-alpha-7b
952±13
367
366367

Default Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)