Text Arena💻Coding

View overall rankings across various AI models in text-to-text tasks across math, coding, creative writing, and other open-ended domains.

Jun 25, 2026

1,405,969 votes

51 labs

Rank by

Lab Rank		Model Score		Rank Spread
1	Anthropic claude-fable-5 · Proprietary	1564±18	1	17
2	Alibaba qwen3.7-max-preview · Proprietary	1526±18	10	345
3	Meta muse-spark · Proprietary	1525±10	11	637
4	Z.ai glm-5.1	1525±9	12	636
5	Google gemini-3.1-pro-preview · Proprietary	1523±6	15	634
6	OpenAI gpt-5.4-high · Proprietary	1521±7	16	837
7	Xiaomi mimo-v2.5-pro	1519±8	18	841
8	xAI grok-4.20-beta-0309-reasoning · Proprietary	1513±6	24	1046
9	Baidu ernie-5.1 · Proprietary	1512±8	27	1048
10	Moonshot kimi-k2.6	1512±8	28	1048
11	Bytedance dola-seed-2.0-pro · Proprietary	1510±6	31	1248
12	MiniMax minimax-m3 · Proprietary	1503±10	42	1664
13	Meituan longcat-flash-chat-2602-exp · Proprietary	1502±8	43	2064
14	DeepSeek deepseek-v4-pro	1501±7	44	2364
15	Amazon amazon-nova-experimental-chat-26-02-10 · Proprietary	1487±20	61	24101
16	Mistral mistral-medium-3.5	1477±11	72	50105
17	Nvidia nvidia-nemotron-3-ultra-550b-a55b-nvfp4	1468±14	84	59121
18	Tencent hunyuan-hy3-preview	1461±14	96	64129
19	StepFun step-3.5-flash	1450±6	113	89132
20	Arcee AI trinity-large-preview	1441±8	123	103148
21	Ant Group ling-flash-2.0	1412±15	157	129182
22	Prime Intellect intellect-3	1409±19	160	127197
23	Inception AI mercury-2 · Proprietary	1397±21	170	145204
24	Cohere command-a-03-2025	1390±6	177	159197
25	Ai2 olmo-3.1-32b-instruct	1384±12	187	159209
26	NexusFlow athene-v2-chat	1369±9	198	176224
27	01 AI yi-lightning · Proprietary	1369±10	199	176225
28	IBM granite-4.1-8b	1352±20	223	184246
29	Reka AI reka-core-20240904 · Proprietary	1315±15	248	233269
30	AI21 Labs jamba-1.5-large	1312±15	251	236271
31	Microsoft phi-4	1307±10	258	241271
32	Princeton gemma-2-9b-it-simpo	1272±15	282	263299
33	Databricks dbrx-instruct-preview	1250±11	293	280307
34	InternLM internlm2_5-20b-chat	1248±14	296	279309
35	HuggingFace zephyr-orpo-141b-A35b-v0.1	1244±21	298	274312
36	Nexusflow starling-lm-7b-beta	1235±13	304	289313
37	OpenChat openchat-3.5-0106	1229±14	306	291317
38	Snowflake snowflake-arctic-instruct	1224±11	307	294317
39	AllenAI/UW tulu-2-dpo-70b	1214±21	310	294329
40	UC Berkeley starling-lm-7b-alpha	1206±16	313	303331
41	LMSYS vicuna-33b	1192±13	320	309334
42	NousResearch openhermes-2.5-mistral-7b	1186±23	323	308344
43	Upstage AI solar-10.7b-instruct-v1.0	1183±27	325	308345
44	MosaicML mpt-30b-chat	1167±35	332	309349
45	Together AI stripedhyena-nous-7b	1126±22	348	332352
46	UW guanaco-33b	1113±36	349	334355
47	Tsinghua chatglm3-6b	1089±26	352	345357
48	RWKV RWKV-4-Raven-14B	1059±27	355	350359
49	OpenAssistant oasst-pythia-12b	1049±25	356	352360
50	Stability AI stablelm-tuned-alpha-7b	1004±33	359	354361
51	Stanford alpaca-13b	999±27	360	356361

Default Leaderboard Plots

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Confidence Intervals on Model Strength (via Bootstrapping)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Lab Rank

Model Score

Rank Spread

Anthropic

claude-fable-5 · Proprietary

1564±18

Alibaba

qwen3.7-max-preview · Proprietary

1526±18

345