Code Arena | HTML

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning, tool use, and production-style workflows.

Feb 24, 2026
123,082 votes
46 models
Rank Spread
1
14
Anthropic
Anthropic · Proprietary
1557+33/-33
419
2
14
Anthropic
Anthropic · Proprietary
1554+34/-34
367
3
16
Google · Proprietary
1522+43/-43
222
4
18
Anthropic
Anthropic · Proprietary
1510+35/-35
320
5
36
Anthropic
1502+10/-10
8,081
6
310
OpenAI · Proprietary
1477+16/-16
1,696
7
510
Anthropic
Anthropic · Proprietary
1469+9/-9
8,265
8
611
Google · Proprietary
1461+9/-9
14,273
9
814
Z.ai · MIT
1447+10/-10
4,991
10
519
Z.ai · MIT
1443+33/-33
348
11
914
Google · Proprietary
1441+9/-9
9,551
12
618
Minimax
MiniMax · Modified MIT
1440+28/-28
508
13
919
MoonshotAI
Moonshot · Modified MIT
1427+22/-22
749
14
923
MoonshotAI
Moonshot · Modified MIT
1418+27/-27
497
15
1119
Minimax
MiniMax · MIT
1413+10/-10
7,292
16
1122
1406+10/-10
5,710
17
1123
OpenAI · Proprietary
1401+16/-16
1,634
18
1223
OpenAI · Proprietary
1399+13/-13
3,929
19
1523
Anthropic
Anthropic · Proprietary
1394+9/-9
8,965
20
1523
Anthropic
1393+8/-8
11,507
21
1524
OpenAI · Proprietary
1393+9/-9
6,413
22
1624
Anthropic
Anthropic · Proprietary
1386+8/-8
13,104
23
1128
Qwen Icon
Alibaba · Apache 2.0
1386+33/-33
321
24
2128
DeepSeek · MIT
1373+11/-11
3,858
25
2328
Z.ai · MIT
1362+9/-9
8,715
26
2329
1361+11/-11
4,328
27
2329
OpenAI · Proprietary
1359+8/-8
10,436
28
2331
1353+18/-18
1,218
29
2631
OpenAI · Proprietary
1340+12/-12
2,990
30
2831
MoonshotAI
Moonshot · Modified MIT
1339+8/-8
10,056
31
2832
OpenAI · Proprietary
1334+10/-10
6,489
32
3134
Minimax
MiniMax · Apache 2.0
1318+9/-9
8,802
33
3236
Anthropic
Anthropic · Proprietary
1304+8/-8
11,237
34
3236
DeepSeek · MIT
1302+10/-10
5,018
35
3336
DeepSeek · MIT
1292+11/-11
5,131
36
3336
Qwen Icon
Alibaba · Apache 2.0
1290+8/-8
10,981
37
3739
Kwai
KwaiKAT · Proprietary
1265+15/-15
1,955
38
3740
OpenAI · Proprietary
1249+17/-17
1,537
39
3740
xAI · Proprietary
1241+11/-11
5,662
40
3843
Mistral · Apache 2.0
1227+20/-20
1,039
41
4043
Google · Proprietary
1211+13/-13
3,455
42
4043
Mistral · Modified MIT
1211+17/-17
1,453
43
4043
xAI · Proprietary
1210+19/-19
1,267
44
4445
xAI · Proprietary
1158+23/-23
968
45
4446
xAI · Proprietary
1146+22/-22
1,017
46
4546
Mistral · Proprietary
1104+22/-22
1,020

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)