Code Arena | Overall

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning and tool use.

Mar 18, 2026
210,395 votes
55 models
Rank Spread
1
12
Anthropic
Anthropic · Proprietary
1547+11/-11
4,090$5 / $251M
2
12
Anthropic
Anthropic · Proprietary
1547+12/-12
3,324$5 / $251M
3
33
Anthropic
Anthropic · Proprietary
1521+9/-9
5,823$3 / $151M
4
44
Anthropic
1490+7/-7
13,310$5 / $25200K
5
58
Anthropic
Anthropic · Proprietary
1467+7/-7
13,403$5 / $25200K
6
513
OpenAI · Proprietary
1458+17/-17
1,518N/AN/A
7
510
Google · Proprietary
1456+10/-10
4,285$2 / $121M
8
514
Minimax
MiniMax · Proprietary
1452+15/-15
1,708$0.30 / $1.20204.8K
9
614
Z.ai · MIT
1446+10/-10
4,389$0.72 / $2.3080K
10
614
Z.ai · MIT
1440+10/-10
5,069$0.39 / $1.75202.8K
11
714
Google · Proprietary
1437+7/-7
13,655$0.50 / $31M
12
714
Google · Proprietary
1437+7/-7
17,775$2 / $121M
13
815
MoonshotAI
Moonshot · Modified MIT
1432+9/-9
5,992$0.60 / $3N/A
14
717
OpenAI · Proprietary
1428+16/-16
1,606N/AN/A
15
1318
Minimax
MiniMax · Modified MIT
1417+9/-9
5,913$0.20 / $1.20196.6K
16
1423
OpenAI · Proprietary
1409+12/-12
3,023$1.75 / $14400K
17
1421
MoonshotAI
Moonshot · Modified MIT
1409+10/-10
3,725$0.45 / $2.20262.1K
18
1626
Minimax
MiniMax · MIT
1399+8/-8
9,823$0.27 / $0.95196.6K
19
1528
OpenAI · Proprietary
1397+16/-16
1,584$1.75 / $14400K
20
1626
1396+7/-7
11,133$0.50 / $31M
21
1629
OpenAI · Proprietary
1391+12/-12
3,847$1.25 / $10400K
22
1828
Anthropic
1389+6/-6
16,085$3 / $15200K
23
1729
Qwen Icon
Alibaba · Apache 2.0
1388+10/-10
4,471$0.39 / $2.34262.1K
24
1728
OpenAI · Proprietary
1388+9/-9
6,361$1.25 / $10400K
25
1828
Anthropic
Anthropic · Proprietary
1386+6/-6
17,939$3 / $15200K
26
1829
Anthropic
Anthropic · Proprietary
1385+9/-9
8,820$15 / $75200K
27
2031
1372+15/-15
1,812$2 / $62M
28
2031
Qwen Icon
Alibaba · Apache 2.0
1372+11/-11
3,130$0.26 / $2.08262.1K
29
2431
DeepSeek · MIT
1371+8/-8
7,502$0.26 / $0.38163.8K
30
2734
Qwen Icon
Alibaba · Apache 2.0
1354+12/-12
2,845$0.20 / $1.56262.1K
31
2733
Z.ai · MIT
1354+9/-9
8,608$0.39 / $1.90204.8K
32
3035
OpenAI · Proprietary
1341+7/-7
13,303$1.25 / $10400K
33
3037
1338+8/-8
7,020$0.09 / $0.29262.1K
34
3137
OpenAI · Proprietary
1337+8/-8
7,900$1.75 / $14400K
35
3238
OpenAI · Proprietary
1327+9/-9
6,430$1.25 / $10400K
36
3337
MoonshotAI
Moonshot · Modified MIT
1327+6/-6
14,478$1.15 / $8262.1K
37
3339
DeepSeek · MIT
1323+8/-8
8,970$0.26 / $0.38163.8K
38
3640
Minimax
MiniMax · Apache 2.0
1311+9/-9
8,702$0.26 / $1196.6K
39
3840
Anthropic
Anthropic · Proprietary
1308+6/-6
15,820$1 / $5200K
40
3742
1303+14/-14
2,137$0.09 / $0.29262.1K
41
4042
DeepSeek · MIT
1285+10/-10
5,052$0.27 / $0.41163.8K
42
4042
Qwen Icon
Alibaba · Apache 2.0
1283+6/-6
15,530$0.40 / $1.60262.1K
43
4348
Kwai
KwaiKAT · Proprietary
1258+15/-15
1,940$0.21 / $0.83256K
44
4349
Google · Proprietary
1252+16/-16
1,493$0.25 / $1.501M
45
4349
Qwen Icon
Alibaba · Apache 2.0
1250+15/-15
1,848$0.16 / $1.30262.1K
46
4349
OpenAI · Proprietary
1242+17/-17
1,510$0.25 / $2400K
47
4350
Qwen Icon
Alibaba · Proprietary
1240+17/-17
1,582N/AN/A
48
4349
xAI · Proprietary
1234+9/-9
7,078$0.20 / $0.502M
49
4452
Mistral · Apache 2.0
1220+20/-20
1,038$0.50 / $1.50N/A
50
4952
Google · Proprietary
1205+13/-13
3,373$1.25 / $101M
51
4852
xAI · Proprietary
1204+19/-19
1,256$0.20 / $0.50N/A
52
4952
Mistral · Modified MIT
1198+16/-16
1,633N/AN/A
53
5354
xAI · Proprietary
1149+23/-23
940$0.20 / $0.502M
54
5355
xAI · Proprietary
1139+22/-22
992$0.20 / $1.50256K
55
5455
Mistral · Proprietary
1095+22/-22
1,003$0.40 / $2128K

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)