Code Arena | Overall

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning, tool use, and production-style workflows.

Feb 24, 2026
171,212 votes
46 models
Rank Spread
1
12
Anthropic
Anthropic · Proprietary
1560+13/-13
2,845
2
13
Anthropic
Anthropic · Proprietary
1553+15/-15
2,182
3
23
Anthropic
Anthropic · Proprietary
1531+16/-16
1,839
4
44
Anthropic
1499+8/-8
11,149
5
58
OpenAI · Proprietary
1471+16/-16
1,696
6
58
Anthropic
Anthropic · Proprietary
1471+8/-8
11,239
7
513
Google · Proprietary
1461+15/-15
1,826
8
513
Z.ai · MIT
1451+13/-13
2,621
9
713
Google · Proprietary
1443+7/-7
17,027
10
713
Google · Proprietary
1441+7/-7
12,934
11
714
Z.ai · MIT
1439+10/-10
5,128
12
714
MoonshotAI
Moonshot · Modified MIT
1436+10/-10
4,022
13
714
Minimax
MiniMax · Modified MIT
1436+11/-11
3,632
14
1117
MoonshotAI
Moonshot · Modified MIT
1419+12/-12
2,954
15
1422
Minimax
MiniMax · MIT
1401+8/-8
9,796
16
1423
1399+8/-8
8,840
17
1423
Qwen Icon
Alibaba · Apache 2.0
1396+12/-12
2,473
18
1424
OpenAI · Proprietary
1395+15/-15
1,634
19
1523
OpenAI · Proprietary
1392+12/-12
3,929
20
1523
Anthropic
1388+7/-7
14,186
21
1523
Anthropic
Anthropic · Proprietary
1388+8/-8
8,983
22
1524
OpenAI · Proprietary
1387+9/-9
6,438
23
1624
Anthropic
Anthropic · Proprietary
1386+7/-7
15,941
24
2125
DeepSeek · MIT
1370+9/-9
6,049
25
2427
Z.ai · MIT
1355+8/-8
8,747
26
2530
OpenAI · Proprietary
1342+7/-7
13,179
27
2530
1341+8/-8
6,932
28
2631
OpenAI · Proprietary
1334+9/-9
5,814
29
2631
MoonshotAI
Moonshot · Modified MIT
1331+7/-7
12,690
30
2633
OpenAI · Proprietary
1328+9/-9
6,507
31
2834
DeepSeek · MIT
1319+8/-8
7,376
32
3034
Minimax
MiniMax · Apache 2.0
1312+9/-9
8,833
33
3035
1306+13/-13
2,146
34
3134
Anthropic
Anthropic · Proprietary
1305+7/-7
13,948
35
3436
DeepSeek · MIT
1286+10/-10
5,131
36
3537
Qwen Icon
Alibaba · Apache 2.0
1280+7/-7
13,656
37
3639
Kwai
KwaiKAT · Proprietary
1258+15/-15
1,955
38
3740
OpenAI · Proprietary
1242+17/-17
1,537
39
3740
xAI · Proprietary
1235+9/-9
7,130
40
3843
Mistral · Apache 2.0
1222+20/-20
1,039
41
4043
Google · Proprietary
1205+13/-13
3,455
42
4043
xAI · Proprietary
1203+19/-19
1,267
43
4043
Mistral · Modified MIT
1197+16/-16
1,686
44
4445
xAI · Proprietary
1152+22/-22
968
45
4446
xAI · Proprietary
1140+21/-21
1,017
46
4546
Mistral · Proprietary
1099+22/-22
1,020

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)