Code Arena🏆Overall

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning and tool use.

Apr 1, 2026
224,709 votes
59 models
Rank Spread
1
12
Anthropic
Anthropic · Proprietary
1546+12/-12
3,698$5 / $251M
2
12
Anthropic
Anthropic · Proprietary
1543+11/-11
4,479$5 / $251M
3
33
Anthropic
Anthropic · Proprietary
1521+9/-9
7,086$3 / $151M
4
44
Anthropic
1491+7/-7
13,254$5 / $25200K
5
58
Anthropic
Anthropic · Proprietary
1465+7/-7
14,248$5 / $25200K
6
515
OpenAI · Proprietary
1457+17/-17
1,495N/AN/A
7
510
Google · Proprietary
1456+9/-9
5,467$2 / $121M
8
516
Alibaba · Proprietary
1454+19/-19
1,125N/AN/A
9
616
Z.ai · MIT
1441+10/-10
4,536$1 / $3.20202.8K
10
616
Z.ai · MIT
1439+10/-10
4,876$0.39 / $1.75202.8K
11
716
Google · Proprietary
1438+7/-7
17,165$2 / $121M
12
716
Google · Proprietary
1436+7/-7
13,282$0.50 / $31M
13
716
Xiaomi · Proprietary
1433+12/-12
2,903$1 / $31M
14
816
Moonshot · Modified MIT
1429+8/-8
6,694$0.60 / $3N/A
15
719
MiniMax · Proprietary
1428+12/-12
2,716$0.30 / $1.20204.8K
16
719
OpenAI · Proprietary
1427+16/-16
1,579N/AN/A
17
1526
Moonshot · Modified MIT
1408+11/-11
3,610$0.38 / $1.72262.1K
18
1528
OpenAI · Proprietary
1407+12/-12
2,974$1.75 / $14400K
19
1530
OpenAI · Proprietary
1403+17/-17
1,460$1.75 / $14400K
20
1730
MiniMax · Modified MIT
1396+8/-8
6,716$0.12 / $0.99196.6K
21
1730
OpenAI · Proprietary
1392+13/-13
3,753$1.25 / $10400K
22
1730
MiniMax · MIT
1391+8/-8
9,275$0.27 / $0.95196.6K
23
1730
1391+7/-7
12,208$0.50 / $31M
24
1730
OpenAI · Proprietary
1390+9/-9
6,124$1.25 / $10400K
25
1830
Anthropic
1388+6/-6
15,916$3 / $15200K
26
1830
Alibaba · Apache 2.0
1386+9/-9
5,559$0.39 / $2.34262.1K
27
1930
Anthropic
Anthropic · Proprietary
1386+6/-6
18,512$3 / $15200K
28
1731
1386+12/-12
3,030$2 / $62M
29
1732
OpenAI · Proprietary
1385+18/-18
1,198$0.75 / $4.50400K
30
1931
Anthropic
Anthropic · Proprietary
1384+9/-9
8,570$15 / $75200K
31
2833
DeepSeek · MIT
1368+8/-8
8,118$0.26 / $0.38163.8K
32
3034
Alibaba · Apache 2.0
1362+10/-10
4,272$0.26 / $2.08262.1K
33
3135
Z.ai · MIT
1354+9/-9
8,345$0.39 / $1.90204.8K
34
3240
Alibaba · Apache 2.0
1344+10/-10
3,958$0.20 / $1.56262.1K
35
3340
OpenAI · Proprietary
1339+7/-7
12,868$1.25 / $10400K
36
3440
1337+8/-8
6,737$0.09 / $0.29262.1K
37
3440
OpenAI · Proprietary
1335+8/-8
7,956$1.75 / $14400K
38
3440
Moonshot · Modified MIT
1329+6/-6
15,230$1.15 / $8262.1K
39
3440
OpenAI · Proprietary
1328+9/-9
6,225$1.25 / $10400K
40
3440
DeepSeek · MIT
1327+7/-7
9,603$0.26 / $0.38163.8K
41
4143
Anthropic
Anthropic · Proprietary
1312+6/-6
16,594$1 / $5200K
42
4144
MiniMax · Apache 2.0
1303+9/-9
8,400$0.26 / $1196.6K
43
4145
1300+14/-14
2,096$0.09 / $0.29262.1K
44
4245
DeepSeek · MIT
1285+11/-11
4,869$0.27 / $0.41163.8K
45
4345
Alibaba · Apache 2.0
1280+6/-6
15,380$0.40 / $1.60262.1K
46
4651
Kwai
KwaiKAT · Proprietary
1257+15/-15
1,883$0.21 / $0.83256K
47
4652
Alibaba · Apache 2.0
1247+16/-16
1,817$0.16 / $1.30262.1K
48
4652
Google · Proprietary
1238+10/-10
5,276$0.25 / $1.501M
49
4653
OpenAI · Proprietary
1238+17/-17
1,443$0.25 / $2400K
50
4653
Alibaba · Proprietary
1235+17/-17
1,562N/AN/A
51
4653
xAI · Proprietary
1233+9/-9
6,917$0.20 / $0.502M
52
4756
Mistral · Apache 2.0
1221+20/-20
1,031$0.50 / $1.50N/A
53
4956
xAI · Proprietary
1207+20/-20
1,209N/AN/A
54
5256
Google · Proprietary
1202+13/-13
3,295$1.25 / $101M
55
5256
Mistral · Modified MIT
1198+17/-17
1,585N/AN/A
56
5257
Inception AI · Proprietary
1182+21/-21
1,107$0.25 / $0.75128K
57
5658
xAI · Proprietary
1148+23/-23
934$0.20 / $0.502M
58
5758
xAI · Proprietary
1138+22/-22
983$0.20 / $1.50256K
59
5959
Mistral · Proprietary
1090+23/-23
993$0.40 / $2128K

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles