Code Arena🏆WebDev

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning and tool use.

Apr 22, 2026
249,561 votes
64 models
Rank Spread
1
13
Anthropic
Anthropic · Proprietary
1576+19/-19
1,286$5 / $251M
2
14
Anthropic
Anthropic · Proprietary
1566+16/-16
1,684$5 / $251M
3
16
Anthropic
Anthropic · Proprietary
1552+10/-10
4,815$5 / $251M
4
26
Anthropic
Anthropic · Proprietary
1545+9/-9
5,723$5 / $251M
5
38
Z.ai · MIT
1534+14/-14
2,398$1.05 / $3.50202.8K
6
38
Moonshot · Modified MIT
1529+18/-18
1,232$0.75 / $3.50262.1K
7
58
Anthropic
Anthropic · Proprietary
1526+9/-9
7,660$3 / $151M
8
59
Meta
Meta · Proprietary
1513+16/-16
1,608N/AN/A
9
89
Anthropic
1491+7/-7
13,066$5 / $25200K
10
1013
Alibaba · Proprietary
1471+12/-12
2,753$0.33 / $1.951M
11
1013
Anthropic
Anthropic · Proprietary
1468+6/-6
15,287$5 / $25200K
12
1018
OpenAI · Proprietary
1457+17/-17
1,482$2.50 / $151.1M
13
1015
Google · Proprietary
1456+8/-8
6,727$2 / $121M
14
1220
Z.ai · MIT
1440+10/-10
4,879$0.38 / $1.74202.8K
15
1320
Google · Proprietary
1438+7/-7
17,169$2 / $121M
16
1222
OpenAI · Proprietary
1437+16/-16
1,448$2.50 / $151.1M
17
1320
Google · Proprietary
1437+7/-7
13,276$0.50 / $31M
18
1321
Z.ai · MIT
1436+9/-9
5,576$1 / $3.20202.8K
19
1421
Moonshot · Modified MIT
1430+8/-8
7,854$0.60 / $3N/A
20
1424
Xiaomi · Proprietary
1426+10/-10
3,946$1 / $31M
21
1726
MiniMax · Modified MIT
1417+10/-10
3,714$0.30 / $1.20196.6K
22
2029
Moonshot · Modified MIT
1408+11/-11
3,611$0.44 / $2262.1K
23
2031
OpenAI · Proprietary
1407+12/-12
2,970$1.75 / $14400K
24
1935
OpenAI · Proprietary
1404+17/-17
1,459$1.75 / $14400K
25
2133
1403+10/-10
4,115$2 / $62M
26
2135
OpenAI · Proprietary
1399+13/-13
2,460$2.50 / $151.1M
27
2235
OpenAI · Proprietary
1393+13/-13
3,755$1.25 / $10400K
28
2235
MiniMax · MIT
1392+8/-8
9,274$0.29 / $0.95196.6K
29
2235
OpenAI · Proprietary
1391+9/-9
6,122$1.25 / $10400K
30
2335
Alibaba · Apache 2.0
1389+8/-8
6,624$0.39 / $2.34262.1K
31
2335
1389+6/-6
13,308$0.50 / $31M
32
2435
Anthropic
1388+6/-6
15,742$3 / $15200K
33
2535
Anthropic
Anthropic · Proprietary
1386+6/-6
18,405$3 / $15200K
34
2435
Anthropic
Anthropic · Proprietary
1385+9/-9
8,568$15 / $75200K
35
2536
MiniMax · Modified MIT
1384+8/-8
7,847$0.15 / $1.20196.6K
36
3538
DeepSeek · MIT
1368+8/-8
7,911$0.25 / $0.38131.1K
37
3639
Alibaba · Apache 2.0
1363+9/-9
5,428$0.26 / $2.08262.1K
38
3640
Z.ai · MIT
1354+9/-9
8,348$0.39 / $1.90204.8K
39
3744
Alibaba · Apache 2.0
1346+9/-9
4,991$0.20 / $1.56262.1K
40
3845
OpenAI · Proprietary
1339+7/-7
12,872$1.25 / $10400K
41
3945
1337+8/-8
6,736$0.09 / $0.29262.1K
42
3945
OpenAI · Proprietary
1335+8/-8
7,765$1.75 / $14400K
43
3945
DeepSeek · MIT
1331+7/-7
10,360$0.25 / $0.38131.1K
44
4045
Moonshot · Modified MIT
1330+6/-6
15,363$1.15 / $8262.1K
45
3946
OpenAI · Proprietary
1329+9/-9
6,228$1.25 / $10400K
46
4548
Anthropic
Anthropic · Proprietary
1315+6/-6
17,784$1 / $5200K
47
4649
MiniMax · Apache 2.0
1304+9/-9
8,401$0.26 / $1196.6K
48
4650
1300+14/-14
2,096$0.09 / $0.29262.1K
49
4750
DeepSeek · MIT
1286+11/-11
4,870$0.27 / $0.41163.8K
50
4850
Alibaba · Apache 2.0
1281+7/-7
15,214$0.40 / $1.60262.1K
51
5157
Kwai
KwaiKAT · Proprietary
1258+15/-15
1,882$0.21 / $0.83256K
52
5157
Alibaba · Apache 2.0
1248+16/-16
1,814$0.16 / $1.30262.1K
53
5158
OpenAI · Proprietary
1239+17/-17
1,444$0.25 / $2400K
54
5158
Alibaba · Proprietary
1236+17/-17
1,560N/AN/A
55
5158
Google · Proprietary
1235+9/-9
6,155$0.25 / $1.501M
56
5158
xAI · Proprietary
1234+9/-9
6,916$0.20 / $0.502M
57
5160
Mistral · Apache 2.0
1223+20/-20
1,031$0.50 / $1.50N/A
58
5360
xAI · Proprietary
1208+20/-20
1,209N/AN/A
59
5760
Google · Proprietary
1203+13/-13
3,300$1.25 / $101M
60
5761
Mistral · Modified MIT
1199+17/-17
1,582N/AN/A
61
6063
Inception AI · Proprietary
1165+23/-23
947$0.25 / $0.75128K
62
6163
xAI · Proprietary
1149+23/-23
936$0.20 / $0.502M
63
6163
xAI · Proprietary
1139+22/-22
984$0.20 / $1.50256K
64
6464
Mistral · Proprietary
1091+23/-23
993$0.40 / $2128K

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Battle Count for Each Combination of Models (without Ties)