Code Arena🏆Overall

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning and tool use.

Apr 1, 2026
224,709 votes
21 open source models
Rank Spread
19
Z.ai · MIT
1441+10/-10
616
4,536$1 / $3.20202.8K
210
Z.ai · MIT
1439+10/-10
616
4,876$0.39 / $1.75202.8K
314
Moonshot · Modified MIT
1429+8/-8
816
6,694$0.60 / $3N/A
417
Moonshot · Modified MIT
1408+11/-11
1526
3,610$0.38 / $1.72262.1K
520
MiniMax · Modified MIT
1396+8/-8
1730
6,716$0.12 / $0.99196.6K
622
MiniMax · MIT
1391+8/-8
1730
9,275$0.27 / $0.95196.6K
726
Alibaba · Apache 2.0
1386+9/-9
1830
5,559$0.39 / $2.34262.1K
831
DeepSeek · MIT
1368+8/-8
2833
8,118$0.26 / $0.38163.8K
932
Alibaba · Apache 2.0
1362+10/-10
3034
4,272$0.26 / $2.08262.1K
1033
Z.ai · MIT
1354+9/-9
3135
8,345$0.39 / $1.90204.8K
1134
Alibaba · Apache 2.0
1344+10/-10
3240
3,958$0.20 / $1.56262.1K
1236
1337+8/-8
3440
6,737$0.09 / $0.29262.1K
1338
Moonshot · Modified MIT
1329+6/-6
3440
15,230$1.15 / $8262.1K
1440
DeepSeek · MIT
1327+7/-7
3440
9,603$0.26 / $0.38163.8K
1542
MiniMax · Apache 2.0
1303+9/-9
4144
8,400$0.26 / $1196.6K
1643
1300+14/-14
4145
2,096$0.09 / $0.29262.1K
1744
DeepSeek · MIT
1285+11/-11
4245
4,869$0.27 / $0.41163.8K
1845
Alibaba · Apache 2.0
1280+6/-6
4345
15,380$0.40 / $1.60262.1K
1947
Alibaba · Apache 2.0
1247+16/-16
4652
1,817$0.16 / $1.30262.1K
2052
Mistral · Apache 2.0
1221+20/-20
4756
1,031$0.50 / $1.50N/A
2155
Mistral · Modified MIT
1198+17/-17
5256
1,585N/AN/A

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles