Vision Arena🔍Entity Recognition

View overall rankings across multimodal AI models capable of reasoning over visual inputs.

Jun 5, 2026

4,666 votes

35 models

Rank by

	Rank Spread
1	113	gemini-3-pro Google · Proprietary	1302±35	243	$2 / $12	1M
2	113	gemini-3-flash Google · Proprietary	1299±37	224	$0.50 / $3	1M
3	116	gemini-3.1-pro-preview Google · Proprietary	1288±39	207	$2 / $12	1M
4	122	gemini-3-flash (thinking-minimal) Google · Proprietary	1269±36	209	$0.50 / $3	1M
5	120	gemini-2.5-pro Google · Proprietary	1257±21	873	$1.25 / $10	1M
6	123	gpt-5-high OpenAI · Proprietary	1256±32	434	$1.25 / $10	400K
7	131	kimi-k2.5-thinking Moonshot · Modified MIT	1243±43	162	$0.60 / $3	N/A
8	131	gemini-3.1-flash-lite-preview Google · Proprietary	1242±48	132	$0.25 / $1.50	1M
9	131	gpt-5.1-high OpenAI · Proprietary	1240±49	108	$1.25 / $10	400K
10	131	qwen3.5-397b-a17b Alibaba · Apache 2.0	1240±51	112	$0.39 / $2.34	262.1K
11	131	grok-4-0709 xAI · Proprietary	1237±34	341	$3 / $15	256K
12	331	o3-2025-04-16 OpenAI · Proprietary	1231±29	505	$2 / $8	200K
13	135	grok-4.20-beta-0309-reasoning xAI · Proprietary	1227±67	72	$2 / $6	2M
14	331	chatgpt-4o-latest-20250326 OpenAI · Proprietary	1226±30	294	$5 / $15	128K
15	431	gemini-2.5-flash Google · Proprietary	1217±24	587	$0.30 / $2.50	1M
16	431	o4-mini-2025-04-16 OpenAI · Proprietary	1213±32	398	$1.10 / $4.40	200K
17	135	gpt-5.2-chat-latest-20260210 OpenAI · Proprietary	1211±59	99	$1.75 / $14	128K
18	432	gpt-5-mini-high OpenAI · Proprietary	1204±40	283	$0.25 / $2	400K
19	632	gpt-4.1-mini-2025-04-14 OpenAI · Proprietary	1200±32	362	$0.40 / $1.60	1M
20	335	grok-4-1-fast-reasoning xAI · Proprietary	1197±61	79	$0.20 / $0.50	2M
21	535	gpt-5.1 OpenAI · Proprietary	1192±43	141	$1.25 / $10	400K
22	535	qwen3-vl-235b-a22b-instruct Alibaba · Apache 2.0	1191±43	148	$0.20 / $0.88	262.1K
23	435	gpt-5.2-high OpenAI · Proprietary	1189±49	123	$1.75 / $14	400K
24	735	gpt-5-chat OpenAI · Proprietary	1188±31	383	$1.25 / $10	128K
25	735	gemini-2.5-flash-lite-preview-06-17-thinking Google · Proprietary	1188±31	422	$0.10 / $0.40	1M
26	735	gpt-4.1-2025-04-14 OpenAI · Proprietary	1183±30	404	$2 / $8	1M
27	735	gpt-5.2 OpenAI · Proprietary	1176±44	138	$1.75 / $14	400K
28	735	gemma-4-31b Google · Apache 2.0	1175±46	151	$0.14 / $0.40	262.1K
29	735	gemma-3-27b-it Google · Gemma	1175±32	326	$0.08 / $0.16	131.1K
30	735	gemma-4-26b-a4b Google · Apache 2.0	1152±59	92	N/A	N/A
31	735	qwen3.5-27b Alibaba · Apache 2.0	1146±64	80	$0.20 / $1.56	262.1K
32	1635	mistral-small-2506 Mistral · Apache 2.0	1135±40	212	$0.10 / $0.30	32K
33	1835	mistral-medium-2508 Mistral · Proprietary	1132±30	443	$2.70 / $8.10	32K
34	1835	mistral-small-3.1-24b-instruct-2503 Mistral · Apache 2.0	1126±38	248	$0.10 / $0.30	32K
35	1835	mistral-medium-2505 Mistral · Proprietary	1123±39	201	$0.40 / $2	131.1K

Vision Arena🔍Entity Recognition

Default Leaderboard Plots

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Confidence Intervals on Model Strength (via Bootstrapping)

Fraction of Model A Wins for All Non-tied A vs. B Battles