Arena Team

Created by researchers from UC Berkeley’s SkyLab, Arena (formerly LMArena) is an open platform where everyone can easily access, explore and interact with the world’s leading AI models.

Agent Arena: Causal Evaluation of Agents in the Real World

Agent Arena: Causal Evaluation of Agents in the Real World

Agents are increasingly doing real work. The resulting task distribution has greatly expanded. We desire an agent evaluation that scales along with usage and capability.

Empowering Users to Get More Done With Agent Mode

Empowering Users to Get More Done With Agent Mode

The future of AI is not single-modality chat; it is in powerful agentic capabilities. Today, we are excited to introduce Agent Mode, designed to help everyone from everyday users looking to get more done to entrepreneurs looking to maximize agentic efficacy across complex use cases. While traditional chat requires

New Categories for Web Development in Code Arena

New Categories for Web Development in Code Arena

AI coding models are increasingly used to build web apps, but aggregated leaderboards obscure key performance differences. After analyzing 250k+ Code Arena prompts, we identified major front-end task categories and built new leaderboard views to compare model strengths and weaknesses.

Multimodal Max

Multimodal Max

Text Arena Score vs. Time to First Token Note: displayed models were the top public models when the router was created. Search Arena Score vs. Time to First Token Note: displayed models were the top public models when the router was created. Vision Arena Score vs. Time to First Token

Arena Leaderboard Dataset

Arena Leaderboard Dataset

For almost three years, Arena has been publishing leaderboards covering frontier AI capabilities across 10 arenas, dozens of categories, and hundreds of models, and today we're releasing the entire history of those leaderboards as a public-access dataset. We've watched how our leaderboards are cited, referenced,

March 2026: Arena Updates across Product, Leaderboard Rankings & Research

March 2026: Arena Updates across Product, Leaderboard Rankings & Research

March 2026 brought major updates to the Arena leaderboard, including new rankings across document, video, text, and code models. In this monthly roundup, we break down the best AI models, latest LLM benchmarks, and key trends shaping AI evaluation. Whether you're a daily voter or just checking in

Supporting Independent Research in AI Evaluation

Supporting Independent Research in AI Evaluation

Arena’s Academic Partnerships Program provides funding and support for independent research advancing the scientific foundations of AI evaluation.