The PhD students who became the judges of the AI industry

Artificial intelligencemodels are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best— and who decides that?Arena, formerly LM Arena, hasemergedas the de facto public leaderboard for frontier LLMs, influencing funding, launches, and PR cycles. In just seven months, the startup went from aUCBerkeley PhD research project tobeing valued at$1.7 billion.

On this episode of TechCrunch’sEquitypodcast,Rebecca Bellan catches up with Arena co-foundersAnastasios AngelopoulosandWei-Lin Chiangtodeterminehow a team like theirs can build a neutral benchmark when the companiesthey’reranking are also their backers.

Listen to the full episode to hear:

How Arenaactually works, and why its founders say youcan’tgame it the wayyoumightastatic benchmark.

What“structural neutrality”actually means, and whether taking money from OpenAI, Google, and Anthropic is a conflict of interest.

How Arena is moving beyond chat to benchmark agents, coding, and real-world tasks with a new enterprise product.

Why Claude is currently winning the expert leaderboard for legal and medical usecases.

Arena’sbet on what comes after LLMs, and why agents are next on the leaderboard.

Subscribe to Equity onYouTube,Apple Podcasts,Overcast,Spotifyand all the casts. Youalso canfollow Equity onXandThreads, at @EquityPod.

The PhD students who became the judges of the AI industry

Leave a Reply Cancel reply

THINK OUT OF THE BOX

WHO WE ARE

WHAT WE DO

Our Publication Site

Our Webinar Promotion Website

QUICK LINKS