Skip to yearly menu bar Skip to main content


Poster

Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data

Florian Eddie Dorner · Vivian Nastl · Moritz Hardt
2025 Poster

Abstract

Video

Chat is not available.