Skip to yearly menu bar Skip to main content


The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination

Yifan Sun · Han Wang · Dongbai Li · Gang Wang · Huan Zhang

Abstract

Video

Chat is not available.