Skip to yearly menu bar Skip to main content


Poster

RRM: Robust Reward Model Training Mitigates Reward Hacking

Tianqi Liu ⋅ Wei Xiong ⋅ Jie Ren ⋅ Lichang Chen ⋅ Junru Wu ⋅ Rishabh Joshi ⋅ Yang Gao ⋅ Jiaming Shen ⋅ Zhen Qin ⋅ Tianhe Yu ⋅ Daniel Sohn ⋅ Anastasia Makarova ⋅ Jeremiah Zhe Liu ⋅ Yuan Liu ⋅ Bilal Piot ⋅ Abe Ittycheriah ⋅ Aviral Kumar ⋅ Mohammad Saleh
2025 Poster

Abstract

Video

Chat is not available.