Skip to yearly menu bar Skip to main content


Poster

Reward Model Ensembles Help Mitigate Overoptimization

Thomas Coste · Usman Anwar · Robert Kirk · David Krueger
2024 Poster

Abstract

Video

Chat is not available.