Skip to yearly menu bar Skip to main content


Poster

Why is Your Language Model a Poor Implicit Reward Model?

Noam Razin · Yong Lin · Jiarui Yao · Sanjeev Arora

Abstract

Log in and register to view live content