Skip to yearly menu bar Skip to main content


Oral
in
Affinity Event: Tiny Papers Oral Session 3

Policy Optimization in RLHF: The Impact of Out-of-preference Data

Ziniu Li ⋅ Tian Xu ⋅ Yang Yu

Abstract

Video

Chat is not available.