Skip to yearly menu bar Skip to main content


Poster

Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Tengyang Xie ⋅ Dylan Foster ⋅ Akshay Krishnamurthy ⋅ Corby Rosset ⋅ Ahmed H Awadallah ⋅ Alexander Rakhlin
2025 Poster

Abstract

Video

Chat is not available.