Skip to yearly menu bar Skip to main content


Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration

Avinandan Bose · Zhihan Xiong · Aadirupa Saha · Simon Du · Maryam Fazel

Abstract

Chat is not available.