Skip to yearly menu bar Skip to main content


Poster

Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models

Ashutosh Baheti ⋅ Ximing Lu ⋅ Faeze Brahman ⋅ Ronan Le Bras ⋅ Maarten Sap ⋅ Mark Riedl
2024 Poster

Abstract

Video

Chat is not available.