Skip to yearly menu bar Skip to main content


Poster

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Rishabh Agarwal ⋅ Nino Vieillard ⋅ Yongchao Zhou ⋅ Piotr Stanczyk ⋅ Sabela Ramos Garea ⋅ Matthieu Geist ⋅ Olivier Bachem
2024 Poster

Abstract

Video

Chat is not available.