Skip to yearly menu bar Skip to main content


Poster

RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards

Zhilin Wang · Jiaqi Zeng · Olivier Delalleau · Ellie Evans · Daniel Egert · Hoo-Chang Shin · Felipe Soares · Yi Dong · Oleksii Kuchaiev

Abstract

Log in and register to view live content