Skip to yearly menu bar Skip to main content


Poster

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Yongliang Wu · Yizhou Zhou · Ziheng Zhou · Yingzhe Peng · Xinyu Ye · Xinting Hu · Wenbo Zhu · Lu Qi · Ming-Hsuan Yang · xu yang

Abstract

Log in and register to view live content