Skip to yearly menu bar Skip to main content


Poster

RM-R1: Reward Modeling as Reasoning

Xiusi Chen · Gaotang Li · Ziqi Wang · Bowen Jin · Cheng Qian · Yu Wang · Hongru WANG · Yu Zhang · Denghui Zhang · Tong Zhang · Hanghang Tong · Heng Ji

Abstract

Log in and register to view live content