Skip to yearly menu bar Skip to main content


Poster

ICPO: Provable and Practical In-Context Policy Optimization for Test-Time Scaling

Tianrun Yu · Yuxiao Yang · Zhaoyang Wang · Kaixiang Zhao · Porter Jenkins · Xuchao Zhang · Chetan Bansal · Huaxiu Yao · Weitong Zhang

Abstract

Log in and register to view live content