Skip to yearly menu bar Skip to main content


Poster

Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy

Yuan Xie ⋅ Boyi Liu ⋅ Qiang Liu ⋅ Zhaoran Wang ⋅ Yuan Zhou ⋅ Jian Peng
2019 Poster
[ PDF

Abstract

Chat is not available.