Skip to yearly menu bar Skip to main content


Poster

Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs

Naman Agarwal · Syomantak Chaudhuri · Prateek Jain · Dheeraj Nagaraj · Praneeth Netrapalli
2022 Poster

Abstract

Video

Chat is not available.