Provably Efficient Q-learning with Function
Approximation via Distribution Shift Error Checking
Oracle
Simon S. Du Yuping Luo
Institute for Advanced Study Princeton University
ssdu@ias.edu yupingl@cs.princeton.edu
Ruosong Wang Hanrui Zhang
Carnegie Mellon University Duke University
ruosongw@andrew.cmu.edu hrzhang@cs.duke.edu
Abstract
...
附件列表