计算视觉Worst-Case Regret Bounds for Exploration via Randomized Value Functions

2023Hua

收藏 2025-08-11

Worst-Case Regret Bounds for Exploration via
   Randomized Value Functions

                  Daniel Russo
                  Columbia University
               djr2174@gsb.columbia.edu

                     Abstract
   This paper studies a recent proposal to use randomized value functions to drive
   exploration in reinforcement learning. These randomized value functions are
   generated by injecting random noise into the training data, making the approach
   compatible wit ...

附件列表

计算视觉Worst-Case Regret Bounds for Exploration via Randomized Value Functions.pdf

大小:319.53 KB

只需: RMB 9 元马上下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

栏目导航

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群