ICML-Is Pessimism Provably Efficient for Offline RL

收藏 2025-08-10

Is Pessimism Provably Efficient for Offline RL?

               Ying Jin * 1 Zhuoran Yang * 2 Zhaoran Wang * 3

            Abstract                Vinyals et al., 2017) relies on two ingredients: (i) expressive
We study offline reinforcement learning (RL),       function approximators, e.g., deep neural networks (LeCun
which aims to learn an optimal policy based on       et al., 2015), which approximate policies and values, and
a dataset collected a priori. Due to the lack of ...

附件列表

ICML-Is Pessimism Provably Efficient for Offline RL .pdf

大小:887.78 KB

只需: RMB 9 元马上下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群