ICML-Provably Correct Optimization and Exploration with Non-linear Policies

收藏 2025-07-27

Provably Correct Optimization and Exploration with Non-linear Policies

               Fei Feng 1 Wotao Yin 1 Alekh Agarwal 2 Lin Yang 3

         Abstract             rer & Geist, 2014; Geist et al., 2019; Abbasi-Yadkori et al.,
Policy optimization methods remain a powerful    2019; Agarwal et al., 2020c; Bhandari & Russo, 2019) when
workhorse in empirical Reinforcement Learning    the agent has access to a distribution over states which is
(RL), with a focus on neur ...

附件列表

ICML-Provably Correct Optimization and Exploration with Non-linear Policies.pdf

大小:929.05 KB

只需: RMB 9 元马上下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

栏目导航

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群