ICML-Phasic Policy Gradient

收藏 2025-07-27

Phasic Policy Gradient

            Karl Cobbe 1 Jacob Hilton 1 Oleg Klimov 1 John Schulman 1

            Abstract             can be used to better optimize the other.
We introduce Phasic Policy Gradient (PPG), a re-    However, there are also disadvantages to sharing network
inforcement learning framework which modifies    parameters. First, it is not clear how to appropriately balance
traditional on-policy actor-critic methods by sepa- the competing objectives of t ...

附件列表

ICML-Phasic Policy Gradient.pdf

大小:1.21 MB

只需: RMB 9 元马上下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

栏目导航

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群