ICML-PID Accelerated Value Iteration Algorithm

121

收藏 2025-07-27

PID Accelerated Value Iteration Algorithm

            Amir-massoud Farahmand 1 2 Mohammad Ghavamzadeh 3

         Abstract             approximation of the value or action-value functions, i.e.,
The convergence rate of Value Iteration (VI), a    Vk+1 ← T π Vk or Qk+1 ← T  Qk . For discounted MDPs,
fundamental procedure in dynamic programming       the Bellman operator is a contraction, and standard fixed-
and reinforcement learning, for solving MDPs can    point ite ...

附件列表

ICML-PID Accelerated Value Iteration Algorithm.pdf

大小:1.27 MB

只需: RMB 9 元马上下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群