部分可观测中数值迭代收敛速度的加快马尔可夫决策过程

596

收藏 2022-03-15

摘要翻译：
部分可观察马尔可夫决策过程（POMDPs）是一种在不确定条件下进行规划的自然模型，近年来受到许多人工智能研究人员的青睐。值迭代是求解POMDPS最优策略的一种著名算法。它通常需要大量的迭代才能收敛。本文提出了一种加速值迭代收敛的方法。该方法已经在一系列基准问题上进行了评估，并被发现是非常有效的：它使值迭代在所有测试问题上只经过几次迭代就能收敛。
---
英文标题：
《Speeding Up the Convergence of Value Iteration in Partially Observable
Markov Decision Processes》
---
作者：
N. L. Zhang, W. Zhang
---
最新提交年份：
2011
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Artificial Intelligence 人工智能
分类描述：Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域，除了视觉、机器人、机器学习、多智能体系统以及计算和语言（自然语言处理），这些领域有独立的学科领域。特别地，包括专家系统，定理证明（尽管这可能与计算机科学中的逻辑重叠），知识表示，规划，和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--

---
英文摘要：
Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.
---
PDF链接：
https://arxiv.org/pdf/1106.0251

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

lisa11yang

2022-3-15 20:16:59

谢谢分享

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群