基于区域的POMDPs增量剪枝算法

418

收藏 2022-04-06

摘要翻译：
对求解部分可观测马尔可夫决策过程的增量剪枝算法进行了改进。我们的技术针对动态规划(DP)更新的交叉和步骤，这是POMDP算法复杂度的关键来源。在交叉和修剪时，我们的算法不是对整个信念空间进行推理，而是将信念空间划分为更小的区域，并在每个区域内进行独立的修剪。我们从分析和实验两个方面评估了新技术的好处，并表明它产生了非常显著的性能增益。这些结果有助于POMDP算法对现有最佳技术无法处理的领域的可扩展性。
---
英文标题：
《Region-Based Incremental Pruning for POMDPs》
---
作者：
Zhengzhu Feng, Shlomo Zilberstein
---
最新提交年份：
2012
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Artificial Intelligence 人工智能
分类描述：Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域，除了视觉、机器人、机器学习、多智能体系统以及计算和语言（自然语言处理），这些领域有独立的学科领域。特别地，包括专家系统，定理证明（尽管这可能与计算机科学中的逻辑重叠），知识表示，规划，和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--

---
英文摘要：
We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dynamic programming (DP) update, a key source of complexity in POMDP algorithms. Instead of reasoning about the whole belief space when pruning the cross-sums, our algorithm divides the belief space into smaller regions and performs independent pruning in each region. We evaluate the benefits of the new technique both analytically and experimentally, and show that it produces very significant performance gains. The results contribute to the scalability of POMDP algorithms to domains that cannot be handled by the best existing techniques.
---
PDF链接：
https://arxiv.org/pdf/1207.4116

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群