摘要翻译:
尽管一般的最优部分可观测马尔可夫决策工艺规划问题难以解决,但仍存在具有高度结构化模型的重要问题。以前的研究人员已经利用这种洞察力为因子域和平面状态动力学模型中具有拓扑结构的域构造了更有效的算法。在我们的工作中,受教育界与自动辅导相关的发现的激励,我们考虑了在因数动力学模型中表现出拓扑结构形式的问题。我们的可达任意时间规划器(RAPID)利用这种结构,在最优MDP策略下有效地计算出良好的初始可达状态包络,在时间上与状态变量的个数呈线性。RAPID在有限的状态包络上执行部分可观察的规划,并在时间允许的情况下缓慢扩展状态空间。RAPID在一个有122个状态变量的大型辅导启发的问题模拟中表现良好,对应于超过10^30个状态的平坦状态空间。
---
英文标题:
《RAPID: A Reachable Anytime Planner for Imprecisely-sensed Domains》
---
作者:
Emma Brunskill, Stuart Russell
---
最新提交年份:
2012
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
Despite the intractability of generic optimal partially observable Markov decision process planning, there exist important problems that have highly structured models. Previous researchers have used this insight to construct more efficient algorithms for factored domains, and for domains with topological structure in the flat state dynamics model. In our work, motivated by findings from the education community relevant to automated tutoring, we consider problems that exhibit a form of topological structure in the factored dynamics model. Our Reachable Anytime Planner for Imprecisely-sensed Domains (RAPID) leverages this structure to efficiently compute a good initial envelope of reachable states under the optimal MDP policy in time linear in the number of state variables. RAPID performs partially-observable planning over the limited envelope of states, and slowly expands the state space considered as time allows. RAPID performs well on a large tutoring-inspired problem simulation with 122 state variables, corresponding to a flat state space of over 10^30 states.
---
PDF链接:
https://arxiv.org/pdf/1203.3538