摘要翻译:
研究了关系马尔可夫决策过程的最优广义策略的计算问题。我们描述了一种结合纯粹归纳技术和符号动态规划方法的一些优点的方法。后者是利用一阶决策理论回归和公式重写得到最优价值函数的原因,而前者在提供适当的假设语言时,能够对小实例推广价值函数或策略。我们的想法是使用推理,特别是经典的一阶回归来自动生成一个专用于手头领域的假设语言,然后由归纳求解器用作输入。这种方法避免了符号动态规划的更复杂的推理,同时将归纳求解者的注意力集中在与所考虑领域的最优值函数特别相关的概念上。
---
英文标题:
《Exploiting First-Order Regression in Inductive Policy Selection》
---
作者:
Charles Gretton, Sylvie Thiebaux
---
最新提交年份:
2012
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function using first-order decision theoretic regression and formula rewriting, while the former, when provided with a suitable hypotheses language, are capable of generalising value functions or policies for small instances. Our idea is to use reasoning and in particular classical first-order regression to automatically generate a hypotheses language dedicated to the domain at hand, which is then used as input by an inductive solver. This approach avoids the more complex reasoning of symbolic dynamic programming while focusing the inductive solver's attention on concepts that are specifically relevant to the optimal value function for the domain considered.
---
PDF链接:
https://arxiv.org/pdf/1207.4107