MDPs的分块线性规划逼近

330

收藏 2022-03-30

摘要翻译：
近似线性规划(ALP)是求解大型因素马尔可夫决策过程的一种有效方法。该方法的主要思想是用一组基函数逼近最优值函数，并用线性规划(LP)优化它们的权值。本文提出了一种新的ALP近似。与标准的ALP公式相比，我们将约束空间分解为一组低维空间。这种结构可以有效地求解新的线性规划问题。特别地，LP的约束可以以紧凑的形式满足，而不依赖于ALP约束的树宽。我们研究了所提出的方法的实践和理论方面。此外，我们还证明了它在超过2100个态的MDP上的放大潜力。
---
英文标题：
《Partitioned Linear Programming Approximations for MDPs》
---
作者：
Branislav Kveton, Milos Hauskrecht
---
最新提交年份：
2012
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Artificial Intelligence 人工智能
分类描述：Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域，除了视觉、机器人、机器学习、多智能体系统以及计算和语言（自然语言处理），这些领域有独立的学科领域。特别地，包括专家系统，定理证明（尽管这可能与计算机科学中的逻辑重叠），知识表示，规划，和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--

---
英文摘要：
Approximate linear programming (ALP) is an efficient approach to solving large factored Markov decision processes (MDPs). The main idea of the method is to approximate the optimal value function by a set of basis functions and optimize their weights by linear programming (LP). This paper proposes a new ALP approximation. Comparing to the standard ALP formulation, we decompose the constraint space into a set of low-dimensional spaces. This structure allows for solving the new LP efficiently. In particular, the constraints of the LP can be satisfied in a compact form without an exponential dependence on the treewidth of ALP constraints. We study both practical and theoretical aspects of the proposed approach. Moreover, we demonstrate its scale-up potential on an MDP with more than 2^100 states.
---
PDF链接：
https://arxiv.org/pdf/1206.3266

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群