全部版块 我的主页
论坛 经济学人 二区 外文文献专区
210 0
2022-03-26
摘要翻译:
马尔可夫决策过程被广泛应用于随机环境中的决策问题建模。然而,精确地规范MDPs中的奖励函数通常是非常困难的。最近的研究主要集中在基于极大极小后悔准则的最优策略的计算上,以便在报酬函数不确定的情况下获得鲁棒策略。计算minimax后悔策略的核心任务之一是获得对某个候选奖励函数最优的所有策略的集合。在本文中,我们提出了一个有效的算法,利用奖励函数的几何性质与策略。我们还提出了一个近似版本的方法,以进一步加快速度。我们的实验证明我们的算法提高了性能的数量级。
---
英文标题:
《A Geometric Traversal Algorithm for Reward-Uncertain MDPs》
---
作者:
Eunsoo Oh, Kee-Eung Kim
---
最新提交年份:
2012
---
分类信息:

一级分类:Computer Science        计算机科学
二级分类:Artificial Intelligence        人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--

---
英文摘要:
  Markov decision processes (MDPs) are widely used in modeling decision making problems in stochastic environments. However, precise specification of the reward functions in MDPs is often very difficult. Recent approaches have focused on computing an optimal policy based on the minimax regret criterion for obtaining a robust policy under uncertainty in the reward function. One of the core tasks in computing the minimax regret policy is to obtain the set of all policies that can be optimal for some candidate reward function. In this paper, we propose an efficient algorithm that exploits the geometric properties of the reward function associated with the policies. We also present an approximate version of the method for further speed up. We experimentally demonstrate that our algorithm improves the performance by orders of magnitude.
---
PDF链接:
https://arxiv.org/pdf/1202.3754
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群