摘要翻译:
强化学习(RL)是一种学习决策任务的方法,可以使机器人在线学习和适应自己的情况。为了使RL算法在机器人控制任务中具有实用性,它必须在很少的动作中学习,同时不断地实时地采取这些动作。现有的基于模型的RL方法在相对较少的动作中学习,但对于实际的在线学习来说,每个动作之间通常需要太多的时间。在本文中,我们提出了一种新的基于模型的RL的并行结构,它通过1)利用基于样本的近似规划方法和2)将动作、模型学习和规划过程并行化,以使动作过程在典型的机器人控制周期中足够快,我们证明了当两者都被赋予无限时间时,使用这种结构的算法几乎和使用典型的顺序结构的方法一样,并且在控制自动车辆等需要实时动作的任务上大大超过这些方法。
---
英文标题:
《A Real-Time Model-Based Reinforcement Learning Architecture for Robot
Control》
---
作者:
Todd Hester, Michael Quinlan, Peter Stone
---
最新提交年份:
2011
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Robotics 机器人学
分类描述:Roughly includes material in ACM Subject Class I.2.9.
大致包括ACM科目I.2.9类的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Software Engineering 软件工程
分类描述:Covers design tools, software metrics, testing and debugging, programming environments, etc. Roughly includes material in all of ACM Subject Classes D.2, except that D.2.4 (program verification) should probably have Logics in Computer Science as the primary subject area.
涵盖设计工具、软件度量、测试和调试、编程环境等。大致包括ACM所有主题课程D.2的材料,除了D.2.4(程序验证)可能应该有计算机科学中的逻辑作为主要主题领域。
--
---
英文摘要:
Reinforcement Learning (RL) is a method for learning decision-making tasks that could enable robots to learn and adapt to their situation on-line. For an RL algorithm to be practical for robotic control tasks, it must learn in very few actions, while continually taking those actions in real-time. Existing model-based RL methods learn in relatively few actions, but typically take too much time between each action for practical on-line learning. In this paper, we present a novel parallel architecture for model-based RL that runs in real-time by 1) taking advantage of sample-based approximate planning methods and 2) parallelizing the acting, model learning, and planning processes such that the acting process is sufficiently fast for typical robot control cycles. We demonstrate that algorithms using this architecture perform nearly as well as methods using the typical sequential architecture when both are given unlimited time, and greatly out-perform these methods on tasks that require real-time actions such as controlling an autonomous vehicle.
---
PDF链接:
https://arxiv.org/pdf/1105.1749