【强化学习经典教材】Reinforcement learning state of the art.pdf

6515

收藏 2018-06-22

Part I Introductory Part
1 Reinforcement Learning and Markov Decision Processes . . . . . . . . . . 3

Part II Efficient Solution Frameworks
2 Batch Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3 Least-Squares Methods for Policy Iteration . . . . . . . . . . . . . . . . . . . . . . 75

4 Learning and Using Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5 Transfer in Reinforcement Learning: A Framework and a Survey . . . 143
6 Sample Complexity Bounds of Exploration . . . . . . . . . . . . . . . . . . . . . . 175
Part III Constructive-Representational Directions
7 Reinforcement Learning in Continuous State and Action Spaces . . . . 207

8 Solving Relational and First-Order Logical Markov Decision
Processes: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

9 Hierarchical Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
10 Evolutionary Computation for Reinforcement Learning . . . . . . . . . . . 325
Part IV ProbabilisticModels of Self and Others
11 Bayesian Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

12 Partially Observable Markov Decision Processes . . . . . . . . . . . . . . . . . 387
13 Predictively Defined Representations of State . . . . . . . . . . . . . . . . . . . . 415
14 Game Theory and Multi-agent Reinforcement Learning . . . . . . . . . . . 441
15 Decentralized POMDPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
Part V Domains and Background
16 Psychological and Neuroscientific Connections with Reinforcement
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507

17 Reinforcement Learning in Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
18 Reinforcement Learning in Robotics: A Survey . . . . . . . . . . . . . . . . . . 579
Part VI Closing
19 Conclusions, Future Directions and Outlook . . . . . . . . . . . . . . . . . . . . . 613

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631