A Policy Gradient Algorithm for Learning to Learn
in Multiagent Reinforcement Learning
Dong-Ki Kim 1 2 Miao Liu 2 3 Matthew Riemer 2 3 Chuangchuang Sun 1 2 Marwa Abdulhai 1 2
Golnaz Habibi 1 2 Sebastian Lopez-Cot 1 2 Gerald Tesauro 2 3 Jonathan P. How 1 2
Abstract learning agents because their changing behaviors jointly af-
A fundamental challenge in multiagent reinforce- fect the environment’s transition and reward function. This
...
附件列表