Learning to Weight Imperfect Demonstrations
Yunke Wang 1 Chang Xu 2 Bo Du 1 Honglak Lee 3 4
Abstract any access to reward signal, has achieved great success in
many sequential decision making problems (Stadie et al.,
This paper investigates how to weight imperfect 2017; Ermon et al., 2015; Finn et al., 2016). Compared
expert demonstrations for generative adversarial to complex reward engin ...
附件列表