Tight Regret Bounds for Model-Based Reinforcement
Learning with Greedy Policies
Yonathan Efroni Nadav Merlis Mohammad Ghavamzadeh Shie Mannor
Technion, Israel Technion, Israel Facebook AI Research Technion, Israel
Abstract
State-of-the-art efficient model-based Reinforcement Learning (RL) algorithms typ-
ically act by iteratively solving empirical models, i.e., by performing full-planning
on Markov Decision P ...
附件列表