Kernel-Based Reinforcement Learning: A Finite-Time Analysis
Omar D. Domingues 1 2 Pierre Menard 3 Matteo Pirotta 4 Emilie Kaufmann 1 5 Michal Valko 1 5 6
Abstract in the face of uncertainty (OFU, Jaksch et al. 2010) and
Thompson Sampling (Strens, 2000; Osband et al., 2013)
We consider the exploration-exploitation dilemma
principles have been used to design algorithms with sublin-
in f ...
附件列表