Off-Policy Confidence Sequences
Nikos Karampatziakis 1 Paul Mineiro 2 Aaditya Ramdas 3
Abstract that the probability that they ever exclude the true value is
bounded by a prespecified quantity. In other words, they
We develop confidence bounds that hold uni- retain validity under optional (early) stopping and optional
formly over time for off-policy evaluation in the continuation (collecting more dat ...
附件列表