ICML-Incentivized Bandit Learning with Self-Reinforcing User Preferences

收藏 2025-08-10

Incentivized Bandit Learning with Self-Reinforcing User Preferences

            Tianchen Zhou 1 Jia Liu 1 Chaosheng Dong 2 Jingyuan Deng 2

         Abstract             accumulates more positive feedbacks. For example, on a
In this paper, we investigate a new multi-armed    movie rental website, current customers tend to have more
bandit (MAB) online learning model that consid-    interest in Movie A that has 500 positive reviews, compared
ers real-world phenomena in man ...

附件列表

ICML-Incentivized Bandit Learning with Self-Reinforcing User Preferences.pdf

大小:641.59 KB

只需: RMB 9 元马上下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

栏目导航

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群