摘要翻译:
个性化web服务通过利用内容和用户信息,努力使其服务(广告、新闻文章等)适应个人用户。尽管最近取得了一些进展,但这个问题仍然具有挑战性,至少有两个原因。首先,web服务具有动态变化的内容池,使得传统的协同过滤方法不适用。其次,大多数实际感兴趣的web服务的规模要求在学习和计算方面都快速的解决方案。在本文中,我们将新闻文章的个性化推荐建模为一个上下文强盗问题,一种原则性的方法,即学习算法根据关于用户和文章的上下文信息依次选择文章为用户服务,同时根据用户点击反馈调整其文章选择策略,以最大化用户总点击量。这项工作的贡献有三个方面。首先,我们提出了一个新的,通用的上下文bandit算法,该算法计算效率高,并且从学习理论中得到了很好的激励。其次,我们认为任何bandit算法都可以使用先前记录的随机流量进行可靠的离线评估。最后,利用这种离线评价方法,我们成功地将我们的新算法应用到一个Yahoo!今日首页模块数据集包含超过3300万个事件。结果显示,与标准的上下文无关的bandit算法相比,点击提升率为12.5%,当数据变得更少时,优势变得更大。
---
英文标题:
《A Contextual-Bandit Approach to Personalized News Article Recommendation》
---
作者:
Lihong Li, Wei Chu, John Langford, Robert E. Schapire
---
最新提交年份:
2012
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Machine Learning
机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Information Retrieval 信息检索
分类描述:Covers indexing, dictionaries, retrieval, content and analysis. Roughly includes material in ACM Subject Classes H.3.0, H.3.1, H.3.2, H.3.3, and H.3.4.
涵盖索引,字典,检索,内容和分析。大致包括ACM主题课程H.3.0、H.3.1、H.3.2、H.3.3和H.3.4中的材料。
--
---
英文摘要:
Personalized web services strive to adapt their services (advertisements, news articles, etc) to individual users by making use of both content and user information. Despite a few recent advances, this problem remains challenging for at least two reasons. First, web service is featured with dynamically changing pools of content, rendering traditional collaborative filtering methods inapplicable. Second, the scale of most web services of practical interest calls for solutions that are both fast in learning and computation. In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks. The contributions of this work are three-fold. First, we propose a new, general contextual bandit algorithm that is computationally efficient and well motivated from learning theory. Second, we argue that any bandit algorithm can be reliably evaluated offline using previously recorded random traffic. Finally, using this offline evaluation method, we successfully applied our new algorithm to a Yahoo! Front Page Today Module dataset containing over 33 million events. Results showed a 12.5% click lift compared to a standard context-free bandit algorithm, and the advantage becomes even greater when data gets more scarce.
---
PDF链接:
https://arxiv.org/pdf/1003.0146