摘要翻译:
研究了决策理论在线学习(DTOL)问题。出于实际应用的动机,当动作数量非常大时,我们将重点放在DTOL上。以往的学习算法都有一个可调的学习速率参数,在实际应用中使用在线学习的一个障碍是不知道如何最优地设置这个参数,尤其是当动作数量较大时。本文提出了一种全新的完全无参数的DTOL算法,给出了一个简洁的解决方案。我们引入了一个新的后悔概念,这对于有大量动作的应用程序来说更加自然。我们证明了我们的算法对于这种新的后悔概念取得了很好的性能;此外,根据已有的遗憾概念,通过优化参数,该算法的性能接近于已有算法的最佳界。
---
英文标题:
《A parameter-free hedging algorithm》
---
作者:
Kamalika Chaudhuri, Yoav Freund, Daniel Hsu
---
最新提交年份:
2010
---
分类信息:
一级分类:Computer Science        计算机科学
二级分类:Machine Learning        
机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Computer Science        计算机科学
二级分类:Artificial Intelligence        
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
  We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large.   In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret. 
---
PDF链接:
https://arxiv.org/pdf/0903.2851