全部版块 我的主页
论坛 经济学人 二区 外文文献专区
275 0
2022-03-06
摘要翻译:
奖励通常表达对一组替代方案的期望或偏好。本文提出了基于三个要求的任何概率分布都可以定义奖励,即奖励应该是实值的、加性的和保序的,其中后者意味着更可能的事件也应该是更理想的。我们的主要结果表明,奖励是由负面信息内容唯一决定的。为了分析随机过程,我们定义一个实现的效用为它的报酬率。在这种解释下,我们证明了随机过程的期望效用是它的负熵率。此外,我们将我们的结果应用于分析Agent与环境的交互。我们证明了由耦合相互作用系统的输入输出(I/O)分布和agent的I/O分布所产生的负交叉熵给出了agent实际达到的期望效用。因此,我们的结果允许对效用的概念进行信息论解释,并根据熵动力学对agent-环境相互作用进行表征。
---
英文标题:
《A conversion between utility and information》
---
作者:
Pedro A. Ortega, Daniel A. Braun
---
最新提交年份:
2009
---
分类信息:

一级分类:Computer Science        计算机科学
二级分类:Artificial Intelligence        人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science        计算机科学
二级分类:Information Theory        信息论
分类描述:Covers theoretical and experimental aspects of information theory and coding. Includes material in ACM Subject Class E.4 and intersects with H.1.1.
涵盖信息论和编码的理论和实验方面。包括ACM学科类E.4中的材料,并与H.1.1有交集。
--
一级分类:Mathematics        数学
二级分类:Information Theory        信息论
分类描述:math.IT is an alias for cs.IT. Covers theoretical and experimental aspects of information theory and coding.
它是cs.it的别名。涵盖信息论和编码的理论和实验方面。
--

---
英文摘要:
  Rewards typically express desirabilities or preferences over a set of alternatives. Here we propose that rewards can be defined for any probability distribution based on three desiderata, namely that rewards should be real-valued, additive and order-preserving, where the latter implies that more probable events should also be more desirable. Our main result states that rewards are then uniquely determined by the negative information content. To analyze stochastic processes, we define the utility of a realization as its reward rate. Under this interpretation, we show that the expected utility of a stochastic process is its negative entropy rate. Furthermore, we apply our results to analyze agent-environment interactions. We show that the expected utility that will actually be achieved by the agent is given by the negative cross-entropy from the input-output (I/O) distribution of the coupled interaction system and the agent's I/O distribution. Thus, our results allow for an information-theoretic interpretation of the notion of utility and the characterization of agent-environment interactions in terms of entropy dynamics.
---
PDF链接:
https://arxiv.org/pdf/0911.5106
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群