不打折土匪小游戏

281

收藏 2022-03-27

摘要翻译：
我们分析了两臂强盗战略试验的不打折连续时间博弈。风险臂根据一个L{e}vy过程产生收益，每个单位时间的平均收益未知，自然从任意有限集合中提取。观察所有的行动和实现的收益，加上一个自由的背景信号，玩家使用马尔可夫策略，对未知参数的共同后验信念作为状态变量。我们证明了在给定当前信念的情况下，唯一对称Markov完美平衡点可以用一个简单的封闭形式计算，只涉及安全臂的收益、风险臂的期望当前收益和期望全信息收益。特别是，均衡不依赖于产生收益过程的精确规范。
---
英文标题：
《Undiscounted Bandit Games》
---
作者：
Godfrey Keller and Sven Rady
---
最新提交年份：
2020
---
分类信息：

一级分类：Economics 经济学
二级分类：Theoretical Economics 理论经济学
分类描述：Includes theoretical contributions to Contract Theory, Decision Theory, Game Theory, General Equilibrium, Growth, Learning and Evolution, Macroeconomics, Market and Mechanism Design, and Social Choice.
包括对契约理论、决策理论、博弈论、一般均衡、增长、学习与进化、宏观经济学、市场与机制设计、社会选择的理论贡献。
--

---
英文摘要：
We analyze undiscounted continuous-time games of strategic experimentation with two-armed bandits. The risky arm generates payoffs according to a L\'{e}vy process with an unknown average payoff per unit of time which nature draws from an arbitrary finite set. Observing all actions and realized payoffs, plus a free background signal, players use Markov strategies with the common posterior belief about the unknown parameter as the state variable. We show that the unique symmetric Markov perfect equilibrium can be computed in a simple closed form involving only the payoff of the safe arm, the expected current payoff of the risky arm, and the expected full-information payoff, given the current belief. In particular, the equilibrium does not depend on the precise specification of the payoff-generating processes.
---
PDF链接：
https://arxiv.org/pdf/1909.13323

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群