具有移动目标的竞争搜索博弈

nandehutu2022

359

收藏 2022-04-07

摘要翻译：
我们引入了一个离散时间搜索博弈，其中两个玩家竞争先找到一个对象。物体在有限多个状态下按照时变马尔可夫链运动。玩家知道马尔可夫链和物体的初始概率分布，但不观察物体的当前状态。队员们轮流活跃。主动玩家选择一个状态，这个选择由另一个玩家观察。如果对象处于选定状态，则此玩家获胜，游戏结束。否则，对象按照马尔可夫链移动，博弈在下一个周期继续。我们证明了这个博弈允许一个值，并且对于任何错误项$\veps>0$，每个玩家都有一个纯（子博弈完美）$\veps$-最优策略。有趣的是，0最优策略并不总是存在的。在所有有限但足够长的范围内，$\veps$-最优策略都是$2\veps$-最优的，并且在折扣因子接近1的情况下，在折扣版本的博弈中也是$2\veps$-最优的，因此$\veps$-最优策略是稳健的。我们得到了关于值的解析性质和结构性质以及$\veps$-最优策略的结果。此外，我们还考察了易于计算和实现的有限截断策略的性能。我们特别关注重要的时间齐次情形，在此情形下，附加结果成立。
---
英文标题：
《A competitive search game with a moving target》
---
作者：
Benoit Duvocelle, J\'anos Flesch, Mathias Staudigl, Dries Vermeulen
---
最新提交年份：
2020
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Computer Science and Game Theory 计算机科学与博弈论
分类描述：Covers all theoretical and applied aspects at the intersection of computer science and game theory, including work in mechanism design, learning in games (which may overlap with Learning), foundations of agent modeling in games (which may overlap with Multiagent systems), coordination, specification and formal methods for non-cooperative computational environments. The area also deals with applications of game theory to areas such as electronic commerce.
涵盖计算机科学和博弈论交叉的所有理论和应用方面，包括机制设计的工作，游戏中的学习（可能与学习重叠），游戏中的agent建模的基础（可能与多agent系统重叠），非合作计算环境的协调、规范和形式化方法。该领域还涉及博弈论在电子商务等领域的应用。
--
一级分类：Economics 经济学
二级分类：Theoretical Economics 理论经济学
分类描述：Includes theoretical contributions to Contract Theory, Decision Theory, Game Theory, General Equilibrium, Growth, Learning and Evolution, Macroeconomics, Market and Mechanism Design, and Social Choice.
包括对契约理论、决策理论、博弈论、一般均衡、增长、学习与进化、宏观经济学、市场与机制设计、社会选择的理论贡献。
--
一级分类：Mathematics 数学
二级分类：Probability 概率
分类描述：Theory and applications of probability and stochastic processes: e.g. central limit theorems, large deviations, stochastic differential equations, models from statistical mechanics, queuing theory
概率论与随机过程的理论与应用：例如中心极限定理，大偏差，随机微分方程，统计力学模型，排队论
--

---
英文摘要：
We introduce a discrete-time search game, in which two players compete to find an object first. The object moves according to a time-varying Markov chain on finitely many states. The players know the Markov chain and the initial probability distribution of the object, but do not observe the current state of the object. The players are active in turns. The active player chooses a state, and this choice is observed by the other player. If the object is in the chosen state, this player wins and the game ends. Otherwise, the object moves according to the Markov chain and the game continues at the next period. We show that this game admits a value, and for any error-term $\veps>0$, each player has a pure (subgame-perfect) $\veps$-optimal strategy. Interestingly, a 0-optimal strategy does not always exist. The $\veps$-optimal strategies are robust in the sense that they are $2\veps$-optimal on all finite but sufficiently long horizons, and also $2\veps$-optimal in the discounted version of the game provided that the discount factor is close to 1. We derive results on the analytic and structural properties of the value and the $\veps$-optimal strategies. Moreover, we examine the performance of the finite truncation strategies, which are easy to calculate and to implement. We devote special attention to the important time-homogeneous case, where additional results hold.
---
PDF链接：
https://arxiv.org/pdf/2008.12032

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群