多智能体学习的连续策略复制器动力学

571

收藏 2022-03-07

摘要翻译：
近年来，多智能体的学习和自适应问题引起了人们的广泛关注。研究表明，多智能体学习的动力学可以用群体生物学中的复制子方程来研究。到目前为止，大多数已有的研究都局限于具有少量可用行动的离散策略空间。然而，在许多情况下，可供选择的试剂更好地表征连续光谱。本文提出了一种推广的复制器框架，该框架允许研究具有连续策略空间的Q-学习智能体的自适应动力学。Agent策略不再是以概率向量为特征，而是以连续变量上的概率度量为特征。因此，离散情况下的常微分方程被一个耦合积分-微分复制器方程系统所取代，该系统描述了个体agent策略的相互演化。我们导出了一组描述复制器动力学稳态的函数方程，并在几个两人博弈中检验了它们的解，并通过仿真验证了我们的分析结果。
---
英文标题：
《Continuous Strategy Replicator Dynamics for Multi--Agent Learning》
---
作者：
Aram Galstyan
---
最新提交年份：
2011
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Machine Learning 机器学习
分类描述：Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文（有监督的，无监督的，强化学习，强盗问题，等等），包括健壮性，解释性，公平性和方法论。对于机器学习方法的应用，CS.LG也是一个合适的主要类别。
--
一级分类：Computer Science 计算机科学
二级分类：Artificial Intelligence 人工智能
分类描述：Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域，除了视觉、机器人、机器学习、多智能体系统以及计算和语言（自然语言处理），这些领域有独立的学科领域。特别地，包括专家系统，定理证明（尽管这可能与计算机科学中的逻辑重叠），知识表示，规划，和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类：Computer Science 计算机科学
二级分类：Computer Science and Game Theory 计算机科学与博弈论
分类描述：Covers all theoretical and applied aspects at the intersection of computer science and game theory, including work in mechanism design, learning in games (which may overlap with Learning), foundations of agent modeling in games (which may overlap with Multiagent systems), coordination, specification and formal methods for non-cooperative computational environments. The area also deals with applications of game theory to areas such as electronic commerce.
涵盖计算机科学和博弈论交叉的所有理论和应用方面，包括机制设计的工作，游戏中的学习（可能与学习重叠），游戏中的agent建模的基础（可能与多agent系统重叠），非合作计算环境的协调、规范和形式化方法。该领域还涉及博弈论在电子商务等领域的应用。
--
一级分类：Physics 物理学
二级分类：Adaptation and Self-Organizing Systems 自适应和自组织系统
分类描述：Adaptation, self-organizing systems, statistical physics, fluctuating systems, stochastic processes, interacting particle systems, machine learning
自适应，自组织系统，统计物理，波动系统，随机过程，相互作用粒子系统，机器学习
--

---
英文摘要：
The problem of multi-agent learning and adaptation has attracted a great deal of attention in recent years. It has been suggested that the dynamics of multi agent learning can be studied using replicator equations from population biology. Most existing studies so far have been limited to discrete strategy spaces with a small number of available actions. In many cases, however, the choices available to agents are better characterized by continuous spectra. This paper suggests a generalization of the replicator framework that allows to study the adaptive dynamics of Q-learning agents with continuous strategy spaces. Instead of probability vectors, agents strategies are now characterized by probability measures over continuous variables. As a result, the ordinary differential equations for the discrete case are replaced by a system of coupled integral--differential replicator equations that describe the mutual evolution of individual agent strategies. We derive a set of functional equations describing the steady state of the replicator dynamics, examine their solutions for several two-player games, and confirm our analytical results using simulations.
---
PDF链接：
https://arxiv.org/pdf/0904.4717

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群