摘要翻译:
集体图形模型利用实例间的关联依赖来输出更精确的标记。然而,现有模型支持的关联性非常有限,这限制了精度的提高。本文有两大贡献。首先,我们提出了一个通用的集体推理框架,它使数据实例对其标号的一组{\em性质}达成一致。通过对称团势鼓励协议。我们证明了丰富的性质会带来更大的收益,并给出了一大类这类性质的系统推理过程。该过程在群集图上执行消息传递,其中属性感知消息是用特定于群集的算法计算的。这为领域自适应提供了一个仅限于推理的解决方案。我们在书目信息抽取上的实验表明,在未见域上显著地减少了测试错误。我们的第二个主要贡献是计算来自具有对称团势的团簇的传出消息的算法。我们的算法对二元标号上的任意对称势以及多标号上的类max-like和类multimal-like势都是精确的。对于大多数势,我们还提供了一个有效的基于拉格朗日松弛的算法,与精确算法相比较。我们提出了一个NP硬Potts势的13/15近似算法,在团的大小上,运行时间为次二次。相比之下,对于具有Potts势的图,最著名的先前保证仅为1/2。我们的经验表明,我们的Potts势的方法比最好的替代方案快一个数量级,而我们的基于拉格朗日松弛的多数势算法优于最佳适用的启发式--ICM。
---
英文标题:
《Generalized Collective Inference with Symmetric Clique Potentials》
---
作者:
Rahul Gupta, Sunita Sarawagi, Ajit A. Diwan
---
最新提交年份:
2009
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
Collective graphical models exploit inter-instance associative dependence to output more accurate labelings. However existing models support very limited kind of associativity which restricts accuracy gains. This paper makes two major contributions. First, we propose a general collective inference framework that biases data instances to agree on a set of {\em properties} of their labelings. Agreement is encouraged through symmetric clique potentials. We show that rich properties leads to bigger gains, and present a systematic inference procedure for a large class of such properties. The procedure performs message passing on the cluster graph, where property-aware messages are computed with cluster specific algorithms. This provides an inference-only solution for domain adaptation. Our experiments on bibliographic information extraction illustrate significant test error reduction over unseen domains. Our second major contribution consists of algorithms for computing outgoing messages from clique clusters with symmetric clique potentials. Our algorithms are exact for arbitrary symmetric potentials on binary labels and for max-like and majority-like potentials on multiple labels. For majority potentials, we also provide an efficient Lagrangian Relaxation based algorithm that compares favorably with the exact algorithm. We present a 13/15-approximation algorithm for the NP-hard Potts potential, with runtime sub-quadratic in the clique size. In contrast, the best known previous guarantee for graphs with Potts potentials is only 1/2. We empirically show that our method for Potts potentials is an order of magnitude faster than the best alternatives, and our Lagrangian Relaxation based algorithm for majority potentials beats the best applicable heuristic -- ICM.
---
PDF链接:
https://arxiv.org/pdf/0907.0589