摘要翻译:
基于大规模知识库的应用,我们研究了一种复杂的统计推理框架&马尔可夫逻辑网络(MLNs)的扩展问题。我们的方法Felix利用数学规划中的拉格朗日松弛思想,将程序分解为更小的任务,同时保持原MLN的联合推理性质。这样做的好处是,我们可以使用高度可伸缩的专门算法来完成常见的任务,如分类和共指。我们提出了一个在RDBMS中支持拉格朗日松弛的体系结构,我们证明了MLN的可扩展联合推理。作为TAC挑战的一部分,我们通过从180万个文档中构造一个知识库,实证验证了Felix比以前的MLN推理方法具有更高的可扩展性和效率。我们展示了Felix的规模,并实现了最先进的质量数字。相比之下,现有的方法甚至不能扩展到小三个数量级的语料库子集。
---
英文标题:
《Scaling Inference for Markov Logic with a Task-Decomposition Approach》
---
作者:
Feng Niu, Ce Zhang, Christopher R\'e, Jude Shavlik
---
最新提交年份:
2012
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Databases 数据库
分类描述:Covers database management, datamining, and data processing. Roughly includes material in ACM Subject Classes E.2, E.5, H.0, H.2, and J.1.
涵盖数据库管理、
数据挖掘和数据处理。大致包括ACM学科类E.2、E.5、H.0、H.2和J.1中的材料。
--
---
英文摘要:
Motivated by applications in large-scale knowledge base construction, we study the problem of scaling up a sophisticated statistical inference framework called Markov Logic Networks (MLNs). Our approach, Felix, uses the idea of Lagrangian relaxation from mathematical programming to decompose a program into smaller tasks while preserving the joint-inference property of the original MLN. The advantage is that we can use highly scalable specialized algorithms for common tasks such as classification and coreference. We propose an architecture to support Lagrangian relaxation in an RDBMS which we show enables scalable joint inference for MLNs. We empirically validate that Felix is significantly more scalable and efficient than prior approaches to MLN inference by constructing a knowledge base from 1.8M documents as part of the TAC challenge. We show that Felix scales and achieves state-of-the-art quality numbers. In contrast, prior approaches do not scale even to a subset of the corpus that is three orders of magnitude smaller.
---
PDF链接:
https://arxiv.org/pdf/1108.0294