摘要翻译:
提出了一种基于语义Web本体和规则的频繁模式挖掘方法。特别地,我们考虑使用一种将描述逻辑与DL-safe规则相结合的语言的设置。该设置对于数据挖掘在语义Web中的实际应用非常重要。重点研究了表示形式的语义与频繁模式发现任务之间的关系,并针对该方法的核心,提出了一种利用组合知识库语义的算法。我们已经开发了一个概念验证的数据挖掘实现。我们的实证结果表明,使用组合知识库进行语义测试可以通过在评估前剪枝无用的候选模式来加快
数据挖掘的速度。我们还表明,所产生的模式集的质量可能会提高:模式更加紧凑,模式更少。我们得出结论,利用所选择的表示形式的语义是(到)关系频繁模式发现方法的设计和应用的关键。注:出现在逻辑程序设计理论与实践(TPLP)中
---
英文标题:
《The role of semantics in mining frequent patterns from knowledge bases
in description logics with rules》
---
作者:
Joanna Jozefowska, Agnieszka Lawrynowicz, Tomasz Lukaszewski
---
最新提交年份:
2010
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Logic in Computer Science 计算机科学中的逻辑
分类描述:Covers all aspects of logic in computer science, including finite model theory, logics of programs, modal logic, and program verification. Programming language semantics should have Programming Languages as the primary subject area. Roughly includes material in ACM Subject Classes D.2.4, F.3.1, F.4.0, F.4.1, and F.4.2; some material in F.4.3 (formal languages) may also be appropriate here, although Computational Complexity is typically the more appropriate subject area.
涵盖计算机科学中逻辑的所有方面,包括有限模型理论,程序逻辑,模态逻辑和程序验证。程序设计语言语义学应该把程序设计语言作为主要的学科领域。大致包括ACM学科类D.2.4、F.3.1、F.4.0、F.4.1和F.4.2中的材料;F.4.3(形式语言)中的一些材料在这里也可能是合适的,尽管计算复杂性通常是更合适的主题领域。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
We propose a new method for mining frequent patterns in a language that combines both Semantic Web ontologies and rules. In particular we consider the setting of using a language that combines description logics with DL-safe rules. This setting is important for the practical application of data mining to the Semantic Web. We focus on the relation of the semantics of the representation formalism to the task of frequent pattern discovery, and for the core of our method, we propose an algorithm that exploits the semantics of the combined knowledge base. We have developed a proof-of-concept data mining implementation of this. Using this we have empirically shown that using the combined knowledge base to perform semantic tests can make data mining faster by pruning useless candidate patterns before their evaluation. We have also shown that the quality of the set of patterns produced may be improved: the patterns are more compact, and there are fewer patterns. We conclude that exploiting the semantics of a chosen representation formalism is key to the design and application of (onto-)relational frequent pattern discovery methods. Note: To appear in Theory and Practice of Logic Programming (TPLP)
---
PDF链接:
https://arxiv.org/pdf/1003.2700