摘要翻译:
过去对概率数据库的研究主要是研究静态数据库上的查询应答问题。然而,概率数据库的应用场景常常涉及到使用新证据形式的附加信息对数据库的条件化。因此,条件问题是将先验概率数据库转换为后验概率数据库,后验概率数据库被物化以用于后续查询处理或进一步细化。结果表明,条件化问题与计算精确元组置信度值的问题密切相关。精确置信度计算是一个NP难问题。这使得研究人员开始考虑可信度计算的近似技术。然而,这些技术既不能解决条件化问题,也不能解决精确的置信度计算问题。在本文中,我们提出了解决这两个问题的有效技术。我们研究了几种问题分解方法和启发式,这些方法是基于最成功的约束满足搜索技术,如Davis-Putnam算法。我们对所提出的算法进行了全面的实验评估,以补充这一点。我们的实验表明,我们的精确算法可以很好地适应实际的数据库大小,并且在某些情况下可以与以前最有效的近似算法竞争。
---
英文标题:
《Conditioning Probabilistic Databases》
---
作者:
Christoph Koch and Dan Olteanu
---
最新提交年份:
2008
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Databases 数据库
分类描述:Covers database management, datamining, and data processing. Roughly includes material in ACM Subject Classes E.2, E.5, H.0, H.2, and J.1.
涵盖数据库管理、
数据挖掘和数据处理。大致包括ACM学科类E.2、E.5、H.0、H.2和J.1中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
Past research on probabilistic databases has studied the problem of answering queries on a static database. Application scenarios of probabilistic databases however often involve the conditioning of a database using additional information in the form of new evidence. The conditioning problem is thus to transform a probabilistic database of priors into a posterior probabilistic database which is materialized for subsequent query processing or further refinement. It turns out that the conditioning problem is closely related to the problem of computing exact tuple confidence values. It is known that exact confidence computation is an NP-hard problem. This has led researchers to consider approximation techniques for confidence computation. However, neither conditioning nor exact confidence computation can be solved using such techniques. In this paper we present efficient techniques for both problems. We study several problem decomposition methods and heuristics that are based on the most successful search techniques from constraint satisfaction, such as the Davis-Putnam algorithm. We complement this with a thorough experimental evaluation of the algorithms proposed. Our experiments show that our exact algorithms scale well to realistic database sizes and can in some scenarios compete with the most efficient previous approximation algorithms.
---
PDF链接:
https://arxiv.org/pdf/0803.2212