摘要翻译:
如何为多文档摘要组织信息以使所生成的摘要具有一致性的问题一直没有得到足够的重视。虽然单文档摘要的句子顺序可以根据输入文章中句子的顺序来确定,但对于多文档摘要则不是这样,其中摘要句子可以从不同的输入文章中提取。在本文中,我们提出了一种研究新闻体裁中排序信息性质的方法,并描述了在我们为这项任务开发的多个可接受排序语料库上所做的实验。在这些实验的基础上,我们实现了一种信息排序策略,该策略结合了事件的时间顺序和主题相关性的约束。对我们的增强算法的评估表明,在两个基线策略的排序上有显著的改进。
---
英文标题:
《Inferring Strategies for Sentence Ordering in Multidocument News
Summarization》
---
作者:
R. Barzilay, N. Elhadad
---
最新提交年份:
2011
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
---
英文摘要:
The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed for the task. Based on these experiments, we implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness. Evaluation of our augmented algorithm shows a significant improvement of the ordering over two baseline strategies.
---
PDF链接:
https://arxiv.org/pdf/1106.1820