全部版块 我的主页
论坛 经济学人 二区 外文文献专区
405 0
2022-04-02
摘要翻译:
协作标记系统,如Deliciou、CiteULike和其他系统,允许用户用称为标记的描述性标签来注释资源,例如Web页面或科学论文。数以千计的用户提供的社会注释可以用来推断分类知识、对文档进行分类或推荐新的相关信息。传统的文本推理方法没有充分利用社会注释,因为它们没有考虑到用户个人观点和词汇的变化。在以前的工作中,我们引入了一个简单的概率模型,该模型考虑了单个注释者的兴趣,以发现注释资源的隐藏主题。不幸的是,这种方法有一个主要缺点:必须事先指定主题和兴趣的数量。为了解决这个问题,我们将模型扩展到一个完全的贝叶斯框架,它提供了一种自动估计这些数字的方法。特别是,该模型允许兴趣和主题的数量根据数据结构的建议而改变。在主题抽取任务中,我们通过比较该模型与潜在Dirichlet分配的性能,在合成数据和真实数据上对该模型进行了详细的评估。对于后一个评价,我们应用该模型从Delicious获得的社会注释中推断Web资源的主题,以发现与指定资源相似的新资源。我们的实证结果表明,所提出的模型是一种开发用户生成注释中包含的社会知识的有希望的方法。
---
英文标题:
《Modeling Social Annotation: a Bayesian Approach》
---
作者:
Anon Plangprasopchok, Kristina Lerman
---
最新提交年份:
2010
---
分类信息:

一级分类:Computer Science        计算机科学
二级分类:Artificial Intelligence        人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--

---
英文摘要:
  Collaborative tagging systems, such as Delicious, CiteULike, and others, allow users to annotate resources, e.g., Web pages or scientific papers, with descriptive labels called tags. The social annotations contributed by thousands of users, can potentially be used to infer categorical knowledge, classify documents or recommend new relevant information. Traditional text inference methods do not make best use of social annotation, since they do not take into account variations in individual users' perspectives and vocabulary. In a previous work, we introduced a simple probabilistic model that takes interests of individual annotators into account in order to find hidden topics of annotated resources. Unfortunately, that approach had one major shortcoming: the number of topics and interests must be specified a priori. To address this drawback, we extend the model to a fully Bayesian framework, which offers a way to automatically estimate these numbers. In particular, the model allows the number of interests and topics to change as suggested by the structure of the data. We evaluate the proposed model in detail on the synthetic and real-world data by comparing its performance to Latent Dirichlet Allocation on the topic extraction task. For the latter evaluation, we apply the model to infer topics of Web resources from social annotations obtained from Delicious in order to discover new resources similar to a specified one. Our empirical results demonstrate that the proposed model is a promising method for exploiting social knowledge contained in user-generated annotations.
---
PDF链接:
https://arxiv.org/pdf/0811.1319
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群