摘要:With user-generated content, anyone can be a content creator.This phenomenon has infinitely increased the amount of information circulated online, and it is becoming harder to efficient y obtain required information. In this paper, we describe how natural language processing and text mining can be parallelized using Hadoop and Message Passing Interface. We propose a parallel web text mining platform that processes massive amounts of data quickly and efficiently. Our web knowledge service platform is designed to collect information about the IT and telecommunications industries from the web and process this information using natural language processing and data-mining techniques.
原文链接:http://www.cqvip.com//QK/70429X/201303/47532688.html
送人玫瑰,手留余香~如您已下载到该资源,可在回帖当中上传与大家共享,欢迎来CDA社区交流学习。(仅供学术交流用。)