首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于主题的舆情跟踪方法研究及性能评价
引用本文:姚长青,杜永萍.基于主题的舆情跟踪方法研究及性能评价[J].图书情报工作,2012,56(18):50-53,109.
作者姓名:姚长青  杜永萍
作者单位:1. 中国科学技术信息研究所 北京 100038;2. 北京工业大学计算机学院 北京 100124
基金项目:国家自然科学基金青年基金项目“问答式信息检索中信息抽取技术研究”(项目编号:60803086);北京市自然科学基金项目“语义蕴涵推理技术及在问答式信息检索中的应用研究”(项目编号:4123091)研究成果之一
摘    要:舆情跟踪是对媒体信息流中的热点话题进行实时追踪,是近年来自然语言处理领域的研究热点。实现该任务的核心技术是进行文本分类,运用信息增益以及互信息计算特征项权重,提取向量空间模型中文档表示的有效特征;分别采用Rocchio、K-Nearest Neighbor(KNN)、Bayes方法对于给定主题的事件实现舆情跟踪。在测试集上的最优性能F-Measure值达到86.2%。舆情跟踪在信息安全等领域具有广阔的应用前景,为用户及时判断网络热点事件的发展趋势提供有效指导依据。

关 键 词:舆情跟踪  文本分类  自然语言处理  
收稿时间:2012-03-02

Research and Performance Evaluation on the Theme Based Method for the Public Opinion Tracking
Yao Changqing,Du Yongping.Research and Performance Evaluation on the Theme Based Method for the Public Opinion Tracking[J].Library and Information Service,2012,56(18):50-53,109.
Authors:Yao Changqing  Du Yongping
Institution:1. Institute of Scientific and Technical Information of China, Beijing 100038;2. Institute of Computer Science, Beijing University of Technology, Beijing 100124
Abstract:The aim of the public opinion tracking is to make tracks for the progress of the appointed hot topic in the information flow of the media, and this has becomes the hotspot research direction in the field of natural language processing in recent years. The key technique to achieve the task is text classification. The authors adopt different methods of information gain and mutual information for the feature selection within the vector space model. They are used for the weight calculation and the effective features with higher weight values are extracted. The approach of Rocchio, KNN and Bayes are adopted to implement the public opinion tracking on a given topic events. Finally, the authors give the statistical data analysis and achieve the performance of 86.2% F-Measure on the test set. Public opinion tracking has a broad application prospect in the areas of information security and so on. It provides the effective guidance for the determination to the development trend of the network hot events.
Keywords:public opinion tracking  text classification  natural language processing  
本文献已被 CNKI 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号