首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于SSDKmeans算法的微博热点话题发现研究
引用本文:李海明.基于SSDKmeans算法的微博热点话题发现研究[J].教育技术导刊,2019,18(9):173-175.
作者姓名:李海明
作者单位:山东科技大学 计算机科学与工程学院,山东 青岛 266590
摘    要:为及时从海量微博信息中迅捷有效提取出微博热点话题、事件,提出基于频繁集的聚类SSDKmeans算法,在有限空间下统计分词的近似频数,并在此基础上构建文本向量空间模型,在聚类生成的每个话题簇中提炼话题关键词。通过对2万条微博数据进行有效性验证,结果表明,基于SSDKmeans算法的话题发现有较高的召回率和精准率,分别为91.3%、92.1%。SSDKmeans算法能够有效提高微博热点话题发现率,进而及时了解社会热点话题与舆论趋势。

关 键 词:话题发现  文本聚类  微博短文本  频繁集  
收稿时间:2019-07-14

Research on Hot Topic Discovery of Microblog Based on SSDKmeans Algorithms
LI Hai-ming.Research on Hot Topic Discovery of Microblog Based on SSDKmeans Algorithms[J].Introduction of Educational Technology,2019,18(9):173-175.
Authors:LI Hai-ming
Institution:College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China
Abstract:In order to quickly and effectively generate hot topics and events from the massive micro-blog information, in this paper, a clustering algorithm based on SSDKmeans of frequent sets is proposed to calculate the approximate frequency of word segmentation in finite space, and on this basis, a text vector space model is constructed to extract topic keywords in each topic cluster generated by clustering. The validity of 20 000 real microblog data is verified. The experimental results show that topic discovery based on SSDKmeans algorithm has higher recall rate and precision rate, 91.3% and 92.1% respectively. SSDKmeans algorithm can effectively improve the discovery of hot topics in Microblog, and then more timely understand the social hot topics,public opinion trends.
Keywords:topic discovery  text clustering  microblog short text  frequent sets  
点击此处可从《教育技术导刊》浏览原始摘要信息
点击此处可从《教育技术导刊》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号