首页 | 本学科首页   官方微博 | 高级检索  
     检索      

贝叶斯分类算法在社交网站信息过滤中的应用分析
引用本文:李志义,沈之锐,义梅练.贝叶斯分类算法在社交网站信息过滤中的应用分析[J].图书情报工作,2014,58(13):100-106.
作者姓名:李志义  沈之锐  义梅练
作者单位:华南师范大学经管学院
基金项目:本文系广东省哲学社会科学基金项目“基于网络日志的用户行为分析与网站信息组织优化研究”(项目编号:GD11CTS02)研究成果之一。
摘    要:对文档进行分类并鉴别出垃圾信息是一个非常有实用价值的研究领域,越来越多的网站开始关注这种技术。采用智能算法对垃圾信息进行有效分析,寻找垃圾制作者,并通过网络日志和所发表的内容,判断哪些是广告用户和垃圾信息的发布者,并将其删除。认为对垃圾信息的甄别其实是一种把信息分成有用信息和无用信息的过程,试用贝叶斯分类算法把信息分成不同的类。针对基于规则的分类方法和通过分析广告链接网址来剔除垃圾信息的方法的缺陷,给出贝叶斯分类算法及机器训练方法,从实验结果看,本方法优于基于规则的分类法。

关 键 词:贝叶斯分类  社交网站  信息过滤  
收稿时间:2014-04-14
修稿时间:2014-05-28

Analysis and Application of Bayes Classification Algorithm in the Social Networking Site Information Filtering
Li Zhiyi,Shen Zhirui,Yi Meilian.Analysis and Application of Bayes Classification Algorithm in the Social Networking Site Information Filtering[J].Library and Information Service,2014,58(13):100-106.
Authors:Li Zhiyi  Shen Zhirui  Yi Meilian
Institution:Economic & Management College, South China Normal University, Guangzhou 510006
Abstract:The classification of the document and identify the spam is a very valuable research field. More and more websites began to pay attention to this technology. This paper uses the intelligent algorithm to effectively analyze the garbage information, looking for spammers; through web logs and the published content, determine which advertisers and garbage information promulgator, and delete it. Screening for spam is in fact a process of dividing information into useful information and useless information, the paper attempts to use Bayes classification algorithm to put information into different categories, so the information can be filtered to different classes. The main contribution of the article is aiming at the defects of classification based on rules and method to weed out spam through the analysis of the advertising links, and gives the Bayes classification algorithm and machine learning methods. The experiment results show that, this method is superior to the one based on classification rules.
Keywords:Bayes classification  social networking sites  information filtering  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号