首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于特征表现的虚假评论人预测研究
引用本文:聂卉,吴毅骏.基于特征表现的虚假评论人预测研究[J].图书情报工作,2015,59(10):102-109.
作者姓名:聂卉  吴毅骏
作者单位:中山大学资讯管理学院 广州 510275
基金项目:本文系广东省哲学社会科学"十二五"规划2013年度项目"基于情境和用户感知的知识推荐机制研究"(项目编号:CD13CTS01)研究成果之一。
摘    要:目的/意义]重点探讨基于特征表现的虚假评论人的预测,目的在于揭示真实网络环境中"网络水军"的特点和行为规律,构建一个简洁清晰、可解释的评论人身份预测模型,为深层次的评论挖掘研究奠定基础。方法/过程]结合实证分析和机器学习技术,对目标网站"大众点评网"的内部评价机制进行探索,利用因子分析提炼评论人属性及行为表现特征,并在此基础上构建基于Logistic回归的预测模型。结果/结论]对于目标网站,模型对虚假评论人的分类预测精度达到73.8%,AUC指标达到80.9%。而评论人的贡献度、活跃度以及文字素养被验证与其身份有统计意义上的显著关系,但评论人的层级、情绪以及评价偏差则对其身份预测的影响不显著。实验结论和经验分析基本保持一致,模型能够被合理解释。

关 键 词:虚假评论人  虚假评论  评论人特征  
收稿时间:2015-03-13

Study on Spammer Detection Based on Reviewer-Specific Characteristics
Nie Hui,Wu Yijun.Study on Spammer Detection Based on Reviewer-Specific Characteristics[J].Library and Information Service,2015,59(10):102-109.
Authors:Nie Hui  Wu Yijun
Institution:School of Information Management, Sun Yat-Sen University, Guangzhou 510275
Abstract:Purpose/significance]This paper mainly studies the problem of review spammer detection based on specific characteristics. It tends to reveal the characteristics and behavioral regularities of Water Army on the web, build a simple and reasonable explanation prediction model of identifying review spammers, to build a foundation for the deep reviews mining research.Method/process]By integrating empirical analysis and Machine Learning (ML) techniques, it explores the inner evaluation strategies of the target website. Factor analysis is employed to extract the features and behavior-specific factors of reviewers, on the basis of which a Logistic Regression model is built to identify review spammer.Result/conclusion]On the dataset built from the target website, the classification accuracy for spammer identification can achieve to 73.8%,and AUC measure is 80.9%. Additionally, three feature factors related with reviewers, contribution, activity and word- specific literacy are proved to be significantly associated with the identification of reviewer, while the opposite result is get for reviewers' level, emotion and rating deviation. Basically, it gets the same results by using ML based methods and empirical analysis and the reasonable explanation can be given for our prediction model.
Keywords:review spammer  review spam  behavior characteristics  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号