首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于词同现网络与支持向量机的论文甄别
引用本文:孙文俊,杜娟.基于词同现网络与支持向量机的论文甄别[J].现代情报,2010,30(7):87-92.
作者姓名:孙文俊  杜娟
作者单位:哈尔滨工业大学经济管理学院,黑龙江,哈尔滨,150001
摘    要:单词在句子中的交互不是随机的,而是基于一定的规则,这种规则可以通过语言网络进行研究。词同现网络是人类语言网络的一种表现形式,它利用单词在句子中的相邻关系来确定一个连接。文中采用语言网络分析的方法对论文进行甄别:将论文用词同现网络表示,计算网络的特征参数并输出一个向量来表征论文,然后运用支持向量机对论文进行分类。结果表明,使用该方法对高水平的论文和文本发生器产生的论文具有很好的甄别效果,对领域差别大的论文甄别效果也较显著。

关 键 词:词同现网络  论文甄别  语言网络分析  小世界网络

Paper Discrimination Based-on Word Co-occurrence Network and Support Vector Machine
Authors:Sun Wenjun  Du Juan
Institution:School of Management, Harbin Institute of Technology, Haerbin 150001, China
Abstract:Words in human language interact in sentences in non-random ways,but in a subtle manner that can be described in terms of a network of word interactions.Word co-occurrence network is a form of the human language complex network;it uses the co-occurrence of words in a sentence to define connections.This paper discriminates papers using language network analysis method:employ the word co-occurrence network of papers to represent them,then calculate the various parameters of the network and output a vector;finally,apply support vector machines to discriminate papers.The experimental results show that the classifier built by this method behaves well on high quality papers and unauthentic papers generated by text generators,and it also discriminates the papers which come from different area significantly.
Keywords:language network analysis  word co-occurrence network  paper discrimination  small-world network
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《现代情报》浏览原始摘要信息
点击此处可从《现代情报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号