首页 | 本学科首页   官方微博 | 高级检索  
     检索      

K-Means聚类的多种距离计算方法的文本实验比较
引用本文:林滨.K-Means聚类的多种距离计算方法的文本实验比较[J].福建工程学院学报,2016,0(1):80-85.
作者姓名:林滨
作者单位:福州软件职业技术学院计算机系
摘    要:针对文本类型数据的分类进行研究,用VSM模型和TF IDF技术对文本文件进行了数据样本抽取加权,得到文本相似度矩阵;采用不同样本距离计算方法和K-Means算法对数据进行了聚类实验,获得聚类结果并进行了分析和总结;基于实验结论,研究了不同距离计算方法之间的区别以及适用的数据类型。

关 键 词:文本聚类  TF-IDF  K-Means  距离计算

Experimental comparison of K-Means text clustering by varied distance calculation methods
Lin Bin.Experimental comparison of K-Means text clustering by varied distance calculation methods[J].Journal of Fujian University of Technology,2016,0(1):80-85.
Authors:Lin Bin
Institution:Fuzhou Software Technology Vocational College
Abstract:Text data samples were extracted and weighted and the text similarity matrices were obtained by vector space model (VSM) model and TF-IDF weighting technology. The data clustering was conducted via different distance calculation methods and K-Means algorithm.The clustering results were analysed. The differences among the distance calculation methods and the applicable data types were studied.
Keywords:text clustering  TF-IDF  K Means  distance calculation
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《福建工程学院学报》浏览原始摘要信息
点击此处可从《福建工程学院学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号