DKTC：一种中文文本聚类方法 DKTC:A Method of Chinese Text Clustering期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

DKTC：一种中文文本聚类方法

引用本文：	张义军,刘泉凤.DKTC：一种中文文本聚类方法[J].图书情报工作,2009,53(1):109-109.

作者姓名：	张义军刘泉凤

作者单位：	浙江水利水电高等专科学校图书馆

摘要：	文章在对DBSCN与K-means两种经典聚类算法分析研究基础上,结合中文文本数据的特点,对这两种方法进行结合与改进,提出了一种中文文本聚类方法：DKTC。该算法能自动产生簇的个数,且对“噪声”或异常数据不敏感,对数据的输入顺序不敏感,另外,与DBSCAN相比,该算法有更高的处理效率。实验表明,DKTC算法不仅能对中文文本进行聚类,且与传统DBSCN与K-means法相比,聚类效果都有一定程度的改善。
关键词：	文本聚类聚类算法中文信息处理
收稿时间：	2008-05-15
DKTC:A Method of Chinese Text Clustering

Zhang Yijun,Liu Quanfeng.DKTC:A Method of Chinese Text Clustering[J].Library and Information Service,2009,53(1):109-109.

Authors:	Zhang Yijun Liu Quanfeng

Abstract:	Based on the careful analysis of two classic clustering algorithm: DBSCN and K-means, combineding with the characteristics of the Chinese text data, this paper put forward a Chinese text clustering algorithm by improving those 2 ways above: DKTC. It can automatically generate the number of clusters, and doesn’t have close relation with "information noise" or abnormal data and the order of the input data. In addition, compared with DBSCAN, DKTC has a higher efficiency. Experiments have shown that, DKTC is able to cluster Chinese text, and has improved the traditional DBSCN and K-means algorithm to some degree.

Keywords:	text clustering clustering algorithm
本文献已被万方数据等数据库收录！
	点击此处可从《图书情报工作》浏览原始摘要信息
	点击此处可从《图书情报工作》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏