一个K-均值文档聚类的改进算法 An Advanced Algorithm for K-Means Document Clustering期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

一个K-均值文档聚类的改进算法

引用本文：	吴景岚,刘燕,朱文兴.一个K-均值文档聚类的改进算法[J].闽江学院学报,2004,25(2):48-52.

作者姓名：	吴景岚刘燕朱文兴

作者单位：	1. 闽江学院计算机科学系,福建,福州,350108 2. 福州大学计算机科学与技术系,福建,福州,350002

基金项目：	福建省自然科学基金[A0310013]

摘要：	k均值算法是一个常用的局部搜索算法，它的主要缺陷是容易陷入局部极小，并且该局部极小解与全局最优解往往有很大的偏差。本文提出一个基于K-均值的迭代局部搜索文档聚类算法。该算法以k均值算法所得到的解作为初始解，从该初始解开始作局部搜索。在搜索过程中接受部分劣解。当解无法改进时，算法对所得到的局部极小解做适当强度的扰动后进行下一次的迭代，以跳出局部极小，从而拓展了搜索的范围。实验结果表明该算法对文档数据集聚类的正确性迭99％以上。
关键词：	K-均值迭代局部搜索文档聚类算法局部极小解全局最优解数据库
文章编号：	1009-7821(2004)02-048-05
修稿时间：	2004年2月1日
An Advanced Algorithm for K-Means Document Clustering

WU Jing-Lan LIU Yan ZHU Wen-xing.An Advanced Algorithm for K-Means Document Clustering[J].Journal of Minjiang University,2004,25(2):48-52.

Authors:	WU Jing-Lan LIU Yan ZHU Wen-xing

Abstract:	K-means clustering algorithm is one of the common local search approaches used in clustering problem. But the main drawback of K-means is that it often gets trapped in local optima that are significantly worse than the global optimum. This paper presents an Iterated Local Search document clustering algorithm based on K-means, it takes the solution by K-means algorithm as its initial solution, from which local search process is started; during the searching, some bad solutions are accepted. When a solution can no more be improved, the algorithm makes the next iteration after an appropriate disturbance on the local minimum solution, in order to skip out of the local minimum, consequently enlarging the search space. Results indicate that the proposed algorithms gain 99% plus correctness for document clustering.

Keywords:	K-means Algorithm Document Clustering Iterated Local Search
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏