首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Efficient implementation of associative classifiers for document classification
Authors:Yongwook Yoon  Gary Geunbae Lee
Institution:Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-Dong, Pohang 790-784, Republic of Korea
Abstract:In practical text classification tasks, the ability to interpret the classification result is as important as the ability to classify exactly. Associative classifiers have many favorable characteristics such as rapid training, good classification accuracy, and excellent interpretation. However, associative classifiers also have some obstacles to overcome when they are applied in the area of text classification. The target text collection generally has a very high dimension, thus the training process might take a very long time. We propose a feature selection based on the mutual information between the word and class variables to reduce the space dimension of the associative classifiers. In addition, the training process of the associative classifier produces a huge amount of classification rules, which makes the prediction with a new document ineffective. We resolve this by introducing a new efficient method for storing and pruning classification rules. This method can also be used when predicting a test document. Experimental results using the 20-newsgroups dataset show many benefits of the associative classification in both training and predicting when applied to a real world problem.
Keywords:Text classification  Associative classifier  Feature selection  Rule pruning  Subset expansion
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号