首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于类重叠度欠采样的不平衡模糊多类支持向量机
作者姓名:吴园园  申立勇
作者单位:中国科学院大学数学科学学院, 北京 100049
基金项目:湖北省协同创新中心开放课题(JD20150402)资助
摘    要:传统的欠采样方法容易丢失重要的样本信息,且其实验结果的稳定性较差。针对上述问题,提出一种基于类重叠度欠采样的不平衡数据模糊多类支持向量机算法。该算法首先采用LOF局部离群点因子和箱线图的方法清洗训练数据集中的噪声样本,然后根据类重叠度抽取对分类起关键作用的支持向量,并且将代表每个样本点重要程度的类重叠度作为隶属度值,构造模糊多类支持向量机。实验结果表明,该算法克服了随机欠采样的支持向量机容易丢失重要样本信息和实验结果不稳定的缺点,且很好地提升了支持向量机在不平衡且含噪声的数据集上的分类精度,并保持较高的计算效率。

关 键 词:支持向量机  模糊多类支持向量机  噪声  不平衡数据  类重叠度  
收稿时间:2017-05-02
修稿时间:2017-06-02

Imbalanced fuzzy multiclass support vector machine algorithm based on class-overlap degree undersampling
Authors:WU Yuanyuan  SHEN Liyong
Institution:School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:Undersampling is a commonly-used method for data reconstruction. This method is used to solve the problem of imbalanced data classification. However, the traditional undersampling method often loses important sample information, and lacks stabilities of experimental results. To settle these two problems, this paper proposes an imbalanced fuzzy multiclass support vector machine algorithm based on class-overlap degree undersampling. This algorithm combines LOF local outlier factor and box-whisker plot to delete noise samples in the training datasets, then extracts support vectors based on class-overlap degree. Finally, the class-overlap degree of each sample is set as the membership value of this sample, and the fuzzy multiclass support vector machine is constructed. Experimental results show that our algorithm overcomes the disadvantages that the support vector machine with random undersampling often loses the important sample information and the unstabilities of experimental results. In addition, our algorithm improves the classification accuracy of support vector machine in imbalanced and noisy datasets.
Keywords:support vector machine                                                                                                                        fuzzy multiclass support vector machine                                                                                                                        noise                                                                                                                        imbalanced datasets                                                                                                                        class-overlap degree
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号