Quality flaw prediction in Spanish Wikipedia: A case of study with verifiability flaws期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Quality flaw prediction in Spanish Wikipedia: A case of study with verifiability flaws

Authors:	Edgardo Ferretti Leticia Cagnina Viviana Paiz Sebastián Delle Donne Rodrigo Zacagnini Marcelo Errecalde

Institution:	1. Departamento de Informática, Universidad Nacional de San Luis (UNSL), Ejército de los Andes 950, San Luis, Argentina;2. Laboratorio de Investigación y, Desarrollo en Inteligencia Computacional (UNSL), Argentina;3. Consejo Nacional de Investigaciones, Científicas y Técnicas (CONICET), Argentina

Abstract:	In this work, we present the first quality flaw prediction study for articles containing the two most frequent verifiability flaws in Spanish Wikipedia: articles which do not cite any references or sources at all (denominated Unreferenced) and articles that need additional citations for verification (so-called Refimprove). Based on the underlying characteristics of each flaw, different state-of-the-art approaches were evaluated. For articles not citing any references, a well-established rule-based approach was evaluated and interesting findings show that some of them suffer from Refimprove flaw instead. Likewise, for articles that need additional citations for verification, the well-known PU learning and one-class classification approaches were evaluated. Besides, new methods were compared and a new feature was also proposed to model this latter flaw. The results showed that new methods such as under-bagged decision trees with sum or majority voting rules, biased-SVM, and centroid-based balanced SVM, perform best in comparison with the ones previously published.

Keywords:	Information quality Quality flaw prediction Semi-supervised learning Supervised learning Wikipedia
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏