首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A semantic approach to post-retrieval query performance prediction
Institution:1. Department of Information Science and Technology, South China Business College, Guangdong University of Foreign Studies, Guangzhou 510545, China;2. Department of Computer and Information Science, University of Macau, Macau 999078, China;3. Department of Information and Communication Engineering, Guangzhou Maritime University, Guangzhou 510725, China;1. School of Economics and Management, Harbin Engineering University, Harbin 150001, China;2. Management School, Harbin University of Commerce, Harbin 150028, China;3. Department of Computer Science and Information Engineering, Asia University, Taichung, 41354, Taiwan;4. Department of Computer Science and Engineering, Kyung Hee University, Republic of Korea;1. Business School, Hohai University, Nanjing 211100, China;2. Foreign Language School, Hohai University, Nanjing 211100, China
Abstract:The importance of query performance prediction has been widely acknowledged in the literature, especially for query expansion, refinement, and interpolating different retrieval approaches. This paper proposes a novel semantics-based query performance prediction approach based on estimating semantic similarities between queries and documents. We introduce three post-retrieval predictors, namely (1) semantic distinction, (2) semantic query drift, and (3) semantic cohesion based on (1) the semantic similarity of a query to the top-ranked documents compared to the whole collection, (2) the estimation of non-query related aspects of the retrieved documents using semantic measures, and (3) the semantic cohesion of the retrieved documents. We assume that queries and documents are modeled as sets of entities from a knowledge graph, e.g., DBPedia concepts, instead of bags of words. With this assumption, semantic similarities between two texts are measured based on the relatedness between entities, which are learned from the contextual information represented in the knowledge graph. We empirically illustrate these predictors’ effectiveness, especially when term-based measures fail to quantify query performance prediction hypotheses correctly. We report our findings on the proposed predictors’ performance and their interpolation on three standard collections, namely ClueWeb09-B, ClueWeb12-B, and Robust04. We show that the proposed predictors are effective across different datasets in terms of Pearson and Kendall correlation coefficients between the predicted performance and the average precision measured by relevance judgments.
Keywords:Query performance prediction  Semantic-enabled prediction  Post-retrieval prediction  Semantic information retrieval
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号