Mining subtopics from text fragments for a web query期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Mining subtopics from text fragments for a web query

Authors:	Qinglei Wang Yanan Qian Ruihua Song Zhicheng Dou Fan Zhang Tetsuya Sakai Qinghua Zheng

Institution:	1. SPKLSTN Lab, Department of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, People’s Republic of China 2. Microsoft Research Asia, No. 5 Danling Street, Haidian District, Beijing, 100080, People’s Republic of China 3. Nankai-Baidu Joint Lab, Nankai University, Tianjin, 300071, People’s Republic of China

Abstract:	Web search queries are often ambiguous or faceted, and the task of identifying the major underlying senses and facets of queries has received much attention in recent years. We refer to this task as query subtopic mining. In this paper, we propose to use surrounding text of query terms in top retrieved documents to mine subtopics and rank them. We first extract text fragments containing query terms from different parts of documents. Then we group similar text fragments into clusters and generate a readable subtopic for each cluster. Based on the cluster and the language model trained from a query log, we calculate three features and combine them into a relevance score for each subtopic. Subtopics are finally ranked by balancing relevance and novelty. Our evaluation experiments with the NTCIR-9 INTENT Chinese Subtopic Mining test collection show that our method significantly outperforms a query log based method proposed by Radlinski et al. (2010) and a search result clustering based method proposed by Zeng et al. (2004) in terms of precision, I-rec, D-nDCG and D#-nDCG, the official evaluation metrics used at the NTCIR-9 INTENT task. Moreover, our generated subtopics are significantly more readable than those generated by the search result clustering method.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏