Enhancing web search with queries of equivalent intents |
| |
Authors: | Ruihua Song Dingquan Wang Jian-Yun Nie Ji-Rong Wen Yong Yu |
| |
Institution: | 1.Microsoft Research Asia,Beijing,China;2.John Hopkins University,Baltimore,USA;3.University of Montreal,Montreal,Canada;4.Renmin University,Beijing,China;5.Shanghai Jiao Tong University,Shanghai,China |
| |
Abstract: | Users often issue all kinds of queries to look for the same target due to the intrinsic ambiguity and flexibility of natural languages. Some previous work clusters queries based on co-clicks; however, the intents of queries in one cluster are not that similar but roughly related. It is desirable to conduct automatic mining of queries with equivalent intents from a large scale search logs. In this paper, we take account of similarities between query strings. There are two issues associated with such similarities: it is too costly to compare any pair of queries in large scale search logs, and two queries with a similar formulation, such as “SVN” (Apache Subversion) and support vector machine (SVM), are not necessarily similar in their intents. To address these issues, we propose using the similarities of query strings above the co-click based clustering results. Our method improves precision over the co-click based clustering method (lifting precision from 0.37 to 0.62), and outperforms a commercial search engine’s query alteration (lifting \(F_1\) measure from 0.42 to 0.56). As an application, we consider web document retrieval. We aggregate similar queries’ click-throughs with the query’s click-throughs and evaluate them on a large scale dataset. Experimental results indicate that our proposed method significantly outperforms the baseline method of using a query’s own click-throughs in all metrics. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|