Exploring criteria for successful query expansion in the genomic domain |
| |
Authors: | Nicola Stokes Yi Li Lawrence Cavedon Justin Zobel |
| |
Institution: | (1) NICTA Victoria Research Lab, Department of Computer Science and Software Engineering, The University of Melbourne, Melbourne, Australia |
| |
Abstract: | Query Expansion is commonly used in Information Retrieval to overcome vocabulary mismatch issues, such as synonymy between
the original query terms and a relevant document. In general, query expansion experiments exhibit mixed results. Overall TREC
Genomics Track results are also mixed; however, results from the top performing systems provide strong evidence supporting
the need for expansion. In this paper, we examine the conditions necessary for optimal query expansion performance with respect
to two system design issues: IR framework and knowledge source used for expansion. We present a query expansion framework
that improves Okapi baseline passage MAP performance by 185%. Using this framework, we compare and contrast the effectiveness
of a variety of biomedical knowledge sources used by TREC 2006 Genomics Track participants for expansion. Based on the outcome
of these experiments, we discuss the success factors required for effective query expansion with respect to various sources
of term expansion, such as corpus-based cooccurrence statistics, pseudo-relevance feedback methods, and domain-specific and
domain-independent ontologies and databases. Our results show that choice of document ranking algorithm is the most important
factor affecting retrieval performance on this dataset. In addition, when an appropriate ranking algorithm is used, we find
that query expansion with domain-specific knowledge sources provides an equally substantive gain in performance over a baseline
system.
|
| |
Keywords: | Passage retrieval for genomic queries Knowledge based query expansion Corpus based query expansion Pseudo relevance feedback Concept-based normalisation passage ranking TREC 2006 Genomics Track |
本文献已被 SpringerLink 等数据库收录! |
|