On document relevance and lexical cohesion between query terms |
| |
Authors: | Olga Vechtomova Murat Karamuftuoglu Stephen E Robertson |
| |
Institution: | 1. Department of Management Sciences, University of Waterloo, 200 University Avenue West, Waterloo, Ont., Canada N2L 3GE;2. Department of Computer Engineering, Bilkent University, Bilkent, 06800 Ankara, Turkey;3. Microsoft Research Cambridge, 7 J J Thomson Avenue, Cambridge, CB3 0FB, UK |
| |
Abstract: | Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms’ occurrences in a document is related to its relevance to the query. Lexical cohesion between distinct query terms in a document is estimated on the basis of the lexical-semantic relations (repetition, synonymy, hyponymy and sibling) that exist between there collocates – words that co-occur with them in the same windows of text. Experiments suggest significant differences between the lexical cohesion in relevant and non-relevant document sets exist. A document ranking method based on lexical cohesion shows some performance improvements. |
| |
Keywords: | Information retrieval Lexical cohesion Word collocation Document relevance |
本文献已被 ScienceDirect 等数据库收录! |
|