首页 | 本学科首页   官方微博 | 高级检索  
     检索      


An analysis of ill-formed input in natural language queries to document retrieval systems
Authors:Charlene W Young  Caroline M Eastman  Robert L Oakman
Abstract:We analyzed natural language document retrieval queries from the Thomas Cooper Library at the University of South Carolina in order to investigate the frequency of various types of ill-formed input, such as spelling errors, co-occurrence violations, conjunctions, ellipsis and missing or incorrect punctuation. The primary reason for analyzing ill-formed inputs was to determine whether there is a significant need to study ill-formed inputs in detail. After analyzing the queries, we found that most of the queries were sentence fragments and that many of them contained some type of ill-formed input. Conjunctions caused the most problems. The next most serious problem was caused by punctuation errors. Spelling errors occurred in a small number of the queries. The remaining types of ill-formed input considered, ellipsis and co-occurrence violations, were not found in the queries.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号