首页 | 本学科首页   官方微博 | 高级检索  
     检索      


From Plain Character Strings to Meaningful Words: Producing Better Full Text Databases for Inflectional and Compounding Languages with Morphological Analysis Software
Authors:Riitta Alkula
Institution:(1) Tieto Enator Corporation, Finland
Abstract:The paper deals with linguistic processing and retrieval techniques in fulltext databases. Special attention is focused on the characteristics of highly inflectional languages, and how morphological structure of a language should be taken into account, when designing and developing information retrieval systems. Finnish is used as an example of a language, which has a more complicated inflectional structure than the English language. In the FULLTEXT project, natural language analysis modules for Finnish were incorporated into the commercial BASIS information retrieval system, which is based on inverted files and Boolean searching. Several test databases were produced, each using one or two Finnish morphological analysis programs.
Keywords:natural language processing  full text retrieval  stemming  morphology
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号