首页 | 本学科首页   官方微博 | 高级检索  
     检索      

自举在词义消歧中的应用及其关键问题
引用本文:李纲,寇广增.自举在词义消歧中的应用及其关键问题[J].情报学报,2010,29(1).
作者姓名:李纲  寇广增
作者单位:武汉大学信息管理学院,武汉,430072
基金项目:国家自然科学基金项目 
摘    要:基于标注语料库的有指导学习方法是词义消歧取得性能最好的方法,优于无指导学习方法和基于词典的方法.它的准确率极大地依赖于标注语料库的规模,而目前人工标注语料库数量极少,缺乏标注语料就成为制约词义消歧发展的瓶颈,如何生成大规模标注语料成为词义消歧研究热点.自举是解决上述问题的重要方法,它以小规模标注语料作为种子,运用机器学习算法生成大规模标注语料.本文对自举在词义消歧中的应用和关键问题进行介绍.首先对自举进行算法描述,然后分别从中英文词义消歧领域介绍自举的应用情况,最后对自举应用涉及的初始种子、自举参数、未标注语料集的选择和互联网应用等几个关键问题进行总结.

关 键 词:自举  词义消歧

Application of Bootstrapping in Word Sense Disambiguation and Its Key Problem
Li Gang , Kou Guangzeng.Application of Bootstrapping in Word Sense Disambiguation and Its Key Problem[J].Journal of the China Society for Scientific andTechnical Information,2010,29(1).
Authors:Li Gang  Kou Guangzeng
Institution:School of Information Management/a>;Wuhan University/a>;Wuhan 430072
Abstract:The corpus-based word sense disambiguation(WSD) usually achieves the best performance,as compared to unsupervised or knowledge-based methods.But this performance largely depends on the availability of sense tagged corpora,which is very small since the semantic annotations are usually done by humans.The lack of widely available sense tagged corpora is becoming the bottleneck in word sense disambiguation.Recently several WSD studies have been made to automatically sense-tag a training corpus via bootstrapping...
Keywords:bootstrapping  word sense disambiguation  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号