首页 | 本学科首页   官方微博 | 高级检索  
     检索      

面向科技文献的混合语义信息抽取方法研究
引用本文:冷伏海,白如江,祝清松.面向科技文献的混合语义信息抽取方法研究[J].图书情报工作,2013,57(11):112-119.
作者姓名:冷伏海  白如江  祝清松
作者单位:1. 中国科学院国家科学图书馆; 2. 山东理工大学图书馆
摘    要:针对目前知识抽取技术无法精确抽取学术文献中提及的具体理论方法和性能指标参数等问题,综合运用语义标注技术、规则抽取技术以及正则表达式技术,提出一种面向科技文献的混合语义信息抽取方法。该方法首先对科技文献进行语义标注,得到相关学术术语。然后,构造抽取规则,抽取文献提及的与具体性能指标相关的句子。最后,采用正则表达式技术从相关句子中精确抽取出关键性能指标。对碳纳米管研究领域科技文献语义的信息抽取证明,该方法能迅速、有效和准确地抽取科技文献主要创新研究内容和性能指标。

关 键 词:科技文献  信息抽取  语义标注  正则表达  
收稿时间:2013-04-15
修稿时间:2013-05-10

A Hybrid Semantic Information Extraction Methodfor Scientific Research Papers
Leng Fuhai,Bai Rujiang,Zhu Qingsong.A Hybrid Semantic Information Extraction Methodfor Scientific Research Papers[J].Library and Information Service,2013,57(11):112-119.
Authors:Leng Fuhai  Bai Rujiang  Zhu Qingsong
Institution:1. Chinese Academy of Sciences, National Science Library, Beijing 100190; 2. Shandong University of Technology Library, Zibo 255049
Abstract:Knowledge extraction techniques can not accurately extract specific theoretical approaches and performance indicators parameters mentioned in the academic literature. This paper proposed a hybrid semantic extract method to address this problem mentioned above. The proposed method combined semantic tagging technology, rule extraction technology and regular expression technology to accurately extract the relevant information from scientific literature. Firstly, semantic annotation technology was used to obtain relevant academic terms. Then, construct specific extraction rules to extract sentences associated with the performance indicators. Finally, regular expressions technology was used to accurately extract the parameters of the key performance indicators. Experiment in the field of carbon nanotube research proved that this method can rapidly, efficiently and accurately extract the scientific literature innovative research and the indicators.
Keywords:research papers  information extraction  semantic annotation  regular express  
本文献已被 万方数据 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号