首页 | 本学科首页   官方微博 | 高级检索  
     检索      

英文科技论文摘要的语义特征词典构建
引用本文:宋东桓,李晨英,刘子瑜,韩明杰.英文科技论文摘要的语义特征词典构建[J].图书情报工作,2020,64(6):108-119.
作者姓名:宋东桓  李晨英  刘子瑜  韩明杰
作者单位:1. 中国农业大学图书馆 北京 100193; 2. 中国科学院文献情报中心 北京 100190
摘    要:目的/意义] 论文摘要是信息组织的重要标引对象,将论文摘要按一定结构进行标引有利于科学传播、知识发现和情报分析。如何对现有非结构式摘要进行精准快速的自动标引是亟待解决的现实问题。方法/过程] 假定不同类别的摘要具有内在一致性,即对结构式摘要的研究可为非结构式摘要自动标引提供方法和技术参考。据此,基于美国国家医学图书馆结构要素标签术语集和标签分类映射关系,提出结构要素BOMRC体系和结构式摘要的识别与规范化标引方法。其次选取研究样本并采用文本挖掘方法对样本语料中的单词、动词、三词词块、四词词块等词汇进行词频、TFIDF值等多个指标的定量统计分析,构建能够进行结构要素识别的语义特征词典。最后利用非结构式摘要测试集进行语义特征词典有效性检验。结果/结论] 结果显示,利用语义特征词典方法能够有效识别非结构式摘要的各类要素,并可用于优化以机器学习方法为核心的自动识别模型。

关 键 词:科技论文  论文摘要  结构要素  语义特征  特征词典  
收稿时间:2019-07-09
修稿时间:2019-09-20

Semantic Feature Dictionary Construction of Abstract in English Scientific Journals
Song Donghuan,Li Chenying,Liu Ziyu,Han Mingjie.Semantic Feature Dictionary Construction of Abstract in English Scientific Journals[J].Library and Information Service,2020,64(6):108-119.
Authors:Song Donghuan  Li Chenying  Liu Ziyu  Han Mingjie
Institution:1. China Agricultural University Library, Beijing 100193; 2. National Science Library, Chinese Academy of Sciences, Beijing 100190
Abstract:Purpose/significance]The abstract of scientific papers is a vital indexing object within information organization.Meanwhile,indexing the abstract according to certain rules is conducive for not only scientific communication or knowledge discovery,and intelligence analysis as well.Thus,how to realize auto-index accurately and quickly,for millions of unstructured abstracts existed nowadays is a crucial problem to be addressed.Method/process]This study assumed that different categories of abstract are inherently consistent,that is,the study of structured abstract can provide a method and technical reference for unstructured abstract auto-indexing.Acting in accordance with this assumption and based on the US National Library of Medicine's structural element labeling terminology,this study accomplished mapping across abstract element classifications and proposed BOMRC system,a normalization indexing method for structured abstract.Then we collected research sample and used text mining method to analyze multiple features of structured abstract quantitatively and statistically,such as word frequency,TF-IDF value,as for dimension of words,verbs,three-word lexical chunks and four-word lexical chunks,which enabled us propose a semantic feature dictionary for structured elements.Finally,we used unstructured abstract to test the validity of the semantic feature dictionary.Result/conclusion]The results show that the semantic feature dictionary method can effectively identify various structural elements of scientific paper abstract,and it can be used to optimize the automatic recognition model,which may be based on machine learning methods.
Keywords:scientific paper  paper abstract  structural element  semantic feature  feature dictionary
本文献已被 维普 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号