首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于LDA的科技创新主题语义识别研究
引用本文:祝娜,王效岳,杨京,白如江.基于LDA的科技创新主题语义识别研究[J].图书情报工作,2015,59(14):126-134.
作者姓名:祝娜  王效岳  杨京  白如江
作者单位:山东理工大学科技信息研究所 淄博 255049
基金项目:本文系教育部人文社会科学研究青年基金"长句检索中信息查询扩展研究"(编号:12YJC870001)和文化部科技创新项目"大规模学术文献并行处理与自动分类研究"研究成果之一。
摘    要:目的/意义] 由于传统科技创新主题概率识别方法忽略文本内容语义理解,为了更加准确地识别出主题,科技创新主题语义识别势在必行。方法/过程] 提出一种基于LDA的科技创新主题语义识别方法,利用语义角色标注技术对科技文献中的科技创新内容进行语义标引,构建LDA主题语义识别模型,根据表征科技创新内容的关键词语义角色对应的上位词的概率识别出科技创新主题。结果/结论] 通过以3D打印领域数据为对象进行实验,证明该方法能够更加准确地识别出科技创新主题,形成科技创新主题-主题词-科技文献的混合分布聚类集群,减少研究背景等无关数据干扰,避免语义含义相同的科技创新主题词重复统计问题。

关 键 词:语义角色标注  科技创新主题  LDA模型  3D打印  
收稿时间:2015-04-27

Semantic Recognition of Technological Innovation Theme Based on LDA
Zhu Na,Wang Xiaoyue,Yang Jing,Bai Rujiang.Semantic Recognition of Technological Innovation Theme Based on LDA[J].Library and Information Service,2015,59(14):126-134.
Authors:Zhu Na  Wang Xiaoyue  Yang Jing  Bai Rujiang
Institution:Institute of Scientific & Technical Information, Shandong University of Technology, Zibo 255049
Abstract:Purpose/significance] Traditional probabilistic model of technology innovation theme identification method ignores the semantic understanding of the text. In order to identify the theme more accurately, the semantic recognition of technological innovation theme is imperative.Method/process] This article proposes a semantic recognition method of science and technology innovation theme based LDA, uses the semantic role labeling technique to semantic index the technological innovation content of scientific literature, builds the LDA topic semantic recognition model, and identifies the science and technology innovation theme according to the probability of the hypernyms which correspond with semantic roles of keywords from technological innovation content.Result/conclusion] The 3D printing field data experimental results show that, this method can identify the innovation theme more accurately, and form a mixed distribution cluster of scientific and technological innovation theme-scientific and technological innovation MeSH-scientific literature. It can reduce the interference of the background and other irrelevant data and avoid of the same semantic meaning's double counting problem of scientific and technological innovation MeSH.
Keywords:semantic role labeling  technological innovation theme  LDA model  3D printing  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号