首页 | 本学科首页   官方微博 | 高级检索  
     检索      

学术文本中细粒度知识实体的关联分析
引用本文:章成志,谢雨欣,宋云天.学术文本中细粒度知识实体的关联分析[J].图书馆论坛,2021(3):12-20.
作者姓名:章成志  谢雨欣  宋云天
作者单位:南京理工大学经济管理学院
基金项目:国家自然科学基金“基于学术文献全文内容的细粒度算法实体抽取与评估研究”(项目编号:72074113);富媒体数字出版内容组织与知识服务重点实验室开放基金项目“富媒体数字出版内容中细粒度知识实体的抽取及关联与演化分析研究”(项目编号:ZD2020/09-04);江苏省研究生科研与实践创新计划项目“学术文本中细粒度知识实体的抽取及关联与演化分析研究”(项目编号:KYCX20_0406)研究成果。
摘    要:考察特定领域文本中蕴含的细粒度知识实体的使用情况,对知识实体的评估和选择具有重要意义。学术文本中的细粒度知识实体通常具有多个类型、多种关联关系,挖掘知识实体的同质与异质关联关系,有助于深入了解特定领域知识实体的实际使用情况。目前相关研究大多针对学术文本中单一知识实体的抽取和评估,缺乏对知识实体间关系的关注,在一定程度上限制了基于实体抽取进行知识发现的能力。文章以自然语言处理领域为例,对学术论文全文中的细粒度知识实体关联数据进行挖掘,并通过可视化方式揭示关联数据中蕴含的信息。主要是选取全国计算语言学会议2009-2018年间收录的中文论文为原始语料,人工标注论文中使用的知识实体,并针对NLP特点将其细分为“指标实体”“工具实体”“资源实体”“方法实体”4种类型;结合关联规则挖掘算法Apriori和复杂网络分析软件构建知识实体关联网络,揭示该领域常用的知识实体,以及这些知识实体的使用相关性。

关 键 词:全文内容分析  细粒度知识实体  关联分析

Association Analysis of Fine-Grained Knowledge Entities in Academic Texts
ZHANG Chengzhi,XIE Yuxin,SONG Yuntian.Association Analysis of Fine-Grained Knowledge Entities in Academic Texts[J].Library Tribune,2021(3):12-20.
Authors:ZHANG Chengzhi  XIE Yuxin  SONG Yuntian
Abstract:The study on fine-grained knowledge entities in domain-specific texts has great significance in the evaluation and selection of knowledge entities.There is a wide variety of fine-grained knowledge entities in academic texts,which usually relate to one another in multiple ways.Mining the homogeneous and heterogeneous association relationship between knowledge entities can help people understand the actual use of knowledge entities in specific fields more deeply.However,most of the current researches focus on the extraction and evaluation of a single knowledge entity in academic texts,lack of attention to the relationship between knowledge entities,which limits the ability of knowledge discovery based on entity extraction to a certain extent.Taking the field of natural language processing(NLP)as an example,this paper mines the data pertaining to the associations between finegrained knowledge entities in full academic articles,and displays the insightful information behind the data through visualization.Specifically,the full texts of Chinese papers published in the National Conference on Computational Linguistics(CCL)from 2009 to 2018 are selected as the original corpus,and the knowledge entities used in the papers are manually annotated.According to the characteristics of NLP,the knowledge entities are divided into four categories:"indicator entity","tool entity","resource entity"and"method entity".In this paper,a knowledge entity association network is established using Apriori algorithm,an association rule mining technique,and complex network analysis software,in an attempt to reveal the common knowledge entities in the field of NLP and their relevance in use.
Keywords:full-text context analysis  fine-grained knowledge entity  association analysis
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号