首页 | 本学科首页   官方微博 | 高级检索  
     检索      

中文文献题录数据机构名称归一化研究
引用本文:杨昭,任娟.中文文献题录数据机构名称归一化研究[J].图书情报工作,2020,64(4):95-102.
作者姓名:杨昭  任娟
作者单位:1. 上海交通大学图书馆 上海 200240; 2. 上海出版印刷高等专科学校 上海 200093; 3. 上海出版传媒研究院 上海 200093
摘    要:目的/意义] 大数据时代,机构名称数据呈现海量性、动态性、多样性等新特征,机构名称归一化可改善大数据环境下科研管理、学科评价、学科服务中的数据可靠性,提升基于机构名称的数据检索质量和应用效果。方法/过程] 从语言学角度和模型构建层面研究机构名称归一化,构建基于共现关系和相似度的机构名称归一化框架模型,提出机构名称实体边界识别方法,编制机构多层级词表,提出机构名称归一化方法,最后选取2008-2018年中文文献题录数据进行实验。结果/结论] 实验结果验证了模型的有效性,对其他类型机构名称归一化有一定的启发。

关 键 词:机构名称  归一化  模型构建  大数据  实体边界识别  
收稿时间:2019-06-21
修稿时间:2019-10-15

Research on Institution Name Normalization Based on Chinese Bibliographic Data
Yang Zhao,Ren Juan.Research on Institution Name Normalization Based on Chinese Bibliographic Data[J].Library and Information Service,2020,64(4):95-102.
Authors:Yang Zhao  Ren Juan
Institution:1. Shanghai Jiao Tong University Library, Shanghai 200240; 2. Shanghai Publishing and Printing College, Shanghai 200093; 3. Shanghai Research Institute of Publishing and Media, Shanghai 200093
Abstract:Purpose/significance]In the era of big data,institution name data presents new features such as mass,dynamic and diversity.Normalization of institution name can improve the reliability of data in scientific research management,subject evaluation and subject service under big data environment,and improve the quality and application effect of data retrieval based on institution name.Method/process]From the perspective of linguistics and model construction,this paper studied name normalization.This paper constructs a Framework Model for Normalization of Institutional Names Based on Co-occurrence Relations and Similarity.Firstly,it proposed a method of identifying the entity boundary of names.Secondly,it compiled a multi-level vocabulary and proposes a normalized method of names.Finally,the Chinese bibliographic data from 2008 to 2018 were selected for experiment.Result/conclusion]Experiments verify the validity of the model,which has some enlightening significance for the normalization of the names of other types of institutions.
Keywords:institution name  normalization  model construction  big data  entity boundary recognition
本文献已被 维普 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号