首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 906 毫秒
1.
本研究对MEDLINE中生物体类文献中高频主要主题词进行共词聚类分析,获取主题词之间的关联规则,利用UMLS语义关系进行结构化表达.从MEDLINE中选取<中华医学杂志>上的生物体类文献作为测试集,由专家人工抽取关系,与共词聚类得到的关联规则进行比较.利用共词聚类分析对生物体类主题词关系的挖掘及评价分析,为文本知识发现提供了一种新的尝试.  相似文献   

2.
近年来,知识抽取技术在非结构化文本的处理中起到很重要的作用.文章在对当前知识抽取的相关文献、系统和项目分析研究的基础之上,提出了当前知识抽取研究中的主要抽取内容对象的分类,并对这些主要内容对象抽取的相关技术方法进行综述.主要总结了Web对象识别和集成、术语识别和抽取、主题发现和识别、概念层次关系的抽取、非概念层次关系的抽取、事实抽取、观点抽取和倾向识别等7种内容对象抽取的技术方法.并在此基础之上,对未来知识抽取的发展趋势进行了分析.该文为2008年第9期本期话题"知识抽取"的文章之一.  相似文献   

3.
非相关文献知识发现的关键技术研究   总被引:3,自引:0,他引:3  
本文在对非相关文献知识发现中的关键技术进行界定的基础上,对11个国外主要的非相关文献知识发现研究中所涉及的关键技术,即初始文本集的构建、信息抽取及中间关联词的确定与排序等进行了比较研究,认为B集合质量低是非相关文献知识发现目前存在的主要问题.针对该问题,作者提出以提高B集合的质量为主要目标,从B集合形成的前过程,即初始文本集的质量以及B集合本身的质量,即B词的排序两个方面的改进策略.前者包括初始文本集的合理结构及综合过滤机制,后者包括双向词频法、基于MeSH加权和基于文献内聚力加权.并对部分改进策略进行了试验.  相似文献   

4.
基于深度标引的专利文本挖掘框架研究   总被引:1,自引:1,他引:0  
专利文献中的文摘、权利要求项、全文等文本信息蕴涵了重要技术细节和技术保护等内容,从这些专利文本内容中挖掘具有技术价值、商业价值的潜在信息是当前专利信息应用领域的研究热点.文章研究将面向分析目标的专利文本深度标引应用到专利文本挖掘中,在数据预处理阶段就将分析目标作为知识抽取的基础,专利分析人员可依据分析需求,在文本挖掘时只提取标引结果的某一部分进行分析和处理,这不仅可提高专利文本挖掘的数据预处理质量,也可提高后期文本分析的效率.该文为<数字图书馆论坛>2008年第11期本期话题"科技创新中的专利应用研究"的文章之一.  相似文献   

5.
从数字图书馆知识化管理的角度,对珀尔修斯数字图书馆工程,古文献的构建、传输和资源格式,学科之间的知识管理工具,数字图书馆软件发展以及深化和拓宽文献服务等进行分析,并对珀尔修斯数字图书馆如何收藏文献,数字化传统纸质文本和增设一系列工具以及如何为读者进一步阅读提供实时参考文献,成功实现古文献、多文本、多文种的数字化知识管理等方面进行研究。  相似文献   

6.
典型关系抽取系统的技术方法解析   总被引:3,自引:0,他引:3  
实体关系抽取是信息抽取领域中的一项重要任务.文章在对当前关系抽取的相关文献、系统和项目进行分析研究的基础上,将基于非结构化文本的实体关系抽取技术方法归纳为:以模式构造和匹配为主线进行关系抽取、以词典驱动关系抽取、运用机器学习算法进行关系抽取、借助Ontology进行关系抽取以及多种方法有机结合进行关系抽取.从技术应用特点、核心模块的实现细节以及系统评测结果等方面深入分析了典型的关系抽取系统,它们包括EEES关系抽取系统、SVM关系抽取系统、T-Rex关系抽取系统、KMI语义网络门户的混合关系抽取系统,旨在为进一步构建实体关系抽取系统提供良好借鉴.该文为2008年第9期本期话题"知识抽取"的文章之一.  相似文献   

7.
医学文献集合的主题抽取和主题聚类实践   总被引:1,自引:0,他引:1  
文献中的重要关键词能够反映其核心主题,因此对文献主题的发现和抽取问题就转化为对文献中的重要关键词集合的抽取.文章在调研了国外在主题抽取和聚类方面采用的技术方法的基础上,提出了在医学学科领域从文本信息资源中抽取主题并进行主题领域判断的技术方案,并详细阐述了其中的主题聚类的技术环节.为了验证该技术方案的有效性,文章以骨关节炎领域为例,对文中提出的技术方案进行实践验证.验证的结果表明文章提出的技术方案有着实际的有效性.该文为2008年第9期本期话题"知识抽取"的文章之一.  相似文献   

8.
“十一五”期间我国文献情报领域知识发现研究综述   总被引:1,自引:0,他引:1  
对近年来关于知识发现的大量相关论文从概念关系辨析、知识发现方法体系、文本挖掘与文本趋势挖掘、非相关文献知识发现、数据挖掘研究拓展等方面开展研究,总结“十一五”期间我国文献情报领域知识发现研究成果,重点介绍有关知识发现的内容分析、关联理论、领域驱动、可视化、文本挖掘模型等研究进展,最后分析展望今后该研究领域的研究热点和研究方向。  相似文献   

9.
鉴于重要关键词对于文本有着重要的强文本表示功能,关键词抽取和筛选在信息检索、信息抽取和知识挖掘等领域中有着重要的作用。在调研当前关键词抽取的方法后,结合医学领域已有的叙词表和工具以及BM25F加权词频公式提出基于医学文本的重要关键词抽取和筛选的技术方法。该方法主要解决两个关键问题:关键词的识别和抽取、关键词重要性的衡量和筛选。以2001-2007年骨关节炎领域的文献集合为数据来源,对该技术方法进行实践尝试,并验证其实际有效性,为知识挖掘中的重要关键词抽取提供一个行之有效的途径。  相似文献   

10.
[目的/意义]学术全文本下的关系抽取是学术全文本知识图谱构建的关键技术,所构建的学术知识图谱能够实现文献的结构化、知识化,提高研究人员检索文献、分析文献和把握科研动态的效率,以及通过图谱的认知推理,有助于隐式知识发现.[方法/过程]通过外部知识来增强关系抽取已在不少研究取得成果,但针对特定领域的关系抽取往往缺少可用的外...  相似文献   

11.
Knowledge flow between scientific disciplines has commonly been measured based on citation data. Previous studies using citing relationships have mostly considered direct citations but have paid little attention to indirect citations (IDC) to indicate how knowledge diffusion from one discipline to another via one or more intermediaries. In this study, we measured knowledge flow between disciplines from two perspectives: direct citations (DC) and discipline potential energy (DPE), which is proposed to combine both direct and indirect citations. Data were collected from the Web of Science (WoS) database. Findings include: (1) DPE overshadows previous measures by considering not only direct citations but also indirect citations between disciplines which was usually ignored in previous measures, and revealed that the knowledge contribution of some disciplines had been underestimated by previous measures, such as Physics and Engineering. (2) The proportion of IDC contribution is close to that of direct knowledge contribution when the discipline scale is removed, which suggests that it is essential to consider IDC to distinguish the knowledge relationship (net-outflow/inflow) between disciplines. (3) Both measurements show that Biology & Biochemistry has always been the top discipline with the highest net outflow of knowledge, which is inconsistent with the history of science that Mathematics, Physics and Chemistry would be the highest net outflow disciplines. The results show that even considering IDC does not fully reveal the knowledge contribution and academic influence of disciplines. This paper also analyzes the potential reasons for citation bias in revealing the contribution of disciplinary knowledge from a citation perspective. Therefore, caution should be taken in the use of citations as a primary measure of knowledge flow.  相似文献   

12.
In science-technology research, papers and patents are used to represent science and technology, respectively. Detecting sleeping beauty papers and their princes in technology (patent field) could uncover dynamic knowledge contributions from science (paper field) to technology (patent field). However, previous studies have mainly focused on sleeping beauty in science. Some studies have examined SB patents in technology, but SB papers in patents are rarely studied and need to be further discussed. In addition, knowledge could flow along citations. Thus, if one paper is cited by one patent's reference (indirect citation), it also contributes to the patent, even though the patent does not directly cite it. At the same time, indirect citations are rarely discussed in sleeping beauty studies. This could lead to a loss of significant information. Therefore, to reveal the dynamic knowledge contribution from science to technology considering indirect citations, this study proposed a new method of mining sleeping beauty papers in technology and their princes. The lithium-ion battery domain is selected as a case study. The findings are as follows: (1) Most papers do not contribute knowledge to technology continuously, even when considering indirect citations, and the time-varying knowledge contribution strength changes significantly overtime. (2) The knowledge contribution strength with a time delay of more than 11 years occupies 80% of the total knowledge contribution strength. It is suggested that the window period of paper publication evaluation be extended. (3) 22 sleeping beauty papers in technology are detected. Nine papers are among the top 10 regarding the total knowledge contribution strength. (4) The princes of 9 typical sleeping beauty papers in technology are all papers. This implies that the awakening of these papers in technology was all provoked by scientific development.  相似文献   

13.
全面介绍了解放军医学图书馆最新研发的《中国生物医学期刊引文数据库》机构知识版的新功能。《中国生物医学期刊引文数据库》机构知识版以 RDF 语义对机构、作者、基金、期刊和文献进行了统一描述,以机构库、作者库、基金库、期刊库和文献库为核心,基于文献标注信息对文献内容进行深入挖掘与分析,实现了对生物医学领域的信息资源关联;具备发表文献查询、引文查询、快速出具引证报告等功能,强化了文献计量可视化分析功能,发布各类 TOP 排行统计,推送关联信息,还增加了机构文献管理模块,可为各个机构提供个性化的文献管理和评价服务。  相似文献   

14.
In 2008 Meier and Conkling first tested Google Scholar's coverage of the engineering literature against citations gathered from the Compendex database. Since that time, other studies have used the same methodology and found improvement in Google Scholar's coverage. This study uses engineering dissertations from Proquest Dissertations & Theses to create a data set of citations for the comparison of fee-based databases, Compendex and Scopus, against Google Scholar. From 1950 to 2017 Google Scholar outperformed both Compendex and Scopus in discoverability of citations in nine engineering subjects. These results have implications for collection management and information literacy program planning for librarians.  相似文献   

15.
The number of received citations have been used as an indicator of the impact of academic publications. Developing tools to find papers that have the potential to become highly-cited has recently attracted increasing scientific attention. Topics of concern by scholars may change over time in accordance with research trends, resulting in changes in received citations. Author-defined keywords, title and abstract provide valuable information about a research article. This study performs a latent Dirichlet allocation technique to extract topics and keywords from articles; five keyword popularity (KP) features are defined as indicators of emerging trends of articles. Binary classification models are utilized to predict papers that were highly-cited or less highly-cited by a number of supervised learning techniques. We empirically compare KP features of articles with other commonly used journal-related and author-related features proposed in previous studies. The results show that, with KP features, the prediction models are more effective than those with journal and/or author features, especially in the management information system discipline.  相似文献   

16.
基于知识管理研究的基本内涵来比较研究情报学与知识管理异同,指出情报学主要是研究显性信息的收集加工整理及传播,侧重于社会层次与个人层次;知识管理主要研究隐性知识的收集加工整理及传播,侧重于组织层次。同时,探究知识管理研究对情报学研究的启示,并给出情报学研究的基本模型及交流传递模式。  相似文献   

17.
论知识管理与知识创新   总被引:109,自引:0,他引:109  
The authors think that for knowledge management,knowledge economy is a catalyst, human being is the core, information is a tool, knowledge innovation is the objective and approach.Then, they propose ideal patterns and development trends of knowledge management.However, present studies on knowledge management are different, from those focusing on human beings or organisation to those focusing on information management. 9 refs.  相似文献   

18.
哪些因素会影响学术论文的被引次数是文献计量学领域的一个经典研究议题。目前的研究主要关注论文的内容特征和形式特征与被引次数之间的关系,鲜有研究从文本可读性视角切入这一议题。文本可读性影响读者对文本内容的理解和知识吸收,是一个关乎知识传播效率和研究成果认可度的重要因素。本研究在控制论文知识品质和权威性的基础上,使用文本可读性R值等五个变量研究论文的文本可读性对被引次数的影响。以中文图书情报学知名期刊发表于2016—2020年的论文为研究样本,研究发现论文的文本可读性R值、是否采用复合式标题、是否使用公式和表格对被引次数有显著影响,而是否使用图对被引次数没有显著影响。研究验证了中文情境下文本可读性对论文影响力的实质性作用,研究结果对科研人员改善自身的中文学术写作以及提高研究成果影响力具有重要参考价值。  相似文献   

19.
This paper intends to explore the impact of geographic proximity on the diffusion of knowledge in the form of publication citations, and argues that codified knowledge is transmitted faster in proximity and is subject to similar geographic constraints as tacit knowledge. The geographic proximity advantage would be particularly relevant in the early stage of dissemination. We collected three sets of research articles published in 1990, 2000 and 2010 and compared the longitudinal citations they received domestically and from abroad. The study found that domestic citations accumulate faster and reach their peak much earlier than foreign citations, and the difference is most evident in the first few years after publication. The result shows that geographic proximity does play a role in the speed of knowledge diffusion and points to the network effect for citations. Those located closer to the knowledge origin would be exposed and react to publications faster due to the additional opportunities of research exchange and network.  相似文献   

20.
The purpose of this study was to determine if a computerized commercial selective dissemination of information service could contribute to the services offered to the patrons of a specific medical library who were already participating in a manual selective dissemination of information service. The citations generated by the two services were contrasted on the basis of literature coverage, timeliness of retrieval, and relevancy of output. Eighty-four percent of the discrete citations retrieved were from 664 periodicals subscribed to by both services. Only 16 percent of the total of 1,387 discrete citations were produced by both services. The manual service was more timely; and, although it produced fewer citations, a higher percentage of these were relevant. Numerically, a total of 346 useful citations were recovered by the manual service and 379 from the commercial service. It appears, therefore, that a computerized commercial SDI service could contribute to the services offered to the medical scientists participating in a manual SDI service.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号