首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
Main path analysis (MPA) is an effective method widely accepted in science and technology for extracting knowledge diffusion paths. Traditional citation analysis assumes that all citations are treated equally. In contrast, this paper proposes a new MPA framework from the perspective of citation structure and content. Three indicators are considered to adjust edge weight: (1) Structural similarity, (2) Topic similarity and (3) Sentiment analysis. This study takes the bullwhip effect and the Internet of Things domain as examples to verify the reliability and feasibility of improved MPA. The results show that the improved main path uncovers the knowledge trajectories appropriately, which has an ability to distinguish citations and detect important papers. This research enriches MPA theory and provides future research directions from perspective of citation structure and content.  相似文献   

2.
Main path analysis (MPA) is the most widely accepted approach to tracing knowledge transfer in a research field. In this study, we extracted multiple longest paths from the multidisciplinary academic field's citation network and integrating topic modeling to the extracted paths. We consider three main aspects of trajectory analysis when analyzing the represented documents through the extracted paths: emergence, authority, and topic dynamics. For path extraction, we adopt the longest path algorithm that consists of the following three steps: 1) topological sort, 2) edge relaxation, and 3) multiple path extraction. For topic integration into multiple paths, we employ latent Dirichlet allocation (LDA) by utilizing the topic-document matrix that LDA derives to select an article's topic from the citation network, where each article is labeled with the topic that is assigned with the highest topical probability for that article. We conduct a series of experiments to examine the results on a dataset from the field of healthcare informatics that PubMed provides.  相似文献   

3.
Main Path Analysis (MPA) is widely used to trace the developmental trajectory of a technological field through a citation network. The citation-based traversal weight is usually utilized to cherry-pick the most significant path. However, the theme of documents along a main path may not be so coherent, and it is very possible to miss the main paths of significant sub-fields overall in a domain. Furthermore, the global path search algorithm in conventional MPA also suffers from high space complexity due to the exhaustive strategy. To address these limitations, a new method, named as semantic MPA (sMPA), is proposed by leveraging semantic information in two steps of candidate path generation and main path selection. In the meanwhile, the resulting source code can be freely accessed. To demonstrate the advantages of our method, extensive experiments are conducted on a patent dataset pertaining to lithium-ion battery in electric vehicle. Experimental results show that our sMPA is capable of discovering more knowledge flows from important sub-fields, and improving the topical coherence of candidate paths as well.  相似文献   

4.
Patent prior art search is a type of search in the patent domain where documents are searched for that describe the work previously carried out related to a patent application. The goal of this search is to check whether the idea in the patent application is novel. Vocabulary mismatch is one of the main problems of patent retrieval which results in low retrievability of similar documents for a given patent application. In this paper we show how the term distribution of the cited documents in an initially retrieved ranked list can be used to address the vocabulary mismatch. We propose a method for query modeling estimation which utilizes the citation links in a pseudo relevance feedback set. We first build a topic dependent citation graph, starting from the initially retrieved set of feedback documents and utilizing citation links of feedback documents to expand the set. We identify the important documents in the topic dependent citation graph using a citation analysis measure. We then use the term distribution of the documents in the citation graph to estimate a query model by identifying the distinguishing terms and their respective weights. We then use these terms to expand our original query. We use CLEF-IP 2011 collection to evaluate the effectiveness of our query modeling approach for prior art search. We also study the influence of different parameters on the performance of the proposed method. The experimental results demonstrate that the proposed approach significantly improves the recall over a state-of-the-art baseline which uses the link-based structure of the citation graph but not the term distribution of the cited documents.  相似文献   

5.
[目的/意义]基于当前技术演化分析方法的发展现状,提出一种能够在微观层次上突出既定领域中主要技术发展脉络的多主路径方法。[方法/过程]将专利文本挖掘和动态规划方法应用于专利引文网络,以路径上所有专利对的语义相似度总和最优作为启发策略进行路径搜索,以获取若干能够分别聚焦于特定主题的主路径,供研究者总览既定技术领域中主要技术主题的发展脉络及其相互关系。[结果/结论]实证结果表明,将该方法应用于硬盘驱动器磁头领域,可以有效抽取其中主要技术主题的演化轨迹。  相似文献   

6.
张娴  方曙 《图书情报工作》2016,60(20):140-148
[目的/意义] 对现有专利引用网络主路径方法研究内容进行总结梳理,为今后应用该方法解决技术演化进程中的关键性专利技术识别和主流线索提取提供理论支撑。[方法/过程] 系统梳理相关研究成果,从算法研究、应用研究、方法优化扩展研究三个方面总结现有研究内容与特点,分析当前研究的局限性,探讨未来研究发展方向。[结果/结论] 当前研究的主要局限性在于:对路径发展驱动力的多元性与系统性揭示不够、忽视不同引证关系对路径演化的影响差异性、对演化的动态性关注不足、多主路径方法本质上仍属单目标搜索。未来研究将关注以下几个方向:对算法思想进行实质性与创新性拓展、更强调动态性与未来预测性、优化计算效率以增强适用性与实用性、发挥专利引用主路径在产业化扩散研究中的独特优势。  相似文献   

7.
This study presents a unique approach in investigating the knowledge diffusion structure for the field of data quality through an analysis of the main paths. We study a dataset of 1880 papers to explore the knowledge diffusion path, using citation data to build the citation network. The main paths are then investigated and visualized via social network analysis. This paper takes three different main path analyses, namely local, global, and key-route, to depict the knowledge diffusion path and additionally implements the g-index and h-index to evaluate the most important journals and researchers in the data quality domain.  相似文献   

8.
This study investigated how Health Information National Trends Survey (HINTS) data from the U.S. National Cancer Institute (NCI) was cited in the scholarly literature. It addressed the following research questions: (1) What patterns of citations exist among authors of research articles using HINTS data? and (2) How is the citation format of HINTS data characterized? We collected scholarly articles that used HINTS data as primary or secondary data for analysis from Web of Science databases and HINTS publications on the NCI website. Among the resulting 250 articles, we identified citations to HINTS data themselves (data citations) and those to HINTS‐related documents (data‐related citations). Among the 250 articles, 156 articles (62.4%) cited HINTS data or HINTS‐related documents; only 29 articles (11.6%) cited HINTS data, while 127 (50.8%) cited HINTS‐related documents. Both data and data‐related citations increased over time. Data citation format varied, and 13 different compositions of citation elements were identified. Author, Title, and Location (URL) were common elements. The frequent use of URLs is undesirable due to URL instability. Furthermore, the data citations showed not only various compositions of citation elements but also ill‐defined element formats. Standardized citation formats are therefore needed.  相似文献   

9.
[目的/意义]引用语境是科学论文中包含引用的句子,是对引文的描述性或评价性文字。通过提取和分析引用语境中的线索词,可以了解引用语境的一般特征。[方法/过程]以Journal of Informetrics(JOI)期刊为例,选取人称代词、行为动词和连接词三类常用的线索词,分别计算它们在引用语境中的频次、占比和排序。通过比较各类线索词在引用语境和非引用语境中的存在度,识别引用语境中的常用句型和论证模式。[结果/结论]在JOI期刊中,引用语境主要表现出如下特点:侧重于第一人称和第三人称论述,既展现他人的工作,也展现作者的研究;偏重于研究方法类引用,常用的行为动词为"use"base"和"study";强调通过转折和列举等逻辑方式进行论证,最常用的连接词位"also"和"but"。分析引用语境中的线索词,对于更好地理解科学论文中的引用的功能和动机具有重要的价值和意义。  相似文献   

10.
This article uses Google Scholar (GS) as a source of data to analyse Open Access (OA) levels across all countries and fields of research. All articles and reviews with a DOI and published in 2009 or 2014 and covered by the three main citation indexes in the Web of Science (2,269,022 documents) were selected for study. The links to freely available versions of these documents displayed in GS were collected. To differentiate between more reliable (sustainable and legal) forms of access and less reliable ones, the data extracted from GS was combined with information available in DOAJ, CrossRef, OpenDOAR, and ROAR. This allowed us to distinguish the percentage of documents in our sample that are made OA by the publisher (23.1%, including Gold, Hybrid, Delayed, and Bronze OA) from those available as Green OA (17.6%), and those available from other sources (40.6%, mainly due to ResearchGate). The data shows an overall free availability of 54.6%, with important differences at the country and subject category levels. The data extracted from GS yielded very similar results to those found by other studies that analysed similar samples of documents, but employed different methods to find evidence of OA, thus suggesting a relative consistency among methods.  相似文献   

11.
This paper presents an empirical analysis of two different methodologies for calculating national citation indicators: whole counts and fractionalised counts. The aim of our study is to investigate the effect on relative citation indicators when citations to documents are fractionalised among the authoring countries. We have performed two analyses: a time series analysis of one country and a cross-sectional analysis of 23 countries. The results show that all countries’ relative citation indicators are lower when fractionalised counting is used. Further, the difference between whole and fractionalised counts is generally greatest for the countries with the highest proportion of internationally co-authored articles. In our view there are strong arguments in favour of using fractionalised counts to calculate relative citation indexes at the national level, rather than using whole counts, which is the most common practice today.  相似文献   

12.
调查1967-2013年期间与用户相关性判断研究相关的82篇文献,筛选其中55篇在研究方法上具有代表性者构成样本文献。通过分析发现:样本文献在方法论思想和具体研究方法上存在较多共性和规律性的观点与做法,并可在结构上组织为一个以相关性判断的情境依赖性、认知主因和真实情境设定中开展研究3项方法原则为核心,纵向上涵盖方法论思想、研究策略和具体研究方案设计等多个层次,横向上涉及样本选取、数据采集与分析策略制定等多个研究方案设计关键环节的参考性框架。认为信息查询与检索领域的认知观是该方法框架形成、发展和进一步演化的关键驱动因素,并据此分析该框架的未来发展。  相似文献   

13.
The most common approach to measuring the effectiveness of Information Retrieval systems is by using test collections. The Contextual Suggestion (CS) TREC track provides an evaluation framework for systems that recommend items to users given their geographical context. The specific nature of this track allows the participating teams to identify candidate documents either from the Open Web or from the ClueWeb12 collection, a static version of the web. In the judging pool, the documents from the Open Web and ClueWeb12 collection are distinguished. Hence, each system submission should be based only on one resource, either Open Web (identified by URLs) or ClueWeb12 (identified by ids). To achieve reproducibility, ranking web pages from ClueWeb12 should be the preferred method for scientific evaluation of CS systems, but it has been found that the systems that build their suggestion algorithms on top of input taken from the Open Web achieve consistently a higher effectiveness. Because most of the systems take a rather similar approach to making CSs, this raises the question whether systems built by researchers on top of ClueWeb12 are still representative of those that would work directly on industry-strength web search engines. Do we need to sacrifice reproducibility for the sake of representativeness? We study the difference in effectiveness between Open Web systems and ClueWeb12 systems through analyzing the relevance assessments of documents identified from both the Open Web and ClueWeb12. Then, we identify documents that overlap between the relevance assessments of the Open Web and ClueWeb12, observing a dependency between relevance assessments and the source of the document being taken from the Open Web or from ClueWeb12. After that, we identify documents from the relevance assessments of the Open Web which exist in the ClueWeb12 collection but do not exist in the ClueWeb12 relevance assessments. We use these documents to expand the ClueWeb12 relevance assessments. Our main findings are twofold. First, our empirical analysis of the relevance assessments of 2  years of CS track shows that Open Web documents receive better ratings than ClueWeb12 documents, especially if we look at the documents in the overlap. Second, our approach for selecting candidate documents from ClueWeb12 collection based on information obtained from the Open Web makes an improvement step towards partially bridging the gap in effectiveness between Open Web and ClueWeb12 systems, while at the same time we achieve reproducible results on well-known representative sample of the web.  相似文献   

14.
One aspect of faculty effectiveness can be measured through research productivity, and publication and citation rates can serve as an indicator of that productivity. This study, the fourth in a series to examine LIS faculty and program productivity as measured by publication and citation, uses the same methodology as the previous investigations. A consistent data instrument (the Social Science Citation Index) provided publication and citation data for LIS faculty, covering the years 1999 to 2004. Tables show the faculty and programs with the highest publication and citation rates, both overall and per capita, as well as a cumulative ranking of LIS programs based on faculty research productivity. This study, in conjunction with the three previous, documents an increase in LIS research productivity, suggesting an increase in faculty effectiveness.  相似文献   

15.
[目的/意义] 跨学科研究已成为现代科学创新研究的重要范式和必然趋势,探究跨学科领域中学科的发展模式与演化路径,对于揭示跨学科领域形成与发展的动态过程具有重要意义。[方法/过程] 以眼动追踪(Eye Tracking,ET)领域为例,对文献引文关系进行提取与学科标注,构建文献和学科层面的引文关系网络;计算各学科的他引比率、他被引比率和普赖斯指数,从宏观层面分析ET领域中主要学科的跨学科发展模式;考察不同阶段内部及不同阶段之间的学科引证关系,探究不同阶段各学科在跨学科发展过程中的关系结构与角色演变;基于引文的中介中心度识别连接不同学科关系的重要文献,考察重要文献、高被引文献以及参考文献之间的引文关系,从微观层面揭示ET领域发展的具体演化路径。[结果/结论] ET领域发展经历潜伏期、发展期和成熟期三个阶段,并呈现独立型、交叉型和学习型三种学科发展模式;各学科之间的引证关系随阶段变化逐渐紧密且分布逐渐均匀,神经学、心理学和临床医学在跨学科发展和知识输出方面处于核心地位;ET领域纵向发展表现为独立型学科的基础理论创新,横向发展表现为3种类型学科的深度融合,并呈现出"独立-线性-网状"的发展路径。  相似文献   

16.
One of the main applications of citation is to find articles that are relevant to a particular article. However, not all citations are equally relevant to the target article. This paper presents an approach to identify the most relevant citation(s). To this end, the Normalized Similarity Index (NSI) is proposed to quantify the similarity between the source and target of a citation base on the co-citations and references shared by them. To validate the method, NSI was calculated for five citation networks and was compared with the peer review grades for the relevancy between the source and the target articles. The results showed a significant correlation between the NSI ranks and those of peer review. Also, combined linkage (CL) and weighted direct citation (WDC) were calculated from the same data. According to the results of comparison between the NSI with other similarity measures, in most cases, NSI did better than others at reproducing the peer rankings. Our principal conclusion is that the NSI can be used to prioritize the citations of given highly cited article, and represent knowledge flow from the target article.  相似文献   

17.
[目的/意义] 概率主题模型算法在不断得到改进与扩展,本文对国内外已有的利用引文构建的主题模型进行研究,分析和对比不同模型的生成过程与算法,并探讨利用引文构建的主题模型在科技文本分析中的应用与可扩展的研究方向。[方法/过程] 通过Web of Science数据库和CNKI数据库获取国内外利用引文构建主题模型的相关文献,经人工判读后筛选出具有代表性的文献,对这些文献中利用引文构建的主题模型,从建模思想、生成过程、参数估计与推断算法等方面进行对比与分析。[结果/结论] 目前国内外利用引文构建的主题模型主要包括研究主题与引文分布的主题模型、研究被引与施引主题间关系的主题模型,以及基于引用内容的引用主题模型;主题模型中引入引文信息后,能够获得更完整的主题内容和特定主题下的重要文献,并可识别施引文献和被引文献之间主题间的关系及影响;已有的模型多集中在概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)和潜在狄利克雷分配(Latent Dirichlet Allocation,LDA)主题模型基础上进行扩展。未来可扩展研究引入引用内容的主题模型、模型的性能优化和评价方法、模型的应用研究等。  相似文献   

18.
[目的/意义]确定基于引用关系提取关键文献时各种方法的优缺点、适用场合,从而使用户快速捕捉领域重要文献,掌握领域概貌。[方法/过程]基于文献引用关系,从文献被引频次、文献引用网络、文献共被引网络3个角度,结合HistCite、CiteSpace等软件探讨领域关键文献的识别方法,通过同源数据的实际验证,对不同方法进行判别比较。[结果/结论]基于被引频次的方法更适合选择特定领域中哪些文献对总体文献的科学进步产生重大影响角度提取关键文献,对应的关键文献集合呈现非常分散的特性;基于引用网络的方法更适合从特定领域研究动态提取发展过程中的关键文献,对应的关键文献集合呈现非常明显的集中特性;基于共被引网络的方法更适合从特定领域研究基础角度提取关键文献,对应的关键文献集合呈现较强的集中性,且能发现原始采集中未发现的大量关键文献。  相似文献   

19.
This study presents a ranking of 182 academic journals in the field of artificial intelligence. For this, the revealed preference approach, also referred to as a citation impact method, was utilized to collect data from Google Scholar. This list was developed based on three relatively novel indices: h-index, g-index, and hc-index. These indices correlated almost perfectly with one another (ranging from 0.97 to 0.99), and they correlated strongly with Thomson's Journal Impact Factors (ranging from 0.64 to 0.69). It was concluded that journal longevity (years in print) is an important but not the only factor affecting an outlet's ranking position. Inclusion in Thomson's Journal Citation Reports is a must for a journal to be identified as a leading A+ or A level outlet. However, coverage by Thomson does not guarantee a high citation impact of an outlet. The presented list may be utilized by scholars who want to demonstrate their research output, various academic committees, librarians and administrators who are not familiar with the AI research domain.  相似文献   

20.
丁敬达  郑巧  刘超 《图书情报工作》2021,65(11):143-152
[目的/意义] 梳理软件引用及其规范的理论和实践现状,分析存在的困难和问题,促进软件引用规范和标准的建立。[方法/过程] 通过文献调研,得出软件引用面临接受文化、奖励制度、引用认知和元数据4个方面挑战,分析国内外为应对挑战对软件引用及其规范的理论探索和实践进展。[结果/结论] 软件引用利益相关者分析、软件引用原则和元数据标准等理论探索以及国际研究组织和社区的引用指南、相关项目和知识库的支持、人员培训、贡献分配和奖励计划的实施、软件引用文件的提供等实践为软件引用规范的建立奠定了良好的基础,但尚需软件引用利益相关者通力协作、共同克服面对的困难和挑战。S  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号