首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
本文创新性构建学术论文被引影响因素特征空间,以我校SCI&SSCI学术论文为例,验证机器学习模型在预测学术论文被引频次研究中的有效性和准确性,本文的分析结论可以为高校图书馆开展决策支持服务提供参考。本文梳理学术论文被引频次影响因素及预测方法的相关研究,结合传统文献计量和Altmetrics指标构建学术论文影响因素的特征空间,并通过实验比较线性回归、神经网络、支持向量机三种机器学习模型在预测学术论文被引频次研究中的有效性和准确性。本文的分析结论证明基于Altmetrics视角构建的特征空间的预测准确率大幅度提高,并且支持向量机模型在对学术论文影响力预测的实证研究中表现出优异的性能。  相似文献   

2.
[目的/意义]基于科学论文发表后的早期特征,准确预测论文未来的引文扩散演变模式,对科学产出评估、科学突破早期发现等都具有潜在的价值。[方法/过程]归纳总结9种不同的引文扩散演变模式,并基于论文自发表后的早期时序、结构和文献特征,建模预测未来一定引文窗口内的演变模式。选择美国物理学会的引文数据集进行实证研究,探究不同特征组合下引文扩散演变模式的预测效果。[结果/结论]结果显示,时序特征对预测模型的贡献程度最大,同时结构特征和文献特征也起到重要的作用,当融合3个特征时所有预测模型的准确率均超过了80%,证明了本文所选特征的有效性。  相似文献   

3.
[目的/意义] 为更好地提升基于内容的引文分析效果,对国内外引用对象相关研究进行调研总结,为引用内容分析研究提供借鉴。[方法/过程] 通过调研国内外引用对象相关研究,梳理引用对象的概念定义、分类体系、应用领域和自动化识别等方面研究进展,总结当前引用对象研究不足并提出未来发展方向。[结果/结论] 引用对象从语义层面评价文献学术研究的贡献和利用价值,为引文分析方法增加了重要维度。引用对象研究需要从理论、技术和应用三个方向进行深化:理论上,加强多维度引用对象特征的研究和分析;技术上,探索基于大规模语料的自动化识别方法;应用上,尝试基于引用对象的科研评价服务。  相似文献   

4.
This paper explores a new indicator of journal citation impact, denoted as source normalized impact per paper (SNIP). It measures a journal's contextual citation impact, taking into account characteristics of its properly defined subject field, especially the frequency at which authors cite other papers in their reference lists, the rapidity of maturing of citation impact, and the extent to which a database used for the assessment covers the field's literature. It further develops Eugene Garfield's notions of a field's ‘citation potential’ defined as the average length of references lists in a field and determining the probability of being cited, and the need in fair performance assessments to correct for differences between subject fields. A journal's subject field is defined as the set of papers citing that journal. SNIP is defined as the ratio of the journal's citation count per paper and the citation potential in its subject field. It aims to allow direct comparison of sources in different subject fields. Citation potential is shown to vary not only between journal subject categories – groupings of journals sharing a research field – or disciplines (e.g., journals in mathematics, engineering and social sciences tend to have lower values than titles in life sciences), but also between journals within the same subject category. For instance, basic journals tend to show higher citation potentials than applied or clinical journals, and journals covering emerging topics higher than periodicals in classical subjects or more general journals. SNIP corrects for such differences. Its strengths and limitations are critically discussed, and suggestions are made for further research. All empirical results are derived from Elsevier's Scopus.  相似文献   

5.
Biomedical research encompasses diverse types of activities, from basic science (“bench”) to clinical medicine (“bedside”) to bench-to-bedside translational research. It, however, remains unclear whether different types of research receive citations at varying rates. Here we aim to answer this question by using a newly proposed paper-level indicator that quantifies the extent to which a paper is basic science or clinical medicine. Applying this measure to 5 million biomedical papers, we find a systematic citation disadvantage of clinical oriented papers; they tend to garner far fewer citations and are less likely to be hit works than papers oriented towards basic science. At the same time, clinical research has a higher variance in its citation. We also find that the citation difference between basic and clinical research decreases, yet still persists, if longer citation-window is used. Given the increasing adoption of short-term, citation-based bibliometric indicators in funding decisions, the under-cited effect of clinical research may provide disincentives for bio-researchers to venture into the translation of basic scientific discoveries into clinical applications, thus providing explanations of reasons behind the existence of the gap between basic and clinical research that is commented as “valley of death” and the commentary of “extinction” risk of translational researchers. Our work may provide insights to policy-makers on how to evaluate different types of biomedical research.  相似文献   

6.
[目的/意义] 文章的被引频次一直是量化评价一篇论文学术影响力的重要指标。但在不同学科不同年份发表的论文会因该领域研究论文数、引用滞后等因素呈现较大的差异。因此在对比两篇论文时,难以简单依据被引频次的绝对值来评判论文影响力大小。为此,本文设计了一个新的可计算数学模型,使得每篇论文可以有一个标准化的指标,以便对不同学科不同年份发表的论文的学术影响力进行直接比较。[方法/过程] 通过分析2006、2017两年中国科技类学术期刊各学科论文的被引频次分布规律,采用同学科论文被引频次的分布形态最接近对数正态分布的先设条件,提出一种被引频次标准化指数——Paper Citation Standardized Index (简称PCSI,中文"论文引证标准化指数")。最后以中国科协优秀科技期刊论文评选结果为例,将它们与论文所属学科全部论文进行实证对比研究。[结果/结论] 结果证明,PCSI对不同年份、不同学科论文的被引频次进行了标准化,反映了被引频次的线性差距,是一种较为理想的单篇论文学术影响力比较评价工具。  相似文献   

7.
[目的/意义] 文章的被引频次一直是量化评价一篇论文学术影响力的重要指标。但在不同学科不同年份发表的论文会因该领域研究论文数、引用滞后等因素呈现较大的差异。因此在对比两篇论文时,难以简单依据被引频次的绝对值来评判论文影响力大小。为此,本文设计了一个新的可计算数学模型,使得每篇论文可以有一个标准化的指标,以便对不同学科不同年份发表的论文的学术影响力进行直接比较。[方法/过程] 通过分析2006、2017两年中国科技类学术期刊各学科论文的被引频次分布规律,采用同学科论文被引频次的分布形态最接近对数正态分布的先设条件,提出一种被引频次标准化指数——Paper Citation Standardized Index (简称PCSI,中文"论文引证标准化指数")。最后以中国科协优秀科技期刊论文评选结果为例,将它们与论文所属学科全部论文进行实证对比研究。[结果/结论] 结果证明,PCSI对不同年份、不同学科论文的被引频次进行了标准化,反映了被引频次的线性差距,是一种较为理想的单篇论文学术影响力比较评价工具。  相似文献   

8.
Demonstrating the practical value of public research has been an important subject in science policy. Here we present a detailed study on the evolution of the citation linkage between life science related patents and biomedical research over a 37-year period. Our analysis relies on a newly-created dataset that systematically links millions of non-patent references to biomedical papers. We find a large disparity in the volume of citations to science among technology sectors, with biotechnology and drug patents dominating it. The linkage has been growing exponentially over a long period of time, doubling every 2.9 years. The U.S. has been the largest producer of cited science for years, receiving nearly half of the citations. More than half of citations goes to universities. We use a new paper-level indicator to quantify to what extent a paper is basic research or clinical medicine. We find that the cited papers are likely to be basic research, yet a significant portion of papers cited in patents that are related to FDA-approved drugs are clinical research. The U.S. National Institute of Health continues to be an important funder of cited science. For the majority of companies, more than half of citations in their patents are authored by public research. Taken together, these results indicate a continuous linkage of public science to private sector inventions.  相似文献   

9.
高被引论文与“睡美人”论文引用曲线及影响因素研究   总被引:2,自引:0,他引:2  
[目的/意义]通过对潜在“睡美人”论文的引用分布分析,提炼其特征,以期为“睡美人”论文的预判研究提供思路。[方法/过程]采用引用曲线这一更为直观的反映论文引用分布的方法,以“天文学和天体物理”这一领域为例,构建其10的高被引论文、“睡美人”论文的10-20年被引用数据并进行引文分布的对比分析。[结果/结论]研究发现两类文献的引用曲线模式及特点——高被引论文的持续增长型、显峰型、双峰型、振荡型,“睡美人”论文的持续增长型、显峰型、双峰型、振荡型、稳定型等被引用曲线模式;针对施引文献、研究主题演化方向探讨了各模式引用曲线形成的相关因素,发现两类文献达到引用高峰的时间存在差异。  相似文献   

10.
[目的/意义]探索论文未被引现象是引文分布研究中不可或缺的部分,不仅有利于丰富和扩展计量学的研究范畴,也有利于识别文献未被引的产生机制和最大限度避免科研资源浪费与提升科学交流效率。[方法/过程]以CSSCI为来源数据库,以图书馆情报与文献学为样本学科,随机选择200名学者为样本,获取这些学者的第一作者论文及相关引文数据,以6年为计量时间窗口,依析取的8个外部特征因素计算不同分组的未被引率,采用非参数方法检验各因素是否存在显著差异。[结果/结论]8个外部特征因素对论文未被引都有显著影响,其中作者所属机构的影响相对较小,作者发文时年龄与论文篇幅的影响相对较大,作者发文时职称、作者数量、参考文献数量、关键词数量、基金类别的影响程度大致相仿;各因素的未被引率在前3年的变化较为剧烈,后3年变化较为平缓;各因素未被引率的时间序列变化趋势各不相同,其影响平稳性也变化各异。  相似文献   

11.
Predicting the citation counts of academic papers is of considerable significance to scientific evaluation. This study used a four-layer Back Propagation (BP) neural network model to predict the five-year citations of 49,834 papers in the library, information and documentation field indexed by the CSSCI database and published from 2000 to 2013. We extracted six paper features, two journal features, nine author features, eight reference features, and five early citation features to make the prediction. The empirical experiments showed that the performance of the BP neural network is significantly better than those of the six baseline models. In terms of the prediction effect, the accuracy of the model at predicting infrequently cited papers was higher than that for frequently cited ones. We determined that five essential features have significant effects on the prediction performance of the model, i.e., ‘citations in the first two years’, ‘first-cited age’, ‘paper length’, ‘month of publication’, and ‘self-citations of journals’, and the other features contribute only slightly to the prediction.  相似文献   

12.
齐燕 《图书情报工作》2017,61(24):114-122
[目的/意义]首次被引速度是反映文献及其作者或承载期刊的影响力的重要维度,也在一定程度上决定了后续被引情况。尝试通过一些改进工作克服部分现有评价指标存在的问题,如评价结果区分度过小甚至错误、应用受限,以及不适应出版周期缩短的现实趋势等。[方法/过程]基于文献引用详细信息考察从计时单元细化进行指标改进的可行性,对现有研究中的两类指标进行改进,提出评估首次被引速度的新指标:S类指数(包括SF、Sz指数)和FM指数。考虑到要达到一定的数据量特定学者的发文时间跨度通常相对较大进而数据特征更为丰富的特点,选择他人同主题研究中的我国图书情报与文献学领域10名科研人员为研究对象进行实证研究,在CNKI引文数据库中获取324篇文献的首次施引文献,基于"被引-首引文献对"的年度或月度的时间差进行相应指标的计算。[结果/结论]从10位学者的新旧指数的评估结果看,相对于现有研究指标,新的FM指数具有非常显著的区分度及精细度的提升;新的S类指数具有与h指数相近的评价效力,同时其计时数据的客观、稳定的特性使其具有比传统S指数相对更大的应用空间;而且原始数据的获取对数据库没有过多要求,仅需进行一些数据处理和运算的编程工作,具有较大的可行性。  相似文献   

13.
The top 1000 biomedical papers by number of citations are classified by method, type of method and non-methods by examination of citation contexts. Supervised machine learning is applied to the context data for a training sample of papers which is then used to classify the full list, revealing that words indicating utility are most important for the classification of methods. Further word analysis is carried out using corpus linguistics to uncover context words that characterize non-methods. Hedging words are found to play an important role for non-methods, and several are selected for further analysis with logistic regression. Other variables in the regression are a consensus variable based on the similarity of contexts for a paper and another variable based on whether citations come from “methods” sections of citing papers. Accuracy of predictions from logistic regression is comparable to machine learning. The results are interpreted in terms of the perceived certainty or uncertainty of the underlying knowledge, that is, methods and their outputs have higher certainty, and non-methods higher uncertainty. Evidence is found that hedging is inversely related to citation frequency. Implications of this work for the study of the development of science and the role of methods and tools in biomedical research are discussed.  相似文献   

14.
With the advancement of science and technology, the number of academic papers published each year has increased almost exponentially. While a large number of research papers highlight the prosperity of science and technology, they also give rise to some problems. As we know, academic papers are the most intuitive embodiment of the research results of scholars, which can reflect the level of researchers. It is also the standard for evaluation and decision-making of them, such as promotion and allocation of funds. Therefore, how to measure the quality of an academic paper is very critical. The most common standard for measuring the quality of academic papers is the number of citation counts of them, as this indicator is widely used in the evaluation of scientific publications. It also serves as the basis for many other indicators (such as the h-index). Therefore, it is very important to be able to accurately predict the citation counts of academic papers. To improve the effective of citation counts prediction, we try to solve the citation counts prediction problem from the perspective of information cascade prediction and take advantage of deep learning techniques. Thus, we propose an end-to-end deep learning framework (DeepCCP), consisting of graph structure representation and recurrent neural network modules. DeepCCP directly uses the citation network formed in the early stage of the paper as the input, and outputs the citation counts of the corresponding paper after a period of time. It only exploits the structure and temporal information of the citation network, and does not require other additional information. According to experiments on two real academic citation datasets, DeepCCP is shown superior to the state-of-the-art methods in terms of the accuracy of citation count prediction.  相似文献   

15.
In an age of intensifying scientific collaboration, the counting of papers by multiple authors has become an important methodological issue in scientometric based research evaluation. Especially, how counting methods influence institutional level research evaluation has not been studied in existing literatures. In this study, we selected the top 300 universities in physics in the 2011 HEEACT Ranking as our study subjects. We compared the university rankings generated from four different counting methods (i.e. whole counting, straight counting using first author, straight counting using corresponding author, and fractional counting) to show how paper counts and citation counts and the subsequent university ranks were affected by counting method selection. The counting was based on the 1988–2008 physics papers records indexed in ISI WoS. We also observed how paper and citation counts were inflated by whole counting. The results show that counting methods affected the universities in the middle range more than those in the upper or lower ranges. Citation counts were also more affected than paper counts. The correlation between the rankings generated from whole counting and those from the other methods were low or negative in the middle ranges. Based on the findings, this study concluded that straight counting and fractional counting were better choices for paper count and citation count in the institutional level research evaluation.  相似文献   

16.
基于被引次数的引文分析无法直接揭示论文的研究内容,利用关键词或从标题、摘要和全文中抽取的主题词很难客观反映论文的被引原因。本文以碳纳米管纤维研究领域的高被引论文为研究对象进行引文内容抽取和主题识别,经人工判读验证:基于引文内容分析的高被引论文识别的核心主题能够较好地揭示高被引论文的被引原因(引用动机),而且与论文的研究内容相符合;与基于全文、基于标题和摘要的主题识别相比,在引文内容分析基础上识别的主题具有更好的主题代表性,能够有效揭示被引文献的研究内容,是对原文相关信息的重要补充。本文的实验表明基于引文内容分析的高被引论文主题识别是可行而且有效的。图4。表4。参考文献31。  相似文献   

17.
医学论文质量评价、信息交换的传统模式正在发生变化.审稿方法有明显缺陷而改革乏力.引文分析和影响因子的使用存在争议,且偏离了正确的使用轨道.开放存取出版模式与传统审稿方法相结合或许能成为医学论文评价的一种途径,做到真正意义上的"公开,公平,公正".  相似文献   

18.
科技论文引用种类的初步剖析   总被引:1,自引:0,他引:1  
张微 《图书情报工作》2010,54(16):59-62
科技论文被引用率的大幅提高,并不意味着我国科技竞争力会自然大幅提高。被引用有负引用、正引用之分。正引用有流水引用、有效引用、深度引用和发展性引用之分。只有发展性引用率高,才说明科研成果有很强的生命力。为此提出用收录、引用和引用深度对科技论文进行质量评价的综合方法,以期为提高我国图书情报信息工作的水平提供参考。  相似文献   

19.
One important reason for the use of field categorization in bibliometrics is the necessity to make citation impact of papers published in different scientific fields comparable with each other. Raw citations are normalized by using field-categorization schemes to achieve comparable citation scores. There are different approaches to field categorization available. They can be broadly classified as intellectual and algorithmic approaches. A paper-based algorithmically constructed classification system (ACCS) was proposed which is based on citation relations. Using a few ACCS field-specific clusters, we investigate the discriminatory power of the ACCS. The micro study focusses on the topic ‘overall water splitting’ and related topics. The first part of the study investigates intellectually whether the ACCS is able to identify papers on overall water splitting reliably and validly. Next, we compare the ACCS with (1) a paper-based intellectual (INSPEC) classification and (2) a journal-based intellectual classification (Web of Science, WoS, subject categories). In the last part of our case study, we compare the average number of citations in selected ACCS clusters (on overall water splitting and related topics) with the average citation count of publications in WoS subject categories related to these clusters. The results of this micro study question the discriminatory power of the ACCS. We recommend larger follow-up studies on broad datasets.  相似文献   

20.
《Journal of Informetrics》2019,13(2):485-499
With the growing number of published scientific papers world-wide, the need to evaluation and quality assessment methods for research papers is increasing. Scientific fields such as scientometrics, informetrics, and bibliometrics establish quantified analysis methods and measurements for evaluating scientific papers. In this area, an important problem is to predict the future influence of a published paper. Particularly, early discrimination between influential papers and insignificant papers may find important applications. In this regard, one of the most important metrics is the number of citations to the paper, since this metric is widely utilized in the evaluation of scientific publications and moreover, it serves as the basis for many other metrics such as h-index. In this paper, we propose a novel method for predicting long-term citations of a paper based on the number of its citations in the first few years after publication. In order to train a citation count prediction model, we employed artificial neural network which is a powerful machine learning tool with recently growing applications in many domains including image and text processing. The empirical experiments show that our proposed method outperforms state-of-the-art methods with respect to the prediction accuracy in both yearly and total prediction of the number of citations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号