首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 160 毫秒
1.
With the advancement of science and technology, the number of academic papers published each year has increased almost exponentially. While a large number of research papers highlight the prosperity of science and technology, they also give rise to some problems. As we know, academic papers are the most intuitive embodiment of the research results of scholars, which can reflect the level of researchers. It is also the standard for evaluation and decision-making of them, such as promotion and allocation of funds. Therefore, how to measure the quality of an academic paper is very critical. The most common standard for measuring the quality of academic papers is the number of citation counts of them, as this indicator is widely used in the evaluation of scientific publications. It also serves as the basis for many other indicators (such as the h-index). Therefore, it is very important to be able to accurately predict the citation counts of academic papers. To improve the effective of citation counts prediction, we try to solve the citation counts prediction problem from the perspective of information cascade prediction and take advantage of deep learning techniques. Thus, we propose an end-to-end deep learning framework (DeepCCP), consisting of graph structure representation and recurrent neural network modules. DeepCCP directly uses the citation network formed in the early stage of the paper as the input, and outputs the citation counts of the corresponding paper after a period of time. It only exploits the structure and temporal information of the citation network, and does not require other additional information. According to experiments on two real academic citation datasets, DeepCCP is shown superior to the state-of-the-art methods in terms of the accuracy of citation count prediction.  相似文献   

2.
学术文献特征表示,是学术文献搜索、分类组织、个性化推荐等学术大数据服务的关键步骤。研究表明,图神经网络能够有效学习文献的特征表示,然而当前研究主要集中在有监督学习方法上,不仅对数据集的大小和质量的要求较高,且学习到的文献特征表示与具体任务高度耦合。基于此,本文将四种无监督图神经网络方法引入学术文献表示学习,从Cora、CiteSeer和DBLP (database systems and logic programming)数据集的引文网络、共被引网络和文献耦合网络中学习文献的表示向量,并应用于文献分类和论文推荐两大下游任务。研究结果表明,(1)深度互信息图神经网络适合于文献分类任务,对抗正则化变分图自编码器则在论文推荐任务上性能更佳;(2)Cora数据集上的结果表明,相较于共被引和文献耦合网络,引文网络更适合于学习通用的文献表示向量。  相似文献   

3.
[目的/意义] 对学术论文引用预测影响因素和预测方法进行梳理,分析现存问题并提出发展方向。[方法/过程] 采用文献调研法,综述国内外研究进展,总结预测影响因素和预测方法的相关内容和特点。[结果/结论] 现有影响因素指标繁多,无统一标准;预测方法理论基础薄弱;引文预测动态性研究不足;预测模型通用性受限。未来应加强引文预测的理论研究、加强传统文献计量和替代计量的结合、加强自然语言处理的深度应用、建立统一的基线标准、构建更加精准的预测模型。  相似文献   

4.
��[Purpose/significance] This paper summarizes the influencing factors and prediction methods of academic paper citation, analyzes the existing problems and proposes the future development directions.[Method/process] This paper used the literature research method to review the research progress of academic papers at home and abroad, and summarized the relevant content and characteristics of influencing factors and prediction methods.[Result/conclusion] There are many indicators of influencing factors, but there is no unified selection criteria. The theoretical basis of prediction methods is weak. The research on dynamics of citation prediction is insufficient. The generality of prediction models is limited. In the future, we should strengthen the theoretical research of citation prediction methods, the combination of traditional bibliometrics and alternative metrics, the deep application of natural language processing, and establish a unified baseline standard, a more accurate prediction model.  相似文献   

5.
本文创新性构建学术论文被引影响因素特征空间,以我校SCI&SSCI学术论文为例,验证机器学习模型在预测学术论文被引频次研究中的有效性和准确性,本文的分析结论可以为高校图书馆开展决策支持服务提供参考。本文梳理学术论文被引频次影响因素及预测方法的相关研究,结合传统文献计量和Altmetrics指标构建学术论文影响因素的特征空间,并通过实验比较线性回归、神经网络、支持向量机三种机器学习模型在预测学术论文被引频次研究中的有效性和准确性。本文的分析结论证明基于Altmetrics视角构建的特征空间的预测准确率大幅度提高,并且支持向量机模型在对学术论文影响力预测的实证研究中表现出优异的性能。  相似文献   

6.
[目的/意义]针对采用不同引证网络探测新兴趋势的问题,比较群体动力学方法用于直接引证网络和文献耦合网络上的效能差异。[方法/过程]首先构建并分析直接引证网络、文献耦合网络和同被引网络的特征,然后基于群体动力学方法对文献耦合网络进行实证研究。[结果/结论]对比以往的研究结果发现:群体动力学方法作用于直接引证网络进行新兴趋势的预测结果较基线方法为好,而在文献耦合网络中预测的效果并不比基线方法更佳。  相似文献   

7.
Traditionally, citation count has served as the main evaluation measure for a paper's importance and influence. In turn, many evaluations of authors, institutions and journals are based on aggregations upon papers (e.g. h-index). In this work, we explore measures defined on the citation graph that offer a more intuitive insight into the impact of a paper than the superficial count of citations. Our main argument is focused on the identification of influence as an expression of the citation density in the subgraph of citations built for each paper. We propose two measures that capitalize on the notion of density providing researchers alternative evaluations of their work. While the general idea of impact for a paper can be viewed as how many researchers have shown interest to a piece of work, the proposed measures are based on the hypothesis that a piece of work may have influenced some papers even if they do not contain references to that piece of work. The proposed measures are also extended to researchers and journals.  相似文献   

8.
Predicting the citation counts of academic papers is of considerable significance to scientific evaluation. This study used a four-layer Back Propagation (BP) neural network model to predict the five-year citations of 49,834 papers in the library, information and documentation field indexed by the CSSCI database and published from 2000 to 2013. We extracted six paper features, two journal features, nine author features, eight reference features, and five early citation features to make the prediction. The empirical experiments showed that the performance of the BP neural network is significantly better than those of the six baseline models. In terms of the prediction effect, the accuracy of the model at predicting infrequently cited papers was higher than that for frequently cited ones. We determined that five essential features have significant effects on the prediction performance of the model, i.e., ‘citations in the first two years’, ‘first-cited age’, ‘paper length’, ‘month of publication’, and ‘self-citations of journals’, and the other features contribute only slightly to the prediction.  相似文献   

9.
We evaluate author impact indicators and ranking algorithms on two publication databases using large test data sets of well-established researchers. The test data consists of (1) ACM fellowship and (2) various life-time achievement awards. We also evaluate different approaches of dividing credit of papers among co-authors and analyse the impact of self-citations. Furthermore, we evaluate different graph normalisation approaches for when PageRank is computed on author citation graphs.We find that PageRank outperforms citation counts in identifying well-established researchers. This holds true when PageRank is computed on author citation graphs but also when PageRank is computed on paper graphs and paper scores are divided among co-authors. In general, the best results are obtained when co-authors receive an equal share of a paper's score, independent of which impact indicator is used to compute paper scores. The results also show that removing author self-citations improves the results of most ranking metrics. Lastly, we find that it is more important to personalise the PageRank algorithm appropriately on the paper level than deciding whether to include or exclude self-citations. However, on the author level, we find that author graph normalisation is more important than personalisation.  相似文献   

10.
[目的/意义]论文被引频次只能反映论文的宏观影响力,无法揭示论文在他人研究中的具体作用和影响,因此,本文提出从引用内容的主题和功能两方面对论文的影响力进行分析。[方法/过程]以2014年诺贝尔生理学或医学奖获得者J.O'Keefe的高被引论文为实例,首先,采用文献计量学方法对引用内容主题进行分析;对其,影响范围及领域进行可视化分析;其次,从引用性质和功能角度,将引用内容分成正面引用、负面引用和中性引用;最后,将中性引用进一步划分为3类,分别是研究背景介绍、理论基础和实验基础。[结果/结论]结果表明,共词分析可以很好地表达论文影响的主题领域;引用内容的分类可以提供一篇论文被引用的多方面原因。在本实验中没有负面引用,多于10%的引用为正面引用,大约50%的中性引用都是作者在研究背景章节中介绍与施引文献相关的研究工作。  相似文献   

11.
In the past, recursive algorithms, such as PageRank originally conceived for the Web, have been successfully used to rank nodes in the citation networks of papers, authors, or journals. They have proved to determine prestige and not popularity, unlike citation counts. However, bibliographic networks, in contrast to the Web, have some specific features that enable the assigning of different weights to citations, thus adding more information to the process of finding prominence. For example, a citation between two authors may be weighed according to whether and when those two authors collaborated with each other, which is information that can be found in the co-authorship network. In this study, we define a couple of PageRank modifications that weigh citations between authors differently based on the information from the co-authorship graph. In addition, we put emphasis on the time of publications and citations. We test our algorithms on the Web of Science data of computer science journal articles and determine the most prominent computer scientists in the 10-year period of 1996–2005. Besides a correlation analysis, we also compare our rankings to the lists of ACM A. M. Turing Award and ACM SIGMOD E. F. Codd Innovations Award winners and find the new time-aware methods to outperform standard PageRank and its time-unaware weighted variants.  相似文献   

12.
A citation is a well-established mechanism for connecting scientific artifacts. Citation networks are used by citation analysis for a variety of reasons, prominently to give credit to scientists’ work. However, because of current citation practices, scientists tend to cite only publications, leaving out other types of artifacts such as datasets. Datasets then do not get appropriate credit even though they are increasingly reused and experimented with. We develop a network flow measure, called DataRank, aimed at solving this gap. DataRank assigns a relative value to each node in the network based on how citations flow through the graph, differentiating publication and dataset flow rates. We evaluate the quality of DataRank by estimating its accuracy at predicting the usage of real datasets: web visits to GenBank and downloads of Figshare datasets. We show that DataRank is better at predicting this usage compared to alternatives while offering additional interpretable outcomes. We discuss improvements to citation behavior and algorithms to properly track and assign credit to datasets.  相似文献   

13.
[目的/意义]探讨被引频次位置指标在科技期刊评价中的作用,确定合适时间窗口的最优位置指标。[方法/过程]从Web of Science数据库中选取符合条件的14种眼科期刊作为研究对象,分别计算各期刊2014年度不同位置指标,包括2年、5年、8年和10年引证时间窗口(citation time window,CTW)的h指数(h2、h5、h8和h10)、累计h指数(a-h2、a-h5、a-h8和a-h10)以及相对应的期刊2014年度被引频次百分位数位置(percentage rank position,PRP)指标(Top1%、Top5%、Top10%、Top25%Top50%)和累计PRP指标(a-Top1%、a-Top5%、a-Top10%、a-Top25%和a-Top50%)。比较影响因子、不同CTW位置指标与期刊问卷调查评分的相关度,确定不同位置指标应用于期刊评价的效果。[结果/结论]合理的位置指标在期刊影响力评价中优于影响因子和5年影响因子,累计被引频次位置指标普遍优于年度指标,2年CTW的h指数优于其他CTW的h指数,5年CTW的a-h2、h2,5年和8年CTW的a-Top50%和Top50%与影响因子和5年影响因子相比具有更理想的期刊评价效果。  相似文献   

14.
The number of clinical citations received from clinical guidelines or clinical trials has been considered as one of the most appropriate indicators for quantifying the clinical impact of biomedical papers. Therefore, the early prediction of clinical citation count of biomedical papers is critical to scientific activities in biomedicine, such as research evaluation, resource allocation, and clinical translation. In this study, we designed a four-layer multilayer perceptron neural network (MPNN) model to predict the clinical citation count of biomedical papers in the future by using 9,822,620 biomedical papers published from 1985 to 2005. We extracted ninety-one paper features from three dimensions as the input of the model, including twenty-one features in the paper dimension, thirty-five in the reference dimension, and thirty-five in the citing paper dimension. In each dimension, the features can be classified into three categories, i.e., the citation-related features, the clinical translation-related features, and the topic-related features. Besides, in the paper dimension, we also considered the features that have previously been demonstrated to be related to the citation counts of research papers. The results showed that the proposed MPNN model outperformed the other five baseline models, and the features in the reference dimension were the most important. In all the three dimensions, the citation-related and topic-related features were more important than the clinical translation-related features for the prediction. It also turned out that the features helpful in predicting the citation count of papers are not important for predicting the clinical citation count of biomedical papers. Furthermore, we explored the MPNN model based on different categories of biomedical papers. The results showed that the clinical translation-related features were more important for the prediction of clinical citation count of basic papers rather than those papers closer to clinical science. This study provided a novel dimension (i.e., the reference dimension) for the research community and could be applied to other related research tasks, such as the research assessment for translational programs. In addition, the findings in this study could be useful for biomedical authors (especially for those in basic science) to get more attention from clinical research.  相似文献   

15.
[目的/意义]文章对科技政策隐性扩散路径自组织方法进行研究,挖掘科技政策文本包含深层语义信息,将隐性知识显性化,为科研人员拓展和丰富政策扩散路径研究提供参考。[方法/过程]本文结合科技政策篇章文本的形式语义和内容语义两个方面对政策文本结构化处理和深度挖掘,对政策文本资源全解析,抽取科技政策文本中包含的特征,其中包括概念和关系自动获取与标引技术、网络表示学习,挖掘科技政策文本中的隐含结构信息,利用BiLSTM-CRF模型的深度学习方法实现概念的自动获取和自动标引关系。将得到多篇科技政策文本的概念和关系组成概念关系对的形式,借助于表示学习的方法发现每个节点稠密的向量表示。[结果/结论]通过实验验证,证明了本文借助隐性路径特征的科技政策扩散隐性路径自组织方法的有效性,在一定程度上拓展了政策研究的方法,为科研人员在政策扩散研究上提供了参考。  相似文献   

16.
科技论文引用种类的初步剖析   总被引:1,自引:0,他引:1  
张微 《图书情报工作》2010,54(16):59-62
科技论文被引用率的大幅提高,并不意味着我国科技竞争力会自然大幅提高。被引用有负引用、正引用之分。正引用有流水引用、有效引用、深度引用和发展性引用之分。只有发展性引用率高,才说明科研成果有很强的生命力。为此提出用收录、引用和引用深度对科技论文进行质量评价的综合方法,以期为提高我国图书情报信息工作的水平提供参考。  相似文献   

17.
通过对我国图书情报学期刊网络引文的实证分析,得出如下结论:HTML格式网络引文的比例在逐年下降,PDF格式和动态类网络引文的比例在逐渐上升,维基、博客、论坛等新型网络学术信息正日益得到我国图书情报学者的认可和接受;动态类网络引文的可追溯性略高于静态类网络引文,但二者可追溯率都介于50%-51%之间;分布在.edu域名的网络引文的可追溯性相对较差。  相似文献   

18.
Citation behaviour is the source driver of scientific dynamics, and it is essential to understand its effect on knowledge diffusion and intellectual structure. This study explores the effect of citation behaviour on disciplinary knowledge diffusion and intellectual structure by comparing three types of citation behaviour trends, namely the high citation trend, medium citation trend, and low citation trend. The diffusion power, diffusion speed, and diffusion breadth were calculated to quantify knowledge diffusion. The properties of the global and local citation network structure were used to reflect the particular influences of citation behaviour on the scientific intellectual structure. The primary empirical results show that (a) the high citation behaviour trend could improve the knowledge diffusion speed for papers with a short citation history span. Additionally, the medium citation trend has the broadest diffusion breadth whereas the low citation behaviour trend might make the citation counts take off for papers with a long citation history span; (b) the high citation trend has a stronger influence and greater control over the intellectual structure, but this relationship is true only for papers with a short or normal citation history span. These findings could play important roles in scientific research evaluation and impact prediction.  相似文献   

19.
Wide differences in publication and citation practices make impossible the direct comparison of raw citation counts across scientific disciplines. Recent research has studied new and traditional normalization procedures aimed at suppressing as much as possible these disproportions in citation numbers among scientific domains. Using the recently introduced IDCP (Inequality due to Differences in Citation Practices) method, this paper rigorously tests the performance of six cited-side normalization procedures based on the Thomson Reuters classification system consisting of 172 sub-fields. We use six yearly datasets from 1980 to 2004, with widely varying citation windows from the publication year to May 2011. The main findings are the following three. Firstly, as observed in previous research, within each year the shapes of sub-field citation distributions are strikingly similar. This paves the way for several normalization procedures to perform reasonably well in reducing the effect on citation inequality of differences in citation practices. Secondly, independently of the year of publication and the length of the citation window, the effect of such differences represents about 13% of total citation inequality. Thirdly, a recently introduced two-parameter normalization scheme outperforms the other normalization procedures over the entire period, reducing citation disproportions to a level very close to the minimum achievable given the data and the classification system. However, the traditional procedure of using sub-field mean citations as normalization factors yields also good results.  相似文献   

20.
Inspired by “sleeping beauties in science”, we proposed that the awakening effect in knowledge diffusion is ubiquitous, whereas the “prince” paper has the strongest effect. To test this hypothesis, a three-layer super-network model depicting the knowledge diffusion trajectory is designed and the diffusion path of the awakening effect (defined on the basis of influential strength) is simulated. In detail, the model is built based on the citation network and collaboration network of 63785 publications in the library and information science domain. Through meta-paths in this super-network, the influential strength of a paper and the awakening effect from neighboring papers can be quantified into 36 numerical features. By testing the effectiveness of these features in citation counts prediction, we try to prove our hypothesis. Thus an effective predictor in machine learning is trained upon these features. Using this predictor, we showed that most neighboring papers in the super-network had effects on future citation counts. The effectiveness of these features is again demonstrated through experiments on papers with different publication years. We also did a case study on papers that were significantly affected by the awakening effect, and found that the model proposed in this paper can also be used to explain some common phenomena in knowledge diffusion. All results show that the awakening effect could be not only ubiquitous but also quantifiable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号