首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于作者共引分析的推荐系统研究知识图谱构建   总被引:2,自引:0,他引:2  
作者共引分析是文献研究中所采用的重要和有效方法。本文针对推荐系统领域的研究,用基于作者共引分析的方法构建知识图谱。利用Web of Science数据库作为数据来源,提取1997-2014年的推荐系统研究文章,生成作者共引矩阵后转化为Pearson相关系数矩阵,再进行因子分析、聚类分析与多维尺度分析,构建推荐系统研究领域的知识图谱。分析表明,推荐系统研究目前处于快速发展时期,相关学者人数与研究范围不断扩大,其中基于协同过滤的推荐算法是最为核心的研究内容,个性化推荐、基于内容的推荐算法和基于数据挖掘的推荐算法等方向是目前该领域的研究热点。  相似文献   

2.
黄利  周密 《资源科学》2020,42(4):607-620
生态系统服务研究已得到国内外学者广泛关注,揭示当前国际研究热点与发展趋势,将为国内生态系统服务研究与实践提供借鉴与参考。本文以“Web of Science核心合集”的SCI-E和SSCI数据库为样本数据源,借助CiteSpace可视化科学计量工具,对国际上生态系统服务研究领域的4208篇文献进行了系统分析,利用活力指数(AI)和吸引力指数(AAI)对不同国家或地区在该领域的研究效率和学术影响力进行了评价,探讨国际上生态系统服务研究进展和动态变化规律。研究结果表明:①国际上生态系统服务研究的文献数量和被引次数随着年份变化增长显著,尤其在2012年以后,关注该问题的学者不断增加;②发文期刊集中性强,排名前10位的期刊发文数量占到全部期刊发文总数的40%;③中国近年来在生态系统服务领域研究实力不断增强,但仍低于全球平均水平;④生态系统服务评估框架和方法框架是当前热点研究领域,尤其应注重将社会需求、人类福祉、生态系统调节服务纳入到生态系统服务分析框架,同时重点关注文化生态系统服务的价值和作用,以及充分运用机器学习和大数据挖掘等创新方法解决复杂社会生态问题。  相似文献   

3.
4种期刊评价工具的比较研究   总被引:2,自引:0,他引:2  
本文简要介绍了《中文核心期刊要目总览》、《中国科技期刊引证报告》、《中国学术期刊综合引证报告》和中国科学引文数据库这4种目前在国内使用比较广泛的核心期刊评价工具,并从评价方法、来源期刊、学科分类、计量指标和网络服务等5个方面对这4种期刊评价工具进行了比较。在比较分析的基础上探讨了期刊评价工具未来的发展。  相似文献   

4.
In this paper, we propose a re-ranking algorithm using post-retrieval clustering for content-based image retrieval (CBIR). In conventional CBIR systems, it is often observed that images visually dissimilar to a query image are ranked high in retrieval results. To remedy this problem, we utilize the similarity relationship of the retrieved results via post-retrieval clustering. In the first step of our method, images are retrieved using visual features such as color histogram. Next, the retrieved images are analyzed using hierarchical agglomerative clustering methods (HACM) and the rank of the results is adjusted according to the distance of a cluster from a query. In addition, we analyze the effects of clustering methods, query-cluster similarity functions, and weighting factors in the proposed method. We conducted a number of experiments using several clustering methods and cluster parameters. Experimental results show that the proposed method achieves an improvement of retrieval effectiveness of over 10% on average in the average normalized modified retrieval rank (ANMRR) measure.  相似文献   

5.
Citation analysis does not tell the whole story about the innovativeness of scientific papers. Works by prominent authors tend to receive disproportionately many citations, while publications by less well-known researchers covering the same topics may not attract as much attention. In this paper we address the shortcomings of traditional scientometric approaches by proposing a novel method that utilizes a classifier for predicting publication years based on latent topic distributions. We then calculate real-number innovation scores used to identify potential breakthrough papers and turnaround years. The proposed approach can complement existing citation-based measures of article importance and author contribution analysis; it opens as well novel research direction for time-based, innovation-centered research scientific output evaluation. In our experiments, we focus on two corpora of research papers published over several decades at two well-established conferences: The World Wide Web Conference (WWW) and the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), containing around 3500 documents in total. We indicate significant years and demonstrate examples of highly-ranked papers, thus providing a novel insight on the evolution of the two conferences. Finally, we compare our results to citation analysis and discuss how our approach may complement traditional scientometrics.  相似文献   

6.
Popular and/or prestigious? Measures of scholarly esteem   总被引:1,自引:0,他引:1  
Citation analysis does not generally take the quality of citations into account: all citations are weighted equally irrespective of source. However, a scholar may be highly cited but not highly regarded: popularity and prestige are not identical measures of esteem. In this study we define popularity as the number of times an author is cited and prestige as the number of times an author is cited by highly cited papers. Information retrieval (IR) is the test field. We compare the 40 leading researchers in terms of their popularity and prestige over time. Some authors are ranked high on prestige but not on popularity, while others are ranked high on popularity but not on prestige. We also relate measures of popularity and prestige to date of Ph.D. award, number of key publications, organizational affiliation, receipt of prizes/honors, and gender.  相似文献   

7.
本文对1998-2011年《情报学报》在中文社会科学引文索引、中国引文数据库、万方数字期刊群和学术谷歌的百篇高被引论文及其相关信息进行了比较分析。研究发现,无论是单篇论文,还是论文作者,在这4个数据源中的被引频次都有一定的差异,有些结果差异非常明显;在利用文献计量方法进行相关研究时,既要明确各个数据源的特点,也要注意数据源本身存在的一些不足和缺陷。  相似文献   

8.
Hierarchic clustering methods may be used to condense information for a user, as they are in multivariate data analysis, or to achieve computational advantages, as they are in information retrieval. The structure of the hierarchic classification produced has a direct bearing on the effectiveness and utility of using cluster analysis, yet this important feature of the classification has only been implicitly referred to in the literature to date. In this study, three different coefficients are defined, each of which quantify the symmetry-asymmetry (balancedness-unbalancedness) of hierarchic clusterings on a scale from 0 to 1. Using examples of data from the areas of information retrieval and of multivariate data analysis, a number of hierarchic clustering methods are discussed in terms of the hierarchies they produce.  相似文献   

9.
李晶  罗泰晔 《科技管理研究》2020,40(19):153-158
研究热点的识别是科学计量及相关领域长期关注的重要问题之一,识别领域内的技术研究热点对于研发组织的战略规划具有重要意义。本文针对5G技术,提出了一种基于文本挖掘的研究热点识别的新方法。我们从web of science数据库中检索了2013至2018年间以5G技术为主题的11429篇科研论文,基于文本关联规则挖掘构建关键词网络,以信息熵和组合力作为指标对论文的高频关键词进行聚类分析,在此基础上识别出了5G领域的三类热点技术。  相似文献   

10.
【目的/意义】学术社交网络为科研成果交互分享提供了平台支撑,针对平台中高影响力成果的特征分析, 有助于拓展高影响力成果研究维度,为平台优化及用户合理利用提供参考。【方法/过程】本文选择学术社交网络中 iSchool成员的8449篇高影响力成果作为研究样本,从年份、刊物、作者3个视角探究其分布特征,并应用时间序列 聚类方法归纳影响力变化模式及规律。【结果/结论】来源年代近、刊物质量好、合作意愿强为多数学术社交网络中 高影响力成果的共有特点,虽存在部分高质高产的核心作者但作者来源整体分散,经典成果同样能在平台中保持 并延续其高关注度。高影响力成果影响力变化呈现出线性增长型、趋向饱和型、趋向衰退型和热点猛增型4种模 式,主要体现了科研成果借助学术社交网络提升和发挥持续影响力的整体趋势。【创新/局限】本文创新点为分多维 度揭示科研成果特征,利用时间序列聚类分析方法归纳指标变化规律,丰富基于资源层面的学术社交网络用户行 为研究。  相似文献   

11.
A hybrid text/citation-based method is used to cluster journals covered by the Web of Science database in the period 2002–2006. The objective is to use this clustering to validate and, if possible, to improve existing journal-based subject-classification schemes. Cross-citation links are determined on an item-by-paper procedure for individual papers assigned to the corresponding journal. Text mining for the textual component is based on the same principle; textual characteristics of individual papers are attributed to the journals in which they have been published. In a first step, the 22-field subject-classification scheme of the Essential Science Indicators (ESI) is evaluated and visualised. In a second step, the hybrid clustering method is applied to classify the about 8300 journals meeting the selection criteria concerning continuity, size and impact. The hybrid method proves superior to its two components when applied separately. The choice of 22 clusters also allows a direct field-to-cluster comparison, and we substantiate that the science areas resulting from cluster analysis form a more coherent structure than the “intellectual” reference scheme, the ESI subject scheme. Moreover, the textual component of the hybrid method allows labelling the clusters using cognitive characteristics, while the citation component allows visualising the cross-citation graph and determining representative journals suggested by the PageRank algorithm. Finally, the analysis of journal ‘migration’ allows the improvement of existing classification schemes on the basis of the concordance between fields and clusters.  相似文献   

12.
基于Web使用挖掘的用户个性化服务研究   总被引:5,自引:0,他引:5  
万维网是一个巨大的全球性的信息服务中心。随着诸如新闻、广告、消费信息、金融管理、远程教育、政府网站、电子商务等的日益普及 ,提供网络信息服务的竞争日益激烈。谁能更方便地为用户提供所需要的网络资源 ;谁能提供更贴近用户的个性化服务 ;谁能更快捷抓住用户新的需求 ,是能否成功为用户提供网络业务的关键。现代社会的竞争趋势要求对因特网上大量出现和产生的信息进行实时和深层次的分析 ,虽然借助于强大的搜索引擎和搜索技术 ,用户仍然在分析和使用这些信息时面临许多困难。同时基于WWW的Web站点设计、Web服务设计、Web…  相似文献   

13.
As text documents are explosively increasing in the Internet, the process of hierarchical document clustering has been proven to be useful for grouping similar documents for versatile applications. However, most document clustering methods still suffer from challenges in dealing with the problems of high dimensionality, scalability, accuracy, and meaningful cluster labels. In this paper, we will present an effective Fuzzy Frequent Itemset-Based Hierarchical Clustering (F2IHC) approach, which uses fuzzy association rule mining algorithm to improve the clustering accuracy of Frequent Itemset-Based Hierarchical Clustering (FIHC) method. In our approach, the key terms will be extracted from the document set, and each document is pre-processed into the designated representation for the following mining process. Then, a fuzzy association rule mining algorithm for text is employed to discover a set of highly-related fuzzy frequent itemsets, which contain key terms to be regarded as the labels of the candidate clusters. Finally, these documents will be clustered into a hierarchical cluster tree by referring to these candidate clusters. We have conducted experiments to evaluate the performance based on Classic4, Hitech, Re0, Reuters, and Wap datasets. The experimental results show that our approach not only absolutely retains the merits of FIHC, but also improves the accuracy quality of FIHC.  相似文献   

14.
牛青  王上铭 《情报科学》2012,(8):1183-1188
运用文献计量方法,以美国科学情报研究所开发的Web of Science数据库为数据源,检索研究共引分析的文献,从文献量、著者、机构、核心期刊、引文等角度进行统计和分析,得出对共引分析的研究尚处于初始阶段,目前在作者共引分析方法上还存在不足等结论。  相似文献   

15.
The dynamic nature and size of the Internet can result in difficulty finding relevant information. Most users typically express their information need via short queries to search engines and they often have to physically sift through the search results based on relevance ranking set by the search engines, making the process of relevance judgement time-consuming. In this paper, we describe a novel representation technique which makes use of the Web structure together with summarisation techniques to better represent knowledge in actual Web Documents. We named the proposed technique as Semantic Virtual Document (SVD). We will discuss how the proposed SVD can be used together with a suitable clustering algorithm to achieve an automatic content-based categorization of similar Web Documents. The auto-categorization facility as well as a “Tree-like” Graphical User Interface (GUI) for post-retrieval document browsing enhances the relevance judgement process for Internet users. Furthermore, we will introduce how our cluster-biased automatic query expansion technique can be used to overcome the ambiguity of short queries typically given by users. We will outline our experimental design to evaluate the effectiveness of the proposed SVD for representation and present a prototype called iSEARCH (Intelligent SEarch And Review of Cluster Hierarchy) for Web content mining. Our results confirm, quantify and extend previous research using Web structure and summarisation techniques, introducing novel techniques for knowledge representation to enhance Web content mining.  相似文献   

16.
基于页面链接挖掘的Web教育信息检索   总被引:2,自引:0,他引:2  
王成云  王乐乐 《情报科学》2004,22(4):475-477,487
教育信息检索是教育信息应用于教育科研与教育教学的关键环节,而Web页面链接挖掘是对Web页面之间的链接结构进行挖掘。本文对Web链接结构挖掘在教育信息检索方面上进行了研究,介绍了Web挖掘的概念、分类,以及HITS与Page—rank等算法,并提出了一种基于样本模式特征提取的信息检索方法。  相似文献   

17.
Towards mapping library and information science   总被引:3,自引:1,他引:3  
In an earlier study by the authors, full-text analysis and traditional bibliometric methods were combined to map research papers published in the journal Scientometrics. The main objective was to develop appropriate techniques of full-text analysis and to improve the efficiency of the individual methods in the mapping of science. The number of papers was, however, rather limited. In the present study, we extend the quantitative linguistic part of the previous studies to a set of five journals representing the field of Library and Information Science (LIS). Almost 1000 articles and notes published in the period 2002–2004 have been selected for this exercise. The optimum solution for clustering LIS is found for six clusters. The combination of different mapping techniques, applied to the full text of scientific publications, results in a characteristic tripod pattern. Besides two clusters in bibliometrics, one cluster in information retrieval and one containing general issues, webometrics and patent studies are identified as small but emerging clusters within LIS. The study is concluded with the analysis of cluster representations by the selected journals.  相似文献   

18.
基于Web日志挖掘的网络动态竞争情报分析研究   总被引:1,自引:0,他引:1  
通过挖掘蕴含在WebE1志中的隐含模式和知识,Web日志挖掘为企业实现网络竞争情报动态分析提供了一种有效的途径。文章分析Web日志挖掘的原理和过程,并探讨Web日志挖掘在动态竞争情报分析中的应用。  相似文献   

19.
在全球知识经济发展的大背景下,跨学科研究的深度和广度已经成为影响创新进程的一个重要因素,对于国家的社会经济发展和学术成长具有重要的影响。本文在梳理当前国内外跨学科测度研究的基础上,基于汤森路透(Thomson Reuters)科学引文索引(SCI)以及社会科学引文索引(SSCI)中的Web of Science分类,从学科专业化指数、学科集成指数和学科扩散指数三个方面来衡量研究的跨学科性,并通过科学地图可视化的方法来展示跨学科研究的学科分布特征。最后,本文选取2005年至2014年诺贝尔物理学奖获得者所著文献作为实证案例,通过构建的测度指标来研究这些顶级学者论文的跨学科特征。研究结果表明物理学的创新性成果具有明显的学科特征,诺贝尔物理学奖获得者们的研究成果总体上专注于物理学科领域,学科的集成程度和扩散程度还处于较低水平。  相似文献   

20.
Author co-citation analysis (ACA) is a commonly used method to map knowledge domains and depict scientific intellectual structures. Although all authors’ information has been considered in previous studies, ACA does not distinguish credits of different collaborators within a team. Authors’ sequence in a publication illustrates their contributions and specialty of research, which offers more information as inputs of ACA. This paper considers author sequence in ACA and proposes a sequence-based ACA method. By assigning various weight values to authors with different sequences, this proposed method considers distinct contributions of co-authors influencing the effect of ACA. Extra weight is given to corresponding authors, beyond their sequence, to acknowledge their additional contributions. Results of the empirical study based on the data from the field of Library and Information Science show many details on the visualization maps of the proposed methods, such as the number of sub-fields, the position of sub-fields, the position of authors, and clarity and interpretability of visualization maps. Meanwhile, the current paper proposes a novel framework of evaluating knowledge domain maps with both quantitative and qualitative facets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号