首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 984 毫秒
1.
Publication patterns of 79 forest scientists awarded major international forestry prizes during 1990-2010 were compared with the journal classification and ranking promoted as part of the ‘Excellence in Research for Australia’ (ERA) by the Australian Research Council. The data revealed that these scientists exhibited an elite publication performance during the decade before and two decades following their first major award. An analysis of their 1703 articles in 431 journals revealed substantial differences between the journal choices of these elite scientists and the ERA classification and ranking of journals. Implications from these findings are that additional cross-classifications should be added for many journals, and there should be an adjustment to the ranking of several journals relevant to the ERA Field of Research classified as 0705 Forestry Sciences.  相似文献   

2.
3.
To take into account the impact of the different bibliometric features of scientific fields and different size of both the publication set evaluated and the set used as reference standard, two new impact indicators are introduced. The Percentage Rank Position (PRP) indicator relates the ordinal rank position of the article assessed to the total number of papers in the publishing journal. The publications in the publishing journal are ranked by the decreasing citation frequency. The Relative Elite Rate (RER) indicator relates the number of citations obtained by the article assessed to the mean citation rate of the papers in the elite set of the publishing journal. The indices can be preferably calculated from the data of the publications in the elite set of journal papers of individuals, teams, institutes or countries. The number of papers in the elite set is calculated by the equation: P(πv) = (10 log P) ? 10, where P is the total number of papers. The mean of the PRP and RER indicators of the journal papers assessed may be applied for comparing the eminence of publication sets across fields.  相似文献   

4.
The launch of Google Scholar Metrics as a tool for assessing scientific journals may be serious competition for Thomson Reuters' Journal Citation Reports, and for the Scopus‐powered Scimago Journal Rank. A review of these bibliometric journal evaluation products is performed. We compare their main characteristics from different approaches: coverage, indexing policies, search and visualization, bibliometric indicators, results analysis options, economic cost, and differences in their ranking of journals. Despite its shortcomings, Google Scholar Metrics is a helpful tool for authors and editors in identifying core journals. As an increasingly useful tool for ranking scientific journals, it may also challenge established journals products.  相似文献   

5.
In this paper, we evaluate a number of machine learning techniques for the task of ranking answers to why-questions. We use TF-IDF together with a set of 36 linguistically motivated features that characterize questions and answers. We experiment with a number of machine learning techniques (among which several classifiers and regression techniques, Ranking SVM and SVM map ) in various settings. The purpose of the experiments is to assess how the different machine learning approaches can cope with our highly imbalanced binary relevance data, with and without hyperparameter tuning. We find that with all machine learning techniques, we can obtain an MRR score that is significantly above the TF-IDF baseline of 0.25 and not significantly lower than the best score of 0.35. We provide an in-depth analysis of the effect of data imbalance and hyperparameter tuning, and we relate our findings to previous research on learning to rank for Information Retrieval.  相似文献   

6.
The evaluation of diversified web search results is a relatively new research topic and is not as well-understood as the time-honoured evaluation methodology of traditional IR based on precision and recall. In diversity evaluation, one topic may have more than one intent, and systems are expected to balance relevance and diversity. The recent NTCIR-9 evaluation workshop launched a new task called INTENT which included a diversified web search subtask that differs from the TREC web diversity task in several aspects: the choice of evaluation metrics, the use of intent popularity and per-intent graded relevance, and the use of topic sets that are twice as large as those of TREC. The objective of this study is to examine whether these differences are useful, using the actual data recently obtained from the NTCIR-9 INTENT task. Our main experimental findings are: (1) The $\hbox{D}\,\sharp$ evaluation framework used at NTCIR provides more “intuitive” and statistically reliable results than Intent-Aware Expected Reciprocal Rank; (2) Utilising both intent popularity and per-intent graded relevance as is done at NTCIR tends to improve discriminative power, particularly for $\hbox{D}\,\sharp$ -nDCG; and (3) Reducing the topic set size, even by just 10 topics, can affect not only significance testing but also the entire system ranking; when 50 topics are used (as in TREC) instead of 100 (as in NTCIR), the system ranking can be substantially different from the original ranking and the discriminative power can be halved. These results suggest that the directions being explored at NTCIR are valuable.  相似文献   

7.
用AUC评估分类器的预测性能   总被引:1,自引:0,他引:1  
杨波  程泽凯  秦锋 《情报学报》2007,(2):275-279
准确率一直被作为分类器预测性能的主要评估标准,但是它存在着诸多的缺点和不足。本文将准确率与AUC(the area under the Receiver Operating Characteristic curve)进行了理论上的对比分析,并分别使用AUC和准确率对3种分类学习算法在15个两类数据集上进行了评估。综合理论和实验两个方面的结果,显示了AUC不但优于而且应该替代准确率,成为更好的分类器性能的评估度量。同时,用AUC对3种分类学习算法的重新评估,进一步证实了基于贝叶斯定理的NaiveBayes和TAN-CMI分类算法优于决策树分类算法C4.5。  相似文献   

8.
基于DIT理论建模的付费搜索排名质量测度研究   总被引:1,自引:0,他引:1  
针对付费搜索对搜索引擎结果排名的影响,建立以“顺序浏览”为特征的用户搜索模型,通过将模型预测结果与实际数据分布的比较对模型进行检验,进一步定义以信息代价为衡量标准的排名质量定量测度方法,并设计一个数值实验评价两种不同排名机制下付费搜索对排名质量的影响,指出付费排名相对自然排名给用户所增加的搜索代价比与SERP结果质量差异成正相关关系,但在有效竞价排名下该比例会有所减小。  相似文献   

9.
Collaborative filtering is concerned with making recommendations about items to users. Most formulations of the problem are specifically designed for predicting user ratings, assuming past data of explicit user ratings is available. However, in practice we may only have implicit evidence of user preference; and furthermore, a better view of the task is of generating a top-N list of items that the user is most likely to like. In this regard, we argue that collaborative filtering can be directly cast as a relevance ranking problem. We begin with the classic Probability Ranking Principle of information retrieval, proposing a probabilistic item ranking framework. In the framework, we derive two different ranking models, showing that despite their common origin, different factorizations reflect two distinctive ways to approach item ranking. For the model estimations, we limit our discussions to implicit user preference data, and adopt an approximation method introduced in the classic text retrieval model (i.e. the Okapi BM25 formula) to effectively decouple frequency counts and presence/absence counts in the preference data. Furthermore, we extend the basic formula by proposing the Bayesian inference to estimate the probability of relevance (and non-relevance), which largely alleviates the data sparsity problem. Apart from a theoretical contribution, our experiments on real data sets demonstrate that the proposed methods perform significantly better than other strong baselines.
Marcel J. T. ReindersEmail:
  相似文献   

10.
This paper investigates how text analysis and classification techniques can be used to enhance e-government, typically law enforcement agencies' efficiency and effectiveness by analyzing text reports automatically and provide timely supporting information to decision makers. With an increasing number of anonymous crime reports being filed and digitized, it is generally difficult for crime analysts to process and analyze crime reports efficiently. Complicating the problem is that the information has not been filtered or guided in a detective-led interview resulting in much irrelevant information. We are developing a decision support system (DSS), combining natural language processing (NLP) techniques, similarity measures, and machine learning, i.e., a Naïve Bayes' classifier, to support crime analysis and classify which crime reports discuss the same and different crime. We report on an algorithm essential to the DSS and its evaluations. Two studies with small and big datasets were conducted to compare the system with a human expert's performance. The first study includes 10 sets of crime reports discussing 2 to 5 crimes. The highest algorithm accuracy was found by using binary logistic regression (89%) while Naive Bayes' classifier was only slightly lower (87%). The expert achieved still better performance (96%) when given sufficient time. The second study includes two datasets with 40 and 60 crime reports discussing 16 different types of crimes for each dataset. The results show that our system achieved the highest classification accuracy (94.82%), while the crime analyst's classification accuracy (93.74%) is slightly lower.  相似文献   

11.
本文探讨了在计算机编目条件下,会议文献类图书的分类主题标引、书次号标引的方法及其在机读书目数据中的规范著录,分析了各个工作环节中常见的问题,提出了相应的处理办法。  相似文献   

12.
We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empirically optimal for a widely used information retrieval measure. Our algorithm is based on boosted regression trees, although the ideas apply to any weak learners, and it is significantly faster in both train and test phases than the state of the art, for comparable accuracy. We also show how to find the optimal linear combination for any two rankers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that starting with a previously trained model, and boosting using its residuals, furnishes an effective technique for model adaptation, and we give significantly improved results for a particularly pressing problem in web search—training rankers for markets for which only small amounts of labeled data are available, given a ranker trained on much more data from a larger market.  相似文献   

13.
A number of online marketplaces enable customers to buy or sell used products, which raises the need for ranking tools to help them find desirable items among a huge pool of choices. To the best of our knowledge, no prior work in the literature has investigated the task of used product ranking which has its unique characteristics compared with regular product ranking. While there exist a few ranking metrics (e.g., price, conversion probability) that measure the “goodness” of a product, they do not consider the time factor, which is crucial in used product trading due to the fact that each used product is often unique while new products are usually abundant in supply or quantity. In this paper, we introduce a novel time-aware metric—“sellability”, which is defined as the time duration for a used item to be traded, to quantify the value of it. In order to estimate the “sellability” values for newly generated used products and to present users with a ranked list of the most relevant results, we propose a combined Poisson regression and listwise ranking model. The model has a good property in fitting the distribution of “sellability”. In addition, the model is designed to optimize loss functions for regression and ranking simultaneously, which is different from previous approaches that are conventionally learned with a single cost function, i.e., regression or ranking. We evaluate our approach in the domain of used vehicles. Experimental results show that the proposed model can improve both regression and ranking performance compared with non-machine learning and machine learning baselines.  相似文献   

14.
Subject classification arises as an important topic for bibliometrics and scientometrics, searching to develop reliable and consistent tools and outputs. Such objectives also call for a well delimited underlying subject classification scheme that adequately reflects scientific fields. Within the broad ensemble of classification techniques, clustering analysis is one of the most successful.Two clustering algorithms based on modularity – the VOS and Louvain methods – are presented here for the purpose of updating and optimizing the journal classification of the SCImago Journal & Country Rank (SJR) platform. We used network analysis and Pajek visualization software to run both algorithms on a network of more than 18,000 SJR journals combining three citation-based measures of direct citation, co-citation and bibliographic coupling. The set of clusters obtained was termed through category labels assigned to SJR journals and significant words from journal titles.Despite the fact that both algorithms exhibited slight differences in performance, the results show a similar behaviour in grouping journals. Consequently, they are deemed to be appropriate solutions for classification purposes. The two newly generated algorithm-based classifications were compared to other bibliometric classification systems, including the original SJR and WoS Subject Categories, in order to validate their consistency, adequacy and accuracy. In addition to some noteworthy differences, we found a certain coherence and homogeneity among the four classification systems analysed.  相似文献   

15.
网络科技论文影响力的评价效果取决于评价指标变量的选择。将网络科技论文影响力评价与论文排名相关联,以Web of Science数据库中数学类论文为样本,从6个不同的排名等级组,即排名前0.01%、0.01%-0.1%、0.1%-1%、1%-10%、10%-20%、20%-50%,分别抽取论文数十篇,用文献信息方法对单篇论文从内容、论文载体和论文作者三个层面初选28个特征变量,以324篇论文的排名等级与28个学术链接指标样本建立为"序回归"模型的研究问题。基于Lasso方法对28个学术链接指标进行变量选择和参数估计,获得9个学术链接特征指标作为评价网络科技论文学术影响力的基本特征指标;以418篇OA论文的排名等级对23个网络影响计量及其衍生变量进行变量选择,获得5个论文网络传播与利用影响力的评价指标。最终共获得14个网络科技论文学术影响力的评价指标。  相似文献   

16.
工程技术数字图书馆网(Digital Library Network for Engineering and Technology,DLNET)是美国国家科学数字图书馆倡议下的国家科学基金会资助的项目。该馆作为一个支持工程与技术方面教育和学习的平台,具有资源检索、资源评估、基础研究等功能。文章从资源组织、技术特征、界面设计、服务特点、评价和建议等方面对工程技术数字图书馆网作了概要的评述。  相似文献   

17.
针对变精度粗糙集模型进行研究,提出了利用变精度粗糙集模型进行Web文档的算法。通过引入阈值β,使得用户可以通过调整β的值,实现对Web文档的不同级别的分类。试验结果表明,该算法在大大降低关键词向量维数的基础上,在保证分类准确度的前提下,有效的增加了分类的灵活性。  相似文献   

18.
19.
从图书情报领域的分类法及相关理念入手,研究构建网络科学数据资源的分类导航平台。引入动态面分类法对科学数据资源目录进行组织,在此基础上,提出可行的多维关键词与多维分类关联的标引方法;设计基于分类与关键词关联权重的排序方案,使用该构建方案开发的实验系统可以有效地对分散网络科学数据资源进行分类并提供导航服务。  相似文献   

20.
结合国内Internet服务和使用的实际情况,从“基石”、“门户”、“诀窍”、“粘合剂”和“生命线”五个视角提出了如何构建成功网站的策略。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号