首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Despite the increasing use of citation-based metrics for research evaluation purposes, we do not know yet which metrics best deliver on their promise to gauge the significance of a scientific paper or a patent. We assess 17 network-based metrics by their ability to identify milestone papers and patents in three large citation datasets. We find that traditional information-retrieval evaluation metrics are strongly affected by the interplay between the age distribution of the milestone items and age biases of the evaluated metrics. Outcomes of these metrics are therefore not representative of the metrics’ ranking ability. We argue in favor of a modified evaluation procedure that explicitly penalizes biased metrics and allows us to reveal metrics’ performance patterns that are consistent across the datasets. PageRank and LeaderRank turn out to be the best-performing ranking metrics when their age bias is suppressed by a simple transformation of the scores that they produce, whereas other popular metrics, including citation count, HITS and Collective Influence, produce significantly worse ranking results.  相似文献   

2.
盛宇 《图书情报工作》2012,56(14):62-66
微博具有更新快、双向交流方便等传统交流方式无法比拟的优点。以新浪微博为例,提出基于微博的学术交流过程模型,并分别从信息发布者和使用者角度对模型各部分进行详细介绍,主要包括微博学术交流主体、微博学术信息内容、微博学术信息获取以及微博学术信息的评价和利用。最后,对微博学术交流进行总结,提出几点建议和对未来的展望。  相似文献   

3.
Several studies have reported on metrics for measuring the influence of scientific topics from different perspectives; however, current ranking methods ignore the reinforcing effect of other academic entities on topic influence. In this paper, we developed an effective topic ranking model, 4EFRRank, by modeling the influence transfer mechanism among all academic entities in a complex academic network using a four-layer network design that incorporates the strengthening effect of multiple entities on topic influence. The PageRank algorithm is utilized to calculate the initial influence of topics, papers, authors, and journals in a homogeneous network, whereas the HITS algorithm is utilized to express the mutual reinforcement between topics, papers, authors, and journals in a heterogeneous network, iteratively calculating the final topic influence value. Based on a specific interdisciplinary domain, social media data, we applied the 4ERRank model to the 19,527 topics included in the criteria. The experimental results demonstrate that the 4ERRank model can successfully synthesize the performance of classic co-word metrics and effectively reflect high citation topics. This study enriches the methodology for assessing topic impact and contributes to the development of future topic-based retrieval and prediction tasks.  相似文献   

4.
[目的/意义]微博评论情感分类模型可以为相关舆情监管部门正确管控话题事件的发展状况和舆情提供一定的指导作用。[方法/过程]基于字词向量的多尺度卷积神经网络,运用多尺度卷积核改善微博评论中上下文信息有限的条件制约,构建基于字词向量的多尺度卷积神经网络微博评论情感分类模型;通过爬取"微博热搜整改"数据,对模型的可行性和优越性进行验证。[结果/结论]验证结果表明基于字词向量的多尺度卷积神经网络在微博舆情等上下文信息有限的短文本分类任务中表现良好。本文在理论层面为微博舆情情感分类提供了更为准确的情感分类理论模型及分类方法,在实践层面可以更好地指导舆情监管部门对舆情的情感倾向进行更好的引导和监管。  相似文献   

5.
Query suggestions have become pervasive in modern web search, as a mechanism to guide users towards a better representation of their information need. In this article, we propose a ranking approach for producing effective query suggestions. In particular, we devise a structured representation of candidate suggestions mined from a query log that leverages evidence from other queries with a common session or a common click. This enriched representation not only helps overcome data sparsity for long-tail queries, but also leads to multiple ranking criteria, which we integrate as features for learning to rank query suggestions. To validate our approach, we build upon existing efforts for web search evaluation and propose a novel framework for the quantitative assessment of query suggestion effectiveness. Thorough experiments using publicly available data from the TREC Web track show that our approach provides effective suggestions for adhoc and diversity search.  相似文献   

6.
基于DIT理论建模的付费搜索排名质量测度研究   总被引:1,自引:0,他引:1  
针对付费搜索对搜索引擎结果排名的影响,建立以“顺序浏览”为特征的用户搜索模型,通过将模型预测结果与实际数据分布的比较对模型进行检验,进一步定义以信息代价为衡量标准的排名质量定量测度方法,并设计一个数值实验评价两种不同排名机制下付费搜索对排名质量的影响,指出付费排名相对自然排名给用户所增加的搜索代价比与SERP结果质量差异成正相关关系,但在有效竞价排名下该比例会有所减小。  相似文献   

7.
A number of online marketplaces enable customers to buy or sell used products, which raises the need for ranking tools to help them find desirable items among a huge pool of choices. To the best of our knowledge, no prior work in the literature has investigated the task of used product ranking which has its unique characteristics compared with regular product ranking. While there exist a few ranking metrics (e.g., price, conversion probability) that measure the “goodness” of a product, they do not consider the time factor, which is crucial in used product trading due to the fact that each used product is often unique while new products are usually abundant in supply or quantity. In this paper, we introduce a novel time-aware metric—“sellability”, which is defined as the time duration for a used item to be traded, to quantify the value of it. In order to estimate the “sellability” values for newly generated used products and to present users with a ranked list of the most relevant results, we propose a combined Poisson regression and listwise ranking model. The model has a good property in fitting the distribution of “sellability”. In addition, the model is designed to optimize loss functions for regression and ranking simultaneously, which is different from previous approaches that are conventionally learned with a single cost function, i.e., regression or ranking. We evaluate our approach in the domain of used vehicles. Experimental results show that the proposed model can improve both regression and ranking performance compared with non-machine learning and machine learning baselines.  相似文献   

8.
User generated content forms an important domain for mining knowledge. In this paper, we address the task of blog feed search: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to mention the topic in passing. The large number of blogs makes the blogosphere a challenging domain, both in terms of effectiveness and of storage and retrieval efficiency. We examine the effectiveness of an approach to blog feed search that is based on individual posts as indexing units (instead of full blogs). Working in the setting of a probabilistic language modeling approach to information retrieval, we model the blog feed search task by aggregating over a blogger’s posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance in terms of effectiveness. We then introduce a two-stage model where a pre-selection of candidate blogs is followed by a ranking step. The model integrates aggressive pruning techniques as well as very lean representations of the contents of blog posts, resulting in substantial gains in efficiency while maintaining effectiveness at a very competitive level.  相似文献   

9.
[目的/意义]为了解传统媒体微博与新媒体微博信息传播规律与特征的差异,识别各类型媒体微博中具有高度影响力的媒体微博节点,以促进媒体微博的全面发展。[方法/过程]研究基于社会网络理论,选取新浪微博中的传统媒体微博及新媒体微博各50个作为研究样本,测度其社会网络结构特征,比较分析两者的信息传播规律。[结果/结论]研究发现,传统媒体微博和新媒体微博网络整体均较为紧密,且传统媒体微博高于新媒体微博;信息在传统媒体微博网络中的传播较新媒体微博更为便利,信息转移效率更高,网络节点之间的整体凝聚性更强;传统媒体微博中纸质媒体微博在地位上占据绝对优势,而新媒体微博中各类型媒体微博则相对发展均衡;新媒体微博权利更为集中,呈现两极分化状态,而传统媒体微博则更为均衡。  相似文献   

10.
Search effectiveness metrics are used to evaluate the quality of the answer lists returned by search services, usually based on a set of relevance judgments. One plausible way of calculating an effectiveness score for a system run is to compute the inner-product of the run’s relevance vector and a “utility” vector, where the ith element in the utility vector represents the relative benefit obtained by the user of the system if they encounter a relevant document at depth i in the ranking. This paper uses such a framework to examine the user behavior patterns—and hence utility weightings—that can be inferred from a web query log. We describe a process for extrapolating user observations from query log clickthroughs, and employ this user model to measure the quality of effectiveness weighting distributions. Our results show that for measures with static distributions (that is, utility weighting schemes for which the weight vector is independent of the relevance vector), the geometric weighting model employed in the rank-biased precision effectiveness metric offers the closest fit to the user observation model. In addition, using past TREC data as to indicate likelihood of relevance, we also show that the distributions employed in the BPref and MRR metrics are the best fit out of the measures for which static distributions do not exist.  相似文献   

11.
The increasing trend of cross-border globalization and acculturation requires text summarization techniques to work equally well for multiple languages. However, only some of the automated summarization methods can be defined as “language-independent,” i.e., not based on any language-specific knowledge. Such methods can be used for multilingual summarization, defined in Mani (Automatic summarization. Natural language processing. John Benjamins Publishing Company, Amsterdam, 2001) as “processing several languages, with a summary in the same language as input”, but, their performance is usually unsatisfactory due to the exclusion of language-specific knowledge. Moreover, supervised machine learning approaches need training corpora in multiple languages that are usually unavailable for rare languages, and their creation is a very expensive and labor-intensive process. In this article, we describe cross-lingual methods for training an extractive single-document text summarizer called MUSE (MUltilingual Sentence Extractor)—a supervised approach, based on the linear optimization of a rich set of sentence ranking measures using a Genetic Algorithm. We evaluated MUSE’s performance on documents in three different languages: English, Hebrew, and Arabic using several training scenarios. The summarization quality was measured using ROUGE-1 and ROUGE-2 Recall metrics. The results of the extensive comparative analysis showed that the performance of MUSE was better than that of the best known multilingual approach (TextRank) in all three languages. Moreover, our experimental results suggest that using the same sentence ranking model across languages results in a reasonable summarization quality, while saving considerable annotation efforts for the end-user. On the other hand, using parallel corpora generated by machine translation tools may improve the performance of a MUSE model trained on a foreign language. Comparative evaluation of an alternative optimization technique—Multiple Linear Regression—justifies the use of a Genetic Algorithm.  相似文献   

12.
We analyse the difference between the averaged (average of ratios) and globalised (ratio of averages) author-level aggregation approaches based on various paper-level metrics. We evaluate the aggregation variants in terms of (1) their field bias on the author-level and (2) their ranking performance based on test data that comprises researchers that have received fellowship status or won prestigious awards for their long-lasting and high-impact research contributions to their fields. We consider various direct and indirect paper-level metrics with different normalisation approaches (mean-based, percentile-based, co-citation-based) and focus on the bias and performance differences between the two aggregation variants of each metric. We execute all experiments on two publication databases which use different field categorisation schemes. The first uses author-chosen concept categories and covers the computer science literature. The second covers all disciplines and categorises papers by keywords based on their contents. In terms of bias, we find relatively little difference between the averaged and globalised variants. For mean-normalised citation counts we find no significant difference between the two approaches. However, the percentile-based metric shows less bias with the globalised approach, except for citation windows smaller than four years. On the multi-disciplinary database, PageRank has the overall least bias but shows no significant difference between the two aggregation variants. The averaged variants of most metrics have less bias for small citation windows. For larger citation windows the differences are smaller and are mostly insignificant.In terms of ranking the well-established researchers who have received accolades for their high-impact contributions, we find that the globalised variant of the percentile-based metric performs better. Again we find no significant differences between the globalised and averaged variants based on citation counts and PageRank scores.  相似文献   

13.
Evaluating scholars’ achievements is an important problem in the science of science with applications in the evaluation of grant proposals and promotion applications. Since the number of scholars and the number of scholarly outputs grow exponentially with time, well-designed ranking metrics that have the potential to assist in these tasks are of prime importance. To rank scholars, it is important to put their achievements in perspective by comparing them with the achievements of other scholars active in the same period. We propose here a particular way of doing so: by computing the evaluated scholar's share on each year's citations which quantifies how the scholar fares in competition with the others. We assess the resulting ranking method using the American Physical Society citation data and four prestigious physics awards. Our results show that the new method significantly outperforms other ranking methods in identifying the prize laureates.  相似文献   

14.
User queries to the Web tend to have more than one interpretation due to their ambiguity and other characteristics. How to diversify the ranking results to meet users’ various potential information needs has attracted considerable attention recently. This paper is aimed at mining the subtopics of a query either indirectly from the returned results of retrieval systems or directly from the query itself to diversify the search results. For the indirect subtopic mining approach, clustering the retrieval results and summarizing the content of clusters is investigated. In addition, labeling topic categories and concept tags on each returned document is explored. For the direct subtopic mining approach, several external resources, such as Wikipedia, Open Directory Project, search query logs, and the related search services of search engines, are consulted. Furthermore, we propose a diversified retrieval model to rank documents with respect to the mined subtopics for balancing relevance and diversity. Experiments are conducted on the ClueWeb09 dataset with the topics of the TREC09 and TREC10 Web Track diversity tasks. Experimental results show that the proposed subtopic-based diversification algorithm significantly outperforms the state-of-the-art models in the TREC09 and TREC10 Web Track diversity tasks. The best performance our proposed algorithm achieves is α-nDCG@5 0.307, IA-P@5 0.121, and α#-nDCG@5 0.214 on the TREC09, as well as α-nDCG@10 0.421, IA-P@10 0.201, and α#-nDCG@10 0.311 on the TREC10. The results conclude that the subtopic mining technique with the up-to-date users’ search query logs is the most effective way to generate the subtopics of a query, and the proposed subtopic-based diversification algorithm can select the documents covering various subtopics.  相似文献   

15.
企业市场机遇信息声望评价模型研究   总被引:1,自引:0,他引:1  
针对市场机遇信息搜索中对个体查找的优先度评判问题,提出一套市场机遇信息声望评价模型,该模型包含距离性(搜寻者与被搜寻者的距离)和权威性(个体提供市场机遇信息的能力)两个测度指标。综合运用社会网络分析方法和自行设计的AuthorityRank算法来计算市场机遇信息声望,并搜集某商业银行的调研数据,进行算法的运用和结果分析。  相似文献   

16.
文章调查了图书馆界利用微博开展读者服务的现状,分析了图书馆利用微博开放平台开展社会网络服务的可行性,并以腾讯微博开放平台为例提出了以图书馆为中心、以地理区域为中心和以文献资源为中心三种社会网络服务整合模式。  相似文献   

17.
Web2.0时代,微博毫无疑问成为其中最具代表性的社交网络产品,即时性、互动性等方面的优势决定其必将开启新的网络营销时代,尤其是对于传统出版企业而言,随着实体书店销售网络的衰颓,网络营销是亟需的新的营销领域。本文主要对出版企业进行微博营销的意义、策略和微博营销应该注意的问题进行探讨。  相似文献   

18.
The evaluation of diversified web search results is a relatively new research topic and is not as well-understood as the time-honoured evaluation methodology of traditional IR based on precision and recall. In diversity evaluation, one topic may have more than one intent, and systems are expected to balance relevance and diversity. The recent NTCIR-9 evaluation workshop launched a new task called INTENT which included a diversified web search subtask that differs from the TREC web diversity task in several aspects: the choice of evaluation metrics, the use of intent popularity and per-intent graded relevance, and the use of topic sets that are twice as large as those of TREC. The objective of this study is to examine whether these differences are useful, using the actual data recently obtained from the NTCIR-9 INTENT task. Our main experimental findings are: (1) The $\hbox{D}\,\sharp$ evaluation framework used at NTCIR provides more “intuitive” and statistically reliable results than Intent-Aware Expected Reciprocal Rank; (2) Utilising both intent popularity and per-intent graded relevance as is done at NTCIR tends to improve discriminative power, particularly for $\hbox{D}\,\sharp$ -nDCG; and (3) Reducing the topic set size, even by just 10 topics, can affect not only significance testing but also the entire system ranking; when 50 topics are used (as in TREC) instead of 100 (as in NTCIR), the system ranking can be substantially different from the original ranking and the discriminative power can be halved. These results suggest that the directions being explored at NTCIR are valuable.  相似文献   

19.
Decades of research on the social norms approach (SNA) has shown that informing people of how their behavior compares to their peers is an effective way to reduce risky behavior. The SNA has been particularly successful at reducing drinking on college campuses. However, one recent study may have found a way to improve upon the SNA: rank-framing messages. This study found that reframing social norms messages to show how students’ alcohol consumption ranks relative to their peers is more effective at increasing information seeking. The current study is a replication of this study. Rank-framed messages did decrease drinking behaviors but did not increase information seeking. Possible explanations and the potential merit of rank-framed social norms interventions are discussed.  相似文献   

20.
The aim of this study was to develop a model to evaluate the retrieval quality of search queries performed by Dutch general practitioners using the printed Index Medicus, MEDLINE on CD-ROM, and MEDLINE through GRATEFUL MED. Four search queries related to general practice were formulated for a continuing medical education course in literature searching. The selected potential relevant citations from the course instructor and the 103 course participants together served as the basic set for the three judges to evaluate for (a) relevance and (b) quality, with the latter based on journal ranking, research design and publication type. Relevant individual citations received a citation quality score from 1 (low) to 4 (high). The overall search quality was expressed in a formula, which included the individual citation quality score of the selected and missed relevant citations, and the number of selected non-relevant citations. The outcome measures were the number and quality of relevant citations and agreement between the judges. Out of 864 citations, 139 were assessed as relevant, of which 44 citations received an individual citation quality score of 1, 76 of 2, 19 of 3 and none of 4. The level of agreement between the judges was 68% for the relevant citations, and 88% for the non-relevant citations. We describe a model for the evaluation of search queries based not only on the relevance, but also on the quality of the citations retrieved. With adaptation, this model could be generalized to other professional users, and to other bibliographic sources.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号