首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 77 毫秒
1.
华斌  吴诺  贺欣 《图书情报工作》2021,65(23):58-69
[目的/意义]提出一种基于专家个体多维特征刻画的电子政务项目评审专家组推荐方法,提升专家组间项目评审的一致性水平。[方法/过程]以专家个体的长期评审意见为数据源,利用意见挖掘技术实现知识元识别与情感极性获取;构造专家的领域知识结构并动态迭代更新;利用统计分析刻画专家知识水平、评审深刻性、情感风格、领域专长特征,实现基于科学计量的专家特征刻画并以此为基础进行专家组合的推荐。[结果/结论]本文的方法注重专家组的多维特征均衡,对电子政务项目评审具有很好的问题针对性,并在实践中取得了良好的应用效果。  相似文献   

2.
以武汉大学专家检索系统WHU-ES为平台,借鉴基于相关文档集的归并排序法和基于词典进行查询扩展的方法,对图情领域专家检索进行实验与评价,包括专家排序和专长识别两个方面.利用基于词典进行查询扩展的方法对基于相关文档集的归并排序法进行改进,实验结果表明,利用专长词表可以有效地提高专家检索的查准率和专长识别的效果.未来研究中需进一步解决词表的规范性生成问题.  相似文献   

3.
苏颖 《情报工程》2015,1(5):008-017
专利检索是一个非常复杂的过程,用户为了迅速高效地完成检索任务需要得到支持。专利检索过程的许多环节可以借助一些工具完成,其中就包括查询(式)构造工具。查询构造是一项高度依赖人工的任务,工具只能实现对可能有用数据进行预先计算,并针对用户进行可视化。信息检索系统中,查询过程和查询结果可视化的方式有很多。本研究提出了两种典型的原型系统设计,用于在专利检索过程中对不同的查询表达式进行比较。原型包含查询表达式构造因素和结果集大小因素,两种因素对于专利领域专家探究查询表达式的调整对检索效率的影响至关重要。本文开发的系统有助于在专利检索过程中对复杂查询表达式进行逐步优化,系统设计思想基于了领域专家型知识工程。  相似文献   

4.
Searching for information pervades a wide spectrum of human activity, including learning and problem solving. With recent changes in the amount of information available and the variety of means of retrieval, there is even more need to understand why some searchers are more successful than others. This study was undertaken to advance the understanding of expertise in seeking information on the Web by identifying strategies and attributes that will increase the chance of a successful search on the Web. The strategies were as follows: evaluation, navigation, affect, metacognition, cognition, and prior knowledge, and attributes included age, sex, years of experience, computer knowledge, and info-seeking knowledge. Success was defined as finding a target topic within 30 minutes. Participants were from three groups. Novices were 10 undergraduate pre-service teachers, intermediates were 9 final-year master of library and information studies students, and experts were 10 highly experienced professional librarians working in a variety of settings. Participants' verbal protocols were transcribed verbatim into a text file and coded. These codes, along with Internet temporary files, a background questionnaire, and a post-task interview were the sources of the data. Since the variable of interest was the time to finding the topic, in addition to ANOVA and Pearson correlation, survival analysis was used to explore the data. The most significant differences in patterns of search between novices and experts were found in the cognitive, metacognitive, and prior knowledge strategies. Survival analysis revealed specific actions associated with success in Web searching: (1) using clear criteria to evaluate sites, (2) not excessively navigating, (3) reflecting on strategies and monitoring progress, (4) having background knowledge about information seeking, and (5) approaching the search with a positive attitude.  相似文献   

5.
毛进  李纲 《图书情报工作》2014,58(14):34-40
从专家所发表的论文文本内容中抽取出专家的研究专长特征,利用重叠K-Means聚类算法对研究领域内的专家进行重叠聚类划分,识别出专家的多个研究专长,并根据共同研究专长将专家聚集在一起,进而在图论的基础上,将专家聚类转化为研究领域内专家的图结构表示,借助网络可视化软件绘制研究领域专家图谱。  相似文献   

6.
Research on cross-language information retrieval (CLIR) has typically been restricted to settings using binary relevance assessments. In this paper, we present evaluation results for dictionary-based CLIR using graded relevance assessments in a best match retrieval environment. A text database containing newspaper articles and a related set of 35 search topics were used in the tests. First, monolingual baseline queries were automatically formed from the topics. Secondly, source language topics (in English, German, and Swedish) were automatically translated into the target language (Finnish), using structured target queries. The effectiveness of the translated queries was compared to that of the monolingual queries. Thirdly, pseudo-relevance feedback was used to expand the original target queries. CLIR performance was evaluated using three relevance thresholds: stringent, regular, and liberal. When regular or liberal threshold was used, a reasonable performance was achieved. Using stringent threshold, equally high performance could not be achieved. On all the relevance thresholds the performance of the translated queries was successfully raised by pseudo-relevance feedback based query expansion. However, the performance of the stringent threshold in relation to the other thresholds could not be raised by this method.  相似文献   

7.
Mobile Agents for Distributed and Heterogeneous Information Retrieval   总被引:1,自引:0,他引:1  
The heterogeneous, distributed and voluminous nature of many government and corporate data sources impose severe constraints on meeting the diverse requirements of users who analyze the data. Additionally, communication bandwidth limitations, time constraints, and multiple data formats impose further restrictions on users of these distributed data sources. In this paper, we present an Agent-based Complex QUerying and Information Retrieval Engine (ACQUIRE) for large, heterogeneous, and distributed data sources. ACQUIRE acts as a softbot or interface agent by presenting users with a view of a single, unified, homogenous data source, against which users can pose high-level declarative queries. ACQUIRE translates each such user query into a set of sub-queries by employing a combination of planning and traditional database query optimization techniques. ACQUIRE then spawns a set of mobile agents corresponding to these sub-queries, which in turn retrieve the data from various distributed data sources by dynamically optimizing the retrieval strategy as it is carried out. These mobile agents carry with them data-processing code that can be executed at the remote site, thus reducing the size of data returned by the agent. When all mobile agents have returned, ACQUIRE filters and merges the retrieved data and presents the results to the user. While the system is still very much a work in progress, current validation experiments on simulated NASA Distributed Active Archive Centers (DAACs) have demonstrated that complex queries can be effectively decomposed and retrieved by this approach.  相似文献   

8.
The majority of Internet users search for medical information online; however, many do not have an adequate medical vocabulary. Users might have difficulties finding the most authoritative and useful information because they are unfamiliar with the appropriate medical expressions describing their condition; consequently, they are unable to adequately satisfy their information need. We investigate the utility of bridging the gap between layperson and expert vocabularies; our approach adds the most appropriate expert expression to queries submitted by users, a task we call query clarification. We evaluated the impact of query clarification. Using three different synonym mappings and conducting two task-based retrieval studies, users were asked to answer medically-related questions using interleaved results from a major search engine. Our results show that the proposed system was preferred by users and helped them answer medical concerns correctly more often, with up to a 7 % increase in correct answers over an unmodified query. Finally, we introduce a supervised classifier to select the most appropriate synonym mapping for each query, which further increased the fraction of correct answers (12 %).  相似文献   

9.
The critical task of predicting clicks on search advertisements is typically addressed by learning from historical click data. When enough history is observed for a given query-ad pair, future clicks can be accurately modeled. However, based on the empirical distribution of queries, sufficient historical information is unavailable for many query-ad pairs. The sparsity of data for new and rare queries makes it difficult to accurately estimate clicks for a significant portion of typical search engine traffic. In this paper we provide analysis to motivate modeling approaches that can reduce the sparsity of the large space of user search queries. We then propose methods to improve click and relevance models for sponsored search by mining click behavior for partial user queries. We aggregate click history for individual query words, as well as for phrases extracted with a CRF model. The new models show significant improvement in clicks and revenue compared to state-of-the-art baselines trained on several months of query logs. Results are reported on live traffic of a commercial search engine, in addition to results from offline evaluation.  相似文献   

10.
11.
从Sogou查询日志中选取样本查询且进行人工标注,通过对标注后新闻查询的分析,提出能用于识别新闻意图的新特征,即查询表达式特征、查询随时间分布特征以及点击结果特征。根据这3个特征,利用决策树分类器实现查询中新闻意图的自动识别,结果发现:①新闻类查询的查询目标主要集中在特定主题信息以及娱乐类信息方面,其查询主题大多为娱乐、政治、体育与经济类信息;②相对非新闻查询,新闻查询具有更可能包含实体、随时间分布波动较大、点击结果之间相似度更高的特点;③本方法对查询中新闻意图的识别效果较好,其宏平均准确率、召回率、F值分别为 0.76、0.73、0、74。  相似文献   

12.
The authors discuss the problem of distributed knowledge acquisition for the construction of complete and consistent databases in integrated expert systems via the sharing of knowledge sources of different topologies (experts, problem-oriented texts, and electronic media in the form of databases). The emphasis is on the models, methods, and algorithms of distributed knowledge acquisition from databases as additional knowledge sources. The authors describe the architecture and basic facilities of distributed knowledge acquisition, which function as a part of the AT-TECHNOLOGY tool complex.  相似文献   

13.
The relative contributions of expertise in search skills and domain knowledge were examined when using the Internet to find information. Four conditions were compared: expert searchers/high domain knowledge; expert searchers/low domain knowledge; novice searchers/high domain knowledge; and novice searchers/low domain knowledge. Search outcomes and verbal protocols were analyzed. The combination of search expertise and high domain knowledge yielded the most efficient searches. Higher search expertise yielded access to sites rated more accurate and credible. High domain knowledge yielded sites rated more thorough. Verbal protocols depicted searching as a complex decision process. Findings have implications for instructional support.  相似文献   

14.
This article presents preliminary findings from a research grant on the everyday life information-seeking (ELIS) behaviors of urban young adults. Twenty-seven teens aged 14 through 17 participated in the study. Qualitative data were gathered using written activity logs and semi-structured group interviews. A typology of urban teens' preferred ELIS sources, media types, and query topics is presented. The typology shows friends and family as preferred ELIS sources, cell phones as the preferred method of mediated communication, and schoolwork, time-related queries, and social life as the most common and most significant areas of ELIS. The results indicate a heavy preference for people as information sources and that urban teens hold generally unfavorable views of libraries and librarians. The conclusion lists questions that information practitioners should consider when designing programs and services for urban teens and calls for researchers to consider this often-ignored segment of the population as potential study participants.  相似文献   

15.
This study investigates the information seeking behavior of general Korean Web users. The data from transaction logs of selected dates from August 2006 to August 2007 were used to examine characteristics of Web queries and to analyze click logs that consist of a collection of documents that users clicked and viewed for each query. Changes in search topics are explored for NAVER users from 2003/2004 to 2006/2007. Patterns involving spelling errors and queries in foreign languages are also investigated. Search behaviors of Korean Web users are compared to those of the United States and other countries. The results show that entertainment is the topranked category, followed by shopping, education, games, and computer/Internet. Search topics changed from computer/Internet to entertainment and shopping from 2003/2004 to 2006/2007 in Korea. The ratios of both spelling errors and queries in foreign languages are low. This study reveals differences for search topics among different regions of the world. The results suggest that the analysis of click logs allows for the reduction of unknown or unidentifiable queries by providing actual data on user behaviors and their probable underlying information needs. The implications for system designers and Web content providers are discussed.  相似文献   

16.
This article explores the way librarians define, leverage, and amplify expertise in a twenty-first century academic library. An expert team comprised of a nursing librarian, online learning librarian, information-literacy librarian, and assessment librarian sorted the learning outcomes from the Information-Literacy Competency Standards for Nursing created by the Health Sciences Interest Group taskforce of the Association of College and Research Libraries (ACRL) by grade-levels. Results found distinguishing experts within a library supports the customization of scaffolded instruction. Additionally, using expert teams in academic libraries supports the larger mission of universities to integrate libraries into teaching and research.  相似文献   

17.
Search engine results are often biased towards a certain aspect of a query or towards a certain meaning for ambiguous query terms. Diversification of search results offers a way to supply the user with a better balanced result set increasing the probability that a user finds at least one document suiting her information need. In this paper, we present a reranking approach based on minimizing variance of Web search results to improve topic coverage in the top-k results. We investigate two different document representations as the basis for reranking. Smoothed language models and topic models derived by Latent Dirichlet?allocation. To evaluate our approach we selected 240 queries from Wikipedia disambiguation pages. This provides us with ambiguous queries together with a community generated balanced representation of their (sub)topics. For these queries we crawled two major commercial search engines. In addition, we present a new evaluation strategy based on Kullback-Leibler divergence and Wikipedia. We evaluate this method using the TREC sub-topic evaluation on the one hand, and manually annotated query results on the other hand. Our results show that minimizing variance in search results by reranking relevant pages significantly improves topic coverage in the top-k results with respect to Wikipedia, and gives a good overview of the overall search result. Moreover, latent topic models achieve competitive diversification with significantly less reranking. Finally, our evaluation reveals that our automatic evaluation strategy using Kullback-Leibler divergence correlates well with α-nDCG scores used in manual evaluation efforts.  相似文献   

18.
本文以支持管理者决策为出发点,为OLAP查询建立统计模型。文章首次将数理统计学中的核密度估计方法及Copula理论相结合引入到OLAP查询建模的研究中,有效地抽取数据立方体的概要知识,在减少数据存储空间的同时,以近似查询的方法实现查询精度与查询时间之间的折衷。该方法的优势在于对连续属性的查询处理,模型的建立使得在连续属性上的查询降低了对物化方体的依赖性,极大地提高了OLAP查询的灵活性。实验分析表明使用该方法可以在保证较高查询精度的条件下大大减少数据立方体的存储空间,加快OLAP查询速度,从而为管理决策提供快速和高效的指导。  相似文献   

19.
[目的/意义] 揭示移动图书馆用户的查询式构造行为特征,并为移动图书馆的检索功能改进提出建议。[方法/过程] 采用系统日志挖掘法,根据某高校移动图书馆为期一个月的用户日志,通过统计分析方法,利用互信息值、查询式多样性、查询式丰富性、学科分布、持续时间等指标考察移动图书馆用户的查询式关联性、查询重构模式、查询式主题等方面。[结果/结论] 移动图书馆用户的查询式互信息值普遍较低,即查询式在内容上的关联性较弱;重复模式和直线模式是最常见的重构模式,即移动图书馆用户反复搜索同一查询式;移动图书馆用户的搜索兴趣集中在人文社科领域,用户对相同主题查询式的搜索行为具有持续性。建议增加查询推荐功能、自动纠错功能和高级检索功能,以提高移动图书馆检索服务的查全率和查准率。  相似文献   

20.
The paper studies concept-based cross-language information retrieval (CLIR). The document collection was a subset of the TREC collection. The test requests were formed from TREC's health related topics. As translation dictionaries the study used a general dictionary and a domain-specific (=medical) dictionary. The effects of translation method, conjunction, and facet order on the effectiveness of concept-based cross-language queries were studied, and concept-based structuring of cross-language queries was compared to mechanical structuring based on the output of dictionaries. The performance of translated Finnish queries against English documents was compared to the performance of original English queries against the English documents, and the performance of different CLIR query types was compared with one another. No major difference was found between concept-based and mechanical structuring. The best translation method was a simultaneous look-up in the medical dictionary and the general dictionary, in which case cross-language queries performed as well as the original English queries. The results showed that especially at high exhaustivity (the number of mutually restrictive concepts in a request) levels cross-language queries perform well in relation to monolingual queries. This suggests that conjunction disambiguates cross-language queries. An extensive study was made of the relative importance of the concepts of requests. On the basis of the classification data of request concepts it was shown how the order of facets in a query affects cross-language as well as monolingual queries.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号