首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Efficient information searching and retrieval methods are needed to navigate the ever increasing volumes of digital information. Traditional lexical information retrieval methods can be inefficient and often return inaccurate results. To overcome problems such as polysemy and synonymy, concept-based retrieval methods have been developed. One such method is Latent Semantic Indexing (LSI), a vector-space model, which uses the singular value decomposition (SVD) of a term-by-document matrix to represent terms and documents in k-dimensional space. As with other vector-space models, LSI is an attempt to exploit the underlying semantic structure of word usage in documents. During the query matching phase of LSI, a user's query is first projected into the term-document space, and then compared to all terms and documents represented in the vector space. Using some similarity measure, the nearest (most relevant) terms and documents are identified and returned to the user. The current LSI query matching method requires that the similarity measure be computed between the query and every term and document in the vector space. In this paper, the kd-tree searching algorithm is used within a recent LSI implementation to reduce the time and computational complexity of query matching. The kd-tree data structure stores the term and document vectors in such a way that only those terms and documents that are most likely to qualify as nearest neighbors to the query will be examined and retrieved.  相似文献   

2.
[目的/意义] 揭示移动图书馆用户的查询式构造行为特征,并为移动图书馆的检索功能改进提出建议。[方法/过程] 采用系统日志挖掘法,根据某高校移动图书馆为期一个月的用户日志,通过统计分析方法,利用互信息值、查询式多样性、查询式丰富性、学科分布、持续时间等指标考察移动图书馆用户的查询式关联性、查询重构模式、查询式主题等方面。[结果/结论] 移动图书馆用户的查询式互信息值普遍较低,即查询式在内容上的关联性较弱;重复模式和直线模式是最常见的重构模式,即移动图书馆用户反复搜索同一查询式;移动图书馆用户的搜索兴趣集中在人文社科领域,用户对相同主题查询式的搜索行为具有持续性。建议增加查询推荐功能、自动纠错功能和高级检索功能,以提高移动图书馆检索服务的查全率和查准率。  相似文献   

3.
李伟 《高校图书馆工作》2005,25(4):29-31,87
文章介绍了一种基于Web Services的分布式图情数据库集成查询系统的设计及其实现。系统通过中间代理层向用户提供了统一的数据库视图,并采用了一种基于关键字的倒排索引方法以提高系统的查询性能。参考文献4。  相似文献   

4.
传统的查询扩展方法,不能从根本上消除用户查询意图与检索结果之间的语义偏差和用户查询的歧义性问题,而交互式查询扩展可以有效地帮助用户更快捷、精确地从海量的网络资源中找到所需信息,为用户提供更满意的搜索结果。综合运用文献调研和问卷调查法,从用户使用及需求情况、使用原因、评价及建议等维度对交互式查询扩展进行实证分析。提出操作方式简单化、查询扩展个性化、交互显示人性化、检索结果精确化、检索环境移动化是交互式查询扩展的研究重点和主要发展方向。  相似文献   

5.
信息搜寻中用户查询重构研究综述   总被引:1,自引:0,他引:1  
李纲  胡蓉 《图书情报工作》2014,58(11):123-129
基于信息搜寻中人机交互行为,从查询重构类型与模式、查询重构绩效、查询重构影响因素及查询式扩展技术4个方面综述国内外关于用户查询重构的研究。得出结论:基于查询重构模式研究,可获悉用户偏向使用的查询重构序列;结合查询重构影响因素和重构序列,可向不同群体用户推荐高概率的查询词,但该研究也存在一定的局限;尽管从检索系统角度,查询重构得到不少学者的广泛关注,但关于用户如何重构查询的研究在中文文献中尚未见到。  相似文献   

6.
针对传统web数据集成系统实用性、伸缩性和适应性差的问题,提出了一种新的web 数据集成系统体系结构,实现web规模的数据集成。系统支持用户提交关键词查询、提取用户查询模式、映射相关领域、选择web数据库、执行查询排序查询结果。介绍了组成系统的关键组件,及创建Deep Web索引、领域映射和用户模式匹配等处理大规模异构web数据的关键技术。  相似文献   

7.
As the volume and variety of information sources continues to grow, there is increasing difficulty with respect to obtaining information that accurately matches user information needs. A number of factors affect information retrieval effectiveness (the accuracy of matching user information needs against the retrieved information). First, users often do not present search queries in the form that optimally represents their information need. Second, the measure of a document’s relevance is often highly subjective between different users. Third, information sources might contain heterogeneous documents, in multiple formats and the representation of documents is not unified. This paper discusses an approach for improvement of information retrieval effectiveness from document databases. It is proposed that retrieval effectiveness can be improved by applying computational intelligence techniques for modelling information needs, through interactive reinforcement learning. The method combines qualitative (subjective) user relevance feedback with quantitative (algorithmic) measures of the relevance of retrieved documents. An information retrieval is developed whose retrieval effectiveness is evaluated using traditional precision and recall.  相似文献   

8.
基于用户相关反馈的带结构语义的XML查询词扩展   总被引:1,自引:0,他引:1  
在XML文档的信息检索中,检索质量不高的一个主要原因是用户难以提出准确描述其查询意图的查询表达式,而查询扩展技术被认为是可以帮助用户构建符合其查询意图的查询表达式.本文在XML信息检索中提出了基于用户相关反馈的查询扩展技术,在查询扩展中除了考虑词频因素外还充分考虑了XML文档的结构特点对于扩展查询词选取的影响,包括文档中元素的语义权重、元素所在层次和词项与初始查询词间的距离因素对于扩展查询词选取的影响.实验证明本方法是可行的,且能较好地提高检索结果的准确率.  相似文献   

9.
Users are often faced with complex information needs that are not easily represented as a single query. With current technology, the burden of issuing these individual queries, analysing retrieved documents for relevance, as well as aggregating results falls upon the time-poor and informationally overloaded user. Aggregated search techniques represent the new generation of search applications that endeavour to help users perform these complex tasks. However, the way in which different data types are combined in current aggregated search applications is often performed using static hard-coded structures. We suggest that a useful alternative is to marry techniques from natural language generation, such as text planning and summarisation, in order to dynamically determine the best organisation of retrieved information. These organisations can be motivated by linguistic theories that consider issues such as the role that the information plays to facilitate a task, and the relationships between different pieces of information. With reference to a discourse strategy, it is possible to draw on several data sources automatically to generate a useful, focused, and coherent answer. We focus on exploring the parallels between aggregated search and natural language generation in the hope that the fields can be mutually informed, leading to further advances in the way search technologies can better serve the user. These issues are discussed and presented with examples of existing systems across different domains.  相似文献   

10.
In Information Retrieval, since it is hard to identify users’ information needs, many approaches have been tried to solve this problem by expanding initial queries and reweighting the terms in the expanded queries using users’ relevance judgments. Although relevance feedback is most effective when relevance information about retrieved documents is provided by users, it is not always available. Another solution is to use correlated terms for query expansion. The main problem with this approach is how to construct the term-term correlations that can be used effectively to improve retrieval performance. In this study, we try to construct query concepts that denote users’ information needs from a document space, rather than to reformulate initial queries using the term correlations and/or users’ relevance feedback. To form query concepts, we extract features from each document, and then cluster the features into primitive concepts that are then used to form query concepts. Experiments are performed on the Associated Press (AP) dataset taken from the TREC collection. The experimental evaluation shows that our proposed framework called QCM (Query Concept Method) outperforms baseline probabilistic retrieval model on TREC retrieval.  相似文献   

11.
Vocabulary incompatibilities arise when the terms used to index a document collection are largely unknown, or at least not well-known to the users who eventually search the collection. No matter how comprehensive or well-structured the indexing vocabulary, it is of little use if it is not used effectively in query formulation. This paper demonstrates that techniques for mapping user queries into the controlled indexing vocabulary have the potential to radically improve document retrieval performance. We also show how the use of controlled indexing vocabulary can be employed to achieve performance gains for collection selection. Finally, we demonstrate the potential benefit of combining these two techniques in an interactive retrieval environment. Given a user query, our evaluation approach simulates the human user's choice of terms for query augmentation given a list of controlled vocabulary terms suggested by a system. This strategy lets us evaluate interactive strategies without the need for human subjects.  相似文献   

12.
[目的/意义]了解、分析和识别用户学术搜索时所表达的信息需求是优化查询结果、提高学术搜索引擎用户体验的首要步骤,而用户进行学术搜索时通过查询表达式所表达的用户表意信息需求及潜在信息需求可称之为学术查询意图。本文总结学术查询意图类目体系有助于学术查询意图识别和检索结果页面的呈现。[方法/过程]在A.Broder的查询意图类目体系的基础上,结合百度学术搜索查询日志中查询表达式实例,构建学术查询意图的类目体系。以此为基础,总结不同类别的学术查询意图,并分析不同类别学术查询意图下查询表达式的特点。[结果/结论]学术查询意图主要分为学术文献类、学术实体类、学术探索类、知识问答类和非学术文献类五大类;得出不同类别学术查询意图在学术搜索中的大致比例;给出每类学术查询意图的查询表达式特征、查询情景和查询结果页。  相似文献   

13.
搜索引擎中Robot搜索算法的优化   总被引:15,自引:0,他引:15  
目前的搜索引擎越来越暴露出不足之处 ,当用户使用搜索引擎时输入特定关键词之后 ,返回的查询结果往往有数千甚至几百万之多 ,而且其中包含大量的重复信息与垃圾信息 ,用户从中筛选出自己感兴趣的网页仍然需要耗费很长的时间。另外一种情况就是 ,Web上明明存在某些重要网页 ,却没有被搜索引擎的robot发现。本文针对这种现象 ,重点讨论搜索引擎中的搜索策略 ,改善搜索算法 ,使Robot在搜索阶段就能够充分处理与Robot频繁交互的URL列表。根据网页的内容、HTML结构以及其中包含的超链信息计算网页的PageRank ,使URL列表能够根据重要性调整排列顺序。初步的试验结果表明 ,本文的优化算法可以较大程度地改进搜索引擎的整体性能  相似文献   

14.
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target languages in response to a user query in a single source language. In a multilingual federated search environment, different information sources contain documents in different languages. A general search strategy in multilingual federated search environments is to translate the user query to each language of the information sources and run a monolingual search in each information source. It is then necessary to obtain a single ranked document list by merging the individual ranked lists from the information sources that are in different languages. This is known as the results merging problem for multilingual information retrieval. Previous research has shown that the simple approach of normalizing source-specific document scores is not effective. On the other side, a more effective merging method was proposed to download and translate all retrieved documents into the source language and generate the final ranked list by running a monolingual search in the search client. The latter method is more effective but is associated with a large amount of online communication and computation costs. This paper proposes an effective and efficient approach for the results merging task of multilingual ranked lists. Particularly, it downloads only a small number of documents from the individual ranked lists of each user query to calculate comparable document scores by utilizing both the query-based translation method and the document-based translation method. Then, query-specific and source-specific transformation models can be trained for individual ranked lists by using the information of these downloaded documents. These transformation models are used to estimate comparable document scores for all retrieved documents and thus the documents can be sorted into a final ranked list. This merging approach is efficient as only a subset of the retrieved documents are downloaded and translated online. Furthermore, an extensive set of experiments on the Cross-Language Evaluation Forum (CLEF) () data has demonstrated the effectiveness of the query-specific and source-specific results merging algorithm against other alternatives. The new research in this paper proposes different variants of the query-specific and source-specific results merging algorithm with different transformation models. This paper also provides thorough experimental results as well as detailed analysis. All of the work substantially extends the preliminary research in (Si and Callan, in: Peters (ed.) Results of the cross-language evaluation forum-CLEF 2005, 2005).
Hao YuanEmail:
  相似文献   

15.
提出了一种基于本体论的智能化查询算法,该算法充分利用本体论的推理功能,把出现在用户查询中的概念、关系以及属性等信息进行综合分析,挖掘出用户的真正需求,从而实现智能化的网络信息查询。  相似文献   

16.
针对现有信息检索系统中存在的词不匹配问题,提出一种基于词间关联规则的查询扩展算法,该算法利用现有挖掘算法自动对前列初检文档进行词间关联挖掘,提取含有原查询词的词间关联规则,从中提取扩展词,实现查询扩展。实验结果表明,该算法能改善和提高信息检索系统的查全率和查准率,具有很高的应用价值,与未进行查询扩展时相比,采用本文查询扩展算法后,平均准确率提高了13.34%,与传统的局部上下文分析查询扩展算法比较,其平均准确率提高了4.87%。  相似文献   

17.
[目的/意义]针对移动在线学习平台中用户评价具有布尔变量属性的学习资源,提出一种适用于该类资源的协同推荐方法。[方法/过程]首先采用基于用户自身属性和已有好友分布特征的FRUTAI算法,确定目标用户的最近邻集;然后在解决数据稀疏的基础上,提出适用于布尔型移动在线学习资源的协同推荐方法;最后选取实证对象,依据相关评估方法评估推荐结果。[结果/结论]在以豆瓣书评网数据作为数据集的实证中取得了较好的推荐效果。实证结果表明,本文提出的改进的协同推荐算法可以有效地应用于移动在线学习平台中的布尔型学习资源,具有较好的推荐效果。  相似文献   

18.
通过网络空间信息表征方式及其特点的详细分析,进一步认识用户在网络空间中的认知过程及其浏览/查询活动中的常见认知问题,对用户与网络空间的交互机制以及基于语义的信息交互语境效果进行深入探讨,在此基础上提出智能导航对交互活动的支持因素。  相似文献   

19.
基于移动代理的分布式信息检索   总被引:6,自引:0,他引:6  
介绍了移动代理技术的内容和特点,分析了其性能优势。在此基础上,重点论述了基于移动代理技术的分布式信息检索的系统模型、系统实例、性能指标和移动代理规划。指出移动代理技术在分布式信息检索中的应用,是解决网络环境下分布式大量信息资源检索的有效途径。  相似文献   

20.
信息环境的泛在化和移动化要求图书馆要以用户为中心去重新设计自身的服务方式。本文以混合模式移动 应用(hybrid)的方式建立一个在移动互联环境下集社交、分享以及云存储为一体的知识服务平台。在这个平台上,通 过数据挖掘进行个性化服务,通过社交增强用户的粘性,通过云存储给用户提供独立而又互相关联的知识空间。图书 馆还可以通过知识的重组与推送为用户提供知识服务,分享和发布功能使用户之间可以相互服务。用户可以通过平台 利用碎片化时间进行科研活动。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号