首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 201 毫秒
1.
本文首先分析了目前P2P网络中基于DHI的精确匹配搜索方法,并在此基础上提出一种基于关键字的信息搜索方法,实现了基于关键字的语义查询.通过仿真实验表明:该方法相对于现有算法具有更好的命中率和更高的查全率.  相似文献   

2.
对等网环境下的语义检索研究   总被引:2,自引:0,他引:2  
目前的P2P信息检索多是基于文件名的关键字匹配检索,仅能实现粗粒度的共享,检索效率低下.借鉴利用本体技术提出非结构化对等网的语义检索框架,描述各模块的功能,并分析该框架模型的查询请求处理流程.  相似文献   

3.
针对XML数据的关键字查询问题,考查了已有的查询技术的优势和不足,提出了基于语义的XML关键字检索算法。对用户输入的关键字进行分类,分为条件关键字和结果关键字。条件关键字只用于限定查询范围,不出现在结果集中。给出了语义相关节点对的概念和判定方法,并提出了基于关键字分类和语义相关节点对的XML数据查询算法。  相似文献   

4.
基于内容的非结构化P2P搜索系统中直接影响查询效果和搜索成本的两个主要问题是,高维语义空间所引起的文本相似度计算复杂以及广播算法带来的大量冗余消息. 本文提出利用集合差异度实现基于内容聚类的P2P搜索模型提高查询效率和减少冗余消息。该模型利用集合差异度定义文本相似度,将文本相似性的计算复杂度控制在线性时间内而有效地减少了查询时间;利用节点之间的集合差异度实现基于内容的聚类,既降低了查询时间,又减少了冗余消息.模拟实验表明,利用集合差异度构建的基于内容的搜索模型不仅具有较高的召回率,而且将搜索成本和查询时间分别降低到了Gnutella系统的40%和30%左右.  相似文献   

5.
本文针对传统C/S数字图书馆模型的不足,提出一种基于语义的P2P数字图书馆模型.该模型把节点共享资源按其语义组织起来,具有相同语义资源的节点在逻辑上组成一个P2P网络,与该语义相关的查询在这个语义网络中最有可能得到快速准确的结果.  相似文献   

6.
基于P2P计算的教育资源库共享系统设计与实现   总被引:1,自引:0,他引:1  
在深入分析当前教育资源库管理系统缺陷和P2P计算技术优势的基础上,提出了一种P2P模式的教育资源库共享豫统,详述了系统的体系结构、基于本体的资源发布和查询机制,最后给出了基于JXTA开发平台的系统实现方案,有效解决了异构教育资源的共享问题。  相似文献   

7.
基于本体论的网络信息检索   总被引:3,自引:1,他引:3  
张鼐 《情报杂志》2006,25(4):95-96,99
网络信息的激增和多样化给有效的信息检索带来了种种困难,目前的检索工具仅提供了基于关键字的检索,而忽略了关键字本身所包含的语义内容。针对这些问题,提出了一种基于本体论的网络信息检索方法,该方法可以弥补基于关键字机械匹配检索机制的不足,改善网络信息检索的性能,增强网络信息检索的语义性。  相似文献   

8.
余宏  万常选 《情报杂志》2007,26(10):51-54
针对XML文档检索的特点,提出了一种基于XSEarch引擎的语义近似检索模型。设计了利用WordNet对查询项进行语义扩展的计算方法,且对XSEarch引擎的答案排序模型进行了改进,并提出了满足近似检索模型的系统体系结构。  相似文献   

9.
吴燕 《科技广场》2008,(1):69-70
结构化P2P网络的资源定位算法采用的是分布式哈希表(DHT)算法,根据精确关键字进行资源的定位与发现。本文介绍了几种基于DHT的资源定位算法:CAN、Chord和Pastry,对它们的构建和路由算法进行分析,最后指出了结构化P2P网络所面临的问题。  相似文献   

10.
针对传统检索模型局限于语法层次上关键词匹配的特点,以领域本体为知识组织方式,提出了一种基于领域本体的语义检索模型,同时给出了该模型中的查询语义扩展算法和相似度计算算法。  相似文献   

11.
Recent developments have shown that entity-based models that rely on information from the knowledge graph can improve document retrieval performance. However, given the non-transitive nature of relatedness between entities on the knowledge graph, the use of semantic relatedness measures can lead to topic drift. To address this issue, we propose a relevance-based model for entity selection based on pseudo-relevance feedback, which is then used to systematically expand the input query leading to improved retrieval performance. We perform our experiments on the widely used TREC Web corpora and empirically show that our proposed approach to entity selection significantly improves ad hoc document retrieval compared to strong baselines. More concretely, the contributions of this work are as follows: (1) We introduce a graphical probability model that captures dependencies between entities within the query and documents. (2) We propose an unsupervised entity selection method based on the graphical model for query entity expansion and then for ad hoc retrieval. (3) We thoroughly evaluate our method and compare it with the state-of-the-art keyword and entity based retrieval methods. We demonstrate that the proposed retrieval model shows improved performance over all the other baselines on ClueWeb09B and ClueWeb12B, two widely used Web corpora, on the [email protected], and [email protected] metrics. We also show that the proposed method is most effective on the difficult queries. In addition, We compare our proposed entity selection with a state-of-the-art entity selection technique within the context of ad hoc retrieval using a basic query expansion method and illustrate that it provides more effective retrieval for all expansion weights and different number of expansion entities.  相似文献   

12.
This paper presents a novel IR-style keyword search model for semantic web data retrieval, distinguished from current retrieval methods. In this model, an answer to a keyword query is a connected subgraph that contains all the query keywords. In addition, the answer is minimal because any proper subgraph can not be an answer to the query. We provide an approximation algorithm to retrieve these answers efficiently. A special ranking strategy is also proposed so that answers can be appropriately ordered. The experimental results over real datasets show that our model outperforms existing possible solutions with respect to effectiveness and efficiency.  相似文献   

13.
李江华  时鹏 《情报杂志》2012,31(4):112-116
Internet已成为全球最丰富的数据源,数据类型繁杂且动态变化,如何从中快速准确地检索出用户所需要的信息是一个亟待解决的问题.传统的搜索引擎基于语法的方式进行搜索,缺乏语义信息,难以准确地表达用户的查询需求和被检索对象的文档语义,致使查准率和查全率较低且搜索范围有限.本文对现有的语义检索方法进行了研究,分析了其中存在的问题,在此基础上提出了一种基于领域的语义搜索引擎模型,结合语义Web技术,使用领域本体元数据模型对用户的查询进行语义化规范,依据领域本体模式抽取文档中的知识并RDF化,准确地表达了用户的查询语义和作为被查询对象的文档语义,可以大大提高检索的准确性和检索效率,详细地给出了模型的体系结构、基本功能和工作原理.  相似文献   

14.
The importance of query performance prediction has been widely acknowledged in the literature, especially for query expansion, refinement, and interpolating different retrieval approaches. This paper proposes a novel semantics-based query performance prediction approach based on estimating semantic similarities between queries and documents. We introduce three post-retrieval predictors, namely (1) semantic distinction, (2) semantic query drift, and (3) semantic cohesion based on (1) the semantic similarity of a query to the top-ranked documents compared to the whole collection, (2) the estimation of non-query related aspects of the retrieved documents using semantic measures, and (3) the semantic cohesion of the retrieved documents. We assume that queries and documents are modeled as sets of entities from a knowledge graph, e.g., DBPedia concepts, instead of bags of words. With this assumption, semantic similarities between two texts are measured based on the relatedness between entities, which are learned from the contextual information represented in the knowledge graph. We empirically illustrate these predictors’ effectiveness, especially when term-based measures fail to quantify query performance prediction hypotheses correctly. We report our findings on the proposed predictors’ performance and their interpolation on three standard collections, namely ClueWeb09-B, ClueWeb12-B, and Robust04. We show that the proposed predictors are effective across different datasets in terms of Pearson and Kendall correlation coefficients between the predicted performance and the average precision measured by relevance judgments.  相似文献   

15.
如何在语义层面有效地管理和利用分散的知识资源是知识型企业面临的难题。文章基于分布式知识管理的研究现状,将语义网和对等网相结合应用于知识管理,提出了一种基于语义对等网的分布式知识管理模型,详细描述了知识检索的流程和功能,并为实施该模型构建了技术平台,为高效率、高质量分布式知识管理提供有力支撑。  相似文献   

16.
拟合用户偏好的个性化搜索   总被引:2,自引:0,他引:2  
文章从用户偏好的角度对个性化搜索进行了优化研究,提出了基于语义关联树的查询扩展算法以及基于该算法的拟合用户偏好的个性化搜索系统架构。语义关联树可以灵活有效地控制查询扩展模型,在此之上的拟合用户偏好的个性化搜索系统具有用户偏好自学习能力。实验证明,该方法能有效提高文本检索的准确率。  相似文献   

17.
Pseudo-relevance feedback (PRF) is a well-known method for addressing the mismatch between query intention and query representation. Most current PRF methods consider relevance matching only from the perspective of terms used to sort feedback documents, thus possibly leading to a semantic gap between query representation and document representation. In this work, a PRF framework that combines relevance matching and semantic matching is proposed to improve the quality of feedback documents. Specifically, in the first round of retrieval, we propose a reranking mechanism in which the information of the exact terms and the semantic similarity between the query and document representations are calculated by bidirectional encoder representations from transformers (BERT); this mechanism reduces the text semantic gap by using the semantic information and improves the quality of feedback documents. Then, our proposed PRF framework is constructed to process the results of the first round of retrieval by using probability-based PRF methods and language-model-based PRF methods. Finally, we conduct extensive experiments on four Text Retrieval Conference (TREC) datasets. The results show that the proposed models outperform the robust baseline models in terms of the mean average precision (MAP) and precision P at position 10 (P@10), and the results also highlight that using the combined relevance matching and semantic matching method is more effective than using relevance matching or semantic matching alone in terms of improving the quality of feedback documents.  相似文献   

18.
To obtain high performances, previous works on FAQ retrieval used high-level knowledge bases or handcrafted rules. However, it is a time and effort consuming job to construct these knowledge bases and rules whenever application domains are changed. To overcome this problem, we propose a high-performance FAQ retrieval system only using users’ query logs as knowledge sources. During indexing time, the proposed system efficiently clusters users’ query logs using classification techniques based on latent semantic analysis. During retrieval time, the proposed system smoothes FAQs using the query log clusters. In the experiment, the proposed system outperformed the conventional information retrieval systems in FAQ retrieval. Based on various experiments, we found that the proposed system could alleviate critical lexical disagreement problems in short document retrieval. In addition, we believe that the proposed system is more practical and reliable than the previous FAQ retrieval systems because it uses only data-driven methods without high-level knowledge sources.  相似文献   

19.
Existing pseudo-relevance feedback (PRF) methods often divide an original query into individual terms for processing and select expansion terms based on the term frequency, proximity, position, etc. This process may lose some contextual semantic information from the original query. In this work, based on the classic Rocchio model, we propose a probabilistic framework that incorporates sentence-level semantics via Bidirectional Encoder Representations from Transformers (BERT) into PRF. First, we obtain the importance of terms at the term level. Then, we use BERT to interactively encode the query and sentences in the feedback document to acquire the semantic similarity score of a sentence and the query. Next, the semantic scores of different sentences are summed as the term score at the sentence level. Finally, we balance the term-level and sentence-level weights by adjusting factors and combine the terms with the top-k scores to form a new query for the next-round processing. We apply this method to three Rocchio-based models (Rocchio, PRoc2, and KRoc). A series of experiments are conducted based on six official TREC data sets. Various evaluation indicators suggest that the improved models achieve a significant improvement over the corresponding baseline models. Our proposed models provide a promising avenue for incorporating sentence-level semantics into PRF, which is feasible and robust. Through comparison and analysis of a case study, expansion terms obtained from the proposed models are shown to be more semantically consistent with the query.  相似文献   

20.
The term mismatch problem in information retrieval is a critical problem, and several techniques have been developed, such as query expansion, cluster-based retrieval and dimensionality reduction to resolve this issue. Of these techniques, this paper performs an empirical study on query expansion and cluster-based retrieval. We examine the effect of using parsimony in query expansion and the effect of clustering algorithms in cluster-based retrieval. In addition, query expansion and cluster-based retrieval are compared, and their combinations are evaluated in terms of retrieval performance by performing experimentations on seven test collections of NTCIR and TREC.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号