首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
杨韦洁  高珑  苏静 《现代情报》2014,34(7):78-82,87
针对传统数字图书馆中基于关键字的P2P查询扩展存在对用户检索词语义信息解释不足的缺陷,本文提出一种P2P环境下基于语义的节点查询扩展方法,通过把关键字关联表和本体相结合,实现了一种个性化查询扩展方法,同时利用这种扩展方法实现P2P中基于兴趣网络的搜索,能够较大幅度提升检索效率。  相似文献   

2.
当前的P2P(Peer-to-Peer)点播流媒体系统中数据调度算法未能充分利用每个用户节点自身的特性。本文在分析典型数据调度算法基础上提出一种基于节点可选度的数据调度算法(B-SSP算法)。该算法一方面在调度下载数据块时综合考虑了邻居节点带宽能力及其所拥有的数据信息,另一方面对服务节点的请求处理过程进行了优化。B-SSP算法有利于提高用户节点播放视频的连续性,降低流媒体服务器的负载压力,从而改善P2P点播流媒体系统的整体服务质量。仿真结果和实际应用表明算法性能良好,适用于用户节点能力差异较大的P2P点播流媒体环境。  相似文献   

3.
本文为提高数据仓库前端工具的智能化程度和查询分析能力,提出了一种基于知识的面向主题智能分析系统框架,结合数据仓库、人工智能、数据挖掘、云模型等相关技术,为用户提供一种基于领域知识的业务概念术语级的智能查询和智能应答机制,增强了数据仓库查询的智能化程度,使用户能够基于领域知识进行动态即席查询,并且为用户的查询提供更多的相关信息知识。  相似文献   

4.
文件污染是当前P2P文件共享系统普遍存在的问题,极大的降低了系统的可用性。P2P文件共享系统和生物免疫系统一样,都是高度分布、自适应和自组织的。利用向量空间相似度赋予投票权重,采用自适应的信誉阈值判断文件可信性,建立了基于人工免疫的防污染对象信誉机制来进行邻居节点集的选取,以改进系统可用性。仿真实验表明,系统具有很高的识别精确度,能够以低通讯代价很好的抑制污染文件在网络中的传播。  相似文献   

5.
对等计算环境(P2P)具有开放性以及对等节点的匿名性和自治的特点,由于缺乏对与之交互的节点的可信程度的知识,相互信任问题成为P2P进一步研究的一个重要方面.为了解决这一问题,提出一种在P2P环境下基于小世界特性的信任机制,通过彼此联系紧密的节点组成社区,反馈使社区内节点实时学习其所关心节点的信任情况.实验结果表明该机制的可行性和有效性,具有较好的信任收敛度和对恶意行为的防范效果.从而保证了P2P环境下请求节点获得服务的可靠性和安全性.  相似文献   

6.
解决用户的模糊查询问题一直以来是信息检索领域研究的热点。为了解决不同用户间的查询差异,一种称为个性化搜索的技术得以提出,其通过获取用户的喜好来识别查询意图,但研究发现很少有用户愿意直接或间接提供个人信息。本文提出一种基于用户点击历史信息自动获取用户兴趣进而对搜索结果进行个性化呈现的Web搜索系统架构。基于主题相关PageRank技术,设计了用户兴趣学习算法和个性化搜索页面排序算法。实验表明该算法能有效学习用户的兴趣信息,提高了个性化Web搜索质量。  相似文献   

7.
对P2P网络信誉模型的研究是保证P2P网络稳定运行的基本因素。P2P网络通过节点的直接交换来共享计算机资源和服务,需要进行节点优化选择保证P2P网络的信誉和稳定性。传统方法采用粒子滤波算法实现节点自动控制,当存在多重节点反复组合时,对P2P网络的信誉度评价准确性不好。提出一种基于最优化网格分配节点信息覆盖的P2P网络信誉模型设计方法,构建网络的最优节点选择机制,采用自顶向下的方式分层构造数据聚集树,进行节点信誉信息表征和数学模型构建,实现算法改进。仿真实验表明,采用该模型能有效提高对P2P网络的信誉评价精度,优化网络的节点分布,实现节点最优选择,能够有效抵抗外界干扰和攻击,稳定性较好。  相似文献   

8.
[目的/意义]大多数社交网络节点的影响力计算没有考虑用户的评价,而用户评价对特定领域的专业影响力节点的识别具有重要意义。[方法/过程]本文利用领域字典和话题识别模型将目标用户的主题范围进行限定,同时结合社交网络用户中的个人信息综合指标,基于用户关注关系建立链路网络,并充分纳入用户评论的情感评分,提出针对专业影响力节点挖掘的Domain Rank算法。[结果/结论]研究表明,该算法能够有效的从多主题的用户群体中发现和识别潜在的专业影响力节点。  相似文献   

9.
基于共同的兴趣和需求,对等网(P2P)中的节点用户很容易形成虚拟社区.文章利用P2P网络的社区特性,通过挖掘各个层次用户的本地资源及搜索历史记录,形成跨社区路由模块和跨社区历史查询记录管理模块,使得查询首先发送到这些节点而不是相关社区的所有节点上,在此基础上提出了跨社区的P2P语义检索框架,并分析了各个功能模块的实现方法和技术,最后对未来研究进行展望.  相似文献   

10.
建立P2P网络信誉模型能够提高用户对P2P下载技术的信赖程度。由于信誉机制存在着发散性和评价方面的不足,使得传统模型无法进行统一和定量化的评价。为此,提出一种基于改进遗传算法的P2P网络信誉模型。对P2P网络信誉度进行分析,为模型的建立提供了准确依据。将节点的服务质量作为目标函数,利用遗传算法的遗传操作过程进行目标函数的寻优,对遗传算法进行了改进,从而得到最优P2P网络信誉模型。仿真实验结果表明,利用改进的P2P网络信誉度模型进行资料下载,能够提高下载速度,效果令人满意。  相似文献   

11.
This paper presents the trends of searching queries by users from peer-to-peer (P2P) networks over an 18-month period from July 2002 to January 2004. Four data sets of search queries collected from Gnutella were studied to describe the searching trends. Major findings include (1) the percentage of duplicate queries ranging from 34% to 68% of total queries; (2) an increase in non-English queries; (3) an approximately half of searching queries specified for video or audio file types; (4) the stop word “the” accounting for one-third of total stop words; (5) the shift of queries from audio to video; and (6) P2P users demanding for timely entertainment and porn materials. Based on the findings, it is worthwhile for P2P developers to consider (1) system design that allows effective searching using multiple languages; and (2) techniques that eliminate stop words for faster searching.  相似文献   

12.
Increasing knowledge of paedophile activity in P2P systems is a crucial societal concern, with important consequences on child protection, policy making, and internet regulation. Because of a lack of traces of P2P exchanges and rigorous analysis methodology, however, current knowledge of this activity remains very limited. We consider here a widely used P2P system, eDonkey, and focus on two key statistics: the fraction of paedophile queries entered in the system and the fraction of users who entered such queries. We collect hundreds of millions of keyword-based queries; we design a paedophile query detection tool for which we establish false positive and false negative rates using assessment by experts; with this tool and these rates, we then estimate the fraction of paedophile queries in our data; finally, we design and apply methods for quantifying users who entered such queries. We conclude that approximately 0.25% of queries are paedophile, and that more than 0.2% of users enter such queries. These statistics are by far the most precise and reliable ever obtained in this domain.  相似文献   

13.
Web queries in question format are becoming a common element of a user's interaction with Web search engines. Web search services such as Ask Jeeves – a publicly accessible question and answer (Q&A) search engine – request users to enter question format queries. This paper provides results from a study examining queries in question format submitted to two different Web search engines – Ask Jeeves that explicitly encourages queries in question format and the Excite search service that does not explicitly encourage queries in question format. We identify the characteristics of queries in question format in two different data sets: (1) 30,000 Ask Jeeves queries and 15,575 Excite queries, including the nature, length, and structure of queries in question format. Findings include: (1) 50% of Ask Jeeves queries and less than 1% of Excite were in question format, (2) most users entered only one query in question format with little query reformulation, (3) limited range of formats for queries in question format – mainly “where”, “what”, or “how” questions, (4) most common question query format was “Where can I find………” for general information on a topic, and (5) non-question queries may be in request format. Overall, four types of user Web queries were identified: keyword, Boolean, question, and request. These findings provide an initial mapping of the structure and content of queries in question and request format. Implications for Web search services are discussed.  相似文献   

14.
A growing body of research is beginning to explore the information-seeking behavior of Web users. The vast majority of these studies have concentrated on the area of textual information retrieval (IR). Little research has examined how people search for non-textual information on the Internet, and few large-scale studies has investigated visual information-seeking behavior with general-purpose Web search engines. This study examined visual information needs as expressed in users’ Web image queries. The data set examined consisted of 1,025,908 sequential queries from 211,058 users of Excite, a major Internet search service. Twenty-eight terms were used to identify queries for both still and moving images, resulting in a subset of 33,149 image queries by 9855 users. We provide data on: (1) image queries – the number of queries and the number of search terms per user, (2) image search sessions – the number of queries per user, modifications made to subsequent queries in a session, and (3) image terms – their rank/frequency distribution and the most highly used search terms. On average, there were 3.36 image queries per user containing an average of 3.74 terms per query. Image queries contained a large number of unique terms. The most frequently occurring image related terms appeared less than 10% of the time, with most terms occurring only once. We contrast this to earlier work by P.G.B. Enser, Journal of Documentation 51 (2) (1995) 126–170, who examined written queries for pictorial information in a non-digital environment. Implications for the development of models for visual information retrieval, and for the design of Web search engines are discussed.  相似文献   

15.
Although most of the queries submitted to search engines are composed of a few keywords and have a length that ranges from three to six words, more than 15% of the total volume of the queries are verbose, introduce ambiguity and cause topic drifts. We consider verbosity a different property of queries from length since a verbose query is not necessarily long, it might be succinct and a short query might be verbose. This paper proposes a methodology to automatically detect verbose queries and conditionally modify queries. The methodology proposed in this paper exploits state-of-the-art classification algorithms, combines concepts from a large linguistic database and uses a topic gisting algorithm we designed for verbose query modification purposes. Our experimental results have been obtained using the TREC Robust track collection, thirty topics classified by difficulty degree, four queries per topic classified by verbosity and length, and human assessment of query verbosity. Our results suggest that the methodology for query modification conditioned to query verbosity detection and topic gisting is significantly effective and that query modification should be refined when topic difficulty and query verbosity are considered since these two properties interact and query verbosity is not straightforwardly related to query length.  相似文献   

16.
Conventional information retrieval technology (i.e. VSM) faces many difficulties when being implemented in complex P2P systems for the lack of global statistic information (e.g. IDF) and central services. In this paper, we suggest a novel query optimization scheme (Semantic Dual Query Expansion, SDQE) that makes full use of the context information supplied by the local document collection. Latent Semantic Indexing (LSI) is used to explore the local context information. By comparing the different local context information hidden in different document collections, it is possible to solve the synonymy–polysemy problem in VSM. The experiments prove that our scheme is effective to improve the retrieval performance in P2P systems without knowing the global statistic information.  相似文献   

17.
Caching search results is employed in information retrieval systems to expedite query processing and reduce back-end server workload. Motivated by the observation that queries belonging to different topics have different temporal-locality patterns, we investigate a novel caching model called STD (Static-Topic-Dynamic cache), a refinement of the traditional SDC (Static-Dynamic Cache) that stores in a static cache the results of popular queries and manages the dynamic cache with a replacement policy for intercepting the temporal variations in the query stream.Our proposed caching scheme includes another layer for topic-based caching, where the entries are allocated to different topics (e.g., weather, education). The results of queries characterized by a topic are kept in the fraction of the cache dedicated to it. This permits to adapt the cache-space utilization to the temporal locality of the various topics and reduces cache misses due to those queries that are neither sufficiently popular to be in the static portion nor requested within short-time intervals to be in the dynamic portion.We simulate different configurations for STD using two real-world query streams. Experiments demonstrate that our approach outperforms SDC with an increase up to 3% in terms of hit rates, and up to 36% of gap reduction w.r.t. SDC from the theoretical optimal caching algorithm.  相似文献   

18.
JXTA协议是一组公开的协议,利用这组协议可以让连接到互联网上的任何设备,包括手机、无线PDA、个人电脑、服务器等,以P2P方式相互通信和协同工作。JXTA Peers创建了一个虚拟的网络,处在这个网络中的任何Peer能相互通信,即使一些Peers处在防火墙、NAT之后或者使用不同的网络传输协议。JXSE项目使用Java SE为JXTA协议提供了一个完整的参考实现。本文简单介绍了JXTA协议和JXTA中的基本概念以及如何应用JXSE实现两个不同内网中计算机的通信。  相似文献   

19.
20.
It is widely believed that many queries submitted to search engines are inherently ambiguous (e.g., java and apple). However, few studies have tried to classify queries based on ambiguity and to answer “what the proportion of ambiguous queries is”. This paper deals with these issues. First, we clarify the definition of ambiguous queries by constructing the taxonomy of queries from being ambiguous to specific. Second, we ask human annotators to manually classify queries. From manually labeled results, we observe that query ambiguity is to some extent predictable. Third, we propose a supervised learning approach to automatically identify ambiguous queries. Experimental results show that we can correctly identify 87% of labeled queries with the approach. Finally, by using our approach, we estimate that about 16% of queries in a real search log are ambiguous.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号