共查询到20条相似文献,搜索用时 609 毫秒
1.
协同推荐中基于用户-文档矩阵的用户聚类研究* 总被引:1,自引:0,他引:1
针对个性化推荐服务的需要以及用户聚类处理时用户-文档访问数据的高维稀疏性问题,采用“比对降维”的思想和K层次聚类算法,分析基于用户资源评价数据的用户聚类处理流程。在此基础上,采用Java开源技术设计并实现一个用户聚类的试验系统。 相似文献
2.
3.
随着互联网上信息数量的不断增长,传统的信息检索技术已经很难满足人们对查询质量的苛刻要求。为了方便用户从检索结果中快速、准确地定位自己想要的信息,集成了文档聚类功能的搜索引擎应运而生。本文讨论了文档聚类技术在搜索引擎中的应用问题,介绍了一些算法,重点分析了Vivisimo这个比较有代表性的聚类搜索引擎,并预测了搜索引擎聚类技术的发展趋势。 相似文献
4.
查询扩展技术通过向初始查询请求加入相似或相关的词,组成更为准确的扩展查询表达式,来减少查询请求与相关文献在表达上的不匹配现象,改善检索性能.与传统的查询扩展不同,XML查询扩展不仅要对文档内容进行有效扩展,而且还要考虑结构扩展.本文提出了一种基于伪反馈的XML查询扩展方法,将初始检索结果聚类,获得与查询请求最为相关的文档簇,然后在文档簇中抽取词组,找到符合用户查询意图的扩展查询词组,并在扩展查询词组的基础上进行结构扩展,最终形成完整的"内容+结构"的查询扩展表达式.相关实验结果表明,相对没有扩展的查询,所提方法具有更好的精度. 相似文献
5.
6.
电子文档和用户的增长导致了信息检索结果个性化模式的创新,从而更好地为用户偏好服务.个性化的内容检索旨在改善检索过程中考虑个别用户的特殊兴趣.本文提出了一种基于扩展模糊概念网的信息检索结果的个性化的新方法.在这种方法中,网页和用户偏好都将以扩展模糊概念网形式表示.扩展模糊概念网可看作是关系矩阵和关联矩阵模型,关系矩阵中的元素代表模糊概念间的关系,关联矩阵中的元素表明概念间的关联度.这种方法的好处是能找到用户查询的绝大多数文档并且更灵活、更好地显示给用户. 相似文献
7.
8.
用户兴趣模型是个性化服务的核心,对用户兴趣的挖掘可以发现潜在的兴趣知识,提供更为优化的服务.本文将主题图技术与用户兴趣模型结合起来,研究了用户兴趣知识的主题图表示,并在此基础上运用无尺度图K-中心点聚类算法对构建的主题图进行深层次的聚类挖掘,建立了基于主题图的用户兴趣挖掘模型.在解释模型各个模块功能的同时,提出了该过程模型中的关键问题,并对建立模型过程中的无尺度图K-中心点聚类算法、文档中的主题图表示及主题概化和主题图合并等关键问题进行了深入的分析,最终构建了智能主题图,实现了过程建模和事物建模. 相似文献
9.
10.
11.
几种搜索引擎中Image搜索的比较 总被引:4,自引:0,他引:4
李爱国 《现代图书情报技术》2002,18(5):35-36
随着用户对网上图像搜索要求的不断增长,各种基于Web的图像搜索引擎应运而生。但是各种不同的图像搜索引擎在搜索的响应时间、检索出的图像的数量、准确性、检索结果的排序等方面存在着较大的差异。本文首先就图像搜索模式作一简单的叙述,然后对搜索引擎Google、Excite、Yahoo、Ixqiuck的Image Search进行比较。 相似文献
12.
一种基于后缀树的Web搜索结果聚类方法 总被引:3,自引:2,他引:1
为同时满足Web搜索结果聚类的关联性、快速性以及类别描述的可浏览性等需求,本文提出了一种适合中文Web信息搜索结果的后缀树聚类算法,其中后缀树的构建以中文汉字为基本单位,一种有效的策略解决了基于二进制方法合并短语类后的类别描述问题,利用短语类语义层面的相似性合并同义短语类,有效地改善了聚类结果的质量.测试结果表明:与传统的文档聚类算法相比,基于后缀树的算法在Web文档聚类的精度和效率方面具有较强的优越性. 相似文献
13.
WWW 检索工具的原理和实践 总被引:12,自引:1,他引:11
本文在总结WWW 上信息查找方法和原理的基础上, 归纳出基于超文本的信息查找和基于检索工具的信息查找两种方法。着重描述了WWW 检索工具的技术指标和特点, 最后介绍了一个著名的WWW 检索工具WebCrawler 的功能和使用步骤。 相似文献
14.
常用中文搜索引擎检索性能比较分析 总被引:4,自引:0,他引:4
从几种主要的中搜索引擎几年来的市场占有份额的变化出发,选择检索性能为比较切入点。以代表检索性能的数据库、检索结果、用户界面和搜索技巧为评价标准,对常用搜索引擎进行数据统计分析,找出专注于搜索主业的google和百度最成功的原因。 相似文献
15.
Oren Kurland 《Information Retrieval》2009,12(4):437-460
To obtain high precision at top ranks by a search performed in response to a query, researchers have proposed a cluster-based
re-ranking paradigm: clustering an initial list of documents that are the most highly ranked by some initial search, and using
information induced from these (often called) query-specific clusters for re-ranking the list. However, results concerning the effectiveness of various automatic cluster-based re-ranking methods have been inconclusive. We show that using query-specific clusters for automatic re-ranking
of top-retrieved documents is effective with several methods in which clusters play different roles, among which is the smoothing of document language models. We do so by adapting previously-proposed cluster-based retrieval approaches, which are based on (static) query-independent
clusters for ranking all documents in a corpus, to the re-ranking setting wherein clusters are query-specific. The best performing
method that we develop outperforms both the initial document-based ranking and some previously proposed cluster-based re-ranking
approaches; furthermore, this algorithm consistently outperforms a state-of-the-art pseudo-feedback-based approach. In further
exploration we study the performance of cluster-based smoothing methods for re-ranking with various (soft and hard) clustering
algorithms, and demonstrate the importance of clusters in providing context from the initial list through a comparison to
using single documents to this end.
相似文献
Oren KurlandEmail: |
16.
网络时代搜索引擎带来的社会问题 总被引:1,自引:0,他引:1
针对搜索引擎面临各种特色服务所引起的各种社会问题,从搜索引擎运营商的利益角度,论述这些问题所产生的负面影响,并在展望搜索引擎运营商市场前景的基础上,就如何完善搜索引擎检索机制提出建议。 相似文献
17.
《Library & information science research》2023,45(1):101222
Searches with learning intent typically require the users to interact with the searching environment and perform knowledge acquisition features such as scan, read, and process the online content to fulfill their information needs. To capture indicators from searching behaviors that could account for the knowledge gained during a Web search, a qualitative study was performed using the Concurrent Think-Aloud protocol to observe the mechanisms of transfer and map knowledge flows during 78 search sessions. Findings indicate evidence of transfer of learning in the form of sixteen online information searching strategy indicators. This research aids the understanding of how knowledge is gained during search sessions and how to identify behaviors that could indicate that learning has occurred, which could be used to represent knowledge gain on Web search engines. In this way, it can aid search engines to become not only better tools of searching, but also tools of learning. 相似文献
18.
基于搜索引擎分类信息的用户查询歧义消减 总被引:1,自引:1,他引:0
用户在利用搜索引擎进行信息检索时,查询条件往往存在歧义,这导致搜索结果的多样性和冗余性.传统的方法主要是基于语义分析或构建知识库,此类方法在实际应用中的可行性不高.本文基于搜索引擎的分类信息,实现了一个简单有效的分类搜索系统.它首先根据用户的查询条件,将返回的搜索结果进行分类,并以树形目录的形式展示给用户,而后根据用户的点击数据,逐步确定用户的搜索意图,从而达到了查询歧义消减的目的.论文详细介绍了系统的设计思想、架构和工作流程.测试实例表明,该系统可以在一定程度上确定用户的查询意图,为用户返回更加准确的搜索结果. 相似文献
19.
Carol A. Leibiger 《Behavioral & Social Sciences Librarian》2013,32(4):187-222
Googlitis, the overreliance on search engines for research and the resulting development of poor searching skills, is a recognized problem among today's students. Google is not an effective research tool because, in addition to encouraging keyword searching at the expense of more powerful subject searching, it only accesses the Surface Web and is driven by advertising. American higher education unwittingly fosters the use of search engines in research by emphasizing results rather than process. Academic librarians emulate teaching faculty in their reliance on lectures, and their course-related instruction is limited in its effectiveness because it is constrained to one-shot, lecture-driven sessions. A more effective way to teach research is to collaborate with faculty via problem-based and project-oriented learning tasks that incorporate authentic discipline-specific information finding and critical thinking into assignments. 相似文献
20.
T. Couto N. Ziviani P. Calado M. Cristo M. Gonçalves E. S. de Moura W. Brandão 《Information Retrieval》2010,13(4):315-345
Automatic document classification can be used to organize documents in a digital library, construct on-line directories, improve
the precision of web searching, or help the interactions between user and search engines. In this paper we explore how linkage
information inherent to different document collections can be used to enhance the effectiveness of classification algorithms.
We have experimented with three link-based bibliometric measures, co-citation, bibliographic coupling and Amsler, on three
different document collections: a digital library of computer science papers, a web directory and an on-line encyclopedia.
Results show that both hyperlink and citation information can be used to learn reliable and effective classifiers based on
a kNN classifier. In one of the test collections used, we obtained improvements of up to 69.8% of macro-averaged F
1 over the traditional text-based kNN classifier, considered as the baseline measure in our experiments. We also present alternative ways of combining bibliometric
based classifiers with text based classifiers. Finally, we conducted studies to analyze the situation in which the bibliometric-based
classifiers failed and show that in such cases it is hard to reach consensus regarding the correct classes, even for human
judges. 相似文献