王若佳  李培 《图书情报工作》2015,59(11):111-118
[目的/意义] 针对当前我国网络用户的健康信息检索行为, 探索利用中文搜索引擎的健康信息检索规律, 为完善健康搜索引擎和网站建设提供参考。[方法/过程] 基于搜狗搜索引擎的大规模查询日志, 采用日志挖掘的方法, 从查询行为和点击行为两个角度对网络用户的健康信息检索行为进行研究。查询行为的研究指标包括会话层(会话长度、用户重复查询), 查询串层(查询串长度、重复查询)和词项层(高频词汇, 主题分类);点击行为的研究指标为点击位置和点击内容。[结果/结论] 健康相关查询的重复率较高, 提示相关网站可缓存高重复率查询串的返回结果;大众关注的热点领域为疾病、保健、母婴、医疗机构与美容整形, 提示网站的导航设计注意导航方向;用户更偏爱使用问答型平台, 提示网站设计者应更加关注与用户间问答型的互动模式。  相似文献   

搜索引擎中Robot搜索算法的优化   总被引:15,自引:0,他引:15  
目前的搜索引擎越来越暴露出不足之处 ,当用户使用搜索引擎时输入特定关键词之后 ,返回的查询结果往往有数千甚至几百万之多 ,而且其中包含大量的重复信息与垃圾信息 ,用户从中筛选出自己感兴趣的网页仍然需要耗费很长的时间。另外一种情况就是 ,Web上明明存在某些重要网页 ,却没有被搜索引擎的robot发现。本文针对这种现象 ,重点讨论搜索引擎中的搜索策略 ,改善搜索算法 ,使Robot在搜索阶段就能够充分处理与Robot频繁交互的URL列表。根据网页的内容、HTML结构以及其中包含的超链信息计算网页的PageRank ,使URL列表能够根据重要性调整排列顺序。初步的试验结果表明 ,本文的优化算法可以较大程度地改进搜索引擎的整体性能  相似文献   

如何查找隐形网页资源   总被引:2,自引:0,他引:2  
众所周知,互联网是各类信息的存储器,是一本包罗万象的百科全书。为了使每一个用户都能更有效地获取其所需要的信息,大量的搜索引擎在网上涌现,其中包括Google、Yahoo、Infoseek等。一般来说,这些搜索引擎用URL和关键词来标引和存储其数据库中的网页,当用户提出查询请求时,搜索引擎首先根据数据库中所存储的网页的URL来搜索网页,并返回相关的结果。然而,这些搜索引擎并不能搜索互联网上的所有信息。最近人们注意到一种叫"invisibleweb"的网页,这种网页又被称为"deep"或隐形网页。简单地说,就是那些因为各种原因不能被普通搜索引擎如Google、Yahoo等搜索到的网页。据  相似文献   

[目的/意义] 鉴于已有基于点击流的用户模型大多简单地采用页面类型序列代替行为序列,提出一种根据点击流访问页面序列到用户行为的映射方案,解决用户行为建模的问题。[方法/过程] 本文在分析网页URL参数、页面内容等特征的基础上,以81 759个电商用户会话为测试样本,提出并实现从页面到用户行为的映射方法,给出一种依据原始日志建立用户行为序列来描述会话的方案。[结果/结论] 分析反映出在会话层面上已有研究不易得到的行为特征,得到6类具备不同行为模式的会话:功能探索会话、卖家管理会话、营销推动会话、资料管理会话、商品浏览会话、检索依赖会话。基于点击流对用户会话建模,可以得出用户会话中行为序列特征,对实现准确营销与推荐具有重要价值。  相似文献   

搜索引擎用户日志分析对信息检索学术研究和搜索引擎优化都有重要意义,文章对约20G的新浪爱问搜索引擎(http://iask.com/)日志进行了系统的分析.发现了很多中文搜索的特点,并针对这些现象提出了一些问题.这些内容对于掌握用户搜索行为,完善搜索引擎系统和中文信息检索研究都具有重要的意义.该文为2008年第七期本期话题<用户查询的理解>的文章之一.  相似文献   

基于搜索引擎分类信息的用户查询歧义消减   总被引:1,自引:1,他引:0  
用户在利用搜索引擎进行信息检索时,查询条件往往存在歧义,这导致搜索结果的多样性和冗余性.传统的方法主要是基于语义分析或构建知识库,此类方法在实际应用中的可行性不高.本文基于搜索引擎的分类信息,实现了一个简单有效的分类搜索系统.它首先根据用户的查询条件,将返回的搜索结果进行分类,并以树形目录的形式展示给用户,而后根据用户的点击数据,逐步确定用户的搜索意图,从而达到了查询歧义消减的目的.论文详细介绍了系统的设计思想、架构和工作流程.测试实例表明,该系统可以在一定程度上确定用户的查询意图,为用户返回更加准确的搜索结果.  相似文献   

智能搜索引擎信息过滤机制研究   总被引:3,自引:0,他引:3  
智能搜索引擎是人工智能技术和传统搜索引擎技术相结合的产物。面对信息无时无刻不在进行更替的网络环境,智能搜索引擎具有自然语言过滤智能化、多文档处理智能化、用户服务智能化等信息处理机制。为促进智能搜索引擎发展,应重视用户建模技术研究,加强基于多Agent智能搜索引擎系统的研制与实践,加大智能搜索引擎关键技术研究力度。  相似文献   

随着互联网络的发展,网上的资源越来越多,各开放式的数据库也不断出现,为使用户能更好地使用网络资源,有几百个搜索引擎在Internet上服务于用户。然而,搜索引擎只能实现对页面的搜索,不能实现对数据库内部的搜索,而后者又是目前人们关注的问题。另外,Internet的用户遍布全世界,所使用的语言各不相同,实现对多语种数据的检索亦是IT界人士研究的问题。本文以中草药数据为基础,通过建立一个多语种的词表实现多语种的检索,以及建立一套URL命令集实现了对多个风格不同的数据库进行检索、连接的问题。  相似文献   

搜索引擎日志记录了用户与系统交互的整个过程。对日志文件进行挖掘,可以发现用户进行Web搜索的行为特征与规律,有效改善搜索引擎系统的性能。在对国内外相关研究进行系统梳理和总结的基础上,文章提出了一个Web搜索引擎日志挖掘的研究框架,主要包括日志挖掘的研究内容、数据集的选择方法、数据预处理的方法、不同地域用户行为的特征与比较、如何应用于系统性能的改善等内容。  相似文献   

本文利用大规模搜索日志对用户中文长句查询的情况进行了统计研究.通过分析搜索日志中的中文长句查询确定了经常发生的查询类型特点,并对用户搜索行为与查询长度、查询类型和查询频率的关系进行了研究.进一步了解了session中用户查询词修改情况,总结了用户查询修改方法和长度修改方面的特征和规律.最后,将不同长度的查询放到了三个商业搜索引擎中分别进行检索,计算其重叠率.通过以上的分析研究发现虽然目前大部分查询都是短查询,但短查询并不能满足用户所有的检索需求,特别是在搜索引擎向语义检索不断发展的今天,长句检索的分析和利用能够从更深层次上了解用户的查询用语特点和搜索点击行为,这对于查询技术的改进和语义空间的构建都具有积极的作用.  相似文献   


Today's learners operate in digital environments which can be largely navigated with no human intervention. At the same time, libraries spend millions and millions of dollars to provide access to content which our users may never know is available to them. Through the Open SESMO (Search Engine & Social Media Optimization) database project, Montana State University (MSU) Library applied search engine optimization and structured data with the Schema.org vocabulary, linked data models and practices, and social media optimization techniques to all the library's subscribed databases. Our research shows that Open SESMO creates significant return-on-investment with substantial increased traffic to our paid resources by our users as evidenced through analytics and metrics. In the core research of the article, we take a quantitative look at the pre/post results to assess the Open SESMO method and its impact on organic search referrals and use of the collection analyzing data from three distinct fall semesters. Returns include demonstrated library value through database recommendations, connecting researchers to subject librarians, and increased visitation to our library's paid databases with growth in organic search referrals, impressions, and click-through rates. This project offers a standard and innovative practice for other libraries to employ in surfacing their paid databases to users through the open web by applying structured and linked data methods.  相似文献   

Query suggestion, which enables the user to revise a query with a single click, has become one of the most fundamental features of Web search engines. However, it has not been clear what circumstances cause the user to turn to query suggestion. In order to investigate when and how the user uses query suggestion, we analyzed three kinds of data sets obtained from a major commercial Web search engine, comprising approximately 126 million unique queries, 876 million query suggestions and 306 million action patterns of users. Our analysis shows that query suggestions are often used (1) when the original query is a rare query, (2) when the original query is a single-term query, (3) when query suggestions are unambiguous, (4) when query suggestions are generalizations or error corrections of the original query, and (5) after the user has clicked on several URLs in the first search result page. Our results suggest that search engines should provide better assistance especially when rare or single-term queries are input, and that they should dynamically provide query suggestions according to the searcher’s current state.  相似文献   

Bing and Google customize their results to target people with different geographic locations and languages but, despite the importance of search engines for web users and webometric research, the extent and nature of these differences are unknown. This study compares the results of seventeen random queries submitted automatically to Bing for thirteen different English geographic search markets at monthly intervals. Search market choice alters a small majority of the top 10 results but less than a third of the complete sets of results. Variation in the top 10 results over a month was about the same as variation between search markets but variation over time was greater for the complete results sets. Most worryingly for users, there were almost no ubiquitous authoritative results: only one URL was always returned in the top 10 for all search markets and points in time, and Wikipedia was almost completely absent from the most common top 10 results. Most importantly for webometrics, results from at least three different search markets should be combined to give more reliable and comprehensive results, even for queries that return fewer than the maximum number of URLs.  相似文献   

元搜索引擎性能评价体系研究   总被引:1,自引:0,他引:1  
对于元搜索引擎的比较与评价研究在我国还是空白,本文在借鉴单一搜索引擎评价以及国外学者研究的基础上,初步提出了元搜索引擎评价的主要性能指标,并对一些主要的元搜索引擎的性能和特点做了简要说明,这对于元搜索引擎的发展和用户都是有益的.参考文献8.  相似文献   

基于用户行为分析的搜索引擎优化策略   总被引:8,自引:0,他引:8  
搜索引擎在给广大网络用户带来便捷的同时,也暴露出其不足。从网络用户利用搜索引擎的角度分析搜索引擎存在的问题以及用户利用搜索引擎时出现的障碍,在此基础上提出搜索引擎的优化模式。此外,提出由用户、知识生产者与知识组织者三者与搜索引擎共同组成一个信息系统的建议,以达到优化搜索引擎的目的。  相似文献   

This study examined ten, selected word pairs, each containing a word's full spelling and its abbreviation, to determine which form search engine users preferred in searching. Using seven search logs gathered from several Internet search engines with approximately 608 MB of data, the study measured the occurrences of the twenty terms. The selected words are important in library cataloging, for some are prescribed abbreviations in metadata content standards. The study found that in eight of the ten word pairs users preferred to search words’ full spellings over the abbreviations, often by a high margin.  相似文献   

毛振鹏  胡滨  代海岩 《晋图学刊》2005,(5):23-25,39
建立搜索引擎质量评价体系可以指导用户进行网络信息检索和网站搜索引擎优化,促进搜索引擎功能的不断升级。搜索引擎质量评价体系中总体定性评价主要是分别对搜索引擎的用户舒适程度、专业程度、智能程度进行总体评价;量化指标评价主要是采用传统检索指标和网络检索指标对搜索引擎进行单项评价。  相似文献   

Search engine optimization, or the practice of designing a web site so that it rises to the top of the results page when users search for particular keywords or phrases, has become so prevalent on the modern web that it has a significant influence on Google search results. This article examines the techniques used by search engine optimization practitioners, the difference between “white hat” and “black hat” optimization tactics, and why it is important for library staff to understand these techniques and their impact on search engine results pages. It also looks at ways that library staff can help their users develop awareness of the factors that influence search results and how to better assess the quality and relevance of results listings.  相似文献   

Using EndNote version 7.0, the authors tested the search capabilities of the EndNote search engine for retrieving citations from MEDLINE for importation into EndNote, a citation management software package. Ovid MEDLINE and PubMed were selected for the comparison. Several searches were performed on Ovid MEDLINE and PubMed using EndNote as the search engine, and the same searches were run on both Ovid and PubMed directly. Findings indicate that it is preferable to search MEDLINE directly rather than using EndNote. The publishers of EndNote do warn its users about the limitations of their product as a search engine when searching external databases. In this article, the limitations of EndNote as a search engine for searching MEDLINE were explored as related to MeSH, non-MeSH, citation verification, and author searching.  相似文献   

