首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Over the past decade, worldwide Internet usage has grown tremendously, with the most rapid growth in some emerging economies such as Latin America and the Middle East, where people speaking different languages actively seek information on the web. Global search engines may not adequately address local users’ needs while regional web portals may lack rich web content. Different from search engines, web directories organize sites and pages into intuitive hierarchical structures to facilitate browsing. However, high-quality web directories in users’ native languages often do not exist and their development requires much domain knowledge not readily available. In this research, we proposed a novel semi-automatic approach to facilitate web repository management. We applied the approach to developing web directories in the business and health-care domains for the Spanish-speaking and Arabic-speaking communities respectively. The two directories contain respectively 4735 and 5107 unique sites and pages with a maximum depth of 5 levels. Results of experiments involving 37 native speakers show that these directories outperformed existing benchmark directories in terms of browsing effectiveness and efficiency, providing strong implications for information professionals and multinational enterprise managers.  相似文献   

2.
Frequent requests from users to search engines on the World Wide Web are to search for information about people using personal names. Current search engines only return sets of documents containing the name queried, but, as several people usually share a personal name, the resulting sets often contain documents relevant to several people. It is necessary to disambiguate people in these result sets in order to to help users find the person of interest more readily. In the task of name disambiguation, effective measurement of similarities in the documents is a crucial step towards the final disambiguation. We propose a new method that uses web directories as a knowledge base to find common contexts in documents and uses the common contexts measure to determine document similarities. Experiments, conducted on documents mentioning real people on the web, together with several famous web directory structures, suggest that there are significant advantages in using web directories to disambiguate people compared with other conventional methods.  相似文献   

3.
Search engines are essential for finding information on the World Wide Web. We conducted a study to see how effective eight search engines are. Expert searchers sought information on the Web for users who had legitimate needs for information, and these users assessed the relevance of the information retrieved. We calculated traditional information retrieval measures of recall and precision at varying numbers of retrieved documents and used these as the bases for statistical comparisons of retrieval effectiveness among the eight search engines. We also calculated the likelihood that a document retrieved by one search engine was retrieved by other search engines as well.  相似文献   

4.
Commercial search engines are now playing an increasingly important role in Web information dissemination and access. Of particular interest to business and national governments is whether the big engines have coverage biased towards the US or other countries. In our study we tested for national biases in three major search engines and found significant differences in their coverage of commercial Web sites. The US sites were much better covered than the others in the study: sites from China, Taiwan and Singapore. We then examined the possible technical causes of the differences and found that the language of a site does not affect its coverage by search engines. However, the visibility of a site, measured by the number of links to it, affects its chance to be covered by search engines. We conclude that the coverage bias does exist but this is due not to deliberate choices of the search engines but occurs as a natural result of cumulative advantage effects of US sites on the Web. Nevertheless, the bias remains a cause for international concern.  相似文献   

5.
The Web and especially major Web search engines are essential tools in the quest to locate online information for many people. This paper reports results from research that examines characteristics and changes in Web searching from nine studies of five Web search engines based in the US and Europe. We compare interactions occurring between users and Web search engines from the perspectives of session length, query length, query complexity, and content viewed among the Web search engines. The results of our research shows (1) users are viewing fewer result pages, (2) searchers on US-based Web search engines use more query operators than searchers on European-based search engines, (3) there are statistically significant differences in the use of Boolean operators and result pages viewed, and (4) one cannot necessary apply results from studies of one particular Web search engine to another Web search engine. The wide spread use of Web search engines, employment of simple queries, and decreased viewing of result pages may have resulted from algorithmic enhancements by Web search engine companies. We discuss the implications of the findings for the development of Web search engines and design of online content.  相似文献   

6.
There was a proliferation of electronic information sources and search engines in the 1990s. Many of these information sources became available through the ubiquitous interface of the Web browser. Diverse information sources became accessible to information professionals and casual end users alike. Much of the information was also hyperlinked, so that information could be explored by browsing as well as searching. While vast amounts of information were now just a few keystrokes and mouseclicks away, as the choices multiplied, so did the complexity of choosing where and how to look for the electronic information. Much of the complexity in information exploration at the turn of the twenty-first century arose because there was no common cataloguing and control system across the various electronic information sources. In addition, the many search engines available differed widely in terms of their domain coverage, query methods and efficiency.Meta-search engines were developed to improve search performance by querying multiple search engines at once. In principle, meta-search engines could greatly simplify the search for electronic information by selecting a subset of first-level search engines and digital libraries to submit a query to based on the characteristics of the user, the query/topic, and the search strategy. This selection would be guided by diagnostic knowledge about which of the first-level search engines works best under what circumstances. Programmatic research is required to develop this diagnostic knowledge about first-level search engine performance.This paper introduces an evaluative framework for this type of research and illustrates its use in two experiments. The experimental results obtained are used to characterize some properties of leading search engines (as of 1998). Significant interactions were observed between search engine and two other factors (time of day and Web domain). These findings supplement those of earlier studies, providing preliminary information about the complex relationship between search engine functionality and performance in different contexts. While the specific results obtained represent a time-dependent snapshot of search engine performance in 1998, the evaluative framework proposed should be generally applicable in the future.  相似文献   

7.
The performance and capabilities of Web search engines is an important and significant area of research. Millions of people world wide use Web search engines very day. This paper reports the results of a major study examining the overlap among results retrieved by multiple Web search engines for a large set of more than 10,000 queries. Previous smaller studies have discussed a lack of overlap in results returned by Web search engines for the same queries. The goal of the current study was to conduct a large-scale study to measure the overlap of search results on the first result page (both non-sponsored and sponsored) across the four most popular Web search engines, at specific points in time using a large number of queries. The Web search engines included in the study were MSN Search, Google, Yahoo! and Ask Jeeves. Our study then compares these results with the first page results retrieved for the same queries by the metasearch engine Dogpile.com. Two sets of randomly selected user-entered queries, one set was 10,316 queries and the other 12,570 queries, from Infospace’s Dogpile.com search engine (the first set was from Dogpile, the second was from across the Infospace Network of search properties were submitted to the four single Web search engines). Findings show that the percent of total results unique to only one of the four Web search engines was 84.9%, shared by two of the three Web search engines was 11.4%, shared by three of the Web search engines was 2.6%, and shared by all four Web search engines was 1.1%. This small degree of overlap shows the significant difference in the way major Web search engines retrieve and rank results in response to given queries. Results point to the value of metasearch engines in Web retrieval to overcome the biases of individual search engines.  相似文献   

8.
Ecommerce is developing into a fast-growing channel for new business, so a strong presence in this domain could prove essential to the success of numerous commercial organizations. However, there is little research examining ecommerce at the individual customer level, particularly on the success of everyday ecommerce searches. This is critical for the continued success of online commerce. The purpose of this research is to evaluate the effectiveness of search engines in the retrieval of relevant ecommerce links. The study examines the effectiveness of five different types of search engines in response to ecommerce queries by comparing the engines’ quality of ecommerce links using topical relevancy ratings. This research employs 100 ecommerce queries, five major search engines, and more than 3540 Web links. The findings indicate that links retrieved using an ecommerce search engine are significantly better than those obtained from most other engines types but do not significantly differ from links obtained from a Web directory service. We discuss the implications for Web system design and ecommerce marketing campaigns.  相似文献   

9.
A user’s single session with a Web search engine or information retrieval (IR) system may consist of seeking information on single or multiple topics, and switch between tasks or multitasking information behavior. Most Web search sessions consist of two queries of approximately two words. However, some Web search sessions consist of three or more queries. We present findings from two studies. First, a study of two-query search sessions on the AltaVista Web search engine, and second, a study of three or more query search sessions on the AltaVista Web search engine. We examine the degree of multitasking search and information task switching during these two sets of AltaVista Web search sessions. A sample of two-query and three or more query sessions were filtered from AltaVista transaction logs from 2002 and qualitatively analyzed. Sessions ranged in duration from less than a minute to a few hours. Findings include: (1) 81% of two-query sessions included multiple topics, (2) 91.3% of three or more query sessions included multiple topics, (3) there are a broad variety of topics in multitasking search sessions, and (4) three or more query sessions sometimes contained frequent topic changes. Multitasking is found to be a growing element in Web searching. This paper proposes an approach to interactive information retrieval (IR) contextually within a multitasking framework. The implications of our findings for Web design and further research are discussed.  相似文献   

10.
Web 2.0 and folksonomies in a library context   总被引:1,自引:0,他引:1  
Libraries have a societal purpose and this role has become increasingly important as new technologies enable organizations to support, enable and enhance the participation of users in assuming an active role in the creation and communication of information. Folksonomies, a Web 2.0 technology, represent such an example. Folksonomies result from individuals freely tagging resources available to them on a computer network. In a library environment folksonomies have the potential of overcoming certain limitations of traditional classification systems such as the Library of Congress Subject Headings (LCSH). Typical limitations of this type of classification systems include, for example, the rigidity of the underlying taxonomical structures and the difficulty of introducing change in the categories. Folksonomies represent a supporting technology to existing classification systems helping to describe library resources more flexibly, dynamically and openly. As a review of the current literature shows, the adoption of folksonomies in libraries is novel and limited research has been carried out in the area. This paper presents research into the adoption of folksonomies for a University library. A Web 2.0 system was developed, based on the requirements collected from library stakeholders, and integrated with the existing library computer system. An evaluation of the work was carried out in the form of a survey in order to understand the possible reactions of users to folksonomies as well as the effects on their behavior. The broad conclusion of this work is that folksonomies seem to have a beneficial effect on users’ involvement as active library participants as well as encourage users to browse the catalogue in more depth.  相似文献   

11.
赵金海 《现代情报》2007,27(3):62-64
从桌面搜索工具、搜索引擎指南、目录和论著资源等方面入手,对国外现有论述搜索引擎的主要资源的种类、性能和特色进行了述评。在此基础上,推荐有关搜索引擎的最佳资源,为人们学习掌握搜索引擎的资源、搜索技巧、方法和优化检索策略提供参考资料和学习途径。  相似文献   

12.
Comparing rankings of search results on the Web   总被引:1,自引:0,他引:1  
The Web has become an information source for professional data gathering. Because of the vast amounts of information on almost all topics, one cannot systematically go over the whole set of results, and therefore must rely on the ordering of the results by the search engine. It is well known that search engines on the Web have low overlap in terms of coverage. In this study we measure how similar are the rankings of search engines on the overlapping results.We compare rankings of results for identical queries retrieved from several search engines. The method is based only on the set of URLs that appear in the answer sets of the engines being compared. For comparing the similarity of rankings of two search engines, the Spearman correlation coefficient is computed. When comparing more than two sets Kendall’s W is used. These are well-known measures and the statistical significance of the results can be computed. The methods are demonstrated on a set of 15 queries that were submitted to four large Web search engines. The findings indicate that the large public search engines on the Web employ considerably different ranking algorithms.  相似文献   

13.
可视化搜索引擎Kartoo   总被引:1,自引:0,他引:1  
曹红兵 《现代情报》2005,25(9):78-79,82
可视化搜索引擎因其实现了信息的可视化表达、具有传统搜索引擎无法比拟的优势而被称为是下一代搜索引擎。本文以Kartoo为例,介绍了这种可视化搜索引擎的工作原理、检索界面、检索功能、特点及用法。  相似文献   

14.
This paper examines a real-time measure of bias in Web search engines. The measure captures the degree to which the distribution of URLs, retrieved in response to a query, deviates from an ideal or fair distribution for that query. This ideal is approximated by the distribution produced by a collection of search engines. Differences between bias and classical retrieval measures are highlighted by examining the possibilities for bias in four extreme cases of recall and precision. The results of experiments examining the influence on bias measurement of subject domains, search engines, and search terms are presented. Three general conclusions are drawn: (1) the performance of search engines can be distinguished with the aid of the bias measure; (2) bias values depend on the subject matter under consideration; (3) choice of search terms does not account for much of the variance in bias values. These conclusions underscore the need to develop “bias profiles” for search engines.  相似文献   

15.
Web search engines are beginning to offer access to multimedia searching, including audio, video and image searching. In this paper we report findings from a study examining the state of multimedia search functionality on major general and specialized Web search engines. We investigated 102 Web search engines to examine: (1) how many Web search engines offer multimedia searching, (2) the type of multimedia search functionality and methods offered, such as “query by example”, and (3) the supports for personalization or customization which are accessible as advanced search. Findings include: (1) few major Web search engines offer multimedia searching and (2) multimedia Web search functionality is generally limited. Our findings show that despite the increasing level of interest in multimedia Web search, those few Web search engines offering multimedia Web search, provide limited multimedia search functionality. Keywords are still the only means of multimedia retrieval, while other methods such as “query by example” are offered by less than 1% of Web search engines examined.  相似文献   

16.
Multitasking is the human ability to handle the demands of multiple tasks. Multitasking behavior involves the ordering of multiple tasks and switching between tasks. People often multitask when using information retrieval (IR) technologies as they seek information on more than one information problem over single or multiple search episodes. However, limited studies have examined how people order their information problems, especially during their Web search engine interaction. The aim of our exploratory study was to investigate assigned information problem ordering by forty (40) study participants engaged in Web search. Findings suggest that assigned information problem ordering was influenced by the following factors, including personal interest, problem knowledge, perceived level of information available on the Web, ease of finding information, level of importance and seeking information on information problems in order from general to specific. Personal interest and problem knowledge were the major factors during assigned information problem ordering. Implications of the findings and further research are discussed. The relationship between information problem ordering and gratification theory is an important area for further exploration.  相似文献   

17.
Across the world, millions of users interact with search engines every day to satisfy their information needs. As the Web grows bigger over time, such information needs, manifested through user search queries, also become more complex. However, there has been no systematic study that quantifies the structural complexity of Web search queries. In this research, we make an attempt towards understanding and characterizing the syntactic complexity of search queries using a multi-pronged approach. We use traditional statistical language modeling techniques to quantify and compare the perplexity of queries with natural language (NL). We then use complex network analysis for a comparative analysis of the topological properties of queries issued by real Web users and those generated by statistical models. Finally, we conduct experiments to study whether search engine users are able to identify real queries, when presented along with model-generated ones. The three complementary studies show that the syntactic structure of Web queries is more complex than what n-grams can capture, but simpler than NL. Queries, thus, seem to represent an intermediate stage between syntactic and non-syntactic communication.  相似文献   

18.
A growing body of research is beginning to explore the information-seeking behavior of Web users. The vast majority of these studies have concentrated on the area of textual information retrieval (IR). Little research has examined how people search for non-textual information on the Internet, and few large-scale studies has investigated visual information-seeking behavior with general-purpose Web search engines. This study examined visual information needs as expressed in users’ Web image queries. The data set examined consisted of 1,025,908 sequential queries from 211,058 users of Excite, a major Internet search service. Twenty-eight terms were used to identify queries for both still and moving images, resulting in a subset of 33,149 image queries by 9855 users. We provide data on: (1) image queries – the number of queries and the number of search terms per user, (2) image search sessions – the number of queries per user, modifications made to subsequent queries in a session, and (3) image terms – their rank/frequency distribution and the most highly used search terms. On average, there were 3.36 image queries per user containing an average of 3.74 terms per query. Image queries contained a large number of unique terms. The most frequently occurring image related terms appeared less than 10% of the time, with most terms occurring only once. We contrast this to earlier work by P.G.B. Enser, Journal of Documentation 51 (2) (1995) 126–170, who examined written queries for pictorial information in a non-digital environment. Implications for the development of models for visual information retrieval, and for the design of Web search engines are discussed.  相似文献   

19.
Searching for relevant material that satisfies the information need of a user, within a large document collection is a critical activity for web search engines. Query Expansion techniques are widely used by search engines for the disambiguation of user’s information need and for improving the information retrieval (IR) performance. Knowledge-based, corpus-based and relevance feedback, are the main QE techniques, that employ different approaches for expanding the user query with synonyms of the search terms (word synonymy) in order to bring more relevant documents and for filtering documents that contain search terms but with a different meaning (also known as word polysemy problem) than the user intended. This work, surveys existing query expansion techniques, highlights their strengths and limitations and introduces a new method that combines the power of knowledge-based or corpus-based techniques with that of relevance feedback. Experimental evaluation on three information retrieval benchmark datasets shows that the application of knowledge or corpus-based query expansion techniques on the results of the relevance feedback step improves the information retrieval performance, with knowledge-based techniques providing significantly better results than their simple relevance feedback alternatives in all sets.  相似文献   

20.
This paper reports results from a study exploring the multimedia search functionality of Chinese language search engines. Web searching in Chinese (Mandarin) is a growing research area and a technical challenge for popular commercial Web search engines. Few studies have been conducted on Chinese language search engines. We investigate two research questions: which Chinese language search engines provide multimedia searching, and what multimedia search functionalities are available in Chinese language Web search engines. Specifically, we examine each Web search engine’s (1) features permitting Chinese language multimedia searches, (2) extent of search personalization and user control of multimedia search variables, and (3) the relationships between Web search engines and their features in the Chinese context. Key findings show that Chinese language Web search engines offer limited multimedia search functionality, and general search engines provide a wider range of features than specialized multimedia search engines. Study results have implications for Chinese Web users, Website designers and Web search engine developers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号