期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Characteristics of question format web queries: an exploratory study

《Information processing & management》2002,38(4):453-471

Web queries in question format are becoming a common element of a user's interaction with Web search engines. Web search services such as Ask Jeeves – a publicly accessible question and answer (Q&A) search engine – request users to enter question format queries. This paper provides results from a study examining queries in question format submitted to two different Web search engines – Ask Jeeves that explicitly encourages queries in question format and the Excite search service that does not explicitly encourage queries in question format. We identify the characteristics of queries in question format in two different data sets: (1) 30,000 Ask Jeeves queries and 15,575 Excite queries, including the nature, length, and structure of queries in question format. Findings include: (1) 50% of Ask Jeeves queries and less than 1% of Excite were in question format, (2) most users entered only one query in question format with little query reformulation, (3) limited range of formats for queries in question format – mainly “where”, “what”, or “how” questions, (4) most common question query format was “Where can I find………” for general information on a topic, and (5) non-question queries may be in request format. Overall, four types of user Web queries were identified: keyword, Boolean, question, and request. These findings provide an initial mapping of the structure and content of queries in question and request format. Implications for Web search services are discussed. 相似文献

2.

Children's query types and reformulations in Google search

Dania Bilal Jacek Gwizdka 《Information processing & management》2018,54(6):1022-1041

We investigated the searching behaviors of twenty-four children in grades 6, 7, and 8 (ages 11–13) in finding information on three types of search tasks in Google. Children conducted 72 search sessions and issued 150 queries. Children's phrase- and question-like queries combined were much more prevalent than keyword queries (70% vs. 30%, respectively). Fifty two percent of the queries were reformulations (33 sessions). We classified children's query reformulation types into five classes based on the taxonomy by Liu et al. (2010). We found that most query reformulations were by Substitution and Specialization, and that children hardly repeated queries. We categorized children's queries by task facets and examined the way they expressed these facets in their query formulations and reformulations. Oldest children tended to target the general topic of search tasks in their queries most frequently, whereas younger children expressed one of the two facets more often. We assessed children's achieved task outcomes using the search task outcomes measure we developed. Children were mostly more successful on the fact-finding and fully self-generated task and partially successful on the research-oriented task. Query type, reformulation type, achieved task outcomes, and expressing task facets varied by task type and grade level. There was no significant effect of query length in words or of the number of queries issued on search task outcomes. The study findings have implications for human intervention, digital literacy, search task literacy, as well as for system intervention to support children's query formulation and reformulation during interaction with Google. 相似文献

3.

Multitasking during Web search sessions

Amanda Spink Minsoo Park Bernard J. Jansen Jan Pedersen 《Information processing & management》2006

A user’s single session with a Web search engine or information retrieval (IR) system may consist of seeking information on single or multiple topics, and switch between tasks or multitasking information behavior. Most Web search sessions consist of two queries of approximately two words. However, some Web search sessions consist of three or more queries. We present findings from two studies. First, a study of two-query search sessions on the AltaVista Web search engine, and second, a study of three or more query search sessions on the AltaVista Web search engine. We examine the degree of multitasking search and information task switching during these two sets of AltaVista Web search sessions. A sample of two-query and three or more query sessions were filtered from AltaVista transaction logs from 2002 and qualitatively analyzed. Sessions ranged in duration from less than a minute to a few hours. Findings include: (1) 81% of two-query sessions included multiple topics, (2) 91.3% of three or more query sessions included multiple topics, (3) there are a broad variety of topics in multitasking search sessions, and (4) three or more query sessions sometimes contained frequent topic changes. Multitasking is found to be a growing element in Web searching. This paper proposes an approach to interactive information retrieval (IR) contextually within a multitasking framework. The implications of our findings for Web design and further research are discussed. 相似文献

4.

Automatic new topic identification using multiple linear regression

Seda Ozmutlu 《Information processing & management》2006

The purpose of this study is to provide automatic new topic identification of search engine query logs, and estimate the effect of statistical characteristics of search engine queries on new topic identification. By applying multiple linear regression and multi-factor ANOVA on a sample data log from the Excite search engine, we demonstrated that the statistical characteristics of Web search queries, such as time interval, search pattern and position of a query in a user session, are effective on shifting to a new topic. Multiple linear regression is also a successful tool for estimating topic shifts and continuations. The findings of this study provide statistical proof for the relationship between the non-semantic characteristics of Web search queries and the occurrence of topic shifts and continuations. 相似文献

5.

How are we searching the World Wide Web? A comparison of nine search engine transaction logs

Bernard J. Jansen Amanda Spink 《Information processing & management》2006

The Web and especially major Web search engines are essential tools in the quest to locate online information for many people. This paper reports results from research that examines characteristics and changes in Web searching from nine studies of five Web search engines based in the US and Europe. We compare interactions occurring between users and Web search engines from the perspectives of session length, query length, query complexity, and content viewed among the Web search engines. The results of our research shows (1) users are viewing fewer result pages, (2) searchers on US-based Web search engines use more query operators than searchers on European-based search engines, (3) there are statistically significant differences in the use of Boolean operators and result pages viewed, and (4) one cannot necessary apply results from studies of one particular Web search engine to another Web search engine. The wide spread use of Web search engines, employment of simple queries, and decreased viewing of result pages may have resulted from algorithmic enhancements by Web search engine companies. We discuss the implications of the findings for the development of Web search engines and design of online content. 相似文献

6.

Investigating the role of in-situ user expectations in Web search

《Information processing & management》2023,60(3):103300

Pre-adoption expectations often serve as an implicit reference point in users’ evaluation of information systems and are closely associated with their goals of interactions, behaviors, and overall satisfaction. Despite the empirically confirmed impacts, users’ search expectations and their connections to tasks, users, search experiences, and behaviors have been scarcely studied in the context of online information search. To address the gap, we collected 116 sessions from 60 participants in a controlled-lab Web search study and gathered direct feedback on their in-situ expected information gains (e.g., number of useful pages) and expected search efforts (e.g., clicks and dwell time) under each query during search sessions. Our study aims to examine (1) how users’ pre-search experience, task characteristics, and in-session experience affect their current expectations and (2) how user expectations are correlated with search behaviors and satisfaction. Our results with both quantitative and qualitative evidence demonstrate that: (1) user expectation is significantly affected by task characteristics, previous and in-situ search experience; (2) user expectation is closely associated with users’ browsing behaviors and search satisfaction. The knowledge learned about user expectation advances our understanding of users’ search behavioral patterns and their evaluations of interaction experience and will also facilitate the design, implementation, and evaluation of expectation-aware user models, metrics, and information retrieval (IR) systems. 相似文献

7.

Impact of response latency on sponsored search

Xiao Bai B. Barla Cambazoglu 《Information processing & management》2019,56(1):110-129

Recent research in the human computer interaction and information retrieval areas has revealed that search response latency exhibits a clear impact on the user behavior in web search. Such impact is reflected both in users’ subjective perception of the usability of a search engine and in their interaction with the search engine in terms of the number of search results they engage with. However, a similar impact analysis has been missing so far in the context of sponsored search. Since the predominant business model for commercial search engines is advertising via sponsored search results (i.e., search advertisements), understanding how response latency influences the user interaction with the advertisements displayed on the search engine result pages is crucial to increase the revenue of a commercial search engine. To this end, we conduct a large-scale analysis using query logs obtained from a commercial web search. We analyze the short-term and long-term impact of search response latency on the querying and clicking behaviors of users using desktop and mobile devices to access the search engine, as well as the corresponding impact on the revenue of the search engine. This analysis demonstrates the importance of serving sponsored search results with low latency and provides insight into the ad serving policy of commercial search engines to ensure long-term user engagement and search revenue. 相似文献

8.

一种面向Web搜索的查询修正方案

杨建林严明《情报理论与实践》2008,31(1):146-149

本文分析了正方法,查询修正中的用户信息行为,吸收网页抓取、检索与浏览并重的思想,综合考虑用户Web搜索过程中的行为特点、查询修正所用词汇的可用来源,给出一个新的面向Web搜索的查询修正解决方案. 相似文献

9.

Application of automatic topic identification on Excite Web search engine data logs

《Information processing & management》2005,41(5):1243-1262

The analysis of contextual information in search engine query logs enhances the understanding of Web users’ search patterns. Obtaining contextual information on Web search engine logs is a difficult task, since users submit few number of queries, and search multiple topics. Identification of topic changes within a search session is an important branch of search engine user behavior analysis. The purpose of this study is to investigate the properties of a specific topic identification methodology in detail, and to test its validity. The topic identification algorithm’s performance becomes doubtful in various cases. These cases are explored and the reasons underlying the inconsistent performance of automatic topic identification are investigated with statistical analysis and experimental design techniques. 相似文献

10.

基于Heritrix和Lucene的专题搜索引擎研究

贾超卫文学《中国科技信息》2012,(10):95-96

专题搜索引擎也称垂直搜索引擎,主要用来满足特定领域的用户需求。Heritrix是开源的网络爬虫,Heritrix的WebUI启动方式并不易用于广大用户。本文改变了往常对Heritrix用法,摒弃了Heritrix的WebUI启动方式,对Heritrix源码进行修改,将Lucene整合到Heritrix中,构建成一个完整的搜索引擎,并通过监听器监听搜索引擎状态,使搜索引擎能够进行自动爬取和数据更新。同时,本文添加了网页过滤模块以及对查询结果排序算法进行了改进,提高了搜索引擎的易用性和查询的准确率。相似文献

11.

Document replication strategies for geographically distributed web search engines

Enver Kayaaslan B. Barla Cambazoglu Cevdet Aykanat 《Information processing & management》2013

Large-scale web search engines are composed of multiple data centers that are geographically distant to each other. Typically, a user query is processed in a data center that is geographically close to the origin of the query, over a replica of the entire web index. Compared to a centralized, single-center search engine, this architecture offers lower query response times as the network latencies between the users and data centers are reduced. However, it does not scale well with increasing index sizes and query traffic volumes because queries are evaluated on the entire web index, which has to be replicated and maintained in all data centers. As a remedy to this scalability problem, we propose a document replication framework in which documents are selectively replicated on data centers based on regional user interests. Within this framework, we propose three different document replication strategies, each optimizing a different objective: reducing the potential search quality loss, the average query response time, or the total query workload of the search system. For all three strategies, we consider two alternative types of capacity constraints on index sizes of data centers. Moreover, we investigate the performance impact of query forwarding and result caching. We evaluate our strategies via detailed simulations, using a large query log and a document collection obtained from the Yahoo! web search engine. 相似文献

12.

Information navigation on the web by clustering and summarizing query results

Dmitri G. Roussinov Hsinchun Chen 《Information processing & management》2001,37(6):203

We report our experience with a novel approach to interactive information seeking that is grounded in the idea of summarizing query results through automated document clustering. We went through a complete system development and evaluation cycle: designing the algorithms and interface for our prototype, implementing them and testing with human users. Our prototype acted as an intermediate layer between the user and a commercial Internet search engine (AltaVista), thus allowing searches of the significant portion of World Wide Web. In our final evaluation, we processed data from 36 users and concluded that our prototype improved search performance over using the same search engine (AltaVista) directly. We also analyzed effects of various related demographic and task related parameters. 相似文献

13.

Exploring the immediate and short-term effects of peer advice and cognitive authority on Web search behavior

Jiqun Liu Yiwei Wang Soumik Mandal Chirag Shah 《Information processing & management》2019,56(3):1010-1025

An individual's Web search behavior can be influenced by a number of factors, including features and functions of a search engine as well as search education. In contrast to the long-lasting attention to the algorithm and interface dimensions of search, there is a lack of research concerned with the potential effects of user education on search behavior. To address this gap, we ran a three-session field-lab-combined study to examine the effects of user education from two distinct sources – peer advice and cognitive authority (operationalized as video-based student's advice and expert's advice respectively) – on Web search behavior in two different search task scenarios (i.e., factual specific and factual amorphous tasks). We also tested if these behavioral effects persist for a short period of time when the explicit search tips are removed. Using 185 task session data generated by 31 participants in two field and one lab sessions, this study demonstrates that: (1) both peer advice and cognitive authority are effective in stimulating immediate behavioral changes in Web search; (2) the immediate behavioral impact of search advice is broader in factual amorphous task than in factual specific task; (3) framing search tips as the advice from cognitive authority is more likely to generate continuing, short-term effects on Web search behaviors. This research has implications for the design of task-aware user education as well as the study of users’ interactions with IR systems in general. 相似文献

14.

The influence of task and gender on search and evaluation behavior using Google

Lori Lorigo Bing Pan Helene Hembrooke Thorsten Joachims Laura Granka Geri Gay 《Information processing & management》2006

To improve search engine effectiveness, we have observed an increased interest in gathering additional feedback about users’ information needs that goes beyond the queries they type in. Adaptive search engines use explicit and implicit feedback indicators to model users or search tasks. In order to create appropriate models, it is essential to understand how users interact with search engines, including the determining factors of their actions. Using eye tracking, we extend this understanding by analyzing the sequences and patterns with which users evaluate query result returned to them when using Google. We find that the query result abstracts are viewed in the order of their ranking in only about one fifth of the cases, and only an average of about three abstracts per result page are viewed at all. We also compare search behavior variability with respect to different classes of users and different classes of search tasks to reveal whether user models or task models may be greater predictors of behavior. We discover that gender and task significantly influence different kinds of search behaviors discussed here. The results are suggestive of improvements to query-based search interface designs with respect to both their use of space and workflow. 相似文献

15.

Disambiguated query suggestions and personalized content-similarity and novelty ranking of clustered results to optimize web searches

Gloria Bordogna Alessandro Campi Giuseppe Psaila Stefania Ronchi 《Information processing & management》2012

In this paper, we face the so called “ranked list problem” of Web searches, that occurs when users submit short requests to search engines. Generally, as a consequence of terms’ ambiguity and polysemy, users engage long cycles of query reformulation in an attempt to capture relevant information in the top ranked results. 相似文献

16.

Factors affecting assigned information problem ordering during Web search: An exploratory study

Amanda Spink Minsoo Park Sherry Koshman 《Information processing & management》2006

Multitasking is the human ability to handle the demands of multiple tasks. Multitasking behavior involves the ordering of multiple tasks and switching between tasks. People often multitask when using information retrieval (IR) technologies as they seek information on more than one information problem over single or multiple search episodes. However, limited studies have examined how people order their information problems, especially during their Web search engine interaction. The aim of our exploratory study was to investigate assigned information problem ordering by forty (40) study participants engaged in Web search. Findings suggest that assigned information problem ordering was influenced by the following factors, including personal interest, problem knowledge, perceived level of information available on the Web, ease of finding information, level of importance and seeking information on information problems in order from general to specific. Personal interest and problem knowledge were the major factors during assigned information problem ordering. Implications of the findings and further research are discussed. The relationship between information problem ordering and gratification theory is an important area for further exploration. 相似文献

17.

Validation and interpretation of Web users’ sessions clusters

George Pallis Lefteris Angelis Athena Vakali 《Information processing & management》2007

Understanding users’ navigation on the Web is important towards improving the quality of information and the speed of accessing large-scale Web data sources. Clustering of users’ navigation into sessions has been proposed in order to identify patterns and similarities which are then managed in the context of Web users oriented applications (searching, e-commerce, etc.). This paper deals with the problem of assessing the quality of user session clusters in order to make inferences regarding the users’ navigation behavior. A common model-based clustering algorithm is used to result in clusters of Web users’ sessions. These clusters are validated by using a statistical test, which measures the distances of the clusters’ distributions to infer their dissimilarity and distinguishing level. Furthermore, a visualization method is proposed in order to interpret the relation between clusters. Using real data sets, we illustrate how the proposed analysis can be applied in popular application scenarios to reveal valuable associations among Web users’ navigation sessions. 相似文献

18.

New query suggestion framework and algorithms: A case study for an educational search engine

《Information processing & management》2016,52(5):733-752

Query suggestion is generally an integrated part of web search engines. In this study, we first redefine and reduce the query suggestion problem as “comparison of queries”. We then propose a general modular framework for query suggestion algorithm development. We also develop new query suggestion algorithms which are used in our proposed framework, exploiting query, session and user features. As a case study, we use query logs of a real educational search engine that targets K-12 students in Turkey. We also exploit educational features (course, grade) in our query suggestion algorithms. We test our framework and algorithms over a set of queries by an experiment and demonstrate a 66–90% statistically significant increase in relevance of query suggestions compared to a baseline method. 相似文献

19.

Click data as implicit relevance feedback in web search

Seikyung Jung Jonathan L. Herlocker Janet Webster 《Information processing & management》2007

Search sessions consist of a person presenting a query to a search engine, followed by that person examining the search results, selecting some of those search results for further review, possibly following some series of hyperlinks, and perhaps backtracking to previously viewed pages in the session. The series of pages selected for viewing in a search session, sometimes called the click data, is intuitively a source of relevance feedback information to the search engine. We are interested in how that relevance feedback can be used to improve the search results quality for all users, not just the current user. For example, the search engine could learn which documents are frequently visited when certain search queries are given. 相似文献

20.

Automated assistance in the formulation of search statements for bibliographic databases

Michael P. Oakes Malcolm J. Taylor 《Information processing & management》1998,34(6):645-668

We report on the design and construction of features of an automated query system which will assist pharmacologists who are not information specialists to access the Derwent Drug File (DDF) pharmacological database. Our approach was to first elucidate those search skills of the search intermediary which might prove tractable to automation. Modules were then produced which assist in the three important subtasks of search statement generation, namely vocabulary selection, the choice of context indicators and query reformulation. Vocabulary selection is facilitated by approximate string matching, morphological analysis, browsing and menu searching. The context of the study, such as treatment or metabolism, is determined using a system of advisory menus. The task of query reformulation is performed using user feedback on retrieved documents, thesaurus relations between document index terms and term postings data. Use is made of diverse information sources, including electronic forms of printed search aids, a thesaurus and a medical dictionary. The system will be of use both to semicasual users and experienced intermediaries. Many of the ideas developed should prove transportable to domains other than pharmacology: the techniques for thesaurus manipulation are designed for use with any hierarchical thesaurus. 相似文献