期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An algorithm to cluster documents based on relevance

《Information processing & management》2005,41(5):1035-1049

相似文献

2.

Investigating the role of in-situ user expectations in Web search

《Information processing & management》2023,60(3):103300

Pre-adoption expectations often serve as an implicit reference point in users’ evaluation of information systems and are closely associated with their goals of interactions, behaviors, and overall satisfaction. Despite the empirically confirmed impacts, users’ search expectations and their connections to tasks, users, search experiences, and behaviors have been scarcely studied in the context of online information search. To address the gap, we collected 116 sessions from 60 participants in a controlled-lab Web search study and gathered direct feedback on their in-situ expected information gains (e.g., number of useful pages) and expected search efforts (e.g., clicks and dwell time) under each query during search sessions. Our study aims to examine (1) how users’ pre-search experience, task characteristics, and in-session experience affect their current expectations and (2) how user expectations are correlated with search behaviors and satisfaction. Our results with both quantitative and qualitative evidence demonstrate that: (1) user expectation is significantly affected by task characteristics, previous and in-situ search experience; (2) user expectation is closely associated with users’ browsing behaviors and search satisfaction. The knowledge learned about user expectation advances our understanding of users’ search behavioral patterns and their evaluations of interaction experience and will also facilitate the design, implementation, and evaluation of expectation-aware user models, metrics, and information retrieval (IR) systems. 相似文献

3.

Intelligent scientific authoring tools: Interactive data mining for constructive uses of citation networks

B. Berendt B. Krause S. Kolbe-Nusser 《Information processing & management》2010

Many powerful methods and tools exist for extracting meaning from scientific publications, their texts, and their citation links. However, existing proposals often neglect a fundamental aspect of learning: that understanding and learning require an active and constructive exploration of a domain. In this paper, we describe a new method and a tool that use data mining and interactivity to turn the typical search and retrieve dialogue, in which the user asks questions and a system gives answers, into a dialogue that also involves sense-making, in which the user has to become active by constructing a bibliography and a domain model of the search term(s). This model starts from an automatically generated and annotated clustering solution that is iteratively modified by users. The tool is part of an integrated authoring system covering all phases from search through reading and sense-making to writing. Two evaluation studies demonstrate the usability of this interactive and constructive approach, and they show that clusters and groups represent identifiable sub-topics. 相似文献

4.

网络用户在线评论的主题图谱构建及可视化研究——以酒店用户评论为例

下载免费PDF全文

邢云菲曹高辉陶然《情报科学》2021,39(9):101-109

【目的/意义】网络用户在线评论是用户对某产品或服务机构体验感知的反馈,对网络用户在线评论的文本挖掘是情报分析的重要内容。【方法/过程】为了更有效从海量网络用户在线评论文本中挖掘用户感兴趣的信息, 本研究爬取TripAdvisor网站四大城市的酒店用户在线评论,基于主题图谱理论和文本聚类算法构建网络用户在线评论的聚类模型,通过图谱可视化揭示不同地区酒店用户观点差异,并分析不同图谱的社会网络特征。【结果/结论】研究发现酒店用户最关注的是服务,其次是酒店的环境和位置。本研究能够快速挖掘酒店用户关注内容,对帮助酒店管理者了解用户住宿需求并以此提高用户满意度具有重要价值。【创新/局限】本文结合主题图谱和文本挖掘技术构建酒店用户在线评论主题图谱,在大数据文本主题聚类上显示出优越性。但本文仅分析TripAdvisor网站四个城市中部分酒店的用户在线评论,数据面覆盖不够广泛。相似文献

5.

基于标签聚类的电子商务网站分类目录改善研究

张红甘利人薛春香《现代情报》2012,32(1):3-7

本研究针对电子商务网站用户对商品概念认知与网站实际分类目录不匹配,导致检索效率低下的问题,提出了基于用户标签的电子商务网站分类目录改善方案,即将用户标签进行多层聚类,将聚类结果以层级结构的形式展示,并实现标签聚类结果和网站分类目录的映射,从而提高电子商务网站的分类检索效率和分类导航性能。相似文献

6.

基于用户兴趣的个性化搜索系统研究

韩娜沈西挺刘岩《人天科学研究》2010,(1)

针对目前常用搜索引擎在查询时返回结果数量巨大且杂乱无章的现象,在Web客户端为实现对用户的个性化信息服务设计了一种基于用户兴趣的搜索系统。利用用户的兴趣对于用户提出的搜索条件进行处理,再通过常用的搜索引擎进行查询,并将得到的结果进行二次排序,同时通过反馈信息不断更新用户的兴趣,以满足用户不断变化的需求。实验证明这样在保证了查全率的基础上,提高了查准率,从而提高了搜索效率。相似文献

7.

基于马尔可夫模型的图书馆用户聚类分群方法研究 总被引：1，自引：0，他引：1

下载免费PDF全文

吴艳玲孙思阳《情报科学》2021,39(11):167-172

【目的/意义】针对图书馆用户群体聚类分群不稳定且错误率较高的问题,提出基于马尔可夫模型的图书馆用户聚类分群方法,提升图书馆用户聚类分群精准度。【方法/过程】采用一阶马尔可夫混合模型构建用户动作序列模型,通过模型产生用户行为聚类,体现用户动作的动态性,采用自适应自然梯度算法,依据用户行为分离状态自适应调整自身步长,优化模型参数学习中模型自动选择问题,实现最佳图书馆用户聚类分群。【结果/结论】通过实验结果能够证明,实际聚类数量小于L值时,提出方法能够实现参数学习过程中模型的自动选择。提出方法的分群数量最多,能够划分出最大的取值区间,聚类错误率最低为0.22%,聚类性能比较稳定,分群结果更加精准,达到了设计的预期。【创新/局限】采用一阶马尔可夫混合模型实现了图书馆用户聚类分群。后续将进一步研究可考虑用户序列间关联的高阶马尔可夫分量模型,以提高分群算法的准确性和稳定性。相似文献

8.

Evaluating combinations of ranked lists and visualizations of inter-document similarity

《Information processing & management》2001,37(3):435-458

We are interested in how ideas from document clustering can be used to improve the retrieval accuracy of ranked lists in interactive systems. In particular, we are interested in ways to evaluate the effectiveness of such systems to decide how they might best be constructed. In this study, we construct and evaluate systems that present the user with ranked lists and a visualization of inter-document similarities. We first carry out a user study to evaluate the clustering/ranked list combination on instance-oriented retrieval, the task of the TREC-6 Interactive Track. We find that although users generally prefer the combination, they are not able to use it to improve effectiveness. In the second half of this study, we develop and evaluate an approach that more directly combines the ranked list with information from inter-document similarities. Using the TREC collections and relevance judgments, we show that it is possible to realize substantial improvements in effectiveness by doing so, and that although users can use the combined information effectively, the system can provide hints that substantially improve on the user's solo effort. The resulting approach shares much in common with an interactive application of incremental relevance feedback. Throughout this study, we illustrate our work using two prototype systems constructed for these evaluations. The first, AspInQuery, is a classic information retrieval system augmented with a specialized tool for recording information about instances of relevance. The other system, Lighthouse, is a Web-based application that combines a ranked list with a portrayal of inter-document similarity. Lighthouse can work with collections such as TREC, as well as the results of Web search engines. 相似文献

9.

Click data as implicit relevance feedback in web search

Seikyung Jung Jonathan L. Herlocker Janet Webster 《Information processing & management》2007

Search sessions consist of a person presenting a query to a search engine, followed by that person examining the search results, selecting some of those search results for further review, possibly following some series of hyperlinks, and perhaps backtracking to previously viewed pages in the session. The series of pages selected for viewing in a search session, sometimes called the click data, is intuitively a source of relevance feedback information to the search engine. We are interested in how that relevance feedback can be used to improve the search results quality for all users, not just the current user. For example, the search engine could learn which documents are frequently visited when certain search queries are given. 相似文献

10.

一种改进的SOM神经网络对Web用户的聚类

肖强钱晓东武振锋《情报科学》2012,(6):820-824

将Web网站用户浏览日志进行访问用户的有效性提取,并利用相异度原理对提取的Web访问用户进行聚类中心和聚类数的确定,并以此做为SOM神经网络权值的调节值和SOM神经网络输出的节点数,从而优化SOM神经网络的学习能力,提高SOM神经网络的聚类效果。相似文献

11.

How motivational feedback increases user’s benefits and continued use: A study on gamification,quantified-self and social networking

《International Journal of Information Management》2019

With the increasing provenance of hedonic and social information systems, systems are observed to employ other forms of feedback and design than purely informational in order to increase user engagement and motivation. Three principle classes of motivational design pursuing user engagement have become increasingly established; gamification, quantified-self and social networking. This study investigates how the perceived prominence of these three design classes in users’ use of information system facilitate experiences of affective, informational and social feedback as well as user’s perceived benefits from a system and their continued use intentions. We employ survey data (N = 167) gathered from users of HeiaHeia; an exercise encouragement system that employs features belonging to the three design classes. The results indicate that gamification is positively associated with experiences of affective feedback, quantified-self with experiences of both affective and informational feedback and social networking with experiences of social feedback. Experiences of affective feedback are further strongly associated with user perceived benefits and continued use intentions, whereas experiences of informational feedback are only associated with continued use intentions. Experiences of social feedback had no significant relationship with neither. The findings provide practical insights into how systems can be designed to facilitate different types of feedback that increases users’ engagement, benefits and intentions to continue the use of a system. 相似文献

12.

基于社会网络分析与LDA的虚拟学术社区中用户群体主题挖掘研究 总被引：1，自引：0，他引：1

下载免费PDF全文

李玉媛熊回香杨梦婷叶佳鑫《情报科学》2021,39(11):110-116

【目的/意义】研究从用户群体的角度出发,依据用户特征对社区用户进行群体划分,以了解不同用户群体的主题差异,从而更加全面清晰的了解社区主题,更好的为社区用户推荐资源。【方法/过程】研究利用社会网络分析和Topsis算法对用户群体进行划分,再利用LDA模型分别对不同用户进行主题挖掘,最后采用谱聚类实现主题优化。【结果/结论】科学网情报学社区的核心用户与一般用户群体主题有相同的部分,也存在差异,核心用户群体的主题专指性较强,一般用户群体的主题较为广泛。基于虚拟学术社区用户群体主题挖掘模型,可以更加全面展示社区用户关注的主题,更好地为社区用户推荐资源。【创新/局限】研究从用户群体的视角出发,提出了虚拟学术社区用户群体主题挖掘模型,更好的为社区用户推荐资源,但本研究在数据量、主题模型以及社会网络分析指标的选取等方面还需要拓展与延伸。相似文献

13.

The influence of task and gender on search and evaluation behavior using Google

Lori Lorigo Bing Pan Helene Hembrooke Thorsten Joachims Laura Granka Geri Gay 《Information processing & management》2006

To improve search engine effectiveness, we have observed an increased interest in gathering additional feedback about users’ information needs that goes beyond the queries they type in. Adaptive search engines use explicit and implicit feedback indicators to model users or search tasks. In order to create appropriate models, it is essential to understand how users interact with search engines, including the determining factors of their actions. Using eye tracking, we extend this understanding by analyzing the sequences and patterns with which users evaluate query result returned to them when using Google. We find that the query result abstracts are viewed in the order of their ranking in only about one fifth of the cases, and only an average of about three abstracts per result page are viewed at all. We also compare search behavior variability with respect to different classes of users and different classes of search tasks to reveal whether user models or task models may be greater predictors of behavior. We discover that gender and task significantly influence different kinds of search behaviors discussed here. The results are suggestive of improvements to query-based search interface designs with respect to both their use of space and workflow. 相似文献

14.

Asking Clarifying Questions: To benefit or to disturb users in Web search?

《Information processing & management》2023,60(2):103176

Modern information-seeking systems are becoming more interactive, mainly through asking Clarifying Questions (CQs) to refine users’ information needs. System-generated CQs may be of different qualities. However, the impact of asking multiple CQs of different qualities in a search session remains underexplored. Given the multi-turn nature of conversational information-seeking sessions, it is critical to understand and measure the impact of CQs of different qualities, when they are posed in various orders. In this paper, we conduct a user study on CQ quality trajectories, i.e., asking CQs of different qualities in chronological order. We aim to investigate to what extent the trajectory of CQs of different qualities affects user search behavior and satisfaction, on both query-level and session-level. Our user study is conducted with 89 participants as search engine users. Participants are asked to complete a set of Web search tasks. We find that the trajectory of CQs does affect the way users interact with Search Engine Result Pages (SERPs), e.g., a preceding high-quality CQ prompts the depth users to interact with SERPs, while a preceding low-quality CQ prevents such interaction. Our study also demonstrates that asking follow-up high-quality CQs improves the low search performance and user satisfaction caused by earlier low-quality CQs. In addition, only showing high-quality CQs while hiding other CQs receives better gains with less effort. That is, always showing all CQs may be risky and low-quality CQs do disturb users. Based on observations from our user study, we further propose a transformer-based model to predict which CQs to ask, to avoid disturbing users. In short, our study provides insights into the effects of trajectory of asking CQs, and our results will be helpful in designing more effective and enjoyable search clarification systems. 相似文献

15.

Investigating the lack of diversity in user behavior: The case of musical content on online platforms

《Information processing & management》2020,57(2):102169

Whether to deal with issues related to information ranking (e.g. search engines) or content recommendation (on social networks, for instance), algorithms are at the core of processes that select which information is made visible. Such algorithmic choices have a strong impact on users’ activity de facto, and therefore on their access to information. This raises the question of how to measure the quality of the choices algorithms make and their impact on users. As a first step in that direction, this paper presents a framework with which to analyze the diversity of information accessed by users in the context of musical content.The approach adopted centers on the representation of user activity through a tripartite graph that maps users to products and products to categories. In turn, conducting random walks in this structure makes it possible to analyze how categories catch users’ attention and how this attention is distributed. Building upon this distribution, we propose a new index referred to as the (calibrated) herfindahl diversity, which is aimed at quantifying the extent to which this distribution is diverse and representative of existing categories.To the best of our knowledge, this paper is the first to connect the output of random walks on graphs with diversity indexes. We demonstrate the benefit of such an approach by applying our index to two datasets that record user activity on online platforms involving musical content. The results are threefold. First, we show that our index can discriminate between different user behaviors. Second, we shed some light on a saturation phenomenon in the diversity of users’ attention. Finally, we show that the lack of diversity observed in the datasets derives from exogenous factors related to the heterogeneous popularity of music styles, as opposed to internal factors such as recurrent user behaviors. 相似文献

16.

Detecting shilling groups in online recommender systems based on graph convolutional network

《Information processing & management》2022,59(5):103031

Online recommender systems have been shown to be vulnerable to group shilling attacks in which attackers of a shilling group collaboratively inject fake profiles with the aim of increasing or decreasing the frequency that particular items are recommended. Existing detection methods mainly use the frequent itemset (dense subgraph) mining or clustering method to generate candidate groups and then utilize the hand-crafted features to identify shilling groups. However, such two-stage detection methods have two limitations. On the one hand, due to the sensitivity of support threshold or clustering parameters setting, it is difficult to guarantee the quality of candidate groups generated. On the other hand, they all rely on manual feature engineering to extract detection features, which is costly and time-consuming. To address these two limitations, we present a shilling group detection method based on graph convolutional network. First, we model the given dataset as a graph by treating users as nodes and co-rating relations between users as edges. By assigning edge weights and filtering normal user relations, we obtain the suspicious user relation graph. Second, we use principal component analysis to refine the rating features of users and obtain the user feature matrix. Third, we design a three-layer graph convolutional network model with a neighbor filtering mechanism and perform user classification by combining both structure and rating features of users. Finally, we detect shilling groups through identifying target items rated by the attackers according to the user classification results. Extensive experiments show that the classification accuracy and detection performance (F1-measure) of the proposed method can reach 98.92% and 99.92% on the Netflix dataset and 93.18% and 92.41% on the Amazon dataset. 相似文献

17.

Credibility assessment of good abandonment results in mobile search

《Information processing & management》2020,57(6):102350

Search engine optimization allows for users’ needs to be directly met by result snippets or a “knowledge map” without clicking any results. This behavior is called “good abandonment” and is found to frequently occur during mobile searching. Users exhibit such a behavior when they trust the result that addresses their information need without bothering to click it. Therefore, this study examines how users judge a result's credibility without clicking. This study proposes a model for assessing the credibility of good abandonment results, making a hypothesis about the measures that may affect credibility assessments in mobile searches. A credibility assessment experiment was conducted to collect users’ eye movements, perceived credibility and feedback on different credibility measures. Users’ search behaviors were recorded by a screen recorder, in order to see whether a search was good abandonment. Then the initially proposed model was validated in terms of users’ perceived credibility, search behaviors and feedback, and further improved. The revised model found that the credibility assessment of good abandonment results in mobile searching is determined by six credibility measures distributed across three aspects of content, operator and design. Content-related measures show that users tend to believe the results if there is detailed and updated context information and the content is neutral. Operator-related measures indicate the impact of trust in the search engine on the credibility assessment. Design-related measures indicate that users tend to trust results with interactive functions and optimal layouts. How each of the six measures influence users’ assessment of credibility is discussed in this paper. 相似文献

18.

Implicit information need as explicit problems,help, and behavioral signals

《Information processing & management》2020,57(2):102069

Information need is one of the most fundamental aspects of information seeking, which traditionally conceptualizes as the initiation phase of an individual’s information seeking behavior. However, the very elusive and inexpressible nature of information need makes it hard to elicit from the information seeker or to extract through an automated process. One approach to understanding how a person realizes and expresses information need is to observe their seeking behaviors, to engage processes with information retrieval systems, and to focus on situated performative actions. Using Dervin’s Sense-Making theory and conceptualization of information need based on existing studies, the work reported here tries to understand and explore the concept of information need from a fresh methodological perspective by examining users’ perceived barriers and desired helps in different stages of information search episodes through the analyses of various implicit and explicit user search behaviors. In a controlled lab study, each participant performed three simulated online information search tasks. Participants’ implicit behaviors were collected through search logs, and explicit feedback was elicited through pre-task and post-task questionnaires. A total of 208 query segments were logged, along with users’ annotations on perceived problems and help. Data collected from the study was analyzed by applying both quantitative and qualitative methods. The findings identified several behaviors – such as the number of bookmarks, query length, number of the unique queries, time spent on search results observed in the previous segment, the current segment, and throughout the session – strongly associated with participants’ perceived barriers and help needed. The findings also showed that it is possible to build accurate predictive models to infer perceived problems of articulation of queries, useless and irrelevant information, and unavailability of information from users’ previous segment, current segment, and whole session behaviors. The findings also demonstrated that by combining perceived problem(s) and search behavioral features, it was possible to infer users’ needed help(s) in search with a certain level of accuracy (78%). 相似文献

19.

A knowledge-based semantic framework for query expansion

Jamal Abdul Nasir Iraklis Varlamis Samreen Ishfaq 《Information processing & management》2019,56(5):1605-1617

Searching for relevant material that satisfies the information need of a user, within a large document collection is a critical activity for web search engines. Query Expansion techniques are widely used by search engines for the disambiguation of user’s information need and for improving the information retrieval (IR) performance. Knowledge-based, corpus-based and relevance feedback, are the main QE techniques, that employ different approaches for expanding the user query with synonyms of the search terms (word synonymy) in order to bring more relevant documents and for filtering documents that contain search terms but with a different meaning (also known as word polysemy problem) than the user intended. This work, surveys existing query expansion techniques, highlights their strengths and limitations and introduces a new method that combines the power of knowledge-based or corpus-based techniques with that of relevance feedback. Experimental evaluation on three information retrieval benchmark datasets shows that the application of knowledge or corpus-based query expansion techniques on the results of the relevance feedback step improves the information retrieval performance, with knowledge-based techniques providing significantly better results than their simple relevance feedback alternatives in all sets. 相似文献

20.

A heuristic hierarchical scheme for academic search and retrieval

Emmanouil Amolochitis Ioannis T. Christou Zheng-Hua Tan Ramjee Prasad 《Information processing & management》2013

We present PubSearch, a hybrid heuristic scheme for re-ranking academic papers retrieved from standard digital libraries such as the ACM Portal. The scheme is based on the hierarchical combination of a custom implementation of the term frequency heuristic, a time-depreciated citation score and a graph-theoretic computed score that relates the paper’s index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate our proposed re-ranking scheme via user feedback against the results of ACM Portal on a total of 58 different user queries specified from 15 different users. The results show that our proposed scheme significantly outperforms ACM Portal in terms of retrieval precision as measured by most common metrics in Information Retrieval including Normalized Discounted Cumulative Gain (NDCG), Expected Reciprocal Rank (ERR) as well as a newly introduced lexicographic rule (LEX) of ranking search results. In particular, PubSearch outperforms ACM Portal by more than 77% in terms of ERR, by more than 11% in terms of NDCG, and by more than 907.5% in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can be easily plugged in any existing search engine for retrieval of academic publications. 相似文献