首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The research examines the notion that the principles underlying the procedure used by doctors to diagnose a patient's disease are useful in the design of “intelligent” IR systems because the task of the doctor is conceptually similar to the computer (or human) intermediary's task in “intelligent information retrieval”: to draw out, through interaction with the IR system, the user's query/information need. The research is reported in two parts. In Part II, an information retrieval tool is described which is based on “intelligent information retrieval” assumptions about the information user. In Part I, presented here, the theoretical framework for the tool is set out. This framework is borrowed from the diagnostic procedure currently used in medicine, called “differential diagnosis”. Because of the severe consequences that attend misdiagnosis, the operating principle in differential diagnosis is (1) to expand the uncertainty in the diagnosis situation so that all possible hypotheses and evidence are considered, then (2) to contract the uncertainty in a step by step fashion (from an examination of the patient's symptoms, through the patient's history and a physical (signs), to laboratory tests). The IR theories of Taylor, Kuhlthau and Belkin are used to demonstrate that these medical diagnosis procedures are already present in IR and that it is a viable model with which to design “intelligent” IR tools and systems.  相似文献   

2.
Noetica is a tool for structuring knowledge about concepts and the relationships between them. It differs from typical information systems in that the knowledge it represents is abstract, highly connected and includes meta-knowledge (knowledge about knowledge). Noetica represents knowledge using a strongly-typed semantic network. By providing a rich type system it is possible to represent conceptual information using formalised structures. A class hierarchy provides a basic classification for all objects. This allows for a consistency of representation that is not often found in “free” semantic networks and gives the ability to easily extend a knowledge model while retaining its semantics. We also provide visualisation and query tools for this data model. Visualisation can be used to explore complete sets of link-classes, show paths while navigating through the database, or visualise the results of queries. Noetica supports goal-directed queries (a series of user-supplied goals that the system attempts to satisfy in sequence) and path-finding queries (where the system find relationships between objects in the database by following links).  相似文献   

3.
基于语义网的网络智能导航系统研究   总被引:1,自引:0,他引:1  
高雪霞  田文强 《科技通报》2012,28(2):126-127,133
针对网络智能导航不能根据用户的真实需求,将用户快速、准确地引领到目的地的情况,提出一种基于语义网的网络智能导航系统。通过建立网络信息语义模型和用户需求语义模型,在网络信息和用户之间构建导航语义网,将用户文字描述的具体需求准确理解并输入到导航语义网,在导航语义网中完整理解导航需求,准确实现用户对信息搜索的导航。  相似文献   

4.
Traditional information retrieval techniques that primarily rely on keyword-based linking of the query and document spaces face challenges such as the vocabulary mismatch problem where relevant documents to a given query might not be retrieved simply due to the use of different terminology for describing the same concepts. As such, semantic search techniques aim to address such limitations of keyword-based retrieval models by incorporating semantic information from standard knowledge bases such as Freebase and DBpedia. The literature has already shown that while the sole consideration of semantic information might not lead to improved retrieval performance over keyword-based search, their consideration enables the retrieval of a set of relevant documents that cannot be retrieved by keyword-based methods. As such, building indices that store and provide access to semantic information during the retrieval process is important. While the process for building and querying keyword-based indices is quite well understood, the incorporation of semantic information within search indices is still an open challenge. Existing work have proposed to build one unified index encompassing both textual and semantic information or to build separate yet integrated indices for each information type but they face limitations such as increased query process time. In this paper, we propose to use neural embeddings-based representations of term, semantic entity, semantic type and documents within the same embedding space to facilitate the development of a unified search index that would consist of these four information types. We perform experiments on standard and widely used document collections including Clueweb09-B and Robust04 to evaluate our proposed indexing strategy from both effectiveness and efficiency perspectives. Based on our experiments, we find that when neural embeddings are used to build inverted indices; hence relaxing the requirement to explicitly observe the posting list key in the indexed document: (a) retrieval efficiency will increase compared to a standard inverted index, hence reduces the index size and query processing time, and (b) while retrieval efficiency, which is the main objective of an efficient indexing mechanism improves using our proposed method, retrieval effectiveness also retains competitive performance compared to the baseline in terms of retrieving a reasonable number of relevant documents from the indexed corpus.  相似文献   

5.
The system presented in this article aims to improve information access through the use of semantic annotation utilizing a non-traditional approach. Instead of applying semantic annotations to enhance the internal information access mechanisms, we use them to empower the user of an information access system through an innovative named entity-based user interface – NameSieve. NameSieve was built to support an intelligence analyst during the process of exploratory search, an advanced type of search requiring multiple iterations of retrieval interleaved with browsing and analyzing the retrieved information. The proposed approach was implemented in the NameSieve system so that the system can transparently present a summary of search results in the form of entity “clouds.” Therefore, these clouds allow the analyst to further explore the results in a novel manner, acting together as a faceted browsing interface. We ran a user study (with ten subjects) to examine the effect of NameSieve, and the study results reported in the paper demonstrate that this new way of applying semantic annotation information was actively used and was evaluated positively by the subjects. It enabled the subjects to work more productively and bring back most relevant documents.  相似文献   

6.
One difficult problem in information retrieval (IR) is the proper interpretation of user queries. It is extremely hard for users to express their information needs in a specific yet exhaustive way. In an effort to alleviate this problem, two theoretical models have been proposed to utilize user characteristics maintained in the form of a user profile. Although the idea of integrating user profiles into an IR system is intuitively appealing, and the models seem viable, no research to date has established a foundation for the roles of user profiles in such a system. Aiming at the investigation of the roles of user profiles, therefore, this study first identifies and extends various query/profile interaction models to provide a ground upon which the investigation can be undertaken. From a continuum of models characterized on the basis of interaction types, metrics, and parameters, nearly 400 models are chosen to investigate the “model space.” New measures are developed based on the notion of user satisfaction/frustration. In addition, three different criteria are used to guide users in making judgments on the quality of retrieved items. Analysis of the data obtained from the experiments shows that, for a wide variety of criteria and metrics, there are always some query/profile interaction models that outperform the query alone model. In addition, preferable characteristics for different criteria are identified in terms of interaction types, parameters, and metrics.  相似文献   

7.
Using genetic algorithms to evolve a population of topical queries   总被引:1,自引:1,他引:0  
Systems for searching the Web based on thematic contexts can be built on top of a conventional search engine and benefit from the huge amount of content as well as from the functionality available through the search engine interface. The quality of the material collected by such systems is highly dependant on the vocabulary used to generate the search queries. In this scenario, selecting good query terms can be seen as an optimization problem where the objective function to be optimized is based on the effectiveness of a query to retrieve relevant material. Some characteristics of this optimization problem are: (1) the high-dimensionality of the search space, where candidate solutions are queries and each term corresponds to a different dimension, (2) the existence of acceptable suboptimal solutions, (3) the possibility of finding multiple solutions, and in many cases (4) the quest for novelty. This article describes optimization techniques based on Genetic Algorithms to evolve “good query terms” in the context of a given topic. The proposed techniques place emphasis on searching for novel material that is related to the search context. We discuss the use of a mutation pool to allow the generation of queries with new terms, study the effect of different mutation rates on the exploration of query-space, and discuss the use of a especially developed fitness function that favors the construction of queries containing novel but related terms.  相似文献   

8.
An information retrieval system is modeled from the point of view of a user linearly scanning the output list for relevant records of citations. Expected search length, a measure of retrieval system performance, is shown to be affected by the stopping rule employed by the user to determine when to terminate the search. Three stopping rules are considered: the satiation rule, the disgust rule, and the combination rule. The effects of these various stopping rules on expected search length are examined and discussed in detail.  相似文献   

9.
Web searchers commonly have difficulties crafting queries to fulfill their information needs; even after they are able to craft a query, they often find it challenging to evaluate the results of their Web searches. Sources of these problems include the lack of support for constructing and refining queries, and the static nature of the list-based representations of Web search results. WordBars has been developed to assist users in their Web search and exploration tasks. This system provides a visual representation of the frequencies of the terms found in the first 100 document surrogates returned from an initial query, in the form of a histogram. Exploration of the search results is supported through term selection in the histogram, resulting in a re-sorting of the search results based on the use of the selected terms in the document surrogates. Terms from the histogram can be easily added or removed from the query, generating a new set of search results. Examples illustrate how WordBars can provide valuable support for query refinement and search results exploration, both when vague and specific initial queries are provided. User evaluations with both expert and intermediate Web searchers illustrate the benefits of the interactive exploration features of WordBars in terms of effectiveness as well as subjective measures. Although differences were found in the demographics of these two user groups, both were able to benefit from the features of WordBars.  相似文献   

10.
FACTS is an APL-based interactive on-line system used for retrieval of budget and accounting data. The system provides selective retrieval and manipulation of financial data for management in a development laboratory. The terms “teilnehmer” and “teilhaber” are defined and it is argued that use of a teilnehmer system, such as APL, can considerably reduce the programming and monitary investment for information science systems applications. A brief discussion of APL's text editing facilities is also included to introduce this relatively unknown language to information scientists.  相似文献   

11.
针对目前常用搜索引擎在查询时返回结果数量巨大且杂乱无章的现象,在Web客户端为实现对用户的个性化信息服务设计了一种基于用户兴趣的搜索系统。利用用户的兴趣对于用户提出的搜索条件进行处理,再通过常用的搜索引擎进行查询,并将得到的结果进行二次排序,同时通过反馈信息不断更新用户的兴趣,以满足用户不断变化的需求。实验证明这样在保证了查全率的基础上,提高了查准率,从而提高了搜索效率。  相似文献   

12.
Current Web-based search engines presume a category search for a specific group of users. This approach is appropriate for generalized information searches since it is based on statistically generated user profiles. However, in some applications, such as medicine and law, an individualized search for a specific user at a given point in time is desired. In addition, the use of specialized terminology in some fields necessitates guidance for the non-expert to be successful in locating the desired information. This paper presents a new decision support system enabled by the analytic hierarchy process and intelligent software agents that can be used by researchers and practitioners in technical fields to aid information retrieval and improve search results from a controlled vocabulary. An application from telemedicine is given to illustrate the potential improvements.  相似文献   

13.
李江华  时鹏 《情报杂志》2012,31(4):112-116
Internet已成为全球最丰富的数据源,数据类型繁杂且动态变化,如何从中快速准确地检索出用户所需要的信息是一个亟待解决的问题.传统的搜索引擎基于语法的方式进行搜索,缺乏语义信息,难以准确地表达用户的查询需求和被检索对象的文档语义,致使查准率和查全率较低且搜索范围有限.本文对现有的语义检索方法进行了研究,分析了其中存在的问题,在此基础上提出了一种基于领域的语义搜索引擎模型,结合语义Web技术,使用领域本体元数据模型对用户的查询进行语义化规范,依据领域本体模式抽取文档中的知识并RDF化,准确地表达了用户的查询语义和作为被查询对象的文档语义,可以大大提高检索的准确性和检索效率,详细地给出了模型的体系结构、基本功能和工作原理.  相似文献   

14.
We report on the design and construction of features of an automated query system which will assist pharmacologists who are not information specialists to access the Derwent Drug File (DDF) pharmacological database. Our approach was to first elucidate those search skills of the search intermediary which might prove tractable to automation. Modules were then produced which assist in the three important subtasks of search statement generation, namely vocabulary selection, the choice of context indicators and query reformulation. Vocabulary selection is facilitated by approximate string matching, morphological analysis, browsing and menu searching. The context of the study, such as treatment or metabolism, is determined using a system of advisory menus. The task of query reformulation is performed using user feedback on retrieved documents, thesaurus relations between document index terms and term postings data. Use is made of diverse information sources, including electronic forms of printed search aids, a thesaurus and a medical dictionary. The system will be of use both to semicasual users and experienced intermediaries. Many of the ideas developed should prove transportable to domains other than pharmacology: the techniques for thesaurus manipulation are designed for use with any hierarchical thesaurus.  相似文献   

15.
A large body of research work has proposed verification techniques for rumors spreading in social media that mainly relied on subjective evidence, e.g., propagation networks or user interactions. Alternatively, in this work, we introduce the task of authority finding in social media, in which we aim to find authorities, for given rumors spreading specifically in Twitter, who can help verify them by providing exclusive/convincing evidence that supports or denies those rumors. We release the first test collection for Authority FINding in Arabic Twitter (AuFIN). The collection comprises 150 rumors (expressed in tweets) associated with a total of 1,044 authority accounts and a user collection of 395,231 Twitter accounts (members of 1,192,284 unique Twitter lists). Moreover, we propose a hybrid model that employs pre-trained language models and combines lexical, semantic, and network signals to find authorities. Our experiments show that the textual representation of users is insufficient, and incorporating the Twitter network features improved the recall of authorities by 34%. Moreover, semantic ranking is inferior to the lexical and network-based ranking in terms of precision, but superior in terms of recall. Therefore, combining both the semantic and network-based ranking achieved the best overall performance achieving a precision of 0.413 and 0.213 at depth 1 and 5 respectively. We show that rumor expansion by exploiting Knowledge Bases improves the recall of authorities by up to 15%. Furthermore, we find that SOTA models for topic expert finding perform poorly on finding authorities. Finally, drawing upon our experiments, we discuss failure factors and make recommendations for future research directions in addressing this task.  相似文献   

16.
This paper is concerned with some aspects of database interfaces for casual, naive users. A “casual user” is defined as an individual who wishes to execute queries once or twice a month, and a “naive user” is someone who has little or no expertise in operating computers. The study focuses on a specific group of casual, naive users, analyzes their needs and proposes a solution. The proposed interface consists of a graphical display of a model of a database and a natural language query language. One of the unique properties of the database interface is that it allows the user to see local item names within the context of a global structure. The interface was then tested to determine whether it was acceptable to the user population and to discover the level of graphical model that the users would find most comfortable.  相似文献   

17.
The use of geometrical factors to locate information centers for a spatially distributed user population will be shown. The total amount of information for the community of users is considered to be predetermined. A proportion of that information is to be allocated to each information center created. An optimal user versus distance and contents of the center compromise will be obtained using standard mathematical programming techniques. An interesting theoretical situation results for those cases where the “satisfaction benefit” due to quantity of information increases more slowly than the quantity of information. For such cases, the optimal decentralization (or pluralization) is no decentralization at all—a single location results. A case study locating the Mathematics information of a University concludes the work.  相似文献   

18.
Citing statements can be used to aid retrieval, to increase the efficiency of citation indexes and for the study of information flow and use. These uses are only feasible on a large scale if computers can identify citing statements within the texts of documents with reasonable accuracy.Computer recognition of multi-sentence citing statements is not easy. Procedures developed for chemistry papers in an earlier experiment were tested on biomedical papers (dealing with various aspects of cancer) and were almost as successful. Specifically, (1) 78% of the words in computer-recognized citing statements were correctly attributable to the corresponding cited papers; and (2) the computer procedures missed 4% of the words in the actual citing statements. When the procedures were modified on the basis of those results and tested on a new sample of cancer papers the results were comparable: 72 and 3% respectively.In an earlier experiment in use of full-text searching to retrieve answer-passages from cancer papers, recall in the “test phase” averaged about 70% and the false retrieval rate was thirteen falsely retrieved sentences per answer-paper retrieved. Unretrieved answer-papers in that experiment's “development phase”, and citing statements referring to them, were studied to develop computer procedures for using citing statements to increase recall. The procedures developed only produced slight recall increases for development phase answer-papers, and similarly for the test phase papers on which they were then tested. Specifically, the test phase results were the following: recall was increased from 70 to 74%, and there was no increase in false retrieval. This contrasts with an earlier experiment in which 50% recall of chemistry papers by search of index terms and abstract words was increased to 70% by the addition of words from citing statements. The difference may be because the average number of citing papers per unretrieved cancer paper was only six while that for chemistry papers was thirteen.  相似文献   

19.
Co-authorship among scientists represents a prototype of a social network. By mapping the graph containing all relevant publications of members in an international collaboration network: COLLNET, we infer the structural mechanisms that govern the topology of this social system. The structure of the network affects the information available to individuals, and their opportunities to collaborate. The structure of the network also affects the overall flow of information, and the nature of the scientific community. We present a number of measures of both the macro- (whole-network) and micro- (actor-centered) structure of collaboration, and apply these to COLLNET. We find that this scientific community displays many aspects of a “small-world,” and is somewhat vulnerable to disruption should major figures become inactive. We also find inequality in the roles played by individuals in the network. The inequalities, however, do not create a closed and isolated “core” or elite.  相似文献   

20.
An ordering system for a global information network is necessary in order to enable the user to retrieve the particular information he is looking for. Classification has been one of the methods of ordering. The principle of traditional classification has been based on the idea of partitioning the universe of knowledge in mutually exclusive classes, i.e. subjects. A particular topic is defined by narrower classification within a class following the principle of ‘genusspecies’ relationship. Ranganathan's system of faceted classification has only replaced the classification of terms into subjects and sub-subjects by classification of terms into five ambiguous categories. Taube's system of coordinate indexing gives full freedom to the user to combine any number of terms of his choice. To be effective for social sciences such a system has to overcome some difficult problems of semantics. The system MANIS described here maintains the traditional classification and yet allows the user to combine terms of his choice, where the choice is restricted to the terms belonging to the system of traditional classification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号