首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 192 毫秒
1.
Given a user question, the goal of a Question Answering (QA) system is to retrieve answers rather than full documents or even best-matching passages, as most Information Retrieval systems currently do. In this paper, we present BRUJA, a QA system for the management of multilingual collections. BRUJ rkstions (English, Spanish and French). The BRUJA architecture is not formed with three monolingual QA systems but instead uses English as Interlingua to make usual QA tasks such as question classifications and answer extractions. In addition, BRUJA uses Cross Language Information Retrieval (CLIR) techniques to retrieve relevant documents from a multilingual collection. On the one hand, we have more documents to find answers from but on the other hand, we are introducing noise into the system because of translations to the Interlingua (English) and the CLIR module. The question is whether the difficulty of managing three languages is worth it or whether a monolingual QA system delivers better results. We report on in-depth experimentation and demonstrate that our multilingual QA system gets better results than its monolingual counterpart whenever it uses good translation resources and, especially, CLIR techniques that are state-of-the-art.  相似文献   

2.
二元语义信息检索模型*   总被引:1,自引:0,他引:1  
提出一个基于二元语义的信息检索模型。该模型包含文档的表示、查询语句的表示、文档和查询的匹配3个部分。相对于传统的基于查询关键词精确匹配的信息检索模型,该模型能较好地满足用户查询要求中的灵活性。  相似文献   

3.
文摘索引型数据库检索系统的现状与发展趋势   总被引:10,自引:0,他引:10  
林佳  杨毅 《图书情报工作》2003,47(10):68-73
以ISI Web of Knowledge信息源为例,从10个方面介绍文摘索引型数据库检索系统的现状和发展趋势。  相似文献   

4.
The application of visualization techniques to information retrieval (IR) has resulted in the development of innovative systems and interfaces that are now available for public use. Visualization tools have emerged in research environments and more recently on the Web to retrieve information. Questions arise in regard to the utility of Web-based IR visualization tools for assisting users not only in manipulating search output, but also in managing the information retrieval process. To understand how Web-based visualization tools enable visual information retrieval, this article reviews some of the human perceptual theory behind the graphical interface of information visualization systems, analyzes iconic representations and information density on visualization displays, and examines information retrieval tasks that have been used in visualization system user research. This article is timely since it addresses new technologies for Web information retrieval and discusses future information visualization user research directions.  相似文献   

5.
BioSYNTHESIS is a prototype intelligent retrieval system under development as part of the IAIMS project at Georgetown University. The aim is to create an integrated system that can retrieve information located on disparate computer systems. The project work has been divided in two phases: BioSYNTHESIS I, development of a single menu to access various databases which reside on different computers; and BioSYNTHESIS II, development of a search component that facilitates complex searching for the user. BioSYNTHESIS II will accept a user's query and conduct a search for appropriate information in the IAIMS databases at Georgetown. For information not available at Georgetown, such as full text, it will access selected remote systems and translate the search query as appropriate for the target system. The search through various computer systems and different databases with unique storage and retrieval structures will be transparent to the user. BioSYNTHESIS I is complete and available to users. The design work for BioSYNTHESIS II is under development and will continue as a multiyear technical research effort of the proposed Georgetown IAIMS implementation project.  相似文献   

6.
The ability to find tables and extract information from them is a necessary component of many information retrieval tasks. Documents often contain tables in order to communicate densely packed, multi-dimensional information. Tables do this by employing layout patterns to efficiently indicate fields and records in two-dimensional form. Their rich combination of formatting and content presents difficulties for traditional retrieval techniques. This paper describes techniques for extracting tables from text and retrieving answers from the extracted information. We compare machine learning (especially, Conditional Random Fields) and heuristic methods for table extraction. To retrieve answers, our approach creates a cell document, which contains the cell and its metadata (headers, titles) for each table cell, and the retrieval model ranks the cells of the extracted tables using a language-modeling approach. Performance is tested using government statistical Web sites and news articles, and errors are analyzed in order to improve the system.  相似文献   

7.
8.
We present a system for multilingual information retrieval that allows users to formulate queries in their preferred language and retrieve relevant information from a collection containing documents in multiple languages. The system is based on a process of document level alignments, where documents of different languages are paired according to their similarity. The resulting mapping allows us to produce a multilingual comparable corpus. Such a corpus has multiple interesting applications. It allows us to build a data structure for query translation in cross-language information retrieval (CLIR). Moreover, we also perform pseudo relevance feedback on the alignments to improve our retrieval results. And finally, multiple retrieval runs can be merged into one unified result list. The resulting system is inexpensive, adaptable to domain-specific collections and new languages and has performed very well at the TREC-7 conference CLIR system comparison.  相似文献   

9.
针对Web信息检索现状和当前智能检索系统存在的问题,提出一个“先控”智能检索系统,面向基础用户,充分利用质量较高的网络资源分类目录体系,辅助形象化的“知识地图”显示,快速准确地定位用户的信息需求范畴,以提高检索效率和检索精度,同时分析了实现技术和尚待解决的问题。  相似文献   

10.
OBJECTIVES: HealthCyberMap (HCM-http://healthcybermap.semanticweb.org) is a web-based service for healthcare professionals and librarians, patients and the public in general that aims at mapping parts of the health information resources in cyberspace in novel ways to improve their retrieval and navigation. METHODS AND SERVICE DESCRIPTION: HCM adopts a clinical metadata framework built upon a clinical coding ontology for the semantic indexing, classification and browsing of Internet health information resources. A resource metadata base holds information about selected resources. HCM then uses GIS (Geographic Information Systems) spatialization methods to generate interactive navigational cybermaps from the metadata base. These visual cybermaps are based on familiar medical metaphors. CONCLUSIONS: HCM cybermaps can be considered as semantically spatialized, ontology-based browsing views of the underlying resource metadata base. Using a clinical coding scheme as a metric for spatialization ('semantic distance') is unique to HCM and is very much suited for the semantic categorization and navigation of Internet health information resources. Clinical codes ensure reliable and unambiguous topical indexing of these resources. HCM also introduces a useful form of cyberspatial analysis for the detection of topical coverage gaps in the resource metadata base using choropleth (shaded) maps of human body systems.  相似文献   

11.
苏颖 《情报工程》2015,1(5):008-017
专利检索是一个非常复杂的过程,用户为了迅速高效地完成检索任务需要得到支持。专利检索过程的许多环节可以借助一些工具完成,其中就包括查询(式)构造工具。查询构造是一项高度依赖人工的任务,工具只能实现对可能有用数据进行预先计算,并针对用户进行可视化。信息检索系统中,查询过程和查询结果可视化的方式有很多。本研究提出了两种典型的原型系统设计,用于在专利检索过程中对不同的查询表达式进行比较。原型包含查询表达式构造因素和结果集大小因素,两种因素对于专利领域专家探究查询表达式的调整对检索效率的影响至关重要。本文开发的系统有助于在专利检索过程中对复杂查询表达式进行逐步优化,系统设计思想基于了领域专家型知识工程。  相似文献   

12.
针对Web信息检索现状和当前智能检索系统存在的问题,提出一个“先控”智能检索系统,面向基础用户,充分利用质量较高的网络资源分类目录体系,辅助形象化的“知识地图”显示,快速准确地定位用户的信息需求范畴,以提高检索效率和检索精度,同时分析了实现技术和尚待解决的问题。  相似文献   

13.
[目的/意义]旨在构建社会化问答社区用户生成答案质量评价指标体系,实现面向用户需求的答案质量自动化评价和筛选,提高社会化问答社区知识服务质量。[方法/过程]引入社会情感特征和用户特征,运用因子分析和结构方程实证构建用户生成答案质量评价指标体系。基于GA-BP神经网络模型设计答案质量自动化评价方法。最后,选取知乎网站数据对用户生成答案质量评价指标体系和自动化评价方法进行应用研究。[结果/结论]构建包含答案文本特征、回答者特征、时效特征、用户特征、社会情感特征5个维度的评价指标体系。实验分析发现基于GA-BP神经网络的答案质量自动化评价方法相比于其他方法准确率较高、平均误差低,具有可行性和有效性,能够进一步应用和推广实践。  相似文献   

14.
杨秀丹  李皓 《图书情报工作》2012,(19):95-100,127
对物理信息检索系统进行用户情境的实地研究,结合情报学认知观理论,分析信息检索系统中的认知要素。在此基础上,设计认知信息检索系统模型——主要在信息标引和信息检索与匹配阶段加入认知要素,最后介绍认知信息检索系统模型的构建过程和模型组成。  相似文献   

15.
为了增强检索效率,信息检索系统必须对用户的信息搜寻活动提供更有效的支持,为达到这一目的,需要更好地理解用户同信息系统间交互的本质。信息检索交互模型能帮助分析、理解这些交互以及如何实现相应的支持。  相似文献   

16.
The majority of Internet users search for medical information online; however, many do not have an adequate medical vocabulary. Users might have difficulties finding the most authoritative and useful information because they are unfamiliar with the appropriate medical expressions describing their condition; consequently, they are unable to adequately satisfy their information need. We investigate the utility of bridging the gap between layperson and expert vocabularies; our approach adds the most appropriate expert expression to queries submitted by users, a task we call query clarification. We evaluated the impact of query clarification. Using three different synonym mappings and conducting two task-based retrieval studies, users were asked to answer medically-related questions using interleaved results from a major search engine. Our results show that the proposed system was preferred by users and helped them answer medical concerns correctly more often, with up to a 7 % increase in correct answers over an unmodified query. Finally, we introduce a supervised classifier to select the most appropriate synonym mapping for each query, which further increased the fraction of correct answers (12 %).  相似文献   

17.
When people are connected together over ad hoc social networks, it is possible to ask questions and retrieve answers using the wisdom of the crowd. However, locating a suitable candidate for answering a specific unique question within larger ad hoc groups is non-trivial, especially if we wish to respect the privacy of users by providing deniability. All members of the network wish to source the best possible answers from the network, while at the same time controlling the levels of attention required to generate them by the collective group of individuals and/or the time taken to read all the answers. Conventional expert retrieval approaches rank users for a given query in a centralised indexing process, associating users with material they have previously published. Such an approach is antithetical to privacy, so we have looked to distribute the routing of questions and answers, converting the indexing process into one of building a forwarding table. Starting from the simple operation of flooding the question to everyone, we compare a number of different routing options, where decisions must be made based on past performance and exploitation of the knowledge of our immediate neighbours. We focus on fully decentralised protocols using ant-inspired tactics to route questions towards members of the network who may be able to answer them well. Simultaneously, privacy concerns are acknowledged by allowing both question asking and answering to be plausibly deniable. We have found that via our routing method, it is possible to improve answer quality and also reduce the total amount of user attention required to generate those answers.  相似文献   

18.
郭海红  李姣  代涛 《情报工程》2016,2(6):039-049
本文旨在构建一个中文健康问句分类方法,并通过对高血压相关的健康问句进行人工分类标注,分析公众的高血压相关健康信息需求,同时为研发高血压相关的智能中文问答系统提供语料基础。本研究基于临床问句分类及公众健康信息查询场景层次模型,构建一个四级中文健康问句主题分类方法,并由5位标注员独立地对从某中文健康网站上收集的将近10万条高血压相关提问数据中随机抽取的2000条样本数据进行人工分类标注,以优化和测试该问句分类方法的可靠性,构建标注语料库,并分析公众的高血压相关健康信息需求。5位标注员使用该分类方法进行独立标注的四级类目评判者间信度kappa值为0.63,意味着分类结果可靠,一级大类获得高度一致性(kappa=0.82),略优于国际上的同类研究。分布在治疗、诊断、健康生活方式、临床发现/病情管理、流行病学、择医六个一级类别中的问句分别占样本总量的48.1%、23.8%、11.9%、5.2%、9.0%和1.9%。所构建的健康问句分类方法可用于组织大型健康问题集,以提高检索效率;分类标注的样本问句可作为高血压相关健康问句自动分类研究的语料;得出的高血压相关健康问句主题分布有助于指导健康网站的知识资源建设。此外,所设计和采用的问句分类方法构建方式、语料标注流程、评判者间信度测量方法等,也可为开放领域及其他受限领域开展用户问句分类与语料构建提供借鉴。  相似文献   

19.
XML信息检索探究   总被引:4,自引:0,他引:4  
廖述梅  万常选  徐升华 《情报学报》2007,381(2):229-234
XML文档是具有层次结构和文本内容的半结构化数据。现有的Web信息检索是基于HTML文档的关键词全文检索,无法胜任XML元素粒度的检索;同时,XML数据库检索实现的是精确查找,检索结果无排序支持。因此,融合信息检索和数据库技术研究XML检索问题成为必然。本文从XML检索的问题域出发,阐述了XML信息检索(XML IR)的国内外研究现状与特点,并分析了目前XML IR的热点和难点问题。  相似文献   

20.
交互式跨语言信息检索是信息检索的一个重要分支。在分析交互式跨语言信息检索过程、评价指标、用户行为进展等理论研究基础上,设计一个让用户参与跨语言信息检索全过程的用户检索实验。实验结果表明:用户检索词主要来自检索主题的标题;用户判断文档相关性的准确率较高;目标语言文档全文、译文摘要、译文全文都是用户认可的判断依据;翻译优化方法以及翻译优化与查询扩展的结合方法在用户交互环境下非常有效;用户对于反馈后的翻译仍然愿意做进一步选择;用户对于与跨语言信息检索系统进行交互是有需求并认可的。用户行为分析有助于指导交互式跨语言信息检索系统的设计与实践。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号