首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 171 毫秒
近年来国外在多语言信息组织与检索研究领域取得了显著进展。本文以WoS、ACM、Emerald、Elsevier、ProQuest、Springer等数据库收录的文献为基础,对近10年来该领域的研究进行述评。国外研究重点关注以下问题:多语言本体构建与协调,基于关联数据的多语言语义网建设,跨语种语言资源和知识组织系统互操作,多语言文本分类与聚类,多语言环境下的用户信息行为,多语言信息检索模型,多语言信息检索方法与技术,多语言信息检索系统开发及评估,特定领域的多语言信息检索,交互式多语言信息检索。对我国的启示主要体现在:加强实证研究方法的应用,开发面向实用的多语言信息检索系统,注重基于语义的信息组织与检索研究,拓展特定学科领域应用研究。  相似文献   

如何提高多语言信息服务质量已成为数字图书馆等科技信息服务领域的重要研究问题。文章首先介绍了国内外多语言信息服务相关研究,然后具体从跨语言信息检索和机器翻译两个方面介绍了国家科技文献中心多语言信息服务研究成果在国家科技文献在线服务系统中的应用。将跨语言信息检索功能和文摘翻译服务功能引入数字图书馆在线查询系统,在国内数字图书馆信息服务领域尚属探索性尝试,可以为进一步提高数字图书馆多语言信息服务质量提供经验。  相似文献   

个性化跨语言学术搜索技术研究   总被引:1,自引:0,他引:1  
学术搜索引擎是一种行业化的搜索引擎,但因其缺乏个性化的服务,使得用户的学术文献检索效率低下,海量的数字学术资源得不到充分利用.本文使用Google翻译,研究基于机器翻译的中、英、俄、法和西班牙等五个语种跨语言学术检索.在跨语言学术搜索的基础上研究个性化检索技术,提出一种基于聚类的个性化信息检索方法:通过观察用户对搜索结果聚类的点击行为,生成并更新用户实时兴趣模型,采用余弦夹角公式计算用户实时兴趣模型与搜索返回结果的相似度,根据相似度大小,为用户提供个性化重排序的搜索返回结果.实验结果证明了提出方法的有效性.  相似文献   

用户自然和社会属性对网络搜索中语言使用行为的影响   总被引:2,自引:0,他引:2  
以用户的自然、社会属性对用户在搜索中使用检索语言的影响为研究对象,旨在对影响用户行为的因素作探索性研究。综合网络调查问卷的分析结果和用户参与对比实验法,得出性别、年龄、学历和专业教育对用户语言使用影响的结论。对于此问题的研究,有助于认清搜索过程的影响因素,进而可以以此建模,改善搜索引擎的服务质量。  相似文献   

[目的/意义]调查健康焦虑人群网络健康信息搜索行为的水平与特征,探讨健康焦虑对网络健康信息搜索行为的影响。[方法/过程]2021年1月30日至2月12日,通过方便抽样向长沙市居民发放788份匿名问卷,使用躯体症状严重程度量表(PHQ-15)、网络健康信息搜索行为量表(OHIB)、无法容忍不确定性量表(IUS)和健康焦虑量表(SHAI)等收集用户信息,最终获得有效问卷699份,回收率为88.7%。对收集到的数据进行描述性统计、方差分析、相关性分析和因子分析,构建结构方程模型进行假设验证。[结果/结论]本研究中(1)健康焦虑用户的SHAI得分为22.94±6.99分,搜索行为得分为3.25±0.80分。(2)健康焦虑用户OHIB得分与HA总得分及患病可能性、负面结果维度得分(r=0.117-0.184,P<0.05)存在正相关关系。(3)健康焦虑在搜索行为(搜索情景、动机、频率、搜索信息类型、搜索疾病类型、搜索平台)上存在显著性差异(p<0.05)。(4)IU在健康焦虑与网络健康信息搜索行为之间起中介作用。结论:(1)不同健康焦虑水平的搜索行为偏好不同,健康焦虑水平越高的用户,...  相似文献   

在现有数字图书馆信息检索系统的基础上,针对检索结果的查准率和查全率偏低等问题,将智能交互式检索技术与CLIR技术相结合,设计基于跨语言交互式检索模型,并将其引入到数字图书馆系统进行应用。  相似文献   

网络用户信息获取语言使用行为研究   总被引:7,自引:1,他引:6  
以网络搜索中用户信息获取习惯为研究对象,旨在对网络搜索中的基本词汇现象及其认知与利用进行总体把握性质的研究。主要使用网络问卷调查方法,辅以用户对比实验所得到的结论进行比较分析,得出在Web2.0标记、浏览中分类语言及其呈现方式、搜索词汇来源与类型、系统提示相关词等方面的研究结论:标签已经在Web2.0中有着极大的影响力;用户浏览时依赖分类语言,查询时对系统提示的相关词亦有较强认知;专指度较高的搜索词汇更容易在网络搜索中取得较好的效果。对于这些特征的把握,有助于各种网络获取服务的改善。  相似文献   

跨源健康信息搜寻的动机、信息源选择及行为路径   总被引:1,自引:0,他引:1  
本研究旨在探究用户面临多种健康信息源时的跨源健康信息搜寻行为,揭示这类信息搜寻行为的动机、信息源选择和行为路径。结合日记法和半结构化深度访谈法,收集并分析了26名参加者的健康信息搜寻日记及跟踪访谈数据,使用NVivo11分析数据。研究发现医生-用户交流障碍、信任缺失、用户自我调节和用户安全心理的需要是促使跨源健康信息搜寻行为产生的动机。相似病症用户的自我陈述、用户的路径依赖(习惯)、从众心理、信息源的权威性及可信度是影响用户跨源过程中信息源选择的主要因素。此外,用户跨源搜寻健康信息的过程中,其心理需求大于实际需求、凭“感觉”感知网络健康信息质量等特征较明显。本研究构建了跨源健康信息搜寻行为路径模型,研究的结果有助于提升用户健康信息搜寻行为认知,改善其跨源健康信息获取能力,提升其健康信息素养;同时,也为相关部门制定提升公民健康信息素养策略及支持跨源健康信息搜寻行为的信息系统的开发和设计提供了借鉴。  相似文献   

跨语言主题标引是满足多语种用户信息需求的重要途径之一。本文以国家图书馆主题标引工作为例,剖析了当前跨语言主题标引存在的问题,探讨了跨语言主题标引实现的路径,并结合国家图书馆的实际提出了开展跨语言主题标引工作的策略。  相似文献   

跨语言信息检索理论与应用研究   总被引:5,自引:0,他引:5  
郭宇锋  黄敏 《图书与情报》2006,35(2):79-81,84
随着互联网的全球化发展趋势,跨语言信息检索日益成为信息检索领域中的重要课题,跨语言检索可用一种提问语言检索出用另一种语言书写的信息。文章主要对跨语言信息检索理论应用研究进行了探讨,并对其在专业领域数据库中的应用提出一种思路。  相似文献   

The problem of language in Web searching has been discussed primarily in the area of cross-language information retrieval (CLIR). However, much CLIR research centers on investigation of the effectiveness of automatic translation techniques. The case study reported here explored bilingual user behaviors, perceptions, and preferences with respect to the capability of the Web as a multilingual information resource. Twenty-eight bilingual academic users from Myongji University in Korea were recruited for the study. Findings show that the subjects did not use Web search engines as multilingual tools. For search queries, they selected a language that represents their information need most accurately depending on the types of information task rather than choosing their first language. Subjects expressed concerns about the accuracy of machine translation of scholarly terminologies and preferred to have user control over multilingual Web searches.  相似文献   

交互式跨语言信息检索是信息检索的一个重要分支。在分析交互式跨语言信息检索过程、评价指标、用户行为进展等理论研究基础上,设计一个让用户参与跨语言信息检索全过程的用户检索实验。实验结果表明:用户检索词主要来自检索主题的标题;用户判断文档相关性的准确率较高;目标语言文档全文、译文摘要、译文全文都是用户认可的判断依据;翻译优化方法以及翻译优化与查询扩展的结合方法在用户交互环境下非常有效;用户对于反馈后的翻译仍然愿意做进一步选择;用户对于与跨语言信息检索系统进行交互是有需求并认可的。用户行为分析有助于指导交互式跨语言信息检索系统的设计与实践。  相似文献   

英汉交互式跨语言检索系统设计与实现   总被引:1,自引:0,他引:1  
针对跨语言信息检索的查询翻译歧义性问题,采用交互式系统开发设计方法,对基于相关反馈的跨语言信息检索技术进行研究和分析,提出一个英汉交互式跨语言信息检索系统,实现用户辅助查询翻译、多级用户相关性判断,以及翻译优化与查询扩展等相关反馈功能,结果明显提高了检索效果。  相似文献   

Given a user question, the goal of a Question Answering (QA) system is to retrieve answers rather than full documents or even best-matching passages, as most Information Retrieval systems currently do. In this paper, we present BRUJA, a QA system for the management of multilingual collections. BRUJ rkstions (English, Spanish and French). The BRUJA architecture is not formed with three monolingual QA systems but instead uses English as Interlingua to make usual QA tasks such as question classifications and answer extractions. In addition, BRUJA uses Cross Language Information Retrieval (CLIR) techniques to retrieve relevant documents from a multilingual collection. On the one hand, we have more documents to find answers from but on the other hand, we are introducing noise into the system because of translations to the Interlingua (English) and the CLIR module. The question is whether the difficulty of managing three languages is worth it or whether a monolingual QA system delivers better results. We report on in-depth experimentation and demonstrate that our multilingual QA system gets better results than its monolingual counterpart whenever it uses good translation resources and, especially, CLIR techniques that are state-of-the-art.  相似文献   

This paper reviews literature on dictionary-based cross-language information retrieval (CLIR) and presents CLIR research done at the University of Tampere (UTA). The main problems associated with dictionary-based CLIR, as well as appropriate methods to deal with the problems are discussed. We will present the structured query model by Pirkola and report findings for four different language pairs concerning the effectiveness of query structuring. The architecture of our automatic query translation and construction system is presented.  相似文献   

综述命名实体识别与翻译研究现状,提出基于信息抽取的命名实体识别与翻译方法,以及对该方法进行一系列集成优化处理,并实现了基于命名实体识别与翻译的跨语言信息检索实验。实验结果显示出命名实体识别与翻译在跨语言信息检索中的重要性,并证明了所提出的翻译加权和网络挖掘未登录命名实体方法的应用能显著提高跨语言信息检索的性能。  相似文献   

跨语言信息检索中的询问翻译方法及其研究进展   总被引:10,自引:0,他引:10  
主要介绍了跨语言文本信息检索的三类基本方法:询问翻译、文献翻译和不翻译,并且对目前最常用的询问翻译方法所涉及的一些基本问题及其研究进展进行了阐述,最后总结出跨语言信息检索的现状和动向。  相似文献   

A usual strategy to implement CLIR (Cross-Language Information Retrieval) systems is the so-called query translation approach. The user query is translated for each language present in the multilingual collection in order to compute an independent monolingual information retrieval process per language. Thus, this approach divides documents according to language. In this way, we obtain as many different collections as languages. After searching in these corpora and obtaining a result list per language, we must merge them in order to provide a single list of retrieved articles. In this paper, we propose an approach to obtain a single list of relevant documents for CLIR systems driven by query translation. This approach, which we call 2-step RSV (RSV: Retrieval Status Value), is based on the re-indexing of the retrieval documents according to the query vocabulary, and it performs noticeably better than traditional methods. The proposed method requires query vocabulary alignment: given a word for a given query, we must know the translation or translations to the other languages. Because this is not always possible, we have researched on a mixed model. This mixed model is applied in order to deal with queries with partial word-level alignment. The results prove that even in this scenario, 2-step RSV performs better than traditional merging methods.  相似文献   

Prior-art search in patent retrieval is concerned with finding all existing patents relevant to a patent application. Since patents often appear in different languages, cross-language information retrieval (CLIR) is an essential component of effective patent search. In recent years machine translation (MT) has become the dominant approach to translation in CLIR. Standard MT systems focus on generating proper translations that are morphologically and syntactically correct. Development of effective MT systems of this type requires large training resources and high computational power for training and translation. This is an important issue for patent CLIR where queries are typically very long sometimes taking the form of a full patent application, meaning that query translation using MT systems can be very slow. However, in contrast to MT, the focus for information retrieval (IR) is on the conceptual meaning of the search words regardless of their surface form, or the linguistic structure of the output. Thus much of the complexity of MT is not required for effective CLIR. We present an adapted MT technique specifically designed for CLIR. In this method IR text pre-processing in the form of stop word removal and stemming are applied to the MT training corpus prior to the training phase. Applying this step leads to a significant decrease in the MT computational and training resources requirements. Experimental application of the new approach to the cross language patent retrieval task from CLEF-IP 2010 shows that the new technique to be up to 23 times faster than standard MT for query translations, while maintaining IR effectiveness statistically indistinguishable from standard MT when large training resources are used. Furthermore the new method is significantly better than standard MT when only limited translation training resources are available, which can be a significant issue for translation in specialized domains. The new MT technique also enables patent document translation in a practical amount of time with a resulting significant improvement in the retrieval effectiveness.  相似文献   

This study develops regression models for predicting the performance of cross-language information retrieval (CLIR). The model assumes that CLIR performance can be explained by two factors: (1) the ease of search inherent in each query and (2) the translation quality in the process of CLIR systems. As operational variables, monolingual information retrieval (IR) performance is used for measuring the ease of search, and the well-known evaluation metric BLEU is used to measure the translation quality. This study also proposes an alternative metric, weighted average for matched unigrams (WAMU), which is tailored to gauging translation quality for special IR purposes. The data for regression analysis are obtained from a retrieval experiment of English-to-Italian bilingual searches using the CLEF 2003 test collection. The CLIR and monolingual IR performances are measured by average precision score. The result shows that the proposed regression model can explain about 60% of the variation in CLIR performance, and WAMU has more predictive power than BLEU. A back translation method for applying the regression model to operational CLIR systems in real situations is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号