首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 78 毫秒
1.
[目的/意义] 构建一个基于多语言本体的跨语言信息检索模型,有助于用户通过该模型使用自己熟悉的语言来获取不同语种的信息资源。[方法/过程] 通过本体设计及检索模型功能模块设计建立一个基于数字出版领域本体的中英跨语言信息检索模型,并利用Java语言及Lucene搜索引擎架构对该模型进行编程实现。[结果/结论] 多语言领域本体具有明确、形式化、共享、概念化、结构清晰等特征,可以作为语义层应用于跨语言信息检索系统之中,实现信息资源的语义表达。经测试,本文构建的模型能够较好地实现分词、查询扩展和语义关联等功能,促进跨语言信息检索向语义层次发展。  相似文献   

2.
本体驱动的跨语言信息检索研究   总被引:5,自引:0,他引:5  
分析跨语言信息检索技术的翻译歧异性问题,指出多语本体的引入可以提高语义排歧的准确性,详细分析两个国外的跨语言信息检索系统,并在此基础上提出一个基于双语本体的中英跨语言信息检索模型及实现方案。  相似文献   

3.
多语言信息检索系统可视化初探   总被引:1,自引:0,他引:1  
多语言检索的研究在信息种类越来越多的现在十分重要,除检索技术与翻译功能的研究外,信息可视化的运用以及界面设计是另一个研究要点.依据以前的研究和文章综述,信息可视化被证明是帮助用户实施多语言信息检索的有效方法.研究提出一个多语言信息检索系统可视化模型及其设计方案,并指出该领域未来的发展方向.  相似文献   

4.
跨语言信息检索理论与应用研究   总被引:5,自引:0,他引:5  
郭宇锋  黄敏 《图书与情报》2006,35(2):79-81,84
随着互联网的全球化发展趋势,跨语言信息检索日益成为信息检索领域中的重要课题,跨语言检索可用一种提问语言检索出用另一种语言书写的信息。文章主要对跨语言信息检索理论应用研究进行了探讨,并对其在专业领域数据库中的应用提出一种思路。  相似文献   

5.
如何提高多语言信息服务质量已成为数字图书馆等科技信息服务领域的重要研究问题。文章首先介绍了国内外多语言信息服务相关研究,然后具体从跨语言信息检索和机器翻译两个方面介绍了国家科技文献中心多语言信息服务研究成果在国家科技文献在线服务系统中的应用。将跨语言信息检索功能和文摘翻译服务功能引入数字图书馆在线查询系统,在国内数字图书馆信息服务领域尚属探索性尝试,可以为进一步提高数字图书馆多语言信息服务质量提供经验。  相似文献   

6.
互联网信息的多语言性和人们所能熟练运用语言的有限性,使得语言已经成为人们进行信息获取和理解的主要障碍之一,应运而生的跨语言信息检索技术受到了研究和应用人员的日益关注。本文主要从双语检索和多语检索两个方面,总结了当前该领域的相关技术和方法,并讨论了跨语言信息检索的优化技术和相关评测情况。  相似文献   

7.
本体在跨语言信息检索中的应用机制研究   总被引:3,自引:1,他引:2  
解释多语本体的含义,指出其在不同语言中所对应的领域知识,分析多语本体在查询扩展、语义标注、基于概念索引3方面对改善跨语言信息检索的作用,并通过介绍EuroWorldNet和Cindor系统的多语本体概念的对应方法,探讨本体应用于跨语言信息检索最关键的多语本体库的映射方法,认为采用中间语言作为概念表示、并通过词典翻译对照与不同语种的词汇建立链接关系是多语本体映射的一种良好方法。  相似文献   

8.
对多语言信息进行语义层面的精确描述,为用户提供准确的跨语言信息资源,是当前多语言信息服务中必须面临和解决的实际问题。多语言主题词表正是解决这一问题的有效工具资源之一。文章首先介绍了国外三个常用多语言主题词表,然后对多语言主题词表在多语言信息自动标引和多语言信息检索两个领域中的应用情况进行了分析,说明多语言主题词表在多语言信息服务领域的潜在应用价值。  相似文献   

9.
英汉交互式跨语言检索系统设计与实现   总被引:1,自引:0,他引:1  
针对跨语言信息检索的查询翻译歧义性问题,采用交互式系统开发设计方法,对基于相关反馈的跨语言信息检索技术进行研究和分析,提出一个英汉交互式跨语言信息检索系统,实现用户辅助查询翻译、多级用户相关性判断,以及翻译优化与查询扩展等相关反馈功能,结果明显提高了检索效果。  相似文献   

10.
我国目前对网上异构档案数据库信息共享和开发利用仍处于低水平实践,其质量和效率有待进一步提高。实现异构档案信息整合与检索涉及诸多方法和技术上的困难,首先研究基于语义的档案信息整合、基于XML EAD的异构档案信息组织及其本体方法的应用,然后研究档案信息检索的平台异构性和语义异构性,语义异构性包括字段映射、数据去重、缩略词统一等,分别给出解决方案。结论是提高了档案信息资源共享与利用的质量,能够促进我国相关标准的制定与完善。  相似文献   

11.
跨语言综合搜索引擎设计   总被引:14,自引:1,他引:13  
分析了当前网络上信息分布的特点, 对目前相关的技术进行评价。在此基础上, 设计了一个解决网络搜索过程中语言障碍的系统——跨语言综合搜索引擎。  相似文献   

12.
分析跨语言信息检索的基本模式和翻译消歧关键技术,采用基于词语对共现率和词语间距加权计算的方法,对查询式翻译进行消歧优化,在此基础上构建跨语言商品信息检索系统并应用于图书商品搜索,实验结果证明翻译质量和检索效果得到提高。  相似文献   

13.
14.
Focused web crawling in the acquisition of comparable corpora   总被引:2,自引:0,他引:2  
Cross-Language Information Retrieval (CLIR) resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this problem. The Web, with its vast volumes of data, offers a natural source for this. We experimented with focused crawling as a means to acquire comparable corpora in the genomics domain. The acquired corpora were used to statistically translate domain-specific words. The same words were also translated using a high-quality, but non-genomics-related parallel corpus, which fared considerably worse. We also evaluated our system with standard information retrieval (IR) experiments, combining statistical translation using the Web corpora with dictionary-based translation. The results showed improvement over pure dictionary-based translation. Therefore, mining the Web for comparable corpora seems promising.  相似文献   

15.
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target languages in response to a user query in a single source language. In a multilingual federated search environment, different information sources contain documents in different languages. A general search strategy in multilingual federated search environments is to translate the user query to each language of the information sources and run a monolingual search in each information source. It is then necessary to obtain a single ranked document list by merging the individual ranked lists from the information sources that are in different languages. This is known as the results merging problem for multilingual information retrieval. Previous research has shown that the simple approach of normalizing source-specific document scores is not effective. On the other side, a more effective merging method was proposed to download and translate all retrieved documents into the source language and generate the final ranked list by running a monolingual search in the search client. The latter method is more effective but is associated with a large amount of online communication and computation costs. This paper proposes an effective and efficient approach for the results merging task of multilingual ranked lists. Particularly, it downloads only a small number of documents from the individual ranked lists of each user query to calculate comparable document scores by utilizing both the query-based translation method and the document-based translation method. Then, query-specific and source-specific transformation models can be trained for individual ranked lists by using the information of these downloaded documents. These transformation models are used to estimate comparable document scores for all retrieved documents and thus the documents can be sorted into a final ranked list. This merging approach is efficient as only a subset of the retrieved documents are downloaded and translated online. Furthermore, an extensive set of experiments on the Cross-Language Evaluation Forum (CLEF) () data has demonstrated the effectiveness of the query-specific and source-specific results merging algorithm against other alternatives. The new research in this paper proposes different variants of the query-specific and source-specific results merging algorithm with different transformation models. This paper also provides thorough experimental results as well as detailed analysis. All of the work substantially extends the preliminary research in (Si and Callan, in: Peters (ed.) Results of the cross-language evaluation forum-CLEF 2005, 2005).
Hao YuanEmail:
  相似文献   

16.
Research on cross-language information retrieval (CLIR) has typically been restricted to settings using binary relevance assessments. In this paper, we present evaluation results for dictionary-based CLIR using graded relevance assessments in a best match retrieval environment. A text database containing newspaper articles and a related set of 35 search topics were used in the tests. First, monolingual baseline queries were automatically formed from the topics. Secondly, source language topics (in English, German, and Swedish) were automatically translated into the target language (Finnish), using structured target queries. The effectiveness of the translated queries was compared to that of the monolingual queries. Thirdly, pseudo-relevance feedback was used to expand the original target queries. CLIR performance was evaluated using three relevance thresholds: stringent, regular, and liberal. When regular or liberal threshold was used, a reasonable performance was achieved. Using stringent threshold, equally high performance could not be achieved. On all the relevance thresholds the performance of the translated queries was successfully raised by pseudo-relevance feedback based query expansion. However, the performance of the stringent threshold in relation to the other thresholds could not be raised by this method.  相似文献   

17.
Abstract

This study aims to identify, collect and critical review the research literature on Multilingual Digital Libraries in English language from 1997 to 2012.

Design/methodology/approach: The present literature review has followed the rules of systematic review. In particular, the identified relevant papers were categorized based on their expressed aim on two core themes, that of system-centered and user-centered studies. The assigned papers were further analyzed and six sub-themes emerged for the system-centered studies and four for the user-centered studies. Additional categorization was also provided according to type of publication.

Findings: The literature concerning Multilingual Digital Libraries is vast and mainly focuses on two aspects the “System” and the “Users”. The majority of papers tried to meet the challenges raised for enabling multilingual information retrieval in Digital Libraries. Unfortunately, these efforts undertaken by a small number of researchers or research groups apparently working in isolation and therefore resulting in the development of numerous different tools and techniques. Relatively few studies have focused on the user and aimed to explore users' behavior and expectations when interacting with Multilingual Digital Libraries. As a result, further research is needed to reach to some tangible and usable findings.

Originality/value: This literature review captures the diversity of the research conducted regarding multilingual information access and retrieval in Digital Libraries. It organizes the vast literature in comprehensive themes and sub-themes enabling easy access to specific information.

Limitations: This study reviews only papers in English due to language restrictions from 1997 to 2012.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号