共查询到17条相似文献,搜索用时 78 毫秒
1.
[目的/意义] 构建一个基于多语言本体的跨语言信息检索模型,有助于用户通过该模型使用自己熟悉的语言来获取不同语种的信息资源。[方法/过程] 通过本体设计及检索模型功能模块设计建立一个基于数字出版领域本体的中英跨语言信息检索模型,并利用Java语言及Lucene搜索引擎架构对该模型进行编程实现。[结果/结论] 多语言领域本体具有明确、形式化、共享、概念化、结构清晰等特征,可以作为语义层应用于跨语言信息检索系统之中,实现信息资源的语义表达。经测试,本文构建的模型能够较好地实现分词、查询扩展和语义关联等功能,促进跨语言信息检索向语义层次发展。 相似文献
2.
本体驱动的跨语言信息检索研究 总被引:5,自引:0,他引:5
吴丹 《现代图书情报技术》2006,1(5):22-26
分析跨语言信息检索技术的翻译歧异性问题,指出多语本体的引入可以提高语义排歧的准确性,详细分析两个国外的跨语言信息检索系统,并在此基础上提出一个基于双语本体的中英跨语言信息检索模型及实现方案。 相似文献
3.
多语言信息检索系统可视化初探 总被引:1,自引:0,他引:1
多语言检索的研究在信息种类越来越多的现在十分重要,除检索技术与翻译功能的研究外,信息可视化的运用以及界面设计是另一个研究要点.依据以前的研究和文章综述,信息可视化被证明是帮助用户实施多语言信息检索的有效方法.研究提出一个多语言信息检索系统可视化模型及其设计方案,并指出该领域未来的发展方向. 相似文献
4.
跨语言信息检索理论与应用研究 总被引:5,自引:0,他引:5
随着互联网的全球化发展趋势,跨语言信息检索日益成为信息检索领域中的重要课题,跨语言检索可用一种提问语言检索出用另一种语言书写的信息。文章主要对跨语言信息检索理论应用研究进行了探讨,并对其在专业领域数据库中的应用提出一种思路。 相似文献
5.
6.
7.
本体在跨语言信息检索中的应用机制研究 总被引:3,自引:1,他引:2
解释多语本体的含义,指出其在不同语言中所对应的领域知识,分析多语本体在查询扩展、语义标注、基于概念索引3方面对改善跨语言信息检索的作用,并通过介绍EuroWorldNet和Cindor系统的多语本体概念的对应方法,探讨本体应用于跨语言信息检索最关键的多语本体库的映射方法,认为采用中间语言作为概念表示、并通过词典翻译对照与不同语种的词汇建立链接关系是多语本体映射的一种良好方法。 相似文献
8.
9.
英汉交互式跨语言检索系统设计与实现 总被引:1,自引:0,他引:1
吴丹 《现代图书情报技术》2009,3(2):89-95
针对跨语言信息检索的查询翻译歧义性问题,采用交互式系统开发设计方法,对基于相关反馈的跨语言信息检索技术进行研究和分析,提出一个英汉交互式跨语言信息检索系统,实现用户辅助查询翻译、多级用户相关性判断,以及翻译优化与查询扩展等相关反馈功能,结果明显提高了检索效果。 相似文献
10.
我国目前对网上异构档案数据库信息共享和开发利用仍处于低水平实践,其质量和效率有待进一步提高。实现异构档案信息整合与检索涉及诸多方法和技术上的困难,首先研究基于语义的档案信息整合、基于XML EAD的异构档案信息组织及其本体方法的应用,然后研究档案信息检索的平台异构性和语义异构性,语义异构性包括字段映射、数据去重、缩略词统一等,分别给出解决方案。结论是提高了档案信息资源共享与利用的质量,能够促进我国相关标准的制定与完善。 相似文献
11.
跨语言综合搜索引擎设计 总被引:14,自引:1,他引:13
黄国才 《现代图书情报技术》2001,17(4):31-33
分析了当前网络上信息分布的特点, 对目前相关的技术进行评价。在此基础上, 设计了一个解决网络搜索过程中语言障碍的系统——跨语言综合搜索引擎。 相似文献
12.
分析跨语言信息检索的基本模式和翻译消歧关键技术,采用基于词语对共现率和词语间距加权计算的方法,对查询式翻译进行消歧优化,在此基础上构建跨语言商品信息检索系统并应用于图书商品搜索,实验结果证明翻译质量和检索效果得到提高。 相似文献
13.
14.
Focused web crawling in the acquisition of comparable corpora 总被引:2,自引:0,他引:2
Tuomas Talvensaari Ari Pirkola Kalervo Järvelin Martti Juhola Jorma Laurikkala 《Information Retrieval》2008,11(5):427-445
Cross-Language Information Retrieval (CLIR) resources, such as dictionaries and parallel corpora, are scarce for special domains.
Obtaining comparable corpora automatically for such domains could be an answer to this problem. The Web, with its vast volumes
of data, offers a natural source for this. We experimented with focused crawling as a means to acquire comparable corpora
in the genomics domain. The acquired corpora were used to statistically translate domain-specific words. The same words were
also translated using a high-quality, but non-genomics-related parallel corpus, which fared considerably worse. We also evaluated
our system with standard information retrieval (IR) experiments, combining statistical translation using the Web corpora with
dictionary-based translation. The results showed improvement over pure dictionary-based translation. Therefore, mining the
Web for comparable corpora seems promising. 相似文献
15.
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target
languages in response to a user query in a single source language. In a multilingual federated search environment, different
information sources contain documents in different languages. A general search strategy in multilingual federated search environments
is to translate the user query to each language of the information sources and run a monolingual search in each information
source. It is then necessary to obtain a single ranked document list by merging the individual ranked lists from the information
sources that are in different languages. This is known as the results merging problem for multilingual information retrieval.
Previous research has shown that the simple approach of normalizing source-specific document scores is not effective. On the
other side, a more effective merging method was proposed to download and translate all retrieved documents into the source
language and generate the final ranked list by running a monolingual search in the search client. The latter method is more
effective but is associated with a large amount of online communication and computation costs. This paper proposes an effective
and efficient approach for the results merging task of multilingual ranked lists. Particularly, it downloads only a small
number of documents from the individual ranked lists of each user query to calculate comparable document scores by utilizing
both the query-based translation method and the document-based translation method. Then, query-specific and source-specific
transformation models can be trained for individual ranked lists by using the information of these downloaded documents. These
transformation models are used to estimate comparable document scores for all retrieved documents and thus the documents can
be sorted into a final ranked list. This merging approach is efficient as only a subset of the retrieved documents are downloaded
and translated online. Furthermore, an extensive set of experiments on the Cross-Language Evaluation Forum (CLEF) () data has demonstrated the effectiveness of the query-specific and source-specific results merging algorithm against other
alternatives. The new research in this paper proposes different variants of the query-specific and source-specific results
merging algorithm with different transformation models. This paper also provides thorough experimental results as well as
detailed analysis. All of the work substantially extends the preliminary research in (Si and Callan, in: Peters (ed.) Results
of the cross-language evaluation forum-CLEF 2005, 2005).
相似文献
Hao YuanEmail: |
16.
Research on cross-language information retrieval (CLIR) has typically been restricted to settings using binary relevance assessments.
In this paper, we present evaluation results for dictionary-based CLIR using graded relevance assessments in a best match
retrieval environment. A text database containing newspaper articles and a related set of 35 search topics were used in the
tests. First, monolingual baseline queries were automatically formed from the topics. Secondly, source language topics (in
English, German, and Swedish) were automatically translated into the target language (Finnish), using structured target queries.
The effectiveness of the translated queries was compared to that of the monolingual queries. Thirdly, pseudo-relevance feedback
was used to expand the original target queries. CLIR performance was evaluated using three relevance thresholds: stringent,
regular, and liberal. When regular or liberal threshold was used, a reasonable performance was achieved. Using stringent threshold,
equally high performance could not be achieved. On all the relevance thresholds the performance of the translated queries
was successfully raised by pseudo-relevance feedback based query expansion. However, the performance of the stringent threshold
in relation to the other thresholds could not be raised by this method. 相似文献
17.
Evgenia Vassilakaki 《International Information and Library Review》2013,45(1-2):3-19
AbstractThis study aims to identify, collect and critical review the research literature on Multilingual Digital Libraries in English language from 1997 to 2012.Design/methodology/approach: The present literature review has followed the rules of systematic review. In particular, the identified relevant papers were categorized based on their expressed aim on two core themes, that of system-centered and user-centered studies. The assigned papers were further analyzed and six sub-themes emerged for the system-centered studies and four for the user-centered studies. Additional categorization was also provided according to type of publication.Findings: The literature concerning Multilingual Digital Libraries is vast and mainly focuses on two aspects the “System” and the “Users”. The majority of papers tried to meet the challenges raised for enabling multilingual information retrieval in Digital Libraries. Unfortunately, these efforts undertaken by a small number of researchers or research groups apparently working in isolation and therefore resulting in the development of numerous different tools and techniques. Relatively few studies have focused on the user and aimed to explore users' behavior and expectations when interacting with Multilingual Digital Libraries. As a result, further research is needed to reach to some tangible and usable findings.Originality/value: This literature review captures the diversity of the research conducted regarding multilingual information access and retrieval in Digital Libraries. It organizes the vast literature in comprehensive themes and sub-themes enabling easy access to specific information.Limitations: This study reviews only papers in English due to language restrictions from 1997 to 2012. 相似文献