共查询到20条相似文献,搜索用时 218 毫秒
1.
2.
3.
4.
计算机涉足语言翻译主要在两大领域:机器辅助翻译和机器翻译。机器辅助多发生在词汇或文化层级,机器翻译发生在语篇层级对翻译的意义更大。机器翻译就是电脑以软件和网络为媒介,提供译文,帮助人们消除语言障碍。但是,语言转换不仅是不同语言词汇间的转换,还是句法、语义的语篇整合。建立成功的语篇语料库是解决问题的方式之一。人工智能的发展也会促进机器翻译的提高。 相似文献
5.
6.
7.
8.
9.
10.
11.
机器可读电子词典是一切自然语言处理系统特别是机器翻译系统的基础。机器翻译的研究实践表明,没有高质量的词典,也就没有高质量的译文。每个机译系统都要在机译词典上花费大量的人力和投资,因此本文就建立这样一套通用的支持机器翻译的电子词典,提出若干设想加以讨论。 相似文献
12.
机器翻译的涉及的技术很多,单词处理是其中的一个重要核心.本文主要介绍单词处理的一些技术,其工作有三部分:第一部分提出最大匹配分词的改进算法;第二部分探讨汉英机器翻泽时名词的单复数处理算法;第三部分给出词性兼类处理的一些方法. 相似文献
13.
目前大多数机器翻译和跨语言检索系统都是基于通用语料,对外文科技资料的翻译效果不理想,本文结合科技文献的加工方法,研究面向科技文献的跨语言信息检索系统的模型。首先对跨语言信息检索的概念和特点进行简单的概述,从3个角度介绍跨语言信息检索的研究方法,然后讨论构建跨语言信息检索系统的必要性,在此基础上设计出一个面向科技文献的跨语言信息检索系统模型以及主要功能结构。 相似文献
14.
Recently, sentiment classification has received considerable attention within the natural language processing research community. However, since most recent works regarding sentiment classification have been done in the English language, there are accordingly not enough sentiment resources in other languages. Manual construction of reliable sentiment resources is a very difficult and time-consuming task. Cross-lingual sentiment classification aims to utilize annotated sentiment resources in one language (typically English) for sentiment classification of text documents in another language. Most existing research works rely on automatic machine translation services to directly project information from one language to another. However, different term distribution between original and translated text documents and translation errors are two main problems faced in the case of using only machine translation. To overcome these problems, we propose a novel learning model based on active learning and semi-supervised co-training to incorporate unlabelled data from the target language into the learning process in a bi-view framework. This model attempts to enrich training data by adding the most confident automatically-labelled examples, as well as a few of the most informative manually-labelled examples from unlabelled data in an iterative process. Further, in this model, we consider the density of unlabelled data so as to select more representative unlabelled examples in order to avoid outlier selection in active learning. The proposed model was applied to book review datasets in three different languages. Experiments showed that our model can effectively improve the cross-lingual sentiment classification performance and reduce labelling efforts in comparison with some baseline methods. 相似文献
15.
理想的机器翻译系统应该是全自动高质量的批处理式系统,在目前的计算语言学发展水平下,计算机还无法彻底解决自然语言的错综复杂现象,达到全自动高质量的翻译。本文认为人机交互的方法是最自然的手段,用户易学易会,具有广阔的发展前景。文章详细地分析归纳汉—英机器翻译的歧义问题,总结了现有的解决歧义的手段,提出了用人机对话解决歧义问题的交互式汉—英机器翻译的思想,并从语言学的角度提供了论据。基于这一思想设计的CEMT—Ⅱ汉英机器翻译系统模型已基本完成。 相似文献
16.
The world-wide use of digital storage and communications devices is increasing the need to make texts available in multiple languages. In this article we explore the possibility of storing a compressed form of a translated version of a text, taking advantage of the availability of the original text. The original text provides some of the semantic content of the text that is to be compressed, and therefore makes it possible for compression to be more efficient than if that information were not available. We begin with an experiment to evaluate the information content of a text when a parallel translation is available. This is achieved by having human subjects guess texts letter by letter, with and without a parallel translation. The perceived information content of a text can be determined from the way subjects make their guesses. The design and results of this experiment are described. The main conclusion is that while the text is considerably more predictable with the aid of a parallel translation, there is a surprising amount of information introduced by the translation. Insights obtained from this experiment are then applied in the design of a mechanical system for compressing parallel texts. The system stores one translation of a text intact, and then compresses further translations of the text with the aid of the original. The method described is able to compress texts significantly better than is possible without the aid of a parallel text. Aspects of the design are also applicable to future compressors that might take advantage of the semantic content of a text to obtain better compression. 相似文献
17.
【目的/意义】跨语言信息检索研究的目的即在消除因语言的差异而导致信息查询的困难,提高从大量纷繁
复杂的查找特定信息的效率。同时提供一种更加方便的途径使得用户能够使用自己熟悉的语言检索另外一种语
言文档。【方法/过程】本文通过对国内外跨语言信息检索的研究现状分析,介绍了目前几种查询翻译的方法,包括:
直接查询翻译、文献翻译、中间语言翻译以及查询—文献翻译方法,对其效果进行比较,然后阐述了跨语言检索关
键技术,对使用基于双语词典、语料库、机器翻译技术等产生的歧义性提出了解决方法及评价。【结果/结论】使用自
然语言处理技术、共现技术、相关反馈技术、扩展技术、双向翻译技术以及基于本体信息检索技术确保知识词典的
覆盖度和歧义性处理,通过对跨语言检索实验分析证明采用知识词典、语料库和搜索引擎组合能够提高查询效
率。【创新/局限】本文为了解决跨语言信息检索使用词典、语料库中词语缺乏的现象,提出通过搜索引擎从网页获
取信息资源来充实语料库中语句对不足的问题。文章主要针对中英文信息检索问题进行了探讨,解决方法还需要
进一步研究,如中文切词困难以及字典覆盖率低等严重影响检索的效率。 相似文献
18.
19.
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system must deal with various languages. Therefore, automatic extraction of bilingual word pairs from parallel corpora with various languages is important. However, previous works based on statistical methods are insufficient because of the sparse data problem. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any prior preparation of a bilingual resource (e.g., a bilingual dictionary, a machine translation system). We call this learning method Inductive Chain Learning (ICL). Moreover, the system using ICL can extract bilingual word pairs even from bilingual sentence pairs for which the grammatical structures of the source language differ from the grammatical structures of the target language because the acquired rules have the information to cope with the different word orders of source language and target language in local parts of bilingual sentence pairs. Evaluation experiments demonstrated that the recalls of systems based on several statistical approaches were improved through the use of ICL. 相似文献
20.
智能信息处理的基础理论探讨 总被引:1,自引:0,他引:1
提出综合自然语言理解和计算智能作为智能信息处理基础理论的思想。本文用智能分类、智能标引、智能检索、智能文描、机器翻译等智能信息处理实践说明自然语言理解可以提供理论架构而计算智能可以提供技术实现。 相似文献