首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于Map-Reduce的自适应双语短语挖掘系统
引用本文:李彬,杨世泉,陈文杰.基于Map-Reduce的自适应双语短语挖掘系统[J].昆明师范高等专科学校学报,2013(3):83-87.
作者姓名:李彬  杨世泉  陈文杰
作者单位:[1]中国人民解放军78300部队自动化工作站,云南昆明650032 [2]阿里云计算有限公司,浙江杭州310012
摘    要:对于跨语言信息检索,统计翻译等应用,双语短语都是极其重要的资源.提出了基于自适应模式的双语短语挖掘算法,该算法可以自动的学习当前Web页面的翻译模式,然后利用学习到的模式抽取当前页面中的双语短语.同时,将自适应双语短语挖掘算法与Map-Reduce并行编程模型融合起来,大大提高了系统的运行效率,并且通过实验验证了该方法的有效性.

关 键 词:自适应模式  双语短语  Map-Reduce并行计算框架  分布式计算

Map-Reduce Based Adaptive Bilingual Term Mining System
LI Bin,YANG Shi-quan,CHEN Wen-jie.Map-Reduce Based Adaptive Bilingual Term Mining System[J].Journal of Kunming Teachers College,2013(3):83-87.
Authors:LI Bin  YANG Shi-quan  CHEN Wen-jie
Institution:1 Command Automation Station, Chinese People Liberation Army 78300 Troops, Yunnan Kunming 650032, China; 2. Aliyun Cloud Computing Company, Zhejiang Hangzhou 310012, China)
Abstract:Bilingual term is critical resource for cross language information retrieval, statistical machine translation etc. In this paper, we presents a new bilingual term mining algorithm which can adaptively learn translation pattern from web page and extract translation pairs with learnt pattern. We also propose a distributed solution which combines adaptive bilingual term mining algorithm and Map- Reduce parallel programming model to improve system performance. Several tests were made to prove that our mining system is effective and efficient.
Keywords:adaptive pattern  bilingual term  Map-Reduce parallel programming model  distributed computing
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号