首页 | 本学科首页   官方微博 | 高级检索  
     检索      

改进的中文字串多模式匹配算法
引用本文:沈洲,王永成,刘功申.改进的中文字串多模式匹配算法[J].情报学报,2002,21(1):27-32.
作者姓名:沈洲  王永成  刘功申
作者单位:上海交通大学电子信息学院,上海,200030
基金项目:8 6 3计划资助项目 (合同号 :86 3 30 6 ZD0 3 0 4 1)
摘    要:针对中文字串匹配问题 ,提出了一种改进的多模式匹配算法。该算法采用新型组合状态自动机 ,解决了对大字符集语言构建字符完全Hash表时可能遇到的存储空间膨胀问题。此外 ,算法还充分利用中文大字符集语言的优势 ,将QS算法的思想融入到多模式匹配应用中 ,取得了良好的效果。实验结果显示 ,本算法明显优于DFSA算法 ,在平均情况下所花费时间仅为DFSA算法的 70 33%。

关 键 词:匹配  字符串  有限状态自动机  多模式匹配
修稿时间:2001年1月2日

Improved Multiple Pattern Algorithm for Chinese String Matching
Sheng Zhou,Wang Yongcheng and Liu Gongshen.Improved Multiple Pattern Algorithm for Chinese String Matching[J].Journal of the China Society for Scientific andTechnical Information,2002,21(1):27-32.
Authors:Sheng Zhou  Wang Yongcheng and Liu Gongshen
Abstract:For the problem of Chinese string matching, an improved multiple pattern matching algorithm is provided. The unbearable memory cost problem which results from constructing Hash table for large character set, is resolved with the new combinatorial state automata. In addition, for taking full use of the advantage of Chinese which is a large character set, we combined the theory of QS algorithm into the application of multiple pattern matching. At last, the experiment data show that the new algorithm is much better than DFSA algorithm. For the average case, the time spent by new algorithm is only 70 33percent of that spent by the DFSA.
Keywords:match  string  finite state automata  multiple pattern match  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号