基于平均信息熵的中文问句关键词提取 Extracting Keywords in Chinese Question Based on Average Information Entropy Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于平均信息熵的中文问句关键词提取

引用本文：	丁菲菲,;杨思春,;刘仁金.基于平均信息熵的中文问句关键词提取[J].六安师专学报,2014(5):46-49.

作者姓名：	丁菲菲 ;杨思春 ;刘仁金

作者单位：	[1]安徽工业大学计算机科学与技术学院,安徽马鞍山243002; [2]皖西学院信息工程学院,安徽六安237012

基金项目：	安徽省高校省级自然科学研究重点项日（KJ2011A048）.

摘要：	关键词提取是问答系统中问句分析的重要步骤，它有助于问答系统快速、准确地返回答案。针对现有文献中基于T FIDF等方法在关键词提取准确率及效率方面的不足，提出一种基于平均信息熵的中文问句关键词提取方法。通过加入专业领域词汇，并在停用词过滤的基础上计算问句中每个词的平均信息熵，以词的信息熵值直接体现该词在问句中的重要性；同时在关键词提取过程中，通过设定不同提取比例，并在不同提取比例下观察评价标准值，以最佳提取比例获取更为合适的关键词。实验结果表明，与传统的T FIDF等其他方法相比，该方法的查准率、查全率以及 F1测度值都得到显著提高。
关键词：	自动问答关键词提取 TFIDF 平均信息熵
Extracting Keywords in Chinese Question Based on Average Information Entropy Model

Institution:	DING Feifei, YANG Sichun, LIU Renjin(1. School of Computer Science and Technology , Anhui University of Technology, Maansizan 243002, China; 2. School of Information Engineering, West Anhui University, Lu＇an 237012, China)

Abstract:	Keywords extraction is the important foundation of question analysis in question answering system .Aiming at the shortcomings of the existing methods of keyword extraction ,a method of extraction keywords in Chinese questions based on average information entropy is proposed . By calculating the average information entropy of each word in a question , the importance of the word in question can be better reflected .The experimental results show that ,compared with the traditional T FIDF method ,the precision ,recall and F1 measure values of this method have been significantly improved .

Keywords:	question answering keywords extraction TFIDF average information entropy
本文献已被维普等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏