首页 | 本学科首页   官方微博 | 高级检索  
     检索      

实体-属性抽取的GRU+CRF方法
引用本文:王仁武,孟现茹,孔琦.实体-属性抽取的GRU+CRF方法[J].现代情报,2018,38(10):57-64.
作者姓名:王仁武  孟现茹  孔琦
作者单位:华东师范大学经济与管理学部信息管理系, 上海 200241
基金项目:国家社会科学基金项目"基于数据驱动的图书馆资源发现系统平台研究"(项目编号:16BTQ026)。
摘    要:目的/意义]研究利用深度学习的循环神经网络GRU结合条件随机场CRF对标注的中文文本序列进行预测,来抽取在线评论文本中的实体-属性。方法/过程]首先根据设计好的文本序列标注规范,对评论语料分词后进行实体及其属性的命名实体标注,得到单词序列、词性序列和标注序列;然后将单词序列、词性序列转为分布式词向量表示并用于GRU循环神经网络的输入;最后输出层采用条件随机场CRF,输出标签即是实体或属性。结果/结论]实验结果表明,本文的方法将实体-属性抽取简化为命名实体标注,并利用深度学习的GRU捕获输入数据的上下文语义以及条件随机场CRF获取输出标签的前后关系,比传统的基于规则或一般的机器学习方法具有较大的应用优势。

关 键 词:实体属性抽取  GRU  循环神经网络  条件随机场  命名实体识别  

Entity-Attribute Extraction with GRU+CRF Method
Authors:Wang Renwu  Meng Xianru  Kong Qi
Institution:Department of Information Management, Faculty of Economics and Management, East China Normal University, Shanghai 200241, China
Abstract:Purpose/Significance]The study used the recurrent neural network GRU combined conditional random field CRF to predict the annotated Chinese sequence text to extract the entity-attribute in the online review text.Method/Process]Firstly,according to the designed annotation specification to a text sequence,the paper made name entity annotations for entities and their attributes after the segmentation of corpus,and got word sequence,part of speech sequence and annotation sequence;Then the word sequence and part-of-speech sequence were converted into distributed word vector representation and used for input of GRU recurrent neural network;finally,the output layer used the conditional random field CRF and the output label was the entity or attribute.Result/Conclusion]The method in this paper simplified entity-attribute extraction to named entity annotation,and used GRU to capture the contextual semantics of input data and conditional random field CRF to obtain the output label context,which had a larger application advantage than the traditional rule based or general machine learning method.
Keywords:entity attribute extraction  GRU  RNN  CRF  NER  
点击此处可从《现代情报》浏览原始摘要信息
点击此处可从《现代情报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号