首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于实体嵌入和长短时记忆网络的入侵检测方法
作者姓名:赖训飞  梁旭文  谢卓辰  李宗旺
作者单位:1. 中国科学院上海微系统与信息技术研究所, 上海 200050;2. 上海科技大学信息学院, 上海 201210;3. 中国科学院上海微小卫星工程中心, 上海 201203;4. 中国科学院大学, 北京 100049
基金项目:国家自然科学基金(91738201)和上海市青年科技英才扬帆计划项目(17YF1418200)资助
摘    要:针对网络入侵检测过程中无法有效处理入侵数据中分类变量的表示,导致网络入侵检测准确率低、漏报率高等问题,提出一种基于实体嵌入和长短时记忆网络(long short-term memory network,LSTM)相结合的网络入侵检测方法。首先,在数据预处理时,将表示网络特征数据中的数值型变量和分类型变量数据分开,通过实体嵌入方法将分类型变量数据映射在一个欧几里得空间,得到一个向量表示,再将这个向量嵌入到数值型数据后面得到输入数据。然后,通过把数据输入到长短时记忆网络中去训练,通过时间反向传播更新参数,得到最优嵌入向量作为输入特征的同时,也得到一个相对最优的LSTM网络的检测模型。在数据集NSL-KDD上进行实验验证,结果表明实体嵌入是一种有效处理网络入侵数据中分类变量的方法,它和LSTM网络相结合组成的模型能够有效提高入侵检测率。在数据预处理时对分类变量的处理中,实体嵌入方法与传统的One-Hot编码方法相比,检测的准确率提高1.44个百分点,漏报率降低2.99个百分点。

关 键 词:实体嵌入  长短时记忆网络  入侵检测  分类变量  
收稿时间:2019-01-25
修稿时间:2019-04-03

Intrusion detection method based on entity embedding and long short-term memory networks
Authors:LAI Xunfei  LIANG Xuwen  XIE Zhuochen  LI Zongwang
Institution:1. Shanghai Institute of Microsyst&Information Technology, Chinese Academy of Sciences, Shanghai 200050, China;2. School of Information Science&Technology, ShanghaiTech University, Shanghai 201210, China;3. Shanghai Engineering Center for Microsatellites, Chinese Academy of Sciences, Shanghai 201203, China;4. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:Due to the inability to effectively deal with the representation of categorical variables in intrusion data, the network intrusion detection has low accuracy and high false negative rate. A method combining entity embedding and long short-term memory network (LSTM) is proposed. First, when the data is preprocessed, the numerical variable data and categorical variable data are separated, and the categorical variable data are mapped into an Euclidean space by using the entity embedding method to obtain a vector representation and then this vector is embedded into the numeric data to get the input data. Then, by inputting the data into the long short-term memory network, the parameters are updated by time back propagation. Thus the optimal embedded vector is obtained as the input feature, and a relatively optimal detection model of the LSTM network is also obtained through training. Experiments are carried out on the data set NSL-KDD, and the results show that entity embedding is an effective method to deal with categorical variables in network intrusion data. The model composed of LSTM network effectively improves the detection rate. In the processing of categorical variables, the accuracy of detection using entity embedding method increases by 1.44 percentage points and the false negative rate decreases by 2.99 percentage points, compared with those using the traditional One-Hot coding method.
Keywords:entity embedding                                                                                                                        LSTM                                                                                                                        intrusion detection                                                                                                                        categorical variables
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号