首页 | 官方网站   微博 | 高级检索  
     

非平衡网络流量识别方法
引用本文:燕昺昊,韩国栋,黄雅静,王孝龙.非平衡网络流量识别方法[J].计算机应用,2018,38(1):20-25.
作者姓名:燕昺昊  韩国栋  黄雅静  王孝龙
作者单位:国家数字交换系统工程技术研究中心, 郑州 450002
基金项目:国家科技重大专项(2016ZX01012101);国家自然科学基金面上项目(61572520);国家自然科学基金创新群体项目(61521003)。
摘    要:针对网络中存在的对等网络(P2P)流量泛滥导致的流量失衡问题,提出将非平衡数据分类思想应用于流量识别过程。通过引入合成少数类过采样技术(SMOTE)算法并进行改进,提出了均值SMOTE (M-SMOTE)算法,实现对流量数据的平衡化处理。在此基础上分别采用3种机器学习分类器:随机森林(RF)、支持向量机(SVM)、反向传播神经网络(BPNN)对处理后各类流量进行识别。理论分析与仿真结果表明,在不影响P2P流量识别准确率的前提下,与非平衡状态相比,引入SMOTE算法将非P2P流量的识别准确率平均提高了16.5个百分点,将网络流量的整体识别率提高了9.5个百分点;与SMOTE算法相比,M-SMOTE算法将非P2P流量的识别准确率与网络流量的整体识别率分别进一步提高了3.2个百分点和2.6个百分点。实验结果表明,非平衡数据分类思想可有效解决P2P流量过多导致的非P2P流量识别率低的问题,同时所提M-SMOTE算法具有更高的识别准确度。

关 键 词:非平衡数据  P2P流量  流量识别  机器学习  合成少数类过采样技术算法  
收稿时间:2017-07-24
修稿时间:2017-08-01

New traffic classification method for imbalanced network data
YAN Binghao,HAN Guodong,HUANG Yajing,WANG Xiaolong.New traffic classification method for imbalanced network data[J].journal of Computer Applications,2018,38(1):20-25.
Authors:YAN Binghao  HAN Guodong  HUANG Yajing  WANG Xiaolong
Affiliation:National Digital Switching System Engineering & Technological Research Center, Zhengzhou Henan 450002, China
Abstract:To solve the problem existing in traffic classification that Peer-to-Peer (P2P) traffic is much more than that of non-P2P, a new traffic classification method for imbalanced network data was presented. By introducing and improving Synthetic Minority Over-sampling Technique (SMOTE) algorithm, a Mean SMOTE (M-SMOTE) algorithm was proposed to realize the balance of traffic data. On the basis of this, three kinds of machine learning classifiers:Random Forest (RF), Support Vector Machine (SVM), Back Propagation Neural Network (BPNN) were used to identify the various types of traffic. The theoretical analysis and simulation results show that, compared with the imbalanced state, the SMOTE algorithm improves the recognition accuracy of non-P2P traffic by 16.5 percentage points and raises the overall recognition rate of network traffic by 9.5 percentage points. Compared with SMOTE algorithm, the M-SMOTE algorithm further improves the recognition rate of non-P2P traffic and the overall recognition rate of network traffic by 3.2 percentage points and 2.6 percentage points respectively. The experimental results show that the way of imbalanced data classification can effectively solve the problem of low P2P traffic recognition rate caused by excessive P2P traffic, and the M-SMOTE algorithm has higher recognition accuracy rate than SMOTE.
Keywords:imbalanced data  Peer-to-Peer (P2P) traffic  traffic classification  machine learning  Synthetic Minority Over sampling Technique (SMOTE) algorithm  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号