添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

摘要:

目前许多深度学习检测模型在各项指标上达到较好的效果,但是由于安全管理者不理解深度学习模型的决策依据,导致一方面无法信任模型的判别结果,另一方面不能很好地诊断和追踪模型的错误,这极大地限制了深度学习模型在该领域的实际应用。面对这样的问题,文章提出了一个基于稀疏自动编码器的可解释性异常流量检测模型(Sparse Autoencoder Based Anomaly Traffic Detection,SAE-ATD)。该模型利用稀疏自动编码器学习正常流量特征,并在此基础上引入了阈值迭代选取最佳阈值,以提高模型的检测率。模型预测完毕后,将预测结果的异常值送入解释器中,通过解释器对参考值进行迭代更新后,返回每个特征参考值和异常值的差值,并结合原始数据进行可解释性分析。文章在CICIDS2017数据集和CIRA-CIC-DoHBrw-2020数据集上进行实验,实验结果表明SAE-ATD在两个数据集上对大部分攻击检测的精确率和召回率达到99%,且能给模型提供可解释性。

Abstract:

Although many deep learning detection models achieve good results in various indicators, security managers do not understand the decision-making basis of deep models, on the one hand, they cannot trust the discrimination results of the model, and on the other hand, they cannot diagnose and track the errors of the model well, which greatly limit the practical application of deep learning models in this field. Faced with such a problem, this paper proposed a Sparse Autoencoder Based Anomaly Traffic Detection (SAE-ATD). The model used the sparse autoencoder to learn the normal traffic characteristics, and on this basis, a threshold was introduced to iteratively select the best threshold to improve the detection rate of the model. After the model was predicted, the outliers in the prediction results were fed into the explainer, and after iteratively updating the reference values through the explainer, the difference between each feature reference value and the outlier was returned, and interpretability analysis was carried out in combination with the original data. In this paper, experiments are carried out on the CICIDS2017 dataset and the CIRA-CIC-DoHBrw-2020 dataset, and the experimental results show that SAE-ATD has 99% accuracy and recall for most attacks detection on the two datasets, and can also provide explainability for the model.

Key words: anomaly traffic detection, autoencoder, deep learning, explainability

由信息增益率产生的特征排名

编号 特征名字 信息增益率
1 min_seg_size_forward 41.9%
2 Init_Win_bytes_backward 41.2%
3 Init_Win_bytes_forward 41.1%
4 Bwd Packet Length Min 40.4%
5 Total Length of Bwd Packets 40.4%
6 Subflow Bwd Bytes 39.9%
7 Bwd Header Length 39.5%
8 Fwd Header Length 39.2%
9 Fwd Header Length.1 38.2%
10 Fwd PSH Flags 35.6%
11 SYN Flag Count 35.6%
12 Max Packet Length 34.8%
13 Bwd Packet Length Mean 34.6%
14 Avg Bwd Segment Size 34.4%
15 Bwd Packet Length Max 33.8%
16 FIN Flag Count 33.5%
17 Total Backward Packets 32.0%
18 Subflow Bwd Packets 32.0%
19 ACK Flag Count 31.7%
20 Destination Port 29.9%
21 Total Fwd Packets 29.7%
22 Subflow Fwd Packets 29.1%
23 act_data_pkt_fwd 25.4%
24 Min Packet Length 25.4%
25 Fwd Packet Length Min 25.2%
26 Fwd Packet Length Max 25.1%
27 Total Length of Fwd Packets 24.7%
28 Subflow Fwd Bytes 23.4%
29 PSH Flag Count 23.3%
30 Down/Up Ratio 23.1%
31 Bwd Packet Length Std 21.9%
32 Average Packet Size 21.8%
33 Packet Length Mean 21.5%
34 Packet Length Std 21.4%
BINBUSAYYIS A, VAIYAPURI T. Unsupervised Deep Learning Approach for Network Intrusion Detection Combining Convolutional Autoencoder and One-Class SVM[J]. Applied Intelligence, 2021, 51(10): 7094-7108. doi: 10.1007/s10489-021-02205-9 JAFAR M T, AL-FAWA'REH M, AL-HRAHSHEH Z, et al. Analysis and Investigation of Malicious DNS Queries Using CIRA-CIC-DoHBrw-2020 Dataset[EB/OL]. [2022-11-17]. https://mjaias.co.uk/mj-en/article/view/24. SAMMOUR M, HUSSIN B, OTHMAN F I. Comparative Analysis for Detecting DNS Tunneling Using Machine Learning Techniques[J]. International Journal of Applied Engineering Research, 2017, 12(22): 12762-12766. AIELLO M, MONGELLI M, PAPALEO G. Basic Classifiers for DNS Tunneling Detection[C]// IEEE. 2013 IEEE Symposium on Computers and Communications(ISCC). New York: IEEE, 2013: 880-885. ZHAO Hong, CHANG Zhaobin, BAO Guangbin, et al. Malicious Domain Names Detection Algorithm Based on N-gram[EB/OL]. [2022-11-17]. https://www.hindawi.com/journals/jcnc/2019/4612474/. ALLARD F, DUBOIS R, GOMPEL P, et al. Tunneling Activities Detection Using Machine Learning Techniques[J]. Journal of Telecommunications and Information Technology, 2011: 37-42. BANADAKI Y M. Detecting Malicious DNS over Https Traffic in Domain Name System Using Machine Learning Classifiers[J]. Journal of Computer Sciences and Applications, 2020, 8(2): 46-55. doi: 10.12691/jcsa-8-2-2 IMAN S, ARASH H, ALI A. CIRA-CIC-DoHBrw-2020[EB/OL]. [2022-11-29]. https://www.unb.ca/cic/datasets/dohbrw-2020.html IMAN SHARAFALDIN, ARASH Habibi Lashkari, ALI A. Ghorba-ni, Intrusion Detection Evaluation Dataset(CICIDS2017)[EB/OL]. [2022-11-29]. http://www.unb.ca/cic/datasets/ids2017.html. ZHAO Ruijie, HUANG Yiteng, DENG Xianwen, et al. A Novel Traffic Classifier with Attention Mechanism for Industrial Internet of Things[J]. IEEE Transactions on Industrial Informatics, 2023: 1-12. DU Min, CHEN Zhi, LIU Chang, et al. Lifelong Anomaly Detection through Unlearning[C]// ACM. Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2019: 1283-1297. YAN Yu, QI Lin, WANG Jie, et al. A Network Intrusion Detection Method Based on Stacked Autoencoder and LSTM[C]// IEEE.ICC 2020-2020 IEEE International Conference on Communications(ICC). New York: IEEE, 2020: 1-6. KINGMA D P, BA J. Adam: A Method for Stochastic Optimization[EB/OL]. [2022-12-17]. https://arxiv.org/pdf/1412.6980.pdf. ZENATI H, ROMAIN M, FOO C S, et al. Adversarially Learned Anomaly Detection[C]// IEEE. 2018 IEEE International Conference on Data Mining(ICDM). New York: IEEE, 2018: 727-736. XU Haowen, CHEN Wenxiao, ZHAO Nengwen, et al. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIS in Web Applications[C]// IEEE. Proceedings of the 2018 World Wide Web Conference. New York: IEEE, 2018: 187-196. BACH S, BINDER A, MONTAVON G, et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation[J]. PLOS ONE, 2015, 10(7): 1-46. 沪ICP备12039260号-9
电话:010-88118778/88114408/88111078 E-mail: [email protected]
地址:北京市海淀区阜成路58号新洲商务大厦6层610 邮编:100142
本系统由北京玛格泰克科技发展有限公司设计开发