The Homology Determination System for APT Samples Based on Gene Maps
DOI:
https://doi.org/10.13052/jcsm2245-1439.1348Keywords:
APT sample, Gene map, Homology determinationAbstract
At present, there are fewer types of homology determination methods for advanced persistent threat (APT) samples detection, and most existing determination schemes have problems such as high cost, low accuracy, and difficulty in identifying unknown APT samples. Therefore, we proposed a homology determination system for APT samples based on gene maps by integrating deep learning and gene maps. Firstly, we extract the software gene features from the samples uploaded by the user and apply the TF-IDF algorithm to clean the extracted software genes. The Word2Vec algorithm is used to vectorize all the genes to construct the gene sample vectors. And we use a LSTM-based classifier to detect APT attack samples. Finally, the K-nearest neighbor algorithm is used to determine the homology of gene-sharing APT samples. The detailed construction process of the scheme is given in this paper, including APT sample gene extraction, cleaning, clustering, sample detection, and homology determination. Experimental validation showcases our model outperforming existing methodologies with an accuracy of 95%, precision of 94%, and recall of 95%. When compared to previous models, the superiority of our approach is evident. These results underscore our model’s high efficiency and accuracy, confirming its potential for significant application in the field of cybersecurity.
Downloads
References
Chen Ruidong, Zhang Xiaosong, Niu Weina, Lan Haoyue. “A Research on Architecture of APT Attack Detection and Countering Technology.” Journal of University of Electronic Science and Technology of China, 2019, 48(6): 870–879.
Xiong C L, Zhu T T, Dong W H, et al. “Conan: A Practical Real-Time APT Detection System With High Accuracy and Efficiency.” IEEE Transactions on Dependable and Secure Computing, 2022, 19(1): 551–565.
Panahnejad M, Mirabi M. “APT-Dt-KC: advanced persistent threat detection based on kill-chain model,” Journal of Supercomputing, 2022, 78(6): 8644–8677.
QiAnXin Threat Intelligence Center. “Global Advanced Persistent Threat Report.” https://ti.qianxin.com/uploads/2022/03/31/4217b248a07f1f1c42b4bba4168efb4e.pdf.
Cen L, Gates C S, Si L, et al. “A Probabilistic Discriminative Model for Malware Detection with Decompiled Source Code.” IEEE Transactions on Dependable & Secure Computing, 2015, 12(4): 400–412.
Kwon B J, Mondal J, Jang J, et al. “The Dropper Effect: Insights into Malware Distribution with Downloader Graph Analytics,” the 22nd ACM SIGSAC Conference on Computer and Communications Security, ACM, 2015:1118–1129.
Garbervetsky D, Zoppi E, Livshits B. “Toward full elasticity in distributed static analysis: the case of callgraph analysis,” 11th Joint Meeting of European Software Engineering Conference (ESEC)/ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), ACM, 2015: 442–453.
Rosenberg I, Sicard G, David E O. “DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks,” 26th International Conference on Artificial Neural Networks, Springer, 2017: 91–99.
Pascanu R, Stokes J W, Sanossian H, et al. “Malware Classification with Recurrent Networks,” 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, 2015: 1916–1920.
Miller B, Kantchelian A, Tschantz M C, et al. “Reviewer Integration and Performance Measurement for Malware Detection,” 13th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 2016: 122–141.
Huang Y Z, Qiao M, Liu F D, et al. “Binary code traceability of multigranularity information fusion from the perspective of software genes,” Computers & Security, 2022, 114.
Zhao B L, Zhang S, Liu F D, et al. “Malware homology identification based on a gene perspective,” Frontiers of Information Technology & Electronic Engineering, 2019, 20(6): 801–815.
Zhao X L, Zhang Y M, L X H, et al. “Research on malicious code homology analysis method based on texture fingerprint clustering,” 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering, IEEE, 2018: 1914–1921.
Sun G, Qian Q. Deep learning and visualization for identifying malware families. IEEE Transactions on Dependable and Secure Computing, 2018.
Zhang Z, Qi P, Wang W. Dynamic malware analysis with feature engineering and feature learning. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, (01), 2020, p. 1210–7.
Li C, Zheng J. API call-based malware classification using recurrent neural networks. Journal of Cyber Security and Mobility, 2021;617–40.
Chaganti R, Ravi V, Pham TD. Deep learning based cross architecture internet of things malware detection and classification. Computers & Security, 2022;102779.
Chaganti R, Ravi V, Pham T D. A multi-view feature fusion approach for effective malware classification using Deep Learning. Journal of information security and applications, 2023.
Do Xuan, C., Dao, M.H. A novel approach for APT attack detection based on combined deep learning model. Neural Comput & Applic 33, 13251–13264 (2021). https://doi.org/10.1007/s00521-021-05952-5.
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Journal of Cyber Security and Mobility
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.