On the Use of Machine Learning for Identifying Botnet Network Traffic
DOI:
https://doi.org/10.13052/2245-1439.421Keywords:
Botnet detection, State of the art, Comparative analysis, Traffic analysis, Machine learningAbstract
During the last decade significant scientific efforts have been invested in the development of methods that could provide efficient and effective botnet detection.As a result, an array of detection methods based on diverse technical principles and targeting various aspects of botnet phenomena have been defined. As botnets rely on the Internet for both communicating with the attacker as well as for implementing different attack campaigns, network traffic analysis is one of the main means of identifying their existence. In addition to relying on traffic analysis for botnet detection, many contemporary approaches use machine learning techniques for identifying malicious traffic. This paper presents a survey of contemporary botnet detection methods that rely on machine learning for identifying botnet network traffic. The paper provides a comprehensive overview on the existing scientific work thus contributing to the better understanding of capabilities, limitations and opportunities of using machine learning for identifying botnet traffic. Furthermore, the paper outlines possibilities for the future development of machine learning-based botnet detection systems.
Downloads
References
Hogben, G. (ed.), “Botnets: Detection, measurement, disinfection and defence,” ENISA, Tech. Rep., 2011.
S. S. Silva, R. M. Silva, R. C. Pinto, and R. M. Salles, “Botnets:Asurvey,” Computer Networks, vol. 1, no. 0, pp. –, 2012.
S. García, A. Zunino, and M. Campo, “Survey on network-based botnet detection methods,” Security and Communication Networks, vol. 7, no. 5, pp. 878–903, 2014.
A. Karim, R. B. Salleh, M. Shiraz, S. A. A. Shah, I. Awan, and N. B. Anuar, “Botnet detection techniques: Review, future trends, and issues,” Journal of Zhejiang University SCIENCEC, vol. 15, no. 11, pp. 943–983, 2014.
C. Livadas, R. Walsh, D. Lapsley, and W. T. Strayer, “Using machine learning techniques to identify botnet traffic,” in Proceedings of 2006 31st IEEE Conference on Local Computer Networks, Nov. 2006, pp. 967–974.
W. T. Strayer, D. Lapsely, R. Walsh, and C. Livadas, “Botnet detection based on network behaviour,” in Botnet Detection, ser. Advances in Information Security. Springer, 2008, vol. 36, pp. 1–24.
G. Gu, R. Perdisci, J. Zhang, andW. Lee, “Botminer: Clustering analysis of network traffic for protocol- and structure independent botnet detection,” in Proceedings of the 17th conference on Security symposium, 2008, pp. 139–154.
H. Choi and H. Lee, “Identifying botnets by capturing group activities in dns traffic,” Computer Networks, vol. 56, no. 1, pp. 20–33, 2012.
S. Saad, I. Traore, A. Ghorbani, B. Sayed, D. Zhao,W. Lu, J. Felix, and P. Hakimian, “Detecting p2p botnets through network behavior analysis and machine learning,” in 2011 Ninth Annual International Conference on Privacy, Security and Trust (PST), July 2011, pp. 174–180.
D. Zhao, I. Traore, B. Sayed,W. Lu, S. Saad, A. Ghorbani, and D. Garant, “Botnet detection based on traffic behavior analysis and flow intervals,” Computers & Security, vol. 39, pp. 2–16, 2013.
J. Zhang, R. Perdisci,W. Lee, U. Sarfraz, and X. Luo, “Detecting stealthy P2P botnets using statistical traffic fingerprints,” in 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks (DSN), Hong Kong. IEEE/IFIP, Jun. 2011, pp. 121–132.
W. Lu, G. Rammidi, and A. A. Ghorbani, “Clustering botnet communication traffic based on n-gram feature selection,” Computer Communications, vol. 34, pp. 502–514, 2011.
L. Bilge, D. Balzarotti,W. Robertson, E. Kirda, and C. Kruegel, “Disclosure: Detecting botnet command and control servers through large-scale netflow analysis,” in Proceedings of the 28th Annual Computer Security
Applications Conference, ser. ACSAC ’12. ACM, 2012, pp. 129–138.
L. Bilge, S. Sen, D. Balzarotti, E. Kirda, and C. Kruegel, “Exposure: A passive dns analysis service to detect and report malicious domains,” ACM Transactions on Information and System Security (TISSEC), vol. 16, no. 4, p. 14, 2014.
M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, and N. Feamster, “Building a dynamic reputation system for dns,” in Proceedings of the 19th USENIX conference on Security, ser. USENIX Security’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 18–18.
M. Antonakakis, R. Perdisci, W. Lee, N. Vasiloglou II, and D. Dagon, “Detecting malware domains at the upper dns hierarchy.” in USENIX Security Symposium, 2011, p. 16.
R. Perdisci, I. Corona, andG. Giacinto, “Early detection of malicious flux networks via large-scale passive dns traffic analysis,” IEEE Transactions on Dependable and Secure Computing, vol. 9, no. 5, pp. 714–726, 2012.
F. Tegeler, X. Fu, G. Vigna, and C. Kruegel, “Botfinder: Finding bots in network traffic without deep packet inspection,” in Proceedings of the 8th international conference on Emerging networking experiments and technologies. ACM, 2012, pp. 349–360.
D. Zhao and I. Traore, “P2p botnet detection through malicious fast flux network identification,” in 2012 IEEE Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2012, pp. 170–175.
J. Zhang, R. Perdisci, W. Lee, X. Luo, and U. Sarfraz, “Building a scalable system for stealthy p2p-botnet detection,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 1, pp. 27–38, 2014.
F. Haddadi, D. Runkel, A. N. Zincir-Heywood, and M. I. Heywood, “On botnet behaviour analysis using gp and c4.5,” in Proceedings of the 2014 conference companion on Genetic and evolutionary computation companion. ACM, 2014, pp. 1253–1260.
M. Masud, T. Al-khateeb, L. Khan, B. Thuraisingham, and K. Hamlen, “Flow-based identification of botnet traffic by mining multiple log files,” in First International Conference on Distributed Framework and Applications, 2008. DFmA 2008, Oct. 2008, pp. 200–206.
S. Shin, Z. Xu, and G. Gu, “EFFORT: Efficient and Effective Bot Malware Detection,” in Proceedings of the 31th Annual IEEE Conference on Computer Communications (INFOCOM’12) Mini-Conference, March 2012, pp. 71–80.
Y. Zeng, X. Hu, and K. Shin, “Detection of botnets using combined host- and network-level information,” in 2010 IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 28 2010-July 1 2010, pp. 291–300.
M. Feily and Shahrestani, “A survey of botnet and botnet detection,” Third International Conference on Emerging Security Information, Systems and Technologies, 2009. SECURWARE ’09, pp. 268–273, 2009.
M. Bailey, E. Cooke, F. Jahanian, Y. Xu, and M. Karir, “A survey of botnet technology and defenses,” in Conference For Homeland Security, 2009. CATCH ’09. Cybersecurity Applications Technology, March 2009, pp. 299–304.
T. Hyslip and J. Pittman, “A survey of botnet detection techniques by command and control infrastructure,” Journal of Digital Forensics, Security and Law, vol. 10, no. 1, pp. 7–26, 2015.
M. Masud, L. Khan, and B. Thuraisingham, Data Mining Tools for Malware Detection. Taylor & Francis Group, 2011.
S. Dua and X. Du, Data mining and machine learning in cybersecurity. Boca Raton, FL: CRC Press. xxii, 234 p. $ 89.95, 2011.
R. Sommer and V. Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” in 2010 IEEE Symposium on Security and Privacy (SP), IEEE, 2010, pp. 305–316.
A. J. Aviv and A. Haeberlen, “Challenges in experimenting with botnet detection systems,” in Proceedings of the 4th conference on Cyber security experimentation and test, ser. CSET’11. Berkeley, CA, USA: USENIX Association, 2011, pp. 6–6.
M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario, “Automated classification and analysis of internet malware.” in RAID, ser. Lecture Notes in Computer Science, C. KrÃijgel, R. Lippmann, and A. Clark, Eds., vol. 4637. Springer, 2007, pp. 178–197.
E. Stinson and J. C. Mitchell, “Characterizing bots’ remote control behavior,” in Botnet Detection, ser. Advances in Information Security, W. Lee, C.Wang, and D. Dagon, Eds. Springer, 2008, vol. 36, pp. 45–64.
L. Liu, S. Chen, G. Yan, and Z. Zhang, “Bottracer: Execution-based bot-like malware detection,” in Proceedings of the 11th international conference on Information Security, ser. ISC ’08. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 97–113.
C. Kolbitsch, P. M. Comparetti, C. Kruegel, E. Kirda, X. Zhou, and X. Wang, “Effective and efficient malware detection at the end host,” in Proceedings of the 18th conference on USENIX security symposium, ser. SSYM’09. Berkeley, CA, USA: USENIX Association, 2009, pp. 351–366.
U. Bayer, P. M. Comparetti, C. Hlauschek, C. KrÃijgel, and E. Kirda, “Scalable, behavior-based malware clustering.” in NDSS. The Internet Society, 2009, pp. 5–5.
Y. Park, Q. Zhang, D. Reeves, and V. Mulukutla, “Antibot: Clustering common semantic patterns for bot detection,” in 2010 IEEE 34th Annual Proceedings on Computer Software and Applications Conference (COMPSAC), July 2010, pp. 262–272.
M. Egele, T. Scholte, E. Kirda, and C. Kruegel, “A survey on automated dynamic malware-analysis techniques and tools,” ACM Comput. Surv., vol. 44, no. 2, pp. 6:1–6:42, Mar. 2008.
I. You and K. Yim, “Malware obfuscation techniques: A brief survey,” in 2010 International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA), Nov. 2010, pp. 297–300.
J. Marpaung, M. Sain, and H.-J. Lee, “Survey on malware evasion techniques: State of the art and challenges,” in 2012 14th International Conference on Advanced Communication Technology (ICACT), Feb. 2012, pp. 744–749.
Damballa, “A new iteration of the tdss/tdl4 malware using dga-based command-and-control,” Damballa, Tech. Rep., 2012.
A. Karasaridis, B. Rexroad, and D. Hoeflin, “Wide-scale botnet detection and characterization,” in Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets, ser. HotBots’07. Berkeley, CA, USA: USENIX Association, 2007, pp. 7–7.
D. Dagon, C. Zou, and W. Lee, “Modeling botnet propagation using time zones,” in Proceedings of the 13 th Network and Distributed System Security Symposium NDSS, 2006, pp. 7–7.
G. Gu, V. Yegneswaran, P. Porras, J. Stoll, and W. Lee, “Active botnet probing to identify obscure command and control channels,” in Proceedings of the 2009 Annual Computer Security Applications Conference, ser.ACSAC ’09.Washington, DC, USA: IEEE Computer Society, 2009, pp. 241–253.
G. Gu, P. Porras, V. Yegneswaran, M. Fong, and W. Lee, “BotHunter: Detecting malware infection through IDS-driven dialog correlation,” in Proceedings of the 16th USENIX Security Symposium, San Jose, California. USENIX Association, Jul. 2007, pp. 167–182.
V. Paxson, “Bro:Asystem for detecting network intruders in real-time,” Computer Networks, vol. 31, no. 23˜A´cˆa´Cˇnˆa˘AIJ24, pp. 2435–2463, 1999.
M. Roesch, “Snort - lightweight intrusion detection for networks,” in Proceedings of the 13th USENIX conference on System administration, ser. LISA ’99. Berkeley, CA, USA: USENIX Association, 1999, pp. 229–238.
J. Goebel and T. Holz, “Rishi: Identify bot contaminated hosts by irc nickname evaluation,” in Proceedings of the first conference on First 30 M. Stevanovic and J. M. Pedersen Workshop on Hot Topics in Understanding Botnets, ser. HotBots’07. Berkeley, CA, USA: USENIX Association, 2007, pp. 8–8.
G. Gu, J. Zhang, and W. Lee, “BotSniffer: Detecting botnet command and control channels in network traffic,” in Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS’08), February 2008, pp. 1–1.
A. Ramachandran, N. Feamster, and D. Dagon, “Revealing botnet membership using dnsbl counter-intelligence,” in Proceedings of the 2nd conference on Steps to Reducing Unwanted Traffic on the Internet - Volume 2, ser. SRUTI’06. Berkeley, CA, USA: USENIX Association, 2006, pp. 8–8.
R. Villamarin-Salomon and J. Brustoloni, “Identifying botnets using anomaly detection techniques applied to DNS traffic,” in Proceedings of 5th IEEE Consumer Communications and Networking Conference (CCNC 2008), 2008, pp. 476–481.
R. Villamarín-Salomón and J. C. Brustoloni, “Bayesian bot detection based on dns traffic similarity,” in Proceedings of the 2009 ACM symposium on Applied Computing, ser. SAC ’09. New York, NY, USA: ACM, 2009, pp. 2035–2041.
X. Yu, X. Dong, G. Yu, Y. Qin, D. Yue, and Y. Zhao, “Online botnet detection based on incremental discrete fourier transform,” JNW, vol. 5, no. 5, pp. 568–576, 2010.
T.M.Mitchell, Machine Learning, 1st ed.NewYork,NY,USA:McGraw- Hill, Inc., 1997.
S. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: A review of classification techniques,” Frontiers in Artificial Intelligence and Applications, vol. 160, p. 3, 2007.
A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” ACM Comput. Surv., vol. 31, no. 3, pp. 264–323, Sep. 1999.
Suricata, IDS, “open-source ids/ips/nsm engine,” 2015.
N. Provos and T. Holz, Virtual honeypots: From botnet tracking to intrusion detection, 2nd ed. Addison-Wesley Professional, 2009.
E. Stinson and J. C. Mitchell, “Towards systematic evaluation of the evadability of bot/botnet detection methods,” in Proceedings of the 2nd conference on USENIX Workshop on offensive technologies, ser. WOOT’08. Berkeley, CA, USA: USENIX Association, 2008, pp. 5:1–5:9.
T. T. Nguyen and G. Armitage, “Asurvey of techniques for internet traffic classification using machine learning,” IEEE Communications Surveys & Tutorials, vol. 10, no. 4, pp. 56–76, 2008.
M. Kührer, C. Rossow, and T. Holz, “Paint it black: Evaluating the effectiveness of malware blacklists,” in Research in Attacks, Intrusions and Defenses. Springer, 2014, pp. 1–21.
C. J. Dietrich and C. Rossow, “Empirical research of ip blacklists,” in ISSE 2008 Securing Electronic Business Processes. Springer, 2009, pp. 163–171.
S. Sinha, M. Bailey, and F. Jahanian, “Shades of grey: On the effectiveness of reputation-based ˆa˘AIJblacklistsˆa˘A˙I,” in 3rd IEEE International Conference on Malicious and Unwanted Software, MALWARE 2008. 2008, pp. 57–64.
S. Sheng, B. Wardman, G. Warner, L. Cranor, J. Hong, and C. Zhang, “An empirical analysis of phishing blacklists,” in Sixth Conference on Email and Anti-Spam (CEAS), 2009.
N. Kheir, F. Tran, P. Caron, and N. Deschamps, “Mentor: Positive dns reputation to skim-off benign domains in botnet c&c blacklists,” in ICT Systems Security and Privacy Protection. Springer, 2014, pp. 1–14.