A Survey on User Profiling Model for Anomaly Detection in Cyberspace

Arash Habibi  Lashkari; Min  Chen; Ali A.  Ghorbani

doi:10.13052/2245-1439.814

Authors

Arash Habibi Lashkari Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada
Min Chen Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada
Ali A. Ghorbani Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada

DOI:

https://doi.org/10.13052/2245-1439.814

Keywords:

User Profiling, Cybersecurity Profiling, Big Security Data, Security Data Source, Security Profiling Features, Anomaly Detection, Cybersecurity forewarning system

Abstract

In the face of escalating global Cybersecurity threats, having an automated forewarning system that can find suspicious user profiles is paramount. It can work as a prevention technique for planned attacks or ultimate security breaches. Significant research has been established in attack prevention and detection, but has demonstrated only one or a few different sources with a short list of features. The main goals of this paper are, first, to review the previous user profiling models and analyze them to find their advantages and disadvantages; second, to provide a comprehensive overview of previous research to gather available features and data sources for user profiling; third, based on the deficiencies of the previous models, the paper proposes a new user profiling model that can cover all available sources and related features based on the cybersecurity perspective. The proposed model includes seven profiling criteria for gathering user’s information and more than 270 features to parse and generate the security profile of a user.

Downloads

Download data is not yet available.

Author Biographies

Arash Habibi Lashkari, Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada

Arash Habibi Lashkari is an assistant professor at the Faculty of Computer Science, University of New Brunswick (UNB) and research manager of the Canadian Institute for Cybersecurity (CIC). He has more than 22 years of academic and industry experience developing technology that detects and protects against cyberattacks, malware and the dark web. Dr. Lashkari has been awarded 3 gold medals as well as 12 silver and bronze medals in international computer security competitions around the world. In 2017, he has been selected as the top 150 researchers who will shape the future of Canada. Also, he won the Runner up Cybersecurity Academic Award of the year at ICSIC conference in Canada. He is the author of 10 books in English and Persian on topics including cryptography, network security, and mobile communication as well as over 80 journals and conference papers concerning various aspects of computer security. His research focuses on cybersecurity, big data security analysis, Internet traffic analysis and the detection of malware and cyber-attacks as well as generating cybersecurity datasets.

Min Chen, Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada

Min Chen is a postdoctoral fellow at Canadian Institute for Cybersecurity (CIC) on the Faculty of Computer Science, University of New Brunswick. She has extensive academic experience in the areas of machine learning, service computing and cybersecurity. She has several conference and journal publications in the research area of machine learning and service computing. Currently, she is interested in studying user profiling in the respective of cybersecurity with machine learning technology. Her research focused on modeling user behavior as a prevention technique for planned attacks.

Ali A. Ghorbani, Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada

Ali A. Ghorbani has held a variety of positions in academia for the past 35 years and is currently the Canada Research Chair (Tier 1) in Cybersecurity, the Dean of the Faculty of Computer Science (since 2008), and the Director of the Canadian Institute for Cybersecurity. He is the co-inventor on 3 awarded patents in the area of Network Security and Web Intelligence and has published over 200 peer-reviewed articles during his career. He has supervised over 160 research associates, postdoctoral fellows, graduate and undergraduate students during his career. His book, Intrusion Detection and Prevention Systems: Concepts and Techniques, was published by Springer in October 2010. In 2007, Dr. Ghorbani received the University of New Brunswick’s Research Scholar Award. Since 2010, he has obtained more than $10M to fund 6 large multi-project research initiatives. Dr. Ghorbani has developed a number of technologies that have been adopted by high-tech companies. He co-founded two startups, Sentrant and EyesOver in 2013 and 2015. Dr. Ghorbani is the co-Editor-In-Chief of Computational Intelligence Journal. He was twice one of the three finalists for the Special Recognition Award at the 2013 and 2016 New Brunswick KIRA award for the knowledge industry.

References

Agarwal, B., and Mittal, N. (2012). Hybrid approach for detection of anomaly network traffic using data mining techniques. Procedia Technology, 6, 996–1003.

Ahmed, A., Low, Y., Aly, M., Josifovski, V., and Smola, A. J. (2011). Scalable distributed inference of dynamic user interests for behavioral targeting. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 114–122). ACM.

Alfawaz, S., Nelson, K., and Mohannak, K. (2010). Information security culture: a behaviour compliance conceptual framework. In Proceedings of the Eighth Australasian Conference on Information Security-Volume 105 (pp. 47–55). Australian Computer Society, Inc.

Anandarajan, M. (2002). Profiling Web usage in the workplace: A behavior-based artificial intelligence approach. Journal of Management Information Systems, 19(1), 243–266.

Baeza-Yates, R., Hurtado, C., Mendoza, M., and Dupret, G. (2005). Modeling user search behavior. In Web Congress, 2005. LA-WEB 2005. Third Latin American (pp. 10). IEEE.

Bereziński, P., Pawelec, J., Małowidzki, M., and Piotrowski, R. (2014). Entropy-based internet traffic anomaly detection: A case study. In Proceedings of the Ninth International Conference on Dependability and Complex Systems DepCoS-RELCOMEX. Brunów, Poland (pp. 47–58). Springer, Cham.

Bradley, K., Rafter, R., and Smyth, B. (2000). Case-based user profiling for content personalisation. In International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems (pp. 62–72). Springer, Berlin, Heidelberg.

Baik, J., Lee, K., Lee, S., Kim, Y., and Choi, J. (2016). Predicting personality traits related to consumer behavior using SNS analysis. New Review of Hypermedia and Multimedia, 22(3), 189–206.

Cao, X., Chen, B., Li, H., and Fu, Y. (2016). Packet Header Anomaly Detection Using Bayesian Topic Models. IACR Cryptology ePrint Archive, 2016, 40.

Chang, S., Qiu, X., Gao, Z., Qi, F., and Liu, K. (2010). A flow-based anomaly detection method using entropy and multiple traffic features. In Broadband Network and Multimedia Technology (IC-BNMT), 2010 3rd IEEE International Conference on (pp. 223–227). IEEE.

Chen, J., Nairn, R., Nelson, L., Bernstein, M., and Chi, E. (2010). Short and tweet: experiments on recommending content from information streams. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1185–1194). ACM.

Corney, M., Mohay, G., and Clark, A. (2011). Detection of anomalies from user profiles generated from system logs. In Proceedings of the Ninth Australasian Information Security Conference-Volume 116 (pp. 23–32). Australian Computer Society, Inc.

Das, A. S., Datar, M., Garg, A., and Rajaram, S. (2007). Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th international conference on World Wide Web (pp. 271–280). ACM.

Denning, D. E. (1987). An intrusion-detection model. IEEE Transactions on software engineering, 2, 222–232.

Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J., and Tan, P. N. (2002). Data mining for network intrusion detection. In Proc. NSF Workshop on Next Generation Data Mining (pp. 21–30).

Eskin, E., Lee, W., and Stolfo, S. J. (2001). Modeling system calls for intrusion detection with dynamic window sizes. In DARPA Information Survivability Conference & Exposition II, 2001. DISCEX’01. Proceedings (Vol. 1, pp. 165–175). IEEE.

Figueroa, A. (2015). Exploring effective features for recognizing the user intent behind web queries. Computers in Industry, 68, 162–169.

Fu, Q., Lou, J. G., Wang, Y., and Li, J. (2009). Execution anomaly detection in distributed systems through unstructured log analysis. In Ninth IEEE International Conference on Data Mining, 2009. ICDM’09. (pp. 149–158). IEEE.

Grčar, M., Mladenič, D., and Grobelnik, M. (2005). User profiling for interest-focused browsing history. In Proceedings of the Workshop on End User Aspects of the Semantic Web (pp. 99–109).

Hadnagy, C. (2010). Social engineering: The art of human hacking. John Wiley & Sons.

Hannon, J., Bennett, M., and Smyth, B. (2010). Recommending twitter users to follow using content and collaborative filtering approaches. In Proceedings of the fourth ACM conference on Recommender systems (pp. 199–206). ACM.

Hannon, J., McCarthy, K., O’Mahony, M. P., and Smyth, B. (2012). A multi-faceted user model for twitter. In International Conference on User Modeling, Adaptation, and Personalization (pp. 303–309). Springer, Berlin, Heidelberg.

Hernández, I., Gupta, P., Rosso, P., and Rocha, M. (2012). A simple model for classifying web queries by user intent. In Proc. 2nd Spanish Conf. Information Retrieval (pp. 235–240).

Hung, C. C., Huang, Y. C., Hsu, J. Y. J., and Wu, D. K. C. (2008). Tag-based user profiling for social media recommendation. In Workshop on Intelligent Techniques for Web Personalization & Recommender Systems at AAAI (pp. 49–55).

Iglesias, F., and Zseby, T. (2015). Analysis of network traffic features for anomaly detection. Machine Learning, 101(1–3), 59–84.

Imbert, Courtney (2014). “Beyond the Cookie: Using Network Traffic Characteristics to Enhance Confidence in User Identity”, Available at: https://www.sans.org/reading-room/whitepapers/authentication/cookie-network-traffic-characteristics-enhance-confidence-user-identity-35362(accessed October 2016).

Kind, A., Stoecklin, M. P., and Dimitropoulos, X. (2009). Histogram-based traffic anomaly detection. IEEE Transactions on Network and Service Management, 6(2), 110–121.

Kim, H. N., Ha, I., Lee, K. S., Jo, G. S., and El-Saddik, A. (2011). Collaborative user modeling for enhanced content filtering in recommender systems. Decision Support Systems, 51(4), 772–781.

Kim, H. N., Alkhaldi, A., El Saddik, A., and Jo, G. S. (2011). Collaborative user modeling with user-generated tags for social recommender systems. Expert Systems with Applications, 38(7), 8488–8496.

Krulwich, B. (1997). Lifestyle finder: Intelligent user profiling using large-scale demographic data. AI magazine, 18(2), 37.

Kyoto University Benchmark Dataset. (2016). Available at: http://www.takakura.com/Kyoto data/ (accessed October 2016).

Lane, T., and Brodley, C. E. (1997). An application of machine learning to anomaly detection. In Proceedings of the 20th National Information Systems Security Conference (Vol. 377, pp. 366–380). Baltimore, USA.

Ling, L., Song, S., and Manikopoulos, C. N. (2006). Windows nt user profiling for masquerader detection. In Proceedings of the IEEE International Conference on Networking, Sensing and Control, ICNSC’06. (pp. 386–391). IEEE.

Lu, C., Lam, W., and Zhang, Y. (2012). Twitter user modeling and tweets recommendation based on wikipedia concept graph. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence (pp. 33–38).

Magklaras, G. B., and Furnell, S. M. (2005). A preliminary model of end user sophistication for insider threat prediction in IT systems. Computers & Security, 24(5), 371–380.

Maia, M., Almeida, J., and Almeida, V. (2008, April). Identifying user behavior in online social networks. In Proceedings of the 1st Workshop on Social Network Systems (pp. 1–6). ACM.

Mantere, M., Uusitalo, I., Sailio, M., and Noponen, S. (2012). Challenges of machine learning based monitoring for industrial control system networks. In 26th International Conference on Advanced Information Networking and Applications Workshops (WAINA), (pp. 968–972). IEEE.

Mantere, M., Sailio, M., and Noponen, S. (2013). Network traffic features for anomaly detection in specific industrial control system network. Future Internet, 5(4), 460–473.

Michlmayr, E., and Cayzer, S. (2007). Learning user profiles from tagging data and leveraging them for personal (ized) information access. In Proceedings of the Workshop on Tagging and Metadata for Social Information Organization.

Münz, G., Li, S., and Carle, G. (2007). Traffic anomaly detection using k-means clustering. In GI/ITG Workshop MMBnet (pp. 13–14).

Nousiainen, S., Kilpi, J., Silvonen, P., and Hiirsalmi, M. (2009). Anomaly detection from server log data. Technical report.

Lakhina, A., Crovella, M., and Diot, C. (2005). Mining anomalies using traffic feature distributions. In ACM SIGCOMM Computer Communication Review (Vol. 35, No. 4, pp. 217–228). ACM.

Li, Y., and Yao, Y. Y. (2002). User profile model: a view from artificial intelligence. In International Conference on Rough Sets and Current Trends in Computing (pp. 493–496). Springer, Berlin, Heidelberg.

Li, W. (2013). Automatic Log Analysis using Machine Learning: Awesome Automatic Log Analysis version 2.0.

Lu, W., and Traore, I. (2005). A new unsupervised anomaly detection framework for detecting network attacks in real-time. In International Conference on Cryptology and Network Security (pp. 96–109). Springer, Berlin, Heidelberg.

Ochoa, E. (2007). User and group profiling based on user process usage (Master’s thesis, Høgskolen i Oslo. Avdeling for ingeniørutdanning).

Ortigosa, A., Carro, R. M., and Quiroga, J. I. (2014). Predicting user personality by mining social interactions in Facebook. Journal of computer and System Sciences,80(1), 57–71.

Pannell, G., and Ashman, H. (2010). User modelling for exclusion and anomaly detection: a behavioural intrusion detection system. In International Conference on User Modeling, Adaptation, and Personalization (pp. 207–218). Springer, Berlin, Heidelberg.

Pannell, G., and Ashman, H. (2010). Anomaly detection over user profiles for intrusion detection.

Pepyne, D. L., Hu, J., and Gong, W. (2004). User profiling for computer security. In Proceedings of the American Control Conference, (Vol. 2, pp. 982–987). IEEE.

Qiu, F., and Cho, J. (2006). Automatic identification of user interest for personalized search. In Proceedings of the 15th International Conference on World Wide Web (pp. 727–736). ACM.

Salem, B., and Karim, T. (2008). Classification features for detecting server-side and client-side web attacks. In IFIP International Information Security Conference (pp. 729–733). Springer, Boston, MA.

Salem, M. B., and Stolfo, S. J. (2011). Modeling user search behavior for masquerade detection. In International Workshop on Recent Advances in Intrusion Detection (pp. 181–200). Springer, Berlin, Heidelberg.

Schiaffino, S. N., and Amandi, A. (2000). User profiling with Case-Based Reasoning and Bayesian Networks. In IBERAMIA-SBIA 2000 Open Discussion Track (pp. 12–21).

Semeraro, G., Degemmis, M., Lops, P., and Basile, P. (2007). Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. In IJCAI (Vol. 7, pp. 2856–2861).

Singh, R., Kumar, H., and Singla, R. K. (2015). An intrusion detection system using network traffic profiling and online sequential extreme learning machine. Expert Systems with Applications, 42(22), 8609–8624.

Somayaji, A. B. (2002). Operating system stability and security through process homeostasis (Doctoral dissertation, University of New Mexico).

Stanton, J. M., Stam, K. R., Mastrangelo, P., and Jolton, J. (2005). Analysis of end user security behaviors. Computers & Security, 24(2), 124–133.

Sugiyama, K., Hatano, K., and Yoshikawa, M. (2004). Adaptive web search based on user profile constructed without any effort from users. In Proceedings of the 13th international conference on World Wide Web (pp. 675–684). ACM.

Tabia, K., and Benferhat, S. (2008). On the use of decision trees as behavioral approaches in intrusion detection. In 2008 Seventh International Conference on Machine Learning and Applications (pp. 665–670). IEEE.

Tavallaee, M., Bagheri, E., Lu, W., and Ghorbani, A. A. (2009). A detailed analysis of the KDD CUP 99 data set. In IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2009. (pp. 1–6). IEEE.

Tebri, H., Boughanem, M., and Chrisment, C. (2005). Incremental profile learning based on a reinforcement method. In Proceedings of the 2005 ACM symposium on Applied computing (pp. 1096–1101). ACM.

Thatte, G., Mitra, U., and Heidemann, J. (2011). Parametric methods for anomaly detection in aggregate traffic. IEEE/ACM Transactions on Networking (TON), 19(2), 512–525.

Wang, K., and Stolfo, S. J. (2004). Anomalous payload-based network intrusion detection. In International Workshop on Recent Advances in Intrusion Detection (pp. 203–222). Springer, Berlin, Heidelberg.

Yeung, D. Y., and Ding, Y. (2002). User profiling for intrusion detection using dynamic and static behavioral models. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 494–505). Springer, Berlin, Heidelberg.

Yeung, C. M. A., Gibbins, N., and Shadbolt, N. (2009). Multiple interests of users in collaborative tagging systems. In Weaving Services and People on the World Wide Web (pp. 255–274). Springer, Berlin, Heidelberg.

Yu, J., Liu, F. F., and Zhao, H. H. (2012). Building user profile based on concept and relation for web personalized services. In International Conference on Innovation and Information Management.

Zhuowei, L., Das, A., and Nandi, S. (2003). Utilizing statistical characteristics of N-grams for intrusion detection. In Proceedings International Conference on Cyberworlds, (pp. 486–493). IEEE.

Zwietasch, T. (2014). Detecting anomalies in system log files using machine learning techniques (Bachelor’s thesis).

“Federal Agency Security Breaches Caused by Lack of User”. (2016). Available at: http://www.businesswire.com (accessed October 2016).

“Monitoring privileged user actions”. (2016). Available at: https://www.sans.org (accessed October 2016).

A Survey on User Profiling Model for Anomaly Detection in Cyberspace

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Arash Habibi Lashkari, Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada

Min Chen, Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada

Ali A. Ghorbani, Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB) Fredericton, Canada

References

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

specialissue

award

Alvaro Garrido

2022 Best Paper

Interview

Interview

magazine

issn

cover

Make a Submission

indexed

openaccess

opinions

riverlogo