A Survey on User Profiling Model for Anomaly Detection in Cyberspace
DOI:
https://doi.org/10.13052/2245-1439.814Keywords:
User Profiling, Cybersecurity Profiling, Big Security Data, Security Data Source, Security Profiling Features, Anomaly Detection, Cybersecurity forewarning systemAbstract
In the face of escalating global Cybersecurity threats, having an automated forewarning system that can find suspicious user profiles is paramount. It can work as a prevention technique for planned attacks or ultimate security breaches. Significant research has been established in attack prevention and detection, but has demonstrated only one or a few different sources with a short list of features. The main goals of this paper are, first, to review the previous user profiling models and analyze them to find their advantages and disadvantages; second, to provide a comprehensive overview of previous research to gather available features and data sources for user profiling; third, based on the deficiencies of the previous models, the paper proposes a new user profiling model that can cover all available sources and related features based on the cybersecurity perspective. The proposed model includes seven profiling criteria for gathering user’s information and more than 270 features to parse and generate the security profile of a user.
Downloads
References
Agarwal, B., and Mittal, N. (2012). Hybrid approach for detection of anomaly network traffic using data mining techniques. Procedia Technology, 6, 996–1003.
Ahmed, A., Low, Y., Aly, M., Josifovski, V., and Smola, A. J. (2011). Scalable distributed inference of dynamic user interests for behavioral targeting. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 114–122). ACM.
Alfawaz, S., Nelson, K., and Mohannak, K. (2010). Information security culture: a behaviour compliance conceptual framework. In Proceedings of the Eighth Australasian Conference on Information Security-Volume 105 (pp. 47–55). Australian Computer Society, Inc.
Anandarajan, M. (2002). Profiling Web usage in the workplace: A behavior-based artificial intelligence approach. Journal of Management Information Systems, 19(1), 243–266.
Baeza-Yates, R., Hurtado, C., Mendoza, M., and Dupret, G. (2005). Modeling user search behavior. In Web Congress, 2005. LA-WEB 2005. Third Latin American (pp. 10). IEEE.
Bereziński, P., Pawelec, J., Małowidzki, M., and Piotrowski, R. (2014). Entropy-based internet traffic anomaly detection: A case study. In Proceedings of the Ninth International Conference on Dependability and Complex Systems DepCoS-RELCOMEX. Brunów, Poland (pp. 47–58). Springer, Cham.
Bradley, K., Rafter, R., and Smyth, B. (2000). Case-based user profiling for content personalisation. In International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems (pp. 62–72). Springer, Berlin, Heidelberg.
Baik, J., Lee, K., Lee, S., Kim, Y., and Choi, J. (2016). Predicting personality traits related to consumer behavior using SNS analysis. New Review of Hypermedia and Multimedia, 22(3), 189–206.
Cao, X., Chen, B., Li, H., and Fu, Y. (2016). Packet Header Anomaly Detection Using Bayesian Topic Models. IACR Cryptology ePrint Archive, 2016, 40.
Chang, S., Qiu, X., Gao, Z., Qi, F., and Liu, K. (2010). A flow-based anomaly detection method using entropy and multiple traffic features. In Broadband Network and Multimedia Technology (IC-BNMT), 2010 3rd IEEE International Conference on (pp. 223–227). IEEE.
Chen, J., Nairn, R., Nelson, L., Bernstein, M., and Chi, E. (2010). Short and tweet: experiments on recommending content from information streams. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1185–1194). ACM.
Corney, M., Mohay, G., and Clark, A. (2011). Detection of anomalies from user profiles generated from system logs. In Proceedings of the Ninth Australasian Information Security Conference-Volume 116 (pp. 23–32). Australian Computer Society, Inc.
Das, A. S., Datar, M., Garg, A., and Rajaram, S. (2007). Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th international conference on World Wide Web (pp. 271–280). ACM.
Denning, D. E. (1987). An intrusion-detection model. IEEE Transactions on software engineering, 2, 222–232.
Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J., and Tan, P. N. (2002). Data mining for network intrusion detection. In Proc. NSF Workshop on Next Generation Data Mining (pp. 21–30).
Eskin, E., Lee, W., and Stolfo, S. J. (2001). Modeling system calls for intrusion detection with dynamic window sizes. In DARPA Information Survivability Conference & Exposition II, 2001. DISCEX’01. Proceedings (Vol. 1, pp. 165–175). IEEE.
Figueroa, A. (2015). Exploring effective features for recognizing the user intent behind web queries. Computers in Industry, 68, 162–169.
Fu, Q., Lou, J. G., Wang, Y., and Li, J. (2009). Execution anomaly detection in distributed systems through unstructured log analysis. In Ninth IEEE International Conference on Data Mining, 2009. ICDM’09. (pp. 149–158). IEEE.
Grčar, M., Mladenič, D., and Grobelnik, M. (2005). User profiling for interest-focused browsing history. In Proceedings of the Workshop on End User Aspects of the Semantic Web (pp. 99–109).
Hadnagy, C. (2010). Social engineering: The art of human hacking. John Wiley & Sons.
Hannon, J., Bennett, M., and Smyth, B. (2010). Recommending twitter users to follow using content and collaborative filtering approaches. In Proceedings of the fourth ACM conference on Recommender systems (pp. 199–206). ACM.
Hannon, J., McCarthy, K., O’Mahony, M. P., and Smyth, B. (2012). A multi-faceted user model for twitter. In International Conference on User Modeling, Adaptation, and Personalization (pp. 303–309). Springer, Berlin, Heidelberg.
Hernández, I., Gupta, P., Rosso, P., and Rocha, M. (2012). A simple model for classifying web queries by user intent. In Proc. 2nd Spanish Conf. Information Retrieval (pp. 235–240).
Hung, C. C., Huang, Y. C., Hsu, J. Y. J., and Wu, D. K. C. (2008). Tag-based user profiling for social media recommendation. In Workshop on Intelligent Techniques for Web Personalization & Recommender Systems at AAAI (pp. 49–55).
Iglesias, F., and Zseby, T. (2015). Analysis of network traffic features for anomaly detection. Machine Learning, 101(1–3), 59–84.
Imbert, Courtney (2014). “Beyond the Cookie: Using Network Traffic Characteristics to Enhance Confidence in User Identity”, Available at: https://www.sans.org/reading-room/whitepapers/authentication/cookie-network-traffic-characteristics-enhance-confidence-user-identity-35362(accessed October 2016).
Kind, A., Stoecklin, M. P., and Dimitropoulos, X. (2009). Histogram-based traffic anomaly detection. IEEE Transactions on Network and Service Management, 6(2), 110–121.
Kim, H. N., Ha, I., Lee, K. S., Jo, G. S., and El-Saddik, A. (2011). Collaborative user modeling for enhanced content filtering in recommender systems. Decision Support Systems, 51(4), 772–781.
Kim, H. N., Alkhaldi, A., El Saddik, A., and Jo, G. S. (2011). Collaborative user modeling with user-generated tags for social recommender systems. Expert Systems with Applications, 38(7), 8488–8496.
Krulwich, B. (1997). Lifestyle finder: Intelligent user profiling using large-scale demographic data. AI magazine, 18(2), 37.
Kyoto University Benchmark Dataset. (2016). Available at: http://www.takakura.com/Kyoto data/ (accessed October 2016).
Lane, T., and Brodley, C. E. (1997). An application of machine learning to anomaly detection. In Proceedings of the 20th National Information Systems Security Conference (Vol. 377, pp. 366–380). Baltimore, USA.
Ling, L., Song, S., and Manikopoulos, C. N. (2006). Windows nt user profiling for masquerader detection. In Proceedings of the IEEE International Conference on Networking, Sensing and Control, ICNSC’06. (pp. 386–391). IEEE.
Lu, C., Lam, W., and Zhang, Y. (2012). Twitter user modeling and tweets recommendation based on wikipedia concept graph. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence (pp. 33–38).
Magklaras, G. B., and Furnell, S. M. (2005). A preliminary model of end user sophistication for insider threat prediction in IT systems. Computers & Security, 24(5), 371–380.
Maia, M., Almeida, J., and Almeida, V. (2008, April). Identifying user behavior in online social networks. In Proceedings of the 1st Workshop on Social Network Systems (pp. 1–6). ACM.
Mantere, M., Uusitalo, I., Sailio, M., and Noponen, S. (2012). Challenges of machine learning based monitoring for industrial control system networks. In 26th International Conference on Advanced Information Networking and Applications Workshops (WAINA), (pp. 968–972). IEEE.
Mantere, M., Sailio, M., and Noponen, S. (2013). Network traffic features for anomaly detection in specific industrial control system network. Future Internet, 5(4), 460–473.
Michlmayr, E., and Cayzer, S. (2007). Learning user profiles from tagging data and leveraging them for personal (ized) information access. In Proceedings of the Workshop on Tagging and Metadata for Social Information Organization.
Münz, G., Li, S., and Carle, G. (2007). Traffic anomaly detection using k-means clustering. In GI/ITG Workshop MMBnet (pp. 13–14).
Nousiainen, S., Kilpi, J., Silvonen, P., and Hiirsalmi, M. (2009). Anomaly detection from server log data. Technical report.
Lakhina, A., Crovella, M., and Diot, C. (2005). Mining anomalies using traffic feature distributions. In ACM SIGCOMM Computer Communication Review (Vol. 35, No. 4, pp. 217–228). ACM.
Li, Y., and Yao, Y. Y. (2002). User profile model: a view from artificial intelligence. In International Conference on Rough Sets and Current Trends in Computing (pp. 493–496). Springer, Berlin, Heidelberg.
Li, W. (2013). Automatic Log Analysis using Machine Learning: Awesome Automatic Log Analysis version 2.0.
Lu, W., and Traore, I. (2005). A new unsupervised anomaly detection framework for detecting network attacks in real-time. In International Conference on Cryptology and Network Security (pp. 96–109). Springer, Berlin, Heidelberg.
Ochoa, E. (2007). User and group profiling based on user process usage (Master’s thesis, Høgskolen i Oslo. Avdeling for ingeniørutdanning).
Ortigosa, A., Carro, R. M., and Quiroga, J. I. (2014). Predicting user personality by mining social interactions in Facebook. Journal of computer and System Sciences,80(1), 57–71.
Pannell, G., and Ashman, H. (2010). User modelling for exclusion and anomaly detection: a behavioural intrusion detection system. In International Conference on User Modeling, Adaptation, and Personalization (pp. 207–218). Springer, Berlin, Heidelberg.
Pannell, G., and Ashman, H. (2010). Anomaly detection over user profiles for intrusion detection.
Pepyne, D. L., Hu, J., and Gong, W. (2004). User profiling for computer security. In Proceedings of the American Control Conference, (Vol. 2, pp. 982–987). IEEE.
Qiu, F., and Cho, J. (2006). Automatic identification of user interest for personalized search. In Proceedings of the 15th International Conference on World Wide Web (pp. 727–736). ACM.
Salem, B., and Karim, T. (2008). Classification features for detecting server-side and client-side web attacks. In IFIP International Information Security Conference (pp. 729–733). Springer, Boston, MA.
Salem, M. B., and Stolfo, S. J. (2011). Modeling user search behavior for masquerade detection. In International Workshop on Recent Advances in Intrusion Detection (pp. 181–200). Springer, Berlin, Heidelberg.
Schiaffino, S. N., and Amandi, A. (2000). User profiling with Case-Based Reasoning and Bayesian Networks. In IBERAMIA-SBIA 2000 Open Discussion Track (pp. 12–21).
Semeraro, G., Degemmis, M., Lops, P., and Basile, P. (2007). Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. In IJCAI (Vol. 7, pp. 2856–2861).
Singh, R., Kumar, H., and Singla, R. K. (2015). An intrusion detection system using network traffic profiling and online sequential extreme learning machine. Expert Systems with Applications, 42(22), 8609–8624.
Somayaji, A. B. (2002). Operating system stability and security through process homeostasis (Doctoral dissertation, University of New Mexico).
Stanton, J. M., Stam, K. R., Mastrangelo, P., and Jolton, J. (2005). Analysis of end user security behaviors. Computers & Security, 24(2), 124–133.
Sugiyama, K., Hatano, K., and Yoshikawa, M. (2004). Adaptive web search based on user profile constructed without any effort from users. In Proceedings of the 13th international conference on World Wide Web (pp. 675–684). ACM.
Tabia, K., and Benferhat, S. (2008). On the use of decision trees as behavioral approaches in intrusion detection. In 2008 Seventh International Conference on Machine Learning and Applications (pp. 665–670). IEEE.
Tavallaee, M., Bagheri, E., Lu, W., and Ghorbani, A. A. (2009). A detailed analysis of the KDD CUP 99 data set. In IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2009. (pp. 1–6). IEEE.
Tebri, H., Boughanem, M., and Chrisment, C. (2005). Incremental profile learning based on a reinforcement method. In Proceedings of the 2005 ACM symposium on Applied computing (pp. 1096–1101). ACM.
Thatte, G., Mitra, U., and Heidemann, J. (2011). Parametric methods for anomaly detection in aggregate traffic. IEEE/ACM Transactions on Networking (TON), 19(2), 512–525.
Wang, K., and Stolfo, S. J. (2004). Anomalous payload-based network intrusion detection. In International Workshop on Recent Advances in Intrusion Detection (pp. 203–222). Springer, Berlin, Heidelberg.
Yeung, D. Y., and Ding, Y. (2002). User profiling for intrusion detection using dynamic and static behavioral models. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 494–505). Springer, Berlin, Heidelberg.
Yeung, C. M. A., Gibbins, N., and Shadbolt, N. (2009). Multiple interests of users in collaborative tagging systems. In Weaving Services and People on the World Wide Web (pp. 255–274). Springer, Berlin, Heidelberg.
Yu, J., Liu, F. F., and Zhao, H. H. (2012). Building user profile based on concept and relation for web personalized services. In International Conference on Innovation and Information Management.
Zhuowei, L., Das, A., and Nandi, S. (2003). Utilizing statistical characteristics of N-grams for intrusion detection. In Proceedings International Conference on Cyberworlds, (pp. 486–493). IEEE.
Zwietasch, T. (2014). Detecting anomalies in system log files using machine learning techniques (Bachelor’s thesis).
“Federal Agency Security Breaches Caused by Lack of User”. (2016). Available at: http://www.businesswire.com (accessed October 2016).
“Monitoring privileged user actions”. (2016). Available at: https://www.sans.org (accessed October 2016).