Generation of Realistic Navigation Paths for Web Site Testing Using RNN and GAN

Authors

  • Silvio Pavanetto Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, Milan, 20133, Italy https://orcid.org/0000-0001-7301-2801
  • Marco Brambilla Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, Milan, 20133, Italy https://orcid.org/0000-0002-8753-2434

DOI:

https://doi.org/10.13052/jwe1540-9589.20816

Keywords:

Web Engineering, Data Mining, Deep Learning, Recurrent Neural Networks, Generative Adversarial Networks, Testing

Abstract

For applications that have not yet been launched, a reliable way for creating online navigation logs may be crucial, enabling developers to test their products as though they were being used by real users. This might lead to faster and lower-cost program testing and enhancement, especially in terms of usability and interaction. In this work we propose a method for using deep learning approaches such as recurrent neural networks (RNN) and generative adversarial neural networks (GANN) to produce high-quality weblogs. Eventually, we can utilize the created data for automated testing and improvement of Web sites prior to their release with the aid of model-driven development tools such as IFML Editor.

Downloads

Download data is not yet available.

Author Biographies

Silvio Pavanetto, Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, Milan, 20133, Italy

Silvio Pavanetto is a research fellow at Politecnico di Milano. His research interests include data science, social media monitoring, data-driven innovation, and big data analysis with particular attention to machine learning and deep learning techniques, applied on different scenarios and types of data, such as time series, images and text. In his two years of research he was the author of several papers published in international conferences and he worked on several research projects, also in collaboration with other European universities.

Marco Brambilla, Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, Milan, 20133, Italy

Marco Brambilla is a full professor at Politecnico di Milano. He manages several research projects and industrial innovation activities. His research interests include data science, software modeling languages, crowdsourcing, social media monitoring, data-driven innovation, and big data analysis. He has been visiting researcher at CISCO and UCSD, USA, and visiting professor at Dauphine University, Paris. He is the main author of the OMG standard IFML. He founded 3 startups and authored over 250 papers, 2 patents, and 5 books. He is editor and associate editor of various journals and he has been PC Chair of two editions of the ICWE Web Engineering Conference.

References

Roberto Acerbis, Aldo Bongio, Marco Brambilla, and Stefano Butti. Model-driven development based on omg’s IFML with webratio web and mobile platform. In Engineering the Web in the Big Data Era – 15th International Conference, ICWE Proceedings, pages 605–608, 2015.

Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB ’94, pages 487–499, San Francisco, CA, USA, 1994. Morgan Kaufmann Publishers Inc.

Kirit Basu. Fake apache log generator, 2015–2018.

Bettina Berendt and Myra Spiliopoulou. Analysis of navigation behaviour in web sites integrating multiple information systems. The VLDB Journal—The International Journal on Very Large Data Bases, 9(1):56–75, 2000.

Carlo Bernaschina, Marco Brambilla, Thanas Koka, Andrea Mauri, and Eric Umuhoza. Integrating modeling languages and web logs for enhanced user behavior analytics. In Proceedings of the 26th International Conference on World Wide Web Companion, pages 171–175. International World Wide Web Conferences Steering Committee, 2017.

Carlo Bernaschina, Sara Comai, and Piero Fraternali. Ifmledit.org: model driven rapid prototyping of mobile apps. In Proceedings of the 4th International Conference on Mobile Software Engineering and Systems, pages 207–208. IEEE Press, 2017.

Marco Brambilla and Piero Fraternali. Interaction Flow Modeling Language: Model-Driven UI Engineering of Web and Mobile Apps with IFML. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2014.

Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, Sep 1995.

Emily L Denton, Soumith Chintala, Rob Fergus, et al. Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems, pages 1486–1494, 2015.

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2672–2680. Curran Associates, Inc., 2014.

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.

Alex Graves. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013.

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9:1735–80, 12 1997.

Martin Hofmann. Support vector machines—kernels and the kernel trick. Notes, 26, 2006.

MinJae Kwon. Flog, an apache log generator, 2017–2018.

Chu-Hsing Lin, Jung-Chun Liu, and Ching-Ru Chen. Access log generator for analyzing malicious website browsing behaviors. In 2009 Fifth International Conference on Information Assurance and Security, pages 126–129. IEEE, 2009.

NA Mahoto, A Memon, and MA TEEVNO. Extraction of web navigation patterns by means of sequential pattern mining. Sindh University Research Journal-SURJ (Science Series), 48(1), 2016.

NASA. Nasa apache web log, 1995.

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 311–318, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.

David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learning representations by back-propagating errors. nature, 323(6088):533, 1986.

Alex Sherstinsky. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. CoRR, abs/1808.03314, 2018.

Nanhay Singh, Achin Jain, and Ram Shringar Raw. Comparison analysis of web usage mining using pattern recognition techniques. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol, 3:137–147, 2013.

Mr PG Vedaprakash, Mr PG Om Prakash, and Mr M Navaneethakrishnan. Analyzing the user navigation pattern from weblogs using data pre-processing technique. International Journal of Computer Science and Mobile Computing, ISSN, pages 90–99, 2016.

Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. Seqgan: Sequence generative adversarial nets with policy gradient. In Thirty-First AAAI Conference on Artificial Intelligence, 2017.

Published

2021-11-21

How to Cite

Pavanetto, S., & Brambilla, M. (2021). Generation of Realistic Navigation Paths for Web Site Testing Using RNN and GAN. Journal of Web Engineering, 20(4), 2571–2604. https://doi.org/10.13052/jwe1540-9589.20816

Issue

Section

ICWE 2020