Causal Cross-embedded Spatio-temporal LSTM for Web Traffic Prediction

Zhao  Na; Mao  Yanying

doi:10.13052/jwe1540-9589.2524

Authors

Zhao Na Chongqing Polytechnic University of Electronic Technology, Chongqing, 400054, P. R. China
Mao Yanying Chongqing Polytechnic University of Electronic Technology, Chongqing, 400054, P. R. China

DOI:

https://doi.org/10.13052/jwe1540-9589.2524

Keywords:

Web, LSTM, Causal Cross-Embedding, Deep Learning, Interpretability

Abstract

Web service traffic forecasting is vital for dynamic resource scaling, load balancing, and anomaly detection, but remains challenging due to frequent large-scale fluctuations caused by heterogeneous user behaviors. Traditional time-series models and recent deep neural networks have made progress by capturing temporal patterns, yet they largely overlook latent causal relationships between services that can significantly influence traffic dynamics. In this paper, we propose a novel causal cross-embedded spatio-temporal LSTM (CEST-LSTM) architecture that integrates spatio-temporal modelling with a causal inference mechanism to improve web traffic prediction. The model consists of a spatio-temporal LSTM branch for capturing temporal dependencies across services and a causal branch that leverages convergent cross mapping-based cross-embedding to uncover and incorporate latent inter-service causal influences. A cross-embedding fusion mechanism seamlessly combines these causal features with spatio-temporal representations. On real-world datasets (e.g., Microsoft Azure and Alibaba Cloud), CEST-LSTM achieves a variance-explained prediction accuracy of approximately 93%, surpassing state-of-the-art baselines such as temporal graph convolutional networks (T-GCN) and spatio-temporal attention GCNs (STA-GCN). Comparative experiments and ablation studies confirm that the causal branch consistently improves forecasting accuracy – for example, removing the causal module reduces accuracy by several percentage points. These results demonstrate that integrating latent causal relationship modelling into spatio-temporal neural networks yields substantial improvements in web traffic prediction, offering a promising direction for robust and interpretable forecasting in complex web systems.

Downloads

Download data is not yet available.

Author Biographies

Zhao Na, Chongqing Polytechnic University of Electronic Technology, Chongqing, 400054, P. R. China

Zhao Na was born in Anshan, Liaoning Province, PR China, in 1979. She obtained a doctoral degree in Communication Engineering from Harbin Engineering University, China, and is currently working at Chongqing Polytechnic University of Electronic Technology. Her main research directions are artificial intelligence and information processing.

Mao Yanying, Chongqing Polytechnic University of Electronic Technology, Chongqing, 400054, P. R. China

Mao Yanying was born in Chongqing, PR China, in 1993. She obtained a master’s degree from Beijing Institute of Technology, China, and is currently working at Chongqing Polytechnic University of Electronic Technology, focusing on artificial intelligence and computer simulation.

References

G. O. Ferreira, C. Ravazzi, F. Dabbene, G. Calafiore and M. Fiore, ‘Forecasting network traffic: a survey and tutorial with open-source comparative evaluation’, IEEE Access, 2023.

Park, C A Study on Traffic Prediction for the Backbone of Korea’s Research & Education Network, J Web Eng, Vol. 21, Iss. 5, 2022, pp. 1419–1434

L. Zhao, et al., ‘T-GCN: A temporal graph convolutional network for traffic prediction’, IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 9, pp. 3848–3858, 2020.

M. Gao, Y. Wei, Y. Xie, Y. Zhang, ‘Traffic prediction with self-supervised learning: a heterogeneity-aware model for urban traffic flow prediction based on self-supervised learning’, Mathematics, vol. 12, no. 9, art. 1290, 2024.

U. Thakur, S. K. Singh, S. Kumar, H. Singh, V. Arya, B. B. Gupta, R. W. Attar, A. Alhomoud, and K. T. Chui, ‘Advanced web traffic modelling and forecasting with a hybrid predictive approach’, Journal of Web Engineering, vol. 24, no. 3, pp. 409–456, 2025. https://doi.org/10.13052/jwe1540-9589.2434.

A. Vaswani, et al., “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, 2017.

X. Hu, L. Zheng, and R. Zhang, ‘Enhancing user understanding with big data: A comparative study of deep learning and statistical methods for forecasting online page views’, 2024 IEEE International Conference on Big Data (BigData), pp. 4095–4103, Washington, DC, USA, 2024, doi: 10.1109/BigData62323.2024.10825932.

R. Casado-Vara, A. Martin del Rey, D. Pérez-Palau, L. de-la-Fuente-Valentín and J. M. Corchado, ‘Web traffic time series forecasting using LSTM neural networks with distributed asynchronous training’, Mathematics, vol. 9, no. 4, art. 421, 2021.

Vrushant Tambe, A. Golait, S. Pardeshi, R. Javeri and G. Arsalwad, ‘Forecast web traffic time series using ARIMA model’, International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), 2022.

B. Yu, H. Yin, Z. Zhu, ‘Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting’, Proc. IJCAI 2018, pp. 3634–3640, Stockholm, Jul. 2018.

S. Bai, J. Z. Kolter, V. Koltun, ‘An empirical evaluation of generic convolutional and recurrent networks for sequence modeling’, arXiv:1803.01271, 2018.

J. Zhang, Y. Zheng, D. Qi, ‘Deep spatio-temporal residual networks for citywide crowd flows prediction’, Proc. AAAI 2017, pp. 1655–1661, San Francisco, Feb. 2017.

T. N. Kipf, M. Welling, ‘Semi-supervised classification with graph convolutional networks’, Proc. ICLR 2017, arXiv:1609.02907, 2017.

S. Hochreiter, J. Schmidhuber, ‘Long short-term memory’, Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.

K. Cho, et al., ‘Learning phrase representations using RNN encoder–decoder’, Proc. EMNLP 2014, pp. 1724–1734, Doha, Oct. 2014.

K. Xu, et al., ‘Show, attend and tell: Neural image caption generation with visual attention’, Proc. ICML 2015, pp. 2048–2057, Lille, Jul. 2015.

S. Liu, X. Wang, ‘An improved transformer based traffic flow prediction model’, Scientific Reports, vol. 15, art. 8284, 2025.

Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, P. S. Yu, ‘A comprehensive survey on graph neural networks’, IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 4–24, 2021.

Z. Wu, S. Pan, G. Long, J. Jiang, X. Chang, C. Zhang, ‘Connecting the dots: Multivariate time series forecasting with graph neural networks’, arXiv:2005.11650, 2020.

H. Wu, J. Xu, J. Wang, M. Long, ‘Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting’, Proc. NeurIPS 2021, pp. 22419–22430, 2021.

H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, W. Zhang, ‘Informer: Beyond efficient transformer for long sequence time-series forecasting’, Proc. AAAI 2021, pp. 11106–11115, 2021.

B. Lim, S. Ö. Arik, N. Loeff, T. Pfister, ‘Temporal fusion transformers for interpretable multi-horizon time series forecasting’, International Journal of Forecasting, vol. 37, no. 4, pp. 1748–1764, 2021.

Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. V. Le, R. Salakhutdinov, ‘Transformer-XL: Attentive language models beyond a fixed-length context’, Proc. ACL 2019, pp. 2978–2988, Florence, Jul. 2019.

M. Zhang, P. Li, Y. Xia, K. Wang, L. Jin, ‘Revisiting graph neural networks for link prediction’, arXiv:2010.16103, 2020.

R. Moraffah, et al., ‘Causal inference for time series analysis: Problems, methods and evaluation’, arXiv:2102.05829, 2021.

J. Runge, P. Nowack, M. Kretschmer, S. Flaxman, D. Sejdinovic, ‘Detecting and quantifying causal associations in large nonlinear time series datasets’, Science Advances, vol. 5, no. 11, art. eaau4996, 2019.

J. Pearl, ‘Causality: Models, Reasoning and Inference’, Cambridge University Press, 2nd ed., 2009.

C. W. J. Granger, ‘Investigating causal relations by econometric models and cross-spectral methods’, Econometrica, vol. 37, no. 3, pp. 424–438, 1969.

C. Tian, M. Xing, Z. Shi, M. B. Blaschko, Y. Yue, M.-F. Moens, ‘Using causality for enhanced prediction of web traffic time series’, arXiv:2502.00612, 2025.

J. Chang, J. Yin, Y. Hao, C. Gao, ‘STFDSGCN: Spatio-Temporal Fusion Graph Neural Network based on Dynamic Sparse Graph Convolution GRU for Traffic Flow Forecast’, Sensors, vol. 25, no. 11, art. 3446, 2025.

M. Xu, W. Dai, C. Liu, X. Gao, W. Lin, G.-J. Qi, H. Xiong, ‘Spatial-temporal transformer networks for traffic flow forecasting’, arXiv:2001.02908, 2020.

C. Tian, M. Xing, Z. Shi, M. B. Blaschko, Y. Yue, M.-F. Moens, ‘Using causality for enhanced prediction of web traffic time series’, arXiv preprint, arXiv:2502.00612, 2025.

S. Cai, H. Peng, R. Liu & P. Chen, ‘Causal-oriented representation learning for time-series forecasting based on the spatiotemporal information transformation’, Communications Physics, vol. 8, art. 242, 2025.

R. Wang et al., ‘Graph neural network-based network traffic analysis: A comprehensive survey’, Network Traffic Analysis Based on Graph Neural Networks, 2025.

P. Fafoutellis and E. I. Vlahogianni, ‘A theory-informed multivariate causal framework for trustworthy short-term urban traffic forecasting’, Transportation Research Part C: Emerging Technologies, vol. 170, 104945, 2025.

L. Zhang, Y. Wang and J. Li, ‘Interpretable predictive modeling of non-stationary long time series’, Computers & Industrial Engineering, vol. 194, 110412, 2024.

Z. Chang, C. Liu, J. Jia, ‘STA-GCN: Spatial-Temporal self-attention graph convolutional networks for traffic-flow prediction’, Applied Sciences, vol. 13, no. 11, art. 6796, 2023.

Causal Cross-embedded Spatio-temporal LSTM for Web Traffic Prediction

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Zhao Na, Chongqing Polytechnic University of Electronic Technology, Chongqing, 400054, P. R. China

Mao Yanying, Chongqing Polytechnic University of Electronic Technology, Chongqing, 400054, P. R. China

References

Downloads

Published

How to Cite

Issue

Section

IEEE Xplore

ImpactScore

specialissue

issn

cover

Make a Submission

subreq

indexed