A Comprehensive Study on Integration of Big Data and AI in Financial Decision-Making

Karim  Elalkaoui; Safaa  Moqqaddem; Mounir Ait  Kerroum

doi:10.13052/jicts2245-800X.1423

Authors

Karim Elalkaoui Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco https://orcid.org/0009-0002-9589-3413
Safaa Moqqaddem Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco https://orcid.org/0009-0007-2185-2690
Mounir Ait Kerroum Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco

DOI:

https://doi.org/10.13052/jicts2245-800X.1423

Keywords:

Artificial intelligence (AI), big data, financial decision-making, deep learning, machine learning (ML), hybrid models, ensemble learning, real-time analytics, financial forecasting, risk assessment, fraud detection, concept drift, explainability, privacy, robustness, multimodal models, streaming frameworks, lakehouse architecture, BTC-USD prediction

Abstract

The rapid proliferation of large-scale financial data, coupled with advancements in Artificial Intelligence (AI), has significantly transformed modern financial decision-making. This paper presents a comprehensive state-of-the-art review of AI-driven approaches supported by Big Data infrastructure in the financial domain. We analyse recent academic contributions (2023–2025) across machine learning(ML), deep learning and hybrid ensemble techniques applied to forecasting, portfolio optimisation, risk assessment and fraud detection. Emerging data architectures such as streaming frameworks and lakehouse platforms are assessed in terms of their ability to support real-time analytics and large-scale model deployment. We highlight the transition towards multimodal and attention-based models that integrate structured and unstructured data sources, and identify key challenges including concept drift, explainability, privacy and robustness. A detailed case study involving a GPU-accelerated hybrid deep learning and ensemble model for BTC–USD price prediction demonstrates practical benefits and current limitations: the hybrid model achieved an RMSE of 2656.69, a MAPE of 2.14%, and an R2 of 0.9626 on the test set. Although the absolute RMSE reflects the inherent volatility of the asset class, the low MAPE (2.14%) and high R2 confirm the model’s predictive efficacy during regime shifts, highlighting the necessity for future integration of macro-scenarios.

Downloads

Download data is not yet available.

Author Biographies

Karim Elalkaoui, Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco

Karim Elalkaoui is a PhD candidate at Ibn Tofail University in Kenitra, Morocco, at the Laboratory of Research in Informatics (LaRI), Faculty of Sciences. He obtained a master’s degree in Big Data and Cloud Computing from Ibn Tofail University. His research interests include artificial intelligence, big data analytics, and financial decision-making systems.

Safaa Moqqaddem, Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco

Safaa Moqqaddem is an Assistant Professor at the National School of Commerce and Management (ENCG), Ibn Tofail University, Kenitra, Morocco. She received a master’s degree in computer science and telecommunications and a PhD in computer science and telecommunications from the Faculty of Sciences, Ibn Tofail University. Her research interests include artificial intelligence, big data, computer vision, machine perception (particularly 3D object detection and tracking), image processing, and intelligent systems. She is also a member of the LaRI Laboratory at the Faculty of Sciences, Kenitra.

Mounir Ait Kerroum, Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco

Mounir Ait Kerroum received his Master’s Degree (DESA) in Computer Science and Telecommunications from the Faculty of Sciences at Mohammed V University, Rabat, Morocco, in 2003, followed by a PhD from the same institution in 2010. In March 2010, he joined the National School of Business and Management (ENCG) of Kénitra, Ibn Tofail University, as an Assistant Professor, before being promoted to Associate Professor in 2014. He is a founding and permanent member of the Laboratory for Research in Computer Science and Telecommunications (LaRIT), established in 2010. From 2017 to 2020, he headed the ‘Networks, Telecommunications, and Artificial Intelligence’ research team within the lab. He is also an Associate Researcher at the Laboratory for Research in Computing and Telecommunications (LRIT) at the Faculty of Sciences, Rabat. Since December 2020, he has held the rank of Full Professor. His current research interests include Artificial Intelligence, Machine Learning, and Deep Learning applied to pattern recognition in remote sensing and medical imaging. His expertise also extends to the application of Explainable AI (XAI), NLP, and Deep Learning for financial market and cryptocurrency forecasting, as well as the use of sentiment analysis to enhance model transparency and explainability.

References

O.B. Sezer, M.U. Gudelek, and A.M. Ozbayoglu. Financial time series forecasting with deep learning: A systematic literature review (2005–2019). Applied Soft Computing, 90, 106181, 2020.

Xu. Zhang, W. Hussain, and W. Chen. Deep learning for financial time series forecasting: A review. Computer Science Review, 49, 100567, 2023.

E. Mienye,N. Jere,G. Obaido,I. D. Mienye and K.Aruleba. Deep Learning in Finance: A Survey of Applications and Techniques IEEE Access, 9, 107222–107248, 2021.

A. Ali,S. A. Razak,S. H. Othman,T. A. E. Eisa ,A. Al-Dhaqm,M.Nasser,T. Elhassan 1,H. Elshafie and A. Saif. Financial Fraud Detection Based on Machine Learning: A Systematic Literature Review. Journal of Financial Crime, 30(2), 506–526, 2023.

Y. Jiang, J. Olmo, and M. Atwi. Deep reinforcement learning for portfolio selection. Global Finance Journal, 43(2), 345–372, 2024.

B. Lim, S.O. Arik, N. Loeff, and T. Pfister. Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4), 1748–1764, 2021.

B.N. Oreshkin, D. Carpov, N. Chapados, and Y. Bengio. N-BEATS: Neural basis expansion analysis for interpretable time series forecasting. International Journal of Forecasting, 36(3), 1091–1102, 2020.

D. Salinas, V. Flunkert, J. Gasthaus, and T. Januschowski. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3), 1181–1191, 2020.

S. Makridakis, I. Spiliotis, and V. Assimakopoulos. M5 accuracy competition: Results, findings and conclusions. International Journal of Forecasting, 38(4), 1346–1364, 2022.

G. Woo, C. Liu, D. Sahoo, A. Kumar, and S. Hoi. ETSformer: Exponential smoothing transformers for time-series forecasting. arXiv preprint arXiv:2202.01381, 2022.

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8), 1735–1780, 1997.

K. Cho, B. v. Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser and I. Polosukhin Attention is all you need. In Advances in Neural Information Processing Systems, 30, 2017.

L. Breiman. Random forests. Machine Learning, 45, 5–32, 2001.

T. Chen and C. Guestrin. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794, 2016.

G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye and T. Liu LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems, 30, 2017.

C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20, 273–297, 1995.

D.H. Wolpert. Stacked generalization. Neural Networks, 5(2), 241–259, 1992.

S.M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 30, 2017.

M.T. Ribeiro, S. Singh, and C. Guestrin. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144, 2016.

R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, D. Pedreschi, F. Giannotti A survey of methods for explaining black box models. ACM Computing Surveys, 51(5), 1–42, 2019.

Y. Chen , R. Calabrese and B. Martin-Barragan Interpretable machine learning for imbalanced credit scoring datasets European Journal of Operational Research, 312(1), 357–372, 2024.

Y. Gal and Z. Ghahramani. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the 33rd International Conference on Machine Learning, 1050–1059, 2016.

B. Lakshminarayanan, A. Pritzel, and C. Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems, 30, 2017.

J. Gawlikowski, C. R. N. Tassi, M. Ali, J. Lee, M. Humt, J. Feng, A. Kruspe, R. Triebel, P. Jung, R. Roscher, M. Shahzad, W. Yang, R. Bamler and X. X. Zhu A survey of uncertainty in deep neural networks. Artificial Intelligence Review, 56, 1513–1589, 2023.

Z. Zhang, S. Zohren, and S. Roberts. DeepLOB: Deep convolutional neural networks for limit order book data. IEEE Transactions on Signal Processing, 67(11), 3001–3012, 2019.

D. Araci. FinBERT: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063, 2019.

D. Cheng, Y. Zou, S. Xiang, C. Jiang Graph neural networks for financial fraud detection: A review. Frontiers of Computer Science, 239, 122300, 2025.

D. Shah, V. Shah, and H. Patel. A comprehensive survey on applications of graph neural networks in fraud detection. Knowledge and Information Systems, 65, 4027–4064, 2023.

Y-H. H. Tsai, S. Bai, P. P. Liang, J. Zico Kolter, L-P. Morency, R. Salakhutdinov Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 6558–6569, 2019.

J. Moody and M. Saffell. Reinforcement learning for trading. In Advances in Neural Information Processing Systems, 11, 1998.

Y. Deng, F. Bao, Y. Kong, Z. Ren; Q. Dai Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 653–664, 2016.

H. Yang, X. Zhang, A. Walid,and X-Y. Liu. Deep reinforcement learning for automated stock trading: An ensemble strategy. SSRN Electronic Journal, 2020.

J. Kreps, N. Narkhede, and J. Rao. Kafka: A distributed messaging system for log processing. In Proceedings of the 6th International Workshop on Networking Meets Databases, 2011.

P. Carbone ,A. Katsifodimos ,S. Ewen ,V. Markl,S. Haridi,K. Tzoumas Apache Flink: Stream and batch processing in a single engine. IEEE Data Engineering Bulletin, 38(4), 2015.

M. Armbrust, T. Das, L. Sun, B. Yavuz, S. Zhu, M. Murthy, J. Torres, H. v. Hovell, A. Ionescu, A. Łuszczak, M. Świtakowski, M. Szafrański, X. Li, T. Ueshin, M. Mokhtar, P. Boncz, A. Ghodsi, S. Paranjpye, P. Senster, R. Xin and M. Zaharia Delta Lake: High-performance ACID table storage over cloud object stores. Proceedings of the VLDB Endowment, 13(12), 3411–3424, 2020.

M. Zaharia, R. S. Xin, P. Wendell, T. Das, M. Armbrust, A. Dave, X. Meng, J. Rosen, S. Venkataraman, M. J. Franklin, A. Ghodsi, J. Gonzalez, S. Shenker and I. Stoica Apache Spark: A unified engine for big data processing. Communications of the ACM, 59(11), 56–65, 2016.

C. Dwork and A. Roth. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4), 211–407, 2014.

H. Brendan McMahan, E. Moore, D. Ramage, S. Hampson and B. Agüera y Arcas Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 1273–1282, 2017.

P. Kairouz, H.B. McMahan, B. Avent, A. Bellet, M. Bennis, A.N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, R.G.L. D’Oliveira, H. Eichner, S. El Rouayheb, D. Evans, J. Gardner, Z. Garrett, A. Gascón, B. Ghazi, P.B. Gibbons, M. Gruteser, Z. Harchaoui, C. He, L. He, Z. Huo, B. Hutchinson, J. Hsu, M. Jaggi, T. Javidi, G. Joshi, M. Khodak, J. Konečný, A. Korolova, F. Koushanfar, S. Koyejo, T. Lepoint, Y. Liu, P. Mittal, M. Mohri, R. Nock, A. Özgür, R. Pagh, M. Raykova, H. Qi, D. Ramage, R. Raskar, D. Song, W. Song, S.U. Stich, Z. Sun, A.T. Suresh, F. Tramèr, P. Vepakomma, J. Wang, L. Xiong, Z. Xu, Q. Yang, F.X. Yu, H. Yu, and S. Zhao, Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1–2), 1–210, 2021.

Y. Wu, Y. Wang, and X. Yuan. A novel deep reinforcement learning framework for portfolio allocation with state representation learning. Expert Systems with Applications, 238, 121828, 2024.

A. Sadighi, M.S. Nosrati, and S.H. Hasheminejad. A deep attention-based reinforcement learning framework for algorithmic trading under regime shifts. Engineering Applications of Artificial Intelligence, 127, 107258, 2024.

Y. Wang, J. Zhang, and L. Wu. Market regime adaptive deep reinforcement learning for quantitative trading. Neurocomputing, 566, 127026, 2024.

R. Chandra, Y. He Bayesian neural networks for stock price forecasting before and during COVID-19 pandemic Information Sciences, 657, 119920, 2024.

R. C. Cavalcante and A. Oliveira An approach to handle concept drift in financial time series based on Extreme Learning Machines and explicit Drift Detection IEEE International Joint Conference on Neural Network, 153, 111298, 2024.

P. Chen, Z. Boukouvalas and R. Corizzo A deep fusion model for stock market prediction with news headlines and time series data Knowledge-Based Systems, 21229–21271, (2024)

J. Park and H. Kim. Integrating technical indicators, news sentiment, and stock correlation for enhanced portfolio optimization. Neural Computing and Applications, 54, 1057–1073, 2024.

R. Maulik, R. Egele, K. Raghavan and P. Balaprakash. Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles Physica D: Nonlinear Phenomena, 158, 107059, 2024.

H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong and W. Zhang. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 11106–11115, 2021.

Y. Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma and M. Long. Transformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625, 2023.

H. Markowitz. Portfolio selection. The Journal of Finance, 7(1), 77–91, 1952.

E.F. Fama. Efficient capital markets: A review of theory and empirical work. The Journal of Finance, 25(2), 383–417, 1970.

F. Black and M. Scholes. The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637–654, 1973.

A Comprehensive Study on Integration of Big Data and AI in Financial Decision-Making

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Karim Elalkaoui, Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco

Safaa Moqqaddem, Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco

Mounir Ait Kerroum, Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco

References

Downloads

Published

How to Cite

Issue

Section

IEEE Xplore

proposal-sp-issue

Special Issue

archiveblock

Interview

Interview

interviewVideo

splissue

issn

cover

Make a Submission

subreq

indexed

openaccesslogo

opinions

riverlogo