Project Evolution-aware Prompting of LLMs for Just-in-time Defect Prediction in Edge-cloud Systems

Authors

  • Inseok Yeo Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
  • Sungu Lee Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
  • Duksan Ryu Jeonbuk National University, Jeonju, Republic of Korea
  • Jongmoon Baik Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

DOI:

https://doi.org/10.13052/jwe1540-9589.2535

Keywords:

Just-in-time defect prediction, large language model, edge-cloud system

Abstract

Edge-cloud systems, which bring computing, storage, and networking resources closer to end-users, offer significant advantages in reducing latency and enabling real-time data processing. These systems are increasingly deployed across diverse domains, such as smart manufacturing, autonomous vehicles, and large-scale IoT networks, to support big data-driven services that require continuous analytics and rapid response. Ensuring software reliability in these environments is critical, which has led to growing attention on just-in-time (JIT) defect prediction as an effective technique for prioritizing testing efforts by identifying code changes likely to introduce defects. However, existing techniques struggle to perform accurately on new or low-data projects due to insufficient training data.

In this paper, we propose PROPER-SDP, a prompt-based approach that leverages large language models. By incorporating project evolution data directly into prompts, our approach enables LLMs to effectively capture the contextual information essential for accurate JIT defect prediction. By doing so, we effectively address the cold-start problem, allowing accurate JIT defect prediction even in the absence of project-specific training data. Evaluation results demonstrate that our method significantly improves prediction performance, surpassing baseline methods by an average of 19.7% in F1-score. Our approach enables reliable JIT defect prediction even in rapidly evolving, resource-constrained edge-cloud systems.

Downloads

Download data is not yet available.

Author Biographies

Inseok Yeo, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

Inseok Yeo received his bachelor’s degree in computer science from Hanyang University in 2024. He is a master’s student in computer science at KAIST. His research areas include software analytics, software engineering based on AI and LLMs.

Sungu Lee, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

Sungu Lee received his bachelor’s degree in mathematics from KAIST in 2021 and his master’s degree in software engineering from KAIST in 2022. He is a doctoral student in software engineering at KAIST. His research areas include software analytics based on AI, software defect prediction, mining software repositories, and software reliability engineering.

Duksan Ryu, Jeonbuk National University, Jeonju, Republic of Korea

Duksan Ryu earned his bachelor’s degree in computer science from Hanyang University in 1999 and his master’s dual degree in software engineering from KAIST and Carnegie Mellon University in 2012. He received his Ph.D. degree from the school of computing at KAIST in 2016. His research areas include software analytics based on AI, software defect prediction, mining software repositories, and software reliability engineering. He is currently an associate professor in software engineering department at Jeonbuk National University.

Jongmoon Baik, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

Jongmoon Baik received his B.Sc. degree in computer science and statistics from Chosun University in 1993. He received his M.Sc. degree and Ph.D. degree in computer science from University of Southern California in 1996 and 2000 respectively. He worked as a principal research scientist at Software and Systems Engineering Research Laboratory, Motorola Labs, where he was responsible for leading many software quality improvement initiatives. His research activity and interests are focused on software six sigma, software reliability and safety, and software process improvement. Currently, he is a full professor in the school of computing at Korea Advanced Institute of Science and Technology (KAIST). He is a member of the IEEE.

References

Akimova, E.N., Bersenev, A.Y., Deikov, A.A., Kobylkin, K.S., Konygin, A.V., Mezentsev, I.P., Misilov, V.E.: A survey on software defect prediction using deep learning. Mathematics 9(11), 1180 (2021).

Bhutamapuram, U.S., Chonari, F., K Anilkumar, G., Konchada, S.K.: Llms for defect prediction in evolving datasets: Emerging results and future directions. In: Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering. p. 520–524. FSE Companion ’25, Association for Computing Machinery, New York, NY, USA (2025). https://doi.org/10.1145/3696630.3728491

EdgeX Foundry: edgex-go: EdgeX Foundry Go Services. https://github.com/edgexfoundry/edgex-go (2025).

Fan, A., Gokkaya, B., Harman, M., Lyubarskiy, M., Sengupta, S., Yoo, S., Zhang, J.M.: Large language models for software engineering: Survey and open problems. In: 2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE). pp. 31–53 (2023). https://doi.org/10.1109/ICSE-FoSE59343.2023.00008

Giray, G., Bennin, K.E., Ömer Köksal, Önder Babur, Tekinerdogan, B.: On the use of deep learning in software defect prediction (2022). https://arxiv.org/abs/2210.02236

Guo, Y., Gao, X., Jiang, B.: An empirical study on jit defect prediction based on bert-style model. arXiv preprint arXiv:2403.11158 (2024).

Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering 38(6), 1276–1304 (2011).

Hong, H., Lee, S., Ryu, D., Baik, J.: Enhancing software defect prediction in ansible scripts using code-smell-guided prompting with large language models in edge-cloud infrastructures. In: International Conference on Web Engineering. pp. 30–42. Springer (2024).

Hosseini, S., Turhan, B., Gunarathna, D.: A systematic literature review and meta-analysis on cross project defect prediction. IEEE Transactions on Software Engineering 45(2), 111–147 (2017).

Kamei, Y., Fukushima, T., McIntosh, S., Yamashita, K., Ubayashi, N., Hassan, A.E.: Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering 21, 2072–2106 (2016).

Kang, S., An, G., Yoo, S.: A quantitative and qualitative evaluation of llm-based explainable fault localization. Proc. ACM Softw. Eng. 1(FSE) (Jul 2024). https://doi.org/10.1145/3660771

KubeEdge Authors: KubeEdge: Kubernetes Native Edge Computing Framework. https://github.com/kubeedge/kubeedge (2025).

Kwon, S., Lee, S., Ryu, D., Baik, J.: Pre-trained model-based software defect prediction for edge-cloud systems. Journal of Web Engineering 22(2), 255–278 (2023).

Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Applied Soft Computing 27, 504–518 (2015).

Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M., Socher, R., Amatriain, X., Gao, J.: Large language models: A survey (2025). https://arxiv.org/abs/2402.06196

Nam, J., Pan, S.J., Kim, S.: Transfer defect learning. In: 2013 35th International Conference on Software Engineering (ICSE). pp. 382–391 (2013). https://doi.org/10.1109/ICSE.2013.6606584

Pal, S., Sillitti, A.: Cross-project defect prediction: a literature review. IEEE access 10, 118697–118717 (2022).

Red Hat, Inc.: OpenShift Installer. https://github.com/openshift/installer (2025).

Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: Vision and challenges. IEEE internet of things journal 3(5), 637–646 (2016).

Soualhia, M., Fu, C., Khomh, F.: Infrastructure fault detection and prediction in edge cloud environments. In: Proceedings of the 4th ACM/IEEE Symposium on Edge Computing. pp. 222–235 (2019).

Souza, P.S., Ferreto, T.C., Rossi, F.D., Calheiros, R.N.: Location-aware maintenance strategies for edge computing infrastructures. IEEE Communications Letters 26(4), 848–852 (2022). https://doi.org/10.1109/LCOMM.2022.3150243

Traefik Labs: Traefik: The Cloud Native Application Proxy. https://github.com/traefik/traefik (2025).

Yeo, I., Lee, s., Ryu, D., Baik, J.: Proper-sdp: Prompt-based project evolution-aware software defect prediction for edge-cloud systems. The 5th International Workshop on Big data driven Edge Cloud Services (BECS 2025) Co-located with the 25th International Conference on Web Engineering (ICWE 2025), June 30-July 3, 2025, Delft, Netherlands.

Yeo, I., Ryu, D., Baik, J.: Improving llm-based fault localization with external memory and project context. arXiv preprint arXiv:2506.03585 (2025).

Z. Wan, X. Xia, A.E.H.D.L.J.Y., Yang, X.: Perceptions, expectations, and challenges in defect prediction (2020).

Zhao, W.X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., Liu, P., Nie, J.Y., Wen, J.R.: A survey of large language models (2025), https://arxiv.org/abs/2303.18223

Zhao, Y., Damevski, K., Chen, H.: A systematic survey of just-in-time software defect prediction. ACM Computing Surveys 55(10), 1–35 (2023).

Downloads

Published

2026-04-19

How to Cite

Yeo, I. ., Lee, S. ., Ryu, D. ., & Baik, J. . (2026). Project Evolution-aware Prompting of LLMs for Just-in-time Defect Prediction in Edge-cloud Systems. Journal of Web Engineering, 25(03), 395–416. https://doi.org/10.13052/jwe1540-9589.2535

Issue

Section

Articles