Code Smell-guided Prompting for LLM-based Defect Prediction in Ansible Scripts

Authors

  • Hyunsun Hong Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
  • Sungu Lee Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
  • Duksan Ryu Jeonbuk National University, Jeonju, Republic of Korea
  • Jongmoon Baik Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

DOI:

https://doi.org/10.13052/jwe1540-9589.2383

Keywords:

Edge-cloud, Ansible, large language models, software defect prediction

Abstract

Ensuring the reliability of infrastructure as code (IaC) scripts, like those written in Ansible, is vital for maintaining the performance and security of edge-cloud systems. However, the scale and complexity of these scripts make exhaustive testing impractical. To address this, we propose a large language model (LLM)-based software defect prediction (SDP) approach that uses code-smell-guided prompting (CSP). In some cases, CSP enhances LLM performance in defect prediction by embedding specific code smell indicators directly into the prompts. We explore various prompting strategies, including zero-shot, one-shot, and chain of thought CSP (CoT-CSP), to evaluate how code smell information can improve defect detection. Unlike traditional prompting, CSP uniquely leverages code context to guide LLMs in identifying defect-prone code segments. Experimental results reveal that while zero-shot prompting achieves high baseline performance, CSP variants provide nuanced insights into the role of code smells in improving SDP. This study represents exploration of LLMs for defect prediction in Ansible scripts, offering a new perspective on enhancing software quality in edge-cloud deployments.

Downloads

Download data is not yet available.

Author Biographies

Hyunsun Hong, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

Hyunsun Hong received his bachelor’s degree in computer science and electrical engineering from Handong Global University in 2023. He is a master’s student in software engineering at KAIST. His research areas include software analytics based on AI and software defect prediction.

Sungu Lee, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

Sungu Lee received his bachelor’s degree in mathematics from KAIST in 2021 and his Master’s degree in software engineering from KAIST in 2022. He is a doctoral student in software engineering from KAIST. His research areas include software analytics based on AI, software defect prediction, mining software repositories, and software reliability engineering.

Duksan Ryu, Jeonbuk National University, Jeonju, Republic of Korea

Duksan Ryu earned a Bachelor’s degree in computer science from Hanyang University in 1999 and a Master’s dual degree in software engineering from KAIST and Carnegie Mellon University in 2012. He received his Ph.D. degree in school of computing from KAIST in 2016. His research areas include software analytics based on AI, software defect prediction, mining software repositories, and software reliability engineering. He is currently an associate professor in software engineering department at Jeonbuk National University.

Jongmoon Baik, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea

Jongmoon Baik received his B.Sc. degree in computer science and statistics from Chosun University in 1993. He received his M.Sc. degree and Ph.D. degree in computer science from University of Southern California in 1996 and 2000, respectively. He worked as a principal research scientist at Software and Systems Engineering Research Laboratory, Motorola Labs, where he was responsible for leading many software quality improvement initiatives. His research activity and interest are focused on software six sigma, software reliability and safety, and software process improvement. Currently, he is a full professor in school of computing at Korea Advanced Institute of Science and Technology (KAIST). He is a member of the IEEE.

References

Giuseppe Agapito, Anna Bernasconi, Cinzia Cappiello, Hasan Ali Khattak, InYoung Ko, Giuseppe Loseto, Michael Mrissa, Luca Nanni, Pietro Pinoli, Azzurra Ragone, et al. Current Trends in Web Engineering: ICWE 2022 International Workshops, BECS, SWEET and WALS, Bari, Italy, July 5–8, 2022, Revised Selected Papers. Springer Nature, 2023.

Kief Morris. Infrastructure as code: managing servers in the cloud. “O’Reilly Media, Inc.”, 2016.

Bas Meijer, Lorin Hochstein, and René Moser. Ansible: Up and Running. “O’Reilly Media, Inc.”, 2022.

Romi Satria Wahono. A systematic literature review of software defect prediction. Journal of software engineering, 1(1):1–16, 2015.

Hyunsun Hong, Sungu Lee, Duksan Ryu, and Jongmoon Baik. Enhancing software defect prediction in ansible scripts using code-smell-guided prompting with large language models in edge-cloud infrastructures. In Proceedings of the 4th International Workshop on Big data driven Edge Cloud Services (BECS 2024) Co-located with the 24th International Conference on Web Engineering (ICWE 2024), Tampere, Finland, June 17–20 2024.

Ruben Opdebeeck, Ahmed Zerouali, and Coen De Roover. Smelly variables in ansible infrastructure code: Detection, prevalence, and lifetime. In Proceedings of the 19th International Conference on Mining Software Repositories, pages 61–72, 2022.

Paweł Piotrowski and Lech Madeyski. Software defect prediction using bad code smells: A systematic literature review. Data-centric business and applications: towards software development (volume 4), pages 77–99, 2020.

Phongphan Danphitsanuphan and Thanitta Suwantada. Code smell detecting tool and code smell-structure bug relationship. In 2012 Spring congress on engineering and technology, pages 1–5. IEEE, 2012.

Tom B Brown. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022.

Yu Nong, Mohammed Aldeen, Long Cheng, Hongxin Hu, Feng Chen, and Haipeng Cai. Chain-of-thought prompting of large language models for discovering and fixing software vulnerabilities. arXiv preprint arXiv:2402.17230, 2024.

Rasmus Ingemann Tuffveson Jensen, Vali Tawosi, and Salwa Alamir. Software vulnerability and functionality assessment using llms. In 2024 IEEE/ACM International Workshop on Natural Language-Based Software Engineering (NLBSE), pages 25–28. IEEE, 2024.

Sunjae Kwon, Sungu Lee, Taehyoun Kim, Duksan Ryu, and Jongmoon Baik. Exploring the feasibility of chatgpt for improving the quality of ansible scripts in edge-cloud infrastructures through code recommendation. In International Conference on Web Engineering, pages 75–83. Springer, 2023.

Matthew Jin, Syed Shahriar, Michele Tufano, Xin Shi, Shuai Lu, Neel Sundaresan, and Alexey Svyatkovskiy. Inferfix: End-to-end program repair with llms. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 1646–1656, 2023.

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.

Ruben Opdebeeck, Ahmed Zerouali, and Coen De Roover. Andromeda: A dataset of ansible galaxy roles and their evolution. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pages 580–584. IEEE, 2021.

Published

2025-02-07

How to Cite

Hong, H. ., Lee, S. ., Ryu, D. ., & Baik, J. . (2025). Code Smell-guided Prompting for LLM-based Defect Prediction in Ansible Scripts. Journal of Web Engineering, 23(08), 1107–1126. https://doi.org/10.13052/jwe1540-9589.2383

Issue

Section

Articles