Pre-trained Model-based Software Defect Prediction for Edge-cloud Systems
Keywords:Just-in-time defect prediction, pre-trained model, edge-cloud system
Edge-cloud computing is a distributed computing infrastructure that brings computation and data storage with low latency closer to clients. As interest in edge-cloud systems grows, research on testing the systems has also been actively studied. However, as with traditional systems, the amount of resources for testing is always limited. Thus, we suggest a function-level just-in-time (JIT) software defect prediction (SDP) model based on a pre-trained model to address the limitation by prioritizing the limited testing resources for the defect-prone functions. The pre-trained model is a transformer-based deep learning model trained on a large corpus of code snippets, and the fine-tuned pre-trained model can provide the defect proneness for the changed functions at a commit level. We evaluate the performance of the three popular pre-trained models (i.e., CodeBERT, GraphCodeBERT, UniXCoder) on edge-cloud systems in within-project and cross-project environments. To the best of our knowledge, it is the first attempt to analyse the performance of the three pre-trained model-based SDP models for edge-cloud systems. As a result, we can confirm that UniXCoder showed the best performance among the three in the WPDP environment. However, we also confirm that additional research is necessary to apply the SDP models to the CPDP environment.
E. N. Akimova, et al., “PyTraceBugs: A large Python code dataset for supervised machine learning in software defect prediction,” in 2021 28th Asia-Pacific Software Engineering Conference (APSEC), 2021, pp. 141–151.
M. Bakaev, et al. (eds.) “ICWE 2021 International Workshops, BECS and Invited Papers, Biarritz. France, 2021” in Revised Selected Papers. Springer Nature, 2022.
M. V. R Blondet, et al., “A wearable real-time BCI system based on mobile cloud computing,” in 2013 6th International IEEE/EMBS Conference on Neural Engineering (NER), 2013, pp. 739–742.
E. H. Butterfield, “Fog computing with Go: A comparative study,” CMC Senior Thesis, Claremont College, 2016.
R. Buyya and N. S. Satish (eds.) Fog and Edge Computing: Principles And Paradigms, John Wiley & Sons, 2019.
J. Deng, L. Lu, Q. Shaojian, “Software defect prediction via LSTM,” IET Software, vol. 14, no. 4, pp. 443–450, 2020.
Z. Feng, et al., “Codebert: A pre-trained model for programming and natural languages,” arXiv preprint arXiv:2002.08155, 2020.
D. Guo, et al., “UniXcoder: Unified cross-modal pre-training for code representation,” arXiv preprint arXiv:2203.03850, 2022.
D. Guo, et al., “Graphcodebert: Pre-training code representations with data flow,” arXiv preprint arXiv:2009.08366, 2020.
S. Herbold, A. Trautsch, J. Grabowski, “A comparative study to benchmark cross-project defect prediction approaches,” in Proceedings of the 40th International Conference on Software Engineering, 2018, pp. 1063–1063.
H. Husain, et al., “Codesearchnet challenge: Evaluating the state of semantic code search,” arXiv preprint arXiv:1909.09436, 2019.
C. Khanan, et al., “JITBot: an explainable just-in-time defect prediction bot,” in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020, pp. 1336–1339.
S. Kwon, et al., “CodeBERT based software defect prediction for edge-cloud systems,” in 2nd International Workshop on Big Data Driven Edge Cloud Services (BECS 2022), International Society for Web Engineering, 2022.
J. Li, et al., “Software defect prediction via convolutional neural network,” in 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), IEEE, 2017, pp. 318–328.
Z. Li, et al., “CodeReviewer: Pre-training for automating code review activities,” arXiv preprint arXiv:2203.09095, 2022.
E. Mashhadi and H. Hemmati, “Applying codebert for automated program repair of java simple bugs,” in 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), IEEE, 2021, pp. 505–509.
F. F. S. B. De Matos, P. A. L. Rego, F. A. M. Trinta, “An empirical study about the adoption of multi-language technique in computation offloading in a mobile cloud computing scenario,” in 11th International Conference on Cloud Computing and Services Science, 2021, pp. 207–214.
C. Pan, M. Lu, B. Xu, “An empirical study on software defect prediction using codebert model,” Applied Sciences, vol. 11, no. 11, p. 4793, 2021.
S. K Pandey, R. B. Mishra, A. K. Tripathi, “Machine learning based methods for software fault prediction: A survey,” Expert Systems with Applications, vol. 172, p. 114595, 2021.
K. Shi, et al., “PathPair2Vec: An AST path pair-based code representation method for defect prediction,” Journal of Computer Languages, vol. 59, p. 100979, 2020.
Y. Shin, et al., “Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities,” IEEE Transactions on Software Engineering, vol. 37, no. 6, pp. 772–787, 2010.
R. S. Wahono, “A systematic literature review of software defect prediction,” Journal of Software Engineering, vol. 1.1, pp. 1–16, 2015.
J. Xu, et al., “ACGDP: An augmented code graph-based system for software defect prediction,” IEEE Transactions on Reliability, vol. 71, no. 2, 2022.
J. Xu, F. Wang, J. Ai, “Defect prediction with semantics and context features of codes based on graph representation learning,” IEEE Transactions on Reliability, vol. 70, no. 2, pp. 613–625, 2020.
J. Xu, et al., !A GitHub-based data collection method for software defect prediction,” in 2019 6th International Conference on Dependable Systems and Their Applications (DSA), IEEE, 2020, pp. 100–108.
X. Yang, et al., “Deep learning for just-in-time defect prediction,” in 2015 IEEE International Conference on Software Quality, Reliability and Security, IEEE, 2015, pp. 17–26.
F. Zhang, et al., “Improving stack overflow question title generation with copying enhanced CodeBERT model and bi-modal information,” Information and Software Technology, vol. 148, pp. 106922, 2022.
H. Zhang and S. C. Cheung, “A cost-effectiveness criterion for applying software defect prediction models,” in Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, 2013, pp. 643–646.
X. Zhou, D. Han, D. Lo, “Assessing generalizability of CodeBERT,” in 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, 2021, pp. 425–436.
Y. Zhou, et al., “How far we have progressed in the journey? An examination of cross-project defect prediction,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 27, no. 1, pp. 1–51, 2018.