Abstract
Just-in-Time (JIT) defect prediction represents a software engineering approach that seeks to detect potential defects in software code at the earliest stages of the development process. This proactive method allows developers to tackle issues before they escalate, thereby enhancing software security and reliability. However, researchers often encounter a common challenge known as class imbalance when working on this model. This imbalance in data adversely affects the model's performance. To address this, the study minimized the class imbalance problem by employing data sampling techniques. The study evaluated the performance of the proposed cone-shaped embedded normalization (CSEN) model against other baseline models in two scenarios. First, the comparison was conducted without sampling, and second, after performing data sampling. Typically, in state-of-the-art predictions of buggy changes, the f1 score ranges from 0.3 to 0.53. However, the proposed model significantly improved this score to 0.72. Moreover, the highest accuracy achieved by the proposed CSEN model was 74.42%.
Similar content being viewed by others
Data Availability
The experiment uses public data sets shared by Kamei et al. [4], and they have already published the download address of the data sets in their paper. https://research.cs.queensu.ca/~kamei/jittse/jit.zip
References
Mockus A, Roy CK, Gourley A. The Bay Area open-source systems project. IEEE Softw. 2010;27(4):20–3.
Fukushima T, Kamei Y, McIntosh S, Yamashita K, Ubayashi N. Studying just-in-time defect prediction using cross-project models. Empir Softw Eng. 2014. https://doi.org/10.1145/25970732597075.
Zimmermann T, Premraj R, Zeller A Predicting defects for Eclipse. In proceedings of the 2012 ACM-IEEE International symposium on empirical software engineering and measurement 2012 (ESEM '12).
Kamei Y, Shihab E, Adams B, Hassan AE. A large-scale empirical study of just-in-time quality assurance. IEEE Trans Software Eng. 2013;39(6):757–73.
Yang Y, Wang Q, Leung H. Just-in-time quality assurance: towards practicable and sustainable quality assurance. IEEE Trans Softw Eng. 2015;41(2):111–27.
Herzig K, Just S, Zeller A It's not a bug, it's a feature: how misclassification impacts bug prediction. In Proceedings of the 39th international conference on software engineering 2017
Menzies T, Krishna R, Fu W. Local versus global models for effort-aware just-in-time defect prediction: A replication study. Empir Softw Eng. 2018;23(3):1712–35.
Huang Q, Xia X, Lo D Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction empirical software engineering. Research collection school of information systems 2018 1–40
Shin Y, Kim D, Zimmermann T. Just-in-time defect prediction leveraging social interactions. Empir Softw Eng. 2019;24(2):639–75.
Hoang T, Dam HK, Kamei Y, Lo D, Ubayashi N “DeepJIT: an end-to-end deep learning framework for just-in-time defect prediction,” in Proceedings of the international conference on mining software repositories (MSR), 2019, pp. 34–45.
Qiao L, Wang Y. Effort-aware and just-in-time defect prediction with neural network. PLoS ONE. 2019;14(2):e0211359. https://doi.org/10.1371/journal.pone.0211359.
Yang S, Wang Q, Leung H. A multi-objective optimization approach for just-in-time defect prediction. Inf Sci. 2020;507:1264–83.
Yuli T, Ning L, Jeff T, Wei Z, “How well just-in-time defect prediction techniques enhance software reliability?” In proceedings of the IEEE international conference on software quality, reliability and security (QRS) 2020
Gao R, Xie Q, Leung H Ensemble-based just-in-time defect prediction. IEEE transactions on software engineering 2021
Yanli S, Jingru Z, Xingqi W, Weiwei W, Jinglong F "Research on cross-company defect prediction method to improve software security", security and communication networks, 19 pages, 2021 https://doi.org/10.1155/2021/5558561
MSR 2014: Proceedings of the 11th working conference on mining software repositories May 2014 Pages 172–181https://doi.org/10.1145/2597073.2597075
Xingguang Y, Huiqun Y, Guisheng F, Kai S, Liqiong C “Local versus Global Models for Just-In-Time Software Defect Prediction” 2019
Saleh Albahli, “A Deep ensemble learning method for effort-aware just-in-time defect prediction “ 2019
Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P, Tang Y, Zhang T. Software defect prediction based on kernel PCA and weighted extreme learning machine. Inf Softw Technol. 2019;106:182–200.
Schmidt-Hieber, J. Nonparametric regression using deep neural networks with ReLU activation function. arXiv 2017, arXiv:1708.06633. Available online: https://arxiv.org/abs/1708.06633 (accessed on 20 November 2019)
Funding
Not Applicable.
Author information
Authors and Affiliations
Contributions
Lipika Goel: Conceptualization, Methodology, Software, Supervision, Sonam Gupta: Visualization, Supervision, Investigation, Dharmendra Kumar: Data Curation, writing – original draft, Vinay Pathak: Software Validation, editing.
Corresponding author
Ethics declarations
Conflict of interest
Not Applicable.
Research Involving Human and Animals
Not Applicable.
Informed Consent
Informed consent was obtained from all individual participants included in the study. The participant has consented to the submission of the research article to the journal.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Security for Communication and Computing Application” guest edited by Karan Singh, Ali Ahmadian, Ahmed Mohamed Aziz Ismail, R S Yadav, Md. Akbar Hossain, D. K. Lobiyal, Mohamed Abdel-Basset, Soheil Salahshour, Anura P. Jayasumana, Satya P. Singh, Walid Osamy, Mehdi Salimi and Norazak Senu.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Goel, L., Gupta, S., Kumar, D. et al. Effect of Data Sampling on Cone Shaped Embedded Normalization in Just in Time Software Defect Prediction. SN COMPUT. SCI. 5, 345 (2024). https://doi.org/10.1007/s42979-024-02703-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-024-02703-w