Enriched Hybrid Recursive Feature Elimination Algorithm with Learning Vector Quantization Classifier for Heterogenous Cross Project Defect Prediction
Main Article Content
Abstract
Objectives: The ultimate objective of this study is to enhance software defect prediction in the presence of insufficient training instances for defect cases in the majority of real-world scenarios. This paper focuses on overcoming the issue of class imbalance by constructing an enriched model in Heterogenous Cross Project Defect Prediction (HCPDP). In this proposed work the source project and target project with different feature metric and size are used for HCPDP.
Methods: By using a hybrid recursive feature elimination approach, the feature size of the source project is condensed to match the feature size of the target project. This is accomplished by integrating the fuzzy linear support classifier which represents the instances in terms of membership. The features which are more informative in discrimination among clean and buggy module is preserved and the least scored features are eliminated from the feature list. The weighted Jaccard Index is used for finding the dissimilarity among the source and target projects. Those computed instance of values are used for predicting the software defect by inducing learning vector quantization.
Findings: As software usage grows tremendously, Heterogenous Cross Defect Prediction has emerged as an essential study area in software engineering. Despite the fact that there are numerous literatures accessible, class imbalance and over fitting are the most significant issues that affect the accuracy rate of prediction models. The newly developed Hybrid Recursive Feature Elimination with Learning Vector Quantization uses two different software projects with different feature size. By adopting hybrid recursive feature elimination, the large dataset of the source project is reduced to the size of the target project, and similar instances between datasets are used to improve the accuracy of defect-prone modules using learning vector quantization.
Novelty: On six different heterogeneous projects for software defect prediction, the proposed Hybrid Recursive Feature Elimination with Learning Vector Quantization (HRFE-LVQ) for HCPDP outperforms standard classification models.