Main Article Content
Finding faults in a Complex Systems at the early stage improves the reliability and it is an emerging area, and the assessment of the fault is performed by Early Fault Prediction Systems (EFPS). The identification process of fault-prone sub modules is the most prioritized (ambiguous sentence) before initiating the testing process of the same modules. The EFPS helps to improve system quality within the specified time and cost values. Early fault prediction in EFPS for the different system components showed significant results concerning the cost and time parameters. According to the state-of-the-art EFPS, ensemble-based classifiers were performed best and most cost-effective compared to other classifier methods. Recently, a random ensemble forest with adaptive synthetic sampling (E-RF-ADASYN) has been developed, is tested on a sample of PROMISE and KAGGLE datasets, and shown the cost-effective classifier results. In the logistic regression to system quality models, and the other knowledge of account for prior probability and costs of misclassification. Probabilities and costs of misclassification in a logistic regression-based classification algorithm for system quality modeling. The decision tree algorithm is an ensemble learning approach for prediction. The algorithm exactly works on developing several decision trees and the decision is is based on the popular output class. The proposed work focuses on developing sampling method called Ensemble- Random Forest with Multi-Distinguished-Features Sampling (E-RF-MDFS), for obtaining the best sample illustration for representing the entire dataset. Bat-induced Butterfly Optimization (BBO) has been used for the feature extraction process. The experiments are conducted on 8 datasets of the PROMISE and KAGGLE database. The proposed E-RF-MDFS has improved performance than E-RF-ADASYN in fault detection accuracy, real positive rate, and Pearson's correlation coefficient. On comparing the performance of E-RF-based classifiers, the performance of the proposed MDFS is the best, with an FDA of 99.3% (Xalan v2.6) than the ADASYN classifier.