4 4.2. Confusion Matrix-based metrics 77 Table 4.1: List of 17 metrics evaluated to guide the inference process of FTs. Metric name Range Comment Fault Tree Size [2, ↓) Number of Fault Tree nodes |FD| := |V|. Precision, Specificity, Sensitivity, Negative predictive value, Accuracy, Threat score, Balanced accuracy, Negative likelihood ratio, Positive likelihood ratio, Diagnostic odds ratio, F1 Score, FowlkesMallows Index [0,1] Metrics that range between 0 and 1 (or 0 to infinity) are normalised to [0,1] with 0 being the optimum value. Matthews correlation coe’cient, Informedness, Markedness, Kappa statistic [0,2] Metrics that range between -1 and 1 are normalised to [0,2] with 0 being the optimum value. FT inference process. To achieve this, we conduct a Principal Component Analysis (PCA), a technique utilised for dimensionality reduction and feature selection. In a second phase, we perform an extensive evaluation of the new approach on six FTs from diverse application areas, and compare with FT-MOEA. In particular, we investigate how the inclusion of additional information in FT-MOEA-CM influences the robustness, scalability, and convergence speed of the FT inference process. Contributions. The primary contributions of this work are as follows: (i) Introduction of the FT-MOEA-CM algorithm, employing confusion matrix-based metrics for the automatic inference of FTs, which enhances robustness, scalability, and convergence speed over its predecessor FT-MOEA. (ii) Improved performance through the integration of features like caching and parallelisation, particularly beneficial for larger FT structures. (iii) FT-MOEA-CM is available at https://gitlab.utwente.nl/fmt/ fault-trees/ft-moea Outline. Section 4.2 introduces FTs and formally defines their inference. Section 4.3 details the FT-MOEA-CM methodology. Section 4.4 describes our experimental setup, and Section 4.5 presents our results from evaluating FT-MOEA-CM on six case studies. We conclude in Section 4.7 and present future work. 4.2 Confusion Matrix-based metrics Our inference approach is guided by metrics based on the Confusion Matrix. The Confusion Matrix (CM) is a performance evaluation tool commonly used in Machine Learning classification tasks (Sokolova and Lapalme, 2009). In binary classification, a 2↘2 CM categorises predictions into four outcomes: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). In our setting
RkJQdWJsaXNoZXIy MjY0ODMw