82 Chapter 4. Fault Tree inference using Multi-Objective Evolutionary Algorithms and Confusion Matrix-based metrics that the most informative metrics are: Matthews correlation coe!cient, Specificity, Negative predictive value, Precision, Diagnostic odds ratio, FT size, and Accuracy. Table 4.3: Loading analysis per metric. PC Feature Loading Type 1 PC1 Matthews correlation coef. 0.296 best 2 PC2 Specificity 0.538 best 3 PC3 Negative predictive value 0.656 best 4 PC4 Precision 0.525 best 5 PC5 Diagnostic odds ratio 0.702 best 6 PC6 FT Size 0.791 best 7 PC7 Accuracy 0.873 best 8 PC2 Sensitivity →0.370 weak 9 PC1 Threat Score 0.283 weak 10 PC4 Balanced accuracy 0.348 weak 11 PC1 F1 Score 0.283 weak 12 PC1 Fowlkew-Mallows Index 0.284 weak 13 PC4 Informedness →0.393 weak 14 PC4 Markedness 0.348 weak 15 PC1 Kappa statistic 0.293 weak 16 PC2 Negative likelihood ratio →0.315 weak 17 PC2 Positive likelihood ratio 0.487 weak The Matthews correlation coe!cient (0.296) has the highest loading on PC1. However, other features have similar loading on PC1, such as the Threat Score (0.283). This similarity may indicate a correlation between these metrics, so only the one with the highest loading is considered. For other PCs, such as PC6 and PC7, FT Size and Accuracy are respectively the highest contributors, indicating that these metrics are uncorrelated with the others. Thus, the seven best metrics consistently contribute the most uniquely to their respective PC and are minimally correlated across di!erent case studies, whereas the weak features show higher correlations to one or more of the best features, and therefore left out of the analysis. 4.5.2 Comparing FT-MOEA and FT-MOEA-CM The comparison between FT-MOEA and FT-MOEA-CM focuses on three key aspects: robustness, scalability, and convergence speed. Robustness is assessed by examining the variability in the output FT. Convergence speed is evaluated by analysing the rate of convergence. Finally, scalability analysis involves studying case studies of various sizes. Results interpretation. Part of the comparative analysis includes box plots constructed from the outcomes of each experimental setup. In these setups, the algorithm is executed five times to generate distinct instances of the experiment, yielding five separate results for the same configuration. This repeated execution is crucial for accurately assessing the outcomes due to the stochastic nature of the optimisation process, where genetic operators are randomly applied. By running the algorithm multiple times, we can e!ectively evaluate the impact of this randomness on the results. Robustness. Robustness is assessed by analysing the variability in the output FT upon convergence. An algorithm is considered robust if it consistently yields the same FT structure, though this criterion may not be universally applicable due
RkJQdWJsaXNoZXIy MjY0ODMw