266 Chapter 9 Results Clusterability of the EU-NN database Our clustering model explained 40% of the variance in the EU-NN database at seven clusters (Supplementary Figure 1A). This is substantially higher than the corresponding coefficients of determination of the randomly generated datasets (mean of 16% and upper limit of the 95% confidence interval at 18%), suggesting intrinsic clusterability of the EU-NN database. Supplementary Figure 1. Advanced analyses. (A) Coefficient of determination (R2) for different numbers of clusters, calculated for the original clustering in green and 20 randomly generated datasets in red. The greater coefficient of determination of the EU-NN database indicates intrinsic clusterability of the database. (B) The silhouette coefficient was calculated for each individual and grouped per cluster. Positive values (green) represent individuals distinctly grouped within their cluster, whereas negative values (red) indicate these individuals are closer to the nearest neighbouring cluster than to others in their own cluster. Larger silhouette coefficients of clusters 5 and 6 reveal greater cluster distinctness. (C) Highlights between which clusters there is more frequent mixing in the jack-knife resampling. A higher mixing probability between two clusters however does not necessarily mean that these clusters are not robust, as it can also mean that these clusters are frequently merged into one larger cluster in the resampling iterations for the chosen number of clusters cluster. μ: mean silhouette coefficient.
RkJQdWJsaXNoZXIy MjY0ODMw