288 Chapter 9 Distance between clusters: complete linkage We define the distance between two clusters by means of an additional rule, called complete linkage, applied to the set of all distances between two individuals (taking one from each cluster). Under complete linkage, the inter-cluster distance dC is chosen to be the largest distance of this set: Mean inter-cluster distance For each cluster pair, we calculate the inter-cluster distance dC(Ci, Cj), and define the mean inter-cluster distance < dC > as the arithmetic mean of these distances. Note that the choice of linkage type governs the meaning of dC(Ci, Cj) and, in turn, of < dC >. Mean intra-cluster distance For each cluster, we consider all pairs {pi, pj} of parameter vectors within it. For each such pair, we calculate the distance dP(pi,pj). We define the mean intracluster distance dintra as the arithmetic mean of these distances. Note that the choice of linkage type does not affect this outcome measure, and that it is not defined when N = M (i.e., when all individuals are in separate clusters, so that no intra-cluster distances exist). Silhouette The silhouette is defined on a per-individual basis. Intuitively, this measure is high if an individual is similar to other individuals of its cluster and distinct from the individuals of other clusters. More specifically, let the i-th individual, represented by pi, be the j-th cluster’s k-th individual: pi = Cjk. Let a(pi) be the mean Gower’s distance of pi to other
RkJQdWJsaXNoZXIy MjY0ODMw