Skip to main content

Table 1 List of cluster validation indices used in this work

From: A practical approach to cluster validation in the energy sector

Name

Abbreviation

Usage

Average within-cluster distance

\(I_{avg\_wc}\)

Measure of similarity of objects/points in a cluster. The higher the index, the smaller the average within-cluster distance.

p-separation-index

Ipsep

Measure of separation between clusters. Instead of minimum/maximum distance (prone to outliers) this can be calculated by the mean of a portion (p) between two clusters. The higher the index, the better the between-cluster separation.

Representation by centroids

Icentroid

Measure of how well a cluster is represented by its centroid. The higher the index, the better the representation.

Representation of dissimilarity structure by clustering

Ipearson

Measure of the dissimilarity structure denoted by the Pearson correlation between pairwise dissimilarities (e.g., Euclidean distances) and “clustering induced dissimilarity” (matching cluster). For increasing dissimilarity, objects/points should not be assigned to the same cluster. Hence for higher indices, pairwise dissimilarity correlates more strongly to clustering dissimilarity.

Within-cluster gaps

Iwidestgap

Measure of the connectivity of a cluster. The higher the index, the smaller the within-cluster gaps.

Entropy

Ientropy

Measure for assessing the uniform size of clusters.

Parsimony

Iparsimony

Measure to express the preference for a lower number of clusters.

Density modes and valleys

Idensdec

Measure to quantify the density drop from cluster-mode to the edges of a cluster and the density-valleys between clusters.

Uniform within-cluster density

Icvdens

Measure to quantify the within-cluster density levels. For higher indices, density is more uniform within the cluster.