From: A practical approach to cluster validation in the energy sector
Goal | Explanation | Mathematical formulation | Simos Rank | Weight in % |
---|---|---|---|---|
The number of clusters should be as low as possible. | Since the resulting clusters are the basis for a subsequent optimization with high computation time, a lower number is favored. | max(Iparsimony) | 6 | 33.0 |
Clusters should be relatively even in size. | The resulting representative driving & load profiles will be distributed according to their cluster size. If single clusters are overrepresented due to their size, the same driving & load profiles will be used and hence the desired variance will be low. | max(Ientropy) | 6 | 33.0 |
Members of a cluster should be well represented by a specific datapoint within the dataset. | This is necessary in order to a) simulate driving & load profiles and b) have it be as similar to other points in the cluster as possible. Input features are a lower dimensional representation of driving & load profiles. | max(Icp2cent) | 5 | 27.7 |
Clusters should be describable by a low number of features. | Next to having unique and distinguishable characteristics, in order to create understandable “personas”, the number of characterizing features should be as low as possible. | max(Ipps) | 1 | 6.3 |