Skip to main content

Completeness Score

The completeness score is a mesure of how all memebers of a given class are assigned to the same cluster.

It measures how many similar samples are put together by the clustering algorithm, it if formally defined by this formula:

h=1โˆ’H(KโˆฃC)H(K)h = 1 - \frac{H(K|C)}{H(K)}
Where,H(KโˆฃC)=โˆ’โˆ‘k,cnkcNlognkcncWhere, H(K|C) = -\sum_{k,c} \frac {n_kc}{N} log \frac {n_kc}{n_c}

When all data points are assigned to one cluster, the completeness score is 1.(highest)

2.png

And when all data point are assigned their own cluster, the completeness score is 0.(lowest)

1.png

Here we can notice how homogeneity score is inversely proportional to the completeness score.

Higher the completeness score, lower the homogeneity score and vice versa.