Home Hierarchical Clustering
Post
Cancel

Hierarchical Clustering

What is Hierarchical Clustering?

Hierarchical Clustering gives results very similar to K-means clustering, But the whole process is different.
There are two types of hierarchical clustering:

  • Agglomerative
    Agglomerative is the bottom-up approach and involves dividing into multiple clusters.
    There are several steps to doing agglomerative approach.

    • Step 1.
      Make each data point a single-point cluster. This forms N clusters.


    • Step 2.
      Take the two closest data points and make them one cluster. This forms N-1 clusters.


    • Step 3.
      Take the two closest clusters and make them one cluster. This forms N-2 clusters.


    • Step 4.
      Repeat Step 3 until there is only one cluster.



    • Fin





  • Divisive
    Divisive is the top-down approach and involves dividing into multiple clusters.

    • Step 1.
      All points in the dataset belong to one single cluster.


    • Step 2.
      Partition the cluster into the two least similar clusters based on SSE (Sum of Squared Errors).


    • Step 3.
      Repeat Step 2 until the desired number of clusters is obtained.

    • Fin
      Since the number we wanted to obtain is 3, the divisive process is complete.




How to calculate the distance between clusters?



There are four ways to calculate distance.

  • Option 1:
    The distance between closest points in the clusters.
  • Option 2:
    The distance between furthest points in the clusters.
  • Option 3:
    Average of all the distances between all points in the clusters.
  • Option 4:
    The distance between centroids.

These ways can significantly affect the output.

The purpose of agglomerative or divisive?



The purpose of hierarchical clustering, the way the hierarchical clustering works, is to maintain a memory of how we went through those processes. This memory is stored in a dendrogram.



What is Dendrogram?

We created clusters above and the dendrogram will help us get the desired result from those clusters.

Let’s assume that we have six data points and apply agglomerative hierachical clustering.



As we see above, find two closest points and put them into one cluster.



Then, Dendrogram will record the distance between two points.



After that, follow the steps of agglomerative. I’ll show you the progress by image.










How to choose clusters?

When a straight line parallel to the X-axis is drawn on the dendrogram obtained through the above process, the points where the graph intersects are the number of clusters.



The optimal number of clusters is the place where the most flat, straight lines can be drawn on the X-axis without changing the number of clusters.
In other words, in our results, it is appropriate when the number of clusters is 2.



Example



Code



1
2
import scipy.cluster.hierarchy as sch
dendrogram = sch.dendrogram(sch.linkage(X, method='ward'))


As you can see in the result below.

1
2
3
from sklearn.cluster import AgglomerativeClustering
hc = AgglomerativeClustering(n_clusters=5, affinity='euclidean', linkage='ward')
y_hc = hc.fit_predict(X)



Result



Dendrogram


Clustering





Implementation

Hierarchical Clustering

This post is licensed under CC BY 4.0 by the author.