What is Hierarchical Clustering?
Hierarchical clustering is a method of cluster analysis that builds a tree-like structure (called a dendrogram) that groups observations by similarity. This method can be agglomerative (starting with individual items and merging clusters) or divisive (starting with all items and splitting clusters). One of the key benefits of hierarchical clustering is that the tree-like structure provides a full picture of how clusters form at different similarity levels, allowing researchers to explore relationships at multiple scales rather than just ending with a single final set of clusters.
Hierarchical clustering helps analysts manage a tradeoff between simplifying the data into a manageable number of clusters while preserving the underlying structure and meaningful variation within the dataset.
Unlike K-means clustering, which requires analysts to specify the number of clusters in advance, hierarchical clustering builds a nested structure of groupings first, allowing analysts to decide how many clusters to keep after examining the results.
Using Hierarchical Clustering
There are several types of hierarchical clustering algorithms, including single linkage, which merges clusters based on the shortest distance between points; complete linkage, which uses the largest distance between points; and average linkage, which considers the average distance between all points in two clusters. These methods are used in various fields — for example, single linkage is often used in genetics to detect long, chain-like clusters, while complete linkage can create more compact, evenly sized groups. Average linkage offers a compromise between the two. Hierarchical clustering is widely applied in biology (e.g., classifying species or genes), marketing (e.g., customer segmentation), and text analysis.
Beyond just building a cluster tree, analysts often apply statistical criteria to decide how many clusters to keep. Tools like the inconsistency coefficient, gap statistic, and elbow method help evaluate where meaningful separations exist in the hierarchy. These metrics aim to identify a point in the clustering process where combining clusters would start to group dissimilar items, signaling a natural stopping point for defining distinct groups.
For example, a researcher might use hierarchical clustering to group customers based on their purchase histories, revealing distinct customer segments for targeted marketing. By examining the full dendrogram, the researcher can explore how customer groups combine or split at different similarity levels — for instance, identifying broad categories like high-, medium-, and low-value customers or drilling down into more detailed subgroups within each category.
By exploring the full dendrogram, the researcher can adjust the cutoff height to examine either broad categories or more detailed subgroups — gaining flexibility to choose the clustering level that best fits the marketing goals.
« Back to Glossary Index