quiz image

7 - Hierarchical Clustering

ThrillingTuba avatar
ThrillingTuba
·
·
Download

Start Quiz

Study Flashcards

17 Questions

What is the purpose of combining rows and columns using Lance-Williams update equations?

To find the position of the minimum distance or maximum similarity.

What is the general form of Lance-Williams update equations?

Several (but not all) linkages can be expressed in this form.

Which linkages cannot be computed with Lance-Williams updates?

MiniMax, Medoid, Hausdorff linkages.

What is the purpose of extracting clusters from a dendrogram?

To obtain the clusters from the hierarchical structure.

Why is it recommended to avoid computing directly in hierarchical clustering?

Direct computation is expensive.

What is the basic idea behind Hierarchical Agglomerative Clustering?

To initially consider every object as a cluster and then iteratively merge the two most similar clusters until only one cluster remains.

What are some variations in Hierarchical Agglomerative Clustering?

Variations include different distance or similarity measures for objects, different distance measures for clusters after merging (Linkage), and various optimizations.

What is Single-linkage in terms of distances of clusters?

Single-linkage considers the minimum distance between clusters as the basis for merging, representing maximum similarity.

Describe the Complete-linkage approach in Hierarchical Agglomerative Clustering.

Complete-linkage involves considering the maximum distance between clusters, indicating minimum similarity, for merging.

What is the concept of Ward-linkage in Hierarchical Agglomerative Clustering?

Ward-linkage focuses on the minimum increase of squared error when merging clusters.

Explain the process of AGNES (Agglomerative Nesting) in clustering.

AGNES involves computing the pairwise distance matrix of objects and using the Lance-Williams equations for merging clusters.

What are some strategies to determine when to stop the hierarchical agglomerative clustering process?

visually inspect the dendrogram, choose interesting branches; stop when clusters remain; stop at a certain distance; significant change in distance; significance via bootstrap resampling; change in cluster sizes or density; constraints satisfied (semi-supervised)

What are some benefits of Hierarchical Agglomerative Clustering (HAC)?

very general: any distance / similarity; easy to understand and interpret; hierarchical result; dendrogram visualization often useful; number of clusters does not need to be fixed beforehand; many variants

What is the main limitation of Hierarchical Agglomerative Clustering (HAC) in terms of scalability?

scalability is the main problem

What is a common issue that users face with Hierarchical Agglomerative Clustering (HAC) in terms of the desired clustering outcome?

in many cases, users want a flat partitioning

What are some challenges or limitations of Hierarchical Agglomerative Clustering (HAC) related to cluster sizes and outliers?

unbalanced cluster sizes; outliers/noise for some linkage strategies

What is one approach to incorporating supervision in Hierarchical Agglomerative Clustering (HAC)?

constraints satisfied (semi-supervised): certain objects are labeled as 'must' or 'should not' be in the same clusters

Learn about the Hierarchical Agglomerative Clustering method, one of the earliest clustering techniques. Understand the process of merging similar clusters until only one cluster remains, and explore variations in distance measures and optimizations. Discover how to plot dendrograms and select interesting subtrees.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free

More Quizzes Like This

Hierarchical Models in Bowling
49 questions
Hierarchical Classification in Biology
20 questions
Hierarchical Clustering and DBSCAN Quiz
115 questions
Use Quizgecko on...
Browser
Browser