3. Basics of Clustering and k-means Clustering PDF
Document Details
Uploaded by FinerLimeTree
University of Southern Denmark - SDU
Tags
Summary
This document discusses the basics of clustering and k-means clustering. It explains that identifying good clusterings requires heuristic solutions due to the computational complexity. It also briefly introduces challenges in clustering, such as computationally intensive search for good solutions and the lack of a universal quality measuring function.
Full Transcript
Challenges Assume we are given a hypothetical quality function q to decide for a given partition of a set of n points whether or not this partition constitutes a (good) clustering. ► na"ive method: test q on all possible clusterings with k partitions (clusters) (2 k ?) ► problems:...
Challenges Assume we are given a hypothetical quality function q to decide for a given partition of a set of n points whether or not this partition constitutes a (good) clustering. ► na"ive method: test q on all possible clusterings with k partitions (clusters) (2 k ?) ► problems: ► there are O(kn ) many partitions ink clusters ► and we don't have this function q, actually Therefore, we need heuristic solutions for both problems: ► efficient search for solutions ► efficient and effective modelling of q There are many such heuristic solutions around, that is, we have a plethora of clustering algorithms in the literature.