Skip to content
ML Visualization

K-Means Clustering

Unsupervised & Dim. ReductionBeginner~7 min

K-Means ClusteringK-means is an unsupervised clustering algorithm that partitions data into k groups by alternating between assigning each point to its nearest centroid and moving each centroid to the mean of its members, minimizing within-cluster variance.

K-means finds groups in your data by repeating two simple steps: assign each point to the nearest center, then move each center to the middle of its group. Repeat until nothing changes.

Clustering…
  • Cluster 1
  • Cluster 2
  • Cluster 3
  • Centroid
3
Iteration 0 / 0
Inertia (within-cluster sum of squares)

Tip: drag any centroid to choose its starting position, then press Play.

The idea in plain words

K-means finds groups by repeating two steps until nothing changes: assign each point to its nearest center, then move each center to the average of its points. Step through the iterations and watch the centers slide into place.

Because the objective isn’t convex, a bad starting placement can converge to an obviously wrong grouping — crowd the centers in one corner to see it. Unlike k-nearest neighbors, there are no labels; k-means discovers structure on its own.

Now, the math

It minimizes the within-cluster sum of squares (inertia):

J=ixiμc(i)2J = \sum_{i} \lVert x_i - \mu_{c(i)} \rVert^2
xix_i
a data point.
μc(i)\mu_{c(i)}
the centroid of the cluster point i is assigned to.
JJ
total inertia — smaller means tighter clusters.
Show the derivation

The assign and update steps each never increase J, so the algorithm always converges — but only to a local minimum. In practice you run it several times from different starts (or use k-means++) and keep the lowest-inertia result.

Now Break It

Try this: Bad initial centroid placement gets stuck in a terrible local minimum — clusters are obviously wrong.

Control: Drag centroids to adversarial starting positions (e.g., all in one corner)

Last updated .