Skip to content
ML Visualization

DBSCAN

Unsupervised & Dim. ReductionIntermediate~7 min

DBSCANCluster by density; label sparse points as noise.

DBSCAN finds clusters as dense regions separated by sparse gaps. Unlike k-means, it discovers the number of clusters itself, handles weird shapes, and calls isolated points noise.

  • Cluster A
  • Cluster B
  • Noise
Dataset
0.80
4

On two moons DBSCAN separates the crescents that k-means cannot, leaving stray points gray as noise. Drag any point or click empty space to add one — the clustering re-runs live. Shrink ε to make everything noise; grow it to merge into one blob.

The idea in plain words

DBSCAN finds clusters as dense regions separated by sparse gaps. A point is a “core” point if enough neighbors sit within a radius ε; clusters grow by connecting core points and their neighbors, and anything left isolated is labeled noise.

Unlike k-means, it discovers the number of clusters itself and handles arbitrary shapes — on two moons it cleanly separates the crescents that k-means fundamentally cannot.

Now, the math

A point p is a core point when its ε-neighborhood is dense enough:

Nε(p)minPts|N_\varepsilon(p)| \geq \text{minPts}
ε\varepsilon
the neighborhood radius — the density scale.
minPts\text{minPts}
how many neighbors within ε make a point a core point.
Nε(p)N_\varepsilon(p)
the set of points within ε of p.
Show the derivation

Clusters are maximal sets of density-connected points: start at any core point and absorb every point reachable through a chain of core-point neighborhoods. ε sets the density threshold — too small and every point is isolated noise; too large and separate clusters merge into one blob.

Now Break It

Try this: Wrong epsilon either merges everything into one blob or marks almost every point as noise.

Control: Epsilon slider (set very small or very large)

Last updated .