DBSCAN
DBSCAN — Cluster by density; label sparse points as noise.
DBSCAN finds clusters as dense regions separated by sparse gaps. Unlike k-means, it discovers the number of clusters itself, handles weird shapes, and calls isolated points noise.
- Cluster A
- Cluster B
- Noise
On two moons DBSCAN separates the crescents that k-means cannot, leaving stray points gray as noise. Drag any point or click empty space to add one — the clustering re-runs live. Shrink ε to make everything noise; grow it to merge into one blob.
On two moons DBSCAN separates the crescents that k-means cannot, leaving stray points gray as noise. Drag any point or click empty space to add one — the clustering re-runs live. Shrink ε to make everything noise; grow it to merge into one blob.
The idea in plain words
DBSCAN finds clusters as dense regions separated by sparse gaps. A point is a “core” point if enough neighbors sit within a radius ε; clusters grow by connecting core points and their neighbors, and anything left isolated is labeled noise.
Unlike k-means, it discovers the number of clusters itself and handles arbitrary shapes — on two moons it cleanly separates the crescents that k-means fundamentally cannot.
Now, the math
A point p is a core point when its ε-neighborhood is dense enough:
- the neighborhood radius — the density scale.
- how many neighbors within ε make a point a core point.
- the set of points within ε of p.
▸ Show the derivation
Clusters are maximal sets of density-connected points: start at any core point and absorb every point reachable through a chain of core-point neighborhoods. ε sets the density threshold — too small and every point is isolated noise; too large and separate clusters merge into one blob.
Now Break It
Try this: Wrong epsilon either merges everything into one blob or marks almost every point as noise.
Control: Epsilon slider (set very small or very large)
Last updated .