Skip to content
ML Visualization

UMAP

Unsupervised & Dim. ReductionAdvanced~8 min

UMAPFast manifold embedding preserving local and some global structure.

UMAP is a faster cousin of t-SNE that preserves more of the global structure. It builds a graph of neighbors and lays it out in 2D, revealing clusters while keeping their arrangement more faithful.

UMAP
t-SNE
  • Cluster 0
  • Cluster 1
  • Cluster 2

Same data, two embeddings running together. UMAP converges fast and tends to keep clusters that are close in the data close in the layout (more global structure) than t-SNE.

Iteration 0 / 38
12
0.40

Very small n_neighbors fragments the manifold into disconnected islands. (This is a simplified, in-spirit UMAP for teaching — not the full fuzzy-simplicial optimizer.)

The idea in plain words

UMAP is a faster cousin of t-SNE. It builds a graph connecting each point to its nearest neighbors, then lays that graph out in 2-D with attractive forces along edges and repulsion elsewhere. Run both on the same data and the contrast is the lesson.

UMAP tends to preserve more global structure and converges faster, so clusters that are related in the data often stay closer in the layout. Very small n_neighbors fragments the manifold into disconnected islands.

Now, the math

UMAP optimizes a low-D layout to match a fuzzy neighbor graph, balancing two forces:

attractgraph edges        repelnon-neighbors\underbrace{\text{attract}}_{\text{graph edges}} \;\;\leftrightarrow\;\; \underbrace{\text{repel}}_{\text{non-neighbors}}
n_neighborsn\_neighbors
graph connectivity — small values emphasize local, fine structure.
min_dist\text{min\_dist}
how tightly points may pack in the layout.
Show the derivation

UMAP models neighborhoods as a fuzzy simplicial set and minimizes a cross-entropy between the high-D and low-D graphs. Fewer neighbors means a sparser graph, so weakly-connected regions drift apart into islands. (This build is a simplified force-directed stand-in for teaching, not the full optimizer.)

Now Break It

Try this: Very small n_neighbors fragments the manifold into disconnected islands.

Control: n_neighbors slider (set very low)

Last updated .