Principal Component Analysis
Principal Component Analysis — Project data onto the directions of greatest variance.
PCA finds the directions in which your data varies most and projects onto them, compressing many correlated features into a few meaningful axes while keeping as much information as possible.
- Data
- Projection axis
- First PC
- Residuals
Drag to rotate the projection axis. The variance meter peaks — and the axis turns amber — exactly at the first principal component.
Drag to rotate the projection axis. The variance meter peaks — and the axis turns amber — exactly at the first principal component.
The idea in plain words
PCA finds the directions along which your data varies the most and projects onto them. Rotate the projection axis by hand and the “variance captured” meter peaks exactly at the first principal component — you discover PCA instead of being told it.
It’s a linear method, though. Hand it a curved manifold like a swiss roll and it can only flatten by projection — it can’t unroll the sheet. That limitation is what motivates t-SNE and UMAP.
Now, the math
The principal components are the eigenvectors of the covariance matrix:
- the k-th principal component (a direction).
- its eigenvalue — the variance captured along that direction.
- the data covariance matrix.
▸ Show the derivation
The direction of maximum variance is the top eigenvector of the covariance matrix; the explained variance ratio is its eigenvalue over the total. Projecting onto the first few components keeps the most information for the fewest dimensions — but only along straight axes, so curved structure is lost.
Now Break It
Try this: Dropping to too few components loses the structure — reconstruction becomes a blur.
Control: Components-to-keep slider (set to 1)
Last updated .