Skip to content
ML Visualization

Random Forest

EnsemblesIntermediate~7 min

Random ForestBag decision trees with random feature subsets.

A random forest takes bagging one step further: each tree not only sees a different sample of the data but also a random subset of features at each split. This decorrelates the trees and makes the forest even more robust.

  • Class 0
  • Class 1
20
Features per split
Dataset
Add points as

Drag from 1 to 60 trees and watch the boundary smooth from noisy to confident. With every feature considered at each split, the trees become near-identical and the ensemble stops helping. Drag any point, or click empty space to drop a new one, and the forest retrains live.

The idea in plain words

A random forest takes bagging one step further: each tree not only sees a different resample of the data but also considers only a random subset of features at each split. This decorrelates the trees, so averaging them helps far more.

Out-of-bag error — scoring each point using only the trees that didn’t train on it — gives a free validation estimate. If you let every split see all features, the trees become near-identical and the ensemble stops improving.

Now, the math

Averaging correlated trees only reduces variance so far:

Var=ρσ2+1ρBσ2\text{Var} = \rho\,\sigma^2 + \frac{1-\rho}{B}\sigma^2
ρ\rho
the correlation between trees — feature subsampling lowers it.
σ2\sigma^2
the variance of a single tree.
Show the derivation

As B → ∞ the second term vanishes but the first, ρσ², remains — so the only way to keep reducing variance is to lower ρ. Restricting each split to a random feature subset does exactly that, at the cost of a little more bias per tree. Using all features sends ρ → 1 and erases the benefit.

Now Break It

Try this: Using all features per split makes every tree nearly identical, defeating the ensemble.

Control: Features-per-split slider (set to all)

Last updated .