Decision Tree

ClassificationBeginner~7 min

Decision Tree — Split the data with a sequence of yes/no questions.

A decision tree classifies by asking a series of simple yes/no questions, splitting the data at each step. Follow the branches down to a leaf and you get your prediction.

Class 0
Class 1
Selected region

Max depth3

Dataset

Add points as

Click a tree node to highlight the region it governs. Increase depth and each split appears as a branch and a cut at once. Drag any point, or click empty space to drop a new one, and the tree re-splits live.

Max depth3

Dataset

Add points as

The idea in plain words

A decision tree classifies by asking a sequence of yes/no questions, each one a threshold on a single feature. Every split appears twice at once: as a branch in the tree and as a straight cut in the feature space. Follow the branches to a leaf and you have your prediction.

Each split is chosen to make the resulting groups as pure as possible. Let the tree grow without limit and it will carve a separate box around every point — perfect on training data, hopeless on anything new. That fragility is exactly what forests fix.

Now, the math

Splits are chosen to reduce Gini impurity, a measure of class mixing:

\text{Gini} = 1 - \sum_k p_k^2

$p_k$: the fraction of a node’s points belonging to class k.
$\text{Gini}=0$: a pure node — every point is the same class.

▸ Show the derivation

At each node the tree tries every threshold on every feature and picks the split that most reduces the weighted Gini of the children. Because splits are axis-aligned, boundaries are always staircases of horizontal and vertical cuts — never diagonal.

Now Break It

Try this: Unlimited depth grows a leaf for every point — perfect on train data, jagged and overfit everywhere else.

Control: Max depth slider (set to maximum)

← Back to all visualizations Continue on the Learning Path →

Last updated July 3, 2026.