Skip to content
ML Visualization

Overfitting & Underfitting

FoundationsBeginner~6 min

Overfitting & UnderfittingUnderfitting is when a model is too simple to capture the pattern (high error everywhere); overfitting is when it is so flexible it memorizes noise (low train error, high test error). The sweet spot minimizes test error.

Slide a model from a flat line to a deranged wiggle and watch two curves: training error keeps falling, but test error dips then rises — the single most important picture in machine learning, drawn by your own hand.

  • Train points
  • Test points
  • Model
  • Truth
Train vs test error — the U
degree 3

The idea in plain words

Slide the model from a flat line to a wild wiggle and watch two numbers. Training error only ever falls — a flexible model can always hug the points it has seen. Test error, measured on points it hasn’t, dips and then rises. Too simple underfits; too complex overfits.

The bottom of that test-error U is the sweet spot. Overshoot it and the model is memorizing noise, not the pattern — the same trap you can trigger with polynomial degree, and the reason for the bias–variance tradeoff.

Now, the math

Generalization is measured by the gap between two errors:

gap=EtestEtrain\text{gap} = E_{\text{test}} - E_{\text{train}}
EtrainE_{\text{train}}
error on the data the model was fit to — falls with complexity.
EtestE_{\text{test}}
error on held-out data — dips then rises, forming the U.
Show the derivation

A model with enough parameters can drive training error to zero by interpolating every point, but those extra degrees of freedom fit the random noise in the sample. On fresh data that noise is different, so the wiggles that helped on the training set now hurt — test error climbs even as training error keeps falling.

Now Break It

Try this: Max out complexity — train error hits zero while test error explodes; resample and the overfit curve flails.

Control: Complexity slider (set to maximum), then resample

Last updated .