Train/Test Split
Train/Test Split — Hold out data to measure real generalization.
If you grade a model on the same data it studied, of course it aces the test. A train/test split holds out unseen data so you measure how well the model actually generalizes.
Drag the divider to set the train / test split (17 train · 11 test)
- Train
- Test
- Model
The model fits on train only; test points reveal the honest error. Turn on leakage — include the test points in training — and the test score becomes a lie.
The model fits on train only; test points reveal the honest error. Turn on leakage — include the test points in training — and the test score becomes a lie.
The idea in plain words
If you grade a model on the same data it studied, of course it aces the test. A train/test split holds out unseen data so you can measure how well the model actually generalizes — the honest number that overfitting would otherwise hide.
The catch is leakage: accidentally letting test data into training makes the test score a lie. And if the training set is tiny, the fit becomes unstable and changes every reshuffle.
Now, the math
Generalization is judged on the held-out test set, not the training set:
- the test set — never touched during fitting.
- the model’s prediction, trained only on the train split.
▸ Show the derivation
A fair test error estimates performance on future data only if the test set is truly independent. Leakage (fitting on test points, or letting information cross over) collapses that gap and yields an over-optimistic score that won’t hold up in production.
Now Break It
Try this: Leaking test data into training (or too small a test set) makes the score meaningless.
Control: Test-set size slider (set to near zero)
Last updated .