Skip to content
ML Visualization

Cross-Validation

Data Prep & Model EvaluationIntermediate~7 min

Cross-ValidationRotate the holdout set across k folds for a stable estimate.

A single train/test split can be lucky or unlucky. k-fold cross-validation rotates the holdout across k slices of the data and averages, giving a far more stable estimate of performance.

Data split into 5 folds — drag across the strip to add or remove folds

val
2
3
4
5

Validation error per fold

Iteration 0 / 4
5

The validation fold sweeps across the data; each fold’s score accumulates into a mean ± spread — far more trustworthy than one split. With k = 2 the estimate swings wildly run to run.

The idea in plain words

A single train/test split can be lucky or unlucky. k-fold cross-validation rotates the validation set across k slices of the data, scores each, and averages — a far more stable estimate of performance.

You watch the validation fold sweep and the per-fold scores accumulate into a mean ± spread. Too few folds (k = 2) gives a noisy estimate that swings run to run; more folds tighten it, at more compute.

Now, the math

The cross-validation score is the mean of the per-fold errors:

CV=1kf=1kEf  ±  std(Ef)\text{CV} = \frac{1}{k}\sum_{f=1}^{k} E_f \;\pm\; \text{std}(E_f)
kk
the number of folds the data is split into.
EfE_f
the error when fold f is the validation set.
Show the derivation

Every point is used for validation exactly once and for training k−1 times, so the estimate uses all the data while never testing on training points. The spread across folds is itself informative — a large std warns that the score is sensitive to which data you held out.

Now Break It

Try this: Too few folds gives a noisy estimate; too many is slow and high-variance per fold.

Control: Number of folds slider

Last updated .