Confusion Matrix

Data Prep & Model EvaluationBeginner~6 min

Confusion Matrix — Break predictions into true/false positives and negatives.

Accuracy hides too much. A confusion matrix lays out exactly how a classifier succeeds and fails — true positives, false positives, true negatives, false negatives — the raw material for every other metric.

Correct
Wrong
Threshold

Confusion matrix

Pred +Pred −Actual +

60TP

20FN

Actual −

12FP

68TN

Decision threshold0.50

Class balance (fraction +)50%

Confusion matrix

Pred +Pred −Actual +

60TP

20FN

Actual −

12FP

68TN

Decision threshold0.50

Class balance (fraction +)50%

The idea in plain words

Accuracy hides too much. A confusion matrix lays out exactly how a classifier succeeds and fails: true positives, false positives, true negatives, false negatives. Drag the threshold and watch the four cells trade off, linked to the colored points.

It’s the raw material for every other metric — precision and recall, the ROC curve. And on imbalanced data it exposes the accuracy paradox: predict all-negative, score “99% accurate,” and still be useless.

Now, the math

Accuracy is just the diagonal of the matrix over the total:

\text{accuracy} = \frac{TP + TN}{TP + TN + FP + FN}

$TP,\ TN$: correct predictions (positive and negative).
$FP,\ FN$: the two error types — false alarms and misses.

▸ Show the derivation

When one class is rare, TN dominates the sum, so accuracy stays high even if the model never catches a single positive. That’s why the individual cells — and metrics derived from them — matter more than accuracy alone on imbalanced problems.

Now Break It

Try this: On imbalanced data, high accuracy hides that the model never catches the rare class.

Control: Class balance slider (set to highly imbalanced)

← Back to all visualizations Continue on the Learning Path →

Last updated July 3, 2026.