Skip to content
ML Visualization

Precision, Recall & F1

Data Prep & Model EvaluationIntermediate~7 min

Precision, Recall & F1Trade off catching positives against being right about them.

Precision asks “when I say positive, am I right?” Recall asks “did I catch all the positives?” They pull against each other, and F1 balances the two.

82%
Precision
74%
Recall
0.50
50%

Push the threshold for more recall and watch precision collapse — the two gauges fight each other. F1 is highest where they balance.

The idea in plain words

Precision asks “when I say positive, am I right?” Recall asks “did I catch all the positives?” They pull against each other: lower the threshold to catch more (higher recall) and you let in more false alarms (lower precision).

F1 balances the two into one number. On imbalanced data, watch precision collapse even as accuracy looks fine — which is why the confusion matrix underneath matters.

Now, the math

Precision, recall, and their harmonic mean F1:

P=TPTP+FP,R=TPTP+FN,F1=2PRP+RP = \frac{TP}{TP+FP},\quad R = \frac{TP}{TP+FN},\quad F_1 = \frac{2PR}{P+R}
PP
precision — correctness of positive predictions.
RR
recall — coverage of the actual positives.
F1F_1
harmonic mean — high only when both are high.
Show the derivation

The harmonic mean punishes imbalance between P and R far more than an arithmetic mean would, so F1 peaks near where the two gauges cross. Which metric to optimize depends on the cost of a false alarm versus a miss — the theme of the decision threshold.

Now Break It

Try this: Optimizing precision alone lets recall collapse — the model only predicts the easiest cases.

Control: Threshold slider (push to extreme)

Last updated .