Precision, Recall & F1
Precision, Recall & F1 — Trade off catching positives against being right about them.
Precision asks “when I say positive, am I right?” Recall asks “did I catch all the positives?” They pull against each other, and F1 balances the two.
Push the threshold for more recall and watch precision collapse — the two gauges fight each other. F1 is highest where they balance.
Push the threshold for more recall and watch precision collapse — the two gauges fight each other. F1 is highest where they balance.
The idea in plain words
Precision asks “when I say positive, am I right?” Recall asks “did I catch all the positives?” They pull against each other: lower the threshold to catch more (higher recall) and you let in more false alarms (lower precision).
F1 balances the two into one number. On imbalanced data, watch precision collapse even as accuracy looks fine — which is why the confusion matrix underneath matters.
Now, the math
Precision, recall, and their harmonic mean F1:
- precision — correctness of positive predictions.
- recall — coverage of the actual positives.
- harmonic mean — high only when both are high.
▸ Show the derivation
The harmonic mean punishes imbalance between P and R far more than an arithmetic mean would, so F1 peaks near where the two gauges cross. Which metric to optimize depends on the cost of a false alarm versus a miss — the theme of the decision threshold.
Now Break It
Try this: Optimizing precision alone lets recall collapse — the model only predicts the easiest cases.
Control: Threshold slider (push to extreme)
Last updated .