ROC Curve & AUC
Don't pick one threshold — judge them all
Precision and recall depend on where you set the threshold. The ROC curve sidesteps that by evaluating every threshold at once.
It plots two rates against each other as the threshold slides from strict to lenient:
= recall = sensitivity. Up the y-axis. How many real positives you catch.
= 1 − specificity. Along the x-axis. How many negatives you wrongly flag.
Watch the curve trace itself
As the threshold drops from "flag nothing" toward "flag everything", the operating point walks from the bottom-left to the top-right corner, sketching the ROC curve. The area underneath is the AUC.
Reading AUC
The model ranks every positive above every negative. Curve hugs the top-left corner.
No better than random — the diagonal line. The model can't tell the classes apart.
Worse than random — but flip the predictions and it's better than random.
AUC is the probability that the model gives a randomly chosen positive a higher score than a randomly chosen negative. It measures ranking quality, independent of any threshold.
ROC vs Precision-Recall curve
- Threshold-independent ranking quality
- Great for balanced classes
- Easy to compare models at a glance
- Classes are heavily imbalanced (rare positives)
- You care mostly about the positive class
- FPR looks deceptively good because negatives dominate
This builds directly on the Confusion Matrix and Precision, Recall & F1.