ROC Curve & AUC

ML classification evaluation ROC AUC

Don't pick one threshold — judge them all

Precision and recall depend on where you set the threshold. The ROC curve sidesteps that by evaluating every threshold at once.

It plots two rates against each other as the threshold slides from strict to lenient:

True Positive Rate TP / (TP+FN)

= recall = sensitivity. Up the y-axis. How many real positives you catch.

False Positive Rate FP / (FP+TN)

= 1 − specificity. Along the x-axis. How many negatives you wrongly flag.

Watch the curve trace itself

As the threshold drops from "flag nothing" toward "flag everything", the operating point walks from the bottom-left to the top-right corner, sketching the ROC curve. The area underneath is the AUC.

Reading AUC

AUC = 1.0 perfect

The model ranks every positive above every negative. Curve hugs the top-left corner.

AUC = 0.5 coin flip

No better than random — the diagonal line. The model can't tell the classes apart.

AUC < 0.5 inverted

Worse than random — but flip the predictions and it's better than random.

The intuitive meaning

AUC is the probability that the model gives a randomly chosen positive a higher score than a randomly chosen negative. It measures ranking quality, independent of any threshold.

ROC vs Precision-Recall curve

ROC / AUC
  • Threshold-independent ranking quality
  • Great for balanced classes
  • Easy to compare models at a glance
Prefer PR curve when
  • Classes are heavily imbalanced (rare positives)
  • You care mostly about the positive class
  • FPR looks deceptively good because negatives dominate
Related

This builds directly on the Confusion Matrix and Precision, Recall & F1.