Confusion Matrix

ML classification evaluation TP / FP / FN / TN

One table, four numbers

A confusion matrix sorts every prediction into four buckets by crossing what actually happened with what the model said.

"Accuracy" alone hides too much — a model that flags nothing as spam can still be 95% accurate if only 5% of mail is spam. The confusion matrix exposes which kind of mistakes the model makes, and that distinction drives every other classification metric.

The four cells

TP — caught it, correctly. FP — false alarm. FN — missed it. TN — correctly left alone.

Watch predictions fall into place

Ten emails, each truly spam or ham, each predicted spam or ham. The animation drops every one into its cell, then totals the matrix.

What each cell means (spam example)

True Positive spam → spam

A spam email correctly sent to the junk folder. Good.

False Positive ham → spam

A real email wrongly junked. A "Type I error" — often the costly one.

False Negative spam → ham

Spam that slipped into the inbox. A "Type II error".

True Negative ham → ham

A real email correctly left in the inbox. Good.

Metrics built from the cells

Accuracy (TP+TN) / all

Share of all predictions that were correct. Misleading when classes are imbalanced.

Precision TP / (TP+FP)

Of everything flagged positive, how much really was. See Precision & Recall.

Recall TP / (TP+FN)

Of all the real positives, how many were caught.

Beyond two classes

For N classes the matrix is N×N: rows are the true class, columns the predicted one. The diagonal is correct; everything off-diagonal shows exactly which classes get confused for which.