Anomaly Detection
Find the needle, not the haystack
Most ML predicts a label from many examples of each class. Anomalies flip that: you have tons of normal data and almost no examples of the rare, important event — fraud, a failing machine, an intrusion. So instead of learning the anomaly, you learn what normal looks like and flag anything that doesn't fit.
A single value far from the rest — a $10,000 charge on a coffee-sized account.
Normal in general, strange in context — high heating use… in July.
Each point looks fine, but the sequence together is abnormal.
Learn normal, score the rest
Model the dense region of normal points, then give every new point an anomaly score by how far outside that region it lands. Cross the threshold and it gets flagged.
Common approaches
Flag points beyond a few standard deviations from the mean — simple and fast for one feature.
Random splits isolate outliers in very few cuts; normal points take many. Scales to high dimensions.
Train an autoencoder on normal data; anything it rebuilds badly is anomalous.
Loosen it and you catch more fraud but annoy honest users with false alarms; tighten it and you miss real cases. It's the same precision–recall trade-off — choose the operating point that fits the cost of each mistake.