Overfitting vs Underfitting
The goal: generalize, don't memorize
A good model doesn't just fit the data it was trained on — it predicts well on data it has never seen.
Two failure modes stand in the way. A model can be too simple to capture the real pattern (underfitting), or too complex, bending to fit every quirk and noise point in the training set (overfitting). The art of ML is steering between them.
Underfitting = a student who barely studied and gets everything wrong. Overfitting = a student who memorized the practice exam word-for-word but fails the real test because the questions changed.
Watch all three fits
The same noisy data, fitted three ways: a line that's too stiff, a smooth curve that's just right, and a wild polynomial that threads every single point.
The tell-tale signs
Bad on the training data and bad on new data. The model is too simple — high bias.
Good on training data, almost as good on new data. The gap is small.
Near-perfect on training data but poor on new data. A big gap — high variance.
Watch the gap between training and validation error. A large gap means overfitting; both being high means underfitting.
Fixing each one
- Use a more complex model (more features, depth, layers)
- Add better features — see Feature Engineering
- Reduce regularization
- Train longer
- Get more training data
- Add regularization — see Ridge vs Lasso
- Use a simpler model / prune / dropout
- Early stopping and cross-validation
Why this connects to bias and variance
Underfitting and overfitting are the two ends of the bias–variance tradeoff. Underfitting is high bias (wrong assumptions); overfitting is high variance (too sensitive to the particular training sample). The next article makes that tradeoff precise.
Continue to Bias–Variance Tradeoff and Cross-Validation, the tool that detects overfitting reliably.