Overfitting vs Underfitting · Suman Bhadra Notes

The goal: generalize, don't memorize

A good model doesn't just fit the data it was trained on — it predicts well on data it has never seen.

Two failure modes stand in the way. A model can be too simple to capture the real pattern (underfitting), or too complex, bending to fit every quirk and noise point in the training set (overfitting). The art of ML is steering between them.

The student analogy

Underfitting = a student who barely studied and gets everything wrong. Overfitting = a student who memorized the practice exam word-for-word but fails the real test because the questions changed.

Watch all three fits

The same noisy data, fitted three ways: a line that's too stiff, a smooth curve that's just right, and a wild polynomial that threads every single point.

The tell-tale signs

Underfitting high train + high test error

Bad on the training data and bad on new data. The model is too simple — high bias.

Just right low train ≈ low test error

Good on training data, almost as good on new data. The gap is small.

Overfitting low train, high test error

Near-perfect on training data but poor on new data. A big gap — high variance.

The diagnostic

Watch the gap between training and validation error. A large gap means overfitting; both being high means underfitting.

Fixing each one

If underfitting

Use a more complex model (more features, depth, layers)
Add better features — see Feature Engineering
Reduce regularization
Train longer

If overfitting

Get more training data
Add regularization — see Ridge vs Lasso
Use a simpler model / prune / dropout
Early stopping and cross-validation

Why this connects to bias and variance

Underfitting and overfitting are the two ends of the bias–variance tradeoff. Underfitting is high bias (wrong assumptions); overfitting is high variance (too sensitive to the particular training sample). The next article makes that tradeoff precise.