Early Stopping
Know when to quit
More training isn't always better. Past a point, a network stops learning the pattern and starts memorizing the noise — and the only way to see it is the validation loss.
Early stopping watches that validation loss and halts training when it stops improving, then rewinds to the best weights. Simple, cheap, and one of the most effective regularizers there is.
Watch the curves diverge
Training loss falls forever; validation loss bottoms out then turns back up. Early stopping marks that turning point as the stopping epoch.
How it works in practice
Monitor
validation loss
After each epoch, evaluate on a held-out validation set.
Patience
wait N epochs
Don't stop at the first uptick — wait patience epochs with no improvement to ride out noise.
Restore best
rewind weights
Save the weights at the best validation score and roll back to them when you stop.
Why it's so useful
Pros
- Prevents overfitting for free
- Saves compute — stop early
- No extra hyperparameters in the model
Keep in mind
- Needs a validation set held out
- Set patience sensibly for noisy curves
- Combine with dropout / weight decay for best results