AdaBoost · Suman Bhadra Notes

A team that learns from its mistakes

AdaBoost ("Adaptive Boosting") builds a strong model from a sequence of weak learners — usually tiny decision stumps that are barely better than a coin flip — by making each new one concentrate on the examples the previous ones got wrong.

It's a form of ensemble learning called boosting: models are added one at a time, in sequence, each correcting its predecessor — unlike bagging, where models train independently in parallel.

Watch the weights shift

Each round, a stump splits the data, the misclassified points grow heavier (bigger), and the next stump is forced to pay attention to them. The final classifier is a weighted vote of all the stumps.

The algorithm in four moves

1. Equal weights start fair

Every training example begins with the same importance.

2. Train a stump weak learner

Fit a simple model that minimises the weighted error.

3. Reweight boost the errors

Increase the weight of misclassified points so the next learner focuses there.

4. Weighted vote α by accuracy

Each stump gets a say proportional to how accurate it was. Combine for the final prediction.

Strengths and cautions

Strengths

Turns weak learners into a strong one
Little tuning, no feature scaling needed
Often very accurate on clean data

Cautions

Sensitive to noise & outliers — it keeps boosting them
Sequential → slower to train than bagging
Can overfit with too many rounds on noisy data

Boosting cousins

AdaBoost reweights points; Gradient Boosting instead fits each new learner to the residual errors. XGBoost is the heavily optimised, regularised version of that idea.