Transfer Learning · Suman Bhadra Notes

Stand on the shoulders of a trained model

A ResNet trained on millions of ImageNet photos already learned to detect edges, textures, shapes, and object parts. Why throw that away? Transfer learning reuses those learned features for your own task.

You keep the pretrained feature extractor, replace just the final classifier head with one for your classes, and train on your (often tiny) dataset. The result: strong accuracy from a few hundred images instead of a few million.

Reuse, replace, retrain

Watch a model pretrained on 1000 ImageNet classes get its head swapped for a 2-class cats-vs-dogs task — with the feature layers frozen.

Two ways to do it

Feature extraction freeze the base

Freeze all pretrained layers; train only the new head. Fast, needs little data, ideal when your data is small.

Fine-tuning unfreeze some

Unfreeze the top few layers too and train them at a low learning rate. More accuracy when you have more data.

Why it works so well

Benefits

Great accuracy from small datasets
Much faster training, less compute
Low layers transfer across most vision tasks

Keep in mind

Works best when the source domain is similar
Match the pretrained model's input preprocessing
Fine-tune gently — a big LR can wreck good features

Beyond vision

This is exactly how modern NLP works too: pretrained language models are fine-tuned on your task — see LLM fine-tuning and Hugging Face.