Transfer Learning

Deep Learning pretrained fine-tuning CNN

Stand on the shoulders of a trained model

A ResNet trained on millions of ImageNet photos already learned to detect edges, textures, shapes, and object parts. Why throw that away? Transfer learning reuses those learned features for your own task.

You keep the pretrained feature extractor, replace just the final classifier head with one for your classes, and train on your (often tiny) dataset. The result: strong accuracy from a few hundred images instead of a few million.

Reuse, replace, retrain

Watch a model pretrained on 1000 ImageNet classes get its head swapped for a 2-class cats-vs-dogs task — with the feature layers frozen.

Two ways to do it

Feature extraction freeze the base

Freeze all pretrained layers; train only the new head. Fast, needs little data, ideal when your data is small.

Fine-tuning unfreeze some

Unfreeze the top few layers too and train them at a low learning rate. More accuracy when you have more data.

Why it works so well

Benefits
  • Great accuracy from small datasets
  • Much faster training, less compute
  • Low layers transfer across most vision tasks
Keep in mind
  • Works best when the source domain is similar
  • Match the pretrained model's input preprocessing
  • Fine-tune gently — a big LR can wreck good features
Beyond vision

This is exactly how modern NLP works too: pretrained language models are fine-tuned on your task — see LLM fine-tuning and Hugging Face.