Types of Machine Learning
One question splits them all
What separates the families of machine learning is simple: what feedback does the model get while it learns?
Does it see the right answer for every example? Does it see no answers at all and have to find structure itself? Or does it learn by trial and error, nudged by rewards? Those three answers give the three classic families — supervised, unsupervised, and reinforcement learning.
Supervised = learning with an answer key. Unsupervised = learning with no answer key. Reinforcement = learning from rewards and penalties.
See all three
The animation steps through each family with a tiny illustration: labelled points getting a boundary, unlabelled points falling into clusters, and an agent finding its way to a reward.
Supervised learning — learn from labelled examples
Every training example comes with the correct answer attached. The model's job is to learn the mapping from input to answer so it can predict the answer for new inputs. Two flavours:
House price, temperature, exam score. The output is continuous. See Linear Regression.
Spam / not spam, cat / dog, disease / healthy. The output is a label. See Logistic Regression.
A labelled dataset — often the expensive part. Someone has to mark thousands of examples with the right answer.
Unsupervised learning — find structure with no labels
Here the data has no answers attached. The model looks for patterns, groupings, or a more compact representation on its own.
Segment customers, group news articles by topic. See K-Means Clustering.
Squeeze many features into a few while keeping the signal. See PCA.
Flag fraud or faulty sensors as points that don't fit any pattern.
Reinforcement learning — learn by reward
An agent takes actions in an environment, receives rewards or penalties, and gradually learns a policy — a strategy that maximises long-term reward. There is no fixed dataset; experience is generated by acting.
Chess, Go, and video games — the reward is the score or the win.
Teach a robot to walk or grasp by rewarding progress.
Learn which suggestions keep users coming back.
A worth-knowing middle ground
A small labelled set guides learning over a much larger unlabelled pile — common when labelling is costly.
The model invents its own task — like predicting a hidden word — which is how modern LLMs are pre-trained.
Quick comparison
- Have inputs and answers → supervised
- Have inputs but no answers → unsupervised
- Have a goal and can act + get reward → reinforcement
- Supervised: labels are costly to collect
- Unsupervised: results are harder to evaluate
- Reinforcement: slow and unstable to train