Recommender Systems · Suman Bhadra Notes

A table full of holes

Picture a giant table: users down the side, items across the top, a rating in each cell. The catch — almost every cell is empty, because no one rates everything. A recommender system's job is to predict the missing entries, then suggest the items it thinks you'd score highest.

Content-based match features

Recommend items similar to ones you liked, using item attributes (genre, brand, keywords).

Collaborative match people

"People like you also liked…" — use the crowd's behaviour, no item features needed.

Hybrid both

Real systems blend the two, and handle the cold-start problem of brand-new users and items.

Two ways to fill a blank

First, neighbourhood-based collaborative filtering: find users who agree with you and borrow their opinion. Then, matrix factorization: explain every rating as a handful of hidden "taste factors".

Matrix factorization, in one idea

R ≈ U × Vᵀ

Approximate the huge, sparse ratings matrix R as the product of two skinny matrices: one row of hidden factors per user (U) and one per item (V). A user's predicted rating is the dot product of their factor vector with the item's. Learn U and V by minimizing error on the ratings you do have — the same gradient-descent idea behind linear regression.

Neighbourhood CF

Intuitive, easy to explain
Struggles when the matrix is very sparse
Slow to find neighbours at scale

Matrix factorization

Compresses millions of ratings into small factor vectors
Generalizes through learned latent structure
The workhorse behind the Netflix Prize