Filters & Kernels — What They Learn
A kernel is a pattern detector
The numbers inside a convolution kernel decide what it responds to. One set of numbers detects vertical edges; another blurs; another sharpens. The kernel "lights up" wherever the image matches its pattern.
In classic image processing, people hand-designed these kernels. In a CNN, the network learns them from data — discovering whatever detectors best solve the task.
Same image, different kernels
Watch one little image transformed by an edge kernel, a blur kernel, and a sharpen kernel — then the feature hierarchy a deep CNN builds.
The feature hierarchy
The first conv layers learn simple detectors: edges at various angles, color blobs, gradients.
Combinations of edges become corners, curves, textures, repeating motifs.
Combinations of textures become eyes, wheels, faces — whole object parts.
A conv layer learns many filters (e.g. 64), each producing a feature map (a channel). The next layer's filters span all those channels, combining lower features into higher ones.
Why learned beats hand-designed
Hand-crafted kernels capture what humans think matters. Learned kernels capture what actually minimizes the loss — often subtle detectors no one would design. That shift is a big part of why deep learning overtook classical computer vision.