Padding & Stride
Controlling the output size
A plain convolution shrinks the image and under-samples the edges. Two settings fix and control this: padding and stride.
Surround the input with zeros so the kernel can sit centered on edge pixels — keeping the output the same size and treating edges fairly.
How far the kernel jumps each move. Stride 1 visits every position; stride 2 skips every other, halving the output.
See the size change
The same input convolved three ways: no padding (shrinks), with padding (same size), and stride 2 (downsamples).
The output-size formula
out = ⌊(in + 2·padding − kernel) / stride⌋ + 1
Example: 5 input, 3 kernel, padding 0, stride 1 → (5−3)/1 + 1 = 3. With padding 1 → (5+2−3)/1 + 1 = 5 (same).
Only valid positions; output shrinks each layer.
Pad just enough that output size = input size (with stride 1).
When to use what
- Use "same" to keep spatial size through many layers
- Stops the image from vanishing in deep stacks
- Treats edge pixels fairly
- Use stride 2 to downsample & cut compute
- Reduces resolution as you go deeper
- Trades spatial detail for more channels