Padding & Stride

Deep Learning CNN padding stride

Controlling the output size

A plain convolution shrinks the image and under-samples the edges. Two settings fix and control this: padding and stride.

Padding add a border

Surround the input with zeros so the kernel can sit centered on edge pixels — keeping the output the same size and treating edges fairly.

Stride step size

How far the kernel jumps each move. Stride 1 visits every position; stride 2 skips every other, halving the output.

See the size change

The same input convolved three ways: no padding (shrinks), with padding (same size), and stride 2 (downsamples).

The output-size formula

For each dimension

out = ⌊(in + 2·padding − kernel) / stride⌋ + 1

Example: 5 input, 3 kernel, padding 0, stride 1 → (5−3)/1 + 1 = 3. With padding 1 → (5+2−3)/1 + 1 = 5 (same).

"valid" padding no padding

Only valid positions; output shrinks each layer.

"same" padding size preserved

Pad just enough that output size = input size (with stride 1).

Strided conv downsample

Stride > 1 reduces spatial size — an alternative to pooling.

When to use what

Padding
  • Use "same" to keep spatial size through many layers
  • Stops the image from vanishing in deep stacks
  • Treats edge pixels fairly
Stride
  • Use stride 2 to downsample & cut compute
  • Reduces resolution as you go deeper
  • Trades spatial detail for more channels