Forward Propagation
How a prediction is made
Forward propagation is the network making a guess. An input enters, and at every layer the same two-step recipe runs until a prediction pops out the other end.
1. Weighted sum + bias: z = W·a + b 2. Activation: a' = f(z)
The output a' becomes the input to the next layer. Repeat to the end.
Follow the numbers
A tiny network — 2 inputs, a 2-neuron hidden layer with ReLU, and a sigmoid output — computes a real prediction, value by value.
It's all matrix multiplication
A whole layer is one matrix multiply: z = W·a + b, where W holds every weight. GPUs run these in parallel — which is why deep learning needs them. Process a whole batch at once and a becomes a matrix too.
Forward prop just computes the prediction. It doesn't learn anything yet.
With fixed weights, the same input always yields the same output (dropout aside).
The values computed here are reused by backpropagation to compute gradients.
What comes next
Forward prop gives a prediction. To learn, we measure how wrong it is with a loss function, then push the error backward with backpropagation and update the weights via gradient descent. That loop is training.