Implement backpropagation and gradient descent step

Extend the scratch forward pass to compute gradients for a single hidden-layer network using binary cross-entropy loss. Implement the backward pass (chain rule) and update weights using a basic learning rate. Show the loss decreasing over 10 iterations on a small data batch.