Back

Backpropagation on a tiny MLP

What you are seeing: a small 2-input neural network (2 -> H -> 1 with tanh hidden units) being trained on a 2D binary classification task. Background color shows the current decision surface: blue regions get classified as class 0, red as class 1. Points are the training data. The network is trained by gradient descent on the binary cross-entropy loss; the loss curve is plotted below.

The right panel draws the network itself: one column of nodes per layer, edges whose thickness scales with weight magnitude and whose color encodes sign (warm = positive, blue = negative). As gradients flow the edges thicken, thin, and flip sign. The small white circle on the decision surface (labelled "probe input") is a single test point swept slowly on the dashed circle; the network is evaluated there every frame, and the hidden-node glow on the right shows how that one input propagates through the layers. The architecture is configurable: 1 to 3 hidden layers, up to 8 units each.

Datasets: moons (two interlocking arcs), XOR (a linear classifier cannot solve this), spiral (two intertwined logarithmic spirals), circles (a core inside an annulus; needs a closed boundary), and gaussians (two blobs, nearly linearly separable). Misclassified training points are ringed so you watch the error set shrink as the surface deforms.

Figure 1. A 2 to H to 1 tanh MLP trained with mini-batch SGD on binary cross-entropy. Method: explicit forward and backward pass (autograd not used).
dataset
layers1
neurons8
lr0.50
speed4

WHAT TO TRY

  • Vary each control and watch the rail readouts respond.
  • Compare the diagnostic plot against the live scene.