Backprop on a Tiny MLP

What you are seeing: a small 2-input neural network (2 -> H -> 1 with tanh hidden units) being trained on a 2D binary classification task. Background color shows the current decision surface: blue regions get classified as class 0, red as class 1. Points are the training data. The network is trained by gradient descent on the binary cross-entropy loss; the loss curve is plotted below.

The right panel draws the network itself: one column of nodes per layer, edges whose thickness scales with weight magnitude and whose color encodes sign (warm = positive, blue = negative). As gradients flow the edges thicken, thin, and flip sign. The small white circle on the decision surface (labelled "probe input") is a single test point swept slowly on the dashed circle; the network is evaluated there every frame, and the hidden-node glow on the right shows how that one input propagates through the layers. The architecture is configurable: 1 to 3 hidden layers, up to 8 units each.

Datasets: moons (two interlocking arcs), XOR (a linear classifier cannot solve this), spiral (two intertwined logarithmic spirals), circles (a core inside an annulus; needs a closed boundary), and gaussians (two blobs, nearly linearly separable). Misclassified training points are ringed so you watch the error set shrink as the surface deforms.

Figure 1. A 2 to H to 1 tanh MLP trained with mini-batch SGD on binary cross-entropy. Method: explicit forward and backward pass (autograd not used).

dataset

layers1

neurons8

lr0.50

speed4

WHAT TO TRY

Switch the dataset to XOR or Spiral: a single hidden layer struggles and the loss plateaus while the decision surface fails to separate the classes. Add neurons or a layer and it bends to fit.
Crank the learning rate up: training speeds up until it overshoots, the loss spikes and the boundary thrashes. Too small and it crawls. The sweet spot shows in the loss curve.
Watch the background decision surface deform live: every frame is one backprop step reshaping the boundary toward the data.