Multilayer Perceptron
Multilayer Perceptron — Stack neurons into layers to learn nonlinear boundaries.
Stack neurons into layers and a network can carve any boundary at all. The multilayer perceptron is the workhorse feedforward network — the thing “deep learning” scaled up.
Click a first-layer neuron to highlight the line it learned
Each hidden neuron learns its own line; together they combine into a curved boundary. Too few neurons can’t bend enough to solve the spiral or circle. Drag any point, or click empty space to drop a new one, and the network retrains from scratch.
Each hidden neuron learns its own line; together they combine into a curved boundary. Too few neurons can’t bend enough to solve the spiral or circle. Drag any point, or click empty space to drop a new one, and the network retrains from scratch.
The idea in plain words
Stack neurons into layers and a network can carve any boundary at all. Each hidden neuron learns its own line; together they combine into a complex curved boundary that a single perceptron never could — solving circles, XOR, even spirals.
Watch the boundary reshape as the network trains, and click a first-layer neuron to see the line it learned. Too few neurons can’t bend enough to separate a hard dataset; the loss curve stalls high.
Now, the math
An MLP composes layers of nonlinear transformations:
- the nonlinear activation — what makes stacking meaningful.
- the weight matrices of the hidden and output layers.
▸ Show the derivation
The universal approximation theorem says a single hidden layer with enough neurons can approximate any continuous function — but “enough” can be huge. Extra depth lets the network build features hierarchically, solving hard shapes like the spiral with far fewer neurons per layer. It’s trained by backpropagation.
Now Break It
Try this: Too few hidden units can’t bend enough to separate a spiral; too many overfit the noise.
Control: Hidden units slider (set to 1)
Last updated .