Skip to content
ML Visualization

The Perceptron

Neural NetworksIntermediate~7 min

The PerceptronThe original learning neuron: a linear threshold unit.

The perceptron is the ancestor of every neural network: it weights its inputs, sums them, and fires if the total crosses a threshold. It learns by nudging weights whenever it gets an example wrong.

  • Class 0
  • Class 1
  • Boundary
  • Just updated
Iteration 0 / 3
Dataset
Add points as

Drag any point, or click empty space to drop a new one, and the perceptron re-solves from scratch. The boundary rotates and snaps into place, updating on each misclassified point. Add points to make the classes overlap and watch it oscillate — on XOR it can never separate them. That limitation killed early neural-net hype and motivated the multilayer perceptron.

The idea in plain words

The perceptron is the ancestor of every neural network: it weights its inputs, sums them, and fires if the total crosses a threshold. It learns by nudging its weights whenever it gets an example wrong, rotating the boundary a little each time.

On separable data the line sweeps and snaps into place. But feed it XOR — not linearly separable — and it oscillates forever, never converging. That famous limitation is exactly what the multilayer perceptron overcomes.

Now, the math

The perceptron update rule nudges weights toward each misclassified point:

ww+η(yy^)xw \leftarrow w + \eta\,(y - \hat{y})\,x
η\eta
the learning rate — how far the boundary moves per mistake.
yy^y - \hat{y}
the error (±1), zero when the point is already correct.
Show the derivation

The perceptron convergence theorem guarantees it finds a separating line in finite steps — but only if one exists. XOR has none, so the weights cycle endlessly. This gap between what a single linear unit can and cannot represent motivated stacking neurons into layers.

Now Break It

Try this: On non-linearly-separable data (XOR) the perceptron never converges — it oscillates forever.

Control: Switch dataset to XOR

Last updated .