Gradient Descent

FoundationsIntermediate~8 min

Gradient Descent — Gradient descent is an iterative optimization algorithm that minimizes a loss function by repeatedly stepping in the direction of its negative gradient. The learning rate controls the step size and determines whether it converges or diverges.

Imagine you’re blindfolded on a hilly landscape and you want to find the lowest valley. Gradient descent is the strategy: feel which way is downhill, take a step that direction, repeat.

Start
Descent path
Minimum

Loss surface

Learning rate α0.10

Iteration 0 / 67

Loss vs iteration

Drag the surface to orbit; drag on the contour map to set the start point.

Loss surface

Learning rate α0.10

Iteration 0 / 67

Loss vs iteration

Drag the surface to orbit; drag on the contour map to set the start point.

The idea in plain words

Gradient descent finds the bottom of a valley by feeling which way is downhill and taking a step that direction, over and over. The learning rate is the step size. Nudge it up and the path descends faster; push it to the top and each step overshoots, bouncing to ever-larger loss until it flies off to infinity.

The valley is defined by a loss function, and this is exactly how models like linear regression are fit when there’s no shortcut.

Now, the math

Each parameter θ updates by stepping against the gradient:

\theta \leftarrow \theta - \eta\,\nabla L(\theta)

$\theta$: a model parameter being tuned.
$\eta$: the learning rate — the step size.
$\nabla L(\theta)$: the gradient: the uphill direction of the loss.

▸ Show the derivation

On an elongated bowl, the steepest direction has the largest curvature. Convergence there requires the learning rate to stay below roughly twice the inverse of that curvature; above it, each step more than undoes the last and the loss diverges — which is exactly what the slider lets you trigger.

Now Break It

Try this: Crank the learning rate high — the path oscillates wildly and diverges to infinity.

Control: Learning rate slider (set to maximum)

← Back to all visualizations Continue on the Learning Path →

Last updated July 3, 2026.