Skip to content
ML Visualization

K-Nearest Neighbors

ClassificationBeginner~6 min

K-Nearest NeighborsK-nearest neighbors is a supervised learning algorithm that classifies a point by a majority vote of its k closest labeled examples under a distance metric. It does no training — it simply stores the data and measures distance at prediction time.

Want to classify something? Just look at the closest examples you’ve already seen and go with the majority. That’s KNN — no training needed, just memory and a sense of distance.

  • Class A
  • Class B
  • Query point
5
Query predictionClass A

Drag the ringed query point around the space. Push k to 1 (memorizes noise) or to the maximum (always the majority class).

The idea in plain words

KNN doesn’t train — it memorizes the data. To classify a new point, it looks at the k closest labeled examples and takes a majority vote. Drag the query point around and watch its predicted class flip as its neighborhood changes.

With k = 1 the boundary bends around every noisy point (overfitting); with k as large as the dataset it always returns the global majority (underfitting). It’s a useful contrast to a fitted model like linear regression.

Now, the math

Neighbors are ranked by Euclidean distance:

d(p,q)=j(pjqj)2d(p, q) = \sqrt{\sum_j (p_j - q_j)^2}
p, qp,\ q
two points being compared.
pjp_j
the j-th feature (coordinate) of point p.
kk
how many nearest neighbors vote.
Show the derivation

k controls the bias–variance balance: small k gives a flexible, high-variance boundary that chases noise; large k averages over a wide neighborhood, raising bias until the model ignores local structure entirely.

Now Break It

Try this: k=1 memorizes every noisy point; k=N always predicts the majority class regardless of position.

Control: k slider (set to 1, then to maximum)

Last updated .