decision-boundary-ml

The surface in input space that separates regions assigned to different classes — for linear classifiers, a hyperplane defined by $w^{⊤} x = 0$ .

Definition

In a binary classifier, the decision boundary is the set of points $x$ where the classifier transitions from predicting one class to the other. For logistic-regression and other linear classifiers, this boundary is the hyperplane:

$w^{⊤} x = w_{0} x_{0} + w_{1} x_{1} + \dots + w_{d} x_{d} = 0$

where $x_{0} = 1$ (dummy variable for the bias).

Geometry

The dimensionality of the boundary is always one less than the input space:

Input dimensions $d$	Boundary
2	Line
3	Plane
$d$	$(d - 1)$ -dimensional hyperplane

The weight vector $w$ (excluding $w_{0}$ ) is the normal to the hyperplane — it points toward the class-1 side. The bias $w_{0}$ shifts the hyperplane away from the origin.

Distance and Confidence

In logistic-regression, the signed quantity $w^{⊤} x$ measures how far $x$ is from the boundary (up to scaling by $∥ w ∥$ ):

$w^{⊤} x ≫ 0$ : far into the class-1 region, $P (y = 1) \approx 1$ — high confidence.
$w^{⊤} x ≪ 0$ : far into the class-0 region, $P (y = 1) \approx 0$ — high confidence.
$w^{⊤} x \approx 0$ : near the boundary, $P (y = 1) \approx 0.5$ — maximum uncertainty.

The sigmoid function translates this signed distance into a smooth probability.

Linearity and Its Limits

Logistic regression produces a linear boundary regardless of the data. If the true classes are not linearly separable (e.g., one class surrounds the other), a linear boundary cannot correctly classify all points. Two strategies for handling this appear later in the module:

Non-linear feature transformations — map inputs to a higher-dimensional space where the classes become linearly separable.
Kernel methods / SVMs — implicitly work in a transformed space without computing the transformation directly.

Margin

Logistic regression finds a separating hyperplane but does not optimize the margin — the minimum distance from the boundary to the nearest training point. A larger margin means the classifier is more robust to small perturbations in the data. Support Vector Machines (covered in weeks 3–5) explicitly maximize the margin.

logistic-regression — produces a linear decision boundary
sigmoid function — converts distance from boundary to probability
generalization — margin and boundary placement affect generalization

Active Recall

For a logistic regression model with $w^{⊤} = (1, - 2, 3)$ , what is the equation of the decision boundary in terms of $x_{0}, x_{1}, x_{2}$ ? Sketch (verbally) what it looks like in $(x_{1}, x_{2})$ space.

The boundary is $1 \cdot x_{0} + (- 2) \cdot x_{1} + 3 \cdot x_{2} = 0$ . Since $x_{0} = 1$ , this simplifies to $- 2 x_{1} + 3 x_{2} + 1 = 0$ , or $x_{2} = \frac{2 x _{1} - 1}{3}$ . In the $(x_{1}, x_{2})$ plane, this is a straight line with slope $2/3$ and $x_{2}$ -intercept $- 1/3$ .

Why does $w^{⊤} x \approx 0$ correspond to maximum classification uncertainty in logistic regression?

When $w^{⊤} x = 0$ , the sigmoid outputs $σ (0) = 0.5$ , meaning both classes are equally likely. The point sits exactly on the decision boundary. As $∣ w^{⊤} x ∣$ grows, the sigmoid saturates toward 0 or 1, and uncertainty decreases.

Logistic regression can only produce linear decision boundaries. Describe a data layout where this fails, and name two strategies for overcoming the limitation.

If class 1 forms a cluster in the centre of the input space and class 0 surrounds it (e.g., concentric circles), no single line or hyperplane can separate them. Two strategies: (1) apply a non-linear feature transformation to map the data into a higher-dimensional space where a linear boundary suffices; (2) use kernel methods (e.g., SVMs with RBF kernel) to implicitly work in the transformed space.

Course Notes

Explorer

decision-boundary-ml

Definition

Geometry

Distance and Confidence

Linearity and Its Limits

Margin

Active Recall

Graph View

Table of Contents

Backlinks

Course Notes

Explorer

decision-boundary-ml

Definition

Geometry

Distance and Confidence

Linearity and Its Limits

Margin

Related

Active Recall

Graph View

Table of Contents

Backlinks