The surface in input space that separates regions assigned to different classes — for linear classifiers, a hyperplane defined by .
Definition
In a binary classifier, the decision boundary is the set of points where the classifier transitions from predicting one class to the other. For logistic-regression and other linear classifiers, this boundary is the hyperplane:
where (dummy variable for the bias).
Geometry
The dimensionality of the boundary is always one less than the input space:
| Input dimensions | Boundary |
|---|---|
| 2 | Line |
| 3 | Plane |
| -dimensional hyperplane |
The weight vector (excluding ) is the normal to the hyperplane — it points toward the class-1 side. The bias shifts the hyperplane away from the origin.
Distance and Confidence
In logistic-regression, the signed quantity measures how far is from the boundary (up to scaling by ):
- : far into the class-1 region, — high confidence.
- : far into the class-0 region, — high confidence.
- : near the boundary, — maximum uncertainty.
The sigmoid function translates this signed distance into a smooth probability.
Linearity and Its Limits
Logistic regression produces a linear boundary regardless of the data. If the true classes are not linearly separable (e.g., one class surrounds the other), a linear boundary cannot correctly classify all points. Two strategies for handling this appear later in the module:
- Non-linear feature transformations — map inputs to a higher-dimensional space where the classes become linearly separable.
- Kernel methods / SVMs — implicitly work in a transformed space without computing the transformation directly.
Margin
Logistic regression finds a separating hyperplane but does not optimize the margin — the minimum distance from the boundary to the nearest training point. A larger margin means the classifier is more robust to small perturbations in the data. Support Vector Machines (covered in weeks 3–5) explicitly maximize the margin.
Related
- logistic-regression — produces a linear decision boundary
- sigmoid function — converts distance from boundary to probability
- generalization — margin and boundary placement affect generalization
Active Recall
For a logistic regression model with , what is the equation of the decision boundary in terms of ? Sketch (verbally) what it looks like in space.
The boundary is . Since , this simplifies to , or . In the plane, this is a straight line with slope and -intercept .
Why does correspond to maximum classification uncertainty in logistic regression?
When , the sigmoid outputs , meaning both classes are equally likely. The point sits exactly on the decision boundary. As grows, the sigmoid saturates toward 0 or 1, and uncertainty decreases.
Logistic regression can only produce linear decision boundaries. Describe a data layout where this fails, and name two strategies for overcoming the limitation.
If class 1 forms a cluster in the centre of the input space and class 0 surrounds it (e.g., concentric circles), no single line or hyperplane can separate them. Two strategies: (1) apply a non-linear feature transformation to map the data into a higher-dimensional space where a linear boundary suffices; (2) use kernel methods (e.g., SVMs with RBF kernel) to implicitly work in the transformed space.