An matrix whose row is the basis-function vector evaluated at training input . Compactly encodes “all training inputs through all basis functions” so that is a single matrix-vector product, and the OLS solution becomes .
Construction
Given training inputs and basis functions :
By convention , so the first column is all ones — pairing with the intercept weight .
Dimensions. is :
- = number of training examples (rows)
- = number of basis functions including the intercept (columns)
Why It’s Useful
The model for all becomes a single matrix product:
The OLS objective becomes , which differentiates to the normal equation in one line.
Whatever the basis functions are — polynomial, Gaussian, sigmoidal, custom — the math is the same. The design matrix decouples what features you use from how you fit.
Examples
Polynomial basis (degree , single input ):
This is a Vandermonde matrix — a recurring object in interpolation theory.
Gaussian RBF basis ( centres at , width ):
Multi-input (, plain linear basis):
Practical Notes
- The first column is always ones. This is the dummy basis function that pairs with the intercept . Forgetting it is a common bug — fits are forced through the origin.
- Standardise inputs first. Polynomial and Gaussian bases on raw inputs can have wildly different column magnitudes, making ill-conditioned.
- Tall vs wide. is “tall” when (more examples than parameters) — the standard setting where OLS works. “Wide” () makes rank-deficient and OLS degenerates.
Connections
- ordinary-least-squares — uses in the normal equation .
- linear-regression — the model whose predictions are .
- non-linear-transformation — the basis-expansion idea; the design matrix is the “-space training set”.