An matrix whose row is the basis-function vector evaluated at training input . Compactly encodes “all training inputs through all basis functions” so that is a single matrix-vector product, and the OLS solution becomes .

Construction

Given training inputs and basis functions :

By convention , so the first column is all ones — pairing with the intercept weight .

Dimensions. is :

  • = number of training examples (rows)
  • = number of basis functions including the intercept (columns)

Why It’s Useful

The model for all becomes a single matrix product:

The OLS objective becomes , which differentiates to the normal equation in one line.

Whatever the basis functions are — polynomial, Gaussian, sigmoidal, custom — the math is the same. The design matrix decouples what features you use from how you fit.

Examples

Polynomial basis (degree , single input ):

This is a Vandermonde matrix — a recurring object in interpolation theory.

Gaussian RBF basis ( centres at , width ):

Multi-input (, plain linear basis):

Practical Notes

  • The first column is always ones. This is the dummy basis function that pairs with the intercept . Forgetting it is a common bug — fits are forced through the origin.
  • Standardise inputs first. Polynomial and Gaussian bases on raw inputs can have wildly different column magnitudes, making ill-conditioned.
  • Tall vs wide. is “tall” when (more examples than parameters) — the standard setting where OLS works. “Wide” () makes rank-deficient and OLS degenerates.

Connections