Two paradigms for building classifiers: discriminative models learn the boundary between classes directly, while generative models learn how each class produces data and infer the boundary from that.
Definition
- Discriminative models learn directly — given features, what is the probability of each class? They focus on what distinguishes the classes from each other.
- Generative models learn — how likely is this data given each class, and how common is each class? Classification then follows from Bayes’ rule: .
Comparison
| Discriminative | Generative | |
|---|---|---|
| Models | and | |
| Goal | Find the boundary | Model the data-generating process |
| Example | logistic-regression | Naive Bayes, Bayesian classifiers |
| Interpretability | Weights show which features discriminate | Can generate synthetic examples |
When Each Is Used
Discriminative models tend to achieve higher classification accuracy when training data is plentiful, because they focus all modelling capacity on the decision boundary. Generative models are more useful when data is scarce (priors help), when you need to detect outliers, or when understanding the data distribution itself is the goal.
This distinction becomes concrete in weeks 7–8 when Bayesian methods are introduced as generative counterparts to the discriminative classifiers covered in weeks 1–5.
Related
- logistic-regression — the primary discriminative classifier in weeks 1–5
- supervised-learning — the framework both paradigms operate within
Active Recall
Logistic regression is a discriminative classifier. What exactly does it model, and what does it not model?
It models — the probability of each class given the input features. It does not model — how the features are distributed within each class. It has no concept of what a “typical” class-1 input looks like; it only knows which side of the boundary an input falls on.
A generative classifier models and . How does it use these to classify a new input ?
It applies Bayes’ rule: . For each class , it computes the likelihood of observing under that class times the prior probability of the class, then predicts the class with the highest posterior probability.
Why might a discriminative model outperform a generative model when training data is plentiful?
A discriminative model devotes all its modelling capacity to learning the decision boundary — the quantity that directly determines classification accuracy. A generative model must also accurately model the full feature distribution , which is a harder problem and may waste capacity on aspects of the distribution that don’t affect the boundary. With enough data, the discriminative model’s focused objective tends to yield better classification performance.