Definition and delimitation of regression model

An embarrassingly simple question — but it seems it has not been asked on Cross Validated before:

  1. What is the definition of a regression model?

Also a support question,

  1. What is not a regression model?

With regards to the latter, I am interested in tricky examples where the answer is not immediately obvious, e.g. ARIMA or GARCH.


I would say that “regression model” is a kind of meta-concept, in the sense that you will not find a definition of “regression model”, but more concrete concepts such as “linear regression”, “non-linear regression”, “robust regression” and so on. This in the same way as in mathemathics we usually do not define “number”, but “natural number”, “integers”, “real number”, “p-adic number” and so on, and if somebody will want to include the quaternions among numbers so be it! it doesn’t really matter, what matters is what definitions is used by the book/paper you are reading at the moment.

Definitions are tools, and essentialism, that is discussing what is the essence of …, what a word really means, are seldom worthwhile.

So, what distinguishes a “regression model” from other kinds of statistical models? Mostly, that there is a response variable, which you want to model as influenced by (or determined by) some set of predictor variables. We are not interested in influence the other direction, and we are not interested in relationships among the predictor variables. Mostly, we take the predictor variables as given, and treat them as constants in the model, not as random variables.

The relationship mentioned above can be linear or nonlinear, specified in a parametric or nonparametric way, and so on.

To delineate from other models we better have a look at some other words often taken to denote something different for “regression models”, like “errors in variables”, when we accept the possibility of measurement errors in the predictor variables. That could well be included in my description of “regression model” above, but is often taken as an alternative model.

Also, what is meant might vary among fields, see What is the difference between conditioning on regressors vs. treating them as fixed?

To repeat: what matters is the definition used by the authors you are reading now, and not some metaphysics about what it “really is”.

