# Regression definition

From Wikipedia:

In statistical modeling, regression analysis is a statistical process
for estimating the relationships among variables. It includes many
techniques for modeling and analyzing several variables, when the
focus is on the relationship between a dependent variable and one or
more independent variables (or ‘predictors’).

Isn’t the same for classification? In the end, isn’t it the purpose of machine learning?

Regression is far broader in purpose and scope than classification or machine learning (however the latter might be understood). There is, however, much overlap.

### Relationships

Relationships analyzed by regression may consist of

• Association

• Dependence

• Causation

Classification provides information about the first two, but is silent about causation. Both regression and machine learning have been used–sometimes successfully, often problematically–to draw conclusions about causation.

### Purposes of Regression

1. To get a summary of multivariate data.

2. To set aside the effect of a variable that might confuse the issue.

3. Contribute to attempts at causal analysis.

4. Measure the size of an effect.

5. Try to discover a mathematical or empirical law.

6. Prediction.

7. Exclusion: getting $$xx$$ “out of the way” when we want to study the relationship between two other variables that might be affected by $$xx$$.

(After Mosteller & Tukey, Data Analysis and Regression, Chapter 12B.)

Classification achieves almost none of these purposes. In limited ways it might provide some kind of summary (1) and help with discovery (5).

Machine learning aims at prediction (6) almost exclusively. Most techniques of machine learning, ranging from random forests through neural networks to support vector models, are opaque to the understanding: they specifically do not aim to summarize data (1), remove the effects of confounding variables (2 and 7), or help us discover regularities that can be embodied in an empirical law (5).

This post is a slight expansion of an introductory presentation I made recently for a semester course in regression. Many more materials on the aims and practice of regression are available there.