When building models with the

`glm`

function in R, one needs to specify the family. A family specifies an error distribution (or variance) function and a link function. For example, when I perform a logistic regression, I use the`binomial(link = "logit")`

family.What are (or represent) the error distribution (or variance) and link function in R ?

I assume that the link function is the type of model built (hence why using the

`logit`

function for the logistic regression. But I am not too sure about the error distribution function.I had a look at R’s documentation but could not find detailed information other than how to use them and what parameters can be specified.

**Answer**

You don’t specify the “error” distribution, you specify the conditional distribution of the response.

When you type the name of the family (such as `binomial`

) that specifies the conditional distribution to be binomial, and that implies the variance function (e.g. in the case of the binomial it is \mu(1-\mu)). If you choose a different family you get a different variance function (for Poisson it’s \mu, for Gamma it’s \mu^2, for Gaussian it’s constant, for inverse Gaussian its \mu^3, and so on).

[For some cases (e.g. logistic regression) you can take a latent-variable approach to the GLM – and in that case, you might possibly regard the distribution of the latent variable as a form of “error distribution”.]

The link function determines how the mean (\mu) and the linear predictor (\eta=X\beta) are related. Specifically, if \eta=g(\mu) then g is called the link function.

You can find tables of the variance functions and the canonical link functions (which have some convenient properties) for commonly-used members of the exponential class in many standard books as well as all over the place on the internet. Here’s a small one:

\begin{array}{lcll}

\textit{Family} & \textit{ Variance fn } & \textit{Canonical link function } & \textit{Other common links } \\

\hline

\text{Gaussian} & \text{constant} &\:\:\:\: \mu\qquad\qquad \text{(identity)} & \\

\text{Binomial} &\: \mu(1-\mu) & \log(\frac{\mu}{1-\mu})\;\qquad \:\:\:\,\text{(logit)} & \text{probit, cloglog} \\

\text{Poisson} &\: \mu &\: \log(\mu)\qquad\qquad\:\:\, \text{(log)} & \text{identity} \\

\text{Gamma} &\: \mu^2 &\:\: 1/\mu\quad\:\:\:\qquad \text{(inverse)} & \log \\

\text{Inverse Gaussian} &\: \mu^3 &\:\: 1/\mu^2 & \log

\end{array}

(R implements these in fairly typical fashion, and in the cases mentioned above will use the canonical link if you don’t specify one)

**Attribution***Source : Link , Question Author : user5365075 , Answer Author : gung – Reinstate Monica*