I am trying to understand Ben Bolker’s answer to this question.

First, we create a data frame:

`set.seed(101) d <- data.frame(x=sample(1:4,size=30,replace=TRUE)) d$y <- rnorm(30,1+2*d$x,sd=0.01)`

Then Mr. Bolker says:

## x as ordered factor

`coef(lm(y~ordered(x),d)) ## (Intercept) ordered(x).L ordered(x).Q ordered(x).C ## 5.998121421 4.472505514 0.006109021 -0.003125958`

Now the intercept specifies the value of

`y`

at themeanfactor level (halfway between 2 and 3); the`L`

(linear) parameter gives a measure of the linear trend (notquitesure I can explain the particular value …),`Q`

and`C`

specify quadratic and cubic terms (which are close to zero in this case because the pattern is linear); if there were more levels the higher-order contrasts would be numbered 5, 6, …

My question is, what does the regression formula look like explicitly?

I thought

`lm()`

makes a model like this:y = 5.9981 + 4.4725 (x_1) + 0.0061 (x_2) – 0.00312 (x_3)

where, since the

`x_i`

are categories, they can only be either 0 or 1.I do not understand what quadratic and cubic terms have to do with a linear model. Even so, squaring/cubing any of the variables would not make a difference, since 0 ^ 3= 0 and 1^3 = 1.

**Answer**

In this case, what `lm()`

is doing is converting your “categorical” variable into a numeric sequence in order.

To make this clearer, I’ll adapt Bolker’s code a bit to make the X variable more obviously categorical:

```
set.seed(101)
d <- data.frame(x=sample(1:4,size=30,replace=TRUE))
d$y <- rnorm(30,1+2*d$x,sd=0.01)
d$x = factor(d$x, labels=c("none", "some", "more", "a lot"))
coef(lm(y~x, d))
# (Intercept) xsome xmore xa lot
# 3.001627 1.991260 3.995619 5.999098
```

So here, the mean of `x=None`

is in the intercept, and the deviation from that is indicated for each category.

```
coef(lm(y~ordered(x), d))
# (Intercept) ordered(x).L ordered(x).Q ordered(x).C
# 5.998121421 4.472505514 0.006109021 -0.003125958
```

Conceptually, what’s happened here is that the `ordered()`

function converted `x`

into `newx`

using (something similar to):

```
if (x=="None") newx=-.67
if (x=="some") newx=-.22
if (x=="more") newx=.22
if (x=="a lot") newx=.67
```

and then it fitted (something like) the model:

$$y = a + b_0 \times newx + b_1 \times newx^2 + b_2 \times newx^3$$

where you have linear $newx$, quadratic $newx^2$, and cubic $newx^3$ components.

Note, I said that it’s **something like** that, because the problem with the model described there is that $newx$, $newx^2$, and $newx^3$ are not at all independent. What `lm()`

does instead is uses a set of contrasts generated by `contr.poly(4)`

. These contrasts ensure orthogonality, so that the linear, quadratic and cubic components are independent. But the principle is similar – when fitting ordered factors, `lm()`

fits a linear, quadratic, cubic, etc… component.

You can see this by comparing:

```
coef(lm(y~ordered(x), d))
# (Intercept) ordered(x).L ordered(x).Q ordered(x).C
# 5.998121421 4.472505514 0.006109021 -0.003125958
```

with

```
contrasts(d$x) <- contr.poly(4)
coef(lm(y~x, d))
# (Intercept) ordered(x).L ordered(x).Q ordered(x).C
# 5.998121421 4.472505514 0.006109021 -0.003125958
```

Exactly identical. So if you want a fuller understanding of what happened, take a closer look at `contr.poly()`

and orthogonal polynomial contrasts in general.

One thing to note is that there is an implicit assumption hidden in here: the difference between each two levels is assumed to be equal. So “None” is as far from “some” as “some” is from “more”, and “more” is from “a lot”.

**Attribution***Source : Link , Question Author : Ovi , Answer Author : S Robidoux*