Results of lm() function with a dependent ordered categorical variable?

I am trying to understand Ben Bolker’s answer to this question.

First, we create a data frame:

set.seed(101)
d <- data.frame(x=sample(1:4,size=30,replace=TRUE))
d$y <- rnorm(30,1+2*d$x,sd=0.01)

Then Mr. Bolker says:

x as ordered factor

coef(lm(y~ordered(x),d))
##  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
##  5.998121421  4.472505514  0.006109021 -0.003125958 

Now the intercept specifies the value of y at the mean factor level (halfway between 2 and 3); the L (linear) parameter gives a measure of the linear trend (not quite sure I can explain the particular value …), Q and C specify quadratic and cubic terms (which are close to zero in this case because the pattern is linear); if there were more levels the higher-order contrasts would be numbered 5, 6, …


My question is, what does the regression formula look like explicitly?

I thought lm() makes a model like this:

y = 5.9981 + 4.4725 (x_1) + 0.0061 (x_2) – 0.00312 (x_3)

where, since the x_i are categories, they can only be either 0 or 1.

I do not understand what quadratic and cubic terms have to do with a linear model. Even so, squaring/cubing any of the variables would not make a difference, since 0 ^ 3= 0 and 1^3 = 1.

Answer

In this case, what lm() is doing is converting your “categorical” variable into a numeric sequence in order.

To make this clearer, I’ll adapt Bolker’s code a bit to make the X variable more obviously categorical:

set.seed(101)
d <- data.frame(x=sample(1:4,size=30,replace=TRUE))
d$y <- rnorm(30,1+2*d$x,sd=0.01)
d$x = factor(d$x, labels=c("none", "some", "more", "a lot"))
coef(lm(y~x, d))
#  (Intercept)       xsome       xmore      xa lot 
#     3.001627    1.991260    3.995619    5.999098

So here, the mean of x=None is in the intercept, and the deviation from that is indicated for each category.

coef(lm(y~ordered(x), d))
#  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
#  5.998121421  4.472505514  0.006109021 -0.003125958

Conceptually, what’s happened here is that the ordered() function converted x into newx using (something similar to):

if (x=="None") newx=-.67
if (x=="some") newx=-.22
if (x=="more") newx=.22
if (x=="a lot") newx=.67

and then it fitted (something like) the model:
$$y = a + b_0 \times newx + b_1 \times newx^2 + b_2 \times newx^3$$

where you have linear $newx$, quadratic $newx^2$, and cubic $newx^3$ components.

Note, I said that it’s something like that, because the problem with the model described there is that $newx$, $newx^2$, and $newx^3$ are not at all independent. What lm() does instead is uses a set of contrasts generated by contr.poly(4). These contrasts ensure orthogonality, so that the linear, quadratic and cubic components are independent. But the principle is similar – when fitting ordered factors, lm() fits a linear, quadratic, cubic, etc… component.

You can see this by comparing:

coef(lm(y~ordered(x), d))
#  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
#  5.998121421  4.472505514  0.006109021 -0.003125958

with

contrasts(d$x) <- contr.poly(4)
coef(lm(y~x, d))
#  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
#  5.998121421  4.472505514  0.006109021 -0.003125958

Exactly identical. So if you want a fuller understanding of what happened, take a closer look at contr.poly() and orthogonal polynomial contrasts in general.

One thing to note is that there is an implicit assumption hidden in here: the difference between each two levels is assumed to be equal. So “None” is as far from “some” as “some” is from “more”, and “more” is from “a lot”.

Attribution
Source : Link , Question Author : Ovi , Answer Author : S Robidoux

Leave a Comment