Can someone please tell me how to have R estimate the break point in a piecewise linear model (as a fixed or random parameter), when I also need to estimate other random effects?
I’ve included a toy example below that fits a hockey stick / broken stick regression with random slope variances and a random y-intercept variance for a break point of 4. I want to estimate the break point instead of specifying it. It could be a random effect (preferable) or a fixed effect.
library(lme4) str(sleepstudy) #Basis functions bp = 4 b1 <- function(x, bp) ifelse(x < bp, bp - x, 0) b2 <- function(x, bp) ifelse(x < bp, 0, x - bp) #Mixed effects model with break point = 4 (mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)) #Plot with break point = 4 xyplot( Reaction ~ Days | Subject, sleepstudy, aspect = "xy", layout = c(6,3), type = c("g", "p", "r"), xlab = "Days of sleep deprivation", ylab = "Average reaction time (ms)", panel = function(x,y) { panel.points(x,y) panel.lmline(x,y) pred <- predict(lm(y ~ b1(x, bp) + b2(x, bp)), newdata = data.frame(x = 0:9)) panel.lines(0:9, pred, lwd=1, lty=2, col="red") } )
Output:
Linear mixed model fit by REML Formula: Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject) Data: sleepstudy AIC BIC logLik deviance REMLdev 1751 1783 -865.6 1744 1731 Random effects: Groups Name Variance Std.Dev. Corr Subject (Intercept) 1709.489 41.3460 b1(Days, bp) 90.238 9.4994 -0.797 b2(Days, bp) 59.348 7.7038 0.118 -0.008 Residual 563.030 23.7283 Number of obs: 180, groups: Subject, 18 Fixed effects: Estimate Std. Error t value (Intercept) 289.725 10.350 27.994 b1(Days, bp) -8.781 2.721 -3.227 b2(Days, bp) 11.710 2.184 5.362 Correlation of Fixed Effects: (Intr) b1(D,b b1(Days,bp) -0.761 b2(Days,bp) -0.054 0.181
Answer
Another approach would be to wrap the call to lmer in a function that is passed the breakpoint as a parameter, then minimize the deviance of the fitted model conditional upon the breakpoint using optimize. This maximizes the profile log likelihood for the breakpoint, and, in general (i.e., not just for this problem) if the function interior to the wrapper (lmer in this case) finds maximum likelihood estimates conditional upon the parameter passed to it, the whole procedure finds the joint maximum likelihood estimates for all the parameters.
library(lme4)
str(sleepstudy)
#Basis functions
bp = 4
b1 <- function(x, bp) ifelse(x < bp, bp - x, 0)
b2 <- function(x, bp) ifelse(x < bp, 0, x - bp)
#Wrapper for Mixed effects model with variable break point
foo <- function(bp)
{
mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)
deviance(mod)
}
search.range <- c(min(sleepstudy$Days)+0.5,max(sleepstudy$Days)-0.5)
foo.opt <- optimize(foo, interval = search.range)
bp <- foo.opt$minimum
bp
[1] 6.071932
mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)
To get a confidence interval for the breakpoint, you could use the profile likelihood. Add, e.g., qchisq(0.95,1)
to the minimum deviance (for a 95% confidence interval) then search for points where foo(x)
is equal to the calculated value:
foo.root <- function(bp, tgt)
{
foo(bp) - tgt
}
tgt <- foo.opt$objective + qchisq(0.95,1)
lb95 <- uniroot(foo.root, lower=search.range[1], upper=bp, tgt=tgt)
ub95 <- uniroot(foo.root, lower=bp, upper=search.range[2], tgt=tgt)
lb95$root
[1] 5.754051
ub95$root
[1] 6.923529
Somewhat asymmetric, but not bad precision for this toy problem. An alternative would be to bootstrap the estimation procedure, if you have enough data to make the bootstrap reliable.
Attribution
Source : Link , Question Author : lockedoff , Answer Author : jbowman