I’m fitting a natural spline fit to some data points. I’d like to estimate the prediction error for the predicted value. In linear regression (I agree that natural spline is also a linear regression with a specific type of design matrix), we know:
ˆβ=(XTX)−1XTY→ assuming var(Y) = σ2I then : var(ˆβ)=(XTX)−1σ2
Now consider ˆY=XiTˆβ+ϵi. We can then write:
This is easy to calculate for linear regression. How should I do it with natural spline? I can get the design matrix for natural spline. I can get (XTX)−1σ2 but how can I get the rest of it:
Here is an example in R:
set.seed(12345) x <- c(1:100) y <- sin(pi*x/50) epsilon <- rnorm(100, 0, 3) knots <- c(10, 20, 30, 40, 50, 60, 70, 80, 90) myFit <- lm(y ~ ns(x, knots = knots))
Now consider x = 32.5 . How can I get the variance for the ˆY corresponding to x = 32.5 ? I know we can use the predict function. however, what I do really want is to get calculate it similar to linear regression by getting the design matrix and multiplying them together.
I really appreciate your help.
You can get the design matrix for a linear model in R using
X <- model.matrix(myFit) sigma <- summary(myFit)$sigma var.Yhat <- (diag(X %*% solve(t(X) %*% X) %*% t(X)) + 1) * sigma^2
Or, if you want to get the prediction variance for new values of X, use
ns() to transform into the natural spline basis first:
X.new <- cbind(1, ns(x.new, knots=knots)) var.Yhat <- (diag(X.new %*% solve(t(X) %*% X) %*% t(X.new)) + 1) * sigma^2