I have a dataset that contains, let’s say, some measurements for position, speed and acceleration. All come from the same “run”. I could construct a linear system and fit a polynomial to all of those measurements.

But can I do the same with splines? What is an ‘R’ way of doing this?

Here is some simulated data I would like to fit:

`f <- function(x) 2+x-0.5*x^2+rnorm(length(x), mean=0, sd=0.1) df <- function(x) 1-x+rnorm(length(x), mean=0, sd=0.3) ddf <- function(x) -1+rnorm(length(x), mean=0, sd=0.6) x_f <- runif(5, 0, 5) x_df <- runif(8, 3, 8) x_ddf <- runif(10, 4, 9) data <- data.frame(type=rep('f'), x=x_f, y=f(x_f)) data <- rbind(data, data.frame(type=rep('df'), x=x_df, y=df(x_df))) data <- rbind(data, data.frame(type=rep('ddf'), x=x_ddf, y=ddf(x_ddf))) library(ggplot2) ggplot(data, aes(x, y, color=type)) + geom_point() library(splines) m <- lm(data$y ~ bs(data$x, degree=6)) # but I want to fit on f, df, ddf. possible?`

**Answer**

We will describe how a spline can be used through Kalman Filtering

(KF) techniques in relation with a State-Space Model (SSM). The fact

that some spline models can be represented by SSM and computed with KF

was revealed by C.F. Ansley and R. Kohn in the years 1980-1990. The

estimated function and its derivatives are the expectations of the

state conditional on the observations. These estimates are computed by

using a *fixed interval smoothing*, a routine task when using a SSM.

For the sake of simplicity, assume that the observations are made at

times t1<t2<⋯<tn and that the observation number k at

tk involves only *one* derivative with order dk in

{0,1,2}. The observation part of the model writes as

y(tk)=f[dk](tk)+ε(tk)

where f(t) denotes the unobserved *true* function and ε(tk)

is a Gaussian error with variance H(tk) depending on the derivation

order dk. The (continuous time) transition equation takes the general form

\tag{T1}

\frac{\text{d}}{\text{d}t}\boldsymbol{\alpha}(t) =

\mathbf{A} \boldsymbol{\alpha}(t) + \boldsymbol{\eta}(t)

where \boldsymbol{\alpha}(t) is the unobserved state vector and

\boldsymbol{\eta}(t) is a Gaussian white noise with covariance \mathbf{Q},

assumed to be independent of the observation noise r.vs \varepsilon(t_k).

In order to describe a spline, we consider a state obtained by stacking the m

first derivatives, i.e. \boldsymbol{\alpha}(t) := [f(t),\, f^{[1]}(t), \,

\dots,\, f^{[m-1]}(t)]^\top. The transition is

\begin{bmatrix}

f^{[1]}(t) \\

f^{[2]}(t) \\

\vdots \\

f^{[m-1]}(t) \\

f^{[m]}(t)

\end{bmatrix}

=

\begin{bmatrix}

0 & 1 & 0 & &\\

0 & 0 & 1 & & \\

\vdots & & & \ddots &\\

& & & & 1\\

0 & \dots & & & 0

\end{bmatrix}

\begin{bmatrix}

f(t) \\

f^{[1]}(t) \\

\vdots \\

f^{[m-2]}(t)\\

f^{[m-1]}(t)

\end{bmatrix} +

\begin{bmatrix}

0 \\

0 \\

\vdots\\

0 \\

\eta(t)

\end{bmatrix}

and we then get a *polynomial spline with order* 2m (and degree

2m-1). While m=2 corresponds to the usual cubic spline, a higher

order will be required to use derivatives with order >1. In order to

stick to a classical SSM formalism we can rewrite (O1) as

\tag{O2}

y(t_k) = \mathbf{Z}(t_k) \boldsymbol{\alpha}(t_k) + \varepsilon(t_k),

where the observation matrix \mathbf{Z}(t_k) picks the suitable

derivative in \boldsymbol{\alpha}(t_k) and the variance H(t_k) of

\varepsilon(t_k) is chosen depending on d_k. So \mathbf{Z}(t_k) =

\mathbf{Z}^\star_{d_k + 1} where \mathbf{Z}^\star_1 := [1,\,0,\,\dots,\,0],

\mathbf{Z}^\star_2 := [0,\,1,\,\dots\,0] and \mathbf{Z}^\star_3 := [0,\,0,\,1, 0,\,\dots].

Similarly H(t_k) = H^\star_{d_k+1} for three variances H^\star_1,

H^\star_2, and H^\star_3.

Although the transition is in continuous time, the KF is actually a

standard *discrete time* one. Indeed, we will in practice focus on

times t where we have an observation, or where we want to estimate

the derivatives. We can take the set \{t_k\} to be the union of

these two sets of times and assume that the observation at t_k can

be missing: this allows to estimate the m derivatives at any time

t_k regardless of the existence of an observation. There remains to

derive the discrete SSM.

We will use indices for discrete times, writing \boldsymbol{\alpha}_k for

\boldsymbol{\alpha}(t_k) and so on. The discrete-time SSM takes the

form

\begin{align*}

\tag{DT}

\boldsymbol{\alpha}_{k+1} &= \mathbf{T}_k

\,\boldsymbol{\alpha}_{k} + \boldsymbol{\eta}^\star_{k}\\

y_k &= \mathbf{Z}_k\boldsymbol{\alpha}_k + \varepsilon_k

\end{align*}

where the matrices \mathbf{T}_k and \mathbf{Q}_k^\star :=

\text{Var}(\boldsymbol{\eta}_k^\star) are derived from (T1) and (O2)

while the variance of \varepsilon_k is given by

H_k=H^\star_{d_k+1} provided that y_k is not missing.

Using some algebra we can find the transition

matrix for the discrete-time SSM

\mathbf{T}_k = \exp\left\{ \delta_k \mathbf{A} \right\} =

\begin{bmatrix}

1 & \frac{\delta_k^1}{1!} & \frac{\delta_k^2}{2!} &

\dots & \frac{\delta_k^{m-1}}{(m-1)!}\\

0 & 1 & \frac{\delta_k^1}{1!} & & \\

\vdots & & & \ddots &\\

& & & & \frac{\delta_k^1}{1!}\\

0 & \dots & & & 1

\end{bmatrix}, \qquad

where \delta_k:= t_{k+1} - t_{k} for k<n. Similarly the

covariance matrix \mathbf{Q}^\star_k = \text{Var}

(\boldsymbol{\eta}_k^\star) for the discrete-time SSM can be

given as

\mathbf{Q}^\star_k=

\sigma_\eta^2 \, \left[\frac{\delta_k^{2m-i-j+1}}{(m-i)!(m-j)! (2m-i-j+1)}\right]_{i,j}

where the indices i and j are between 1 and m.

Now to carry over the computation in R we need a package devoted to KF

and accepting time-varying models; the CRAN package **KFAS** seems a

good option. We can write R functions to compute the matrices

\mathbf{T}_k and \mathbf{Q}^\star_k from the vector of times t_k

in order to encode the SSM (DT). In the notations used by the package,

a matrix \mathbf{R}_k comes to multiply the noise

\boldsymbol{\eta}^\star_k in the transition equation of (DT): we take

it here to be the identity \mathbf{I}_m. Also note that a diffuse

initial covariance must be used here.

**EDIT** The \mathbf{Q}^\star as initially written was wrong. Fixed (aslo in R code and image).

```
smoothWithDer <- function(t, y, d, m = 3,
Hstar = c(3, 0.2, 0.1)^2, sigma2eta = 1.0^2) {
## define the SSM matrices, depending on 'delta_k' or on 'd_k'
Tfun <- function(delta) {
mat <- matrix(0, nrow = m, ncol = m)
for (i in 0:(m-1)) {
mat[col(mat) == row(mat) + i] <- delta^i / gamma(i + 1)
}
mat
}
Qfun <- function(delta) {
im <- (m - 1):0
x <- delta^im / gamma(im + 1)
mat <- outer(X = x, Y = x, FUN = "*")
im2 <- outer(im, im, FUN = "+")
sigma2eta * mat * delta / (im2 + 1)
}
Zfun <- function(d) {
Z <- matrix(0.0, nrow = 1, ncol = m)
Z[1, d + 1] <- 1.0
Z
}
Hfun <- function(d) ifelse(d >= 0, Hstar[d + 1], 0.0)
Rfun <- function() diag(x = 1.0, nrow = m)
## define arrays by stacking the SSM matrices. We need one more
## 'delta' at the end of the series
n <- length(t)
delta <- diff(t)
delta <- c(delta, mean(delta))
Ta <- Qa <- array(0.0, dim = c(m, m, n))
Za <- array(0.0, dim = c(1, m, n))
Ha <- array(0.0, dim = c(1, 1, n))
Ra <- array(0.0, dim = c(m, m, n))
for (k in 1:n) {
Ta[ , , k] <- Tfun(delta[k])
Qa[ , , k] <- Qfun(delta[k])
Za[ , , k] <- Zfun(d[k])
Ha[ , , k] <- Hfun(d[k])
Ra[ , , k] <- Rfun()
}
require(KFAS)
## define the SSM and perform Kalman Filtering and smoothing
mod <- SSModel(y ~ SSMcustom(Z = Za, T = Ta, R = Ra, Q = Qa, n = n,
P1 = matrix(0, nrow = m, ncol = m),
P1inf = diag(1.0, nrow = m),
state_names = paste0("d", 0:(m-1))) - 1)
out <- KFS(mod, smoothing = "state")
list(t = t, filtered = out$att, smoothed = out$alphahat)
}
## An example function as in OP
f <- function(t, d = rep(0, length = length(t))) {
f <- rep(NA, length(t))
if (any(ind <- (d == 0))) f[ind] <- 2.0 + t[ind] - 0.5 * t[ind]^2
if (any(ind <- (d == 1))) f[ind] <- 1.0 - t[ind]
if (any(ind <- (d == 2))) f[ind] <- -1.0
f
}
set.seed(123)
n <- 100
t <- seq(from = 0, to = 10, length = n)
Hstar <- c(3, 0.4, 0.2)^2
sigma2eta <- 1.0
fTrue <- cbind(d0 = f(t), d1 = f(t, d = 1), d2 = f(t, d = 2))
## ============================================================================
## use a derivative index of -1 to indicate non-observed values, where
## 'y' will be NA
##
## [RUN #0] no derivative m = 2 (cubic spline)
## ============================================================================
d0 <- sample(c(-1, 0), size = n, replace = TRUE, prob = c(0.7, 0.3))
ft0 <- f(t, d0)
## add noise picking the right sd
y0 <- ft0 + rnorm(n = n, sd = c(0.0, sqrt(Hstar))[d0 + 2])
res0 <- smoothWithDer(t, y0, d0, m = 2, Hstar = Hstar)
## ============================================================================
## [RUN #1] Only first order derivative: we can take m = 2 (cubic spline)
## ============================================================================
d1 <- sample(c(-1, 0:1), size = n, replace = TRUE, prob = c(0.7, 0.15, 0.15))
ft1 <- f(t, d1)
y1 <- ft1 + rnorm(n = n, sd = c(0.0, sqrt(Hstar))[d1 + 2])
res1 <- smoothWithDer(t, y1, d1, m = 2, Hstar = Hstar)
## ============================================================================
## [RUN #2] First and second order derivative: we can take m = 3
## (quintic spline)
## ============================================================================
d2 <- sample(c(-1, 0:2), size = n, replace = TRUE, prob = c(0.7, 0.1, 0.1, 0.1))
ft2 <- f(t, d2)
y2 <- ft2 + rnorm(n = n, sd = c(0.0, sqrt(Hstar))[d2 + 2])
res2 <- smoothWithDer(t, y2, d2, m = 3, Hstar = Hstar)
## plots : a ggplot with facets would be better here.
for (run in 0:2) {
resrun <- get(paste0("res", run))
drun <- get(paste0("d", run))
yrun <- get(paste0("y", run))
matplot(t, resrun$smoothed, pch = 16, cex = 0.7, ylab = "", xlab = "")
matlines(t, fTrue, lwd = 2, lty = 1)
for (dv in 0:2) {
points(t[drun == dv], yrun[drun == dv], cex = 1.2, pch = 22, lwd = 2,
bg = "white", col = dv + 1)
}
title(main = sprintf("run %d. Dots = smooothed, lines = true, square = obs", run))
legend("bottomleft", col = 1:3, legend = c("d0", "d1", "d2"), lty = 1)
}
```

**Attribution***Source : Link , Question Author : dani , Answer Author : Yves*