Is a spline interpolation considered to be a nonparametric model?

I am aware of the basic differences between nonparametric and parametric statistics. In parametric models, we assume the data follows a distribution and fit it onto it using a fixed number of parameters. With KDE for instance, this is not the case because we don’t assume that the modeled distribution has a particular shape.

I am wondering how this relates to interpolation in general, and to spline interpolation in specific. Are all interpolation approaches considered to be nonparametric, are there “mixed” approaches, what is the case with spline interpolation?


This is a good question. Frequently, one will see smoothing regressions (e.g., splines, but also smoothing GAMs, running lines, LOWESS, etc.) described as nonparametric regression models.

These models are nonparametric in the sense that using them does not involve reported quantities like \widehat{\beta}, \widehat{\theta}, etc. (in contrast to linear regression, GLM, etc.). Smoothing models are extremely flexible ways to represent properties of y conditional on one or more x variables, and do not make a priori commitments to, for example, linearity, simple integer polynomial, or similar functional forms relating y to x.

On the other hand, these models are parametric, in the mathematical sense that they indeed involve parameters: number of splines, functional form of splines, arrangement of splines, weighting function for data fed to splines, etc. In application, however, these parameters are generally not of substantive interest: they are not the exciting bit of evidence reported by researchers… the smoothed curves (along with CIs and measures of model fit based on deviation of observed values from the curves) are the evidentiary bits. One motivation for this agnosticism about the actual parameters underlying a smoothing model is that different smoothing algorithms tend to give pretty similar results (see Buja, A., Hastie, T., & Tibshirani, R. (1989). Linear Smoothers and Additive Models. The Annals of Statistics, 17(2), 453–510 for a good comparison of several).

If I understand you, your “mixed” approaches are what are called “semi-parametric models”. Cox regression is one highly-specialized example of such: the baseline hazard function relies on a nonparametric estimator, while the explanatory variables are estimated in a parametric fashion. GAMs—generalized additive models—permit us to decide which x variables’ effects on y we will model using smoothers, which we will model using parametric specifications, and which we will model using both all in a single regression.

Source : Link , Question Author : John Doe , Answer Author : Alexis

Leave a Comment