I am interested in learning (and implementing) an alternative to polynomial interpolation.

However, I am having trouble finding a good description of how these methods work, how they relate, and how they compare.

I would appreciate your input on the pros/cons/conditions under which these methods or alternatives would be useful, but some good references to texts, slides, or podcasts would be sufficient.

**Answer**

Basic OLS regression is a very good technique for fitting a function to a set of data. However, simple regression only fits a straight line that is constant for the entire possible range of X. This may not be appropriate for a given situation. For instance, data sometimes show a *curvilinear* relationship. This can be dealt with by means of regressing Y onto a transformation of X, f(X). Different transformations are possible. In situations where the relationship between X and Y is monotonic, but continually tapers off, a log transform can be used. Another popular choice is to use a polynomial where new terms are formed by raising X to a series of powers (e.g., X2, X3, etc.). This strategy is easy to implement, and you can interpret the fit as telling you how many ‘bends’ exist in your data (where the number of bends is equal to the highest power needed minus 1).

However, regressions based on the logarithm or an exponent of the covariate will fit optimally only when that is the exact nature of the true relationship. It is quite reasonable to imagine that there is a curvilinear relationship between X and Y that is different from the possibilities those transformations afford. Thus, we come to two other strategies. The first approach is loess, a series of weighted linear regressions computed over a moving window. This approach is older, and better suited to exploratory data analysis.

The other approach is to use splines. At it’s simplest, a spline is a new term that applies to *only a portion* of the range of X. For example, X might range from 0 to 1, and the spline term might only range from .7 to 1. In this instance, .7 is the *knot*. A simple, linear spline term would be computed like this:

Xspline={0if X≤.7X−.7if X>.7

and would be added to your model, *in addition* to the original X term. The fitted model will show a sharp break at .7 with a straight line from 0 to .7, and the line continuing on with a different slope from .7 to 1. However, a spline term need not be linear. Specifically, it has been determined that cubic splines are especially useful (i.e., X3spline). The sharp break needn’t be there, either. Algorithms have been developed that constrain the fitted parameters such that the first and second derivatives match at the knots, which makes the knots impossible to detect in the output. The end result of all this is that with just a few knots (usually 3-5) in choice locations (which software can determine for you) can reproduce pretty much *any* curve. Moreover, the degrees of freedom are calculated correctly, so you can trust the results, which is not true when you look at your data first and then decide to fit a squared term because you saw a bend. In addition, all of this is just another (albeit more complicated) version of the basic linear model. Thus, everything that we get with linear models comes with this (e.g., predictions, residuals, confidence bands, tests, etc.) These are *substantial* advantages.

The simplest introduction to these topics that I know of is:

- Fox, J. (2000).
*Nonparametric Simple Regression: Smoothing Scatterplots*, Sage.

**Attribution***Source : Link , Question Author : David LeBauer , Answer Author : gung – Reinstate Monica*