The R plotting package ggplot2 has an awesome function called stat_smooth for plotting a regression line (or curve) with the associated confidence band.

However I am having a hard time figuring out exactly how this confidence band is generated, for every time of regression line (or “method”). How can I find this information?

**Answer**

From the `Details`

section of the help

Calculation is performed by the (currently undocumented) predictdf

generic function and its methods. For most methods the confidence

bounds are computed using the predict method – the exceptions are

loess which uses a t-based approximation, and for glm where the normal

confidence interval is constructed on the link scale, and then

back-transformed to the response scale.

So predictdf will generally call `stats::predict`

, which in turn will call the correct `predict`

method for the smoothing method. Other functions involving stat_smooth are also useful to consider.

Most model fitting functions will have `predict`

method associated with the `class`

of the model. These will usually take a `newdata`

object and an argument `se.fit`

that will denote whether the standard errors will be fitted. (see `?predict`

) for further details.

`se`

display confidence interval around smooth? (TRUE by default, see level to control

This is passed directy to the predict method to return the appropriate standard errors (method dependant)

`fullrange`

should the fit span the full range of the plot, or just the data

This defines the `newdata`

values for `x`

at which the predictions will be evaluated

`level`

level of confidence interval to use (0.95 by default)

Passed directly to the predict method so that the confidence interval can define the appropriate critical value (eg `predict.lm`

uses `qt((1 - level)/2, df)`

for the standard errors to be multiplied by

`n`

number of points to evaluate smoother at

Used in conjunction with `fullrange`

to define the `x`

values in the `newdata`

object.

Within a call to `stat_smooth`

you can define `se`

which is what is partially matched to `se.fit`

(or `se`

), and will define the `interval`

argument if necessary. `level`

will give level of the confidence interval (defaults 0.95).

The `newdata`

object is defined within the processing, depending on your setting of `fullrange`

to a sequence of length `n`

within the full range of the plot or the data.

In your case, using `rlm`

, this will use `predict.rlm`

, which is defined as

```
predict.rlm <- function (object, newdata = NULL, scale = NULL, ...)
{
## problems with using predict.lm are the scale and
## the QR decomp which has been done on down-weighted values.
object$qr <- qr(sqrt(object$weights) * object$x)
predict.lm(object, newdata = newdata, scale = object$s, ...)
}
```

So it is internally calling `predict.lm`

with an appropriate scaling of the `qr`

decomposition and `scale`

argument.

