# Is bootstrapping standard errors and confidence intervals appropriate in regressions where homoscedasticity assumption is violated?

If in standard OLS regressions two assumptions are violated (normal distribution of errors, homoscedasticity), is bootstrapping standard errors and confidence intervals an appropriate alternative to arrive at meaningful results with respect to the significance of regressor coefficients?

Do significance tests with bootstrapped standard errors and confidence intervals still “work” with heteroscedasticity?

If yes, what would be applicable confidence intervals that can be used in this scenario (percentile, BC, BCA)?

Finally, if bootstrapping is appropriate in this scenario, what would be the relevant literature that needs to be read and cited to arrive at this conclusion? Any hint would be greatly appreciated!

There are at least three (may be more) approaches to perform the bootstrap for linear regression with independent, but not identically distributed data. (If you have other violations of the “standard” assumptions, e.g., due to autocorrelations with time series data, or clustering due to sampling design, things get even more complicated).

1. You can resample observation as a whole, i.e., take a sample with replacement of $(y_j^*, {\bf x}_j^*)$ from the original data $\{ (y_i, {\bf x}_i) \}$. This will be asymptotically equivalent to performing the Huber-White heteroskedasticity correction.
2. You can fit your model, obtain the residuals $e_i = y_i – {\bf x}_i ‘ \hat\beta$, and resample independently ${\bf x}_j^*$ and $e_j^*$ with replacement from their respective empirical distributions, but this breaks down the heteroskedasticity patterns, if there are any, so I doubt this bootstrap is consistent.
3. You can perform wild bootstrap in which you resample the sign of the residual, which controls for the conditional second moment (and, with some extra tweaks, for the conditional third moment, too). This would be the procedure I would recommend (provided that you can understand it and defend it to others when asked, “What did you do to control for heteroskedasticity? How do you know that it works?”).

The ultimate reference is Wu (1986), but Annals are not exactly the picture book reading.

I believe that generally similar corrections for the lack of distributional assumptions can be obtained with Huber/White standard errors. Cameron & Triverdi’s textbook discuss equivalence of the pairs bootstrap and White’s heteroskedasticity correction. The equivalence follows from the general robustness theory for $M$-estimates: both corrections are aimed at correcting the distributional assumptions, whatever they may be, with the minimal assumption of finite second moments of residuals, and independence between observations. See also Hausman and Palmer (2012) on more specific comparisons in finite samples (a version of this paper is available on one of the authors’ websites) on comparison between the bootstrap and heteroskedasticity corrections.