How to perform a bootstrap test to compare the means of two samples?

I have two heavily skewed samples and am trying to use bootstrapping to compare their means using t-statistic.

What is the correct procedure to do it?


The process I am using

I am concerned about the appropriateness of using the standard error of the original/observed data in the final step when I know that this is not normally distributed.

Here are my steps:

  • Bootstrap – randomly sample with replacement (N=1000)
  • Calculate t-statistic for each bootstrap to create a t-distribution:
    $$
    T(b) = \frac{(\overline{X}_{b1}-\overline{X}_{b2})-(\overline{X}_1-\overline{X}_2) }{\sqrt{ \sigma^2_{xb1}/n + \sigma^2_{xb2}/n }}
    $$
  • Estimate t confidence intervals by getting $\alpha/2$ and $1-\alpha/2$ percentiles of t-distribution
  • Get confidence intervals via:

    $$
    CI_L = (\overline{X}_1-\overline{X}_2) – T\_{CI_L}.SE_{original}
    $$
    $$
    CI_U = (\overline{X}_1-\overline{X}_2) + T\_{CI_U}.SE_{original}
    $$
    where
    $$
    SE = \sqrt{ \sigma^2_{X1}/n + \sigma^2_{X2}/n }
    $$

  • Look where the confidence intervals fall to determine if there is a significant difference in means (i.e. non-zero)

I have also looked at the Wilcoxon rank-sum but it is not giving very reasonable results due to the very heavily skewed distribution (e.g. the 75th == 95th percentile). For this reason I would like to explore the bootstrapped t-test further.

So my questions are:

  1. Is this an appropriate methodology?
  2. Is it appropriate to use the SE of observed data when I know it is heavily skewed?

Possible duplicate: What method is preferred, a bootstrapping test or a nonparametric rank-based test?

Answer

I would just do a regular bootstrap test:

  • compute the t-statistic in your data and store it
  • change the data such that the null-hypothesis is true. In this case, subtract the mean in group 1 for group 1 and add the overall mean, and do the same for group 2, that way the means in both group will be the overall mean.
  • Take bootstrap samples from this dataset, probably in the order of 20,000.
  • compute the t-statistic in each of these bootstrap samples. The distribution of these t-statistics is the bootstrap estimate of the sampling distribution of the t-statistic in your skewed data if the null-hypothesis is true.
  • The proportion of bootstrap t-statistics that is larger than or equal to your observed t-statistic is your estimate of the $p$-value. You can do a bit better by looking at $($the number of bootstrap t-statistics that are larger than or equal to the observed t-statistic $+1)$ divided by $($the number of bootstrap samples $+1)$. However, the difference is going to be small when the number of bootstrap samples is large.

You can read more on that in:

  • Chapter 4 of A.C. Davison and D.V. Hinkley (1997) Bootstrap Methods and their Application. Cambridge: Cambridge University Press.

  • Chapter 16 of Bradley Efron and Robert J. Tibshirani (1993) An Introduction to the Bootstrap. Boca Raton: Chapman & Hall/CRC.

  • Wikipedia entry on bootstrap hypothesis testing.

Attribution
Source : Link , Question Author : CatsLoveJazz , Answer Author : dfrankow

Leave a Comment