Why is the jackknife less computationally intensive than the bootstrap?

It’s often claimed that the jackknife is less computationally intensive. How is that the case?

My understanding is that the jackknife involves the following steps:

  1. Remove 1 data point
  2. Estimate the statistic of interest (e.g. sample mean) on the remaining point
  3. Repeat step 1) and 2) to get a sampling distribution of the statistic of interest.

The bootstrap involves the following steps:

  1. Generate a bootstrap sample (sample with replacement)
  2. Estimate the statistic of interest (e.g. sample mean) on the bootstrap sample
  3. Repeat step 1) and 2) to get a sampling distribution of the statistic of interest.

It seems to me that Step 2 is the much more computationally intensive part, and is exactly the same between the jackknife and the bootstrap. If so, then how is the jackknife less computationally intensive?

Answer

The jackknife is not intrinsically faster than the bootstrap, as Cliff AB points out below. Nevertheless, two factors sometimes make it faster than the boostrap in practice.

  1. Convention During a jackknife, the estimation step is always done exactly n times: one data point is omitted from each jackknife estimate. If you had a dataset of n=50 points, you’d therefore run the estimation procedure 50 times, leaving out the 1st, 2nd, …nth point in turn. Bootstraps, by comparison, are almost run “a large number of times” (~1000); bootstrapping with only k=50 repeats is virtually unheard of and people rarely compute jackknife estimates from absolutely massive samples (n=109), in part because it would be pointlessly slow.

  2. Optimization Since the entire bootstrap sample is drawn anew on each iteration, each bootstrap samples can be totally different from the others, and so the statistic needs to be computed from scratch. Each jackknife sample, however, is almost identical to the one before it, with the exception of two data points: the one removed during the last iteration (and now added back) and the one removed for the current iteration (which was previously present). This opens the door to some computational optimizations.

For example you want to estimate the mean. For the bootstrap, you’re stuck adding all n values together each time; bn additions are required for b bootstrap iterations. For the jackknife estimate, you can instead add all n numbers *once* to find S=x. Next, compute the mean for the sample where the ith data point is removed as Sxin1. This requires only 2n additions/subtractions for the whole jackknife. Similar tricks exist for other statistics.

In fact, closed-form expressions can be derived for the jackknife estimates of certain quantities, allowing you to skip the (re)sampling altogether! For example, Bandos, Guo, and Gur provide a closed-form solution for the variance of auROC here.

Attribution
Source : Link , Question Author : Heisenberg , Answer Author : Matt Krause

Leave a Comment