# Lower than expected coverage for importance sampling with simulation

I was trying to answer the question Evaluate integral with Importance sampling method in R. Basically, the user needs to calculate

using the exponential distribution as the importance distribution

and find the value of $\lambda$ which gives the better approximation to the integral (it’s self-study). I recast the problem as the evaluation of the mean value $\mu$ of $f(x)$ over $[0,\pi]$: the integral is then just $\pi\mu$.

Thus, let $p(x)$ be the pdf of $X\sim\mathcal{U}(0,\pi)$, and let $Y\sim f(X)$: the goal now is to estimate

using importance sampling. I performed a simulation in R:

# clear the environment and set the seed for reproducibility
rm(list=ls())
gc()
graphics.off()
set.seed(1)

# function to be integrated
f <- function(x){
1 / (cos(x)^2+x^2)
}

# importance sampling
importance.sampling <- function(lambda, f, B){
x <- rexp(B, lambda)
f(x) / dexp(x, lambda)*dunif(x, 0, pi)
}

# mean value of f
mu.num <- integrate(f,0,pi)$value/pi # initialize code means <- 0 sigmas <- 0 error <- 0 CI.min <- 0 CI.max <- 0 CI.covers.parameter <- FALSE # set a value for lambda: we will repeat importance sampling N times to verify # coverage N <- 100 lambda <- rep(20,N) # set the sample size for importance sampling B <- 10^4 # - estimate the mean value of f using importance sampling, N times # - compute a confidence interval for the mean each time # - CI.covers.parameter is set to TRUE if the estimated confidence # interval contains the mean value computed by integrate, otherwise # is set to FALSE j <- 0 for(i in lambda){ I <- importance.sampling(i, f, B) j <- j + 1 mu <- mean(I) std <- sd(I) lower.CB <- mu - 1.96*std/sqrt(B) upper.CB <- mu + 1.96*std/sqrt(B) means[j] <- mu sigmas[j] <- std error[j] <- abs(mu-mu.num) CI.min[j] <- lower.CB CI.max[j] <- upper.CB CI.covers.parameter[j] <- lower.CB < mu.num & mu.num < upper.CB } # build a dataframe in case you want to have a look at the results for each run df <- data.frame(lambda, means, sigmas, error, CI.min, CI.max, CI.covers.parameter) # so, what's the coverage? mean(CI.covers.parameter) #  0.19  The code is basically a straightforward implementation of importance sampling, following the notation used here. The importance sampling is then repeated $N$ times to get multiple estimates of $\mu$, and each time a checks is made on whether the 95% interval covers the actual mean or not. As you can see, for $\lambda=20$ the actual coverage is just 0.19. And increasing $B$ to values such as $10^6$ doesn’t help (the coverage is even smaller, 0.15). Why is this happening? ## Answer Importance sampling is quite sensitive to the choice of importance distribution. Since you chose $\lambda = 20$, the samples that you draw using rexp will have a mean of $1/20$ with variance $1/400$. This is the distribution you get However, the integral you want to evaluate goes from 0 to $\pi =3.14$. So you want to use a $\lambda$ that gives you such a range. I use $\lambda = 1$. Using $\lambda = 1$ I will be able to explore the full integral space of 0 to $\pi$, and seems like only a few draws over $\pi$ will be wasted. Now I rerun your code, and only change $\lambda = 1$. # clear the environment and set the seed for reproducibility rm(list=ls()) gc() graphics.off() set.seed(1) # function to be integrated f <- function(x){ 1 / (cos(x)^2+x^2) } # importance sampling importance.sampling <- function(lambda, f, B){ x <- rexp(B, lambda) f(x) / dexp(x, lambda)*dunif(x, 0, pi) } # mean value of f mu.num <- integrate(f,0,pi)$value/pi

# initialize code
means  <- 0
sigmas <- 0
error  <- 0
CI.min <- 0
CI.max <- 0
CI.covers.parameter <- FALSE

# set a value for lambda: we will repeat importance sampling N times to verify
# coverage
N <- 100
lambda <- rep(1,N)

# set the sample size for importance sampling
B <- 10^4

# - estimate the mean value of f using importance sampling, N times
# - compute a confidence interval for the mean each time
# - CI.covers.parameter is set to TRUE if the estimated confidence
#   interval contains the mean value computed by integrate, otherwise
# is set to FALSE
j <- 0
for(i in lambda){
I <- importance.sampling(i, f, B)
j <- j + 1
mu <- mean(I)
std <- sd(I)
lower.CB <- mu - 1.96*std/sqrt(B)
upper.CB <- mu + 1.96*std/sqrt(B)
means[j] <- mu
sigmas[j] <- std
error[j] <- abs(mu-mu.num)
CI.min[j] <- lower.CB
CI.max[j] <- upper.CB
CI.covers.parameter[j] <- lower.CB < mu.num & mu.num < upper.CB
}

# build a dataframe in case you want to have a look at the results for each run
df <- data.frame(lambda, means, sigmas, error, CI.min, CI.max, CI.covers.parameter)

# so, what's the coverage?
mean(CI.covers.parameter)
# .95


If you play around with $\lambda$, you will see that if you make it really small (.00001) or large, the coverage probabilities will be bad.

EDIT——-

Regarding the coverage probability decreasing once you go from $B = 10^4$ to $B = 10^6$, that is just a random occurrence, based on the fact that you use $N = 100$ replications. The confidence interval for the coverage probability at $B = 10^4$ is,

So you can’t really say that increasing $B = 10^6$ significantly lowers the coverage probability.

In fact in your code for the same seed, change $N = 100$ to $N = 1000$, then with $B = 10^4$, coverage probability is .123 and with $B = 10^6$ coverage probability is $.158$.

Now, the confidence interval around .123 is

Thus, now with $N = 1000$ replications, you get that the coverage probabiulity significantly increases.