# Generating random samples obeying the exponential distribution with a given min and max

Random samples obeying the exponential distribution can be generated by the inverse sampling technique by using the quantile function of the exponential distribution:

$$x=F−1(u)=−1λln(u) x = F^{-1}(u) = - \frac{1}{\lambda} \ln(u)$$

where $$uu$$ is a sample drawn from the uniform distribution on the unit interval $$(0,1)(0, 1)$$.

In OpenFOAM software, a distribution model called exponential (here) can be used to generate exponential-distribution random samples, and its users can, supposedly, choose a minimum and maximum value for the exponential-distribution samples prior to the random-number generation.

The governing expression implemented into this software is as follows:

$$x=−1λln[exp(−λtmin)+u{exp(−λtmax)−exp(−λtmin)}] x = -\frac{1}{\lambda} \ln \left[ \exp(-\lambda t_{min}) + u \{\exp(-\lambda t_{max}) - \exp(-\lambda t_{min})\} \right]$$

where $$tmint_{min}$$ and $$tmaxt_{max}$$ are user-defined minimum and maximum values, respectively.

Two (rather broad) questions arise:

1. Have you ever come across the above expression (or similar one) in the literature for generating exponential-distribution random samples with a given min-max? Or do you think this expression looks like (or is) a heuristic solution?
2. If heuristic, would you suggest a way to carry out verification tests on this expression to test whether the expression produces samples obeying the exponential distribution within [min,max]? (Plotting normalised histogram (i.e. counts in a bin divided by the number of observations times the bin width) and comparing it with the analytical exponential distribution seem to be problematic due to the min/max limits).

You describe truncation to an interval. I will elaborate.

Suppose $$XX$$ is any random variable (such as an exponential variable) and let $$FXF_X$$ be its distribution function,

$$FX(x)=PrF_X(x) = \Pr(X\le x).$$

For an interval $$[a,b],[a,b],$$ the truncation limits $$XX$$ to that interval. That lops off some probability from $$X,X,$$ namely the chance that $$XX$$ either is less than $$aa$$ or greater than $$b.b.$$ The chance that is left is

$$\Pr(X\in[a,b]) = \Pr(X\le b) – \Pr(X\le a) + \Pr(X=a) = F_X(b) – F_X(a) + \Pr(X=a).\Pr(X\in[a,b]) = \Pr(X\le b) - \Pr(X\le a) + \Pr(X=a) = F_X(b) - F_X(a) + \Pr(X=a).$$

Thus, to make the total probability come out to $$1,1,$$ the distribution function for the truncated $$XX$$ must be zero when $$x\lt a,x\lt a,$$ $$11$$ when $$x\ge b,x\ge b,$$ and otherwise is

$$F_X(x;a,b) = \frac{\Pr(X\in[a,x])}{\Pr(X\in[a,b])}= \frac{F_X(x) – F_X(a) + \Pr(X=a)}{F_X(b) – F_X(a) + \Pr(X=a)}.F_X(x;a,b) = \frac{\Pr(X\in[a,x])}{\Pr(X\in[a,b])}= \frac{F_X(x) - F_X(a) + \Pr(X=a)}{F_X(b) - F_X(a) + \Pr(X=a)}.$$

When you can compute the inverse of the distribution function–which almost always means $$XX$$ is a continuous variable–it’s straightforward to generate samples: draw a uniform random probability $$UU$$ (from the interval $$[0,1],[0,1],$$ of course) and find a number $$xx$$ for which $$F_X(x) = U.F_X(x) = U.$$ This value is written

$$x = F^{-1}_X(U).x = F^{-1}_X(U).$$

$$F_X^{-1}F_X^{-1}$$ is called the “percentage point function” or “inverse distribution function.”

For example, when $$XX$$ has an Exponential distribution with rate $$\lambda \gt 0,\lambda \gt 0,$$

$$U = F_X(x) = 1 – \exp(-\lambda x),U = F_X(x) = 1 - \exp(-\lambda x),$$

which we can solve to obtain

$$F_X^{-1}(U) = -\frac{1}{\lambda}\log(U).F_X^{-1}(U) = -\frac{1}{\lambda}\log(U).$$

This is called “inverting the distribution” or “applying the percentage point function.”

It turns out–and this is the point of this post–that when you can invert $$F_X,F_X,$$ you can also invert the truncated distribution. Given $$U,U,$$ this amounts to solving

$$U = F_X(x;a,b) = \frac{F_X(x)-F_X(a)}{F_X(b) – F_X(a)},U = F_X(x;a,b) = \frac{F_X(x)-F_X(a)}{F_X(b) - F_X(a)},$$

because (since we are now assuming $$XX$$ is continuous) the terms $$\Pr(X=a)=0\Pr(X=a)=0$$ drop out. The solution is

$$x = F_X^{-1}(U;a,b) = F_X^{-1}\left(F_X(a)+\left[F_X(b) – F_X(a)\right]U\right).x = F_X^{-1}(U;a,b) = F_X^{-1}\left(F_X(a)+\left[F_X(b) - F_X(a)\right]U\right).$$

That is, the only change is that after drawing $$U,U,$$ you must rescale and shift it to make its value lie between $$F_X(a)F_X(a)$$ and $$F_X(b),F_X(b),$$ and then you invert it.

This yields the second formula in the question.

An equivalent procedure is to draw a uniform value $$VV$$ from the interval $$[F_X(a),F_X(b)][F_X(a),F_X(b)]$$ and compute $$F_X^{-1}(V).F_X^{-1}(V).$$ This works because the scaled and shifted version of $$UU$$ has a uniform distribution in this interval. I use this method in the code below. The figure illustrates the results of this algorithm with $$\lambda=1/2\lambda=1/2$$ and truncation to the interval $$[2,7].[2,7].$$ I think it alone is a pretty good verification of the procedure.

The R code is general-purpose: replace ff (which implements $$F_XF_X$$) and f.inv (which implements $$F^{-1}_XF^{-1}_X$$) with the corresponding functions for any continuous random variable.

#
# Provide a CDF and its percentage point function.
#
lambda <- 1/2
ff <- function(x) pexp(x, lambda)
f.inv <- function(q) qexp(q, lambda)
#
# Specify the interval of truncation.
#
a <- 2
b <- 7
#
# Simulate data and truncated data.
#
n <- 1e6
x <- f.inv(runif(n))
x.trunc <- f.inv(runif(n, ff(a), ff(b)))
#
# Draw histograms.
#
dx <- (b - a) / 25
bins <- seq(a - ceiling((a - min(x))/dx)*dx, max(x)+dx, by=dx)

h <- hist(x.trunc, breaks=bins, plot=FALSE)
hist(x, breaks=bins, freq=FALSE, ylim=c(0, max(h\$density)), col="#e0e0e0",
xlab="Value",
main="Histogram of X and its truncated version")
plot(h, add=TRUE, freq=FALSE, col="#2020ff40")

abline(v = c(a,b), lty=3, lwd=2)
mtext(c(expression(a), expression(b)), at = c(a, b), side=1, line=0.25)