# Inverse transform method, theoretical graph not matching sample

Suppose a density function is

f_X(x)=\left\{\begin{aligned}\frac{1}{(1+x)^2} & \quad x \ge 0 \\ 0 & \quad \text{otherwise.}\end{aligned}\right.

I am trying to write an algorithm to generate a sample of size $$n = 10^4$$ from this distribution by using the inverse transform method. Why does my histogram not look “nice” like the one below?

n <- 10^4
u <- runif(n)
x <- (-u)/(u-1)
# histogram
hist(x, prob = TRUE, main = bquote(f(x)==1/((1+x)^2))) #density histogram of sample
y <- seq(0, 100, .01)
lines(y, 1/((1+y)^2))    #density curve f(x)


Nice histogram for another density: This happens all the time with distributions that have infinite variance. (This one even has infinite expectation.) One or more extreme values can swamp all the others.

When all values are positive, a nice solution is to plot the histogram on a log scale.

Fortunately, you don’t have to do the math to succeed with this.

The idea is that when the probability element of your distribution is $$f_X(x)\mathrm{d}x,$$ on a log scale $$y = \log(x)$$ we have $$x=e^y,$$ so the probability element becomes

$$f_Y(y)\mathrm{d}y = f_X\left(e^y\right)\,\mathrm{d}e^y = f_X\left(e^y\right)e^y\,\mathrm{d}y.$$

Notice this involves only (a) evaluating $$f_X$$ at $$e^y$$ and (b) multiplying that by $$e^y.$$

The changes to the code–which work in all such cases, regardless of $$f$$–are

1. Make a histogram of the logarithms of the data.

hist(log(x), prob = TRUE, main = bquote(f(x)==1/(1+x)^2), breaks=50, col=gray(.9))

2. Overplot it with the adjusted function $$f_Y:$$

f <- function(x) 1 / (1+x)^2 # The original density function
curve(f(exp(y)) * exp(y), add=TRUE, n=201, xname="y", lwd=2, col="Red")


You can get a little fancier with (1) if you like. Here is a version of it with with the values shown on the axis rather than their logarithms. If you prefer this style, here’s some R code for step (1) to get you started.

hist(log(x), prob = TRUE, main = bquote(f(x)==1/(1+x)^2), breaks=50, col=gray(.9),
xaxt="n", xlab="Value")
i <- floor(range(log10(x)))
xlab <- seq(min(i), max(i))[-1]
rug(xlab * log(10))
for (x in xlab) mtext(bquote(10^.(x)), side=1, line=1/2, at=x * log(10))


An alternative approach is to break the histogram up into two or more pieces. Use a high quantile for a flexible choice; or inspect the initial histogram and choose the threshold(s) for splitting the data by eye. Watch out for a pitfall: when you feed only a part of the data to hist, it overestimates the densities. Multiply the densities by the fraction of data being shown. The code demonstrates how to do that in R: save the output of hist and simply scale its densities (making no other changes), and only then plot this object.

The code is starting to get a little fussy, though:

alpha <- 0.02
threshold <- quantile(x, 1 - alpha)
par(mfrow=c(1,2))
h <- hist(x[x <= threshold], breaks=50, plot=FALSE)
# h$density <- h$density * (1-alpha)
plot(h, freq=FALSE, main = bquote(f(x)==1/(1+x)^2), col=gray(.9), xlab="Value",
sub=bquote(paste("The highest ", .(signif(100*alpha, 2)), "% of data are not shown")))
curve(f(x), add=TRUE, col="Red", lwd=2)
h <- hist(x[x > threshold], breaks=seq(threshold, max(x), length.out=21), plot=FALSE)
h$density <- h$density * alpha
plot(h, freq=FALSE, main = bquote(f(x)==1/(1+x)^2), col=gray(.9),
xlab="Value",
sub=bquote(paste("Only the highest ", .(signif(100*alpha, 2)), "% of data are shown")))
curve(f(x), add=TRUE, col="Red", lwd=2)
par(mfrow=c(1,1))


It can be interesting to use variable-width bins. This method can be combined with the previous ones. Here is an example of the lower part of the data. I broke the data by quantiles for this one. Again, the densities have to be adjusted when a subset of the data is being plotted.

b <- quantile(x, seq(0, 1, length.out=21))
k <- 2
n <- length(b)-k
h <- hist(x[x <= b[n]], plot=FALSE, breaks=b[-(n + seq_len(k))])
q <- sum(x <= b[n]) / length(x)
h$$density <- h$$density * q

plot(h, main = bquote(f(x)==1/(1+x)^2),
col=gray(.9), xlab="Value",
sub=bquote(paste("Only the lowest ", .(signif(100*q, 2)),
"% of data are shown")))
curve(f(x), add=TRUE, col="Red", lwd=2)