Generate points efficiently between unit circle and unit square

I’d like generate samples from the blue region defined here:

The naive solution is to use rejection sampling in the unit square, but this provides only a $1-\pi/4$ (~21.4%) efficiency.

Is there some way I can sample more efficiently?

Will two million points per second do?

The distribution is symmetric: we only need work out the distribution for one-eighth of the full circle and then copy it around the other octants. In polar coordinates $(r,\theta)$, the cumulative distribution of the angle $\Theta$ for the random location $(X,Y)$ at the value $\theta$ is given by the area between the triangle $(0,0), (1,0), (1,\tan\theta)$ and the arc of the circle extending from $(1,0)$ to $(\cos\theta,\sin\theta)$. It is thereby proportional to

whence its density is

We may sample from this density using, say, a rejection method (which has efficiency $8/\pi-2 \approx 54.6479\%$).

The conditional density of the radial coordinate $R$ is proportional to $rdr$ between $r=1$ and $r=\sec\theta$. That can be sampled with an easy inversion of the CDF.

If we generate independent samples $(r_i,\theta_i)$, conversion back to Cartesian coordinates $(x_i,y_i)$ samples this octant. Because the samples are independent, randomly swapping the coordinates produces an independent random sample from the first quadrant, as desired. (The random swaps require generating only a single Binomial variable to determine how many of the realizations to swap.)

Each such realization of $(X,Y)$ requires, on average, one uniform variate (for $R$) plus $1/(8\pi-2)$ times two uniform variates (for $\Theta$) and a small amount of (fast) calculation. That’s $4/(\pi-4) \approx 4.66$ variates per point (which, of course, has two coordinates). Full details are in the code example below. This figure plots 10,000 out of more than a half million points generated.

Here is the R code that produced this simulation and timed it.

n.sim <- 1e6
x.time <- system.time({
# Generate trial angles theta
theta <- sqrt(runif(n.sim)) * pi/4
# Rejection step.
theta <- theta[runif(n.sim) * 4 * theta <= pi * tan(theta)^2]
# Generate radial coordinates r.
n <- length(theta)
r <- sqrt(1 + runif(n) * tan(theta)^2)
# Convert to Cartesian coordinates.
# (The products will generate a full circle)
x <- r * cos(theta) #* c(1,1,-1,-1)
y <- r * sin(theta) #* c(1,-1,1,-1)
# Swap approximately half the coordinates.
k <- rbinom(1, n, 1/2)
if (k > 0) {
z <- y[1:k]
y[1:k] <- x[1:k]
x[1:k] <- z
}
})
message(signif(x.time[3] * 1e6/n, 2), " seconds per million points.")
#
# Plot the result to confirm.
#
plot(c(0,1), c(0,1), type="n", bty="n", asp=1, xlab="x", ylab="y")
rect(-1, -1, 1, 1, col="White", border="#00000040")
m <- sample.int(n, min(n, 1e4))
points(x[m],y[m], pch=19, cex=1/2, col="#0000e010")