# How to calculate overlap between empirical probability densities?

I’m looking for a method to calculate the area of overlap between two kernel density estimates in R, as a measure of similarity between two samples. To clarify, in the following example, I would need to quantify the area of the purplish overlapping region:

library(ggplot2)
set.seed(1234)
d <- data.frame(variable=c(rep("a", 50), rep("b", 30)), value=c(rnorm(50), runif(30, 0, 3)))
ggplot(d, aes(value, fill=variable)) + geom_density(alpha=.4, color=NA)


A similar question was discussed here, the difference being that I need to do this for arbitrary empirical data rather than predefined normal distributions. The overlap package addresses this question, but apparently only for timestamp data, which doesn’t work for me. The Bray-Curtis index (as implemented in vegan package’s vegdist(method="bray") function) also seems relevant but again for somewhat different data.

I’m interested in both the theoretical approach and the R functions I might employ to implement it.

1) Since the original KDEs have probably been evaluated over some grid, if the grid is the same for both (or can easily be made the same), the exercise could be as easy as simply taking $\min(K_1(x),K_2(x))$ at each point and then using the trapezoidal rule, or even a midpoint rule.
2) You might find the point (or points) of intersection and integrate the lower of the two KDEs in each interval where each one is lower. In your diagram above you’d integrate the blue curve to the left of the intersection and the pink one to the right by whatever means you like/have available. This can be done essentially exactly by considering the area under each kernel component $\frac{1}{h}K(\frac{x-x_i}{h})$ to the left or right of that cut-off point.