Computing the mode of data sampled from a continuous distribution

What are the best methods for fitting the ‘mode’ of data sampled from a continuous distribution?

Since the mode is technically undefined (right?) for a continuous distribution, I’m really asking ‘how do you find the most common value’?

If you assume the parent distribution is gaussian, you could bin the data and find say the mode is the bin location with the greatest counts. However, how do you determine the bin size? Are there robust implementations available? (i.e., robust to outliers). I use python/scipy/numpy, but I can probably translate R without too much difficulty.


In R, applying the method that isn’t based on parametric modelling of the underlying distribution and uses the default kernel estimator of density to 10000 gamma distributed variables:

x <- rgamma(10000, 2, 5)
z <- density(x)
plot(z) # always good to check visually

returns 0.199 which is the value of x estimated to have the highest density (the density estimates are stored as “z$y”).

Source : Link , Question Author : keflavich , Answer Author : Peter Ellis

Leave a Comment