# Distribution for percentage data

I have a question about the correct distribution to use for creating a model with my data. I conducted a forest inventory with 50 plots, each plot measures 20m × 50m. For each plot, I estimated the percentage of tree canopy that shades the ground. Each plot has one value, in percent, for canopy cover. Percentages range from 0 to 0.95. I am making a model of percent tree canopy cover (Y variable), with a matrix of independent X variables based on satellite imagery and environmental data.

I am not sure if I should use a binomial distribution, since a binomial random variable is the sum of n independent trials (i.e., Bernoulli random variables). The percentage values are not the sum of trials; they are the actual percentages. Should I use gamma, even though it does not have an upper limit? Should I convert percentages to integer and use Poisson as counts? Should I just stick with Gaussian? I have not found many examples in the literature or in textbooks that try to model percentages in this way. Any hints or insights are appreciated.

The following article discusses a good way to transform a beta-distributed response variable when it includes true 0’s and/or 1’s in the range of percentages:

@DimitriyV.Masterov raises the good point that you mention your data have $0$‘s, but the beta distribution is only supported on $(0,\ 1)$. This prompts the question of what should be done with such values. Some ideas can be gleaned from this excellent CV thread: How small a quantity should be added to x to avoid taking the log of 0?