I know very little of Probability and Statistics, and am wishing to learn. I see the word “distribution” used all over the place in different contexts.

For example, a discrete random variable has a “probability distribution.” I know what this is.

A continuous random variable has a probability density function, then for x∈R, the integral from −∞ to x of the probability density function is the cumulative distribution function evaluated at x.And apparently just “distribution function” is synonymous with “cumulative distribution function,” at least when talking about continuous random variables (question: are they always synonyms?).

Then there are many famous distributions. Γ distribution χ2 distribution, etc. But what exactly is a Γ distribution? Is it the cumulative distribution function of a Γ random variable? Or the probability density function of a Γ random variable?

But then a frequency distribution of a finite data set appears to be a histogram.

Long story short: in Probability and Statistics, what is the definition of the word “distribution”?

I know the definition of distribution in Mathematics (an element of the dual space of the collection of test functions equipped with the inductive limit topology), but not Probability and Statistics.

**Answer**

The following is for R−valued random-variables. The extension to other spaces is straight forward if you are interested. I would argue that the following slightly more general definition is more intuitive than separately considering density, mass and cumulative distribution functions.

I include some mathematical / probabilistic terms in the text to make it correct. If one is not familiar with those terms, the intuition is equally well grasped by just thinking of “Borel sets” as “any subset of R that I can think of”, and of the random variable a the numerical outcome of some experiment with an associated probability.

Let (Ω,F,P) be a probability space and X(ω) an R−valued random variable on this space.

*The set function Q(A):=P(ω∈Ω:X(ω)∈A), where A is a Borel set, is called the distribution of X.*

In words, the distribution tells you (loosely speaking), for any subset of R, the probability that X takes on a value in that set. One can prove that Q is completely determined by the function F(x):=P(X≤x) and vice versa. To do that — and I skip the details here — construct a measure on the Borel sets that assign the probability F(x) to all sets (−∞,x) and argue that this finite measure agrees with Q on a π−system generating the Borel σ−algebra.

If it so happens that Q(A) can be written as Q(A)=∫Af(x)dx then f is a density function for Q and you can see, although this density is not uniquely determined (consider changes on sets of Lebesgue measure zero), it makes sense to also speak of f as the distribution of X. Usually, however, we call it the probability density function of X.

Similarly, if it so happens that Q(A) can be written as Q(A)=∑i∈A∩{…,−1,0,1,…}f(i), then it makes sense to speak of f as the distribution of X although we usually call it the probability mass function.

Thus, whenever you read something like “X follows a uniform distribution on [0,1]“, it simply means that the function Q(A), which tells you the probability that X takes on values in certain sets, is characterized by the probability density function f(x)=I[0,1] or the cumulative distribution function F(x)=∫x−∞f(t)dt.

A final note on the case where there is no mention of a random variable, but only a distribution. One may prove that given a distribution function (or a mass, density or cumulative distribution function), there exists a probability space with a random variable that has this distribution. Thus, there is essentially no difference in speaking about a distribution, or about a random variable having that distribution. It’s just a matter of one’s focus.

**Attribution***Source : Link , Question Author : danzibr , Answer Author : ekvall*