# Are “random sample” and “iid random variable” synonyms?

I have been facing hard time understanding meaning of “random sample” as well as “iid random variable”. I tried to find out the meaning from several sources, but just got more and more confused. I am posting here what I tried and got to know:

Degroot’s Probability & Statistics says:

Random Samples / i.i.d. / Sample Size : Consider a given probability distribution on the real line that can be represented by either a p.f. or a p.d.f. $f$. It is said that $n$ random variables $X_1 , . . . , X_n$ form a random sample from this distribution if these random variables are independent and the marginal p.f. or p.d.f. of each of them is $f$. Such random variables are also said to be independent and identically distributed, abbreviated i.i.d. We refer to the number n of random variables as the sample size.

But one of the other statistics book I have says:

In a Random Sampling, we guarantee that every individual unit in the population gets an equal chance(probability) of being selected.

So, I have a feeling that i.i.d.s are elements that construct random sample, and the procedure to have random sample is random sampling. Am I right?

You don’t say what the other statistics book is, but I’d guess that it is a
book (or section) about finite population sampling.

When you sample random variables, i.e. when you consider a set
$X_1,\dots,X_n$ of $n$ random variables, you know that if they are
independent, $f(x_1,\dots,x_n)=f(x_1)\cdots f(x_n)$, and identically distributed
, in particular $E(X_i)=\mu$ and $\text{Var}(X_i)=\sigma^2$ for all $i$, then:

where $\sigma^2$ is the second central moment.

Sampling a finite population is somewhat different. If the population is of
size $N$, in sampling without replacement there are $\binom{N}{n}$ possible
samples $s_i$ of size $n$ and they are equiprobable:

For example, if $N=5$ and $n=3$, the sample space is $\{s_1,\dots,s_{10}\}$
and the possibile samples are:

If you count the number of occurences of each individual, you can see that
they are six, i.e. each individual has an equal chanche of being selected (6/10). So each $s_i$ is a random sample according to the second definition. Roughly, it is not an i.i.d. random sample because individuals
are not random variables: you can consistently estimate $E[X]$ by a sample mean but will
never know its exact value, but you can know the exact population mean if $n=N$ (let me repeat: roughly.)${}^1$

Let $\mu$ be some polulation mean (mean height, mean income, …). When $n
you can estimate $\mu$ like in random variable sampling:

but the sample mean variance is different:

where $\tilde\sigma^2$ is the population quasi-variance:
$\frac{\sum_{i=1}^N(y_i-\overline{y})^2}{N-1}$.
Factor $(1-n/N)$ is usally called "finite population correction factor".

This is a quick example of how a (random variable) i.i.d. random sample and a
(finite population) random sample may differ. Statistical
inference
random variable sampling, sampling
theory
population sampling.

${}^1$ Say you are manufacturing light bulbs and wish to know their average life
span. Your "population" is just a theoretical or virtual one, at least if you
keep manufacturing light bulbs. So you have to model a data generation
process
and intepret a set of light bulbs as a (random variable) sample. Say
now that you find a box of 1000 light bulbs and wish to know their average
life span. You can select a small set of light bulbs (a finite population
sample), but you could select all of them. If you select a small sample, this
doesn't transform light bulbs into random variables: the random variable is
generated by you, as the choice between "all" and "a small set" is up to
you. However, when a finite population is very large (say your country
population), when choosing "all" is not viable, the second situation is better
handled as the first one.