I am trying to learn statistics because I find that it is so prevalent that it prohibits me from learning some things if I don’t understand it properly. I am having trouble understanding this notion of a sampling distribution of the sample means. I can’t understand the way some books and sites have explained it. I think I have an understanding but am unsure if its correct. Below is my attempt to understand it.

When we talk about some phenomenon taking on a normal distribution, it is generally (not always) concerning the population.

We want to use inferential statistics to predict some stuff about some population, but don’t have all the data. We use random sampling and each sample of size n is equally as likely to be selected.

So we take lots of samples, lets say 100 and then the distribution of the means of those samples will be approximately normal according to the central limit theorem. The mean of the sample means will approximate the population mean.

Now what I don’t understand is a lot of the times you see “A sample of 100 people…” Wouldn’t we need 10s or 100s of samples of 100 people to approximate the population of the mean? Or is it the case that we can take a single sample that’s large enough, say 1000 and then say that mean will approximate the population mean? OR do we take a sample of 1000 people and then take 100 random samples of 100 people in each sample from that original 1000 people we took and then use that as our approximation?

Does taking a large enough sample to approximate the mean (almost) always work? Does the population even need to be normal for this to work?

**Answer**

I think you might be confusing the expected sampling distribution of a mean (which we would calculate based on a single sample) with the (usually hypothetical) process of simulating what would happen if we did repeatedly sample from the same population multiple times.

For any given sample size (even n = 2) we would say that the sample mean (from the two people) estimates the population mean. But the estimation accuracy — that is, how good a job we’ve done of estimating the population mean based on our sample data, as reflected in the standard error of the mean — will be poorer than if we had a 20 or 200 people in our sample. This is relatively intuitive (larger samples give better estimation accuracy).

We would then use the standard error to calculate a confidence interval, which (in this case) is based around the Normal distribution (we’d probably use the t-distribution in small samples since the standard deviation of the population is often underestimated in a small sample, leading to overly optimistic standard errors.)

In answer to your last question, no we don’t always need a Normally distributed population to apply these estimation methods — the central limit theorem indicates that the sampling distribution of a mean (estimated, again, from a single sample) will tend to follow a normal distribution even when the underlying population has a non-Normal distribution. This is usually appropriate for “bigger” sample sizes.

Having said that, when you have a non-Normal population that you’re sampling from, the mean might not be an appropriate summary statistic, even if the sampling distribution for that mean could be considered reliable.

**Attribution***Source : Link , Question Author : mergesort , Answer Author : James Stanley*