I’m teaching an intro stats class and was reviewing the types of sampling, including systematic sampling where you sample every kth individual or object.
A student asked if sampling every person with a particular characteristic would accomplish the same thing.
For example, would sampling every person with a blue t-shirt be random enough and provide enough of a representation of the whole population? At least, if you’re asking a question other than “What color t-shirt do you prefer wearing?” My sense is no, but I wondered if anyone here had any thoughts on this.
The answer, in general, to your question is “no”. Obtaining a random sample from a population (especially of humans) is notoriously difficult. By conditioning on a particular characteristic, you’re by definition not obtaining a random sample. How much bias this introduces is another matter altogether.
As a slightly absurd example, you wouldn’t want to sample this way at, say, a football game between the Bears and the Packers, even if your population was “football fans”. (Bears fans may have different characteristics than other football fans, even when the quantity you are interested in may not seem directly related to football.)
There are many famous examples of hidden bias resulting from obtaining samples in this way. For example, in recent US elections in which phone polls have been conducted, it is believed that people owning only a cell phone and no landline are (perhaps dramatically) underrepresented in the sample. Since these people also tend to be, by and large, younger than those with landlines, a biased sample is obtained. Furthermore, younger people have very different political beliefs than older populations. So, this is a simple example of a case where, even when the sample was not intentionally conditioned on a particular characteristic, it still happened that way. And, even though the poll had nothing to do with the conditioning characteristic either (i.e., whether or not one uses a landline), the effect of the conditioning characteristic on the poll’s conclusions was significant, both statistically and practically.