I have been counting stomata on fossil leaf material to apply a known relationship between stomatal index and CO2. I thought that the material was all from one population (one species at a given site). However, exploration of the data suggests there may be two populations. I interpret these to be the species I was targeting and a hybrid, which are difficult to distinguish by leaf morphology (For reasons of stratigraphy we can rule out that these were actually two different times and thus different ‘real’ CO2 values).
I have been able to find information on how to determine if two samples are from different populations, but not if you took one sample and seem to have two different populations.
Would it be acceptable to divide the distribution (say split it at 6.5) and use a Wilcoxon-Mann-Whitney test to determine if two samples are significantly different?
What is an unbias way to determine if these really are two populations?
These are the stomatal index results for the 41 leaves.
 5.172414 5.246914 5.276382 5.278592 5.288462 5.306122 5.323194 5.325444 5.357143 5.366726
 5.367232 5.376344 5.384615 5.504587 6.053269 6.854839 6.910569 7.006369 7.036247 7.112069
 7.156673 7.231920 7.311828 7.416268 7.440476 7.448494 7.491857 7.526882 7.526882 7.534247
 7.547170 7.559395 7.605634 7.671233 7.749077 7.925408 7.964602 8.064520 8.247423 8.252427
Let’s start with terminology. Population in statistics is the “set of entities under study”. When designing the study, we define the population of interest and then draw samples from this population. So sample cannot “consist” of multiple populations. More appropriate wording would be to talk about “groups”, “clusters”, or “subpopulations”.
To find clusters in your data, you could use clustering algorithms, that will try to split your data into a predefined number of groups, given such criteria. Usually we are aiming at the samples within each cluster being most similar to each other, while the clusters most dissimilar. Notice the logical problem in here: if you would first group stuff in such way that the groups are dissimilar from each other and then test if they differ, then this gets circular. If your test fails, maybe the clustering algorithm was not good enough, or test not sensitive enough? It opens many ways to “torturing the data until it confesses” and generally is a bad idea.
One approach that can be justified, is to use model-based clustering (i.e. mixture model, as mentioned in the other answer by Stephan Kolassa) with one, or two clusters and then conduct a likelihood-ratio test to compare two models. If the data is more “likely” given the two-cluster model, then you can say that two-cluster solution “fits better” the data, though it doesn’t prove that there were actual subpopulations. This approach would need you to be able to define a statistical model that describes the data, so it is more complicated then using “black box” clustering algorithm.