In Steven Pinker’s book Better Angels of Our Nature, he notes that
Probability is a matter of perspective. Viewed at sufficiently close
range, individual events have determinate causes. Even a coin flip can
be predicted from the starting conditions and the laws of physics, and
a skilled magician can exploit those laws to throw heads every time.
Yet when we zoom out to take a wide-angle view of a large number of
these events, we are seeing the sum of a vast number of causes that
sometimes cancel each other out and sometimes align in the same
direction. The physicist and philosopher Henri Poincare explained
that we see the operation of chance in a deterministic world either
when a large number of puny causes add up to a formidable effect, or
when a small cause that escapes our notice determines a large effect
that we cannot miss. In the case of organized violence, someone may
want to start a war; he waits for the opportune moment, which may or
may not come; his enemy decides to engage or retreat; bullets fly;
bombs burst; people die. Every event may be determined by the laws of
neuroscience and physics and physiology. But in the aggregate, the
many causes that go into this matrix can sometimes be shuffled into
extreme combinations. (p. 209)
I am particularly interested in the bolded sentence, but I give the rest for context. My question: are there statistical ways of describing the two processes that Poincare described? Here are my guesses:
1) “A large number of puny causes add up to a formidable effect.” The “large number of causes” and “add up” sound to me like the central limit theorem. But in (the classical definition of) the CLT, the causes need to be random variables, not deterministic effects. Is the standard method here to approximate these deterministic effects as some sort of random variable?
2) “A small cause that escapes our notice determines a large effect that we cannot miss.” It seems to me like you could think of this as some sort of hidden Markov model. But the (unobservable) state transition probabilities in an HMM are just that, probabilities, which is by definition once again not deterministic.
Interesting thought (+1).
In cases 1) and 2), the problem is the same: we do not have complete information. And probability is a measure of the lack of information.
1) The puny causes may be purely deterministic, but which particular causes operate is impossible to know by a deterministic process. Think of molecules in a gaz. The laws of mechanics apply, so what is random here? The information hidden to us: where is which molecule with what speed. So the CLT applies, not because there is randomness in the system, but because there is randomness in our representation of the system.
2) There is a time component in HMM that is not necessarily present in this case. My interpretation is the same as before, the system may be non random, but our access to its state has some randomness.
EDIT: I don’t know if Poincare was thinking of a different statistical approach for these two cases. In case 1) we know the varialbes, but we cannot measure them because there are too many and they are too small. In case 2) we don’t know the variables. Both ways, you end up making assumptions and modeling the observable the best we can, and quite often we assume Normality in case 2).
But still, if there was one difference, I think it would be emergence. If all systems were determined by sums of puny causes then all random variables of the physical world would be Gaussian. Clearly, this is not the case. Why? Because scale matters. Why? Because new properties emerge from interactions at smaller scale, and these new properties need not be Gaussian. Actually, we have no statistical theory for emergence (as far as I know) but maybe one day we will. Then it will be justified to have different statistical approaches for cases 1) and 2)