From the Wikipedia page titled correlation does not imply causality,

For any two correlated events, A and B, the different possible relationships include:

- A causes B (direct causation);
- B causes A (reverse causation);
- A and B are consequences of a common cause, but do not cause each

other;- A and B both causes C, which is (explicitly or implicitly)

conditioned on.;- A causes B and B causes A (bidirectional or cyclic causation);
- A causes C which causes B (indirect causation);
- There is no connection between A and B; the correlation is a

coincidence.What does the fourth point mean. A and B both causes C, which is (explicitly or implicitly) conditioned on. If A and B cause C, why do A and B have to be correlated.

**Answer**

“Conditioning” is a word from probability theory : https://en.wikipedia.org/wiki/Conditional_probability

Conditioning on C means that we are only looking at cases where C is true. “Implicitly” means that we may not be making this restriction explicit, sometimes not even aware of doing it.

The point means that, when A and B both cause C, observing a correlation between A and B in cases where C is true, does not mean there is a real relationship between A and B. It’s just conditioning on C (maybe unwillingly) that creates an artificial correlation.

Let’s take an example.

In a country there exists exactly two sorts of diseases, perfectly independent. Call A : “person has first disease”, B : “person has second disease”. Assume P(A)=0.1, P(B)=0.1.

Now any person who has one of these diseases goes to see the doctor and only then. Call C : “person goes to see the doctor”. We have C=A or B.

Now let’s calculate a few probabilities :

- P(C)=0.19
- P(A|C)=P(B|C)=0.10.19≈0.53
- P(A and B|C)=0.010.19≈0.053
- P(A|C)P(B|C)≈0.28

Clearly, when conditioned on C, A and B are very far from being independent. Actually, conditioned on C, notA seems to “cause” B.

If you use the list of persons who where recorded by their doctor(s) as a data source for an analysis, then there seems to be a strong correlation between diseases A and B. You may not be aware of the fact that your data source is actually a conditioning. This is also called a “selection bias”.

**Attribution***Source : Link , Question Author : matt , Answer Author : kjetil b halvorsen*