I am referring to this article: http://www.nytimes.com/2011/01/11/science/11esp.html

Consider the following experiment. Suppose there was reason to believe that a coin was slightly weighted toward heads. In a test, the coin comes up heads 527 times out of 1,000.

Is this significant evidence that the

coin is weighted?Classical analysis says yes. With a

fair coin, the chances of getting 527

or more heads in 1,000 flips is less

than 1 in 20, or 5 percent, the

conventional cutoff. To put it another

way: the experiment finds evidence of

a weighted coin “with 95 percent

confidence.”Yet many statisticians do not buy it.

One in 20 is the probability of

getting any number of heads above 526

in 1,000 throws. That is, it is the

sum of the probability of flipping

527, the probability of flipping 528,

529 and so on.But the experiment did not find all of

the numbers in that range; it found

just one — 527. It is thus more

accurate, these experts say, to

calculate the probability of getting

that one number — 527 — if the coin is

weighted, and compare it with the

probability of getting the same number

if the coin is fair.Statisticians can show that this ratio

cannot be higher than about 4 to 1,

according to Paul Speckman, a

statistician, who, with Jeff Rouder, a

psychologist, provided the example.

First question:This is new to me. Has anybody a reference where I can find the exact calculation and/or can YOU help me by giving me the exact calculation yourself and/or can you point me to some material where I can find similar examples?Bayes devised a way to update the

probability for a hypothesis as new

evidence comes in.So in evaluating the strength of a

given finding, Bayesian (pronounced

BAYZ-ee-un) analysis incorporates

known probabilities, if available,

from outside the study.It might be called the “Yeah, right”

effect. If a study finds that kumquats

reduce the risk of heart disease by 90

percent, that a treatment cures

alcohol addiction in a week, that

sensitive parents are twice as likely

to give birth to a girl as to a boy,

the Bayesian response matches that of

the native skeptic: Yeah, right. The

study findings are weighed against

what is observable out in the world.In at least one area of medicine —

diagnostic screening tests —

researchers already use known

probabilities to evaluate new

findings. For instance, a new

lie-detection test may be 90 percent

accurate, correctly flagging 9 out of

10 liars. But if it is given to a

population of 100 people already known

to include 10 liars, the test is a lot

less impressive.It correctly identifies 9 of the 10

liars and misses one; but it

incorrectly identifies 9 of the other

90 as lying. Dividing the so-called

true positives (9) by the total number

of people the test flagged (18) gives

an accuracy rate of 50 percent. The

“false positives” and “false

negatives” depend on the known rates

in the population.

Second question:How do you exactly judge if a new finding is “real” or not with this method? And: Isn’t this as arbitrary as the 5%-barrier because of the use of some preset prior probability?

**Answer**

I will answer the first question in detail.

With a fair coin, the chances of

getting 527 or more heads in 1,000

flips is less than 1 in 20, or 5

percent, the conventional cutoff.

For a fair coin the number of heads in 1000 trials follows the binomial distribution with number of trials $n=1000$ and probability $p=1/2$. The probability of getting more than 527 heads is then

$$P(B(1000,1/2)>=527)$$

This can be calculated with any statistical software package. R gives us

```
> pbinom(526,1000,1/2,lower.tail=FALSE)
0.04684365
```

So the probability that with fair coin we will get more than 526 heads is approximately 0.047, which is close to 5% cuttoff mentioned in the article.

The following statement

To put it another way: the experiment

finds evidence of a weighted coin

“with 95 percent confidence.”

is debatable. I would be reluctant to say it, since 95% confidence can be interpreted in several ways.

Next we turn to

But the experiment did not find all of

the numbers in that range; it found

just one — 527. It is thus more

accurate, these experts say, to

calculate the probability of getting

that one number — 527 — if the coin is

weighted, and compare it with the

probability of getting the same number

if the coin is fair.

Here we compare two events $B(1000,1/2)=527$ — fair coin, and $B(1000,p)=527$ — weighted coin. Substituting the formulas for probabilities of these events and noting that the binomial coefficient cancels out we get

$$\frac{P(B(1000,p)=527)}{P(B(1000,1/2)=527)}=\frac{p^{527}(1-p)^{473}}{(1/2)^{1000}}.$$

This is a function of $p$, thus we cand find minima or maxima of it. From the article we may infer that we need maxima:

Statisticians can show that this ratio

cannot be higher than about 4 to 1,

according to Paul Speckman, a

statistician, who, with Jeff Rouder, a

psychologist, provided the example.

To make maximisation easier take logarithm of ratio, calculate the derivative with respect to $p$ and equate it to zero. The solution will be

$$p=\frac{527}{1000}.$$

We can check that it is really a maximum using second derivative test for example. Substituting it to the formula we get

$$\frac{(527/1000)^{527}(473/1000)^{473}}{(1/2)^{1000}}\approx 4.3$$

So the ratio is 4.3 to 1, which agrees with the article.

**Attribution***Source : Link , Question Author : vonjd , Answer Author : mpiktas*