Context:In Gelman’s 8-school example (Bayesian Data Analysis, 3rd edition, Ch 5.5) there are eight parallel experiments in 8 schools testing the effect of coaching. Each experiment yields an estimate for the effectiveness of coaching and the associated standard error.

The authors then build a hierarchical model for the 8 data points of coaching effect as follows:

yi∼N(θi,sei)θi∼N(μ,τ)

Question

In this model, they assume that sei is known. I do not understand this assumption — if we feel that we have to model θi, why don’t we do the same for sei?I’ve checked the Rubin’s original paper introducing the 8 school example, and there too the author says that (p 382):

the assumption of normality and known standard error is made routinely

when we summarize a study by an estimated effect and its standard

error, and we will not question its use here.To summarize, why don’t we model sei? Why do we treat it as known?

**Answer**

On p114 of the same book you cite: “The problem of estimating a set of means with unknown variances will require some additional computational methods, presented in sections 11.6 and 13.6”. So it is for simplicity; the equations in your chapter work out in a closed-form way, whereas if you model the variances, they do not, and you need MCMC techniques from the later chapters.

In the school example, they rely on large sample size to assume that the variances are known “for all practical purposes” (p119), and I expect they estimate them using 1n−1∑(xi−¯x)2 and then pretend those are the exact known values.

**Attribution***Source : Link , Question Author : Heisenberg , Answer Author : Drew N *