The role of scale parameter in GEE

I am learning the generalized estimating equations (GEE) and the geepack R package. There are some questions that I am a little confused.

In a GEE-constructed model, we have Var(Yit)=ϕitV(μit), where ϕ is the scale parameter. We further decompose Var(Y) into V1/2R(α)V1/2 where α is the correlation parameter. Three link-models are specified in geepack, for μ,ϕ,α, respectively. See this PDF file for details.

(1) In GEE1, can I say we only need to make sure that the mean structure is correctly specified, i.e. the link model g(μ)=Xβ is correct, while it doesn’t matter whether the link models for ϕ and α are correctly specified?

(2) By default, the function makes the scale value scale.value = 1.0 — does it say ϕ=1.0? It is understandable that the default alpha=NULL as people can specify different correlation structures and the program will assign appropriate alpha values accordingly. My question is: how often people try to explicitly model the scale parameter ϕ?

(3) This question is closely related to (2) about the scale parameter ϕ. Recall the variance function is Var(Yit)=ϕitV(μit). In the Gaussian case, we have ϕ=σ2 and V(μit)=1; in the binomial case, we have ϕ=1 and V(μit)=μit(1μit); in the Poisson case, we have ϕ=1 and V(μit)=μit. Can I say that, in the negative binomial case, ϕ=1 and V(μit)=μit+φμ2it? Here φ is the NB2 dispersion parameter, NOT the scale parameter ϕ in GEE.

Thank you very much!



If you use the sandwich estimator for the covariance, GEE’s coefficients and standard errors are consistent even if your working models for α and ϕ are wrong. This is well-covered elsewhere, e.g. Sandwich estimator intuition .


I don’t know if it was the same back in 2012, but nowadays at least, scale.value is only used if scale.fix is TRUE.


Yes, you’re exactly right. The scale parameter ϕ is distinct from the NB dispersion φ. They are similar in a vague sense — “addressing overdispersion” — but different quantitatively. For example, if μ=1, then φ adds itself to the variance, whereas ϕ multiplies itself by the variance.

One key qualitative difference is that ϕ does not help re-weight observations to lend more credence to those with lower variance, whereas φ does. You can plug in anything for ϕ and get the same coefficients out. Not so for φ; huge values will cause your estimates to depend overly on observations with small fitted values (for μ). There is a nice recap of closely related techniques in this paper, which, as a bonus, involves a cute kind of animal (harbor seals).

Source : Link , Question Author : alittleboy , Answer Author : eric_kernfeld

Leave a Comment