Horseshoe priors and random slope/intercept regressions

I’m interested in using the horseshoe prior (or the related hierarchical-shrinkage family of priors) for regression coefficients of a traditional multilevel regression (e.g., random slopes/intercepts). Horseshoe priors are similar to lasso and other regularization techniques, but have been found to have better performance in many situations. A regression coefficient βi, where i{1,D} predictors, has a horseshoe prior if its standard deviation is the product of a local (λi) and global (τ) scaling parameter.
βiNormal(0,λi)λiCauchy+(0,τ)τCauchy+(0,1)

I am uncertain as to the best way to expand this to a random intercept framework. For example, group j‘s ith coefficient is often normally distributed around a group-level mean (γi) with a group level standard deviation (σi).

βi,jNormal(γi,σi)γiNormal(0,ψ)σiCauchy+(0,c)

This tends to shrink estimates of βi,j towards γi based on the average dispersion around the coefficient mean. However, if only a small number of groups are substantially different from the mean, I’m concerned that the predictive or explanatory ability of the model may decrease. If I wanted to add a horseshoe prior to these coefficients, would it be appropriate to give each group’s coefficient it’s own independent λ?

βi,jNormal(γi,λi,j)γiNormal(0,λi,0)λi,jCauchy+(0,τ)τCauchy+(0,1)

Would it be better for the λi,j‘s to have an extra level of hierarchy that controls for dispersion around γi?

βi,jNormal(γi,λi,j)γiNormal(0,λi,0)λi,jCauchy+(0,ϕi)λi,0Cauchy+(0,τ)ϕiCauchy+(0,τ)τCauchy+(0,1)

I’ve played around with modeling some of these options in Stan, but I would appreciate thoughts or advice on whether or not these formulations make statistical sense.

Answer

Attribution
Source : Link , Question Author : C.R. Peterson , Answer Author : Community

Leave a Comment