I am modelling
invertebrate.biomass ~ habitat.type * calendar.day + habitat.type * calendar.day ^ 2, with a random intercept of
transect.id(50 transects were repeated 5 times)
My response is zero-heavy – about 25% are 0s – and the non-zeroes are strongly right-skewed.
I understand a possible way of dealing with this is to construct 2 models – one modelling a binary response in a logistic regression and the other modelling the non-zero response in a (e.g.) Gamma regression. I’m working in R and following the ideas in this post.
I want to check the method of combining the results of these 2 models, in order to generate quantitative predictions (ultimately with CI). Am I correct in multiplying the predicted probabilities from the logistic regression with the predicted (non-zero) biomass from the Gamma regression? Thus, the predicted (non-zero) biomass gets down-weighted according to the probability of there actually being an invertebrate present at all. This makes sense in my head, but feels too easy to be true.
See plots below which demonstrate my method in it’s current form.
Assuming I’m right so far, how would I then go about generating a SE / CI for the predictions combining two models?