I’m coinvestigator on a clinical trial where the we are studying the effect of an intervention. The study is powered to detect a clinically meaningful change for some primary endpoint. My questions are around the planned secondary endpoints.
There are two secondary endpoints: 1) whether or not individuals randomized in the trial were prescribed medications (in this case statins) and 2) among those who were prescribed medications, whether or not individuals adhered to those medications. The first secondary endpoint is the main one and we have decided to use a regression model to estimate the intervention effect. The latter secondary endpoint hinges on the results of the first one. The PI wants to look at whether the intervention had an effect on statin adherence among those who were prescribed. Specifically, he wants to use a regression model to do this.
My concerns are 1) by subsetting to this group, we no longer have the randomization with respect to the intervention, so that the “effect of intervention on adherence” is meaningless, and 2) in line with the first point, the underlying mechanisms may vary between the main secondary outcome and the adherence outcome. In other words, we may have many unmeasured factors. My suggestion was to keep the analysis descriptive: tabulate the binary adherence outcome by the binary intervention, and restricting the analysis to those who were prescribed medications. However, a reviewer (nonstatistician) insists that we can still use regression.
Are there things that I can add in order to make a convincing case for keeping it descriptive?
Answer

You are right that the specified analysis is among a nonrandomized set. You are wrong however that the intervention is meaningless because we restrict analyses to this nonrandomized set. One does, however, need to adjust for possible confounders. In fact, this analysis does not “depend” on the significance of the first secondary hypothesis as one might suppose. Given statins are the most widely prescribed medication (period), I’m assuming the issue doesn’t concern whether any people start statins, but it rather has to do with the failure of the intervention to delay the time at which one starts statin medications. The failure to declare statistical significance of the first secondary endpoint simply means you did not have enough to power to detect a change if there was one. So the second secondary hypothesis needs to consider that intervention might affect compliance regardless. Identifying the right confounds to adjust for in an analysis is a serious statistical hurdle that needs careful deliberation in the protocol (via an amendment) or the SAP (if database lock has not been achieved yet).

You’re absolutely right that intervention may have opposite effects in the risk of starting a medication versus the risk of adhering to that medication. I can think of numerous examples. Consider fluoridation of water: it offsets time to dental carries, but it does nothing to improve dental hygiene so gum recess, gingivitis, etc. so the major surgical dental adverse events may not be affected at all. In other words: this is why we need a complete set of secondary endpoints, otherwise we are doing salami science.
Attribution
Source : Link , Question Author : stats134711 , Answer Author : AdamO