I am doing a Cox proportional hazards regression in R using
coxph, which includes many variables. The Martingale residuals look great, and the Schoenfeld residuals are great for ALMOST all of the variables. There are three variables whose Schoenfeld residuals are not flat, and the nature of the variables is such that it makes sense that they could vary with time.
These are variables I’m not really interested in, so making them strata would be fine. However, all of them are continuous variables, not categorical variables. So I perceive strata to not be a viable route*. I have tried building interactions between the variables and time, as described here, but we get the error:
In fitter(X, Y, strats, offset, init, control, weights = weights, : Ran out of iterations and did not converge
I’m working with nearly 1000 data points, and am working with half a dozen variables with many factors each, so it feels like we’re pushing the limits of how this data can be sliced and diced. Unfortunately, all the simpler models I’ve tried with fewer included variables are clearly worse (ex. Schoenfeld residuals are crumbier for more variables).
What are my options? Since I don’t care about these particular poorly-behaved variables, I’d like to just ignore their output, but I suspect that isn’t a valid interpretation!
*One is continuous, one is an integer with a range of over 100, and one is an integer with a range of 6. Perhaps binning?
The most elegant way would be to use a parametric survival model (Gompertz, Weibull, Exponential, …) if you have some idea what the baseline hazard might look like.
If you want to stay with your Cox model you can take up an extended cox model with time-dependent coefficients. Bear in mind that there are also extended cox models with time depending covariats – these do not solve your problem!