I know that traditional statistical models like Cox Proportional Hazards regression & some Kaplan-Meier models can be used to predict days till next occurrence of an event say failure etc. i.e Survival analysis

Questions

- How can regression version of machine learning models like GBM, Neural networks etc be used to predict days till occurrence of an event?
- I believe just using days till occurence as target variable and simplying running a regression model will not work? Why wont it work & how can it be fixed?
- Can we convert the survival analysis problem to a classification and then obtain survival probabilities? If then how to create the binary target variable?
- What is the pros & cons of machine learning approach vs Cox Proportional Hazards regression & Kaplan-Meier models etc?

Imagine sample input data is of the below formatNote:

- The sensor pings the data at intervals of 10 mins but at times data can be missing due to network issue etc as represented by the row with NA.
- var1,var2,var3 are the predictors, explanatory variables.
- failure_flag tells whether the machine failed or not.
- We have last 6 months data at every 10 min interval for each of the machine ids

EDIT:

Expected output prediction should be in the below format

Note: I want to predict the probability of failure for each of the machines for the next 30 days at daily level.

**Answer**

For the case of neural networks, this is a promising approach: WTTE-RNN – Less hacky churn prediction.

The essence of this method is to use a Recurrent Neural Network to predict parameters of a Weibull distribution at each time-step and optimize the network using a loss function that takes censoring into account.

The author also released his implementation on Github.

**Attribution***Source : Link , Question Author : GeorgeOfTheRF , Answer Author : liori*