# Is the logit function always the best for regression modeling of binary data?

However is the logit function, which is an S-shaped curve, always the best for modeling the data? Maybe you have reason to believe your data does not follow the normal S-shaped curve but a different type of curve with domain $(0,1)$.

Is there any research into this? Maybe you can model it as a probit function or something similar, but what if it is something else entirely? Could this lead to better estimation of the effects? Just a thought I had, and I wonder if there is any research into this.

As Macro alluded to in his comment on your question, one common choice is a probit model, which uses the quantile function of a Gaussian instead of the logistic function. I’ve also heard good things about using the quantile function of a Student’s $t$ distribution, although I’ve never tried it.
They all have the same basic S-shape, but they differ in how quickly they saturate at each end. Probit models approach 0 and 1 very quickly, which can be dangerous if the probabilities tend to be less extreme. $t$-based models can go either way, depending on how many degrees of freedom the $t$ distribution has. Andrew Gelman says (in a mostly unrelated context) that $t_7$ is roughly like the logistic curve. Lowering the degrees of freedom gives you fatter tails and a broader range of intermediate values in your regression. When the degrees of freedom go to infinity, you’re back to the probit model.