# What is the difference between decision_function, predict_proba, and predict function for logistic regression problem?

I have been going through the sklearn documentation but I am not able to understand the purpose of these functions in the context of logistic regression.
For decision_function it says that its the distance between the hyperplane and the test instance. how is this particular information useful? and how does this relate to predict and predict-proba methods?

Recall that the functional form of logistic regression is

$$f(x) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \cdots + \beta_k x_k)}}$$

This is what is returned by predict_proba.

The term inside the exponential

$$d(x) = \beta_0 + \beta_1 x_1 + \cdots + \beta_k x_k$$

is what is returned by decision_function. The “hyperplane” referred to in the documentation is

$$\beta_0 + \beta_1 x_1 + \cdots + \beta_k x_k = 0$$

This terminology is a holdover from support vector machines, which literally estimate a separating hyperplane. For logistic regression this hyperplane is a bit of an artificial construct, it is the plane of equal probability, where the model has determined both target classes are equally likely.

The predict function returns a class decision using the rule

$$f(x) > 0.5$$

At the risk of soapboxing, the predict function has very few legitimate uses, and I view using it as a sign of error when reviewing others work. I would go far enough to call it a design error in sklearn itself (the predict_proba function should have been called predict, and predict should have been called predict_class, if anything at all).