Using generalized method of moments (GMM) to calculate logistic regression parameter

I want to calculate coefficients to a regression that is very similar to logistic regression (Actually logistic regression with another coefficient:
$$ \frac{A}{1 + e^{- (b_0 + b_1 x_1 + b_2 x_2 + \ldots)}},$$
when $A$ could be given).
I thought of using GMM to calculate the coefficients, but I’m not sure what are the moment conditions I should use.

Can anyone help me with that?

Thanks!

Answer

Assuming $A\leq 1$, this model has Bernoulli response variable $Y_i$ with

$$
Pr(Y_i = 1) = \frac{A}{1+e^{-X_i’b}},
$$

where $b$ (and possibly $A$, depending on whether it is treated as a constant or a parameter) are the fitted coefficients and $X_i$ is the data for observation $i$. I assume the intercept term is handled by adding a variable with constant value 1 to the data matrix.

The moment conditions are:

\begin{align*}
\mathbb{E}\bigg[\bigg(Y_i-\frac{A}{1+e^{-X_i’b}}\bigg)X_i\bigg] &= 0.
\end{align*}

We replace this with the sample counterpart of the condition, assuming $N$ observations:

$$
m = \frac{1}{N}\sum_{i=1}^N \bigg[\bigg(Y_i-\frac{A}{1+e^{-X_i’b}}\bigg)X_i\bigg] = 0
$$

This is practically solved by minimizing $m’m$ across all possible coefficient values $b$ (below we will use the Nelder-Mead simplex to perform this optimization).

Borrowing from an excellent R-bloggers tutorial on the topic, it is pretty straightforward to implement this in R with the gmm package. As an example, let’s work with the iris dataset, predicting if an iris is versicolor based on its sepal length and width and petal length and width. I’ll assume $A$ is constant and equal to 1 in this case:

dat <- as.matrix(cbind(data.frame(IsVersicolor = as.numeric(iris$Species == "versicolor"), Intercept=1), iris[,1:4]))
head(dat)
#      IsVersicolor Intercept Sepal.Length Sepal.Width Petal.Length Petal.Width
# [1,]            0         1          5.1         3.5          1.4         0.2
# [2,]            0         1          4.9         3.0          1.4         0.2
# [3,]            0         1          4.7         3.2          1.3         0.2
# [4,]            0         1          4.6         3.1          1.5         0.2
# [5,]            0         1          5.0         3.6          1.4         0.2
# [6,]            0         1          5.4         3.9          1.7         0.4

Here are the coefficients fitted using logistic regression:

summary(glm(IsVersicolor~., data=as.data.frame(dat[,-2]), family="binomial"))
# Coefficients:
#              Estimate Std. Error z value Pr(>|z|)    
# (Intercept)    7.3785     2.4993   2.952 0.003155 ** 
# Sepal.Length  -0.2454     0.6496  -0.378 0.705634    
# Sepal.Width   -2.7966     0.7835  -3.569 0.000358 ***
# Petal.Length   1.3136     0.6838   1.921 0.054713 .  
# Petal.Width   -2.7783     1.1731  -2.368 0.017868 *  

The main piece we need to use gmm is a function that returns the moment conditions, namely rows $(Y_i-\frac{A}{1+e^{-X_i’b}})X_i$ for each observation $i$:

moments <- function(b, X) {
  A <- 1
  as.vector(X[,1] - A / (1 + exp(-(X[,-1] %*% cbind(b))))) * X[,-1]
}

We can now numerically fit coefficients $b$, using the linear regression coefficients as a convenient initial point (as suggested in the tutorial linked above):

init.coef <- lm(IsVersicolor~., data=as.data.frame(dat[,-2]))$coefficients
library(gmm)
fitted <- gmm(moments, x = dat, t0 = init.coef, type = "iterative", crit = 1e-19,
              wmatrix = "optimal", method = "Nelder-Mead",
              control = list(reltol = 1e-19, maxit = 20000))
fitted
#  (Intercept)  Sepal.Length   Sepal.Width  Petal.Length   Petal.Width  
#      7.37849      -0.24536      -2.79657       1.31364      -2.77834  
# 
# Convergence code =  0 

The convergence code of 0 indicates the procedure converged, and the parameters are identical to those returned by logistic regression.

A quick look at the gmm package source (functions momentEstim.baseGmm.iterative and gmm:::.obj1 for the parameters provided) shows that the gmm package is minimizing $m’m$ as indicated above. The following equivalent code calls the R optim function directly, performing the same optimization we achieved above with the call to gmm:

gmm.objective <- function(theta, x, momentFun) {
  avg.moment <- colMeans(momentFun(theta, x))
  sum(avg.moment^2)
}
optim(init.coef, gmm.objective, x=dat, momentFun=moments,
      control = list(reltol = 1e-19, maxit = 20000))$par
#  (Intercept) Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#    7.3784866   -0.2453567   -2.7965681    1.3136433   -2.7783439 

Attribution
Source : Link , Question Author : user5497 , Answer Author : josliber

Leave a Comment