The logistic model (or logit model) belongs to the generalized linear models family (GLM). It is widely used in regression analysis to model a binary dependent variable.

Since GLMs are commonly used R has already built-in functionality to estimate GLMs. Specifically the glm function from the stats package, with a binomial family and logit link can be used to estimate a logistic model.

The following logistic regression example is from Brian S. Everitt and Torsten Hothorn (2017).

options(width = 10000)
library(HSAUR)
## Loading required package: tools
m <- glm(ESR ~ fibrinogen, data = plasma, family = binomial(link = "logit"))
coef(m)
## (Intercept)  fibrinogen
##   -6.845075    1.827081

Making use of maximum likelihood estimation the logistic regression model can also be estimated in ROI. Here either a conic solver or a general purpose solver can be used. The conic solvers have the advantages that they are designed to find the global optimum and are (typically) faster.

# Log-likelihood

The log-likelihood $\underset{\beta}{\text{maximize}} \sum_{i = 1}^n ~ y_i ~ \log \left(\frac{\exp(X_{i*} \beta)}{1 + \exp(X_{i*} \beta)} \right) + \sum_{i = 1}^n ~ (1 - y_i) ~ \log \left(1 - \frac{\exp(X_{i*} \beta)}{1 + \exp(X_{i*} \beta)} \right)$ can be simplified to $\underset{\beta}{\text{maximize}} ~ \sum_{i = 1}^n y_i ~ X_{i*} \beta - \sum_{i = 1}^n \log(1 + \exp(X_{i*} \beta)).$

# Estimation

Sys.setenv(ROI_LOAD_PLUGINS = FALSE)
library(ROI)
library(ROI.plugin.optimx)
## Loading required package: minqa
library(ROI.plugin.ecos)
X <- cbind(Intercept = 1, fibrinogen = plasma$fibrinogen) y <- as.integer(plasma$ESR) - 1L

# General purpose solver

mle <- function(beta) {
drop(y %*% X %*% beta - sum(log(1 + exp(X %*% beta))))
}

op_gps <- OP(F_objective(mle, n =  ncol(X)), maximum = TRUE,
bounds = V_bound(ld = -Inf, nobj = ncol(X)))
s1 <- ROI_solve(op_gps, "optimx", start = double(ncol(X)))
## Maximizing -- use negfn and neggr
## Warning in max(logpar): no non-missing arguments to max; returning -Inf
## Warning in min(logpar): no non-missing arguments to min; returning Inf
solution(s1)
##  -6.845075  1.827081

# Conic solver

For conic optimization we need to build the matrices necessary to express the objective function.

library(slam)
logistic_regression <- function(y, X, solver = "ecos", ...) {
stm <- simple_triplet_matrix
stzm <- simple_triplet_zero_matrix
stopifnot(is.vector(y), length(y) == nrow(X))
m <- nrow(X); n <- ncol(X)
i <- 3 * seq_len(m) - 2
op <- OP(c(-(y %*% X), rep.int(1, m), double(m)), maximum = FALSE)
C11 <- stm(rep(i, n), rep(seq_len(n), each = m), -drop(X), 3 * m, n)
C12 <- stm(i, seq_len(m), rep.int(1, m), 3 * m, m)
C13 <- stm(i + 2, seq_len(m), rep.int(-1, m), 3 * m, m)
C1 <- cbind(C11, C12, C13)
C2 <- cbind(stzm(3 * m, n), C12, -C13)
C <- rbind(C1, C2)
cones <- K_expp(2 * m)
rhs <- c(rep(c(0, 1, 1), m), rep(c(0, 1, 0), m))
constraints(op) <- C_constraint(C, cones, rhs)
bounds(op) <- V_bound(ld = -Inf, nobj = ncol(C))
ROI_solve(op, solver = solver, ...)
}

s2 <- logistic_regression(y, X)
head(solution(s2), ncol(X))
##  -6.845075  1.827081