To model count data and contingency tables often Poisson regression is used. Poisson regression models belong to the generalized linear models family (GLM).
Since GLMs are commonly used R has already built-in functionality to estimate GLMs. Specifically the glm
function from the stats package, withpoisson family and log link can be used to estimate a Poisson model.
The following poisson regression example is from the glm
manual page and based on Dobson (1990).
options(width = 10000)
c(18, 17, 15, 20, 10, 20, 25, 13, 12)
counts <- gl(3, 1, 9)
outcome <- gl(3, 3)
treatment <- glm(counts ~ outcome + treatment, family = poisson(link = "log"))
glm.D93 <-round(coef(glm.D93), 4)
## (Intercept) outcome2 outcome3 treatment2 treatment3
## 3.0445 -0.4543 -0.2930 0.0000 0.0000
Making use of maximum likelihood estimation the logistic regression model can also be estimated in ROI. Here either a conic solver or a general purpose solver can be used. The conic solvers have the advantages that they are specifically designed to find the global optimum and are (often) faster.
The maximum likelihood estiamte can be obtained be solving the following optimzation problem. \[ \begin{equation} \underset{\beta}{\text{maximize}} ~~ \sum_{i = 1}^n y_i ~ log(\lambda_i) - \lambda_i ~~ \text{where} ~~ \lambda_i = exp(X_{i*} \beta) \end{equation} \]
Sys.setenv(ROI_LOAD_PLUGINS = FALSE)
library(ROI)
library(ROI.plugin.nloptr)
library(ROI.plugin.ecos)
model.matrix(glm.D93)
X <- counts y <-
function(beta) {
log_likelihood <- drop(X %*% beta)
xb <-sum(y * xb - exp(xb))
}
OP(F_objective(log_likelihood, n = ncol(X)), maximum = TRUE,
op_gps <-bounds = V_bound(ld = -Inf, nobj = ncol(X)))
ROI_solve(op_gps, "nloptr.lbfgs", start = rnorm(ncol(X)))
s1 <-round(solution(s1), 4)
## [1] 3.0445 -0.4543 -0.2930 0.0000 0.0000
This problem can also be estimated by making use of conic optimization.
library(slam)
function(y, X) {
poisson_regression <- nrow(X); n <- ncol(X)
m <- 3 * seq_len(m) - 2
i <- OP(c(-(y %*% X), rep.int(1, m)))
op <- simple_triplet_matrix
stm <- cbind(stm(rep(i, n), rep(seq_len(n), each = m), -drop(X), 3 * m, n),
A <-stm(i + 2, seq_len(m), rep.int(-1, m), 3 * m, m))
rep(c(0, 1, 0), m)
rhs <- K_expp(m)
cones <-constraints(op) <- C_constraint(A, cones, rhs)
bounds(op) <- V_bound(ld = -Inf, nobj = ncol(A))
op
}
poisson_regression(y, X)
op <- ROI_solve(op, solver = "ecos")
s <-round(head(solution(s), n = NCOL(X)), 4)
## [1] 3.0445 -0.4543 -0.2930 0.0000 0.0000