The least absolute deviation (LAD) — also known as L1 regression — is a statistical optimization technique attempting to find a function f(xi) which closely approximates a data set of the points (xi, yi) with i=1,…,n. It minimizes the sum of absolute errors between points generated by the function (ˆyi) and the corresponding data points. minimize n∑i|yi−ˆyi| As such can be classified as a non-linear optimization problem.
However, following, e.g., Brooks and Dula (2013) the objective function can be expressed as
minimizeβ0,β,e+,e− n∑i=1e+i+e−i subject to β0+β⊤xi+e+i−e−i=0 i=1,…,n βj=−1 e+i,e−i≥0 i=1,…,n given a set of points xi∈Rm, i=1,…,n and the jth column representing the dependent variable y. In comparison to the original formulation we differentiate between positive and negative deviations/errors (e+i and e−i, respectively) to get to the linear programming problem shown above.
The linear programming formulation of the L1 regression problem as shown above can be constructed using ROI methods via the following R function.
function(x, j) {
create_L1_problem <- ncol(x) + 1L
m <- 2 * nrow(x)
n <- double(m + n)
beta <-+ 1] <- 1
beta[j OP(objective = L_objective(c(rep(0, m), rep(1, n))),
constraints = rbind(
L_constraint(L = cbind(1, x, diag(nrow(x)), -diag(nrow(x))),
dir = eq(nrow(x)), rhs = rep(0, nrow(x))),
L_constraint(L = beta, dir = "==", rhs = -1)),
bounds = V_bound(li = seq_len(m), lb = rep(-Inf, m),
nobj = length(beta)))
}
With ROI one can solve e.g., Brownlee’s stack loss plant data example from the stats package using the above OP and the solver GLPK as follows.
Sys.setenv(ROI_LOAD_PLUGINS = FALSE)
library(ROI)
library(ROI.plugin.glpk)
data(stackloss)
create_L1_problem(as.matrix(stackloss), 4)
l1p <- ROI_solve(l1p, solver = "glpk") L1_res <-
The first value corresponds to the intercept and the others to the model coefficients.
For comparison purposes we solve the same problem with R package quantreg.
library("quantreg")
rq(stack.loss ~ stack.x, 0.5) fm <-
ncol(stackloss)
n <-cbind(ROI = head(solution(L1_res), n),
quantreg = coef(fm))
## ROI quantreg
## (Intercept) -39.68985507 -39.68985507
## stack.xAir.Flow 0.83188406 0.83188406
## stack.xWater.Temp 0.57391304 0.57391304
## stack.xAcid.Conc. -0.06086957 -0.06086957
doi:10.1016/j.aml.2012.03.031