Prev Next

@(@ \newcommand{\R}[1]{{\rm #1}} @)@@(@ \newcommand{\B}[1]{{\bf #1}} @)@
L2 Objective with L1 Regularization

[x_outinfo] = l1_quadratic(max_itrAblambdadelta)

See Also

We use @(@ 1_n \in \B{R}^n @)@ to denote the vector will all elements equal to one. Given a vector @(@ w \in \B{R}^n @)@ we use @(@ D(w) \in \B{R}^{n \times n} @)@, to denote the corresponding diagonal matrix.

We are given a positive semi-definite matrix @(@ A \in \B{R}^{n \times n} @)@, a vector @(@ b \in \B{R}^n @)@ and a scalar @(@ \lambda \in \B{R}_+ @)@. The problem we wish to solve is @[@ \begin{array}{lll} \R{minimize} & \frac{1}{2} x^\R{T} A x + b^\R{T} x + 2 \lambda \| x \|_1 & \R{w.r.t} \; x \in \B{R}^n \end{array} @]@ The @(@ \delta @)@ relaxed first order conditions for a minimum are: for @(@ i = 1 , \ldots , n @)@ @[@ \begin{array}{l} \R{if} \; x_i > \delta \; , \; \delta \geq \left| \sum_{j=1}^n A_{i,j} x_j + b_i + \lambda \right| \\ \R{if} \; x_i < - \delta \; , \; \delta \geq \left| \sum_{j=1}^n A_{i,j} x_j + b_i - \lambda \right| \\ \R{if} \; | x_i | \leq \delta \; , \; \lambda + \delta \geq \left| \sum_{j=1}^n A_{i,j} x_j + b_i \right| \end{array} @]@

The integer scalar max_itr specifies the maximum number of iterations of the algorithm to execute. It must be greater than or equal to zero. Note that if it is zero, the first row of the info return value will still be computed. This can be useful for deciding what is a good value for the argument delta .

The argument A is an @(@ n \times n @)@ positive definite real matrix.

The argument b is real column vector of length @(@ n @)@.

The argument lambda is a real scalar. It corresponds to @(@ \lambda @)@ in the discussion above (see Purpose ).

The argument delta is a real scalar greater than zero. It corresponds to @(@ \delta @)@ in the discussion above.

The result x_out contains the approximate solution of the optimization problem at the last iteration; i.e., it corresponds to the value in info(end, 1) described below.

The result info is a matrix with each row corresponding to an iteration of the algorithm. The row info(k, :) corresponds to iteration @(@ k-1 @)@ of the algorithm. If l1_quadratic has satisfied the convergence condition if and only if
info(end, 1) <= delta

First Order Condition
We use @(@ x^k @)@ to denote the value of @(@ x @)@ at iteration @(@ k @)@ of the algorithm. The value info(k+1, 1) is defined by @[@ info(k, 1) = \max_i \left\{ \begin{array}{ll} \left| \sum_{j=1}^n A_{i,j} x_j^k + b_i + \lambda \right| & \R{if} \; x_i^k > \delta \\ \left| \sum_{j=1}^n A_{i,j} x_j^k + b_i - \lambda \right| & \R{if} \; x_i^k < - \delta \\ \max \left [ 0 , \left| \sum_{j=1}^n A_{i,j} x_j^k + b_i \right| - \lambda \right] & \R{if} \; | x_i^k | \leq \delta \end{array} \right\} @]@

Relaxation Factor
The value info(k+1, 2) is the relaxation factor used during iteration @(@ k @)@. As this factor decreases to zero, the solution should correspond to the original problem.

Step Size
The value info(k+1, 3) is the line search step size used during iteration @(@ k @)@. Values near one indicate good convergence. Small values indicate that the relaxation factor is decreasing to quickly (with the exception that info(1, 3) is always zero).

The file quadratic_ok.m contains an example and test of l1_quadratic. It returns true if the test passes and false otherwise.

Newton Step
The routine l1_newton_quad determines the Newton step used by l1_quadratic.
Input File: l1_quadratic.m