Prev Next

@(@ \newcommand{\R}[1]{{\rm #1}} @)@@(@ \newcommand{\B}[1]{{\bf #1}} @)@
L2 Objective with L1 Regularization

Syntax
[x_outinfo] = l1_quadratic(max_itrAblambdadelta)

See Also
l1_with_l2

Notation
We use @(@ 1_n \in \B{R}^n @)@ to denote the vector will all elements equal to one. Given a vector @(@ w \in \B{R}^n @)@ we use @(@ D(w) \in \B{R}^{n \times n} @)@, to denote the corresponding diagonal matrix.

Purpose
We are given a positive semi-definite matrix @(@ A \in \B{R}^{n \times n} @)@, a vector @(@ b \in \B{R}^n @)@ and a scalar @(@ \lambda \in \B{R}_+ @)@. The problem we wish to solve is @[@ \begin{array}{lll} \R{minimize} & \frac{1}{2} x^\R{T} A x + b^\R{T} x + 2 \lambda \| x \|_1 & \R{w.r.t} \; x \in \B{R}^n \end{array} @]@ The @(@ \delta @)@ relaxed first order conditions for a minimum are: for @(@ i = 1 , \ldots , n @)@ @[@ \begin{array}{l} \R{if} \; x_i > \delta \; , \; \delta \geq \left| \sum_{j=1}^n A_{i,j} x_j + b_i + \lambda \right| \\ \R{if} \; x_i < - \delta \; , \; \delta \geq \left| \sum_{j=1}^n A_{i,j} x_j + b_i - \lambda \right| \\ \R{if} \; | x_i | \leq \delta \; , \; \lambda + \delta \geq \left| \sum_{j=1}^n A_{i,j} x_j + b_i \right| \end{array} @]@

max_itr
The integer scalar max_itr specifies the maximum number of iterations of the algorithm to execute. It must be greater than or equal to zero. Note that if it is zero, the first row of the info return value will still be computed. This can be useful for deciding what is a good value for the argument delta .

A
The argument A is an @(@ n \times n @)@ positive definite real matrix.

b
The argument b is real column vector of length @(@ n @)@.

lambda
The argument lambda is a real scalar. It corresponds to @(@ \lambda @)@ in the discussion above (see Purpose ).

delta
The argument delta is a real scalar greater than zero. It corresponds to @(@ \delta @)@ in the discussion above.

x_out
The result x_out contains the approximate solution of the optimization problem at the last iteration; i.e., it corresponds to the value in info(end, 1) described below.

info
The result info is a matrix with each row corresponding to an iteration of the algorithm. The row info(k, :) corresponds to iteration @(@ k-1 @)@ of the algorithm. If l1_quadratic has satisfied the convergence condition if and only if
      
info(end, 1) <= delta

First Order Condition
We use @(@ x^k @)@ to denote the value of @(@ x @)@ at iteration @(@ k @)@ of the algorithm. The value info(k+1, 1) is defined by @[@ info(k, 1) = \max_i \left\{ \begin{array}{ll} \left| \sum_{j=1}^n A_{i,j} x_j^k + b_i + \lambda \right| & \R{if} \; x_i^k > \delta \\ \left| \sum_{j=1}^n A_{i,j} x_j^k + b_i - \lambda \right| & \R{if} \; x_i^k < - \delta \\ \max \left [ 0 , \left| \sum_{j=1}^n A_{i,j} x_j^k + b_i \right| - \lambda \right] & \R{if} \; | x_i^k | \leq \delta \end{array} \right\} @]@

Relaxation Factor
The value info(k+1, 2) is the relaxation factor used during iteration @(@ k @)@. As this factor decreases to zero, the solution should correspond to the original problem.

Step Size
The value info(k+1, 3) is the line search step size used during iteration @(@ k @)@. Values near one indicate good convergence. Small values indicate that the relaxation factor is decreasing to quickly (with the exception that info(1, 3) is always zero).

Example
The file quadratic_ok.m contains an example and test of l1_quadratic. It returns true if the test passes and false otherwise.

Newton Step
The routine l1_newton_quad determines the Newton step used by l1_quadratic.
Input File: l1_quadratic.m