Professional Documents
Culture Documents
Overview
Nonlinear Optimization Most deterministic methods for unconstrained optimization have the
following features:
Overview of methods; the Newton method with line search
They are iterative, i.e. they start with an initial guess x0 of the
variables and tries to find better points {xk }, k = 1, . . ..
Niclas Brlin They are descent methods, i.e. at each iteration k,
Department of Computing Science
f (xk +1 ) < f (xk )
Ume University
niclas.borlin@cs.umu.se is (at least) required.
At each iteration k, the nonlinear objective function f is replaced
November 19, 2007
by a simpler model function mk that approximates f around xk .
The next iterate xk +1 = xk + p is sought as the minimizer of mk .
c 2007 Niclas Brlin, CS, UmU
Nonlinear Optimization; The Newton method w/ line search c 2007 Niclas Brlin, CS, UmU
Nonlinear Optimization; The Newton method w/ line search
Trust-region
0.8
0.6
In the trust-region strategy, the algorithm defines a region of trust
xk around xk where the current model function mk is trusted.
0.4 The region of trust is usually defined as
0.2 kpk2 ,
0.8
In the line search strategy, the direction is chosen first, followed
by the distance.
0.6
In the trust-region strategy, the maximum distance is chosen
xk
0.4
first, followed by the direction.
0.6 0.6
0 0.4
xk 0.4
xk
0.2 0.2
0.2
0 p 0
0.2 0.2
0.4
0.4 0.4
Convergence rate
Assume we have a series {xk } that converges to a solution x .
Define the sequence of errors as
ek = xk x
In order to compare different iterative methods, we need an
efficiency measure. and note that
Since we do not know the number of iterations in advance, the lim ek = 0.
k
computational complexity measure used by direct methods
We say that the sequence {xk } converges to x with rate r and
cannot be used.
rate constant C if
Instead the concept of a convergence rate is defined. kek +1 k
lim =C
k kek kr
and C < .
Methods for unconstrained optimization Convergence rate Methods for unconstrained optimization Convergence rate
Convergence Linear convergence Convergence Linear convergence
Descent directions Quadratic convergence Descent directions Quadratic convergence
Line search Local vs. global convergence Line search Local vs. global convergence
The Newton Method Globalization strategies The Newton Method Globalization strategies
Methods for unconstrained optimization Convergence rate Methods for unconstrained optimization
Convergence Linear convergence Convergence
Descent directions Quadratic convergence Descent directions
Line search Local vs. global convergence Line search
The Newton Method Globalization strategies The Newton Method
p fk
Since cos = kpkkf is the angle between the search direction
kk
and the negative gradient, descent directions are in the same
half-plane as the negative gradient.
If the search direction has the form
The search direction corresponding to the negative gradient
pk = Bk1 fk , p = fk is called the direction of steepest descent.
1
0.8
0.6
0.5
0.2
0.1
1.5 1 0.5 0
Overview Overview
Methods for unconstrained optimization Methods for unconstrained optimization
Exact and inexact line searches Exact and inexact line searches
Convergence Convergence
The Sufficient Decrease Condition The Sufficient Decrease Condition
Descent directions Descent directions
Backtracking Backtracking
Line search Line search
The Curvature Condition The Curvature Condition
The Newton Method The Newton Method
The Wolfe Condition The Wolfe Condition
Each iteration of a line search method computes a search direction pk Consider the function
and then decides how far to move along that direction.
The next iteration is given by () = f (xk + pk ), > 0.
f()
or
f (xk + pk ) f (xk ) + fkT pk .
The sufficient decrease condition states that the new point must at
least produce a fraction 0 < c1 < 1 of the decrease predicted by the
Taylor approximation, i.e.
f (xk + pk ) < f (xk ) + c1 fkT pk . 0 1/16 1/8 1/4 1/2 1
This condition is sometimes called the Armijo condition.
c 2007 Niclas Brlin, CS, UmU
Nonlinear Optimization; The Newton method w/ line search c 2007 Niclas Brlin, CS, UmU
Nonlinear Optimization; The Newton method w/ line search
Overview Overview
Methods for unconstrained optimization Methods for unconstrained optimization
Exact and inexact line searches Exact and inexact line searches
Convergence Convergence
The Sufficient Decrease Condition The Sufficient Decrease Condition
Descent directions Descent directions
Backtracking Backtracking
Line search Line search
The Curvature Condition The Curvature Condition
The Newton Method The Newton Method
The Wolfe Condition The Wolfe Condition
where 0 < c1 < c2 < 1, are collectively called the strong Wolfe If f (xk ) 6= 0 we solve the linear equation
conditions. f (xk ) + pf (xk ) = 0
Step length methods that use the Wolfe conditions are more for p and get
complicated than backtracking. p = f (xk )/f (xk ).
Several popular implementations of nonlinear optimization
The new iterate is given by
routines are based on the Wolfe conditions, notably the BFGS
quasi-Newton method. xk +1 = xk + pk = xk f (xk )/f (xk ).
The Classical Newton minimization method in n Geometrical interpretation; the model function
In order to use Newtons method to find a minimizer we apply the The approximation of the non-linear function f (x) with the
first-order necessary conditions on a function f linear (in p) polynomial
f (x ) = 0 (f (x ) = 0) f (xk + p) f (xk ) + 2 f (xk )p
This results in the Newton sequence corresponds to approximating the non-linear function f (x) with
the quadratic (in p) Taylor expansion
xk +1 = xk (2 f (xk ))1 f (xk ) (xk +1 = xk f (x )/f (x ))
1
mk (xk + p) f (xk ) + f (xk )T p + p T 2 f (xk )p,
This is often written as xk +1 = xk + pk , where pk is the solution of the 2
Newton equation: i.e. Bk = 2 f (xk ).
2 f (xk )pk = f (xk ). Newtons method can be interpreted as that at each iteration k, f
This formulation emphasizes that a linear equation system is solved in is approximated by the quadratic Taylor expansion mk around xk
each step, usually by other means than calculating an inverse. and xk +1 is calculated as the minimizer of mk .
c 2007 Niclas Brlin, CS, UmU
Nonlinear Optimization; The Newton method w/ line search c 2007 Niclas Brlin, CS, UmU
Nonlinear Optimization; The Newton method w/ line search
The Newton-Raphson method in 1 The Newton-Raphson method in 1
Methods for unconstrained optimization Methods for unconstrained optimization
The Classical Newton minimization method in n The Classical Newton minimization method in n
Convergence Convergence
Geometrical interpretation; the model function Geometrical interpretation; the model function
Descent directions Descent directions
Properties of the Newton method Properties of the Newton method
Line search Line search
Ensuring a descent direction Ensuring a descent direction
The Newton Method The Newton Method
The modified Newton algorithm with line search The modified Newton algorithm with line search
The positive definite approximation Bk of the Hessian may be The modified Newton algorithm with line search
found with minimal extra effort: The search direction p is
calculated as the solution of
Specify a starting approximation x0 and a convergence tolerance .
2 f (x)p = f (x). Repeat for k = 0, 1, . . .
If 2 f (x) is positive definite, the matrix factorization If kf (xk )k < , stop.