Levenberg-Marquardt algorithm

From Wikipedia, the free encyclopedia

(Redirected from Levenberg-Marquardt)
Jump to: navigation, search

In mathematics and computing, the Levenberg-Marquardt algorithm (or LMA) provides a numerical solution to the problem of minimizing a function, generally nonlinear, over a space of parameters of the function. These minimization problems arise especially in least squares curve fitting and nonlinear programming.

The LMA interpolates between the Gauss-Newton algorithm (GNA) and the method of gradient descent. The LMA is more robust than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. On the other hand, for well-behaved functions and reasonable starting parameters, the LMA tends to be a bit slower than the GNA.

The LMA is a very popular curve-fitting algorithm; most software with generic curve-fitting capabilities provides an implementation of it.

Contents

Its main application is in the least squares curve fitting problem: given a set of empirical data pairs (ti, yi), optimize the parameters p of the model curve f(t|p) so that the sum of the squares of the deviations

S(\mathbf{p}) = \sum_{i=1}^m [y_i - f(t_i \mid \mathbf{p}) ]^2\,

becomes minimal.

(A word on notation: we avoid the letter x because it is used sometimes in the place of our p, sometimes in the place of our t).

Like other numeric minimization algorithms, the Levenberg-Marquardt algorithm is an iterative procedure. To start a minimization, the user has to provide an initial guess for the parameter vector p. In many cases, an uninformed standard guess like pT=(1,1,...,1) will work fine; in other cases, the algorithm converges only if the initial guess is already somewhat close to the final solution.

In each iteration step, the parameter vector p is replaced by a new estimate p + q. To determine q, the functions fi(p + q) are approximated by their linearizations

 f(p+q) \approx f(p) + J q \,\!

where J is the Jacobian of f at p.

At a minimum of the sum of squares S, the gradient of S with respect to q is 0. Differentiating the square of the right hand side of the equation above and setting to zero leads to:

(J^{T}J) q = J^{T} [y - f(p)] \,\!

from which q can be obtained by inverting JTJ.

The key hallmark of the LMA is to replace this equation by a 'damped version'

(J^{T}J + \lambda I) q = J^{T} [y - f(p)], \,\!

where I is the identity matrix, giving as the increment q to the estimated parameter vector p

 q = (J^{T}J + \lambda I)^{-1} J^{T} [y - f(p)]. \,\!

The (non-negative) damping factor λ is adjusted at each iteration. If reduction of S is rapid a smaller value can be used bringing the algorithm closer to the GNA, whereas if an iteration gives insufficient reduction in the residual λ can be increased giving a step closer to the gradient descent direction. A similar damping factor appears in Tikhonov regularization, which is used to solve linear ill-posed problems.

If either the calculated step length q, or the reduction of sum of squares reduction from the latest parameter vector p + q, fall below predefined limits, the iteration is aborted and the last parameter vector p is considered to be the solution.

Various more or less heuristic arguments have been put forward for the best choice for the damping parameter λ. Theoretical arguments exist showing why some of these choices guaranteed local convergence of the algorithm; however these choices can make the global convergence of the algorithm suffer from the undesirable properties of steepest-descent, in particular very slow convergence close to the optimum.

The absolute values of any choice depends on how well-scaled the initial problem is. Marquardt recommended starting with a value λ0 and a factor ν>1. Initially setting λ=λ0 and computing the residual sum of squares S(p) after one step from the starting point with the damping factor of λ=λ0 and secondly with λ/ν. If both of these are worse than the initial point then the damping is increased by successive multiplication by ν until a better point is found with a new damping factor of λνk for some k.

If use of the damping factor λ/ν results in a reduction in squared residual then this is taken as the new value of λ (and the new optimum location is taken as that obtained with this damping factor) and the process continues; if using λ/ν resulted in a worse residual, but using λ resulted in a better residual then λ is left unchanged and the new optimum is taken as the value obtained with λ as damping factor.

In this example we try to fit the function y = acos(bX) + bsin(aX) using the Levenberg-Marquardt (Lev-Mar) algorithm implemented in GNU Octave as the leasqr function. The 3 graphs Fig 1,2,3 show progressively better fitting for the parameters a=100, b=102 used in the initial curve. Only when the parameters in Fig 3 are chosen closest to the original,are the curves fitting exactly. This equation is an example of very sensitive initial conditions for the Lev-Mar algorithm.

Poor Fit Better Fit Best Fit

The algorithm was first published by Kenneth Levenberg, while working at the Frankford Army Arsenal. It was rediscovered by Donald Marquardt who worked as a statistician at DuPont.

Available implementations:

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.