Document string from the mpfit moduleΒΆ


Perform Levenberg-Marquardt least-squares minimization, based on MINPACK-1.

AUTHORS

The original version of this software, called LMFIT, was written in FORTRAN
as part of the MINPACK-1 package by XXX.

Craig Markwardt converted the FORTRAN code to IDL. The information for the
IDL version is:
Craig B. Markwardt, NASA/GSFC Code 662, Greenbelt, MD 20770
UPDATED VERSIONs can be found on my WEB PAGE:
Mark Rivers created this Python version from Craig’s IDL version.
Mark Rivers, University of Chicago
Building 434A, Argonne National Laboratory
9700 South Cass Avenue, Argonne, IL 60439
Updated versions can be found at http://cars.uchicago.edu/software

Sergey Koposov converted the Mark’s Python version from Numeric to numpy
Sergey Koposov, Max Planck Institute for Astronomy
Heidelberg, Germany, D-69117

DESCRIPTION

MPFIT uses the Levenberg-Marquardt technique to solve the
least-squares problem. In its typical use, MPFIT will be used to
fit a user-supplied function (the “model”) to user-supplied data
points (the “data”) by adjusting a set of parameters. MPFIT is
based upon MINPACK-1 (LMDIF.F) by More’ and collaborators.

For example, a researcher may think that a set of observed data
points is best modelled with a Gaussian curve. A Gaussian curve is
parameterized by its mean, standard deviation and normalization.
MPFIT will, within certain constraints, find the set of parameters
which best fits the data. The fit is “best” in the least-squares
sense; that is, the sum of the weighted squared differences between
the model and data is minimized.

The Levenberg-Marquardt technique is a particular strategy for
iteratively searching for the best fit. This particular
implementation is drawn from MINPACK-1 (see NETLIB), and is much faster
and more accurate than the version provided in the Scientific Python package
in Scientific.Functions.LeastSquares.
This version allows upper and lower bounding constraints to be placed on each
parameter, or the parameter can be held fixed.

The user-supplied Python function should return an array of weighted
deviations between model and data. In a typical scientific problem
the residuals should be weighted so that each deviate has a
gaussian sigma of 1.0. If X represents values of the independent
variable, Y represents a measurement for each value of X, and ERR
represents the error in the measurements, then the deviates could
be calculated as follows:

DEVIATES = (Y - F(X)) / ERR

where F is the analytical function representing the model. You are
recommended to use the convenience functions MPFITFUN and
MPFITEXPR, which are driver functions that calculate the deviates
for you. If ERR are the 1-sigma uncertainties in Y, then

TOTAL( DEVIATES^2 )

will be the total chi-squared value. MPFIT will minimize the
chi-square value. The values of X, Y and ERR are passed through
MPFIT to the user-supplied function via the FUNCTKW keyword.

Simple constraints can be placed on parameter values by using the
PARINFO keyword to MPFIT. See below for a description of this
keyword.

MPFIT does not perform more general optimization tasks. See TNMIN
instead. MPFIT is customized, based on MINPACK-1, to the
least-squares minimization problem.


USER FUNCTION

The user must define a function which returns the appropriate
values as specified above. The function should return the weighted
deviations between the model and the data. It should also return a status
flag and an optional partial derivative array. For applications which
use finite-difference derivatives – the default – the user
function should be declared in the following way:

def myfunct(p, fjac=None, x=None, y=None, err=None)
# Parameter values are passed in “p”
# If fjac==None then partial derivatives should not be
# computed. It will always be None if MPFIT is called with default
# flag.
model = F(x, p)
# Non-negative status value means MPFIT should continue, negative means
# stop the calculation.
status = 0
return([status, (y-model)/err]

See below for applications with analytical derivatives.

The keyword parameters X, Y, and ERR in the example above are
suggestive but not required. Any parameters can be passed to
MYFUNCT by using the functkw keyword to MPFIT. Use MPFITFUN and
MPFITEXPR if you need ideas on how to do that. The function must
accept a parameter list, P.

In general there are no restrictions on the number of dimensions in
X, Y or ERR. However the deviates must be returned in a
one-dimensional Numeric array of type Float.

User functions may also indicate a fatal error condition using the
status return described above. If status is set to a number between
-15 and -1 then MPFIT will stop the calculation and return to the caller.


ANALYTIC DERIVATIVES

In the search for the best-fit solution, MPFIT by default
calculates derivatives numerically via a finite difference
approximation. The user-supplied function need not calculate the
derivatives explicitly. However, if you desire to compute them
analytically, then the AUTODERIVATIVE=0 keyword must be passed to MPFIT.
As a practical matter, it is often sufficient and even faster to allow
MPFIT to calculate the derivatives numerically, and so
AUTODERIVATIVE=0 is not necessary.

If AUTODERIVATIVE=0 is used then the user function must check the parameter
FJAC, and if FJAC!=None then return the partial derivative array in the
return list.
def myfunct(p, fjac=None, x=None, y=None, err=None)
# Parameter values are passed in “p”
# If FJAC!=None then partial derivatives must be comptuer.
# FJAC contains an array of len(p), where each entry
# is 1 if that parameter is free and 0 if it is fixed.
model = F(x, p)
Non-negative status value means MPFIT should continue, negative means
# stop the calculation.
status = 0
if (dojac):
pderiv = zeros([len(x), len(p)], Float)
for j in range(len(p)):
pderiv[:,j] = FGRAD(x, p, j)
else:
pderiv = None
return([status, (y-model)/err, pderiv]

where FGRAD(x, p, i) is a user function which must compute the
derivative of the model with respect to parameter P[i] at X. When
finite differencing is used for computing derivatives (ie, when
AUTODERIVATIVE=1), or when MPFIT needs only the errors but not the
derivatives the parameter FJAC=None.

Derivatives should be returned in the PDERIV array. PDERIV should be an m x
n array, where m is the number of data points and n is the number
of parameters. dp[i,j] is the derivative at the ith point with
respect to the jth parameter.

The derivatives with respect to fixed parameters are ignored; zero
is an appropriate value to insert for those derivatives. Upon
input to the user function, FJAC is set to a vector with the same
length as P, with a value of 1 for a parameter which is free, and a
value of zero for a parameter which is fixed (and hence no
derivative needs to be calculated).

If the data is higher than one dimensional, then the last
dimension should be the parameter dimension. Example: fitting a
50x50 image, “dp” should be 50x50xNPAR.


CONSTRAINING PARAMETER VALUES WITH THE PARINFO KEYWORD

The behavior of MPFIT can be modified with respect to each
parameter to be fitted. A parameter value can be fixed; simple
boundary constraints can be imposed; limitations on the parameter
changes can be imposed; properties of the automatic derivative can
be modified; and parameters can be tied to one another.

These properties are governed by the PARINFO structure, which is
passed as a keyword parameter to MPFIT.

PARINFO should be a list of dictionaries, one list entry for each parameter.
Each parameter is associated with one element of the array, in
numerical order. The dictionary can have the following keys
(none are required, keys are case insensitive):

‘value’ - the starting parameter value (but see the START_PARAMS
parameter for more information).

‘fixed’ - a boolean value, whether the parameter is to be held
fixed or not. Fixed parameters are not varied by
MPFIT, but are passed on to MYFUNCT for evaluation.

‘limited’ - a two-element boolean array. If the first/second
element is set, then the parameter is bounded on the
lower/upper side. A parameter can be bounded on both
sides. Both LIMITED and LIMITS must be given
together.

‘limits’ - a two-element float array. Gives the
parameter limits on the lower and upper sides,
respectively. Zero, one or two of these values can be
set, depending on the values of LIMITED. Both LIMITED
and LIMITS must be given together.

‘parname’ - a string, giving the name of the parameter. The
fitting code of MPFIT does not use this tag in any
way. However, the default iterfunct will print the
parameter name if available.

‘step’ - the step size to be used in calculating the numerical
derivatives. If set to zero, then the step size is
computed automatically. Ignored when AUTODERIVATIVE=0.

‘mpside’ - the sidedness of the finite difference when computing
numerical derivatives. This field can take four
values:

0 - one-sided derivative computed automatically
1 - one-sided derivative (f(x+h) - f(x) )/h
-1 - one-sided derivative (f(x) - f(x-h))/h
2 - two-sided derivative (f(x+h) - f(x-h))/(2*h)

Where H is the STEP parameter described above. The
“automatic” one-sided derivative method will chose a
direction for the finite difference which does not
violate any constraints. The other methods do not
perform this check. The two-sided method is in
principle more precise, but requires twice as many
function evaluations. Default: 0.

‘mpmaxstep’ - the maximum change to be made in the parameter
value. During the fitting process, the parameter
will never be changed by more than this value in
one iteration.

A value of 0 indicates no maximum. Default: 0.

‘tied’ - a string expression which “ties” the parameter to other
free or fixed parameters. Any expression involving
constants and the parameter array P are permitted.
Example: if parameter 2 is always to be twice parameter
1 then use the following: parinfo(2).tied = ‘2 * p(1)’.
Since they are totally constrained, tied parameters are
considered to be fixed; no errors are computed for them.
[ NOTE: the PARNAME can’t be used in expressions. ]

‘mpprint’ - if set to 1, then the default iterfunct will print the
parameter value. If set to 0, the parameter value
will not be printed. This tag can be used to
selectively print only a few parameter values out of
many. Default: 1 (all parameters printed)


Future modifications to the PARINFO structure, if any, will involve
adding dictionary tags beginning with the two letters “MP”.
Therefore programmers are urged to avoid using tags starting with
the same letters; otherwise they are free to include their own
fields within the PARINFO structure, and they will be ignored.

PARINFO Example:
parinfo = [{‘value’:0., ‘fixed’:0, ‘limited’:[0,0], ‘limits’:[0.,0.]}
for i in range(5)]
parinfo[0][‘fixed’] = 1
parinfo[4][‘limited’][0] = 1
parinfo[4][‘limits’][0] = 50.
values = [5.7, 2.2, 500., 1.5, 2000.]
for i in range(5): parinfo[i][‘value’]=values[i]

A total of 5 parameters, with starting values of 5.7,
2.2, 500, 1.5, and 2000 are given. The first parameter
is fixed at a value of 5.7, and the last parameter is
constrained to be above 50.


EXAMPLE

import mpfit
import numpy.oldnumeric as Numeric
x = arange(100, float)
p0 = [5.7, 2.2, 500., 1.5, 2000.]
y = ( p[0] + p[1]*[x] + p[2]*[x**2] + p[3]*sqrt(x) +
p[4]*log(x))
fa = {‘x’:x, ‘y’:y, ‘err’:err}
m = mpfit(‘myfunct’, p0, functkw=fa)
print ‘status = ‘, m.status
if (m.status <= 0): print ‘error message = ‘, m.errmsg
print ‘parameters = ‘, m.params

Minimizes sum of squares of MYFUNCT. MYFUNCT is called with the X,
Y, and ERR keyword parameters that are given by FUNCTKW. The
results can be obtained from the returned object m.


THEORY OF OPERATION

There are many specific strategies for function minimization. One
very popular technique is to use function gradient information to
realize the local structure of the function. Near a local minimum
the function value can be taylor expanded about x0 as follows:

f(x) = f(x0) + f’(x0) . (x-x0) + (1/2) (x-x0) . f’‘(x0) . (x-x0)
—– ————— ——————————- (1)
Order 0th 1st 2nd

Here f’(x) is the gradient vector of f at x, and f’‘(x) is the
Hessian matrix of second derivatives of f at x. The vector x is
the set of function parameters, not the measured data vector. One
can find the minimum of f, f(xm) using Newton’s method, and
arrives at the following linear equation:

f’‘(x0) . (xm-x0) = - f’(x0) (2)

If an inverse can be found for f’‘(x0) then one can solve for
(xm-x0), the step vector from the current position x0 to the new
projected minimum. Here the problem has been linearized (ie, the
gradient information is known to first order). f’‘(x0) is
symmetric n x n matrix, and should be positive definite.

The Levenberg - Marquardt technique is a variation on this theme.
It adds an additional diagonal term to the equation which may aid the
convergence properties:

(f’‘(x0) + nu I) . (xm-x0) = -f’(x0) (2a)

where I is the identity matrix. When nu is large, the overall
matrix is diagonally dominant, and the iterations follow steepest
descent. When nu is small, the iterations are quadratically
convergent.

In principle, if f’‘(x0) and f’(x0) are known then xm-x0 can be
determined. However the Hessian matrix is often difficult or
impossible to compute. The gradient f’(x0) may be easier to
compute, if even by finite difference techniques. So-called
quasi-Newton techniques attempt to successively estimate f’‘(x0)
by building up gradient information as the iterations proceed.

In the least squares problem there are further simplifications
which assist in solving eqn (2). The function to be minimized is
a sum of squares:

f = Sum(hi^2) (3)

where hi is the ith residual out of m residuals as described
above. This can be substituted back into eqn (2) after computing
the derivatives:

f’ = 2 Sum(hi hi’)
f’’ = 2 Sum(hi’ hj’) + 2 Sum(hi hi’‘) (4)

If one assumes that the parameters are already close enough to a
minimum, then one typically finds that the second term in f’’ is
negligible [or, in any case, is too difficult to compute]. Thus,
equation (2) can be solved, at least approximately, using only
gradient information.

In matrix notation, the combination of eqns (2) and (4) becomes:

hT’ . h’ . dx = - hT’ . h (5)

Where h is the residual vector (length m), hT is its transpose, h’
is the Jacobian matrix (dimensions n x m), and dx is (xm-x0). The
user function supplies the residual vector h, and in some cases h’
when it is not found by finite differences (see MPFIT_FDJAC2,
which finds h and hT’). Even if dx is not the best absolute step
to take, it does provide a good estimate of the best direction,
so often a line minimization will occur along the dx vector
direction.

The method of solution employed by MINPACK is to form the Q . R
factorization of h’, where Q is an orthogonal matrix such that QT .
Q = I, and R is upper right triangular. Using h’ = Q . R and the
ortogonality of Q, eqn (5) becomes

(RT . QT) . (Q . R) . dx = - (RT . QT) . h
RT . R . dx = - RT . QT . h (6)
R . dx = - QT . h

where the last statement follows because R is upper triangular.
Here, R, QT and h are known so this is a matter of solving for dx.
The routine MPFIT_QRFAC provides the QR factorization of h, with
pivoting, and MPFIT_QRSOLV provides the solution for dx.


REFERENCES

MINPACK-1, Jorge More’, available from netlib (www.netlib.org).
“Optimization Software Guide,” Jorge More’ and Stephen Wright,
SIAM, Frontiers in Applied Mathematics, Number 14.
More’, Jorge J., “The Levenberg-Marquardt Algorithm:
Implementation and Theory,” in Numerical Analysis, ed. Watson,
G. A., Lecture Notes in Mathematics 630, Springer-Verlag, 1977.


MODIFICATION HISTORY

Translated from MINPACK-1 in FORTRAN, Apr-Jul 1998, CM
Copyright (C) 1997-2002, Craig Markwardt
This software is provided as is without any warranty whatsoever.
Permission to use, copy, modify, and distribute modified or
unmodified copies is granted, provided this copyright and disclaimer
are included unchanged.

Translated from MPFIT (Craig Markwardt’s IDL package) to Python,
August, 2002. Mark Rivers
Converted from Numeric to numpy (Sergey Koposov, July 2008)