Class LBFGS
The L-BFGS algorithm is a quasi-Newton method that approximates the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm using a limited amount of computer memory. It is particularly well suited for optimization problems with a large number of variables.
The algorithm works by storing a sparse representation of the approximate inverse Hessian matrix, using only a few vectors that represent the approximation implicitly. Unlike the original BFGS method, L-BFGS stores only a few vectors that represent the approximation implicitly, which makes it particularly well suited for problems with many variables.
This implementation uses a line search procedure to ensure sufficient decrease in the objective function and to maintain positive definiteness of the Hessian approximation.
- Since:
- 1.0
- Author:
- Michael J. Schnieders
Derived from:
Robert Dodier's Java translation of original FORTRAN code by Jorge Nocedal.
J. Nocedal, "Updating Quasi-Newton Matrices with Limited Storage", Mathematics of Computation, 35, 773-782 (1980)
D. C. Lui and J. Nocedal, "On the Limited Memory BFGS Method for Large Scale Optimization", Mathematical Programming, 45, 503-528 (1989)
J. Nocedal and S. J. Wright, "Numerical Optimization", Springer-Verlag, New York, 1999, Section 9.1
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final double
The default maximum angle between search direction and gradient.static final double
Controls the accuracy of the line search.static final int
The default maximum number of interpolations during line search.static final double
The default projected gradient above which step size is reduced.static final double
This specifies the default upper bound for the step in the line search.static final double
This specifies the default lower bound for the step in the line search. -
Method Summary
Modifier and TypeMethodDescriptionstatic int
minimize
(int n, int mSave, double[] x, double f, double[] g, double eps, int maxIterations, OptimizationInterface potential, OptimizationListener listener) This method solves the unconstrained minimization problemstatic int
minimize
(int n, int mSave, double[] x, double f, double[] g, double eps, OptimizationInterface potential, OptimizationListener listener) This method solves the unconstrained minimization problem
-
Field Details
-
DEFAULT_CAPPA
public static final double DEFAULT_CAPPAControls the accuracy of the line search.If the function and gradient evaluations are inexpensive with respect to the cost of the iteration (which is sometimes the case when solving very large problems) it may be advantageous to set
CAPPA
to a small value.A typical small value is 0.1.
Restriction:
CAPPA
should be greater than 1e-4.- See Also:
-
DEFAULT_STEPMIN
public static final double DEFAULT_STEPMINThis specifies the default lower bound for the step in the line search.The default value of 1.0e-16 does not need to be modified unless the problem is extremely badly scaled (in which case the exponent should be increased).
- See Also:
-
DEFAULT_STEPMAX
public static final double DEFAULT_STEPMAXThis specifies the default upper bound for the step in the line search.The default value of 5.0e0 does not need to be modified unless the problem is extremely badly scaled (in which case the exponent should be increased).
- See Also:
-
DEFAULT_SLOPEMAX
public static final double DEFAULT_SLOPEMAXThe default projected gradient above which step size is reduced.- See Also:
-
DEFAULT_ANGLEMAX
public static final double DEFAULT_ANGLEMAXThe default maximum angle between search direction and gradient.- See Also:
-
DEFAULT_INTMAX
public static final int DEFAULT_INTMAXThe default maximum number of interpolations during line search.- See Also:
-
-
Method Details
-
minimize
public static int minimize(int n, int mSave, double[] x, double f, double[] g, double eps, int maxIterations, OptimizationInterface potential, @Nullable OptimizationListener listener) This method solves the unconstrained minimization problemmin f(x), x = (x1,x2,...,x_n),
using the limited-memory BFGS method. The routine is especially effective on problems involving a large number of variables. In a typical iteration of this method an approximation
Hk
to the inverse of the Hessian is obtained by applyingm
BFGS updates to a diagonal matrixHk0
, using information from the previousm
steps.The user specifies the number
mSave
, which determines the amount of storage required by the routine. This is the number of previous steps that will be used to approximate the Hessian. Larger values ofmSave
can lead to better convergence but require more memory and computation per iteration.The algorithm works as follows:
- Initialize with the current point, function value, and gradient
- Compute a search direction using the L-BFGS approximation of the inverse Hessian
- Perform a line search along this direction to find a new point with sufficient decrease in the function value
- Update the L-BFGS approximation using the new point and gradient
- Repeat until convergence or maximum iterations reached
The step length is determined at each iteration by means of the line search routine
lineSearch
, which is a slight modification of the routineCSRCH
written by More and Thuente. This ensures that the function value decreases sufficiently and that the Hessian approximation remains positive definite.- Parameters:
n
- The number of variables in the minimization problem. Must be positive.mSave
- The number of corrections used in the BFGS update. Values ofmSave
less than 3 are not recommended; large values ofmSave
will result in excessive computing time.3 <= mSave <= 7
is recommended. Must be non-negative. IfmSave
is 0, the method will use steepest descent.x
- On initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit, it contains the values of the variables at the best point found (usually a solution). The array must have length at leastn
.f
- The value of the functionf
at the pointx
.g
- The components of the gradientg
at the pointx
. On exit, it contains the gradient at the final point. The array must have length at leastn
.eps
- Determines the accuracy with which the solution is to be found. The subroutine terminates whenG RMS < EPS
. Should be positive.maxIterations
- Maximum number of optimization steps. Must be positive.potential
- Implements thePotential
interface to supply function values and gradients. Cannot be null.listener
- Implements theOptimizationListener
interface and will be notified after each successful step. Can be null, in which case progress will be logged but no callbacks will be made.- Returns:
- status code:
- 0 = success (convergence achieved)
- 1 = maximum iterations reached without convergence
- -1 = optimization failed (e.g., line search failure, invalid inputs)
- Since:
- 1.0
-
minimize
public static int minimize(int n, int mSave, double[] x, double f, double[] g, double eps, OptimizationInterface potential, OptimizationListener listener) This method solves the unconstrained minimization problemmin f(x), x = (x1,x2,...,x_n),
using the limited-memory BFGS method. The routine is especially effective on problems involving a large number of variables. In a typical iteration of this method an approximation
Hk
to the inverse of the Hessian is obtained by applyingm
BFGS updates to a diagonal matrixHk0
, using information from the previousm
steps.The user specifies the number
m
, which determines the amount of storage required by the routine.The user is required to calculate the function value
f
and its gradientg
.The step length is determined at each iteration by means of the line search routine
lineSearch
, which is a slight modification of the routineCSRCH
written by More and Thuente.- Parameters:
n
- The number of variables in the minimization problem. Restriction:n > 0
.mSave
- The number of corrections used in the BFGS update. Values ofmSave
less than 3 are not recommended; large values ofmSave
will result in excessive computing time.3 <= mSave <= 7
is recommended. * Restriction:mSave > 0
.x
- On initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit, it contains the values of the variables at the best point found (usually a solution).f
- The value of the functionf
at the pointx
.g
- The components of the gradientg
at the pointx
.eps
- Determines the accuracy with which the solution is to be found. The subroutine terminates whenG RMS < EPS
potential
- Implements thePotential
interface to supply function values and gradients.listener
- Implements theOptimizationListener
interface and will be notified after each successful step.- Returns:
- status code (0 = success, -1 = failed)
- Since:
- 1.0
-