24.5. Newton Trust-Region Method

The Newton trust-region method solves the constrained quadratic programming problem

to obtain a direction dk, where Hk is the Hessian of the objective function at xk, gk is the gradient of the objective function at xk and is the trust-region radius. If xk + dk sufficiently reduces the nonlinear objective function, then the step is accepted and the trust-region radius is updated. However, if xk + dk does not sufficiently reduce the nonlinear objective function, then the step is rejected, the trust-region radius is reduced, and the quadratic program is re-solved using the updated trust-region radius. The Newton trust-region method can be set using TaoMethod tao_ntr. For the best efficiency, function and gradient evaluations should be performed separately when using this algorithm.

The quadratic optimization problem is approximately solved by applying the Steihaug-Toint conjugate gradient method or generalized Lanczos method to the symmetric system of equations Hk d = -gk. The method used to solve the system of equations is specified with the command line argument -tao_ntr_ksp_type <stcg,gltr>; stcg is the default. Internally, the PETSc implementations for the Steihaug-Toint method and the generalized Lanczos method are used. See the PETSc manual for further information on changing the behavior of these linear system solvers.

A good preconditioner reduces the number of iterations required to compute the direction. For the Steihaug-Toint conjugate gradient method and generalized Lanczos method, this preconditioner must be symmetric and positive definite. The available options are to use no preconditioner, the absolute value of the diagonal of the Hessian matrix, a limited-memory BFGS approximation to the Hessian matrix, or one of the other preconditioners provided by the PETSc package. These preconditioners are specified by the the command line argument -tao_ntr_pc_type <none,ahess,bfgs,petsc>, respectively. The default is the bfgs preconditioner. When the preconditioner type is set the to petsc, the preconditioner set with the PETSc -pc_type command line argument is used. For example, to use an incomplete Cholesky factorization for the preconditioner, one would use the command line arguments -tao_ntr_pc_type petsc -pc_type icc. See the PETSc manual for further information on changing the behavior of the preconditioners.

The choice of scaling matrix can have a significant impact on the quality of the Hessian approximation when using the bfgs preconditioner and affect the number of iterations required by the linear system solver. The choices for scaling matrices are the same as those discussed for the limited-memory, variable-metric algorithm. For Newton methods, however, the option exists to use a scaling matrix based on the true Hessian matrix. In particular, the implementation supports using the absolute value of the diagonal of the Hessian matrix. The scaling matrix to use with the bfgs preconditioner is set with the command line argument -tao_ntr_bfgs_scale_type <ahess,bfgs>; ahess is the default. The bfgs scaling matrix is derived from the BFGS options. The ahess scaling matrix is the absolute value of the diagonal of the Hessian matrix.

The method for computing an initial trust-region radius is set with the command line argument -tao_ntr_init_type <constant,direction,interpolation>; interpolation, which chooses an initial value based on the interpolation scheme found in [(ref CGT)], is the default. This scheme performs a number of function and gradient evaluations to determine a radius such that the reduction predicted by the quadratic model along the gradient direction coincides with the actual reduction in the nonlinear function. The iterate obtaining the best objective function value is used as the starting point for the main line-search algorithm. The constant method initializes the trust-region radius by using the value specified with the -tao_trust0 <double> command line argument, where the default value is 100. The direction technique solves the first quadratic optimization problem by using a standard conjugate gradient method and initializes the trust-region to .

Finally, the method for updating the trust-region radius is set with the command line argument -tao_ntr_update_type <reduction,interpolation>; reduction is the default. The reduction method computes the ratio of the actual reduction in the objective function to the reduction predicted by the quadratic model for the full step, , where qk is the quadratic model. The radius is then updated as:

where and are constants. The interpolation method uses the same interpolation mechanism as in the initialization to compute a new value for the trust-region radius.