Conjugate Gradient

References and Developments Based on Avi Perry's Conjugate Gradient Approach to Solving Non-Linear Unconstrained Mathematical Optimization Problems
Published in OPERATIONS RESEARCH
Vol. 26, No. 6, November-December 1978, pp. 1073-1078 DOI: 10.1287/opre.26.6.1073
This work was referenced numerous times in the Math Programming literature (books and research papers). It was incorporated in commercial software packages including Microsoft Excel.
The following is a SAMPLE of articles that propose extensions and expansions of Perry's algorithm.

Globally Convergent modified Perry's conjugate gradient method

Authors: Ioannis E. Liviens, PanagiotisPintelas

Symmetric Perry Conjugate Gradient Method

Dongyi Liu·Genqi XuReceived: 11 May 2011 / Published online: 29 March 2013© The Author(s) 2013.

This article is published with open access at Springerlink.com

ABSTRACT

A family of new conjugate gradient methods is proposed based on Perry’sidea, which satisfies the descent property or the sufficient descent property for anyline search. In addition, based on the scaling technology and the restarting strat-egy, a family of scaling symmetric Perry conjugate gradient methods with restarting procedures is presented. The memoryless BFGS method and the SCALCG method are the special forms of the two families of new methods, respectively. More-over, several concrete new algorithms are suggested. Under Wolfe line searches,the global convergence of the two families of the new methods is proven by the spectral analysis for uniformly convex functions and non-convex functions. The preliminary numerical comparisons with CG_DESCENT and SCALCG algorithms show that these new algorithms are very effective algorithms for the large-scale unconstrained optimization problems. Finally, a remark for further research is suggested

A SPECTRAL VERSION OF PERRY'S CONJUGATE GRADIENT METHOD FOR NEURAL NETWORK TRAINING

D.G. Sotiropoulos, A.E. Kostopoulos, and T.N. Grapsa
4th GRACM Congress on Computational Mechanics
GRACM 2002
Patra, 27-29 June, 2002
©GRACM

University of Patras, Department of Mathematics,
Division of Computational Mathematics Si Informatics,
GR-265 00 Rio, Patras, Greece.
e-mail: Ogs,arkostop,grapsalOmath.upatras.gr.

Keywords: back propagation, supervised training, conjugate gradient methods, Perry's method, spectral steplength, non-monotone Wolfe conditions.

Abstract.
In this work, an efficient training algorithm for feedforward neural networks is presented. It is based on a scaled version of the conjugate gradient method suggested by Perry, which employs the spectral steplength of Barzilai and Borwein that contains second order information without estimating the Hessian matrix. The learning rate is automatically adapted at each epoch, using the conjugate gradient values and the learning rate of the previous one. In addition, a new acceptability criterion for the learning rate is utilized based on non-monotone Wolfe conditions. The efficiency of the training algorithm is proved on the standard tests, including XOR, 3-bit parity, font learning and function approximation problems

A New Perry Conjugate Gradient Method with the Generalized Conjugacy Condition
Dongyi Liu Yingfeng Shang
Dept. of Math., Tianjin Univ., Tianjin, China

This paper appears in: Computational Intelligence and Software Engineering (CiSE), 2010 International Conference on
Issue Date: 10-12 Dec. 2010
On page(s): 1 - 4
Location: Wuhan
Print ISBN: 978-1-4244-5391-7
INSPEC Accession Number: 11706744
Digital Object Identifier: 10.1109/CISE.2010.5677114
Date of Current Version: 30 December 2010

Abstract
The Perry conjugate gradient method is generalized, from which a new descent algorithm, PDCGy, is presented and its global convergence is proven under Wolfe line searches. Preliminary numerical results for a set of 720 unconstrained optimization test problems verify the performance of the algorithm and show that the PDCGy algorithm is competitive with the CG_DESCENT algorithm.

A Spectral Version Of Perry's Conjugate Gradient Method For Neural Network Training (2002)

[2 citations — 2 self] Download:
http://www.math.upatras.gr/~dgs/pdf/tr02-04.pdf
http://www.math.upatras.gr/%7Edgs/papers/reports/t
CACHED:
by A.E. Kostopoulos , A. E. Kostopoulos , T. N. Grapsa University of Patras Add To MetaCart

Abstract
In this work, an efficient training algorithm for feedforward neural networks is presented. It is based on a scaled version of the conjugate gradient method suggested by Perry, which employs the spectral steplength of Barzilai and Borwein that contains second order information without estimating the Hessian matrix. The learning rate is automatically adapted at each epoch, using the conjugate gradient values and the learning rate of the previous one. In addition, a new acceptability criterion for the learning rate is utilized based on non-monotone Wolfe conditions. The efficiency of the training algorithm is proved on the standard tests, including XOR, 3-bit parity, font learning and function approximation problems.

Self-scaled conjugate gradient training algorithms

Authors: A. E. Kostopoulos University of Patras, Department of Mathematics, GR-265 04 Rio, Patras, Greece T. N. Grapsa University of Patras, Department of Mathematics, GR-265 04 Rio, Patras, Greece Published in: · Journal Neurocomputing archive Volume 72 Issue 13-15, August, 2009
Elsevier Science Publishers B. V. Amsterdam, The Netherlands, The Netherlands
table of contents doi>10.1016/j.neucom.2009.04.006

Abstract
This article presents some efficient training algorithms, based on conjugate gradient optimization methods. In addition to the existing conjugate gradient training algorithms, we introduce Perry's conjugate gradient method as a training algorithm [A. Perry, A modified conjugate gradient algorithm, Operations Research 26 (1978) 26-43]. Perry's method has been proven to be a very efficient method in the context of unconstrained optimization, but it has never been used in MLP training. Furthermore, a new class of conjugate gradient (CG) methods is proposed, called self-scaled CG methods, which are derived from the principles of Hestenes-Stiefel, Fletcher-Reeves, Polak-Ribiere and Perry's method. This class is based on the spectral scaling parameter introduced in [J. Barzilai, J.M. Borwein, Two point step size gradient methods, IMA Journal of Numerical Analysis 8 (1988) 141-148]. The spectral scaling parameter contains second order information without estimating the Hessian matrix. Furthermore, we incorporate to the CG training algorithms an efficient line search technique based on the Wolfe conditions and on safeguarded cubic interpolation [D.F. Shanno, K.H. Phua, Minimization of unconstrained multivariate functions, ACM Transactions on Mathematical Software 2 (1976) 87-94]. In addition, the initial learning rate parameter, fed to the line search technique, was automatically adapted at each iteration by a closed formula proposed in [D.F. Shanno, K.H. Phua, Minimization of unconstrained multivariate functions, ACM Transactions on Mathematical Software 2 (1976) 87-94; D.G. Sotiropoulos, A.E. Kostopoulos, T.N. Grapsa, A spectral version of Perry's conjugate gradient method for neural network training, in: D.T. Tsahalis (Ed.), Fourth GRACM Congress on Computational Mechanics, vol. 1, 2002, pp. 172-179]. Finally, an efficient restarting procedure was employed in order to further improve the effectiveness of the CG training algorithms. Experimental results show that, in general, the new class of methods can perform better with a much lower computational cost and better success performance.

A New Perry Conjugate Gradient Method With the Generalized Conjugacy Condition

Dongyi Liu
Department of Mathematics Tianjin University
Tianjin, P. R. China 300072
Email: dyliu@tju.edu.cn

Yingfeng Shang
Department of Mathematics Tianjin University
Tianjin, P. R. China 300072
Email: yfshang@tju.edu.cn

Abstract
The Perry conjugate gradient method is generalized, from which a new descent algorithm, PDCGy, is presented and its global convergence is proven under Wolfe line searches. Preliminary numerical results for a set of 720 unconstrained optimization test problems verify the performance of the algorithm and show that the PDCGy algorithm is competitive with the CG_DESCENT algorithm.

A new efficient variable learning rate for Perry's spectral conjugate gradient training method. (2004)
http://www.math.upatras.gr/~dgs/04/ic_scce1.pdf
CACHED:
by A. E. Kostopoulos , D.G. Sotiropoulos and T.N. Grapsa , D. G. Sotiropoulos , T. N. Grapsa the 1st International Conference “From Scientific Computing to Computational Engineering”, 8-10 September

Abstract:
Since the presentation of the backpropagation algorithm, several adaptive learning algorithms for training a multilayer perceptron (MLP) have been proposed. In a recent article, we have introduced an efficient training algorithm based on a nonmonotone spectral conjugate gradient. In particular, a scaled version of the conjugate gradient method suggested by Perry, which employ the spectral steplength of Barzilai and Borwein, was presented. The learning rate was automatically adapted at each epoch according to Shanno's technique which exploits the information of conjugate directions as well as the previous learning rate. In addition, a new acceptability criterion for the learning rate was utilized based on nonmonotone Wolfe conditions. A crucial issue of these training algorithms is the learning rate adaptation. Various variable learning rate adaptations have been introduced in the literature to improve the convergence speed and avoid convergence to local minima. In this contribution, we incorporate in the previous training algorithm a new e#ective variable learning rate adaptation, which increases its efficiency. Experimental results in a set of standard benchmarks of MLP networks show that the proposed training algorithm improves the convergence speed and success percentage over a set of well known training algorithms.

Convergence of the Nonmonotone Perry and Shanno Method for Optimization

Authors: Guanghui Liu Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois, 60208, USA Lili Jing College of Economics and Management, Beijing Forestry University, Beijing, 100083, People's Republic of China
Published in: · Journal Computational Optimization and Applications archive Volume 16 Issue 2, July 2000
Kluwer Academic Publishers Norwell, MA, USA

Abstract
In this paper a new nonmonotone conjugate gradient method is introduced, which can be regarded as a generalization of the Perry and Shanno memoryless quasi-Newton method. For convex objective functions, the proposed nonmonotone conjugate gradient method is proved to be globally convergent. Its global convergence for non-convex objective functions has also been studied. Numerical experiments indicate that it is able to efficiently solve large scale optmization problems.

Scaled conjugate gradient algorithms for unconstrained optimization
Neculai Andrei

Computational Optimization and Applications Volume 38, Number 3, 401-416, DOI: 10.1007/s10589-007-9055-7

Abstract
In this work we present and analyze a new scaled conjugate gradient algorithm and its implementation, based on an interpretation of the secant equation and on the inexact Wolfe line search conditions. The best spectral conjugate gradient algorithm SCG by Birgin and Martínez (2001), which is mainly a scaled variant of Perry’s (1977), is modified in such a manner to overcome the lack of positive definiteness of the matrix defining the search direction. This modification is based on the quasi-Newton BFGS updating formula. The computational scheme is embedded in the restart philosophy of Beale–Powell. The parameter scaling the gradient is selected as spectral gradient or in an anticipative manner by means of a formula using the function values in two successive points. In very mild conditions it is shown that, for strongly convex functions, the algorithm is global convergent. Preliminary computational results, for a set consisting of 500 unconstrained optimization test problems, show that this new scaled conjugate gradient algorithm substantially outperforms the spectral conjugate gradient SCG algorithm.

Conjugate gradient methods using quasi-Newton updates with inexact line searches
Hanif D. Sheralia and Osman Ulularb
a Department of Industrial Engineering and Operations Research, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061-0118, USA
b AT & T Laboratories, Room 35B34, 2 Oak Way, Berkeley Heights, NJ 07922, USA
Received 17 February 1987. Submitted by Augustine O. Esogbue. Available online 29 June 2004.
Journal of Mathematical Analysis and Applications Volume 150, Issue 2, August 1990, Pages 359-377

Abstract
Conjugate gradient methods are conjugate direction or gradient deflection methods which lie somewhere between the method of steepest descent and Newton's method. Their principal advantage is that they do not require the storage of any matrices as in Newton's method, or as in quasi-Newton methods, and they are designed to converge faster than the method of steepest descent. Unlike quasi-Newton or variable-metric methods, these are fixed-metric methods in which the search direction at each iteration is based on an approximation to the inverse Hessian constructed by updating a fixed, symmetric, positive definite matrix, typically the identity matrix. The resulting approximation is usually not symmetric, although some variants force symmetry and hence derive memoryless quasi-Newton methods. In this paper, we present a scaled modified version of the conjugate gradient method suggested by Perry, which employs the quasi-Newton condition rather than conjugacy under inexact line searches, in order to derive the search directions. The analysis is extended to the memoryless quasi-Newton modification of this method, as suggested by Shanno. Computational experience on standard test problems indicates that the proposed method, along with Beale and Powell's restarts, improves upon existing conjugate gradient strategies.

A Spectral Conjugate Gradient Method for Unconstrained Optimization
E. G. Birgin and J. M. Martínez

Applied Mathematics & Optimization Volume 43, Number 2, 117-128, DOI: 10.1007/s00245-001-0003-0
Abstract
A family of scaled conjugate gradient algorithms for large-scale unconstrained minimization is defined. The Perry, the Polak—Ribière and the Fletcher—Reeves formulae are compared using a spectral scaling derived from Raydan's spectral gradient optimization method. The best combination of formula, scaling and initial choice of step-length is compared against well known algorithms using a classical set of problems. An additional comparison involving an ill-conditioned estimation problem in Optics is presented.

A scaled nonlinear conjugate gradient algorithm for unconstrained optimization
Author: Andrei, Neculai1
Source: Optimization, Volume 57, Number 4, January 2008 , pp. 549-570(22)
Publisher: Taylor and Francis Ltd

Abstract:

The best spectral conjugate gradient algorithm by (Birgin, E. and Martinez, J.M., 2001, A spectral conjugate gradient method for unconstrained optimization. Applied Mathematics and Optimization, 43, 117-128). which is mainly a scaled variant of (Perry, A., 1977, A class of Conjugate gradient algorithms with a two step varaiable metric memory, Discussion Paper 269, Center for Mathematical Studies in Economics and Management Science, Northwestern University), is modified in such a way as to overcome the lack of positive definiteness of the matrix defining the search direction. This modification is based on the quasi-Newton BFGS updating formula. The computational scheme is embedded into the restart philosophy of Beale-Powell. The parameter scaling the gradient is selected as spectral gradient or in an anticipative way by means of a formula using the function values in two successive points. In very mild conditions it is shown that, for strongly convex functions, the algorithm is global convergent. Computational results and performance profiles for a set consisting of 700 unconstrained optimization problems show that this new scaled nonlinear conjugate gradient algorithm substantially outperforms known conjugate gradient methods including: the spectral conjugate gradient SCG by Birgin and Martinez, the scaled Fletcher and Reeves, the Polak and Ribiere algorithms and the CONMIN by (Shanno, D.F. and Phua, K.H., 1976, Algorithm 500, Minimization of unconstrained multivariate functions. ACM Transactions on Mathematical Software, 2, 87-94).

Conjugate Gradient Methods with Inexact Searches
David F. Shanno Mathematics of Operations Research
Vol. 3, No. 3 (Aug., 1978), pp. 244-256
Abstract
Conjugate gradient methods are iterative methods for finding the minimizer of a scalar function f(x) of a vector variable x which do not update an approximation to the inverse Hessian matrix. This paper examines the effects of inexact linear searches on the methods and shows how the traditional Fletcher-Reeves and Polak-Ribiere algorithm may be modified in a form discovered by Perry [A. Perry, A modified conjugate gradient algorithm, Operations Research 26 (1978) 26-43]to a sequence which can be interpreted as a memoryless BFGS algorithm. This algorithm may then be scaled optimally in the sense of Oren and Spedicato. This scaling can be combined with Beale restarts and Powell's restart criterion. Computational results will show that this new method substantially outperforms known conjugate gradient methods on a wide class of problems.

Spectral conjugate gradient methods with sufficient descent property for large-scale unconstrained optimization
Authors: Gaohang Yu Department of Scientific Computation and Computer Applications, Sun Yat-Sen University, Guangzhou, People's Republic of China Lutai Guan Department of Scientific Computation and Computer Applications, Sun Yat-Sen University, Guangzhou, People's Republic of China Wufan Chen School of Biomedical Engineering, Southern Medical University, Guangzhou, People's Republic of China Published in: · Journal Optimization Methods & Software - Dedicated to Professor Michael J.D. Powell on the occasion of his 70th birthday archive Volume 23 Issue 2, April 2008
Taylor & Francis, Inc. Bristol, PA, USA
Abstract
A class of new spectral conjugate gradient methods are proposed in this paper. First, we modify the spectral Perry's conjugate gradient method, which is the best spectral conjugate gradient algorithm SCG by Birgin and Martinez [E.G. Birgin and J.M. Martinez, A spectral conjugate gradient method for unconstrained optimization, Appl. Math. Optim. 43 (2001), 117-128.], such that it possesses sufficient descent property for any (inexact) line search. It is shown that, for strongly convex functions, the method is a global convergent. Further, a global convergence result for nonconvex minimization is established when the line search fulfils the Wolfe line search conditions. Some other spectral conjugate gradient methods with guaranteed descent are presented here. Numerical comparisons are given with both SCG and CG_DESCENT methods using the unconstrained optimization problems in the CUTE library.

Conjugate Gradient in Nonconvex Optimization--eBook by Radoslaw Pytlak

Abstract
Conjugate Gradient Electromagnetics - eBooks PDF Free Download. (Nonconvex Optimization and. In each iteration,. Nonconvex minimization calculations and. Optimization: algorithms and. Optimization Algorithms Conjugate Gradient Optimization (CONGRA). Conjugate gradient method - Wikipedia, the free encyclopedia . Unconstrained Optimization; Conjugate Gradient Method;. Optimization algorithms and methods; Gradient methods;. From students who've taken these classes before Conjugate Gradient Algorithms in Nonconvex Optimization by.. . Create a book;. Hybrid conjugate gradient methods for unconstrained optimization In this paper we propose two kinds of conjugate gradient methods for. The book provides readers with the basic knowledge necessary. Conjugate Gradient - Avi Perry . Conjugate Gradient Algorithms in Nonconvex. Download Preconditioned Conjugate Gradient Methods ebook - Jonicia Preconditioned Conjugate Gradient Methods book

Globally Convergent modified Perry's conjugate gradient method

Symmetric Perry Conjugate Gradient Method

Conjugate Gradient in Nonconvex Optimization--eBook by Radoslaw Pytlak

A Perry Descent Conjugate Gradient Method with Restricted SpectrumAuthors: Dongyi Liu (dyliutju.edu.cn) Genqi Xu (gqxutju.edu.cn)ABSTRACT

A Perry Descent Conjugate Gradient Method with Restricted Spectrum
Authors:
Dongyi Liu (dyliutju.edu.cn)
Genqi Xu (gqxutju.edu.cn)

ABSTRACT