Data article Title : Data and performance profiles applying an adaptive truncation criterion , within linesearch-based truncated Newton methods , in large scale nonconvex optimization

In this paper, we report data and experiments related to the research article entitled “An adaptive truncation criterion, for linesearch-based truncated Newton methods in large scale nonconvex optimization” by Caliciotti et. Al. [1]. In particular, in [1], large scale unconstrained optimization problems are considered by applying linesearch-based truncated Newton methods. In this framework, a key point is the reduction of the number of inner iterations needed, at each outer iteration, to approximately solving the Newton equation. A novel adaptive truncation criterion is introduced in [1] to this aim. Here, we report the details concerning numerical experiences over a commonly used test set, namely CUTEst [2]. Moreover, comparisons are reported in terms of performance profiles [3], adopting different parameters settings. Finally, our linesearch-based scheme is compared with a renowned trust region method, namely TRON [4]. Specifications Table Subject area Operations Research and Management Science More specific subject area Nonlinear Optimization Type of data Table, graph How data was acquired http://www.cuter.rl.ac.uk/, experimental output data Data format Raw and filtered Experimental factors None Experimental features Different codes have been experienced over the CUTEst test set; then, comparisons among their performance are provided in terms of performance profiles Data accessibility Test problems available at http://www.cuter.rl.ac.uk/. Complete output data available at request to the authors Value of the data  Output data reported represent a significant benchmark for future comparisons, among different algorithms for large scale unconstrained optimization.  Output data may be used by other researchers for tuning novel strategies, within truncated Newton methods.  Output data illuminate the comparison between the linesearch and the trust region approaches, as globalization methods. 1. Data Data from different experimental settings are reported, along with performance profiles, which highlight the advantages of adopting the proposal in [1]. The use of the performance profiles [3] is typically advised in the community of Nonlinear Optimization, since they clearly summarize in one plot the comparison among several codes over an entire test set. We obtain such profiles after filtering the test set from CUTEst collection, in order to guarantee a fair comparison among different codes. In particular, for any test problem, we state that a code fails in solving such a problem whenever (i) a given stopping criterion is not satisfied within 100000 outer iterations, or (ii) if the CPU time exceeds 900 seconds. Moreover, in comparing any two algorithms, we consider only those problems where the algorithms converge to the same stationary point. This is checked by using the test (see [5]) |f1 ∗ − f2 ∗| ≤ 10 min{|f1 |, |f2 ∗| } + 10, being f1 , f2 ∗ the optimal function values obtained by the two algorithms. Finally, we discarded all the test problems where the compared algorithms required a CPU time below 0.1 seconds to solve them. 2. Experimental Design, Materials and Methods In order to assess the Adaptive Truncation Criterion proposed in [1] (named ATC), we consider a standard implementation of a truncated Newton method, namely the linesearch-based truncated scheme described in [6]. Inner iterations are performed using the Conjugate Gradient (CG) method. The novel criterion ATC is adopted in order to avoid over solving of the Newton equation at each outer iteration. In the ATC scheme (see [1]) the maximum number of CG inner iterations allowed at k-th outer iteration (max_itk) is initialized to n, and then adaptively adjusted according to ATC. As regards the parameters in the ATC scheme, we set γ1 = 10 , γ2 = 10 , σ1 = 2, σ2 = 1.1, σ3 = 0.2, θ1 = 10 , θ2 = 10 . This choice is suggested by a preliminary coarse tuning on the chosen test set. Moreover, since we tested ATC both within the unpreconditioned and the preconditioned framework proposed in [6], the value of the parameter l is set to 7, in order to allow the construction of an effective preconditioner (see also the discussion about the choice of the parameter hmax in [6]). The algorithms were coded in FORTRAN 90 and the GFortran compiler under Linux Ubuntu 14.04 was used. The stopping criterion for the outer iterations is the standard one given by ‖gk‖ ≤ 10 −5 max{1, ‖xk‖}, where xk denotes the k-th iterate, gk indicates the gradient of the objective function at xk and ‖ ∙ ‖ stands for the Euclidean norm. As regards the set of test problems, we selected all the unconstrained convex and nonconvex large problems available in the CUTEst collection [2], and when a problem is of variable dimension, we considered two different dimensions (usually 1000 and 10000 variables). The resulting test set consists in 112 problems. As regards the stopping criterion for the CG inner iterations, we tested both the criteria reported in Section 2 of [1]: a) the residual-based criterion; b) the quadratic model reduction-based criterion. Since the criterion a) with ηk = min { 1 k , ‖gk‖} proved to yield poorer performance in practice, we preferred to use the more reliable residual-based criterion adopted in [6]. This criterion sets ηk = max {‖gk‖, √‖gk‖ 3 } min { √n k , ‖gk‖} , which both takes into account the size n of the problem and allows a coarser solution when far from a stationary point. The criterion b) adopts ηk = 0.5, as suggested in [7]. In the sequel we adopt the following terminology:  ATC-true stands for algorithms which use the ATC scheme;  ATC-false stands for algorithms which do not use the ATC scheme. 2.1 Choice of Ck in the ATC scheme Two different formulae were adopted for the parameter Ck in [1]: Ck = min{1, |f(xk)|}; (1) Ck = max{1, |f(xk)|}. (2) Figures 1-3 report performance profiles of the comparison among schemes where our proposal is adopted, with the two choices (1) and (2) for Ck. Figure 1: Unpreconditioned truncated Newton method using the residual-based criterion a) with ATC-true: the choice of Ck in (1) (solid line) vs. the choice of Ck in (2) (dashed line), in terms of CG inner iterations. Figure 2: Unpreconditioned truncated Newton method using the residual-based criterion a) with ATC-true: the choice of Ck in (1) (solid line) vs. the choice of Ck in (2) (dashed line), in terms of function evaluations. Figure 3: Unpreconditioned truncated Newton method using the residual-based criterion a) with ATC-true: the choice of Ck in (1) (solid line) vs. the choice of Ck in (2) (dashed line), in terms of CPU time. 2.2 Numerical comparisons among different truncated Newton schemes Figures 4-7 report performance profiles of the comparison between the two algorithmic choices ATCtrue vs. ATC-false, where the residual-based criterion a) is adopted in the unpreconditioned and preconditioned cases. Figure 4: Unpreconditioned truncated Newton method using the residual-based criterion a): comparison ATC-true vs. ATC-false, in terms of CG inner iterations. Figure 5: Unpreconditioned truncated Newton method using the residual-based criterion a): comparison ATC-true vs. ATC-false, in terms CPU time. Figure 6: Preconditioned truncated Newton method using the residual-based criterion a): comparison ATC-true vs. ATC-false, in terms of CG inner iterations. Figure 7: Preconditioned truncated Newton method using the residual-based criterion a): comparison ATC-true vs. ATC-false, in terms CPU time. Figures 8-9 refer to the comparison, in terms of CPU time, between the adoption of the residual-based criterion a) and the quadratic model reduction-based criterion b) in the algorithm which uses ACT in the unpreconditioned and preconditioned cases. Figure 8: Unpreconditioned truncated Newton method: comparison between the residual-based criterion a) with ATC-true and the quadratic model reduction-based criterion b), in terms of CPU time. Figure 9: Preconditioned truncated Newton method: comparison between the residual-based criterion a) with ATC-true and the quadratic model reduction-based criterion b), in terms of CPU time. 2.3 Comparison with a trust region approach Figures 10-12 report performance profiles of the comparison between our proposal of a truncated Newton method, where ATC is adopted (ATC-true), and the trust region-based code TRON [4]. Figure 10: Comparison between preconditioned truncated Newton method with the residual-based criterion a) and ATC-true vs. TRON, in terms of number of function evaluations. Abscissa axis is in logarithmic scale. Figure 11: Comparison between preconditioned truncated Newton method with the residual-based criterion a) and ATC-true vs. TRON, in terms of CG inner iterations. Abscissa axis is in logarithmic scale. Figure 12: Comparison between Preconditioned truncated Newton method with criterion a), and ATC-true vs. TRON, in terms of CPU time. Abscissa axis is in logarithmic scale. Table 1 reports comparisons among the outputs of different versions of TRON and our proposals, on a selection of test problems. Table 1: This table reports the detailed output for all the problems where at least one of the algorithms fails to converge. On problem FLETCBV3 the algorithms converge towards different points, so that the outputs obtained are not comparable. The output data reported show how the use of the Adaptive Truncation Criterion proposed in [1], enables to efficiently address the problem of “over-solving” the Newton equation, within linesearchbased truncated Newton methods. The adoption of this criterion could have important implications for future implementations of such methods, for solving large scale unconstrained optimization problems. Indeed, it leads to a noticeable reduction of the CG inner iterations, that is significant computational savings of the overall computational burden. Acknowledgements The work of G. Fasano is partially supported by the Italian Flagship Project RITMARE, coordinated by the Italian National Research Council


Subject area
Operations Research and Management Science More specific subject area

Nonlinear Optimization
Type of data Output data illuminate the comparison between the linesearch and the trust region approaches, as globalization methods.

Data
Data from different experimental settings are reported, along with performance profiles, which highlight the advantages of adopting the proposal in [1]. The use of the performance profiles [3] is typically advised in the community of Nonlinear Optimization, since they clearly summarize in one plot the comparison among several codes over an entire test set. We obtain such profiles after filtering the test set from CUTEst collection, in order to guarantee a fair comparison among different codes. In particular, for any test problem, we state that a code fails in solving such a problem whenever (i) a given stopping criterion is not satisfied within 100,000 outer iterations, or (ii) if the CPU time exceeds 900 s. Moreover, in comparing any two algorithms, we consider only those problems where the algorithms converge to the same stationary point. This is checked by using the test (see [5] being f Ã 1 , f Ã 2 the optimal function values obtained by the two algorithms. Finally, we discarded all the test problems where the compared algorithms required a CPU time below 0.1 s to solve them.

Experimental design, materials and methods
In order to assess the Adaptive Truncation Criterion proposed in [1] (named ATC), we consider a standard implementation of a truncated Newton method, namely the linesearch-based truncated scheme described in [6]. Inner iterations are performed using the Conjugate Gradient (CG) method. The novel criterion ATC is adopted in order to avoid over solving of the Newton equation at each outer iteration. In the ATC scheme (see [1]) the maximum number of CG inner iterations allowed at k-th outer iteration (max_it k ) is initialized to n, and then adaptively adjusted according to ATC. As regards the parameters in the ATC scheme, we set This choice is suggested by a preliminary coarse tuning on the chosen test set. Moreover, since we tested ATC both within the unpreconditioned and the preconditioned framework proposed in [6], the value of the parameter l is set to 7, in order to allow the construction of an effective preconditioner (see also the discussion about the choice of the parameter h max in [6]).
The algorithms were coded in FORTRAN 90 and the GFortran compiler under Linux Ubuntu 14.04 was used. The stopping criterion for the outer iterations is the standard one given by where x k denotes the k-th iterate, g k indicates the gradient of the objective function at x k and ‖•‖ stands for the Euclidean norm.  As regards the set of test problems, we selected all the unconstrained convex and nonconvex large problems available in the CUTEst collection [2], and when a problem is of variable dimension, we considered two different dimensions (usually 1000 and 10,000 variables). The resulting test set consists in 112 problems.
As regards the stopping criterion for the CG inner iterations, we tested both the criteria reported in Section 2 of [1]: a) the residual-based criterion; b) the quadratic model reduction-based criterion.  Since the criterion a) with proved to yield poorer performance in practice, we preferred to use the more reliable residual-based criterion adopted in [6]. This criterion sets & which both takes into account the size n of the problem and allows a coarser solution when far from a stationary point. The criterion b) adopts η k ¼ 0:5; as suggested in [7].
In the sequel we adopt the following terminology:  ATC-true stands for algorithms which use the ATC scheme; ATC-false stands for algorithms which do not use the ATC scheme.

Choice of C k in the ATC scheme
Two different formulae were adopted for the parameter C k in [1]:  Figs. 1-3 report performance profiles of the comparison among schemes where our proposal is adopted, with the two choices (1) and (2) for C k .

Numerical comparisons among different truncated Newton schemes
Figs. 4-7 report performance profiles of the comparison between the two algorithmic choices ATCtrue vs. ATC-false, where the residual-based criterion a) is adopted in the unpreconditioned and preconditioned cases.
Figs. 8 and 9 refer to the comparison, in terms of CPU time, between the adoption of the residualbased criterion a) and the quadratic model reduction-based criterion b) in the algorithm which uses ACT in the unpreconditioned and preconditioned cases.

Comparison with a trust region approach
Figs. 10-12 report performance profiles of the comparison between our proposal of a truncated Newton method, where ATC is adopted (ATC-true), and the trust region-based code TRON [4]. Table 1 reports comparisons among the outputs of different versions of TRON and our proposals, on a selection of test problems.
The output data reported show how the use of the Adaptive Truncation Criterion proposed in [1], enables to efficiently address the problem of "over-solving" the Newton equation, within linesearchbased truncated Newton methods. The adoption of this criterion could have important implications for future implementations of such methods, for solving large scale unconstrained optimization problems. Indeed, it leads to a noticeable reduction of the CG inner iterations, that is significant computational savings of the overall computational burden.   With the stopping criterion With ATC-true and With ATC-false and jjg k jj r 10 −5 jjg k jj r 10 −5 max 1; jjx k jj È É jjg k jj r 10 −5 max 1; jjx k jj È É jjg k jj r 10 −5 max 1; jjx k jj È É