A Direct Integral Pseudospectral Method for Solving a Class of Infinite-Horizon Optimal Control Problems Using Gegenbauer Polynomials and Certain Parametric Maps

We present a novel direct integral pseudospectral (PS) method (a direct IPS method) for solving a class of continuous-time infinite-horizon optimal control problems (IHOCs). The method transforms the IHOCs into finite-horizon optimal control problems (FHOCs) in their integral forms by means of certain parametric mappings, which are then approximated by finite-dimensional nonlinear programming problems (NLPs) through rational collocations based on Gegenbauer polynomials and Gegenbauer-Gauss-Radau (GGR) points. The paper also analyzes the interplay between the parametric maps, barycentric rational collocations based on Gegenbauer polynomials and GGR points, and the convergence properties of the collocated solutions for IHOCs. Some novel formulas for the construction of the rational interpolation weights and the GGR-based integration and differentiation matrices in barycentric-trigonometric forms are derived. A rigorous study on the error and convergence of the proposed method is presented. A stability analysis based on the Lebesgue constant for GGR-based rational interpolation is investigated. Two easy-to-implement pseudocodes of computational algorithms for computing the barycentric-trigonometric rational weights are described. Two illustrative test examples are presented to support the theoretical results. We show that the proposed collocation method leveraged with a fast and accurate NLP solver converges exponentially to near-optimal approximations for a coarse collocation mesh grid size. The paper also shows that typical direct spectral/PS- and IPS-methods based on classical Jacobi polynomials and certain parametric maps usually diverge as the number of collocation points grow large, if the computations are carried out using floating-point arithmetic and the discretizations use a single mesh grid whether they are of Gauss/Gauss-Radau (GR) type or equally-spaced.


Introduction
Arguably, one of the most impactful numerical methods for solving continuous-time optimal control problems (CTOCPs) in the 20th century has been direct pseudospectral (PS) methods, which can accurately reduce CTOCPs into optimization problems of standard forms that can be easily treated using typical optimization methods. The key success of these methods lie in their ability to converge to sufficiently smooth solutions with exponential rates using relatively coarse mesh grids. PS methods are considered to be "one of the biggest technologies for solving PDEs" that were largely developed about half a century ago since the pioneering works of Orszag [1] and Patterson Jr and Orszag [2]. They have been continuously refined and extended in later decades to solve many problems in various scientific areas that were only tractable by these techniques. Perhaps one of the brightest moments in the course of the number of terms the more efficient the method is in terms of speed and computational complexity. In applications like CTOCPs, this property leads to optimization problems of small-scale which can be solved very quickly with reduced computational work at a concrete level using modern optimization software [16,46,47]. Now, with this being mentioned, it is important to realize that Chebyshev and Legendre polynomials are usually optimal for large spectral/PS expansions under the Chebyshev and Euclidean norms, respectively, but they are not necessarily optimal for a small/medium range-an observation that was proven numerically in a number of papers for certain polynomialand rational-interpolations and collocations; cf. [5,14,15,[48][49][50][51], which give us another reason to apply Gegenbauer polynomials as a proper basis polynomials that may provide faster convergence rates. (v) A stability analysis conducted in [17] and grounded in the Lebesgue constant for polynomial interpolations in Lagrange-basis form based on flipped-GGR (FGGR) points showed that the Lebesgue constant is not minimal for Chebyshev polynomials but rather was minimal for Gegenbauer polynomials associated with negative α-values. This analysis proved with no doubt that some Gegenbauer polynomials with negative α-values could be more plausible to employ in basis-form polynomial interpolation/collocation for short/medium range of mesh grid sizes. This observation is consistent with an earlier work of Light [52] who proved in the late 1970's that the Chebyshev and Legendre projection operators cannot be minimal as the norms of Gegenbauer projection operators increase monotonically with α for small expansions.
In light of the above arguments, we are motivated in this work to develop a novel direct IPS method for solving IHOCs using Gegenbauer polynomials and study its convergence. To this end, we derive some accurate and numerically stable GGR-based rational interpolation formulas, and describe two computational algorithms for constructing them. We show also how to derive the associated quadrature formulas required for numerical integration in time. We shall then use these numerical instruments to approximate the optimal state and control variables after transforming the IHOC into a FHOC in integral form (FHOCI) by means of certain parametric maps and rational collocation. During the course of our paper presentation, we shall try also to investigate a number of interesting relevant questions to our work. For instance, which parametric map is more suitable for GGR-based rational collocations? How should we choose the Gegenbauer parameter values to carry out collocations in practice? A "poor" choice of α can largely ruin the accuracy of the numerical scheme, while a "good" choice can in many cases furnish superb approximations with higher accuracy than those enjoyed by Chebyshev and Legendre polynomials for sufficiently smooth functions using relatively coarse mesh grids! Do PS-and IPS-methods based on Chebyshev, Legendre, and Gegenbauer polynomials generally converge to the solutions of IHOCs for large mesh sizes? If they do not, then what are the causes? Through rigorous stability, error, and convergence analyses, we shall prove that such methods often converge with exponential rate to near exact solutions using relatively small mesh grids, but they usually diverge for fine meshes under certain parametric maps.
The rest of the article is organized as follows: Sections 2 and 3 describe the IHOC under study and its transformation into a FHOCI via various parametric maps. Section 4 presents the discretization scheme of the FHOCI passing through the construction of the needed barycentric rational interpolants and their quadratures, and closing with a set up of the IPS rational collocation at the GGR points in Sections 4.1-4.3. Section 4.1.1 is devoted to analyze the stability and sensitivity of GGR-based rational interpolation/collocation developed in this paper. The optimality necessary conditions of the obtained NLP through IPS rational collocation are derived in Section 5. Rigorous error and convergence analyses are conducted in Section 6. Some divergence results of typical IPS collocation schemes of the FHOCI for fine meshes of Gauss Type using certain parametric maps are derived in Section 6.1. Simulation results are shown in Section 7 followed by some conclusions and future works in Section 8. The derivation of the barycentric rational formulas necessary for constructing the GGR-based differentiation matrix is shown in Appendix A. Two easy-to-implement pseudocodes of computational algorithms for computing the barycentric weights of the our new rational interpolation method are described in Appendix B.

The Problem Statement
Consider the following nonlinear, autonomous control system of ordinary differential equationṡ x(t) = f (x(t), u(t)), t ∈ [0, ∞), (2.1) subject to the system of initial conditions where x(t) = x 1 (t), x 2 (t), . . . , x n x (t) t ∈ R n x , u(t) = u 1 (t), u 2 (t), . . . , u n u (t) t ∈ R n u , x 0 = [x 1,0 , x 2,0 , . . . , x n x ,0 ] t ∈ R n x is a constant specified vector, and f : The problem is to find the optimal control u(t) and the corresponding state trajectory x(t) on the semi-infinite-domain [0, ∞) that satisfy Eqs. (2.1) and (2.2) while minimizing the functional where g : R n x × R n u → R. We assume that f and g are generally nonlinear, continuously differentiable functions with respect to their arguments, and the nonlinear IHOC (2.1)-(2.3) has a unique solution. In the rest of the article, for any row/column vector y = (y i ) 1≤i≤n with y i ∈ R ∀i and real-valued function h : Ω ⊆ R → R, the notation h(y) stands for a vector of the same size and structure of y such that h(y i ) is the ith element of h(y). Moreover, by h(y), we mean [h 1 (y), . . . , h m (y)] t , for any m-dimensional column vector function h, with the realization that the definition of each array h i (y) follows the former notation rule for each i.

Transformation of the IHOC into a FHOC
Given a differentiable, strictly monotonic mapping T : [0, ∞) → [−1, 1) defined by T (τ) = t, one can transform the IHOC (2.1)-(2.3) into the following FHOC: To take advantage later of the well-conditioning of numerical integration operators during the collocation phase, we rewrite Eq. (3.1b) in its integral formulation as follows:x We refer to the FHOC described by Eqs. (3.1a), (3.1c), and (3.1d) by the FHOCI. A wide variety of defining formulas exist for the mapping T . Five common defining formulas of such a mapping that occurred in the literature are as follows: where L ∈ R + is a scaling parameter that can stretch the image of the interval [−1, 1] in the codomain [0, ∞) as desired. An "optimal" choice of L-value can significantly improve the quality of the discrete approximations as we shall demonstrate later in Section 7. We refer to L by "the map scaling parameter." The parametric maps T i,L , i = 1, 2 are often referred to by the "algebraic" and "logarithmic" maps in the literature, respectively. Notice that the maps defined by Eqs. (3.2c) and (3.2d) are special cases of the parametric maps T i,L , i = 1, 2; in particular, T 1,1 = ϕ a , T 2,1 = ϕ b , and T 2,2 = ϕ c . Since the value of either parametric map varies when α varies for arguments of type GGR points, it is more convenient to denote them by T (α) i,L , i = 1, 2 to emphasize this fact. Mesh-like surfaces of both parametric mappings are shown in Figures 1 and 2, for several values of n, L, and α. Both figures show that the parametric mappings (i) increase monotonically for decreasing values of α while holding n and L fixed, (ii) increase monotonically for increasing values of L while holding n and α fixed, and (iii) increase monotonically for increasing values of n while holding L and α fixed. Moreover, near τ = 1, the rate of increase of T (α) 1,L with respect to any of the arguments n, L, and α while holding the others fixed is much larger than that of T (α) 2,L which grows very slowly. Loosely put, the stretching of the mesh grid near τ = 1 is stronger for T (α) 1,L than T (α) 2,L .

Numerical Discretization of the FHOCI
In this section, we provide a description of the proposed numerical discretization of the FHOCI using an IPS method based on Gegenbauer polynomials and GGR points.

Barycentric Rational Interpolation at the GGR Points
Let Z + be the set of positive integers, G (α) n (τ) be the nth-degree Gegenbauer polynomial with α > −1/2 and They satisfy the discrete orthonormality relation where ̟ j , j = 0, 1, . . . , n, are the corresponding Christoffel numbers of the GGR quadrature formula on the interval [−1, 1] defined by Given a set of n + 1 data points {(τ i , f i )} n i=0 , the Gegenbauer polynomial interpolant P n f in Lagrange form is defined by where L n,i are the Lagrange polynomials given by P n f can be evaluated fast and more stably by evaluating Lagrange polynomials through the "true" barycentric formula which brings into play the barycentric weights ξ i , i = 0, . . . , n, given by An interpolation in Lagrange form with Lagrange polynomials defined by Eqs. (4.5) is often referred to by "a barycentric rational interpolation." The barycentric weights associated with the GGR points can be expressed explicitly in terms of the corresponding Christoffel numbers through the following theorem.
Theorem 4.1. The barycentric weights for the GGR points are given by Proof. Let P (α,β) n (τ) be the Jacobi polynomial of degree n and associated with the parameters α, β > −1 as normalized by Szegö [56]. Through Szegö [56,Eq.   Recall that the GGR points cluster near ±1 as n → ∞, so Eq. (4.7b) may suffer from cancellation errors for values of τ i sufficiently close to 1. The following theorem provides two alternative trigonometric forms of Formula (4.7b).
Theorem 4.2. The barycentric weights corresponding to the interior GGR points are given by Proof. Through the change of variables τ = cos θ and the double angle rule for the cosine function, we find that from which Formula (4.8a) is derived. Formula (4.8b) is established by realizing that which completes the proof.
We refer to Formulas (4.8a) and (4.8b) by the trigonometric-barycentric weights. Formula (4.7b) is faster to compute and requires a smaller number of arithmetic operations compared with Formulas (4.8a) and (4.8b), but the latter two formulas may possibly produce smaller errors near τ = 1 as we observed through numerical experiments. This suggests to perform Formula (4.7b) for all values of τ, except when τ is sufficiently close to 1, where we switch to the other trigonometric forms. To this end, we introduce "a switching parameter," 0 < ε ≪ 1, at which the interchange of formulas is performed. The crossover value of ε, where it becomes more accurate to use the trigonometric form, will depend on the implementation; a prescription of this strategy is outlined in Algorithms B.1 and B.2. We refer to the explicit formulas used in Algorithms B.1 and B.2 to compute the barycentric weights by the "first-and second-switching formulas" of the barycentric weights for the GGR points, respectively. We also denote the errors in computing the barycentric weights using Formula (4.7b), Algorithm B.1, and Algorithm B.2, by E (α) i,n , for i = 1, 2, and 3, respectively. Figures 3 and 4 show comparisons between the three strategies in terms of error, for a certain value range of ε, n, and α. While Algorithm B.1 does not look promising against the usual Formula (4.7b) relative to the given input data as clearly seen from Figure 3, Figure 4 manifests that Algorithm B.2 is more numerically stable near ε = 0.1 and provides better approximations. We shall therefore use the latter algorithm with ε = 0.1 for the computation of the barycentric weights, and refer to the rational interpolation with barycentric weights obtained through the second switching formulas by the "switched rational (SR) interpolation." 1,n occurs as smaller/larger/equal to E (α) 2,n in each color map. All computations were carried out using MATLAB in double-precision floating-point arithmetic.

Stability and Sensitivity Analyses of GGR-Based SR-Interpolation/Collocation
A valuable device for measuring the quality and numerical stability of polynomial interpolations is Lebesgue constant, as it provides a measure of how close the interpolant of a function is to the best polynomial approximant of the function. The Lebesgue constant is also very useful in assessing the quality of approximate solutions obtained through collocation (aka collocated solutions), as their accuracy is related to the rate at which the Lebesgue constant increases.
Let f S = sup {| f (x)| : x ∈ S} be the uniform norm (or supremum norm) of a real-valued, bounded function f defined on a set S ⊆ R. Suppose that y i , i = 0, . . . , n and y c,n (τ) denote the exact solution values at the GGR points τ i , i = 0, . . . , n, and the corresponding collocated solution, respectively. Let also p * n y and p n y be the best polynomial approximation to the exact solution y on [−1, 1) and the Lagrange interpolating polynomial of degree at most n that interpolate the data set {(τ i , y i )} n i=0 respectively. Through the uniqueness of Lagrange interpolation, one can easily show that L n,i (τ) denotes the Lebesgue constant associated with GGR-based SR-interpolation.
8 Figure 4: The first row displays the color maps corresponding to the switching parameter values ε = 0.13, 0.1, 0.07, 0.04, and 0.01. Each color map shows areas delineated by blue, cyan, and yellow colors. Each color specifies whether E (α) 1,n is smaller/larger/equal to E (α) 3,n , for n = 10(10)100 and α = −0.49, −0.4(0.1)2. The second row shows the percentage frequency that E (α) 1,n occurs as smaller/larger/equal to E (α) 3,n in each color map. All computations were carried out using MATLAB in double-precision floating-point arithmetic. Therefore, where δy c,n (τ) is the difference between Lagrange interpolating polynomial and the collocated solution. When δy c,n [−1,1) ≈ 0, Λ (α) n roughly bounds the collocation error in the sense that it nearly quantifies how much larger the collocation error y − y c,n [−1,1) is compared to the smallest possible error, y − p * n y [−1,1) , in the worst case. In this case, it is obvious from Ineq. (4.10) that the smaller the Lebesgue constant, the better the predicted collocated solution is in the uniform norm. In other words, the collocation error is about at most a factor 1 + Λ (α) n worse than the best possible polynomial approximation. One can clearly see also that Λ (α) n depends on the location of the collocation points τ i , i = 0, . . . , n but not on the solution values y i , i = 0, . . . , n. Since the positions of the GGR points change as α and n vary, we are interested to learn the apt choices of α that makes Λ (α) n as small as possible while holding n fixed. This can provide some useful insight on how should we select the candidate range of α-values often used for collocations based on GGR points. In [17], our findings uncovered that Λ (α) n for FGGR-based polynomial interpolation in Lagrange-basis form blows up as α → −0.5 and monotonically increase for increasing positive values of α; cf. [17, Eqs. (3.7) and (3.8)]. Moreover, it was noticed that 'Λ (α) n does not decrease monotonically for increasing negative values of α', indicating that Λ (α) n is not minimal for Chebyshev polynomials but rather attains its smallest value for Gegenbauer polynomials associated with some negative values of α; cf. [17, Figures 1 and 3]. It was roughly estimated in many earlier works through theoretical and numerical evidences that reasonably good α-values for polynomial interpolation in basis form typically belong to the "Gegenbauer collocation interval of choice," I G ε,r , defined by I G ε, for reasons pertaining to the stability and accuracy of numerical schemes employing Gegenbauer polynomials as basis polynomials; cf. [4,17,47,50,58] and the Refs. therein.
On the other hand, it was discovered in a number of works that Lebesgue constants for rational interpolation at equally-spaced nodes are much smaller than those associated with classical polynomial interpolation; cf. [59][60][61][62][63]. Moreover, the Lebesgue constant of Berrut's rational interpolation [64] at equidistant nodes is smaller than the Lebesgue constant for polynomial interpolation at Chebyshev nodes; cf. [65]. Figure 5 shows the surface of Lebesgue constant for GGR-based rational interpolation characterized by Eqs. n does not blow up as α → −0.5, but rather remains bounded, (iii) the associated Lebesgue constant grows logarithmically in the number of collocation nodes, (iv) it is interesting also to see how the Lebesgue constant drops monotonically as α increases while holding n fixed. This suggests that Legendre polynomials are generally more suited for GGR-based SR-interpolations than Chebyshev polynomials. In fact, we can also observe from Figure 5 that Gegenbauer polynomials with increasing α-values are associated with smaller Lebesgue constant values. This suggests that Gegenbauer polynomials with α > 1/2 may also be more plausible to employ in SRinterpolation for short/medium range of n-values. However, the work of Elgindy and Smith-Miles [50] manifests that Gegenbauer quadratures 'may become sensitive to round-off errors for positive and large values of the parameter α due to the narrowing effect of the Gegenbauer weight function,' which drives the quadratures to become more extrapolatory with greater uncertainty in integral approximations; thus, the collocation is subject to a higher risk of producing meaningless results. In other words, δy c,n [−1,1) may become large and the collocation error grows accordingly. It was observed also in [50] that the weight function cease to exist near the boundaries τ = ±1, and its support is nonzero only on a subinterval centered at τ = 0 for increasing values of α > 1. If we refer to GGR-based collocations employing SR-interpolations by the "SR-collocations," then this analysis suggests that, for a relatively large collocation mesh size, SR-collocations is expected to produce higher-order approximations for nonnegative αvalues with apparently optimal α-values within/near the "Gegenbauer SR-collocation interval of choice (SRCIC)" in addition, Gegenbauer polynomials with positive and large α-values are generally not apt for SR-interpolation/collocation. For small mesh sizes, however, there is no rule-of-thumb as to how should we select α, since all Lebesgue constant curves converge to the same limit as n → 1. The analysis in this section assume that the problem under study is well-conditioned. For sensitive problems, the interval of choice Υ G α − c ,α + c may change depending on the sources of sensitivity. In Section 6, we shall expound that proper collocations of the FHOCI using any of the maps (3.2a)-(3.2d) entails shifting the right boundary, α + c , of Υ G α − c ,α + c rightward as the mesh size grows large to reduce the divergence rate of the collocated solutions from the exact solutions. From another perspective, it is interesting to mention that the Lebesgue constant is also a useful instrument in observing how the collocated solutions change as the input data are varied. By closely following the convention in [17], suppose thatỹ i , i = 0, . . . , n, andỹ c,n (τ) are the perturbed solution values due to round-off or input data errors, and the perturbed collocated solution, respectively. Moreover, assume thatp n y(τ) is the Lagrange interpolating polynomial of degree at most n that interpolate the data set {(τ i ,ỹ i )} n i=0 . Then we have where δỹ c,n (τ) is the difference between the perturbed Lagrange interpolating polynomial and the perturbed collocated solution. When δy c,n − δỹ c,n [−1,1) is relatively small, Λ (α) n nearly quantifies how much larger the perturbation error of the collocated solution, y c,n −ỹ c,n [−1,1) , is compared to the maximum possible perturbation error of the solution at the collocation points, max 0≤i≤n |y i −ỹ i |, or to the maximum solution perturbation error, y −ỹ [−1,1) , in the worst case.

The Barycentric GRIM and Quadratures
Consider a real-valued function f defined on the interval [−1, 1] and its GGR-Based SR-interpolation given by Eqs. (4.4), (4.5), and the second switching formulas of the barycentric weights. Following the work of Elgindy [66], the formulas needed to construct the nonzero rows of the barycentric GRIM can be derived by integrating Eq. (4.4) on the successive intervals −1, τ j , j = 1, . . . , n, to obtain where we can rewrite Eq. (4.12) as where Since the polynomials L n,i t; −1, τ j , i = 0, . . . , n, are of degree n, the integrals 1 −1 L n,i t; −1, τ j dt can be computed exactly using an N = ⌈(n + 1)/2⌉-point LG quadrature, where ⌈.⌉ denotes the ceiling function. Let {τ k ,̟ k } N k=0 be the set of LG quadrature nodes and weights, respectively, wherē and L ′ N+1 denotes the derivative of the (N + 1)st-degree Legendre polynomial L N+1 . Then Hence, Eqs. (4.14) and (4.16) yield where . . , f n t , and q j,i , i, j = 0, . . . , n, are the elements of the first-order barycentric GRIM Q given by (4.18) We denote the jth row of Q by Q j ∀ j. The derivation of the formulas required to construct the GGR-based differentiation matrix (GRDM) in barycentric form is described in Appendix A.

IPS Rational Collocation of the FHOC at the GGR Points
Then collocating Eq. (3.1d) at the GGR nodes yields . . , f i,n t ∀i, 1 n is the all ones column vector of size n, [., .], "vec", •, and ⊗ denote the horizontal matrix concatenation, the vectorization of a matrix, the Hadamard product, and the Kronecker product, respectively. Let τ n+1 = 1 and define Q n+1 = q n+1,i 0≤i≤n : q n+1,i = N k=0̟ k L n,i (τ k ) ∀i, then, the discrete cost functional J can be approximated numerically using the LG quadrature as follows: where g = g 0 , . . . , g n t : Hence, the FHOCI (3.1a), (3.1c), and (3.1d) is now converted into a nonlinear programming problem (NLP) in which the goal is to minimize the discrete cost functional (4.20) subject to the nonlinear system of equations (4.19). If we define the image of the collocation points set S n under the transformation T by S T n = {t k : t k = T (τ k ) , k = 0, . . . , n}, and denote x(t i ) and u(t i ) by x i and u i ∀i, respectively, then the NLP can be solved using well-developed optimization software for the unknownsx i = x i , i = 1, . . . , n, andũ j = u j , j = 0, . . . , n. The approximate optimal state and control variables can then be calculated at any point t ∈ [0, ∞) through the PS expansions where t n = [t 0 , t 1 , . . . , t n ] t and L n (t) = L n,0 (t), L n,1 (t), . . . , L n,n (t) t . In the special case, when T = T (α) 1,L (τ), one can easily show that the NLP can be written as follows: , for any vector v, and Furthermore, when T = T (α) 2,L (τ), the NLP can be formulated as follows: We refer to the NLPs described by Eqs. (4.22a), (4.22b), (4.24a), and (4.24b) by NLP1 and NLP2, respectively. We also refer to the present collocation method by the "GGR-IPS" method; the acronyms "GGR-IPS1" and "GGR-IPS2" stand for the GGR-IPS method performed using the parametric maps T (α) 1,L and T (α) 2,L , respectively, while "GGR-IPS12" stands for the GGR-IPS method performed using either maps T (α) 1,L and T (α) 2,L .

Error and Convergence Analyses
In this section we derive the truncation error bounds for Eqs. (4.19) and (4.20) and their convergence rates.  .17) is given by Proof. By definition, we can write for some ξ ∈ (−1, 1), where f E n is the interpolation truncation error at the GGR points given by The proof is established by realizing that of the Integral Constraints System (3.1d) at each point τ j ∈ S n are given by

Divergence of Typical IPS Collocation Schemes of the FHOCI at Any Large Mesh Grid of Gauss Type When
In this section we derive some striking results regarding the convergence of typical collocation schemes of the FHOCI described by Eqs. (3.1a), (3.1c), and (3.1d) when T ∈ {T (α) 1,L , T (α) 2,L } and the mesh grid is large and of Gausstype. While the proof is pertained to the FHOCI and it employs the GGR points as the collocation points, it can be generalized to the usual form of the FHOC described by Eqs. (3.1a)-(3.1c) and any large collocation points of Gauss type, whence it becomes of considerably greater interest. We derive these interesting divergence results in the following two corollaries.  Proof. By the General Leibniz Rule, the (n + 1)st-derivative of ψˆk is given by whence, . From Theorem 6.2, there exist some positive constantB (α) 2 andĈ (α) 2 dependent on α and independent on n such that from which we observe that the upper bound of ψˆk E n at each collocation point τ j diverges as n → ∞. Theorem 6.2 and Corollary 6.2 reveal an interesting fact: Under their assumptions, the proposed method is expected to converge with an exponential rate to near-optimal solutions for increasing n-values within a relatively small n-values range as indicated by Inequality (6.7), but as n grows large, the constant A ψˆk,n grows exponentially fast and ultimately dominates the error bounds when n → ∞, as implied by Inequalities (6.16a)-(6.17b), regardless of how well we choose the map scaling parameter value L. In fact, the asymptotic results of Corollary 6.2 manifest that for increasing large n-values, reducing the L-value abates the divergence of the approximations at the outset, but as n grows larger, this approach fails to cope with the soaring values of n powers of the factors 1/(1 − τ j ) at the mesh points τ j , for sufficiently close mesh points values to τ = 1; thus, divergence is inevitable! While the above forward error analysis may be too pessimistic and may reject solutions that are sufficiently accurate, another concern arise when we analyze the sensitivity of NLP1 and NLP2 associated with the maps T (α) 1,L and T (α) 2,L to input data errors. Observe that both problems require the computations of the maps T ′ 1,L and T ′ 2,L which are ill-conditioned for arguments near 1. In particular, suppose that τ ≈ 1 with a small perturbation h to τ. Then the absolute errors in computing T ′ 1,L (τ) and T ′ 2,L (τ) are given by and hence the relative errors are 2|h| 1 − τ and |h| 1 − τ , respectively, which blow up as τ → 1. Recall that GGR points cluster near ±1 as n → ∞, so the sensitivity of the problem of calculating the maps derivative functions T ′ 1,L and T ′ 2,L at arguments near 1 increases for increasing values of n. For example, let τ = 0.9999999999999 be an exact argument value and consider its approximationτ = 0.9999999999998 with a small perturbation of about 9.99 × 10 −14 to τ. Then the relative error in the input value is about 10 −11 %. However, the relative errors in computing T ′ 1,L (τ) and T ′ 2,L (τ) are nearly 75% and 50%. Hence, the relative changes in evaluating T ′ 1,L (τ) and T ′ 2,L (τ) are about 7.5 and 5 trillion times larger than the relative change in the input value in respective order! This example shows that increasing the mesh size shifts the positive collocation points closer and closer towards τ = 1 and wild ill-conditioning ultimately rears its ugly head, as the sensitivity of NLP2 progressively stiffens for arguments near 1. Therefore, one should keep in mind that reducing L may still improve the approximations for a certain range of n-values, nonetheless this strategy is not susceptible to produce accurate approximations for relatively large values of n, in general, since both NLP1 and NLP2 are ill-conditioned near τ = 1. It is noteworthy to mention here that this sensitivity of NLP1 and NLP2 near τ = 1 is foreseen to relax or disappear if g and f k ∀k decay exponentially fast such that lim τ→1 T ′ (τ)g (x(τ),ũ(τ)) = 0 and lim τ→1 T ′ (τ) f (x(τ),ũ(τ)) = 0. Under a similar proof to that of Corollary 6.2, one can derive the following second divergence result. The present analysis begs another interesting question: Which map should we use if we desire to implement the proposed method? For small/medium range of n-values, the answer is a bit elusive; however, for large values, it seems we have a crystal clear answer as shown by the following corollary.

Corollary 6.4. A Gegenbauer-Gauss collocation of the FHOCI using the map T (α)
1,L generally diverges faster than applying the method lumped with the map T (α) 2,L when n → ∞.
Corollary 6.4 manifests that the map T (α) 2,L is more likely a better choice than T (α) 1,L for large n values. We end this section by drawing the attention of the reader to the fact that integral reformulations of various mathematical models have received considerable attention in the literature because they often produce well-conditioned linear systems. While numerical quadratures and integration matrices are generally more stable than numerical differentiation operators and matrices, there is no strong reason to expect that standard PS collocations of the FHOC in its strong differential-form using a single mesh grid of Gauss-type and maps like T (α) 1,L and T (α) 2,L would exhibit any merits over the current method, and they would ultimately diverge for a large mesh grid size. These considerations lead naturally to the following interesting conjecture.

Conjecture 6.1. Classical Jacobi polynomial collocations of the FHOC in differential/integral-form obtained through maps like T (α)
1,L and T (α) 2,L will likely diverge as the mesh size grows large, if the computations are carried out using floating-point arithmetic and the discretizations use a single mesh grid whether they are of Gauss/Gauss-Radau (GR) type or equally-spaced. The former divergence case is a direct result of the present divergence analysis, while the latter case is due to Runge's phenomenon and the ill-conditioning of polynomial interpolation at equally-spaced nodes as the degree of the polynomial grows.

Numerical Experiments
This section presents the results of some numerical experiments on two test examples which demonstrate the accuracy and efficiency of the proposed GGR-IPS12 methods for small/medium range mesh grid sizes, and verifies the inevitable divergence as the mesh size grows large. All numerical experiments were carried out using MATLAB R2022a software installed on a personal laptop equipped with a 2.9 GHz AMD Ryzen 7 4800H CPU and 16 GB memory running on a 64-bit Windows 11 operating system. The NLPs obtained through the GGR-IPS12 methods were solved using either (i) MATLAB fmincon solver with interior-point algorithm (fmincon-int) and sqp algorithm (fmincon-sqp), or (ii) the augmented Lagrange multiplier method [67,68] integrated with a modified BFGS method and a Chebyshev PS line search method (MBFGS-CPSLSM) [69]; henceforth, referred to by the "EALMM." It should be clearly understood by the reader when we coin the name of the current collocation method with any NLP solver that we are implementing them both to solve the FHOCI. For example, the acronym GGR-IPS12-EALMM stands for the GGR-IPS12 methods combined with the EALMM. In all numerical tests, the exact optimal state and control variables were calculated using MATLAB with 30 digits of precision maintained in internal computations. The fmincon solver was carried out using the stopping criteria TolFun = TolX = 10 −12 , 10 −15 for Examples 1 and 2, respectively; similarly, the augmented Lagrange multiplier method was terminated when the lower bound on the change in the augmented Lagrangian function value during a step does not exceed 10 , and x 0 = 2. The exact state and control variables are cf. [40]. The exact cost function J * = (ln 2) 2 √ 2 + 1 /2 ≈ 0.5799580911421756 rounded to 16 significant digits. Through the change of variables z(t) = ln x(t), (7.2) the IHOC (2.1)-(2.3) can be rewritten in an equivalent linear-quadratic optimal control problem in an infinite horizon with g (z(t), u(t)) = z 2 (t) + u 2 (t) /2, f (z(t), u(t)) = z(t) + u(t), and z 0 = ln 2. We refer to the former and latter forms of the IHOC by Forms A and B, respectively. Form A of the example was previously solved by Garg et al. [40] using LG-and LGR-PS methods and the three maps (3.2c) and (3.2d); the obtained NLPs were solved using SNOPT [70,71]. Table 1 shows a comparison between the LGR-and LG-PS methods and the GGR-IPS12-EALMM using the same initial guessesx(τ) = 2 andũ(τ) = τ ∀τ ∈ [−1, 1]. Notice how the GGR-IPS2-EALMM generally enjoy superior stability properties and achieve higher-order approximations in this example for n = 5(5)30 compared with the other approaches, except for the LG-PS method, where they both achieve the same order of accuracy at n = 30. It is interesting here to recognize how the GGR-IPS2-EALMM defeats the LG-PS method for n = 5(5)25, although the latter employs a Gauss quadrature that is more accurate than the GR quadrature used by the former. One may connect the success of the GGR-IPS2-EALMM here to many reasons, namely (i) the clever change of variables (7.2) that converts the NLP into a linear-quadratic optimal control problem which can be collocated more accurately, (ii) the integral form of the system dynamics allows for gaining more digits of accuracy via numerical quadratures which are well-known for their numerical stability, (iii) the highly-accurate built-in Algorithm B.2 to the current methods which applies the latest technology of SR-interpolation, (iv) the parametric logarithmic map T (α) 2,L that is favored over T (α) 1,L for its slower growth and less sensitivity near τ = 1, and (v) the map scaling parameter L which permits for faster convergence rates when "optimally" chosen. On the other hand, we observe that the errors of GGR-IPS12-EALMM generally decline gradually as the mesh grid size initially grow up to a certain limit, yet they bounce back beyond that limit as the mesh grid size continues to grow large in agreement with the theoretical results of Section 6. It is interesting to see similar phenomena with the control error profiles in [40] in the sense that (i) the control error plot of the LG-PS method does not appear as a (near) straight line in the shown log-scaled chart but rather a convex-shaped curve, as it curves outward; cf. Garg et al. [40, Fig. 1(b)], and (ii) the control error plot of the LGR-PS method suddenly increases at n = 30 much earlier before reaching the round-off plateau; cf. Garg et al. [40, Fig. 2(b)]. Another interesting remark lies in the smallest errors of the current methods; they were all recorded at/near α = 0.5 with α ∈ Υ G 0.5,0.6 , while several optimal values of L were detected. Figures 7 and 8 show the plots of the exact state and control variables in addition to their collocated solutions and absolute errors obtained by the GGR-IPS12 methods integrated with three NLP solvers using the same initial guesses set and several values of n, L, and α. We can observe from the shown graphical data that the GGR-IPS2-EALMM generally achieves better accuracy and stability properties compared with the other methods. Table 1: The uncertainty intervals of the smallest MAE x,u obtained by the LGR-and LG-PS methods of Garg et al. [40] at the collocation points and the corresponding smallest MAE x,u obtained by the GGR-IPS12-EALMM using the same initial guessesx(τ) = 2 andũ(τ) = τ ∀τ ∈ [−1, 1]. All approximations were rounded to 5 significant digits.
LGR-PS [40] LG-PS [40] GGR-IPS1-EALMM GGR-IPS2-EALMM Form A Form B n MAE x,u uncertainty interval MAE x,u /α/L MAE x,u /α/L 5 (1e-03, 1e-02) (1e-04, 1e-03) 5.3830e-03/0.6/2.25 4.2453e-05/0.5/3.5 10 (1e-06, 1e-05) (1e-06, 1e-05) 7.0615e-05/0.5/5.75 1.8735e-09/0.5/4.25 15 (1e-07, 1e-06) (1e-07, 1e-06) 6.0288e-07/0.5/9. 25 1.9736e-09/0.5/5 20 (1e-08, 1e-07) (1e-08, 1e-07) 1.3181e-08/0.5/8.5 2.0583e-09/0.5/2.5 25 (1e-08, 1e-07) (1e-08, 1e-07) 2.4368e-08/0.5/2. 25 1.6175e-09/0.5/3 30 (1e-08, 1e-07) (1e-09, 1e-08) 2.7958e-08/0.5/2. It is interesting to observe here by visual inspection that, when holding L fixed, the global minima of the error mesh surface plots occur near α = 0.5 and the mesh surfaces rise up gradually as we move away, except when α ∈ {−0.4, −0.3}, where sharp peaks may emerge suddenly for growing values of n. This suggests that Legendre polynomials seem an optimal choice among Gegenbauer basis polynomials when holding L fixed, while Gegenbauer polynomials associated with α-values near −0.5 may cause numerical instability as n grows large. However, a different story emerges when the GGR-IPS2-EALMM is performed instead as can be seen from Figures 11 and 12. Notice now that the errors "look" monotonically decreasing for decreasing values of α when holding L fixed at 1 and 2, and the error surface shoots up as n and α increases. Therefore, Gegenbauer polynomials with some negative α-values "seem" optimal for relatively small values of L. Notice also that the sudden peaks observed before with the GGR-IPS1-EALMM in Figures 9 and 10 for α ∈ {−0.4, −0.3} and large n values disappear. A further array of error mesh surface plots of the GGR-IPS1-EALMM are shown in Figures 13 and 14 for n = 10(10)50, α = −0.2, 0, 0.5, 1, L = 0.5(0.5)6, and (x 0 ,ũ 0 ) ∈ Ω. While holding α fixed, there seems no general rule of thumb can be laid down from the shown data. On the other hand, Figures 15 and 16 show the corresponding plots associated with the GGR-IPS2-EALMM, where the errors are very similar and can be clearly seen to surge as L → 0, especially when α = 1, but remain relatively small for L = 2(0.5)6. Figures 17 and 18 show comparisons of the number of iterations required by the GGR-IPS12 methods when combined with three distinct NLP solvers using (x 0 ,ũ 0 ) ∈ Ω and several L-and α-values. Clearly, the integration of the GGR-IPS12 methods with the EALMM leads to a drastic reduction in the number of iterations in all cases. In fact, while the GGR-IPS12-EALMM often converged in only four/five iterations, other methods usually require many more iterations to converge; for example, the GGR-IPS12 methods combined with fmincon-int and fmincon-sqp took more than 200 iterations to converge to the solutions of the problem when (n, L, α) = (48, 1, −0.2) starting with any initial guess (x 0 ,ũ 0 ) ∈ Ω. Table 2 shows the approximate cost function values obtained by the GGR-IPS2-EALMM for several parameter values. The fastest convergence was recorded at α = 0.5 in all cases with J ≈ J 16 = 0.579958091127 in agreement with J to 10 significant digits. It is interesting to see through the tabulated data how Gegenbauer polynomials with α ∈ {−0.4, 0.25} exhibit faster convergence rates than Chebyshev polynomials (when α = 0) capturing 4 correct significant digits as early as n = 12, whereas Chebyshev polynomials are still lagging behind by one digit even when n increases by 2 units. Gegenbauer polynomials with α = −0.2 also performed better than Chebyshev polynomials  Table 1. for n ∈ {14, 16}, while the poorest stability was that of Gegenbauer polynomials with α = 2 scoring only one correct significant digit in all cases! Table 3  ∈ Ω. The table shows the capacity of the GGR-IPS12-EALMM to achieve improved near-optimal solutions for increasing values of n within a small/medium range of mesh grid size; however, the accuracy deteriorates beyond a certain limit as the mesh grid size grows larger in agreement with the theoretical results proven in Section 6. The GGR-IPS2-EALMM is clearly superior to the GGR-IPS1-EALMM in terms of accuracy in all cases, and it is interesting here to see how the GGR-IPS1-EALMM diverges faster than the GGR-IPS2-EALMM for growing mesh sizes as prophesied earlier by the divergence analysis presented in Section 6.1. The best approximations obtained experimentally by the GGR-IPS2-EALMM were recorded at/near α = 0.5 with α ∈ Υ G 0.4,0.6 . The smallest errors of the GGR-IPS1-EALMM were also recorded at/near α = 0.5 for n = 5(5)55 with α ∈ Υ G 0.4,0.6 ; however, the algorithm tends to favor larger positive α-values beyond n = 55, where we noticed the travel of the right boundary, α + c , of Υ G 0.4,α + c rightward from α + c = 0.6 into α + c = 1.8 as n reaches 80. This is no surprise! In fact, recall that T (α) 1,L increases monotonically for decreasing values of α while holding n and L fixed, and we can observe from Figure 1 that the mapping escalates wildly as we continue decreasing the α-values. The byproduct of this behavior is that increasing the α-values moves the collocation points associated with large values of t leftward and relocate them closer to regions where the solution changes rapidly. This graphical interpretation consents with the fact that the interior GGR points move monotonically toward the center of the interval (−1, 1) as the parameter α increases; cf. [56,72]. From another perspective, the leftward movement of the collocation points near the right boundary τ = 1 mitigates the effect of the ill-conditioning of T ′ 1,L for arguments near 1, as T ′ 1,L is evaluated at mesh points that are gradually departing the vicinity of τ = 1. This argument adds more tenability for using Gegenbauer polynomials as a basis polynomials for numerical collocations of FHOCIs obtained from IHOCs via T (α) 1,L or T (α) 2,L in the sense that, while Chebyshev and Legendre polynomials cease to downgrade the errors as the mesh grid size grows large, Gegenbauer polynomials has the additional advantage to alleviate the growth  Table 1. rates of both T (α) 1,L and T (α) 2,L by increasing the α-value whenever we wish while sustaining the luxury to apply either Chebyshev or Legendre polynomials with a push of a button: simply set α = 0 or 0.5 in the solver code! If we now turn our attention to the recorded L-values in the table, we can quickly spot that the smallest computed errors initially spans a wide range of numerically optimal L-values; however, the solvers ultimately have a bias towards smaller values of L in attempt to damp the error in agreement with the forward error analysis presented in Section 6. Notice that the numerically optimal L-value for the GGR-IPS2-EALMM stays at 0.75 for n = 55(5)80, while the corresponding values for the GGR-IPS1-EALMM occur at the smallest feasible L-value among the input range of experimental data. One may pin this peculiar behavior of the solvers to the fact that the logarithmic map T (α) 2,L increases at a much slower rate than that of the algebraic map T (α) 1,L ; cf. Figures 1 and 2.  cf. [41,54,73]. This example is a linear quadratic regulator problem with an optimal cost functional value J * = 19.85335656362790, rounded to 16 significant digits, as obtained by MATLAB using the Symbolic Math Toolbox. Figure 19 shows the plots of the exact optimal state and control variables and their approximations obtained through GGR-IPS2-EALMM using some parameter values. Table 4 shows the MAE x,u of the LGR-PS method of Garg et al. [41] and the smallest corresponding MAE x,u and AE J pairs of the GGR-IPS2-EALMM at the collocation points using (x 0 ,ũ 0 ) ∈ Ω and several values of n. The GGR-IPS2-EALMM proves again to be superior in terms of accuracy for a small range of mesh size, as it converges rapidly to near optimal solutions at a much higher-rate than that of Garg et al. [41]. However, the superb accuracy of the method starts to decline when n grows larger as anticipated earlier.
Notice again here that the best accuracy in all cases was recorded at α = 0.5 with SRCIC Υ G 0.5,0.5 = {0.5}. Another comparison between the GGR-IPS2-fmincon-int, GGR-IPS2-fmincon-sqp, and the transformed LGR method of Shahini and Mehrpouya [42] is shown in Table 5. We can clearly see that the former two methods generally yield smaller AE J values. The rise and fall of accuracy as the mesh size grows is again peculiar in the observed approximations in agreement with the presented divergence analysis in Section 6.1. Remarkably, a match with the exact J * to full machine precision was recorded as early as n = 20 indicating an exceedingly accurate numerical scheme with exponential convergence for coarse meshes. All smallest errors reported by the current methods occurred at α = 0.5, except for n ∈ {90, 100}, where collocations at α = 0 and 0.8 furnished higher accuracy. On the other hand, the rounded errors in [42] decay to 1.35 × 10 −09 as soon as n = 30 using the algebraic map, but surprisingly cease to vary any further for n = 40 (10)

Conclusion and Future Work
Direct IPS methods for solving IHOCS using the logarithmic mapping T (α) 2,L and the developed SR-interpolation and barycentric quadrature formulas can produce excellent approximations to the optimal state and control variables for relatively small/medium mesh grids. However, this class of methods often suffer from numerical instability for fine meshes when endowed with any of the parametric maps T (α) i,L , i = 1, 2; therefore, as the mesh size grows, they are not as useful as one might hope for computing the optimal state and control trajectories to within high precision. In fact, it has been shown in the current paper that two sources of difficulty arise in handling the horizon in IHOCs by a domain transformation that maps the infinite horizon to the finite horizon [−1, 1) through the algebraic and logarithmic maps T (α) i,L , i = 1, 2, namely (i) the exponential growth of the mappings surface slopes near the right boundary τ = 1, which increase the truncation errors produced in the FHOCI discretization without bounds as τ → 1, and (ii) despite the fact that both mappings T (α) i,L , i = 1, 2 have a singularity at τ = 1, and we actually never evaluate them at the singularity, since the GGR collocation points are strictly less than 1, their derivatives are sensitive to input data errors for arguments near τ = 1; thus, both NLP1 and NLP2 are ill-conditioned for τ ≈ 1. These theoretical facts as well as the observed empirical data are considerable reasons to say that typical direct spectral/PS-and IPS-methods based on classical Jacobi polynomials and the parametric maps T (α) i,L , i = 1, 2 are foreseen to diverge as the mesh size grows large, if the computations are carried out using floating-point arithmetic and the discretizations use a single mesh grid whether they are of Gauss type or equally-spaced.
While Gegenbauer polynomials associated with certain nonpositive α-values are well suited for FGGR-based polynomial interpolations in Lagrange-basis form over fine meshes as shown by Elgindy and Refat [17], this paper asserts that Gegenbauer polynomials associated with certain nonnegative α-values are more apt for GGR-based SR-interpolations over fine meshes. Moreover, for coarse mesh grids, Legendre polynomials are particularly (near) optimal basis polynomials for GGR-based SR-collocations of FHOCIs, as argued in Section 4.1.1 and sustained through numerical simulations. On the other hand, Gegenbauer polynomials associated with certain positive values of α ∈ (1/2, 2] are optimal for IHOCIs collocations over fine mesh grids, as they can largely slow down the exponential growth of both parametric maps T (α) i,L , i = 1, 2, and their associated GGR collocation points are less dense near τ = 1; 23 thus, the sensitivity of computing T ′ i,L , i = 1, 2 at arguments near τ = 1 is significantly attenuated. The paper also shows that the parametric map T (α) 1,L is more severely sensitive for τ ≈ 1 than T (α) 2,L and the family T (α) 1,L m ∞ m=0 grows faster than T (α) 2,L m ∞ m=0 as τ → 1; therefore, T (α) 2,L is more apt for the domain transformation of IHOCs than T (α) 1,L for collocation points of Gauss/GR type.
It is worthy to mention that direct IPS methods based on the proposed Gegenbauer SR-collocation can exhibit faster convergence rates for coarse meshes by regulating the map scaling parameter L and the Gegenbauer parameter α. In light of the stability analysis conducted in Section 4.1.1, GGR-based SR-collocations of well conditioned problems are generally endorsed for α-values within/near the SRCIC Υ G 1/2,1 ; the current study also supports this rule of thumb for IHOCs when converted into FHOCIs through the parametric maps T (α) i,L , i = 1, 2 and then collocated at relatively coarse mesh grids. However, the question of how can we find the optimal map scaling parameter L * for IHOCs remains open. An interesting direction for future works may involve a study of new mappings with smaller growth rates and derivatives of less sensitivity to input data errors.

Appendix A. The Barycentric GRDM
To construct the barycentric GRDM, we follow the derivation presented in [16] and multiply both sides of Eq. (4.5) by x − τ j ∀ j to render them differentiable at x = τ j such that Since S (τ j ) = ξ j , S ′ (τ j ) = j k ξ k τ j − τ k , and L n,i (τ j ) = 0 ∀i j, the off-diagonal elements of the differentiation matrix D = (d j,i ) 0≤ j,i≤n can be calculated by the following formula:    The MAE x,u of the GGR-IPS12-EALMM obtained using n = 5(5)80 and (x 0 ,ũ 0 ) ∈ Ω. All approximations were rounded to 5 significant digits.