Skip to main content
Log in

Fast Convergence of Inertial Dynamics with Hessian-Driven Damping Under Geometry Assumptions

  • Published:
Applied Mathematics & Optimization Submit manuscript

Abstract

First-order optimization algorithms can be considered as a discretization of ordinary differential equations (ODEs) (Su et al. in Adv Neural Inf Process Syst 27, 2014). In this perspective, studying the properties of the corresponding trajectories may lead to convergence results which can be transfered to the numerical scheme. In this paper we analyse the following ODE introduced by Attouch et al. (J Differ Equ 261(10):5734–5783, 2016):

$$\begin{aligned} \forall t\geqslant t_0,~\ddot{x}(t)+\frac{\alpha }{t}{\dot{x}}(t)+\beta H_F(x(t)){\dot{x}}(t)+\nabla F(x(t))=0, \end{aligned}$$

where \(\alpha >0\), \(\beta >0\) and \(H_F\) denotes the Hessian of F. This ODE can be derived to build numerical schemes which do not require F to be twice differentiable as shown in Attouch et al. (Math Program 1–43, 2020) and Attouch et al. (Optimization 72:1–40, 2021). We provide strong convergence results on the error \(F(x(t))-F^*\) and integrability properties on \(\Vert \nabla F(x(t))\Vert \) under some geometry assumptions on F such as quadratic growth around the set of minimizers. In particular, we show that the decay rate of the error for a strongly convex function is \(O(t^{-\alpha -\varepsilon })\) for any \(\varepsilon >0\). These results are briefly illustrated at the end of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

  1. Adly, S., Attouch, H.: Finite convergence of proximal-gradient inertial algorithms combining dry friction with hessian-driven damping. SIAM J. Optim. 30(3), 2134–2162 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  2. Alvarez, F., Attouch, H., Bolte, J., Redont, P.: A second-order gradient-like dissipative dynamical system with hessian-driven damping: application to optimization and mechanics. Journal de mathématiques pures et appliquées 81(8), 747–779 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. Apidopoulos, V., Aujol, J.F., Dossal, C., Rondepierre, A.: Convergence rates of an inertial gradient descent algorithm under growth and flatness conditions. Math. Program. 187(1), 151–193 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  4. Attouch, H., Balhag, A., Chbani, Z., Riahi, H.: Accelerated gradient methods combining tikhonov regularization with geometric damping driven by the hessian. arXiv preprint arXiv:2203.05457 (2022)

  5. Attouch, H., Balhag, A., Chbani, Z., Riahi, H.: Fast convex optimization via inertial dynamics combining viscous and hessian-driven damping with time rescaling. Evol. Equ. Control Theory 11(2), 487–514 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  6. Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with hessian driven damping. Math. Program. 193(1), 113–155 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  7. Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: Convergence of iterates for first-order optimization algorithms with inertia and hessian driven damping. Optimization 72, 1–40 (2021)

    MathSciNet  MATH  Google Scholar 

  8. Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168(1), 123–175 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  9. Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case \(\alpha \le 3\). ESAIM 25, 2 (2019)

    MathSciNet  MATH  Google Scholar 

  10. Attouch, H., Fadili, J., Kungurtsev, V.: On the effect of perturbations, errors in first-order optimization methods with inertia and hessian driven damping. arXiv preprint arXiv:2106.16159 (2021)

  11. Attouch, H., Goudou, X., Redont, P.: The heavy ball with friction method, i. the continuous dynamical system: global exploration of the local minima of a real-valued function by asymptotic analysis of a dissipative dynamical system. Commun. Contemp. Math. 2(01), 1–34 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  12. Attouch, H., Maingé, P.E., Redont, P.: A second-order differential system with hessian-driven damping; application to non-elastic shock laws. Differ. Equ. Appl. 4(1), 27–65 (2012)

    MathSciNet  MATH  Google Scholar 

  13. Attouch, H., Peypouquet, J., Redont, P.: Fast convex optimization via inertial dynamics with hessian driven damping. J. Differ. Equ. 261(10), 5734–5783 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  14. Aujol, J.F., Dossal, C., Rondepierre, A.: Optimal convergence rates for Nesterov acceleration. SIAM J. Optim. 29(4), 3131–3153 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  15. Aujol, J.F., Dossal, C., Rondepierre, A.: Convergence rates of the heavy-ball method for quasi-strongly convex optimization. HAL preprint: hal-02545245v2 (2021)

  16. Aujol, J.F., Dossal, C., Rondepierre, A.: FISTA is an automatic geometrically optimized algorithm for strongly convex functions. HAL preprint: hal-03491527 (2021). https://hal.archives-ouvertes.fr/hal-03491527

  17. Aujol, J.F., Dossal, C., Rondepierre, A.: Convergence rates of the heavy-ball method under the łojasiewicz property. Math. Program. 198, 1–60 (2022)

    MATH  Google Scholar 

  18. Balti, M., May, R.: Asymptotic for the perturbed heavy ball system with vanishing damping term. arXiv preprint arXiv:1609.00135 (2016)

  19. Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)

    Article  MATH  Google Scholar 

  20. Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165, 471–507 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  21. Boţ, R.I., Csetnek, E.R., László, S.C.: Tikhonov regularization of a second order dynamical system with hessian driven damping. Math. Program. 189(1), 151–186 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  22. Cabot, A., Engler, H., Gadat, S.: On the long time behavior of second order differential equations with asymptotically small dissipation. Trans. Am. Math. Soc. 361(11), 5983–6017 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  23. Garrigos, G., Rosasco, L., Villa, S.: Convergence of the forward-backward algorithm: beyond the worst case with the help of geometry. arXiv preprint arXiv:1703.09477 (2017)

  24. Jendoubi, M.A., May, R.: Asymptotics for a second-order differential equation with nonautonomous damping and an integrable source term. Appl. Anal. 94(2), 435–443 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  25. Li, B., Shi, B., Yuan, Y.x.: Linear convergence of Nesterov-1983 with the strong convexity. arXiv preprint arXiv:2306.09694 (2023)

  26. Maulen, J.J., Peypouquet, J.: A speed restart scheme for a dynamics with hessian driven damping. arXiv preprint arXiv:2301.12240 (2023)

  27. Nesterov, Y.: A Method of Solving a Convex Programming Problem with Convergence rate o (1/k\(^2\)). In: Sov. Math. Dokl, vol. 27

  28. Sebbouh, O., Dossal, C., Rondepierre, A.: Nesterov’s acceleration and Polyak’s heavy ball method in continuous time: convergence rate analysis under geometric conditions and perturbations. arXiv preprint arXiv:1907.02710 (2019)

  29. Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via high-resolution differential equations. Math. Program. 195(1), 79–148 (2021)

    MathSciNet  MATH  Google Scholar 

  30. Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 153:1-153:43 (2014)

    MathSciNet  MATH  Google Scholar 

Download references

Funding

The authors acknowledge the support of the French Agence Nationale de la Recherche (ANR) under reference ANR- PRC-CE23 MaSDOL and the support of FMJH Program PGMO 2019-0024 and from the support to this program from EDF-Thales-Orange.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hippolyte Labarrière.

Ethics declarations

Competing Interests

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

A Appendix

1.1 A.1 Supplementary Material for Remark 1

Let \(\alpha =1+\frac{2}{\gamma }\). We consider the same Lyapunov energy as in the case \(\alpha >1+\frac{2}{\gamma }\), i.e

$$\begin{aligned} {\mathcal {E}}(t)= & {} \left( t^{2}+t \beta (\lambda -\alpha )\right) \left( F(x(t))-F^{*}\right) +\frac{1}{2}\left\| \lambda \left( x(t)-x^{*}\right) \right. \\{} & {} \left. +t({\dot{x}}(t)+\beta \nabla F(x(t)))\right\| ^{2}, \end{aligned}$$

where \(\lambda =\frac{2\alpha }{\gamma +2}\).

Observe that Lemmas 2 and 3 are both valid for this value of \(\alpha \). By noticing that \(K(\alpha )=0\) and \(\alpha -\lambda =1\), we then get that for all \(t> \max \{t_0,\beta \}\),

$$\begin{aligned} {\mathcal {E}}^\prime (t)+\beta t(t-\beta )\Vert \nabla F(x(t))\Vert ^2\leqslant \frac{\beta }{t(t-\beta )}{\mathcal {E}}(t). \end{aligned}$$
(78)

We can deduce that \(t\mapsto {\mathcal {E}}(t)e^{\frac{\beta }{t-\beta }}\) is decreasing on \([t_0+\beta ,+\infty )\). Consequently, for all \(t\geqslant t_0+\beta \),

$$\begin{aligned} {\mathcal {E}}(t)\leqslant {\mathcal {E}}(t_0+\beta )e^{-\frac{\beta }{t-\beta }+\frac{\beta }{t_0}}\leqslant {\mathcal {E}}(t_0+\beta )e^{\frac{\beta }{t_0}}. \end{aligned}$$

Considering the expression of \({\mathcal {E}}\), this directly implies that:

$$\begin{aligned} \forall t\geqslant t_0+\beta , \quad F(x(t))-F^*\leqslant \frac{e^\frac{\beta }{t_0}{\mathcal {E}}(t_0+\beta )}{t(t-\beta )}. \end{aligned}$$
(79)

The above inequality implies the first claim of Remark 1. Note that the growth condition \({\mathcal {G}}^2_\mu \) does not intervene in the proof.

However, we can use this geometry condition to get an upper bound of \(F(x(t))-F^*\) depending on the mechanical energy \(E_m\) defined for all \(t\geqslant t_0+\beta \) by:

$$\begin{aligned} E_{m}(t)=\left( 1+\dfrac{\beta \alpha }{t}\right) \left( F(x(t))-F^{*}\right) +\dfrac{1}{2}\left\| {\dot{x}}(t)+\beta \nabla F(x(t))\right\| ^{2}. \end{aligned}$$

The assumption \({\mathcal {G}}^2_\mu \) and the decreasing behaviour of \(E_m\) ensure that

$$\begin{aligned} \begin{aligned} {\mathcal {E}}(t_0+\beta )&=t_0\left( t_0+\beta \right) \left( F(x(t_0+\beta ))-F^*\right) \\&\quad +\frac{1}{2}\Vert \lambda (x(t_0+\beta )-x^*)+t(\dot{x}(t_0+\beta )+\beta \nabla F(x(t_0+\beta )))\Vert ^2\\&\leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) \left( F(x(t_0+\beta ))-F^*\right) \\&\quad +\frac{(t_0+\beta )^2+\frac{1}{\sqrt{\mu }}}{2}\Vert \dot{x}(t_0+\beta )+\beta \nabla F(x(t_0+\beta ))\Vert ^2\\&\leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) E_m(t_0+\beta ) \leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) E_m(t_0), \end{aligned} \end{aligned}$$

using inequality (62). Hence, for all \(t\geqslant t_0+\beta \),

$$\begin{aligned} F(x(t))-F^*\leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) e^{\frac{\beta }{t_0}}\frac{E_m(t_0)}{t(t-\beta )}. \end{aligned}$$
(80)

Inequality (78) also guarantees that

$$\begin{aligned} t\mapsto {\mathcal {E}}(t)e^{\frac{\beta }{t-\beta }}+\int _{t_0+\beta }^t\beta u(u-\beta )e^{\frac{\beta }{u-\beta }}\Vert \nabla F(x(u))\Vert ^2du, \end{aligned}$$

is bounded on \((t_0+\beta ,+\infty )\). As \({\mathcal {E}}(t)e^{\frac{\beta }{t-\beta }}\) is positive for all \(t\geqslant t_0+\beta \), we can deduce that there exists \(M>0\) such that for all \(t\geqslant t_0+\beta \),

$$\begin{aligned} \int _{t_0+\beta }^t (u-\beta )^2\Vert \nabla F(x(u))\Vert ^2du\leqslant \int _{t_0+\beta }^t u(u-\beta )e^{\frac{\beta }{u-\beta }}\Vert \nabla F(x(u))\Vert ^2du<M, \end{aligned}$$

and thus,

$$\begin{aligned} \int _{t_0+\beta }^{+\infty } (u-\beta )^2\Vert \nabla F(x(u))\Vert ^2du<+\infty . \end{aligned}$$
(81)

By using the same arguments as in the proof of Theorem 1, we conclude that:

$$\begin{aligned} \int _{t_0}^{+\infty } u^2\Vert \nabla F(x(u))\Vert ^2du<+\infty . \end{aligned}$$
(82)

1.2 A.2 Proof of Corollary 1

The first claim is obtained by applying the following lemma to Theorem 2. The proof of this lemma is given in Appendix A.5.

Lemma 10

Let \(F: \mathbb {R}^{n} \rightarrow \mathbb {R}\) be a convex function having a non empty set of minimizers where \(F^*=\inf \limits _{x\in \mathbb {R}^n}F(x)\). Assume that for some \(t_1>0\) and \(\delta >0\), F satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta (F(x(u))-F^*)du<+\infty . \end{aligned}$$

Let \(z:t\mapsto \frac{\int _{t/2}^tu^\delta x(u)du}{\int _{t/2}^tu^\delta du}\). Then, as \(t\rightarrow +\infty \),

$$\begin{aligned} F(z(t))-F^*=o\left( t^{-\delta -1}\right) . \end{aligned}$$
(83)

The second and third claim are proved by applying Lemma 11 to \(\phi :x\mapsto F(x)-F^*\). The proof of this lemma is given in Appendix A.6.

Lemma 11

Let \(\phi : \mathbb {R}^{n} \rightarrow \mathbb {R}^+\) such that for some \(t_1>0\) and \(\delta >0\), \(\phi \) satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta \phi (x(u))du<+\infty . \end{aligned}$$

Then, as \(t\rightarrow +\infty \),

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))=o\left( t^{-\delta -1}\right) \quad \text{ and } \quad \liminf \limits _{t\rightarrow +\infty } t^{\delta +1}\log (t)\phi (x(t))=0. \end{aligned}$$
(84)

1.3 A.3 Proof of Corollary 2

Let \(F: \mathbb {R}^{n} \rightarrow \mathbb {R}\) be a convex \(C^2\) function having a unique minimizer \(x^*\). Assume that F satisfies \({\mathcal {H}}_{\gamma _1}\) and \({\mathcal {G}}_\mu ^{\gamma _2}\) for some \(\gamma _1>2\), \(\gamma _2>2\) such that \(\gamma _{1}\geqslant \gamma _{2}\) and \(\mu >0\). Let x be a solution of (DIN-AVD) for all \(t\geqslant t_0\) where \(t_0>0\), \(\alpha \geqslant \frac{\gamma _{1}+2}{\gamma _{1}-2}\) and \(\beta >0\). Theorem 3 ensures that:

$$\begin{aligned} \int _{t_0}^{+\infty }u^{\frac{2\gamma _1}{\gamma _1-2}}\Vert \nabla F(x(u))\Vert ^2du<+\infty . \end{aligned}$$

Moreover, as F satisfies \({\mathcal {G}}_\mu ^{\gamma _2}\) for some \(\gamma _2>2\), Lemma 1 implies that:

$$\begin{aligned} \int _{t_0}^{+\infty }u^{\frac{2\gamma _1}{\gamma _1-2}}\left( F(x(u))-F^*\right) ^{\frac{2(\gamma _2-1)}{\gamma _2}}du<+\infty . \end{aligned}$$
(85)

By applying Lemma 11 to \(\phi :x\mapsto \left( F(x)-F^*\right) ^{\frac{2(\gamma _2-1)}{\gamma _2}}\), we get that as t tends to \(+\infty \),

$$\begin{aligned} \inf \limits _{u\in \left[ t/2,t\right] }\left( F(x(u))-F^*\right) ^{\frac{2(\gamma _2-1)}{\gamma _2}} =o\left( t^{-\frac{3\gamma _1-2}{\gamma _1-2}}\right) . \end{aligned}$$

Hence,

$$\begin{aligned} \inf \limits _{u\in \left[ t/2,t\right] }F(x(u))-F^*=o\left( t^{-\frac{(3\gamma _1-2)\gamma _2}{2(\gamma _1-2)(\gamma _2-1)}}\right) . \end{aligned}$$

1.4 A.4 Proof of Lemma 6

Let \(F: \mathbb {R}^{n} \rightarrow \mathbb {R}\) be a convex \(C^2\) function with a non empty set of minimizer \(X^*\). Let \(\delta \in (0,1]\) and \(x^*\in X^*\).

We introduce the following lemma which is proved in Appendix A.8.

Lemma 12

Let \(F: \mathbb {R}^{n} \rightarrow \mathbb {R}\) be a \(C^2\) function. Then, for all \(x\in \mathbb {R}^n\) and \(\varepsilon >0\), there exists \(\nu >0\) such that for all \(y\in B(x,\nu )\):

$$\begin{aligned} (1-\varepsilon )(y-x)^TH_F(x)(y-x)\leqslant & {} (y-x)^TH_F(y)(y-x)\nonumber \\\leqslant & {} (1+\varepsilon )(y-x)^TH_F(x)(y-x). \end{aligned}$$
(86)

As F is a \(C^2\) function, Lemma 12 ensures that there exists \(\nu >0\) such that for all \(x\in B\left( x^*,\nu \right) \):

$$\begin{aligned} \left( 1-\frac{\delta }{4-\delta }\right) K(x)\leqslant (x-x^*)^T H_F(x) (x-x^*)\leqslant \left( 1+\frac{\delta }{4-\delta }\right) K(x), \end{aligned}$$
(87)

where \(K(x)=(x-x^*)^T H_F(x^*) (x-x^*)\).

Let \(\phi _{x,x^*}\) be defined as follows:

$$\begin{aligned} \begin{aligned} \phi _{x,x^*}:[0,1]&\rightarrow \mathbb {R}\\t&\mapsto F\left( tx+(1-t)x^*\right) , \end{aligned} \end{aligned}$$

for some \(x\in B\left( x^*,\nu \right) \). The function \(\phi _{x,x^*}\) is twice differentiable and we have that for all \(t\in [0,1]\):

$$\begin{aligned} \begin{aligned} \phi _{x,x^*}^\prime (t)&=(x-x^*)^T\nabla F(tx+(1-t)x^*),\\ \phi _{x,x^*}^{\prime \prime }(t)&=(x-x^*)^TH_F(tx+(1-t)x^*)(x-x^*). \end{aligned} \end{aligned}$$

By rewriting (87) at the point \(tx+(1-t)x^*\) it follows that for all \(t\in [0,1]\):

$$\begin{aligned} \left( 1-\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0) \leqslant \phi _{x,x^*}^{\prime \prime }(t)\leqslant \left( 1+\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0). \end{aligned}$$
(88)

By integrating the left-hand inequality of (88) and noticing that \(\phi _{x,x^*}^{\prime }(0)=0\) (since \(\nabla F(x^*)=0\)), we get that:

$$\begin{aligned} \forall t\in [0,1],~\left( 1-\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0)t\leqslant \phi _{x,x^*}^{\prime }(t). \end{aligned}$$

By integrating the right-hand inequality of (88), we get that:

$$\begin{aligned} \forall t\in [0,1],~ \phi _{x,x^*}(t)-\phi _{x,x^*}(0)\leqslant \left( 1+\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0)\frac{t^2}{2}, \end{aligned}$$

and consequently,

$$\begin{aligned} \forall t\in [0,1],~ \phi _{x,x^*}(t)-\phi _{x,x^*}(0)\leqslant \frac{1}{2-\delta }t\phi _{x,x^*}^{\prime }(t). \end{aligned}$$

By choosing \(t=1\) and rewriting \(\phi _{x,x^*}\) and \(\phi _{x,x^*}^\prime \) we deduce that

$$\begin{aligned} F(x)-F^*\leqslant \frac{1}{2-\delta }\langle \nabla F(x), x-x^*\rangle . \end{aligned}$$

1.5 A.5 Proof of Lemma 10

Let \(F: \mathbb {R}^{n} \rightarrow \mathbb {R}\) be a convex function having a non empty set of minimizers where \(F^*=\inf \limits _{x\in \mathbb {R}^n}F(x)\). Assume that for some \(t_1>0\) and \(\delta >0\), F satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta (F(x(u))-F^*)du<+\infty . \end{aligned}$$
(89)

Let \(\varepsilon >0\). Assumption (89) ensures that there exists \(t_2\geqslant 2t_1\) such that:

$$\begin{aligned} \forall t\geqslant t_2,\quad \int _{t/2}^{t}u^\delta (F(x(u))-F^*)du<\varepsilon . \end{aligned}$$

Let z be defined as follows:

$$\begin{aligned} z:t\mapsto \frac{\int _{t/2}^tu^\delta x(u)du}{\int _{t/2}^tu^\delta du}. \end{aligned}$$

Let \(t\geqslant t_2\). We define \(\nu \) as:

$$\begin{aligned} \begin{aligned} \nu : {\mathcal {B}}([t/2,t])&\rightarrow [0,1]\\ A\qquad&\mapsto \frac{\int _Au^\delta du}{\int _{t/2}^tu^\delta du}, \end{aligned} \end{aligned}$$

where \({\mathcal {B}}([t/2,t])\) is the Borel \(\sigma \)-algebra on [t/2, t]. Then, we can write that \(z(t)=\int _{t/2}^t x(u)d\nu (u)\). As \(\nu ([t/2,t])=1\) and F is a convex function, Jensen’s inequality ensures that:

$$\begin{aligned} \begin{aligned} F(z(t))-F^*&=F\left( \int _{t/2}^t x(u)d\nu (u)\right) -F^*\\&\leqslant \int _{t/2}^t F(x(u))d\nu (u)-F^*\\&\leqslant \int _{t/2}^t \left( F(x(u))-F^*\right) d\nu (u)\\&\leqslant \frac{\varepsilon }{\int _{t/2}^tu^\delta du}\\ \end{aligned} \end{aligned}$$

Hence, as t tends towards \(+\infty \), \(F(z(t))-F^*=o\left( t^{-\delta -1}\right) .\)

1.6 A.6 Proof of Lemma 11

Let \(\phi : \mathbb {R}^{n} \rightarrow \mathbb {R}^+\) such that for some \(t_1>0\) and \(\delta >0\), \(\phi \) satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta \phi (x(u))du<+\infty . \end{aligned}$$
(90)

Let \(\varepsilon >0\). Assumption (90) guarantees that there exists \(t_2\geqslant 2t_1\) such that

$$\begin{aligned} \forall t\geqslant t_2,\quad \int _{t/2}^t u^\delta \phi (x(u))du<\varepsilon . \end{aligned}$$

Consequently, for all \(t\geqslant t_2\),

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))\int _{t/2}^t u^\delta du<\varepsilon , \end{aligned}$$

and

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))<\frac{\varepsilon (\delta +1)}{t^{\delta +1}-\left( \frac{t}{2}\right) ^{\delta +1}}. \end{aligned}$$

Hence, as \(t\rightarrow +\infty \),

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))=o\left( t^{-\delta -1}\right) . \end{aligned}$$
(91)

We recall that \(\liminf \limits _{t\rightarrow +\infty } f(t)=\lim \limits _{t\rightarrow +\infty }\left[ \inf \limits _{\tau \geqslant t}f(\tau )\right] \). As \(\phi \) is a positive function, we get that:

$$\begin{aligned} \liminf \limits _{t\rightarrow +\infty } t^{\delta +1}\log (t)\phi (x(t))=l\geqslant 0. \end{aligned}$$

Suppose that \(l>0\). Then there exists \({\hat{t}}>t_1\) such that:

$$\begin{aligned} \forall t\geqslant {\hat{t}},\quad t^{\delta +1}\log (t)\phi (x(t))\geqslant \frac{l}{2}, \end{aligned}$$

and hence:

$$\begin{aligned} \forall t\geqslant {\hat{t}},\quad t^\delta \phi (x(t))\geqslant \frac{l}{2t\log (t)}. \end{aligned}$$

This inequality can not hold as we assume that (90) is satisfied. We can deduce that \(l=0\).

1.7 A.7 Proof of Lemma 5

Let \(u\in \mathbb {R}^n\), \(v\in \mathbb {R}^n\) and \(a>0\). The first inequality comes from the following inequalities:

$$\begin{aligned} \begin{aligned} \langle u,v\rangle&=\frac{1}{2}\left\| \sqrt{a}u-\frac{v}{\sqrt{a}}\right\| ^2-\frac{a}{2}\Vert u\Vert ^2-\frac{1}{2a}\Vert v\Vert ^2\geqslant -\frac{a}{2}\Vert u\Vert ^2-\frac{1}{2a}\Vert v\Vert ^2, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \langle u,v\rangle&=\frac{a}{2}\Vert u\Vert ^2+\frac{1}{2a}\Vert v\Vert ^2-\frac{1}{2}\left\| \sqrt{a}u+\frac{v}{\sqrt{a}}\right\| ^2\leqslant \frac{a}{2}\Vert u\Vert ^2+\frac{1}{2a}\Vert v\Vert ^2. \end{aligned} \end{aligned}$$

The second inequality is proved by rewriting \(\Vert u\Vert ^2\) as follows:

$$\begin{aligned} \Vert u\Vert ^2=\Vert u+v\Vert ^2+\Vert v\Vert ^2-2\langle u+v,v\rangle , \end{aligned}$$

and by applying the first inequality to \(\langle u+v,v\rangle \).

1.8 A.8 Proof of Lemma 12

Let \(F: \mathbb {R}^{n} \rightarrow \mathbb {R}\) be a \(C^2\) function. We denote the second order partial derivatives of F by \(\partial _{ij} F =\frac{\partial ^2 F}{\partial x_i\partial x_j}\) for all \((i,j)\in \llbracket 1,n\rrbracket ^2\).

Let \(x\in \mathbb {R}^n\) and \(\varepsilon >0\). For all \((i,j)\in \llbracket 1,n\rrbracket ^2\), \(\partial _{ij} F\) is continuous on \(\mathbb {R}^n\) and consequently,

$$\begin{aligned} \exists {\tilde{\nu }}>0,~ \forall y\in B(x,{\tilde{\nu }}),~(1-\varepsilon )\partial _{ij} F(x)\leqslant \partial _{ij} F(y)\leqslant (1+\varepsilon )\partial _{ij} F(x). \end{aligned}$$

By taking the minimal value of \({\tilde{\nu }}\) for all \((i,j)\in \llbracket 1,n\rrbracket ^2\), we get that there exists \({\tilde{\nu }}>0\) such that:

$$\begin{aligned} \forall (i,j)\in \llbracket 1,n\rrbracket ^2,~\forall y\in B(x,{\tilde{\nu }}),~(1-\varepsilon )\partial _{ij} F(x)\leqslant \partial _{ij} F(y)\leqslant (1+\varepsilon )\partial _{ij} F(x).\nonumber \\ \end{aligned}$$
(92)

Let \(\nu =\min \left\{ {\tilde{\nu }},\left( n\max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\right) ^{-\frac{1}{2}}\right\} \), \(y\in B(x,\nu )\) and \(h=y-x\). Equation (92) gives us that for all \((i,j)\in \llbracket 1,n\rrbracket ^2\):

$$\begin{aligned} \partial _{ij} F(x)h_ih_j-\varepsilon |\partial _{ij} F(x)h_ih_j|\leqslant \partial _{ij} F(y)h_ih_j\leqslant \partial _{ij} F(x)h_ih_j+\varepsilon |\partial _{ij} F(x)h_ih_j|.\nonumber \\ \end{aligned}$$
(93)

We recall that for all \((i,j)\in \llbracket 1,n\rrbracket ^2\), \(\left( H_F(x)\right) _{i,j}=\partial _{ij}F(x)\) and therefore:

$$\begin{aligned} \forall (x,h)\in \mathbb {R}^n\times \mathbb {R}^n,~h^T H_F(x) h=\sum _{i=1}^n\sum _{j=1}^n\partial _{ij}F(x)h_ih_j. \end{aligned}$$

By summing (93) for all \((i,j)\in \llbracket 1,n\rrbracket ^2\), we get that:

$$\begin{aligned}{} & {} h^TH_F(x)h-\varepsilon \sum _{i=1}^n\sum _{j=1}^n|\partial _{ij} F(x)h_ih_j|\leqslant h^TH_F(y)h\\{} & {} \qquad \leqslant h^TH_F(x)h+\varepsilon \sum _{i=1}^n\sum _{j=1}^n|\partial _{ij} F(x)h_ih_j|. \end{aligned}$$

Noticing that \(|h_ih_j|\leqslant \frac{1}{2}\left( h_i^2+h_j^2\right) \) for all \((i,j)\in \llbracket 1,n\rrbracket ^2\), we can deduce that:

$$\begin{aligned} \begin{aligned} \sum _{i=1}^n\sum _{j=1}^n|\partial _{ij} F(x)h_ih_j|&\leqslant \max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\sum _{i=1}^n\sum _{j=1}^n|h_ih_j|\\ {}&\leqslant n\max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\Vert h\Vert ^2\\ {}&\leqslant n\max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\nu ^2\\ {}&\leqslant 1. \end{aligned} \end{aligned}$$

Hence,

$$\begin{aligned} (1-\varepsilon ) h^TH_F(x)h\leqslant h^TH_F(y)h\leqslant (1+\varepsilon )h^TH_F(x)h. \end{aligned}$$

\(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aujol, JF., Dossal, C., Hoàng, V.H. et al. Fast Convergence of Inertial Dynamics with Hessian-Driven Damping Under Geometry Assumptions. Appl Math Optim 88, 81 (2023). https://doi.org/10.1007/s00245-023-10058-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00245-023-10058-6

Keywords

Mathematics Subject Classification

Navigation