Fast Convergence of Inertial Dynamics with Hessian-Driven Damping Under Geometry Assumptions

Aujol, Jean-François; Dossal, Charles; Hoàng, Van Hao; Labarrière, Hippolyte; Rondepierre, Aude

doi:10.1007/s00245-023-10058-6

Fast Convergence of Inertial Dynamics with Hessian-Driven Damping Under Geometry Assumptions

Published: 20 September 2023

Volume 88, article number 81, (2023)
Cite this article

Applied Mathematics & Optimization Submit manuscript

283 Accesses
1 Citation
Explore all metrics

Abstract

First-order optimization algorithms can be considered as a discretization of ordinary differential equations (ODEs) (Su et al. in Adv Neural Inf Process Syst 27, 2014). In this perspective, studying the properties of the corresponding trajectories may lead to convergence results which can be transfered to the numerical scheme. In this paper we analyse the following ODE introduced by Attouch et al. (J Differ Equ 261(10):5734–5783, 2016):

$$\begin{aligned} \forall t\geqslant t_0,~\ddot{x}(t)+\frac{\alpha }{t}{\dot{x}}(t)+\beta H_F(x(t)){\dot{x}}(t)+\nabla F(x(t))=0, \end{aligned}$$

where $\alpha >0$, $\beta >0$ and $H_F$ denotes the Hessian of F. This ODE can be derived to build numerical schemes which do not require F to be twice differentiable as shown in Attouch et al. (Math Program 1–43, 2020) and Attouch et al. (Optimization 72:1–40, 2021). We provide strong convergence results on the error $F(x(t))-F^*$ and integrability properties on $\Vert \nabla F(x(t))\Vert $ under some geometry assumptions on F such as quadratic growth around the set of minimizers. In particular, we show that the decay rate of the error for a strongly convex function is $O(t^{-\alpha -\varepsilon })$ for any $\varepsilon >0$. These results are briefly illustrated at the end of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

Adly, S., Attouch, H.: Finite convergence of proximal-gradient inertial algorithms combining dry friction with hessian-driven damping. SIAM J. Optim. 30(3), 2134–2162 (2020)
Article MathSciNet MATH Google Scholar
Alvarez, F., Attouch, H., Bolte, J., Redont, P.: A second-order gradient-like dissipative dynamical system with hessian-driven damping: application to optimization and mechanics. Journal de mathématiques pures et appliquées 81(8), 747–779 (2002)
Article MathSciNet MATH Google Scholar
Apidopoulos, V., Aujol, J.F., Dossal, C., Rondepierre, A.: Convergence rates of an inertial gradient descent algorithm under growth and flatness conditions. Math. Program. 187(1), 151–193 (2021)
Article MathSciNet MATH Google Scholar
Attouch, H., Balhag, A., Chbani, Z., Riahi, H.: Accelerated gradient methods combining tikhonov regularization with geometric damping driven by the hessian. arXiv preprint arXiv:2203.05457 (2022)
Attouch, H., Balhag, A., Chbani, Z., Riahi, H.: Fast convex optimization via inertial dynamics combining viscous and hessian-driven damping with time rescaling. Evol. Equ. Control Theory 11(2), 487–514 (2022)
Article MathSciNet MATH Google Scholar
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with hessian driven damping. Math. Program. 193(1), 113–155 (2020)
Article MathSciNet MATH Google Scholar
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: Convergence of iterates for first-order optimization algorithms with inertia and hessian driven damping. Optimization 72, 1–40 (2021)
MathSciNet MATH Google Scholar
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168(1), 123–175 (2018)
Article MathSciNet MATH Google Scholar
Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case $\alpha \le 3$. ESAIM 25, 2 (2019)
MathSciNet MATH Google Scholar
Attouch, H., Fadili, J., Kungurtsev, V.: On the effect of perturbations, errors in first-order optimization methods with inertia and hessian driven damping. arXiv preprint arXiv:2106.16159 (2021)
Attouch, H., Goudou, X., Redont, P.: The heavy ball with friction method, i. the continuous dynamical system: global exploration of the local minima of a real-valued function by asymptotic analysis of a dissipative dynamical system. Commun. Contemp. Math. 2(01), 1–34 (2000)
Article MathSciNet MATH Google Scholar
Attouch, H., Maingé, P.E., Redont, P.: A second-order differential system with hessian-driven damping; application to non-elastic shock laws. Differ. Equ. Appl. 4(1), 27–65 (2012)
MathSciNet MATH Google Scholar
Attouch, H., Peypouquet, J., Redont, P.: Fast convex optimization via inertial dynamics with hessian driven damping. J. Differ. Equ. 261(10), 5734–5783 (2016)
Article MathSciNet MATH Google Scholar
Aujol, J.F., Dossal, C., Rondepierre, A.: Optimal convergence rates for Nesterov acceleration. SIAM J. Optim. 29(4), 3131–3153 (2019)
Article MathSciNet MATH Google Scholar
Aujol, J.F., Dossal, C., Rondepierre, A.: Convergence rates of the heavy-ball method for quasi-strongly convex optimization. HAL preprint: hal-02545245v2 (2021)
Aujol, J.F., Dossal, C., Rondepierre, A.: FISTA is an automatic geometrically optimized algorithm for strongly convex functions. HAL preprint: hal-03491527 (2021). https://hal.archives-ouvertes.fr/hal-03491527
Aujol, J.F., Dossal, C., Rondepierre, A.: Convergence rates of the heavy-ball method under the łojasiewicz property. Math. Program. 198, 1–60 (2022)
MATH Google Scholar
Balti, M., May, R.: Asymptotic for the perturbed heavy ball system with vanishing damping term. arXiv preprint arXiv:1609.00135 (2016)
Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)
Article MATH Google Scholar
Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165, 471–507 (2017)
Article MathSciNet MATH Google Scholar
Boţ, R.I., Csetnek, E.R., László, S.C.: Tikhonov regularization of a second order dynamical system with hessian driven damping. Math. Program. 189(1), 151–186 (2021)
Article MathSciNet MATH Google Scholar
Cabot, A., Engler, H., Gadat, S.: On the long time behavior of second order differential equations with asymptotically small dissipation. Trans. Am. Math. Soc. 361(11), 5983–6017 (2009)
Article MathSciNet MATH Google Scholar
Garrigos, G., Rosasco, L., Villa, S.: Convergence of the forward-backward algorithm: beyond the worst case with the help of geometry. arXiv preprint arXiv:1703.09477 (2017)
Jendoubi, M.A., May, R.: Asymptotics for a second-order differential equation with nonautonomous damping and an integrable source term. Appl. Anal. 94(2), 435–443 (2015)
Article MathSciNet MATH Google Scholar
Li, B., Shi, B., Yuan, Y.x.: Linear convergence of Nesterov-1983 with the strong convexity. arXiv preprint arXiv:2306.09694 (2023)
Maulen, J.J., Peypouquet, J.: A speed restart scheme for a dynamics with hessian driven damping. arXiv preprint arXiv:2301.12240 (2023)
Nesterov, Y.: A Method of Solving a Convex Programming Problem with Convergence rate o (1/k$^2$). In: Sov. Math. Dokl, vol. 27
Sebbouh, O., Dossal, C., Rondepierre, A.: Nesterov’s acceleration and Polyak’s heavy ball method in continuous time: convergence rate analysis under geometric conditions and perturbations. arXiv preprint arXiv:1907.02710 (2019)
Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via high-resolution differential equations. Math. Program. 195(1), 79–148 (2021)
MathSciNet MATH Google Scholar
Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 153:1-153:43 (2014)
MathSciNet MATH Google Scholar

Download references

Funding

The authors acknowledge the support of the French Agence Nationale de la Recherche (ANR) under reference ANR- PRC-CE23 MaSDOL and the support of FMJH Program PGMO 2019-0024 and from the support to this program from EDF-Thales-Orange.

Author information

Authors and Affiliations

Univ. Bordeaux, Bordeaux INP, CNRS, IMB, UMR 5251, 33400, Talence, France
Jean-François Aujol
IMT, Univ. Toulouse, INSA Toulouse, Toulouse, France
Charles Dossal, Van Hao Hoàng, Hippolyte Labarrière & Aude Rondepierre
LAAS, Univ. Toulouse, CNRS, Toulouse, France
Aude Rondepierre

Authors

Jean-François Aujol
View author publications
You can also search for this author in PubMed Google Scholar
Charles Dossal
View author publications
You can also search for this author in PubMed Google Scholar
Van Hao Hoàng
View author publications
You can also search for this author in PubMed Google Scholar
Hippolyte Labarrière
View author publications
You can also search for this author in PubMed Google Scholar
Aude Rondepierre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hippolyte Labarrière.

Ethics declarations

Competing Interests

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

1.1 A.1 Supplementary Material for Remark 1

Let $\alpha =1+\frac{2}{\gamma }$. We consider the same Lyapunov energy as in the case $\alpha >1+\frac{2}{\gamma }$, i.e

$$\begin{aligned} {\mathcal {E}}(t)= & {} \left( t^{2}+t \beta (\lambda -\alpha )\right) \left( F(x(t))-F^{*}\right) +\frac{1}{2}\left\| \lambda \left( x(t)-x^{*}\right) \right. \\{} & {} \left. +t({\dot{x}}(t)+\beta \nabla F(x(t)))\right\| ^{2}, \end{aligned}$$

where $\lambda =\frac{2\alpha }{\gamma +2}$.

Observe that Lemmas 2 and 3 are both valid for this value of $\alpha $. By noticing that $K(\alpha )=0$ and $\alpha -\lambda =1$, we then get that for all $t> \max \{t_0,\beta \}$,

$$\begin{aligned} {\mathcal {E}}^\prime (t)+\beta t(t-\beta )\Vert \nabla F(x(t))\Vert ^2\leqslant \frac{\beta }{t(t-\beta )}{\mathcal {E}}(t). \end{aligned}$$

(78)

We can deduce that $t\mapsto {\mathcal {E}}(t)e^{\frac{\beta }{t-\beta }}$ is decreasing on $[t_0+\beta ,+\infty )$. Consequently, for all $t\geqslant t_0+\beta $,

$$\begin{aligned} {\mathcal {E}}(t)\leqslant {\mathcal {E}}(t_0+\beta )e^{-\frac{\beta }{t-\beta }+\frac{\beta }{t_0}}\leqslant {\mathcal {E}}(t_0+\beta )e^{\frac{\beta }{t_0}}. \end{aligned}$$

Considering the expression of ${\mathcal {E}}$, this directly implies that:

$$\begin{aligned} \forall t\geqslant t_0+\beta , \quad F(x(t))-F^*\leqslant \frac{e^\frac{\beta }{t_0}{\mathcal {E}}(t_0+\beta )}{t(t-\beta )}. \end{aligned}$$

(79)

The above inequality implies the first claim of Remark 1. Note that the growth condition ${\mathcal {G}}^2_\mu $ does not intervene in the proof.

However, we can use this geometry condition to get an upper bound of $F(x(t))-F^*$ depending on the mechanical energy $E_m$ defined for all $t\geqslant t_0+\beta $ by:

$$\begin{aligned} E_{m}(t)=\left( 1+\dfrac{\beta \alpha }{t}\right) \left( F(x(t))-F^{*}\right) +\dfrac{1}{2}\left\| {\dot{x}}(t)+\beta \nabla F(x(t))\right\| ^{2}. \end{aligned}$$

The assumption ${\mathcal {G}}^2_\mu $ and the decreasing behaviour of $E_m$ ensure that

$$\begin{aligned} \begin{aligned} {\mathcal {E}}(t_0+\beta )&=t_0\left( t_0+\beta \right) \left( F(x(t_0+\beta ))-F^*\right) \\&\quad +\frac{1}{2}\Vert \lambda (x(t_0+\beta )-x^*)+t(\dot{x}(t_0+\beta )+\beta \nabla F(x(t_0+\beta )))\Vert ^2\\&\leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) \left( F(x(t_0+\beta ))-F^*\right) \\&\quad +\frac{(t_0+\beta )^2+\frac{1}{\sqrt{\mu }}}{2}\Vert \dot{x}(t_0+\beta )+\beta \nabla F(x(t_0+\beta ))\Vert ^2\\&\leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) E_m(t_0+\beta ) \leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) E_m(t_0), \end{aligned} \end{aligned}$$

using inequality (62). Hence, for all $t\geqslant t_0+\beta $,

$$\begin{aligned} F(x(t))-F^*\leqslant \left( (t_0+\beta )^2+\frac{\lambda ^2+\sqrt{\mu }}{\mu }\right) e^{\frac{\beta }{t_0}}\frac{E_m(t_0)}{t(t-\beta )}. \end{aligned}$$

(80)

Inequality (78) also guarantees that

$$\begin{aligned} t\mapsto {\mathcal {E}}(t)e^{\frac{\beta }{t-\beta }}+\int _{t_0+\beta }^t\beta u(u-\beta )e^{\frac{\beta }{u-\beta }}\Vert \nabla F(x(u))\Vert ^2du, \end{aligned}$$

is bounded on $(t_0+\beta ,+\infty )$. As ${\mathcal {E}}(t)e^{\frac{\beta }{t-\beta }}$ is positive for all $t\geqslant t_0+\beta $, we can deduce that there exists $M>0$ such that for all $t\geqslant t_0+\beta $,

$$\begin{aligned} \int _{t_0+\beta }^t (u-\beta )^2\Vert \nabla F(x(u))\Vert ^2du\leqslant \int _{t_0+\beta }^t u(u-\beta )e^{\frac{\beta }{u-\beta }}\Vert \nabla F(x(u))\Vert ^2du<M, \end{aligned}$$

and thus,

$$\begin{aligned} \int _{t_0+\beta }^{+\infty } (u-\beta )^2\Vert \nabla F(x(u))\Vert ^2du<+\infty . \end{aligned}$$

(81)

By using the same arguments as in the proof of Theorem 1, we conclude that:

$$\begin{aligned} \int _{t_0}^{+\infty } u^2\Vert \nabla F(x(u))\Vert ^2du<+\infty . \end{aligned}$$

(82)

1.2 A.2 Proof of Corollary 1

The first claim is obtained by applying the following lemma to Theorem 2. The proof of this lemma is given in Appendix A.5.

Lemma 10

Let $F: \mathbb {R}^{n} \rightarrow \mathbb {R}$ be a convex function having a non empty set of minimizers where $F^*=\inf \limits _{x\in \mathbb {R}^n}F(x)$. Assume that for some $t_1>0$ and $\delta >0$, F satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta (F(x(u))-F^*)du<+\infty . \end{aligned}$$

Let $z:t\mapsto \frac{\int _{t/2}^tu^\delta x(u)du}{\int _{t/2}^tu^\delta du}$. Then, as $t\rightarrow +\infty $,

$$\begin{aligned} F(z(t))-F^*=o\left( t^{-\delta -1}\right) . \end{aligned}$$

(83)

The second and third claim are proved by applying Lemma 11 to $\phi :x\mapsto F(x)-F^*$. The proof of this lemma is given in Appendix A.6.

Lemma 11

Let $\phi : \mathbb {R}^{n} \rightarrow \mathbb {R}^+$ such that for some $t_1>0$ and $\delta >0$, $\phi $ satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta \phi (x(u))du<+\infty . \end{aligned}$$

Then, as $t\rightarrow +\infty $,

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))=o\left( t^{-\delta -1}\right) \quad \text{ and } \quad \liminf \limits _{t\rightarrow +\infty } t^{\delta +1}\log (t)\phi (x(t))=0. \end{aligned}$$

(84)

1.3 A.3 Proof of Corollary 2

Let $F: \mathbb {R}^{n} \rightarrow \mathbb {R}$ be a convex $C^2$ function having a unique minimizer $x^*$. Assume that F satisfies ${\mathcal {H}}_{\gamma _1}$ and ${\mathcal {G}}_\mu ^{\gamma _2}$ for some $\gamma _1>2$, $\gamma _2>2$ such that $\gamma _{1}\geqslant \gamma _{2}$ and $\mu >0$. Let x be a solution of (DIN-AVD) for all $t\geqslant t_0$ where $t_0>0$, $\alpha \geqslant \frac{\gamma _{1}+2}{\gamma _{1}-2}$ and $\beta >0$. Theorem 3 ensures that:

$$\begin{aligned} \int _{t_0}^{+\infty }u^{\frac{2\gamma _1}{\gamma _1-2}}\Vert \nabla F(x(u))\Vert ^2du<+\infty . \end{aligned}$$

Moreover, as F satisfies ${\mathcal {G}}_\mu ^{\gamma _2}$ for some $\gamma _2>2$, Lemma 1 implies that:

$$\begin{aligned} \int _{t_0}^{+\infty }u^{\frac{2\gamma _1}{\gamma _1-2}}\left( F(x(u))-F^*\right) ^{\frac{2(\gamma _2-1)}{\gamma _2}}du<+\infty . \end{aligned}$$

(85)

By applying Lemma 11 to $\phi :x\mapsto \left( F(x)-F^*\right) ^{\frac{2(\gamma _2-1)}{\gamma _2}}$, we get that as t tends to $+\infty $,

$$\begin{aligned} \inf \limits _{u\in \left[ t/2,t\right] }\left( F(x(u))-F^*\right) ^{\frac{2(\gamma _2-1)}{\gamma _2}} =o\left( t^{-\frac{3\gamma _1-2}{\gamma _1-2}}\right) . \end{aligned}$$

Hence,

$$\begin{aligned} \inf \limits _{u\in \left[ t/2,t\right] }F(x(u))-F^*=o\left( t^{-\frac{(3\gamma _1-2)\gamma _2}{2(\gamma _1-2)(\gamma _2-1)}}\right) . \end{aligned}$$

1.4 A.4 Proof of Lemma 6

Let $F: \mathbb {R}^{n} \rightarrow \mathbb {R}$ be a convex $C^2$ function with a non empty set of minimizer $X^*$. Let $\delta \in (0,1]$ and $x^*\in X^*$.

We introduce the following lemma which is proved in Appendix A.8.

Lemma 12

Let $F: \mathbb {R}^{n} \rightarrow \mathbb {R}$ be a $C^2$ function. Then, for all $x\in \mathbb {R}^n$ and $\varepsilon >0$, there exists $\nu >0$ such that for all $y\in B(x,\nu )$:

$$\begin{aligned} (1-\varepsilon )(y-x)^TH_F(x)(y-x)\leqslant & {} (y-x)^TH_F(y)(y-x)\nonumber \\\leqslant & {} (1+\varepsilon )(y-x)^TH_F(x)(y-x). \end{aligned}$$

(86)

As F is a $C^2$ function, Lemma 12 ensures that there exists $\nu >0$ such that for all $x\in B\left( x^*,\nu \right) $:

$$\begin{aligned} \left( 1-\frac{\delta }{4-\delta }\right) K(x)\leqslant (x-x^*)^T H_F(x) (x-x^*)\leqslant \left( 1+\frac{\delta }{4-\delta }\right) K(x), \end{aligned}$$

(87)

where $K(x)=(x-x^*)^T H_F(x^*) (x-x^*)$.

Let $\phi _{x,x^*}$ be defined as follows:

$$\begin{aligned} \begin{aligned} \phi _{x,x^*}:[0,1]&\rightarrow \mathbb {R}\\t&\mapsto F\left( tx+(1-t)x^*\right) , \end{aligned} \end{aligned}$$

for some $x\in B\left( x^*,\nu \right) $. The function $\phi _{x,x^*}$ is twice differentiable and we have that for all $t\in [0,1]$:

$$\begin{aligned} \begin{aligned} \phi _{x,x^*}^\prime (t)&=(x-x^*)^T\nabla F(tx+(1-t)x^*),\\ \phi _{x,x^*}^{\prime \prime }(t)&=(x-x^*)^TH_F(tx+(1-t)x^*)(x-x^*). \end{aligned} \end{aligned}$$

By rewriting (87) at the point $tx+(1-t)x^*$ it follows that for all $t\in [0,1]$:

$$\begin{aligned} \left( 1-\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0) \leqslant \phi _{x,x^*}^{\prime \prime }(t)\leqslant \left( 1+\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0). \end{aligned}$$

(88)

By integrating the left-hand inequality of (88) and noticing that $\phi _{x,x^*}^{\prime }(0)=0$ (since $\nabla F(x^*)=0$), we get that:

$$\begin{aligned} \forall t\in [0,1],~\left( 1-\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0)t\leqslant \phi _{x,x^*}^{\prime }(t). \end{aligned}$$

By integrating the right-hand inequality of (88), we get that:

$$\begin{aligned} \forall t\in [0,1],~ \phi _{x,x^*}(t)-\phi _{x,x^*}(0)\leqslant \left( 1+\frac{\delta }{4-\delta }\right) \phi _{x,x^*}^{\prime \prime }(0)\frac{t^2}{2}, \end{aligned}$$

and consequently,

$$\begin{aligned} \forall t\in [0,1],~ \phi _{x,x^*}(t)-\phi _{x,x^*}(0)\leqslant \frac{1}{2-\delta }t\phi _{x,x^*}^{\prime }(t). \end{aligned}$$

By choosing $t=1$ and rewriting $\phi _{x,x^*}$ and $\phi _{x,x^*}^\prime $ we deduce that

$$\begin{aligned} F(x)-F^*\leqslant \frac{1}{2-\delta }\langle \nabla F(x), x-x^*\rangle . \end{aligned}$$

1.5 A.5 Proof of Lemma 10

Let $F: \mathbb {R}^{n} \rightarrow \mathbb {R}$ be a convex function having a non empty set of minimizers where $F^*=\inf \limits _{x\in \mathbb {R}^n}F(x)$. Assume that for some $t_1>0$ and $\delta >0$, F satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta (F(x(u))-F^*)du<+\infty . \end{aligned}$$

(89)

Let $\varepsilon >0$. Assumption (89) ensures that there exists $t_2\geqslant 2t_1$ such that:

$$\begin{aligned} \forall t\geqslant t_2,\quad \int _{t/2}^{t}u^\delta (F(x(u))-F^*)du<\varepsilon . \end{aligned}$$

Let z be defined as follows:

$$\begin{aligned} z:t\mapsto \frac{\int _{t/2}^tu^\delta x(u)du}{\int _{t/2}^tu^\delta du}. \end{aligned}$$

Let $t\geqslant t_2$. We define $\nu $ as:

$$\begin{aligned} \begin{aligned} \nu : {\mathcal {B}}([t/2,t])&\rightarrow [0,1]\\ A\qquad&\mapsto \frac{\int _Au^\delta du}{\int _{t/2}^tu^\delta du}, \end{aligned} \end{aligned}$$

where ${\mathcal {B}}([t/2,t])$ is the Borel $\sigma $-algebra on [t/2, t]. Then, we can write that $z(t)=\int _{t/2}^t x(u)d\nu (u)$. As $\nu ([t/2,t])=1$ and F is a convex function, Jensen’s inequality ensures that:

$$\begin{aligned} \begin{aligned} F(z(t))-F^*&=F\left( \int _{t/2}^t x(u)d\nu (u)\right) -F^*\\&\leqslant \int _{t/2}^t F(x(u))d\nu (u)-F^*\\&\leqslant \int _{t/2}^t \left( F(x(u))-F^*\right) d\nu (u)\\&\leqslant \frac{\varepsilon }{\int _{t/2}^tu^\delta du}\\ \end{aligned} \end{aligned}$$

Hence, as t tends towards $+\infty $, $F(z(t))-F^*=o\left( t^{-\delta -1}\right) .$

1.6 A.6 Proof of Lemma 11

Let $\phi : \mathbb {R}^{n} \rightarrow \mathbb {R}^+$ such that for some $t_1>0$ and $\delta >0$, $\phi $ satisfies:

$$\begin{aligned} \int _{t_1}^{+\infty }u^\delta \phi (x(u))du<+\infty . \end{aligned}$$

(90)

Let $\varepsilon >0$. Assumption (90) guarantees that there exists $t_2\geqslant 2t_1$ such that

$$\begin{aligned} \forall t\geqslant t_2,\quad \int _{t/2}^t u^\delta \phi (x(u))du<\varepsilon . \end{aligned}$$

Consequently, for all $t\geqslant t_2$,

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))\int _{t/2}^t u^\delta du<\varepsilon , \end{aligned}$$

and

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))<\frac{\varepsilon (\delta +1)}{t^{\delta +1}-\left( \frac{t}{2}\right) ^{\delta +1}}. \end{aligned}$$

Hence, as $t\rightarrow +\infty $,

$$\begin{aligned} \inf \limits _{u\in [t/2,t]}\phi (x(u))=o\left( t^{-\delta -1}\right) . \end{aligned}$$

(91)

We recall that $\liminf \limits _{t\rightarrow +\infty } f(t)=\lim \limits _{t\rightarrow +\infty }\left[ \inf \limits _{\tau \geqslant t}f(\tau )\right] $. As $\phi $ is a positive function, we get that:

$$\begin{aligned} \liminf \limits _{t\rightarrow +\infty } t^{\delta +1}\log (t)\phi (x(t))=l\geqslant 0. \end{aligned}$$

Suppose that $l>0$. Then there exists ${\hat{t}}>t_1$ such that:

$$\begin{aligned} \forall t\geqslant {\hat{t}},\quad t^{\delta +1}\log (t)\phi (x(t))\geqslant \frac{l}{2}, \end{aligned}$$

and hence:

$$\begin{aligned} \forall t\geqslant {\hat{t}},\quad t^\delta \phi (x(t))\geqslant \frac{l}{2t\log (t)}. \end{aligned}$$

This inequality can not hold as we assume that (90) is satisfied. We can deduce that $l=0$.

1.7 A.7 Proof of Lemma 5

Let $u\in \mathbb {R}^n$, $v\in \mathbb {R}^n$ and $a>0$. The first inequality comes from the following inequalities:

$$\begin{aligned} \begin{aligned} \langle u,v\rangle&=\frac{1}{2}\left\| \sqrt{a}u-\frac{v}{\sqrt{a}}\right\| ^2-\frac{a}{2}\Vert u\Vert ^2-\frac{1}{2a}\Vert v\Vert ^2\geqslant -\frac{a}{2}\Vert u\Vert ^2-\frac{1}{2a}\Vert v\Vert ^2, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \langle u,v\rangle&=\frac{a}{2}\Vert u\Vert ^2+\frac{1}{2a}\Vert v\Vert ^2-\frac{1}{2}\left\| \sqrt{a}u+\frac{v}{\sqrt{a}}\right\| ^2\leqslant \frac{a}{2}\Vert u\Vert ^2+\frac{1}{2a}\Vert v\Vert ^2. \end{aligned} \end{aligned}$$

The second inequality is proved by rewriting $\Vert u\Vert ^2$ as follows:

$$\begin{aligned} \Vert u\Vert ^2=\Vert u+v\Vert ^2+\Vert v\Vert ^2-2\langle u+v,v\rangle , \end{aligned}$$

and by applying the first inequality to $\langle u+v,v\rangle $.

1.8 A.8 Proof of Lemma 12

Let $F: \mathbb {R}^{n} \rightarrow \mathbb {R}$ be a $C^2$ function. We denote the second order partial derivatives of F by $\partial _{ij} F =\frac{\partial ^2 F}{\partial x_i\partial x_j}$ for all $(i,j)\in \llbracket 1,n\rrbracket ^2$.

Let $x\in \mathbb {R}^n$ and $\varepsilon >0$. For all $(i,j)\in \llbracket 1,n\rrbracket ^2$, $\partial _{ij} F$ is continuous on $\mathbb {R}^n$ and consequently,

$$\begin{aligned} \exists {\tilde{\nu }}>0,~ \forall y\in B(x,{\tilde{\nu }}),~(1-\varepsilon )\partial _{ij} F(x)\leqslant \partial _{ij} F(y)\leqslant (1+\varepsilon )\partial _{ij} F(x). \end{aligned}$$

By taking the minimal value of ${\tilde{\nu }}$ for all $(i,j)\in \llbracket 1,n\rrbracket ^2$, we get that there exists ${\tilde{\nu }}>0$ such that:

$$\begin{aligned} \forall (i,j)\in \llbracket 1,n\rrbracket ^2,~\forall y\in B(x,{\tilde{\nu }}),~(1-\varepsilon )\partial _{ij} F(x)\leqslant \partial _{ij} F(y)\leqslant (1+\varepsilon )\partial _{ij} F(x).\nonumber \\ \end{aligned}$$

(92)

Let $\nu =\min \left\{ {\tilde{\nu }},\left( n\max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\right) ^{-\frac{1}{2}}\right\} $, $y\in B(x,\nu )$ and $h=y-x$. Equation (92) gives us that for all $(i,j)\in \llbracket 1,n\rrbracket ^2$:

$$\begin{aligned} \partial _{ij} F(x)h_ih_j-\varepsilon |\partial _{ij} F(x)h_ih_j|\leqslant \partial _{ij} F(y)h_ih_j\leqslant \partial _{ij} F(x)h_ih_j+\varepsilon |\partial _{ij} F(x)h_ih_j|.\nonumber \\ \end{aligned}$$

(93)

We recall that for all $(i,j)\in \llbracket 1,n\rrbracket ^2$, $\left( H_F(x)\right) _{i,j}=\partial _{ij}F(x)$ and therefore:

$$\begin{aligned} \forall (x,h)\in \mathbb {R}^n\times \mathbb {R}^n,~h^T H_F(x) h=\sum _{i=1}^n\sum _{j=1}^n\partial _{ij}F(x)h_ih_j. \end{aligned}$$

By summing (93) for all $(i,j)\in \llbracket 1,n\rrbracket ^2$, we get that:

$$\begin{aligned}{} & {} h^TH_F(x)h-\varepsilon \sum _{i=1}^n\sum _{j=1}^n|\partial _{ij} F(x)h_ih_j|\leqslant h^TH_F(y)h\\{} & {} \qquad \leqslant h^TH_F(x)h+\varepsilon \sum _{i=1}^n\sum _{j=1}^n|\partial _{ij} F(x)h_ih_j|. \end{aligned}$$

Noticing that $|h_ih_j|\leqslant \frac{1}{2}\left( h_i^2+h_j^2\right) $ for all $(i,j)\in \llbracket 1,n\rrbracket ^2$, we can deduce that:

$$\begin{aligned} \begin{aligned} \sum _{i=1}^n\sum _{j=1}^n|\partial _{ij} F(x)h_ih_j|&\leqslant \max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\sum _{i=1}^n\sum _{j=1}^n|h_ih_j|\\ {}&\leqslant n\max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\Vert h\Vert ^2\\ {}&\leqslant n\max \limits _{(i,j)\in \llbracket 1,n\rrbracket ^2}|\partial _{ij}F(x)|\nu ^2\\ {}&\leqslant 1. \end{aligned} \end{aligned}$$

Hence,

$$\begin{aligned} (1-\varepsilon ) h^TH_F(x)h\leqslant h^TH_F(y)h\leqslant (1+\varepsilon )h^TH_F(x)h. \end{aligned}$$

$\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Aujol, JF., Dossal, C., Hoàng, V.H. et al. Fast Convergence of Inertial Dynamics with Hessian-Driven Damping Under Geometry Assumptions. Appl Math Optim 88, 81 (2023). https://doi.org/10.1007/s00245-023-10058-6

Download citation

Accepted: 25 August 2023
Published: 20 September 2023
DOI: https://doi.org/10.1007/s00245-023-10058-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Convergence of Inertial Dynamics with Hessian-Driven Damping Under Geometry Assumptions

Abstract

Access this article

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

A Appendix

A Appendix

1.1 A.1 Supplementary Material for Remark 1

1.2 A.2 Proof of Corollary 1

Lemma 10

Lemma 11

1.3 A.3 Proof of Corollary 2

1.4 A.4 Proof of Lemma 6

Lemma 12

1.5 A.5 Proof of Lemma 10

1.6 A.6 Proof of Lemma 11

1.7 A.7 Proof of Lemma 5

1.8 A.8 Proof of Lemma 12

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation