1 Introduction

Numerical simulation has established itself as an independent and indispensable branch of research in the natural sciences, on equal footing with theory and experiment. To be truly useful, numerical approaches have to face and master the multiscale nature which is ubiquitous in almost all relevant applications. This is particularly true for soft matter, where spatial scales may bridge from the electron scale up to the millimeter scale of biomaterials or polymers. Concerning examples we refer to the excellent survey by Noid [31], the collected volume edited by Monticelli and Salonen [28], or the recent special issue [40] of Journal of Physics: Condensed Matter.

An important technique to advance numerical algorithms to the particular needs of multiscale applications consists in coarse-graining, cf., e.g., Noid [32], or Peter and Kremer [33]: small-scale features are discarded on the coarser scale by replacing detailed descriptions of molecules or matter by artificial beads of a certain shape. The simulation then focusses only on these beads and their interaction with the other constituents of the system. When and where necessary, fine details can be reinserted back into the simulation for better accuracy as, e.g., in the AdResS scheme ( [6, 34]).

Of course, to evaluate the equations of motion for the coarse-grained model, it is necessary to derive the prevailing effective forces on these beads. Concerning the transition from an atomic microscale to a molecular macroscale in thermodynamic equilibrium, for example, there are essentially two ways to settle this problem. On the one hand, one can employ an ab initio bottom-up approach and evaluate and assemble the resulting forces from the eliminated details of the fine-grained description, cf. Ercolessi and Adams [5], or run a fine-grained simulation, compute the corresponding forces, and somehow approximate them on the coarse level as suggested, e.g., by Izvekov and Voth [20], and Wang et al [46]. The other alternative is to follow a top-down strategy and use the given structural information about the location of the beads on the coarse scale to formulate an inverse problem: Which are the appropriate forces or interactions on the coarse level that define ensembles with the same structural properties?

In this work, we treat one of the simplest incarnations of the latter approach. Let us presume that the probabilities of the snapshots of the coarse-grained bead ensembles are in good agreement with a model which only uses additive pairwise translation invariant interactions of the beads. Such interactions can be formulated in terms of a scalar pair potential which only depends on the relative position of the respective pair of beads. The pair correlation function, which measures the empirical likelihood to observe two beads at a given relative position, appears to be an adequate piece of data to be used for finding the corresponding potential, because both the data and the unknown consist of a scalar function of the space variable in that case. In fact, in a celebrated paper, Henderson [18] argued that the pair potential is uniquely specified this way, i.e., under given conditions of temperature and density, no two different pair potentials can give rise to the same piece of empirical data. Finding a pair potential that reproduces the given pair correlation function is therefore sometimes called the inverse Henderson problem.

The numerical solution of the inverse Henderson problem is demanding because of the lack of a mathematical formula for computing the pair correlation function for a given pair potential, or vice versa. In the old days people have developed approximate identities like the Percus–Yevick or hypernetted chain integral equations for this purpose, cf. Hansen and McDonald [17], or have used parametric ansatz functions like, e.g., Lennard-Jones potentials, and optimized the corresponding parameters numerically. Today, the state of the art is to use nonparametric potentials and employ iterative schemes which, in each iteration, simulate the equilibrium ensemble for the current guess of the pair potential, and use the associated data fit to somehow generate a new guess. Well-known examples are the iterative Boltzmann inversion (IBI), cf. Soper [42] and Reith, Pütz, and Müller-Plathe [35], and the inverse Monte Carlo method (IMC) by Lyubartsev and Laaksonen [26]. We recommend the valuable reviews by Toth [44] and Rühle et al [37] for a comparison of these and further methods.

We emphasize that the setting of the inverse Henderson problem is generally accepted to be far too simplistic to capture all the relevant features of a real system, cf., e.g., [25, 32, 45], mostly because multibody interactions are neglected. In particular, thermodynamic properties of the coarse-grained model may differ from the real system, especially at other temperatures or densities.

But its simplicity offers a great opportunity for a mathematical analysis, which in turn may lead to a better understanding of other, more flexible coarse-graining techniques that are routinely being employed in practice. Still, only few rigorous mathematical results have yet been obtained, e.g., in [2, 7, 16, 23, 24, 30], the reason being, again, the lack of explicit formulae to attack the problem.

The aim of this paper is to point out and advocate an alternative access point for theoretical investigations, which goes back to a nice observation by Shell [41] from within the chemical physics community: He argues that the Henderson potential minimizes the (information theoretic) relative entropy

$$\begin{aligned} S_{\textrm{rel}}\,=\, \sum _\gamma {{\mathcal {P}}}_*(\gamma )\, \log \frac{{{\mathcal {P}}}_*(\gamma )}{{{\mathcal {P}}}(\gamma )}\,, \end{aligned}$$
(1.1)

where—in his words—the summation is over all possible (coarse grained) ensemble configurations \(\gamma \), \(\mathcal {P}_*(\gamma )\) denotes the target (or observed) probability of \(\gamma \), and \(\mathcal {P}(\gamma )\) is the corresponding probability of a model with a given pair potential. (Compare, for example, Georgii [12] for background on the concept of relative entropy in stochastics.) Subsequently, Murtola, Karttunen, and Vattulainen [29] pointed out that the functional (1.1) is convex, and that the Newton iteration for minimizing the relative entropy coincides with the aforementioned IMC iteration (see also Rosenberger et al [36]).

Like Henderson’s paper [18], the results in [29, 36, 41] lack mathematical rigor, because their arguments are restricted to bounded domains, whereas an unambiguous definition of a “translation invariant ensemble” is only possible in the full space. Concerning the Henderson theorem, this shortcoming has recently been fixed in [7], building on fundamental work by Ruelle and by Georgii. Here, we focus on the proposal by Shell and his colleagues, provide a rigorous justification of their results, and elaborate further on them.

To be specific, we outline in Sect. 2 that the (appropriately formulated) relative entropy (1.1) divided by the volume of the bounded domain converges in the thermodynamic limit to the specific relative entropy, first introduced for continuum systems by Gallavotti and Miracle-Sole [8] in the case of hard-core interactions, and further investigated by Georgii and Zessin in [9, 10, 13] for general additive pair interactions. Under mild assumptions on the model and target ensembles to be utilized, we prove that this specific relative entropy is a strictly convex functional on the corresponding set of pair potentials (which include Lennard-Jones type and hard-core potentials), and that its (unique) minimizer is the particular potential which solves the inverse Henderson problem (see Sects. 3 and 4).

From Sect. 5 onward, we restrict ourselves to low densities (the “gas phase”), where the specific relative entropy is a differentiable function of the pair potential. We calculate its Hessian, and in Sect. 6, we verify that the Newton iteration for minimizing the specific relative entropy does indeed coincide with the IMC iteration formulated in the thermodynamic limit. In Sect. 7, we investigate the Hessian in more detail and show that it can be represented by a self-adjoint positive semidefinite operator in \(L^2\). The mapping properties of this operator can be analyzed somewhat further for the particular class of Lennard-Jones-type pair potentials; we conclude with a corresponding result in Sect. 8.

We hope that this variational framework for the inverse Henderson problem opens a possibility to discuss the convergence of the IMC iteration, or to come up with measures to stabilize or regularize this popular iterative scheme.

2 The relative entropy in the thermodynamic limit

To reformulate Shell’s approach within a rigorous mathematical framework we start with the assumption that the target ensemble is given by a translation invariant probability measure \({\mathbb {P}}_*\) on the configuration space

$$\begin{aligned} \Gamma \,=\, \{\,\gamma \subset {\mathbb {R}}^d\,:\, \triangle \subset {\mathbb {R}}^d \ \text {bounded} \,\Rightarrow \, \#(\gamma \cap \triangle ) < \infty \,\} \end{aligned}$$

with density \(\rho _*\) and finite locally second moments, compare Georgii [9]. For the model ensemble, we restrict ourselves to ensembles of classical particles with additive pairwise interactions defined by a measurable even pair potential \(u:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\cup \{+\infty \}\). Concerning the latter, we assume that there exists \(r_0>0\) and decreasing positive functions \(\varphi :(0,r_0)\rightarrow {\mathbb {R}}^+_0\) and \(\psi : [0,\infty )\rightarrow {\mathbb {R}}^+\) with

$$\begin{aligned} \int _{0}^{r_0} r^{d-1}\varphi (r)\,\textrm{d}r= +\infty \qquad \text {and} \qquad \int _{0}^\infty r^{d-1}\psi (r)\,\textrm{d}r<\infty \,, \end{aligned}$$
(2.1)

such that

$$\begin{aligned} \begin{aligned} u(x)&\,\ge \, \varphi (|x|) \qquad \text { for }\ 0< |x| < r_0\,, \\ |u(x)|&\,\le \, \psi (|x|) \qquad \text { for }\ |x| \ge r_0\,, \end{aligned} \end{aligned}$$
(2.2)

holds true almost everywhere; for our convenience, we take \(\psi \) to be a bounded and decreasing function defined for all \(r\ge 0\). We denote by \({{\mathscr {U}}}_0\) the subset of the above pair potentials, which also belong to \(L^\infty _{\textrm{loc}}({\mathbb {R}}^3\setminus \{0\})\), and define \({{\mathscr {U}}}\) to be the union of \({{\mathscr {U}}}_0\) with the hard-core potentials, which satisfy (2.2) with \(\psi \) as above and \(\varphi \) replaced by \(+\infty \); accordingly, \(r_0\) is taken to be the hard-core radius in this case. Technically, we do not distinguish between potentials (and functions in general) which differ on sets of Lebesgue measure zero.

Remark 2.1

The set \({{\mathscr {U}}}\) is convex. To see this let \(u_i\in {{\mathscr {U}}}\), \(i=1,2\), be such that (2.1) and (2.2) hold for \(r_{0,i}>0\) and decreasing functions \(\varphi _i\) and \(\psi _i\) satisfying (2.1), respectively. Without loss of generality, we may assume that \(r_{0,1}\le r_{0,2}\). Let \(u=tu_1+(1-t)u_2\) for some fixed \(t\in (0,1)\). We distinguish two cases. If \(u_2\) is a hard-core potential, then u is also a hard-core potential with hard-core radius \(r_{0,2}\). In this case, we can choose \(r_0=r_{0,2}\),

$$\begin{aligned}\begin{aligned} \varphi (r)&\,=\, +\infty&\quad&\text { for }\ 0< r < r_{0,2}\,,\\ \psi (r)&\,=\, t\psi _1(r)+(1-t)\psi _2(r)&\quad&\text { for }\ r \ge 0\,, \end{aligned} \end{aligned}$$

to achieve the assumption (2.2) for u. On the other hand, if \(u_2\) is no hard-core potential, then \(u_2\) is bounded on \([r_{0,1},r_{0,2})\), and there exists \(c\ge 1\), such that

$$\begin{aligned} |u_2(x)| \,\le \, c\,\psi _2(|x|) \qquad \text { for }\ r_{0,1} \le |x| < r_{0,2}. \end{aligned}$$

It follows that u satisfies the inequalities (2.2) with \(r_0=r_{0,1}\) and

$$\begin{aligned} \begin{aligned} \varphi (r)&\,=\, t\varphi _1(r)+(1-t)\varphi _2(r)&\quad&\text { for }\ 0< r < r_{0,1}\,,\\ \psi (r)&\,=\, t\psi _1(r)+(1-t)c\,\psi _2(r)&\quad&\text { for }\ r\ge 0. \end{aligned} \end{aligned}$$

In either case, we have verified that \(u\in {{\mathscr {U}}}\), hence \({{\mathscr {U}}}\) is convex. \(\diamond \)

Let \(\Lambda =[-\ell ,\ell ]^d\) be a bounded box in \({\mathbb {R}}^d\). In analogy to Shell, who considered the relative entropy framework for canonical ensembles in \(\Lambda \), we take the restriction \({\mathbb {P}}_{*,\Lambda }\) of \({\mathbb {P}}_*\) as target, and the grand canonical ensemble in \(\Lambda \) with chemical potential \(\mu \in {\mathbb {R}}\) and pair potential \(u\in {{\mathscr {U}}}\) as model. For a configuration \(\gamma \in \Gamma \) of \(N=N(\gamma )\) particles located at \(\{x_1,\dots ,x_N\}\subset \Lambda \), the probability density of the model is thus given by

$$\begin{aligned} {{\mathcal {P}}}(\gamma ) \,=\, \frac{1}{\ \Xi _\Lambda }\,e^{\beta \mu N(\gamma )}e^{-\beta U(\gamma )}\,, \end{aligned}$$

where \(\beta >0\) is the inverse temperature,

$$\begin{aligned} U(\gamma ) \,=\, U(x_1,\dots ,x_N)\,=\!\! \sum _{1\le i<j\le N}\!\! u(x_i-x_j) \end{aligned}$$
(2.3)

is the interaction energy, and

$$\begin{aligned} \Xi _\Lambda \,=\,\sum _{N=0}^\infty \frac{e^{\beta \mu N}}{N!} \int _{\Lambda ^N} e^{-\beta U(x_1,\dots ,x_N)}\,\textrm{d}x_1\cdots \,\textrm{d}x_N \end{aligned}$$

is the corresponding grand canonical partition function.

Note that most of the quantities we deal with depend on \(\beta \); however, since we will keep the temperature fixed throughout this paper, we refrain from making this dependency explicit in our notation.

Let \(S({\mathbb {P}}_{*,\Lambda })\) denote the entropy associated with \({\mathbb {P}}_{*,\Lambda }\). In the spirit of (1.1), we then specify the relative entropy

$$\begin{aligned} S_{\textrm{rel},\Lambda }&\,=\, S({\mathbb {P}}_{*,\Lambda })+ \int _{\Gamma _\Lambda } \bigl (\log {\Xi _\Lambda } -\beta \mu N(\gamma )+\beta \quad U(\gamma )\bigr )\,\,\textrm{d}{\mathbb {P}}_{*,\Lambda }(\gamma )\\&\,=\, S({\mathbb {P}}_{*,\Lambda })+ \log \Xi _\Lambda - \beta \mu \,{\mathbb {E}}_{*,\Lambda }[N] \,+\, \beta \,{\mathbb {E}}_{*,\Lambda }[U]\,, \end{aligned}$$

where \({\mathbb {E}}_{*,\Lambda }[\,\cdot \,]\) denotes expectation with respect to \({\mathbb {P}}_{*,\Lambda }\). Take note that the entropy and the expected interaction energy may be \(+\infty \). Dividing by \(\beta |\Lambda |\), we arrive at

$$\begin{aligned} \frac{1}{\beta |\Lambda |}\,S_{\textrm{rel},\Lambda }\,=\, \frac{1}{\beta |\Lambda |} \,S({\mathbb {P}}_{*,\Lambda }) + \frac{1}{\beta |\Lambda |}\log \Xi _\Lambda - \mu \rho _*+ \,\frac{1}{|\Lambda |}\,{\mathbb {E}}_{*,\Lambda } [U]. \end{aligned}$$
(2.4)

Now, we want to derive the analog of this identity for the thermodynamic limit \(\ell \rightarrow \infty \). Concerning this limit it is known that there is a sequence \((\ell _k)_k\) with \(\ell _k\rightarrow \infty \), such that the probability densities associated with the corresponding grand canonical ensembles converge in a local topology to a translation invariant tempered Gibbs measure on \(\Gamma \), cf. Ruelle [39], for brevity called \((\mu ,u)\)-Gibbs measure in the sequel. In general, different sequences may lead to different \((\mu ,u\))-Gibbs measures, e.g., when different phases coexist. For any such sequence, however, the limit

$$\begin{aligned} p(\mu ,u) \,=\,\lim _{\ell \rightarrow \infty }\,\frac{1}{\beta |\Lambda |}\,\log \Xi _\Lambda \end{aligned}$$
(2.5)

is always the same well-defined and nonnegative finite number, namely the pressure of the ensemble in the thermodynamic limit, cf. Ruelle [38]. Therefore, passing in (2.4) to the thermodynamic limit \(\ell \rightarrow \infty \), we can (uniquely) define the relative entropy \(S_{\textrm{rel}}(\mu ,u)\) of all these \((\mu ,u)\)-Gibbs measures with respect to the target model \({\mathbb {P}}_*\) as

$$\begin{aligned} \frac{1}{\beta }\,S_{\textrm{rel}}(\mu ,u) \,=\, \frac{1}{\beta }\,S_* \,+\, p(\mu ,u) \,-\, \mu \rho _* \,+\, E(u,{\mathbb {P}}_*)\,, \end{aligned}$$
(2.6)

where the individual terms on the right-hand side of (2.6) are the corresponding limits of the respective terms in (2.4): \(S_*=S({\mathbb {P}}_*)\) is the specific entropy and \(E(u,{\mathbb {P}}_*)\) is the specific interaction energy (with respect to the potential u) of the target ensemble, both of which have been shown to be well-defined in \({\mathbb {R}}\cup \{+\infty \}\), cf. [11, 13]. If \({\mathbb {P}}_*\) satisfies a Ruelle condition (compare (A.1) in the appendix), then the specific entropy is finite; this is the case, e.g., when \({\mathbb {P}}_*\) is a \((\mu _*,u_*)\)-Gibbs measure for some \(u_*\in {{\mathscr {U}}}\) and \(\mu _*\in {\mathbb {R}}\), compare [39, Corollary 5.3]. For this latter particular case, it has further been shown in [7] that

$$\begin{aligned} E(u,{\mathbb {P}}_*) \,=\, \frac{1}{2} \int _{{\mathbb {R}}^d} u(x)\rho _*^{(2)}(x)\,\textrm{d}x\,, \end{aligned}$$
(2.7)

where \(\rho _*^{(2)}(x)\) is the pair correlation function associated with \({\mathbb {P}}_*\). Here, and throughout this paper, we deliberately use the short-hand notation \(\rho ^{(2)}(x)\) instead of \(\rho ^{(2)}(x,0)\) for the pair correlation function of a translation invariant point process.

In [10], Georgii studied the relative entropy \(S_{\textrm{rel}}(\mu ,u)\) in detail and established the following Gibbs variational principle:

Theorem A

(Gibbs variational principle) Let \({\mathbb {P}}_*\) be a translation invariant probability measure on \(\Gamma \) with finite locally second moments. Then, the relative entropy (2.6) is nonnegative real or \(+\infty \) for every \(u\in {{\mathscr {U}}}\) and \(\mu \in {\mathbb {R}}\). There holds \(S_{\textrm{rel}}(\mu ,u)=0\), if and only if \({\mathbb {P}}_*\) is a \((\mu ,u)\)-Gibbs measure.

We have utilized this result in [7] to prove a rigorous version of the Henderson theorem:

Theorem B

(Henderson theorem) Let \(u_1,u_2\in {{\mathscr {U}}}\) and \(\mu _1,\mu _2\in {\mathbb {R}}\). If \({\mathbb {P}}_1\) and \({\mathbb {P}}_2\) are \((\mu _1,u_1)\)- and \((\mu _2,u_2)\)-Gibbs measures, respectively, which share the same density and the same pair correlation function, then \(u_1=u_2\) and \(\mu _1=\mu _2\).

Combining Georgii’s Gibbs variational principle and the Henderson theorem, we can formulate an alternative version of the Gibbs variational principle. This version is the one that we will mostly use below.

Theorem 2.2

(Gibbs variational principle, alternative form) If the target \({\mathbb {P}}_*\) is a \((\mu _*,u_*)\)-Gibbs measure for some \(u_*\in {{\mathscr {U}}}\) and \(\mu _*\in {\mathbb {R}}\), then the relative entropy \(S_{\textrm{rel}}(\mu ,u)\) becomes minimal, if and only if \(u=u_*\) and \(\mu =\mu _*\).

Proof

According to Theorem A, it remains to investigate the case when \(S_{\textrm{rel}}(\mu ,u)=0\) for some \(u\in {{\mathscr {U}}}\) and \(\mu \in {\mathbb {R}}\). The Gibbs variational principle states that \({\mathbb {P}}_*\), is then a \((\mu _*,u_*)\)- and \((\mu ,u)\)-Gibbs measure at the same time. In particular this means that these two Gibbs measures share the same density and pair correlation function. The assertion thus follows from the Henderson theorem. \(\square \)

Georgii investigated the relative entropy for fixed interaction and chemical potentials u and \(\mu \), and varied the target model \({\mathbb {P}}_*\). In connection with the inverse Henderson problem, our interest is sort of dual to this: we assume that \({\mathbb {P}}_*\) is fixed, and consider the relative entropy as a function of \(\mu \) and u.

3 Strict convexity of the pressure and the relative entropy

It is well-known that the pressure is a convex function of the chemical potential, cf. [38, Theorem 3.4.6]. This convexity is strict, whenever a Gibbs variational principle is valid, cf., e.g., Hughes [19, Sect. 4.3]. In the sequel, we show that under our assumptions the pressure is also strictly convex in \(\mu \) and u.

Theorem 3.1

The pressure \(p=p(\mu ,u)\) of (2.5) is a strictly convex function of \(\mu \in {\mathbb {R}}\) and \(u\in {{\mathscr {U}}}\). Moreover, for \(u\in {{\mathscr {U}}}_0\), there holds

$$\begin{aligned} \frac{p(\mu ,u)}{\mu } \,\rightarrow \, +\infty \qquad \text {as}\, \mu \rightarrow +\infty . \end{aligned}$$
(3.1)

Proof

To begin with, we first observe that \({\mathbb {R}}\times {{\mathscr {U}}}\) is convex by virtue of Remark 2.1.

Now let \(\mu _1,\mu _2\in {\mathbb {R}}\) and \(u_1,u_2\in {{\mathscr {U}}}\) be arbitrarily chosen, not both being equal at the same time, and define

$$\begin{aligned} \mu \,=\, t\mu _1 + (1-t)\mu _2 \quad \text {and} \quad u \,=\, tu_1 + (1-t)u_2 \end{aligned}$$

for some fixed \(t\in (0,1)\). When choosing for \({\mathbb {P}}_*\) a corresponding \((\mu ,u)\)-Gibbs measure, then it follows from (2.6) and the Gibbs variational principle of Theorem 2.2 that

$$\begin{aligned} p(\mu ,u)\,=\, -\frac{1}{\beta }\,S_*\,+\,\mu \rho _*\,-\, E(u,{\mathbb {P}}_*) . \end{aligned}$$
(3.2)

In particular, the specific entropy \(S_*\) is finite because \({\mathbb {P}}_*\) is a Gibbs measure, and hence, so is the specific interaction energy. Likewise, we obtain

$$\begin{aligned} p(\mu _1,u_1)\,>\, -\frac{1}{\beta }\,S_*\,+\,\mu _1\rho _*\,-\, E(u_1,{\mathbb {P}}_*) \end{aligned}$$

and

$$\begin{aligned} p(\mu _2,u_2)\,>\, -\frac{1}{\beta }\,S_*\,+\,\mu _2\rho _*\,-\, E(u_2,{\mathbb {P}}_*)\,, \end{aligned}$$

which gives

$$\begin{aligned} t\,p(\mu _1,u_1) \,+\, (1-t)\,p(\mu _2,u_2)\,>\, -\frac{1}{\beta }\,S_* \,+\, \mu \rho _* \,-\, E(u,P_*) \end{aligned}$$

because of the linearity of the specific interaction energy. A comparison with (3.2) thus shows that the pressure is strictly convex.

Consider now a fixed pair potential \(u\in {{\mathscr {U}}}_0\). It has been shown in the proof of [9, Lemma 7.1] that for any \(\rho >0\) there exists a translation invariant probability measure \({\mathbb {P}}_\rho \) on \(\Gamma \) with density \(\rho \), such that \(S({\mathbb {P}}_\rho )<\infty \), and \(E(u,{\mathbb {P}}_\rho ) < \infty \). Choosing \({\mathbb {P}}_*={\mathbb {P}}_\rho \) in (2.6), Georgii’s Gibbs variational principle (Theorem A) yields the inequality

$$\begin{aligned} p(\mu ,u) - \mu \rho \,\ge \, -\frac{1}{\beta }S({\mathbb {P}}_\rho )-E(u,{\mathbb {P}}_\rho ) \,=:\, c_\rho \,>\, -\infty \end{aligned}$$
(3.3)

for every \(\mu \in {\mathbb {R}}\). In other words,

$$\begin{aligned} \frac{p(\mu ,u)}{\mu } \,\ge \, \rho \,+\, \frac{c_\rho }{\mu }\,, \end{aligned}$$

and hence,

$$\begin{aligned} \liminf _{\mu \rightarrow +\infty } \frac{p(\mu ,u)}{\mu }\,\ge \, \rho . \end{aligned}$$

Since \(\rho >0\) has been arbitrary, this implies (3.1). \(\square \)

We thus have shown that the relative entropy (2.6) is the sum of a strictly convex functional and an affine function of \((\mu ,u)\). Accordingly, the relative entropy is also a strictly convex functional of \(\mu \) and u, as long as it is finite.

Another immediate consequence of Theorem 2.2 is the following inequality for the pressure.

Corollary 3.2

For \(\mu _1,\mu _2 \in {\mathbb {R}}\) and \(u_1,u_2\in {{\mathscr {U}}}\), there holds

$$\begin{aligned} \begin{aligned}&(\mu _2-\mu _1) \rho _1-\frac{1}{2}\int _{{\mathbb {R}}^d} (u_2-u_1)(x)\rho ^{(2)}_{1}(x)\,\textrm{d}x\\&\qquad \,\le \, p(\mu _2,u_2)-p(\mu _1,u_1) \,\le \, (\mu _2-\mu _1)\rho _{2}-\frac{1}{2}\int _{{\mathbb {R}}^d} (u_2-u_1)(x)\rho ^{(2)}_{2}(x)\,\textrm{d}x. \end{aligned} \end{aligned}$$
(3.4)

whenever \(\rho _i\) and \(\rho ^{(2)}_i\) are the density and pair correlation function of a \((\mu _i,u_i)\)-Gibbs measure, respectively. Both inequalities are strict, unless \(\mu _1=\mu _2\) and \(u_1=u_2\). In particular, if \(u_1=u_2\) and \(\mu _1<\mu _2\), then \(\rho _1<\rho _2\).

Proof

The two inequalities (3.4) follow readily from Theorem 2.2 by choosing for \({\mathbb {P}}_*\) the corresponding \((\mu _i,u_i)\)-Gibbs measures, respectively. They are strict, unless \(\mu _1=\mu _2\) and \(u_1=u_2\). \(\square \)

4 The relative entropy functional for fixed density

Returning to the inverse Henderson problem formulated in the introduction, we now constrain our model ensembles to have the same density \(\rho _*\) as the target ensemble. We do so because numerical simulations are often done in the canonical ensemble, where the density is the crucial quantity, and because this is also the setting used by Henderson and by Shell, as well as in the definition of the IMC scheme mentioned before. Since the attainable densities for hard-core potentials are bounded, we need to distinguish the case whether u is a hard-core potential or not. We focus our analysis on the latter case and mention the necessary modifications for hard-core potentials in Remark 4.4 later in this section.

The first fundamental problem to settle concerns the question whether and how the prescribed density \(\rho _*\) can be attained by some \((\mu ,u)\)-Gibbs measure for a given \(u\in {{\mathscr {U}}}_0\). When the chemical potential is sufficiently small, i.e., when the system is in the gas phase (see Sect. 5 for a specification of this term), then it is known that there is a one-to-one relation between the corresponding chemical potentials and the associated densities, and in this case, the inverse map \(\rho \mapsto \mu \) can even be computed by means of cluster expansions, cf., e.g., Jansen, Kuna, and Tsagkarogiannis [22]. Outside the gas phase, where phase transitions may occur, the problem becomes more difficult. Adopting a method from Chayes and Chayes [2], we can establish the following result, which holds for the full range of possible densities.

Theorem 4.1

Let \(u\in {{\mathscr {U}}}_0\) and \(\rho _*>0\) be fixed. Then, there is a unique chemical potential \(\mu _*=\mu _*(u)\in {\mathbb {R}}\), for which there exists a \((\mu _*,u)\)-Gibbs measure with density \(\rho _*\).

Proof

For every \(\mu \in {\mathbb {R}}\) and the given \(u\in {{\mathscr {U}}}_0\), let \({\mathbb {P}}_{\mu ,u}\) be a \((\mu ,u)\)-Gibbs measure, and denote by \(p(\mu )\) and \(\rho (\mu )\) the associated pressure and density, respectively.

It is well-known, cf. [38, Theorem 4.3.1], that for small chemical potentials, the pressure is a differentiable function with

$$\begin{aligned} p'(\mu ) \,=\, \rho (\mu ). \end{aligned}$$
(4.1)

In other words, if \(\rho _*\) is sufficiently small, then the corresponding chemical potential \(\mu _*\) from the formulation of the theorem is given as the unique minimum of the function

$$\begin{aligned} \Theta (\mu ) \,=\, p(\mu )-\mu \rho _*\,, \end{aligned}$$
(4.2)

the latter being strictly convex by virtue of Theorem 3.1. We will proceed by showing that the minimizer of \(\Theta \) is also the appropriate chemical potential to choose for larger values of \(\rho _*\).

To see this, we first observe that

$$\begin{aligned} \Theta (\mu ) \,\ge \, -\mu \rho _*\,, \end{aligned}$$

because the pressure is nonnegative, whereas

$$\begin{aligned} \Theta (\mu ) \,\ge \, c_{(1+\varepsilon )\rho _*} + \,\varepsilon \mu \rho _* \end{aligned}$$
(4.3)

for any suitable \(\varepsilon >0\) and corresponding constant \(c_{(1+\varepsilon )\rho _*}\) by virtue of (3.3) (with \(\rho =(1+\varepsilon )\rho _*\)). This shows that \(\Theta \) is bounded from below and that

$$\begin{aligned} \Theta (\mu ) \,\rightarrow \, +\infty \,, \qquad \text {whenever}\, |\mu |\rightarrow \infty . \end{aligned}$$

Therefore, \(\Theta \) attain its minimum for a uniquely defined value \(\mu =\mu _*\).

For any \(\mu \in {\mathbb {R}}\), \(\mu \ne \mu _*\), we now conclude from (3.4) and (4.2) that

$$\begin{aligned} (\mu _*-\mu )\rho (\mu ) \,<\, p(\mu _*)-p(\mu ) \,=\, \Theta (\mu _*) + \mu _*\rho _* - \Theta (\mu ) - \mu \rho _*\,, \end{aligned}$$

i.e.,

$$\begin{aligned} (\mu _*-\mu )\bigl (\rho (\mu )-\rho _*\bigr ) \,<\, \Theta (\mu _*)-\Theta (\mu ) \,<\,0 . \end{aligned}$$

This means that

$$\begin{aligned} \begin{array}{c} \rho (\mu ) \,<\, \rho _* \qquad \text {for } \ \mu <\mu _*\,, \\ \rho (\mu ) \,>\, \rho _* \qquad \text {for } \ \mu >\mu _*\, . \end{array} \end{aligned}$$
(4.4)

Now, let \((\mu _k^+)_k\) be a strictly decreasing sequence and \((\mu _k^-)_k\) a strictly increasing sequence of chemical potentials, both of which converge to \(\mu _*\). By virtue of Lemma A.1 from the appendix, there exist \((\mu _*,u)\)-Gibbs measures \({\mathbb {P}}^-\) and \({\mathbb {P}}^+\) with densities

$$\begin{aligned} \rho ({\mathbb {P}}^-) \,=\, \lim _{k\rightarrow \infty } \rho (\mu _k^-) \qquad \text {and} \qquad \rho ({\mathbb {P}}^+) \,=\, \lim _{k\rightarrow \infty } \rho (\mu _k^+) . \end{aligned}$$

The inequalities (4.4) imply that

$$\begin{aligned} \rho ({\mathbb {P}}^-) \,\le \, \rho _* \,\le \, \rho ({\mathbb {P}}^+)\,, \end{aligned}$$

and hence, there is some \(t\in [0,1]\), for which \({\mathbb {P}}=t{\mathbb {P}}^-+(1-t){\mathbb {P}}^+\) has density

$$\begin{aligned} \rho ({\mathbb {P}}) \,=\, t\rho ({\mathbb {P}}^-) + (1-t)\rho ({\mathbb {P}}^+) \,=\, \rho _*\,. \end{aligned}$$

Since the set of \((\mu _*,u)\)-Gibbs measures is convex, \({\mathbb {P}}\) has all the desired properties from the statement of this theorem, and the proof is done. \(\square \)

In the light of the above theorem, we can now restrict our attention to \((\mu ,u)\)-Gibbs measures with density \(\rho _*\), i.e., with chemical potential \(\mu =\mu _*(u)\), when looking at the relative entropy functional. Further, we drop the constant offset \(S_*\) in (2.6), as it is independent of u. This leads to the functional

$$\begin{aligned} \Phi (u) = p_*(u) - \mu _*(u) \rho _* + E(u,{\mathbb {P}}_*) \end{aligned}$$
(4.5)

for \(u\in {{\mathscr {U}}}_0\), where we have set

$$\begin{aligned} p_*(u) \,=\, p(\mu _*(u),u) \end{aligned}$$
(4.6)

for brevity. By a slight abuse of wording we will keep calling \(\Phi \) the relative entropy functional. As we will see next, although \(p_*\) may fail to be convex, in general, the functional \(\Phi \) is strictly convex, again.

Theorem 4.2

Let \({\mathbb {P}}_*\) be as in Theorem A. Then, the functional \(\Phi :{{\mathscr {U}}}_0 \rightarrow {\mathbb {R}}\) of (4.5) is strictly convex as long as the specific interaction enery \(E(u,{\mathbb {P}}_*)\) is finite.

Proof

Let \(u_1\) and \(u_2\) be two different pair potentials from \({{\mathscr {U}}}_0\). For any fixed \(0<t<1\), define

$$\begin{aligned} u \,=\, tu_1 \,+\, (1-t)u_2\,, \end{aligned}$$

and, as in Theorem 4.1, denote by \(\mu =\mu _*(u)\) the chemical potential associated with u and density \(\rho _*\). Finally, let \(\rho ^{(2)}\) be the pair correlation function of the associated \((\mu ,u)\)-Gibbs measure constructed in Theorem 4.1.

Then, we obtain from (4.6) and (3.4)—with \(\mu _i=\mu _*(u_i)\) for \(i=1,2\)—that

$$\begin{aligned}&p_*(u) - t p_*(u_1)- (1-t)p_*(u_2) \\&\quad =\, t \bigl (p_*(u)-p_*(u_1)\bigr )\,+\, (1-t)\bigl (p_*(u)-p_*(u_2)\bigr )\\&\quad < \,t\left( (\mu -\mu _1)\rho _* \,-\, \frac{1-t}{2}\int _{\mathbb {R}^d}(u_2-u_1)(x)\rho ^{(2)}(x) \,\textrm{d}x\right) \\&\qquad + (1-t)\left( (\mu -\mu _2)\rho _*\,-\, \frac{t}{2}\int _{\mathbb {R}^d} (u_1-u_2)(x)\rho ^{(2)}(x)\,\textrm{d}x\right) \\&\quad =\,\mu \rho _* \,-\ t\mu _1\rho _* \,-\, (1-t)\mu _2\rho _*\,. \end{aligned}$$

Note that this inequality is strict because u is different from \(u_1\) and \(u_2\) by construction. Reordering terms, we thus arrive at

$$\begin{aligned} p_*(u)- \mu \rho _*\,<\, t \bigl (p_*(u_1)-\mu _1\rho _*\bigr ) \,+\,(1-t)\bigl (p_*(u_2)-\mu _2\rho _*\bigr )\,. \end{aligned}$$
(4.7)

It thus follows from (4.5) and (4.7) that if the specific interaction energy \(E(\,\cdot \,,{\mathbb {P}}_*)\) stays finite, then \(\Phi \) is strictly convex, because \(E(\,\cdot \,,{\mathbb {P}}_*)\) is linear in the first argument. \(\square \)

Theorem 4.2 implies that the relative entropy functional has at most one local minimizer, which is then also a global one. Concerning this minimizer, we have the following result.

Theorem 4.3

Let \({\mathbb {P}}_*\) be a \((\mu ,u_*)\)-Gibbs measure for some \(u_*\in {{\mathscr {U}}}_0\) and \(\mu \in {\mathbb {R}}\), and let \(\rho _*\) be its density. Then, the relative entropy function (4.5) attains its minimum for \(u=u_*\).

Proof

The given Gibbs measure \({\mathbb {P}}_*\) has finite specific entropy \(S_*\), and hence, (2.6) implies that

$$\begin{aligned} \Phi (u) \,=\, \frac{1}{\beta }\,S_{\textrm{rel}}\bigl (\mu _*(u),u\bigr ) \,-\, \frac{1}{\beta }\,S_* \end{aligned}$$
(4.8)

for every \(u\in {{\mathscr {U}}}_0\). Since \({\mathbb {P}}_*\) has density \(\rho _*\) the chemical potential \(\mu \) associated with \({\mathbb {P}}_*\) must be given by \(\mu =\mu _*(u_*)\) according to Theorem 4.1. From (4.8) and the Gibbs variational principle (Theorems A and 2.2), it follows that

$$\begin{aligned} \Phi (u_*) \,=\, -\,\frac{1}{\beta }\,S_*\,<\, \frac{1}{\beta }\,S_{\textrm{rel}}\bigl (\mu _*(u),u\bigr ) \,-\, \frac{1}{\beta }\,S_* \,=\, \Phi (u) \end{aligned}$$
(4.9)

for every \(u\in {{\mathscr {U}}}_0\setminus \{u_*\}\), and this was to be shown. \(\square \)

Note from (4.9) that the minimal value of \(\Phi \) depends on the specific entropy of the target Gibbs measure, and is therefore unknown in general.

Remark 4.4

For a hard-core potential \(u\in {{\mathscr {U}}}\) the density \(\rho \) of any associated \((\mu ,u)\)-Gibbs measure is bounded from above by a finite closest-packing density \({\rho _{\text {cp}}}={\rho _{\text {cp}}}(u)\): This bound only depends on the hard-core radius \(r_0\) of (2.2) and is a decreasing function of \(r_0\), cf. [9, Sect. 7]. We formally set \({\rho _{\text {cp}}}(u)=+\infty \) for \(u\in {{\mathscr {U}}}_0\).

If the given density \(\rho _*\) of the target happens to be below \({\rho _{\text {cp}}}(u)\) for a given \(u\in {{\mathscr {U}}}\), then Theorem 4.1 is still valid for this pair potential; to establish (4.3) in its proof, \(\varepsilon >0\) must be so small that \((1+\varepsilon )\rho _*\) is still below \({\rho _{\text {cp}}}(u)\), because this is needed for [9, Lemma 7.1], and hence, for (3.3). The relative entropy functional (4.5), however, is only well-defined on the domain

$$\begin{aligned} {\mathcal D}(\Phi )\,=\, \{\,u\in {{\mathscr {U}}}:\, {\rho _{\text {cp}}}(u) > \rho _*\,\}\,, \end{aligned}$$

but this is again a convex set, compare Remark 2.1, and \(\Phi \) happens to be strictly convex on \({\mathcal D}(\Phi )\). Theorem 4.3 extends literally to every \(u_*\in {{\mathscr {U}}}\) with this understanding of the domain of \(\Phi \). \(\diamond \)

Theorem 4.3 is the basis for a variational setting of the inverse Henderson problem: If the density and the pair correlation function of a \((\mu _*,u_*)\)-Gibbs measure target \({\mathbb {P}}_*\) are given, then the unique minimizer of the strictly convex relative entropy functional (4.5) with \(E(u,{\mathbb {P}}_*)\) of (2.7) yields the corresponding pair potential \(u_*\), i.e., the solution of the inverse Henderson problem. Nevertheless, this approach also has some pitfalls as discussed in the following remark.

Remark 4.5

If the target \({\mathbb {P}}_*\) fails to be some \((\mu ,u)\)-Gibbs measure, then it is not clear to us whether the relative entropy functional \(\Phi \) will still be bounded from below on \({{\mathscr {U}}}\), and even if it may, its infimum need not be attained on \({{\mathscr {U}}}\).

Vice versa, if \(u_*\in {{\mathscr {U}}}\) happens to be the miminizer of \(\Phi \), then this does not imply that \({\mathbb {P}}_*\) is a \((\mu _*(u_*),u_*)\)-Gibbs measure. To see the latter, consider the following example: Let \(u_*\) be any hard-core potential in \({{\mathscr {U}}}\) and \(\rho _*<{\rho _{\text {cp}}}(u_*)\); compare Remark 4.4. Then, provided \(\rho _*\) is sufficiently small, a result by Kuna, Lebowitz, and Speer [24, Corollary 4.3] states that there exist uncountably many distinct translation invariant probability measures with finite specific entropy and density \(\rho _*\), which share the pair correlation function \(\rho ^{(2)}_*\) with the \((\mu _*(u_*),u_*)\)-Gibbs measure \({\mathbb {P}}\), but \({\mathbb {P}}\) is the only \((\mu ,u)\)-Gibbs measure among them according to the Henderson theorem. Because the specific interaction energy is given by (2.7) for all of these probability measures, the relative entropy functional does not differ, and hence, \(u_*\) is its unique minimizer, regardless which of them has been the target \({\mathbb {P}}_*\).

Finally, we mention that even if the target \({\mathbb {P}}_*\) is a \((\mu ,u_*)\)-Gibbs measure, and hence, the functional (4.5) is minimized by \(u_*\) according to Theorem 4.3, then this does not seem to imply that the model and the target have the same pair correlation function. To illustrate this, imagine that a fluid corresponding to a pair potential \(u_*\in {{\mathscr {U}}}\) exhibits a so-called triple point (compare, e.g., [17]), where three different phases coexist at the same thermodynamical state point, i.e., for the same values of pressure (or chemical potential \(\mu _*\), say) and temperature (or inverse temperature \(\beta \)). It can be expected that the different phases have linearly independent pair correlation functions; taking convex combinations of the corresponding Gibbs measures, one can thus determine two \((\mu _*,u_*)\)-Gibbs measures which have the same density \(\rho _*\), but have different pair correlation functions. One of them could be the target and the other one the minimizing model. We will see in Sect. 5 that if the density \(\rho _*\) belongs to the gas phase of the minimizer of \(\Phi \), then the pair correlation function of the minimizing Gibbs measure always coincides with the given \(\rho ^{(2)}_*\); see Remark 5.5. \(\diamond \)

5 Differentiability of the relative entropy functional

In the remainder of this paper, we further analyze the case when \(\Phi \) is a differentiable function of u. For this, we build upon our earlier work [14, 15].Footnote 1 A conceptual difficulty in this context is the fact that \({{\mathscr {U}}}\) lacks a universal topology. Therefore, following [14], we consider a tailored neighborhood for any given \(u_0\in {{\mathscr {U}}}\) by introducing a corresponding Banach space \({{\mathscr {V}}}\) of perturbations, consisting of all even measurable functions \(v:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\), for which the associated norm

$$\begin{aligned} \Vert v\Vert _{{\mathscr {V}}}\,=\, \sup _{x\in {\mathbb {R}}^d} \frac{|v(x)|}{\psi _0(|x|)} \end{aligned}$$
(5.1)

is finite. Here, \(\psi _0\) is the majorant \(\psi \) of (2.1) associated with \(u_0\). Note that the norm (5.1) is somewhat stronger than the one employed in [14] because it is more restrictive in the core region \(0<r< r_0\), and hence, the resulting space of perturbations is smaller. We believe that this restriction allows a more natural formulation of our results.

Remark 5.1

It can easily be seen that for any \(v\in {{\mathscr {V}}}\) the sum \(u_0+v\) belongs to \({{\mathscr {U}}}\): If \(u_0\) is a hard-core potential, then \(r_0\) is its hard-core radius and the corresponding function \(\varphi \) from (2.1) equals \(+\infty \), and we have

$$\begin{aligned} \begin{array}{ll} \quad u_0(x)+v(x) \,\ge \, \varphi (|x|) &{}\quad \text { for }\, 0< |x| < r_0\,, \\ \bigl |u_0(x)+v(x)\bigr | \,\le \, \bigl (1\,+\,\Vert v\Vert _{{\mathscr {V}}}\bigr )\psi _0(|x|) &{}\quad \text { for }\, |x| \ge r_0\,. \end{array} \end{aligned}$$

Otherwise, there exists \(r_1\in (0,r_0]\) such that \(\varphi (r) \ge \Vert v\Vert _{{\mathscr {V}}}\psi _0(0)\) for \(0 < r \le r_1\), and therefore

$$\begin{aligned} \begin{array}{ll} \quad u_0(x)+v(x) \,\ge \, \varphi (|x|) - \Vert v\Vert _{{\mathscr {V}}}\psi _0(0) &{}\quad \text { for }\, 0< |x| < r_1\,, \\ \bigl |u_0(x)+v(x)\bigr | \,\le \, {\displaystyle \left( \sup _{|x'|\ge r_1}\frac{|u(x')|}{\psi _0(|x'|)} \,+\,\Vert v\Vert _{{\mathscr {V}}}\right) \psi _0(|x|)} &{}\quad \text { for }\, |x| \ge r_1\,. \end{array} \end{aligned}$$

In either case, this shows that \(u_0+v\in {{\mathscr {U}}}\). \(\diamond \)

We now consider the ball

$$\begin{aligned} {\mathcal {B}}_{{\mathscr {V}}}(u_0) \,=\, \bigl \{ u=u_0+v\,:\, \Vert v\Vert _{{\mathscr {V}}}<\delta _0\bigr \}\,\subset \,{{\mathscr {U}}}\end{aligned}$$
(5.2)

around \(u_0\) for a suitable radius \(\delta _0\) with \(0<\delta _0<1\); we further specify \(\delta _0\) in the context of (5.7) below. As has been verified in [14, Proposition 2.1], there are positive constants \(c_\beta \) and B, such that every \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) satisfies

$$\begin{aligned} \int _{{\mathbb {R}}^d} |e^{-\beta u(x)}-1|\,\textrm{d}x\,\le \, c_\beta \,, \end{aligned}$$
(5.3)

and the interaction energy of every configuration \(\gamma \subset {\mathbb {R}}^d\) with \(N(\gamma )\) particles is bounded from below by

$$\begin{aligned} U(\gamma ) \,\ge \, -B\quad N(\gamma )\,. \end{aligned}$$
(5.4)

Moreover, for every \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) and \(\mu \in {\mathbb {R}}\) and every associated \((\mu ,u)\)-Gibbs measure, the sequence

$$\begin{aligned} {\varvec{\rho }}(\mu ,u)\,=\,[\rho ,\rho ^{(2)},\dots ]^T \end{aligned}$$

of its m-particle correlation functions is a solution of the so-called Kirkwood-Salsburg equations (cf., e.g., [39, Corollary 5.3], or compare (A.3) in the appendix), which can be written in the form

$$\begin{aligned} \bigl (I-e^{\beta \mu } A(u)\bigr ){\varvec{\rho }}(\mu ,u) \,=\, e^{\beta \mu } {\varvec{e_1}}\,, \end{aligned}$$
(5.5)

where \({\varvec{e_1}}=[1,0,\dots ]^T\), I is the identity operator, and A(u) is a bounded operator in an associated Banach space \({\mathscr {X}}\) of sequences of \(L^\infty \) functions. This operator A(u) is given by a certain infinite-dimensional matrix of integral operators, compare [14, 38], and is Fréchet differentiable with respect to \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\).

We fix

$$\begin{aligned} \mu _0 \,<\, \frac{1}{\beta }\,\log \frac{1}{c_\beta e^{2\beta B+1}}\,, \end{aligned}$$
(5.6)

and call the range \(\mu \le \mu _0\) of chemical potentials the gas phase associated with the pair potentials in \({\mathcal {B}}_{{\mathscr {V}}}(u_0)\). In this gas phase, the Kirkwood-Salsburg equations (5.5) have a unique solution \({\varvec{\rho }}(\mu ,u)\), which can be developed in a converging Neumann series in \({\mathscr {X}}\); in particular, this means that for \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\), there exists only one \((\mu ,u)\)-Gibbs measure for each chemical potential \(\mu \) within the gas phase. The individual components \(\rho ^{(m)}(\mu ,u)\) of \({\varvec{\rho }}(\mu ,u)\) are Fréchet differentiable (in \(L^\infty \)) with respect to u, analytic with respect to \(\mu <\mu _0\) and continuous in \(\mu \le \mu _0\). It is also easy to see that \({\varvec{\rho }}\) is a \(C^1\) function of \(\mu \le \mu _0\) and \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\).

Further, for fixed \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\), the density \(\rho =\rho (\mu ,u)\) of the associated Gibbs measure, i.e., the first entry of \({\varvec{\rho }}(\mu ,u)\), is differentiable and strictly increasing in \(\mu \) up to \(\mu =\mu _0\) by virtue of Corollary 3.2. Since it is also continuous in \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\), we can choose \(\delta _0\) in (5.2) so small that

$$\begin{aligned} \rho _0\,=\, \inf \bigl \{ \rho (\mu _0,u)\,:\, u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\bigr \} \,\ge \, (1-\varepsilon )\,\rho (\mu _0,u_0) \end{aligned}$$
(5.7)

for any given \(\varepsilon >0\), and then every density \(\rho \in (0,\rho _0)\) belongs to the gas phase of all pair potentials in \({\mathcal {B}}_{{\mathscr {V}}}(u_0)\). Finally, for fixed \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\), the pressure \(p(\mu ,u)\) is also differentiable and strictly increasing in \(\mu \) up to \(\mu =\mu _0\), and its derivative is given by the density, cf. (4.1).

Proposition 5.2

Let \(0<\rho _*<\rho _0\) with \(\rho _0\) as in (5.7). Then, the chemical potential \(\mu _*=\mu _*(u)\) defined in Theorem 4.1 is differentiable with respect to \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) with derivative \(\mu _*'(u)\in {{\mathscr {V}}}'\) given by

$$\begin{aligned} \mu _*'(u)\,v\,=\, - \frac{\partial _u\rho (\mu _*(u),u)\,v}{\partial _\mu \rho (\mu _*(u),u)}\qquad \text {for}\, v\in {{\mathscr {V}}}, \end{aligned}$$
(5.8)

where \(\partial _\mu \rho \) and \(\partial _u\rho \) denote the partial derivatives of \(\rho (\mu ,u)\), and \({{\mathscr {V}}}'\) is the dual space of \({{\mathscr {V}}}\).

Proof

Since the pressure and the density of a fixed pair potential \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) are strictly increasing and differentiable functions of the chemical potential in the gas phase, we can rewrite each of these functions as a strictly increasing function of any of the other ones. The pressure, for example, can be written as a function of density, which we call \(\pi \) to avoid any confusion, i.e.,

$$\begin{aligned} p(\mu ,u) \,=\, \pi \bigl (\rho (\mu ,u),u\bigr )\,. \end{aligned}$$

Then, the chain rule can be applied to obtain

$$\begin{aligned} \partial _\mu p(\mu ,u) \,=\,\partial _\rho \pi \bigl (\rho (\mu ,u),u\bigr ) \partial _\mu \rho (\mu ,u)\,, \end{aligned}$$
(5.9)

because it has been shown in [39, Theorem 4.3] that \(\pi \) is a differentiable function of the density in the gas phase. For \(\mu =\mu _*(u)\), we thus conclude from (4.1) and (5.9) that

$$\begin{aligned} \rho _* \,=\, \rho \bigl (\mu _*(u),u\bigr ) \,=\, \partial _\mu p\bigl (\mu _*(u),u\bigr ) \,=\, \alpha \,\partial _\mu \rho \bigl (\mu _*(u),u\bigr ) \end{aligned}$$
(5.10)

with

$$\begin{aligned} \alpha \,=\, \partial _\rho \pi (\rho _*,u)\,>\, 0\,. \end{aligned}$$

It thus follows from (5.10) that \(\partial _\mu \rho \bigl (\mu _*(u),u\bigr )>0\), so that the implicit function theorem is applicable to the equation

$$\begin{aligned} \rho (\mu _*(u+v),u+v) \,=\, \rho _* \end{aligned}$$

near \(v=0\). From this, we readily obtain (5.8); moreover, \(\mu _*'(u)\) belongs to \({{\mathscr {V}}}'\) because \(\partial _u\rho \in {{\mathscr {V}}}'\). \(\square \)

Now, we return to our analysis of the relative entropy functional (4.5).

Theorem 5.3

Let \(u_0\in {{\mathscr {U}}}\) be a fixed pair potential and \({\mathcal {B}}_{{\mathscr {V}}}(u_0)\subset {{\mathscr {U}}}\) be defined as in (5.2) with \(\delta _0\) so small that (5.7) holds. Furthermore, let the target \({\mathbb {P}}_*\) of the relative entropy functional have density \(\rho _*<\rho _0\), and assume that the associated specific interaction energy \(E(\,\cdot \,,{\mathbb {P}}_*)\) is bounded on \({\mathcal {B}}_{{\mathscr {V}}}(u_0)\). Then, \(\Phi \) is Fréchet-differentiable in every \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) with derivative \(\Phi '(u)\in {{\mathscr {V}}}'\), given by

$$\begin{aligned} \Phi '(u)\quad v \,=\, E(v,{\mathbb {P}}_*) \,-\, \frac{1}{2}\int _{{\mathbb {R}}^d}v(x)\rho ^{(2)}(x)\,\textrm{d}x\end{aligned}$$
(5.11)

for \(v\in {{\mathscr {V}}}\), where \(\rho ^{(2)}\) is the pair correlation function of the associated \((\mu _*(u),u)\)-Gibbs measure.

Proof

Let \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) and \(v\in {{\mathscr {V}}}\). We apply Corollary 3.2 to \(u_1=u\) and \(u_2=u+v\) with \(\mu _1=\mu _*(u)\) and \(\mu _2=\mu _*(u+v)\). This yields

$$\begin{aligned} - \frac{1}{2}\int _{{\mathbb {R}}^d} v(x)\rho ^{(2)}(x)\,\textrm{d}x&\,\le \,p_*(u+v)-\mu _*(u+v)\rho _* \,-\, \bigl (p_*(u) -\mu _*(u)\rho _*\bigr )\\&\,\le \, -\frac{1}{2}\int _{{\mathbb {R}}^d} v(x)\widetilde{\rho }^{(2)}(x)\,\textrm{d}x\,, \end{aligned}$$

where \(\widetilde{\rho }^{(2)}\) denotes the pair correlation function of the corresponding \((\mu _*(u+v),u+v)\)-Gibbs measure. It follows that

$$\begin{aligned} E(v,{\mathbb {P}}_*) \,-\, \frac{1}{2}\int _{{\mathbb {R}}^d} v(x)\rho ^{(2)}(x)\,\textrm{d}x&\,\le \,\Phi (u+v) \,-\, \Phi (u)\\&\,\le \, E(v,{\mathbb {P}}_*) \,-\, \frac{1}{2}\int _{{\mathbb {R}}^d} v(x)\widetilde{\rho }^{(2)}(x)\,\textrm{d}x\,, \end{aligned}$$

i.e., that

$$\begin{aligned} 0&\,\le \, \Phi (u+v)-\Phi (u)\,-\,\Bigl (E(v,{\mathbb {P}}_*) - \frac{1}{2}\int _{{\mathbb {R}}^d}v(x)\rho ^{(2)}(x)\,\textrm{d}x\Bigr ) \\&\,\le \, \frac{1}{2} \int _{{\mathbb {R}}^d} v(x)\big (\rho ^{(2)}-\widetilde{\rho }^{(2)}\big )(x)\,\textrm{d}x\,. \end{aligned}$$

This shows that

$$\begin{aligned}&\left| \Phi (u+v)-\Phi (u) \,-\,\Bigl (E(v,{\mathbb {P}}_*) - \frac{1}{2}\int _{{\mathbb {R}}^d}v(x)\rho ^{(2)}(x)\,\textrm{d}x\Bigr ) \right| \\&\qquad \,\le \, \frac{C_{\psi _0}}{2}\,\Vert v\Vert _{{{\mathscr {V}}}} \Vert \widetilde{\rho }^{(2)}-\rho ^{(2)}\Vert _{L^\infty ({\mathbb {R}}^d)}\,, \end{aligned}$$

where

$$\begin{aligned} C_{\psi _0} \,=\, |{\mathbb {S}}^{d-1}| \int _0^\infty r^{d-1}\psi _0(r)\,\textrm{d}r\,<\, \infty \end{aligned}$$

by virtue of (2.1). We finally observe that

$$\begin{aligned} \widetilde{\rho }^{(2)} \,=\, \rho ^{(2)}(\mu _*(u+v),u+v) \end{aligned}$$

converges to \(\rho ^{(2)}=\rho ^{(2)}(\mu _*(u),u)\) in \(L^\infty ({\mathbb {R}}^d)\) as \(\Vert v\Vert _{{\mathscr {V}}}\rightarrow 0\) because of Proposition 5.2 and the smoothness of \(\rho ^{(2)}\) as a function of \(\mu \) and u. We have therefore established that

$$\begin{aligned} \left| \Phi (u+v)-\Phi (u) \,-\,\Bigl (E(v,{\mathbb {P}}_*) - \frac{1}{2}\int _{{\mathbb {R}}^d}v(x)\rho ^{(2)}(x)\,\textrm{d}x\Bigr ) \right| \,=\, o(\Vert v\Vert _{{\mathscr {V}}}) \end{aligned}$$

as \(\Vert v\Vert _{{\mathscr {V}}}\rightarrow 0\), which yields (5.11). Since \(E(\,\cdot \,,{\mathbb {P}}_*)\) is bounded on \({\mathcal {B}}_{{\mathscr {V}}}(u_0)\), it is easy to see that \(E(\,\cdot \,,{\mathbb {P}}_*)\in {{\mathscr {V}}}'\), and hence, it follows in the same way as above that \(\Phi '(u)\in {{\mathscr {V}}}'\) with

$$\begin{aligned} \Vert \Phi '(u)\Vert _{{{\mathscr {V}}}'} \,\le \, \Vert E(\,\cdot \,,{\mathbb {P}}_*)\Vert _{{{\mathscr {V}}}'} \,+\, \frac{C_{\psi _0}}{2}\,\Vert \rho ^{(2)}\Vert _{L^\infty ({\mathbb {R}}^d)}\,. \end{aligned}$$

This concludes the proof. \(\square \)

Remark 5.4

We emphasize that the dual space \({{\mathscr {V}}}'\) of \({{\mathscr {V}}}\) is a complicated Banach space. However, the representation (5.11), in combination with (2.7), reveals that if \({\mathbb {P}}_*\) has a bounded and measurable pair correlation function \(\rho ^{(2)}_*\), then \(\Phi '(u)\) can be identified with

$$\begin{aligned} \nabla \Phi (u) \,=\, \frac{1}{2}\bigl (\rho _*^{(2)}-\rho ^{(2)}(\mu _*(u),u)\bigr ) \,\in \, L^\infty ({\mathbb {R}}^d)\,, \end{aligned}$$
(5.12)

when using the natural dual pairing

$$\begin{aligned} \Phi '(u)v \,=\, \langle \,v,\nabla \Phi (u)\,\rangle \,=\, \int _{{\mathbb {R}}^d} v(x)\bigl (\nabla \Phi (u)\bigr )(x)\,\textrm{d}x\end{aligned}$$
(5.13)

of \(L^2({\mathbb {R}}^d)\).

This is the case, for example, if \({\mathbb {P}}_*\) is a \((\mu ,u_*)\)-Gibbs measure for some \(u_*\in {{\mathscr {U}}}\) (with \(\mu =\mu _*(u_*)\)), and then (5.12) can further be rewritten as

$$\begin{aligned} \nabla \Phi (u) \,=\, \frac{1}{2}\bigl (\omega ^{(2)}(\mu ,u_*) -\omega ^{(2)}(\mu _*(u),u)\bigr )\,, \end{aligned}$$

where

$$\begin{aligned} \omega ^{(2)}(\mu ,u) \,=\, \rho ^{(2)}(\mu ,u) - \rho (\mu ,u)^2\,. \end{aligned}$$
(5.14)

It follows from (5.17) below that \(\nabla \Phi (u)\) then also belongs to \(L^1({\mathbb {R}}^d)\), and hence, to \(L^2({\mathbb {R}}^d)\) as well. Note that \({{\mathscr {V}}}\) is embedded in \(L^2({\mathbb {R}})\) for the same reason. We refer to Sect. 7 for further arguments which support the choice of the \(L^2\)-pairing. \(\diamond \)

Remark 5.5

Assume that the target \({\mathbb {P}}_*\) has a bounded and measurable pair correlation function \(\rho ^{(2)}_*\), and that the relative entropy functional \(\Phi \) is minimized by some \(u_*\in {{\mathscr {U}}}\). Assume further that the given density \(\rho _*\) belongs to the gas phase of \(u_*\). Then, it follows from Theorem 5.3 that \(\Phi \) is differentiable at \(u_*\), and that \(\nabla \Phi (u_*) = 0\). From (5.12), we thus conclude that the target and the model share the same pair correlation function in this case; compare Remark 4.5. \(\diamond \)

Remark 5.6

For a \((\mu ,u)\)-Gibbs measure the cluster functions, sometimes also called Ursell functions (e.g., Stell [43], and [15]), or truncated correlation functions (e.g., Jansen [21]), are defined recursively by the constant function \(\omega ^{(1)} =\rho \) and, for \(m\ge 2\), by

$$\begin{aligned} \omega ^{(m)}(x_1,\dots ,x_m) \,=\, \rho ^{(m)}(x_1,\dots ,x_m) \,-\, \sum _{k=2}^m\, \sum _{\pi \in \Pi _k^m}\, \prod _{i=1}^k \omega ^{(|\pi _i|)}({\varvec{x}}_{\pi _i})\,, \end{aligned}$$
(5.15)

cf. [38, p. 87], where \(\Pi _k^m\) denotes the set of partitions \(\pi \) of \(x_1,\dots ,x_m\) into k subsets \({\varvec{x}}_{\pi _1},\dots ,{\varvec{x}}_{\pi _k}\). Note that it immediately follows from (5.15) that each cluster function inherits the translation and permutation invariance of the correlation functions.

For \(m=2\), we recover from (5.15) the definition (5.14), and for fixed \(\mu \) and u, we adopt the short-hand notation

$$\begin{aligned} \omega ^{(2)}(x) \,=\, \rho ^{(2)}(x) \,-\, \rho ^2 \end{aligned}$$

for \(\omega ^{(2)}(x,0)\) from the pair correlation function. Note that

$$\begin{aligned} \omega ^{(2)}(x) \,=\, \omega ^{(2)}(-x) \qquad \text {for every}\, x\in {\mathbb {R}}^d \end{aligned}$$
(5.16)

because of the translation invariance.

We will make repeatedly use of [38, Theorem 4.4.8] which states that

$$\begin{aligned} \int _{({\mathbb {R}}^d)^{(m-2)}} \bigl |\omega ^{(m)}(\,\cdot \,,0,x_3,\dots ,x_m)\bigr | \,\textrm{d}x_3\cdots \,\textrm{d}x_m \,\in \, L^1({\mathbb {R}}^d)\,, \end{aligned}$$
(5.17)

provided that \(u\in {{\mathscr {U}}}\) and \(\mu \le \mu _0\); see also Lemma A.4 in the appendix. In particular, (5.17) shows that \(\omega ^{(2)}\in L^1({\mathbb {R}}^d)\). \(\diamond \)

Based on Theorem 5.3, we now compute the Hessian of \(\Phi \).

Theorem 5.7

Under the assumptions of Theorem 5.3, the relative entropy functional (4.5) is twice differentiable, and its Hessian at \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) is given by

$$\begin{aligned} \Phi ''(u)(v,\widetilde{v})\,=\, - \frac{1}{\partial _\mu \rho } \bigl (\partial _u \rho \,\widetilde{v}\bigr )\bigl (\partial _u \rho \,v\bigr ) \,-\, \frac{1}{2}\quad \langle \,v,\partial _u\rho ^{(2)}\,\widetilde{v}\,\rangle \end{aligned}$$
(5.18)

in terms of the dual pairing (5.13) of \(L^2({\mathbb {R}}^d)\), where \(v,\widetilde{v}\in {{\mathscr {V}}}\), and the partial derivatives \(\partial _u\rho \) and \(\partial _u\rho ^{(2)}\) are evaluated at the pair potential u and the chemical potential \(\mu _*(u)\).

Proof

We differentiate (5.11) with respect to u in the direction \(\widetilde{v}\in {{\mathscr {V}}}\). Keeping in mind that \(\rho ^{(2)}\) in (5.11) stands for \(\rho ^{(2)}(\mu _*(u),u)\), the chain rule gives

$$\begin{aligned} \begin{aligned} \Phi ''(u)(v,\widetilde{v})&\,=\, -\frac{1}{2}\int _{{\mathbb {R}}^d} v(x) \Bigl (\partial _\mu \rho ^{(2)}\mu _*'(u)\,\widetilde{v}\,+\, \partial _u \rho ^{(2)}\,\widetilde{v}\Bigr )(x)\,\textrm{d}x\\&\,=\, -\frac{\mu _*'(u)\,\widetilde{v}}{2}\, \langle \,v,\partial _\mu \rho ^{(2)}\,\rangle \,-\, \frac{1}{2}\quad \langle \,v,\partial _u\rho ^{(2)}\,\widetilde{v}\,\rangle \,, \end{aligned} \end{aligned}$$
(5.19)

where we have adopted the notation (5.13), and the partial derivatives of \(\rho ^{(2)}\)—which belong to \(L^\infty ({\mathbb {R}}^d)\)—are evaluated at the pair potential u and the chemical potential \(\mu _*(u)\).

To determine \(\partial _\mu \rho ^{(2)}(\mu ,u)\) for an arbitrary chemical potential \(\mu <\mu _0\) we apply Corollary 3.2 with \(u_1=u\) and \(u_2=u+w\) for any \(w\in {{\mathscr {V}}}\), and with \(\mu _1=\mu _2=\mu \): As in the proof of Theorem 5.3, we conclude that the pressure is differentiable with respect to u with derivative \(\partial _u p(\mu ,u)\in {{\mathscr {V}}}'\) given by

$$\begin{aligned} \partial _u p(\mu ,u)\,w \,=\, -\frac{1}{2}\int _{{\mathbb {R}}^d}w(x)\rho ^{(2)}(x)\,\textrm{d}x\,=\, -\frac{1}{2} \langle \,w,\rho ^{(2)}\,\rangle \,, \end{aligned}$$

where \(\rho ^{(2)}\) is the pair correlation function of the \((\mu ,u)\)-Gibbs measure. Schwarz’s theorem and (4.1) therefore yield

$$\begin{aligned} \langle \,w,\partial _\mu \rho ^{(2)}\,\rangle \,=\, \partial _\mu \langle \,w,\rho ^{(2)}\,\rangle \,=\, -2\,\partial _\mu \bigl ((\partial _u \,p)w\bigr ) \,=\, -2\,\partial _u (\partial _\mu \,p)w \,=\, -2\,\partial _u \rho \,w\,. \end{aligned}$$
(5.20)

Inserting this into (5.19), we obtain

$$\begin{aligned} \Phi ''(u)(v,\widetilde{v}) \,=\, \bigl (\mu _*'(u)\,\widetilde{v}\bigr )\bigl (\partial _u\rho \, v) \,-\, \frac{1}{2}\langle \,v,\partial _u\rho ^{(2)}\,\widetilde{v}\,\rangle \,, \end{aligned}$$

and assertion (5.18) thus follows from Proposition 5.2. \(\square \)

Note that \(\Phi ''(u)\) is a continuous bilinear form: This follows readily from the result in [14] that \(\partial _u\rho \in {{\mathscr {V}}}'\) and \(\partial _u\rho ^{(2)}\in {\mathscr {L}}\bigl ({{\mathscr {V}}},L^\infty ({\mathbb {R}}^d)\bigr )\). We further observe that \(\Phi ''\) is continuous in u because \(\rho \) and \(\rho ^{(2)}\) are \(C^1\) functions of \(\mu \) and u. This implies that \(\Phi ''(u)\) is a symmetric bilinear form. Finally, this bilinear form is positive semidefinite, because \(\Phi \) is strictly convex.

6 The connection to the IMC iterative scheme

In the preceding section, we have seen that the relative entropy functional is twice differentiable at any \(u_0\in {{\mathscr {U}}}\), provided that the target density \(\rho _*\) belongs to its gas phase. The convex quadratic approximation

$$\begin{aligned} \Phi _2(u_0+v) \,:=\, \Phi (u_0) \,+\, \Phi '(u_0)v \,+\, \frac{1}{2}\,\Phi ''(u_0)(v,v) \,\approx \, \Phi (u_0+v)\,, \end{aligned}$$
(6.1)

valid for small enough \(v\in {{\mathscr {V}}}\), attains its minimum for every solution \(v_0\in {{\mathscr {V}}}\) of the variational problem

$$\begin{aligned} \Phi ''(u_0)(v_0,w) \,=\, - \Phi '(u_0)w \qquad \text {for all}\, w\in {{\mathscr {V}}}, \end{aligned}$$
(6.2)

and the well-known Newton scheme for minimizing \(\Phi \) consists in updating \(u_0\) via

$$\begin{aligned} u_1 \,=\, u_0 \,+\, v_0 \end{aligned}$$
(6.3)

to obtain a (hopefully) better approximation of the global unique minimizer of \(\Phi \). Unfortunately, however, we only know that \(\Phi ''(u_0)\) is semidefinite; it is therefore not clear whether (6.2) has a solution, and if it does, whether it is unique.

The bilinear form \(\Phi ''(u_0)\) defines a linear operator \(A:{{\mathscr {V}}}\rightarrow {{\mathscr {V}}}'\) via the identity

$$\begin{aligned} \Phi ''(u_0)(v,\widetilde{v}) \,=\, \langle \,\widetilde{v},Av\,\rangle _{{{\mathscr {V}}}\times {{\mathscr {V}}}'} \qquad \text {for all}\, v,\widetilde{v}\in {{\mathscr {V}}}\,. \end{aligned}$$

As in Remark 5.4 the dual form on the right-hand side can be replaced by the dual pairing induced by \(L^2({\mathbb {R}}^d)\), when A is identified with the (bounded) operator

$$\begin{aligned} A:{{\mathscr {V}}}\rightarrow L^\infty ({\mathbb {R}}^d)\,, \qquad A:v\,\mapsto \,-\frac{1}{2}\,\partial _u\rho ^{(2)}v \,-\, \frac{1}{\partial _\mu \rho }(\partial _u\rho \,v)\,\nabla _u\rho \,, \end{aligned}$$
(6.4)

where \(\nabla _u\rho \in L^\infty ({\mathbb {R}}^d)\) denotes the representative of the functional \(\partial _u\rho \in {{\mathscr {V}}}'\) for this pairing. This is an immediate consequence of Theorem 5.7. According to [15, Sect. 6] there holds

$$\begin{aligned} \nabla _u\rho (x) \,=\, -\beta \rho ^{(2)}(x) \,-\, \frac{\beta }{2} \int _{{\mathbb {R}}^d} \chi ^{(3)}(0,x+x',x')\,\textrm{d}x'\,, \end{aligned}$$
(6.5)

where

$$\begin{aligned} \chi ^{(3)}(0,x+x',x') \,=\, \omega ^{(3)}(0,x+x',x') \,+\, \rho \,\omega ^{(2)}(x') \,+\, \rho \,\omega ^{(2)}(x+x') \end{aligned}$$

is given in terms of the second and third cluster functions associated with \(\mu \) and u; compare Remark 5.6. Using the translation and permutation invariance of \(\omega ^{(3)}\), this can be rewritten as

$$\begin{aligned} \chi ^{(3)}(0,x+x',x') \,=\, \omega ^{(3)}(x,0,-x') \,+\, \rho \,\omega ^{(2)}(x') \,+\, \rho \,\omega ^{(2)}(x+x')\,. \end{aligned}$$
(6.6)

Note that it has been shown in [15, Proposition 4.2] that the integral in (6.5) is absolutely convergent and that it defines a bounded function of \(x\in {\mathbb {R}}^d\). Since \(\omega ^{(2)}\in L^1({\mathbb {R}}^d)\), it thus follows from (6.6) that

$$\begin{aligned} \int _{{\mathbb {R}}^d} \bigl |\omega ^{(3)}(\,\cdot \,,0,-x')\bigr | \,\textrm{d}x' \,\in \, L^\infty ({\mathbb {R}}^d)\,, \end{aligned}$$
(6.7)

and inserting (6.6) into (6.5), we finally arrive at the representation

$$\begin{aligned} \nabla _u\rho (x) \,=\, -\beta \rho ^{(2)}(x) \,-\, \frac{\beta }{2} \int _{{\mathbb {R}}^d} \omega ^{(3)}(x,0,x')\,\textrm{d}x' \,-\, \beta \rho \int _{{\mathbb {R}}^d} \omega ^{(2)}(x')\,\textrm{d}x'\,, \end{aligned}$$
(6.8)

which will be used later on.

Now, we introduce the (nonlinear) “Henderson operator”

$$\begin{aligned} F:u \,\mapsto \, \rho ^{(2)}\bigl (\mu _*(u),u\bigr )\,, \end{aligned}$$
(6.9)

which maps \(u\in {\mathcal {B}}_{{\mathscr {V}}}(u_0)\) to its associated pair correlation function at the given density \(\rho _*\), and which is an injective operator according to the Henderson theorem. Since \(F(u)+2\nabla \Phi (u)\) is independent of u according to (5.11), see also (5.12), it follows that \(F:{\mathcal {B}}_{{\mathscr {V}}}(u_0)\rightarrow L^\infty ({\mathbb {R}}^d)\) is differentiable for \(\rho _*<\rho _0\) with

$$\begin{aligned} \langle \,\widetilde{v},F'(u_0)v\,\rangle \,=\, -2\,\Phi ''(u_0)(v,\widetilde{v})\,. \end{aligned}$$
(6.10)

This shows that if the target \({\mathbb {P}}_*\) has a pair correlation function \(\rho ^{(2)}_*\in L^\infty ({\mathbb {R}}^d)\), then the variational problem (6.2) is equivalent to the linear operator equation

$$\begin{aligned} F'(u_0)v_0 \,=\, \rho ^{(2)}_* \,-\, F(u_0)\,. \end{aligned}$$
(6.11)

The step (6.3) for minimizing \(\Phi \) with Newton’s method is thus equivalent to one iteration of the IMC method, where the update \(v_0\) is determined by solving (6.11). This is the thermodynamic limit analog of the observation in [29, 36], referred to in the introduction.

Remark 6.1

As follows from Theorem 5.3 and (2.7), the Henderson operator satisfies

$$\begin{aligned} \langle \,v,F(u_0)\,\rangle \,=\, 2\,E(v,{\mathbb {P}}_0)\,, \end{aligned}$$

where \({\mathbb {P}}_0\) denotes the corresponding \((\mu _*(u_0),u_0)\)-Gibbs measure. In the IMC scheme, as introduced in [26], this interaction energy is approximated by the corresponding expectation value

$$\begin{aligned} E(v,{\mathbb {P}}_0) \,\approx \, \frac{1}{|\Lambda |}\,\langle \,V\,\rangle _{\rho _*,\Lambda } \end{aligned}$$

of the canonical ensemble corresponding to the pair potential \(u_0\) at density \(\rho _*\) in the bounded box \(\Lambda \subset {\mathbb {R}}^d\), where V is the observable

$$\begin{aligned} V(\gamma ) \,=\, \sum _{1\le i<j\le N} v(x_i-x_j) \end{aligned}$$

associated with the perturbation \(v\in {{\mathscr {V}}}\), compare (2.3). Likewise, the derivative \(F'(u_0)\) is approximated in the IMC scheme via the cross-correlation matrix

$$\begin{aligned} -\frac{2\beta }{|\Lambda |}\Bigl (\langle \,V\widetilde{V}\,\rangle _{\rho _*,\Lambda } - \langle \,V\,\rangle _{\rho _*,\Lambda }\langle \,\widetilde{V}\,\rangle _{\rho _*,\Lambda }\Bigr )\ \,\approx \,\langle \,\widetilde{v},F'(u_0)v\,\rangle \,. \end{aligned}$$

It has to be emphasized that the density constraint \(\rho (u)=\rho _*\) is inherently built into the canonical ensemble. Accordingly, this approximation of \(F'(u_0)\) does respect this constraint. When working in the grand canonical ensemble instead (or in the thermodynamic limit), this constraint necessitates a correction term to project the unconstrained derivative into the tangent plane of the manifold of pair correlation functions corresponding to particle systems with the correct density. This is the reason for the first term on the right-hand side of (5.18) which does not (and need not) occur in the original IMC scheme. \(\diamond \)

7 Extension of the Hessian to a semidefinite bilinear form on \(L^2({\mathbb {R}}^d)\)

In the remainder of this paper, we study the Hessian of the relative entropy functional somewhat further. More precisely, we examine the mapping properties of the Jacobian of the Henderson map, which is connected to the Hessian \(\Phi ''\) via (6.10).

To begin with, recall from Remark 5.4 that the right-hand side of (6.11) belongs to \(L^\infty ({\mathbb {R}}^d)\cap L^1({\mathbb {R}}^d)\), when the target \({\mathbb {P}}_*\) is a \((\mu ,u_*)\)-Gibbs measure. As our first result of this section, we show that \(F'(u_0)\) has matching mapping properties.

Theorem 7.1

Under the assumptions of Theorem 5.3, the Jacobian \(F'(u_0)\) of the Henderson operator (6.9) belongs to \({\mathscr {L}}({{\mathscr {V}}},L^\infty ({\mathbb {R}}^d))\cap {\mathscr {L}}({{\mathscr {V}}},L^1({\mathbb {R}}^d))\).

Proof

We already know from Sect. 6 that \(F'(u_0)=-2A\in {\mathscr {L}}({{\mathscr {V}}},L^\infty ({\mathbb {R}}^d))\). It thus remains to establish that \(F'(u_0)\in {\mathscr {L}}({{\mathscr {V}}},L^1({\mathbb {R}}^d))\). To this end, we rewrite the Henderson operator in terms of the cluster function (5.14), i.e.,

$$\begin{aligned} F(u_0) \,=\, \omega ^{(2)}(\mu _*(u_0),u_0)\,+\, \rho _*^2\,, \end{aligned}$$

which yields

$$\begin{aligned} F'(u_0)\quad v \,=\, \partial _\mu \omega ^{(2)}(\mu _*,u_0)\,\mu _*'(u_0)\,v \,+\, \partial _u\omega ^{(2)}(\mu _*,u_0)\,v\,, \end{aligned}$$
(7.1)

where \(\mu _*=\mu _*(u_0)\). The partial derivative of the cluster function with respect to the chemical potential is given by

$$\begin{aligned} \begin{aligned} \partial _\mu \omega ^{(2)}(\mu ,u)&\,=\, \partial _\mu \bigl (\rho ^{(2)}(\mu ,u)\,-\, \rho (\mu ,u)^2\bigr ) \\&\,=\, \partial _\mu \rho ^{(2)}(\mu ,u) \,-\, 2\rho (\mu ,u)\partial _\mu \rho (\mu ,u)\,. \end{aligned} \end{aligned}$$
(7.2)

From (5.20) and (6.8), we conclude that

$$\begin{aligned} \partial _\mu \rho ^{(2)}(x)&\,=\, -2\,\nabla _u\rho \quad (x)\\&\,=\, 2\beta \, \rho ^{(2)}(x) \,+\, \beta \int _{{\mathbb {R}}^d} \omega ^{(3)}(x,0,x') \,\textrm{d}x' \,+\, 2\beta \rho \int _{{\mathbb {R}}^d}\omega ^{(2)}(x')\,\textrm{d}x'\,, \end{aligned}$$

and hence, together with Lemma A.2, it follows that

$$\begin{aligned} \partial _\mu \omega ^{(2)}(x) \,=\, 2\beta \,\omega ^{(2)}(x) \,+\, \beta \int _{{\mathbb {R}}^d} \omega ^{(3)}(x,0,x')\,\textrm{d}x' \,. \end{aligned}$$
(7.3)

In particular, this shows that \(\partial _\mu \omega ^{(2)}\in L^1({\mathbb {R}}^d)\) by virtue of (5.17). Together with Proposition 5.2 and (7.1), we thus obtain

$$\begin{aligned}&\Vert F'(u_0)\,v\Vert _{L^1({\mathbb {R}}^d)}\\&\qquad \le \Bigl (\Vert \partial _\mu \omega ^{(2)}(\mu _*,u_0)\Vert _{L^1({\mathbb {R}}^d)} \Vert \mu _*'(u_0)\Vert _{{{\mathscr {V}}}'} +\Vert \partial _u\omega ^{(2)}(\mu _*,u_0)\Vert _{{\mathscr {L}}({{\mathscr {V}}},L^1({\mathbb {R}}^d))}\Bigr )\Vert v\Vert _{{\mathscr {V}}}\,, \end{aligned}$$

because it has been proved in [15] that \(\partial _u\omega ^{(2)}\in {\mathscr {L}}({{\mathscr {V}}},L^1({\mathbb {R}}^d))\). This shows that \(F'(u_0)\in {\mathscr {L}}({{\mathscr {V}}},L^1({\mathbb {R}}^d))\). \(\square \)

Since \(L^\infty ({\mathbb {R}}^d)\cap L^1({\mathbb {R}}^d)\subset L^2({\mathbb {R}}^d)\), Theorem 7.1 shows that \(F'(u_0)\) is a bounded operator from \({{\mathscr {V}}}\) to \(L^2({\mathbb {R}}^d)\). As \({{\mathscr {V}}}\) is a dense subspace of \(L^2({\mathbb {R}}^d)\), and because we have repeatedly identified the dual pairing \({{\mathscr {V}}}\times {{\mathscr {V}}}'\) with the dual pairing of \(L^2({\mathbb {R}}^d)\), this raises the question whether \(F'(u_0)\) extends to a bounded operator in \(L^2({\mathbb {R}}^d)\), or, in terms of the relative entropy functional, whether \(\Phi ''(u_0)\) extends to a quadratic form on \(L^2({\mathbb {R}}^d)\). This will be answered to the affirmative in the remainder of this section.

Lemma 7.2

Under the assumptions of Theorem 5.3, the Jacobian of the Henderson map is given by

$$\begin{aligned} \bigl (F'(u_0)\,v\bigr )(x)&= -\beta \rho ^{(2)}(x)v(x) \end{aligned}$$
(7.4a)
$$\begin{aligned}&\quad -\, 2\beta \rho \,(\omega ^{(2)}*v)(x) \,-\, \beta \,(\omega ^{(2)}*\omega ^{(2)}*v)(x) \end{aligned}$$
(7.4b)
$$\begin{aligned}&\quad +\, \frac{1}{2\partial _\mu \rho }\,\partial _\mu \omega ^{(2)}(x) \int _{{\mathbb {R}}^d}\partial _\mu \omega ^{(2)}(x')\,v(x')\,\textrm{d}x' \end{aligned}$$
(7.4c)
$$\begin{aligned}&\quad -\, 2\beta \int _{{\mathbb {R}}^d}\omega ^{(3)}(x,0,x')\,v(x')\,\textrm{d}x' \end{aligned}$$
(7.4d)
$$\begin{aligned}&\quad -\, \frac{\beta }{2}\int _{{\mathbb {R}}^d} \int _{{\mathbb {R}}^d} \omega ^{(4)}(x,0,x'',x'+x'')\,\textrm{d}x''\,v(x')\,\textrm{d}x' \end{aligned}$$
(7.4e)

for \(v\in {{\mathscr {V}}}\) and \(x\in {\mathbb {R}}^d\).

Proof

According to (6.4), we have

$$\begin{aligned} F'(u_0)\quad v \,=\, -2Av \,=\, \partial _u\rho ^{(2)}\,v \,+\, \frac{2}{\partial _\mu \rho }(\partial _u\rho \,v)\nabla _u\rho \,, \end{aligned}$$

where

$$\begin{aligned} \begin{aligned}&(\partial _u\rho ^{(2)}\,v)(x) \,=\, -\beta \rho ^{(2)}(x)v(x) \,-\, 2\beta \int _{{\mathbb {R}}^d} \rho ^{(3)}(x,0,x')v(x')\,\textrm{d}x'\\&\qquad \,-\, \frac{\beta }{2}\int _{{\mathbb {R}}^d} v(x') \int _{{\mathbb {R}}^d} \bigl (\rho ^{(4)}(x,0,x'',x''+x') \,-\, \rho ^{(2)}(x)\rho ^{(2)}(x')\bigr )\,\textrm{d}x''\,\textrm{d}x' \end{aligned} \end{aligned}$$
(7.5)

by virtue of [15, Eq. (6.9)]. Concerning the nested integral in the second line of (7.5), it has further been shown in [15, Proposition 4.3] that the inner integral converges absolutely and defines an \(L^\infty \) function of x and \(x'\). It follows that

$$\begin{aligned} \bigl (F'(u_0)v\bigr )(x) \,=\, -\beta \rho ^{(2)}(x)\,v(x) \,+\, \int _{{\mathbb {R}}^d} k(x,x')v(x')\,\textrm{d}x' \end{aligned}$$
(7.6)

with integral kernel

$$\begin{aligned} \begin{aligned} k(x,x')&\,=\, \frac{2}{\partial _\mu \rho } \nabla _u\rho (x)\nabla _u\rho (x') \,-\,2\beta \,\rho ^{(3)}(x,0,x')\\&\,-\, \frac{\beta }{2} \int _{{\mathbb {R}}^d} \bigl (\rho ^{(4)}(x,0,x'',x''+x') \,-\, \rho ^{(2)}(x)\rho ^{(2)}(x')\bigr )\,\textrm{d}x''\,. \end{aligned} \end{aligned}$$
(7.7)

We now rewrite the individual terms of this kernel function. To begin with, we recall from (5.20) and (7.2) that

$$\begin{aligned} \nabla _u\rho \,=\, -\frac{1}{2}\,\partial _\mu \rho ^{(2)} \,=\, -\frac{1}{2}\,\partial _\mu \omega ^{(2)} \,-\, \rho \,\partial _\mu \rho \,, \end{aligned}$$

so that

$$\begin{aligned} \frac{2}{\partial _\mu \rho }\nabla _u\rho (x)\nabla _u\rho (x')&\,=\, \frac{1}{2\partial _\mu \rho }\,\partial _\mu \omega ^{(2)}(x)\, \partial _\mu \omega ^{(2)}(x')\\&\qquad +\, \rho \,\partial _\mu \omega ^{(2)}(x) \,+\, \rho \,\partial _\mu \omega ^{(2)}(x') \,+\, 2\rho ^2\partial _\mu \rho \,. \end{aligned}$$

The partial derivatives with respect to \(\mu \) in the final three terms of the right-hand side can be eliminated with the help of (7.3) and Lemma A.2: This yields

$$\begin{aligned} \begin{aligned}&\frac{2}{\partial _\mu \rho }\nabla _u\rho (x)\nabla _u\rho (x') \,=\, \frac{1}{2\partial _\mu \rho }\,\partial _\mu \omega ^{(2)}(x)\, \partial _\mu \omega ^{(2)}(x')\\&\qquad \,+\, 2\beta \rho \,\omega ^{(2)}(x) \,+\, \beta \rho \int _{{\mathbb {R}}^d}\omega ^{(3)}(x,0,x'')\,\textrm{d}x'' \,+\, 2\beta \rho \,\omega ^{(2)}(x')\\&\qquad \,+\, \beta \rho \int _{{\mathbb {R}}^d}\omega ^{(3)}(x',0,x'')\,\textrm{d}x'' \,+\, 2\beta \rho ^3 \,+\, 2\beta \rho ^2\int _{{\mathbb {R}}^d} \omega ^{(2)}(x'')\,\textrm{d}x''\,. \end{aligned} \end{aligned}$$
(7.8)

Concerning the second term of the integral kernel (7.7), we apply definition (5.15) of the cluster functions to rewrite

$$\begin{aligned} \rho ^{(3)}(x,0,x')\,=\, \omega ^{(3)}(x,0,x') \,+\, \rho \,\omega ^{(2)}(x) \,+\, \rho \,\omega ^{(2)}(x-x') \,+\, \rho \,\omega ^{(2)}(x')+\rho ^3\,, \end{aligned}$$
(7.9)

where we have also used (5.16).

In the same way, we can rewrite the integrand of the integral in (7.7) in terms of the cluster functions:

$$\begin{aligned}&\rho ^{(4)}(x,0,x'',x''+x') \,-\, \rho ^{(2)}(x)\rho ^{(2)}(x') \\&\qquad \,=\, \omega ^{(4)}(x,0,x'',x''+x') \,+\, \omega ^{(3)}(x,0,x'')\, \rho \,+\, \omega ^{(3)}(x,0,x''+x')\, \rho \\&\qquad \quad \,+\,\omega ^{(3)}(x,x'',x''+x')\, \rho \,+\, \omega ^{(3)}(0,x'',x''+x')\,\rho \\&\qquad \quad \,+\, \omega ^{(2)}(x-x'')\,\omega ^{(2)}(x''+x') \,+\, \omega ^{(2)}(x-x'-x'')\,\omega ^{(2)}(x'') \\&\qquad \quad \,+\, \omega ^{(2)}(x''-x)\,\rho ^2 \,+\, \omega ^{(2)}(x''+x'-x)\,\rho ^2 \\&\qquad \quad \,+\, \omega ^{(2)}(x'')\,\rho ^2 \,+\, \omega ^{(2)}(x''+x')\,\rho ^2 \\&\qquad \,=\, \omega ^{(4)}(x,0,x'',x''+x') \,+\, \omega ^{(3)}(x,0,x'')\, \rho \,+\, \omega ^{(3)}(x,0,x''+x')\, \rho \\&\qquad \quad \,+\,\omega ^{(3)}(x',0,x-x'')\, \rho \,+\, \omega ^{(3)}(x',0,-x'')\,\rho \\&\qquad \quad \,+\, \omega ^{(2)}(x-x'')\,\omega ^{(2)}(x''+x') \,+\, \omega ^{(2)}(x-x'-x'')\,\omega ^{(2)}(x'') \\&\qquad \quad \,+\, \omega ^{(2)}(x''-x)\,\rho ^2 \,+\, \omega ^{(2)}(x''+x'-x)\,\rho ^2 \\&\qquad \quad \,+\, \omega ^{(2)}(x'')\,\rho ^2 \,+\, \omega ^{(2)}(x''+x')\,\rho ^2 , \end{aligned}$$

where we have used in the second step that the cluster functions are translation and permutation invariant. Integrating over \(x''\in {\mathbb {R}}^d\), we thus obtain

$$\begin{aligned} \begin{aligned}&\int _{{\mathbb {R}}^d} \bigl (\rho ^{(4)}(x,0,x'',x''+x') \,-\, \rho ^{(2)}(x)\rho ^{(2)}(x')\bigr )\,\textrm{d}x''\\&\quad = \int _{{\mathbb {R}}^d} \omega ^{(4)}(x,0,x'',x''+x') \,\textrm{d}x'' + 4\rho ^2\int _{{\mathbb {R}}^d}\omega ^{(2)}(x'')\,\textrm{d}x''\\&\qquad +\, 2\rho \int _{{\mathbb {R}}^d}\omega ^{(3)}(x,0,x'')\,\textrm{d}x'' \,+\, 2\rho \int _{{\mathbb {R}}^d}\omega ^{(3)}(x',0,x'')\,\textrm{d}x'' \\&\qquad +\, \bigl (\omega ^{(2)}*\omega ^{(2)}\bigr )(x+x') +\, \bigl (\omega ^{(2)}*\omega ^{(2)}\bigr )(x-x'), \end{aligned} \end{aligned}$$
(7.10)

where all the terms on the right-hand side are bounded functions of x and \(x'\): This follows from (6.7) and the fact that \(\omega ^{(2)}\in L^1({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\).

Inserting (7.8), (7.9), and (7.10) into (7.7), we conclude that

$$\begin{aligned} k(x,x')&= \frac{1}{2\partial _\mu \rho }\,\partial _\mu \omega ^{(2)}(x)\, \partial _\mu \omega ^{(2)}(x') \,-\, 2\beta \,\omega ^{(3)}(x,0,x') \,-\, 2\beta \rho \,\omega ^{(2)}(x-x')\\&\quad -\,\frac{\beta }{2}\int _{{\mathbb {R}}^d}\omega ^{(4)}(x,0,x'',x''+x')\,\textrm{d}x''\\&\quad -\,\frac{\beta }{2}\,\bigl (\omega ^{(2)}*\omega ^{(2)}\bigr )(x+x') \,-\,\frac{\beta }{2}\,\bigl (\omega ^{(2)}*\omega ^{(2)}\bigr )(x-x')\,. \end{aligned}$$

Note that the final two terms of this integral kernel representation define the same convolution operator

$$\begin{aligned} v \, \mapsto \,-\frac{\beta }{2} \int _{{\mathbb {R}}^d}\bigl (\omega ^{(2)}* \omega ^{(2)}\bigr )(\,\cdot \, -x')\,v(x')\,\textrm{d}x'\,, \end{aligned}$$

because every \(v \in {{\mathscr {V}}}\) is an even function. The assertion (7.4) thus follows readily from (7.6). \(\square \)

Now, we can prove the main result of this section.

Theorem 7.3

Under the assumptions of Theorem 5.3, the operator \(F'(u_0)\) extends to a self-adjoint negative semidefinite operator on \(L^2({\mathbb {R}}^d)\).

Proof

We already know from Theorem 7.1 that \(F'(u_0)\in {\mathscr {L}}({{\mathscr {V}}},L^2({\mathbb {R}}^d))\). Since \({{\mathscr {V}}}\) is a dense subset of \(L^2({\mathbb {R}}^d)\), it remains to discuss the continuity from \(L^2({\mathbb {R}}^d)\) to \(L^2({\mathbb {R}}^d)\) of each of the terms defined in lines (7.4a)–(7.4e) of Lemma 7.2.

For the multiplication operator in (7.4a), this is true because the pair correlation function \(\rho ^{(2)}\) is bounded. The autoconvolution \(\omega ^{(2)}*\omega ^{(2)}\) of the cluster function \(\omega ^{(2)}\in L^1({\mathbb {R}}^d)\) belongs to \(L^1({\mathbb {R}}^d)\) again, hence the two convolution operators in (7.4b) are bounded operators in \(L^2({\mathbb {R}}^d)\). From (7.3), it follows that \(\partial _\mu \omega ^{(2)}\) belongs to \(L^1({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\): the \(L^1\) property has been verified in the argument following (7.3); the \(L^\infty \) property of the second term in (7.3) has been established in (6.7). This shows that \(\partial _\mu \omega ^{(2)}\in L^2({\mathbb {R}}^d)\), and the continuity of the operator in (7.4c) is therefore a consequence of the Cauchy–Schwarz inequality. By virtue of (5.17), the cluster function \(\omega ^{(3)}(\,\cdot \,,0,\,\cdot \,)\) belongs to \(L^1({\mathbb {R}}^d\times {\mathbb {R}}^d)\). Since it is bounded, it also belongs to \(L^2({\mathbb {R}}^d\times {\mathbb {R}}^d)\). Therefore, (7.4d) defines a compact operator in \({\mathscr {L}}(L^2({\mathbb {R}}^d))\). Concerning (7.4e), we have already pointed out after (7.10) that the inner integral of \(\omega ^{(4)}\) is a bounded function of x and \(x'\). It also belongs to \(L^1({\mathbb {R}}^d\times {\mathbb {R}}^d)\) by virtue of (5.17), again. The continuity of the operator defined in (7.4e) therefore follows in the same way as in the previous case.

In summary, this shows that \(F'(u_0)\) extends to an operator in \({\mathscr {L}}(L^2({\mathbb {R}}^d))\). By virtue of (6.10), this extension is self-adjoint and negative semidefinite. \(\square \)

Concerning the Hessian of the relative entropy functional, we readily conclude from (6.10) the following corollary.

Corollary 7.4

Under the assumptions of Theorem 5.3 the Hessian \(\Phi ''(u_0)\) of the relative entropy functional (4.5) defines a symmetric and positive semidefinite bilinear form on \(L^2({\mathbb {R}}^d)\).

As indicated before, we leave it as an open problem whether \(F'(u_0)\) is injective, i.e., whether the local quadratic approximation (6.1) of the relative entropy functional is strictly convex.

8 Lennard-Jones-type pair potentials

In this final section, we establish a stronger regularity result for the derivative of the Henderson map, respectively, the Hessian of the relative entropy functional, which is valid when the potential \(u_0\in {{\mathscr {U}}}\) satisfies (2.2) with a majorant of the form

$$\begin{aligned} \psi _0(r) \,=\, C(1+r^2)^{-\alpha /2} \end{aligned}$$
(8.1)

for some \(C>0\) and \(\alpha >d\). This class of potentials includes the so-called potentials of Lennard-Jones type, cf. [38].

It it known that in the gas phase, the cluster function \(\omega ^{(2)}\) corresponding to such a pair potential satisfies the same rate of decay near infinity as \(\psi _0\) does. This entails the following result.

Theorem 8.1

Let \(u_0\) satisfy (2.2) for some \(\varphi \) as in (2.1) and \(\psi =\psi _0\) of (8.1). Let the space \({{\mathscr {V}}}\) of perturbations of \(u_0\) be defined as before, cf. (5.1), and let \(\rho _*\) belong to the gas phase of \(u_0\). Then, \(F'(u_0)\in {\mathscr {L}}({{\mathscr {V}}})\).

Proof

The aforementioned result about the cluster function states that \(\omega ^{(2)}\in {{\mathscr {V}}}\) under the given assumptions; compare Lemma A.4 in the appendix.

Now, we discuss the continuity of each of the operators corresponding to the individual terms in (7.4). Continuity obviously holds for the multiplication operator in (7.4a), because \(\rho ^{(2)}\) is bounded. Concerning the convolutions in (7.4b), we refer to [16, Proposition 4.1] for the fact that the convolution is a continuous bilinear mapping from \({{\mathscr {V}}}\times {{\mathscr {V}}}\) to \({{\mathscr {V}}}\). Accordingly, the two terms in (7.4b) also define operators in \({\mathscr {L}}({{\mathscr {V}}})\).

For the remaining terms, we make use of Lemma A.4 again. For \(m=3\), this auxiliary result gives

$$\begin{aligned} \int _{{\mathbb {R}}^d} \bigl |\omega ^{(3)}(\,\cdot \,,0,x')\bigr | \,\textrm{d}x' \ \in \ {{\mathscr {V}}}, \end{aligned}$$
(8.2)

and since

$$\begin{aligned} \left| \int _{{\mathbb {R}}^d} \omega ^{(3)}(x,0,x')\,v(x')\,\textrm{d}x'\right| \,\le \, \Vert v\Vert _{L^\infty ({\mathbb {R}}^d)} \int _{{\mathbb {R}}^d} \bigl |\omega ^{(3)}(x,0,x')\bigr | \,\textrm{d}x' \end{aligned}$$

for every \(x\in {\mathbb {R}}^d\), it follows that

$$\begin{aligned} \left\| \int _{{\mathbb {R}}^d}\omega ^{(3)}(\,\cdot \,,0,x')\,v(x')\,\textrm{d}x'\right\| _{{\mathscr {V}}}\,\le \, \Vert v\Vert _{L^\infty ({\mathbb {R}}^d)} \left\| \int _{{\mathbb {R}}^d} \bigl |\omega ^{(3)}(\,\cdot \,,0,x')\bigr | \,\textrm{d}x' \right\| _{{\mathscr {V}}}. \end{aligned}$$

Much the same argument applies to the term in (7.4e):

$$\begin{aligned}&\left| \int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d} \omega ^{(4)}(x,0,x'',x'+x'')\,\textrm{d}x''\,v(x')\,\textrm{d}x'\right| \\&\qquad \,\le \, \Vert v\Vert _{L^\infty ({\mathbb {R}}^d)} \int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d} \bigl |\omega ^{(4)}(x,0,x'',x')\bigr | \,\textrm{d}x'\,\textrm{d}x'' \end{aligned}$$

for every \(x\in {\mathbb {R}}^d\), and the right-hand side again belongs to \({{\mathscr {V}}}\) as a function of x by virtue of Lemma A.4 for \(m=4\). Since \({{\mathscr {V}}}\) is continuously embedded in \(L^\infty ({\mathbb {R}}^d)\) this shows that the two terms in (7.4d) and (7.4e) correspond to operators in \({\mathscr {L}}({{\mathscr {V}}})\).

Finally, concerning the term in (7.4c), we conclude from (7.3), (8.2), and the fact that \(\omega ^{(2)}\in {{\mathscr {V}}}\) that \(\partial _\mu \omega ^{(2)}\) belongs to \({{\mathscr {V}}}\) as well. Accordingly, this term also represents a bounded operator from \({{\mathscr {V}}}\) to \({{\mathscr {V}}}\), and the proof is done. \(\square \)

Note that if the target \({\mathbb {P}}_*\) and the model \({\mathbb {P}}_0\) are Gibbs measures corresponding to pair potentials \(u_*\) and \(u_0\) in \({{\mathscr {U}}}\) and their associated chemical potentials, where \(u_*\) and u satisfy (2.2) for the same majorant \(\psi =\psi _0\) given by (8.1), and if the density \(\rho _*\) belongs to the gas phase of both pair potentials, then the cluster functions \(\omega ^{(2)}_*\) and \(\omega ^{(2)}_0\) of \({\mathbb {P}}_*\) and \({\mathbb {P}}_0\), respectively, both belong to \({{\mathscr {V}}}\). It thus follows from Remark 5.4 that the gradient \(\nabla \Phi (u_0)\) of the relative entropy functional belongs to \({{\mathscr {V}}}\) as well, and therefore the Newton equation (6.11) of the IMC iterative scheme is an operator equation in \({{\mathscr {V}}}\) by virtue of Theorem 8.1. As shown in [16, Remark 6.5], a valid choice \(u_0\in {{\mathscr {U}}}\), which satisfies (2.2) for the same majorant (8.1), is given by the potential of mean force,

$$\begin{aligned} u_0 \,=\, -\frac{1}{\beta }\,\log \bigl (\rho ^{(2)}_*/\rho _*^2\bigr ) \,=\, -\frac{1}{\beta }\,\log \bigl (1+\omega ^{(2)}_*/\rho _*^2\bigr )\,, \end{aligned}$$

provided the density \(\rho _*\) is sufficiently small. Moreover, \(u_*-u_0\) belongs to \({{\mathscr {V}}}\) for this initial guess.