1 The Optimization Problem

In many areas the optimal control of an interface evolution towards an anisotropic shape is desired. For example in chemistry or materials science one wishes to steer the solidification process of crystals [1,2,3,4]. For the time evolution of shapes phase-field models have shown great promise in many application areas and anisotropies can be incorporated (see, e.g., [5] and references therein). In this ansatz the interface is modeled with a diffuse interface layer, and an order parameter y—the so called phase-field— reflects the pure phases with the values \(\pm 1 \), e.g. the liquid phase for \(y \approx 1\) and the solid phase when \(y \approx -1\), and the diffuse interface with values between \(-1\) and 1. The gradient flow of the Ginzburg–Landau energy, having the form

$$\begin{aligned} {\mathcal {E}}(y) :=\int _{\Omega } A(\nabla y) + \psi (y) \, \mathrm dx \end{aligned}$$
(1)

then determines the time evolution of the shape, and with it the state equation for the control problem. Here the first term represents the surface energy where \(A:{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) is an (an-)isotropy function, and the potential \(\psi \) can be thought of being symmetric and to have its global minima at \(\approx \pm 1\). Let us mention that typically the energy involves a variable \(\varepsilon >0\) related to the interfacial thickness which we set to 1 without loss of generality in this paper. Considering in particular the \(L^2\)-gradient flow with a smooth potential we obtain the Allen–Cahn equation. For further introduction to phase-field models we refer to [6] and references therein. The following analysis and numerical ansatz will not only be valid for the Allen–Cahn equations but can be applied in general to differential equations arising from a gradient flow of energies of the form (1). For the gradient of A we use the notation \(A'(p)\) instead of \(\nabla A(p)\) as is common for anisotropic phase-field models.

The goal is now to determine the distributed control u driving the solution y of the gradient flow equation

$$\begin{aligned} \partial _t y - \nabla \cdot A'(\nabla y) + \psi '(y) = u \end{aligned}$$
(2)

from an initial configuration \(y_0\) at time \(t_0\)—say \(t_0=0\)—to a given target function \(y_\Omega \) at a given final time T (or a target function \(y_Q\) over the whole time horizon as considered in (43)). Hence the optimal control problem is described by the following setting:

Let \(\Omega \subset {\mathbb {R}}^d\) be a bounded Lipschitz domain and \(y_\Omega \in L^2(\Omega )\) be a given target function. Let a final time \(0<T<\infty \) be given and denote the space-time cylinder by \(Q :=[0,T]\times \Omega \) and its lateral boundary by \(\Sigma :=[0,T] \times \partial \Omega \). Our objective is to find for a given initial state \(y_0 \in H^1(\Omega )\) a solution to the optimal control problem

$$\begin{aligned} \min J(y,u) :=\dfrac{1}{2} \Vert y(T) - y_\Omega \Vert ^2_{L^2(\Omega )} + \dfrac{\lambda }{2} \Vert u\Vert ^2_{L^2(Q)}, \end{aligned}$$
(3)

subject to the quasilinear, possibly nonsmooth parabolic state equation

$$\begin{aligned} \begin{aligned} \int _Q \partial _t y \eta + A'(\nabla y)^T \nabla \eta + \psi '(y) \eta&= \int _Q u\eta \quad \forall \eta \in L^2(0, T; H^1(\Omega )),\\ y(0)&=y_0 \quad \text{ in } \Omega , \end{aligned} \end{aligned}$$
(4)

where \(u\in L^2(Q)\cong L^2(0,T;L^2(\Omega ))\) and \(y \in L^2(0,T;H^1(\Omega ))\cap H^1(0,T;H^1(\Omega )')\). Note that J is well defined due to the embedding \(L^2(0,T;H^1(\Omega ))\cap H^1(0,T;H^1(\Omega )') \hookrightarrow C([0,T]; L^2(\Omega ))\). The weak formulation implies the boundary condition \(A'(\nabla y)^T\nu =0\) on \(\Sigma \) where \(\nu \) is the outer normal.

In contrast to the well-studied isotropic Allen–Cahn equation where \(A'=Id\), in the anisotropic case \(A:{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) is an absolutely 2-homogeneous function. As a consequence \(A'\) is not differentiable at 0 in general. Let us mention that roughly speaking y is constant in the pure phases, and hence \(\nabla y \approx 0\) holds except on the interface. However, for solving an optimal control problem numerically one typically uses the differentiability of the control-to-state operator which would require differentiability of \(A'\). This paper serves also as a preparation to [7], where we show the differentiability and first order conditions of an implicit in time discretized problem with a regularized A. Hence, to allow the use of possibly regularized anisotropies A we relax the requirement of 2-homogeneity. Moreover we account for various potentials \(\psi \).

Assumptions 1

Assume \(A\in C^1({\mathbb {R}}^d)\) with \(A'\) being strongly monotone, i.e.,

$$\begin{aligned} (A'(p)-A'(q),p-q) \ge C_A \vert p-q\vert ^2 \qquad \forall p,q\in {\mathbb {R}}^d, \end{aligned}$$

with \(C_A>0\) and such that it fulfills the growth condition \( \vert A'(p)\vert \le {\overline{C}}_A\vert p \vert \) with \({\overline{C}}_A>0\).

Furthermore let \(\psi \in C^1({\mathbb {R}})\) be bounded from below and such that it can be approximated by \(f_n \) satisfying

$$\begin{aligned} f_n \in C^2({\mathbb {R}}),\quad f_n \rightarrow \psi \quad \text {in } C^1_{\text {loc}},\quad -c \le f_n \le c (\psi +1) , \quad f_n'' \ge -C_\psi , \quad \vert f_n'' \vert \le C_n , \nonumber \\ \end{aligned}$$
(5)

with \(c, C_n,C_\psi \ge 0\) and \(\psi (y_0)\in L^1(\Omega )\) for the given initial data \(y_0\in H^1(\Omega )\).

In the following we use the convention \(\tfrac{1}{C_\psi }:=\infty \) for \(C_\psi =0\). Note that \(A'\) being strongly monotone on \({\mathbb {R}}^d\) is equivalent to A being strongly convex. Further, the assumptions on \(\psi \) in particular imply that it holds

$$\begin{aligned} \left( \psi '(y_1) - \psi '(y_2), y_1-y_2\right) \ge -C_\psi \vert y_1-y_2\vert ^2 \qquad \forall y_1, y_2 \in {\mathbb {R}}. \end{aligned}$$
(6)

Some examples of A and \(\psi \) with respect to Allen–Cahn equations are mentioned in Remark 2.

In this paper we study the existence of an optimal control to (3)–(4) in arbitrary space dimension, the existence of a solution to the corresponding in time discretized control problem, and the convergence of the time discrete optimal controls to a time continuous one. Here an implicit time discretization using piecewise constant functions is employed. Under some additional smoothness requirements on A and \(\psi \) the first order optimality condition is adressed in [7] where its derivation relies on results of this paper. Therein one can also find numerical results.

To the best of our knowledge there does not exist any mathematical treatment on the optimal control of anisotropic phase-field models so far. Optimal control of isotropic Allen–Cahn variational equations are studied, e.g., in [8,9,10,11,12] and of Cahn–Hilliard variational (in-)equalities in [13,14,15] and references therein. Let us mention results given in the context of anisotropic Allen–Cahn equations. One possible anisotropy was introduced in a pioneering paper by Kobayashi [16] and existence and uniqueness of a solution are studied in [17,18,19]. For quite general anisotropies the solution to Allen–Cahn equations with obstacle potential is analyzed in [20]. Among others they use 2-homogeneity of A, an approximation of the potential similar to (5) and an implicit time discretization (without showing convergence of the discretization). Explicit and semi-implicit approximations are discussed in the survey paper [5], where also many references are given. For convex Kobayashi anisotropies several time discretizations are considered in [21]. In [22, 23] particular suggestions for the anisotropies are given and an efficient semi-implicit method using a particular linearization of \(A'\) and a convex/concave splitting is presented and energy stability is shown. Also several numerical experiments are shown comparing the anisotropies.

Literature to optimal control of quasilinear parabolic equations of the form (4) is still in its infancy. Most literature known to us treats quasilinearities with coefficients depending on xt and on the function y but not on its gradient [24,25,26,27,28]. For quasilinearities involving spatial derivatives of y see for example [29, 30]. In particular let us mention that the latter reference contains the most similar problem to ours, as the authors require a rather general quasilinearity with some particular polynomial growth condition. However they require the nonlinearity \(\psi '\) to be monotone. All the literature listed here assumes the quasilinear term to be rather well behaved, in particular none of its derivatives shall be singular at the origin. In the present context to our knowledge such difficulties have only been considered for elliptic equations [31].

The outline of the paper is the following.

As a first step we study the state equations. Therefore we introduce in Sect. 2 the time discretization. Then we discuss the existence and uniqueness of the solution of the discretized state equation as well as the Lipschitz-continuity of the control-to-state operator. Furthermore, for a set of bounded controls we obtain bounds on the states independent of the discretization level. Using these results we consider the limit with respect to the time discretization and obtain corresponding results for the in time continuous state equation (4). Consequently we have also convergence of the discretization. In addition we show energy stability of the discretization.

Finally in Sect. 3 the existence of the controls in the time continuous and time discrete case is shown. In addition the convergence of a subsequence of time discrete optimal controls to an optimal control of the original problem is obtained. These results hold not only for aiming at an end time state but also for steering to a state over the whole time horizon.

2 Solution to the Time Discretized and to the Time Continuous State Equations

First we introduce the time discretization. Then a certain boundedness property like in [20] (see (12)) is shown which is essential not only for the existence of the solution of the state equation but also for proving the existence of an optimal control and the convergence of the solution of the discretized problem to the time continuous solution. To obtain this result, the potential \(\psi \) is approximated (as, e.g., in [20] and [32]) by a sequence of functions \(f_n\) with bounded second derivatives, such that the dominated convergence theorem can be used. Following the lines of [20] we have no restriction on the space dimension d.

The existence for the time continuous problem will then be shown by taking the limit with respect of the time resolution which also shows convergence of the discretization method. (In [20] first the limit in the time discretization and then in the approximation of \(\psi \) is taken.)

From now on if no subscripts are provided, with \((\cdot ,\cdot )\) and \(\Vert \cdot \Vert \) we mean the \(L^2\)- or \(\ell _2\)-scalar product and norm respectively. The space should be clear from the context. For a Banach space V we will denote its dual by \(V'\) and the duality product by \(\langle {\cdot , \cdot }\rangle \).

Next we introduce a time discretization and show the existence of a solution of the discretized state equation. We divide the interval [0, T] into subintervals \(I_j :=(t_{j-1}, t_j]\) for \(j = 1,\ldots , N\) with \(0=t_0<t_1<\ldots <t_N=T\) and define \(\tau _j:=t_j-t_{j-1}\) and \(\tau :=\max _j {\tau }_j\). The state equation we discretize in time with a discontinuous-Galerkin method (dG(0)). Therefore, let us define

$$\begin{aligned} \begin{aligned}&Y_{\tau }:=\{ y_\tau :Q\rightarrow {\mathbb {R}} \mid y_\tau (t,.)\in H^1(\Omega ) {\;\forall t}, y_\tau (.,x) \text{ a.e. } \text{ constant } \text{ in } I_j \text{ for } j=1,\ldots ,N \} ,\\&U_{\tau }:=\{ u_\tau :Q\rightarrow {\mathbb {R}} \mid u_\tau (t,.)\in L^2(\Omega ) {\;\forall t}, u_\tau (.,x) \text{ a.e. } \text{ constant } \text{ in } I_j \text{ for } j=1,\ldots ,N \} , \end{aligned}\nonumber \\ \end{aligned}$$
(7)

and for each interval we label the constant by a subscript, e.g., \(y_j:=y_\tau \vert _{I_j}\). The vector containing these constants will be denoted by \((y_j)_{j = 1,\ldots , N} \in H^1(\Omega )^N\). The time discretized variant of (3)–(4) is then given by

$$\begin{aligned} \min _{Y_\tau \times U_\tau } {J}({y_\tau },{u_\tau }) = \dfrac{1}{2}\Vert y_N-y_\Omega \Vert ^2 + \dfrac{\lambda }{2} \sum _{j=1}^N \tau _j \Vert u_{j}\Vert ^2 \end{aligned}$$
(8)

subject to the time discretized state equation

$$\begin{aligned} (y_{j}, \varphi _{}) + {\tau _{j}} (A'(\nabla y_{j}), \nabla \varphi _{}) + {\tau _{j}}(\psi '(y_{j}), \varphi _{}) ={\tau _{j}}(u_{j}, \varphi _{}) + (y_{j-1},\varphi _{}) \quad \forall \varphi \in H^1(\Omega ) \nonumber \\ \end{aligned}$$
(9)

with \(j = 1,\ldots ,N\) and \(y_\tau (0,.):=y_0\in H^1(\Omega )\) is given.

We note that the state equation could have arised equally well from an implicit Euler discretization and we will use the notation \(\partial _t^{-\tau } y_\tau \) with

$$\begin{aligned} \partial _t^{-\tau } y_{\tau \vert I_j} :=\tfrac{1}{\tau _j} (y_j - y_{j-1}) \end{aligned}$$

in \(L^2(\Omega )\) for \(j=1,\ldots ,N\). One may favour a splitting approach for \(\psi \) or an approximation of the quasilinear term A as in [22, 23] instead of the fully implicit method. However, to our knowledge there exists no convergence proof for these discretizations of the state equation to the time continuous one in the limit \(\tau \rightarrow 0\). Moreover, while for the implicit time discretization the differentiability of the control to state operator is obtained in [7] under additional smoothness properties on A and \(\psi \), it is not known whether this property holds for semi-implicit discretizations. The additional computational cost using implicit discretization is nearly negligible for solving the optimal control problem.

The first step is given by the subsequent lemma.

Lemma 2

Let A fulfill the conditions in Assumptions  1. Furthermore let \(y_0 \in H^1(\Omega )\) and \(u_\tau \in U_\tau \). Let \(f\in C^2({\mathbb {R}})\) be a function such that \(\vert f'' \vert \in L^\infty ({\mathbb {R}})\) and \(f''\ge -C_\psi \) on \({\mathbb {R}}\) for some constant \(C_\psi \ge 0\). Then, for any \(0<\tau <\tfrac{1}{C_\psi }\), there exists a function \(y_\tau \in Y_\tau \) which is a solution of \( y_\tau (0)=y_0 \) in \(\Omega \) and for all \(j=1,\ldots , N\) it holds \(\forall \eta \in H^1(\Omega ): \)

$$\begin{aligned} \begin{gathered} \int _\Omega \partial ^{-\tau }_t {y_{\tau \vert I_j}} (\eta -y_j) + A(\nabla \eta )- A(\nabla y_j) + f'(y_j)(\eta -y_j) -u_j(\eta -y_j) \ge 0. \end{gathered} \nonumber \\ \end{aligned}$$
(10)

In addition, if \(\Lambda >0\) and \(u_\tau , y_0, A, f\) fulfill that

\(\Vert u_\tau \Vert _{L^2(0,T; L^2(\Omega ))}, \Vert y_0\Vert _{L^2(\Omega )} \le \Lambda \), \(-\Lambda +\Lambda ^{-1}\vert p\vert ^2 \le A(p) \text { and } A'(p)^Tp \le \Lambda \vert p \vert ^2\) as well as

$$\begin{aligned} \int _{\Omega }(A(\nabla y_0) + f(y_0)) \le \Lambda \quad \text { and } \quad f\ge -\Lambda , \quad f''\ge -\Lambda , \end{aligned}$$
(11)

then there exist constants \(\tau _\Lambda >0\) and \(C(\Lambda )>0\), depending on \(\Lambda \), and for all \(0<\tau <\tau _\Lambda \), the solutions \(y_\tau \) of (10) satisfy that

$$\begin{aligned} \begin{gathered} \Vert \partial _t^{-\tau }y_\tau \Vert _{L^2(0,T;L^2(\Omega ))}+ \Vert y_\tau \Vert _{L^\infty (0, T; H^1(\Omega ))} + \Vert f'(y_\tau )\Vert _{L^2(0, T; L^2(\Omega ))} \le C(\Lambda ). \end{gathered} \nonumber \\ \end{aligned}$$
(12)

Note that under Assumptions 1 one can find for given \(u_\tau ,y_0,A\) and f always a constant \(\Lambda \) such that the above required estimates hold. In particular for A the growth condition induces \(A'(p)^Tp \le {\overline{C}}_A \vert p \vert ^2\) as well as \(A'(0)=0\) and then the strong monotonicity provides \(A(0)+\tfrac{1}{2}C_A\vert p\vert ^2 \le A(p)\).

Proof

We note that f and \(f'\) induce continuous Nemytskii operators \(f:L^2(\Omega ) \rightarrow L^1(\Omega )\) and \( f':L^2(\Omega ) \rightarrow L^2(\Omega )\) due to the bounds on \(f''\).

Starting with \(y_{0}:=y(0)\), define \(y_j\in H^1(\Omega )\) successively for \(j\ge 1\) to be the unique minimizer of

$$\begin{aligned} \begin{gathered} \Phi _{j,\tau }(\eta ) :=\int _{\Omega } \left( \tfrac{1}{2\tau _j} \vert \eta - y_{j-1} \vert ^2 + A(\nabla \eta ) + f(\eta ) - u_j\eta \right) \end{gathered} \end{aligned}$$
(13)

where the integrands are strongly convex for \(\tfrac{1}{\tau }+f''(s) \ge \tfrac{1}{\tau }-C>0\). For \(\eta \in H^1(\Omega )\), \(\delta > 0\) we obtain with \(\eta _\delta :=y_j + \delta (\eta - y_j)\) and using the convexity of A as well as \(\Phi _{j,\tau }(y_j) \le \Phi _{j,\tau }(\eta _\delta )\)

$$\begin{aligned} \begin{aligned}&\int _\Omega (A(\nabla \eta ) - A(\nabla y_j) ) \ge \int _\Omega \dfrac{1}{\delta } (A(\nabla \eta _\delta ) - A(\nabla y_j))\\&\quad ~~ \ge - \int _\Omega (\tfrac{1}{2\tau _j\delta } (\vert \eta _\delta - y_{j-1}\vert ^2 - \vert y_j-y_{j-1}\vert ^2) + \tfrac{1}{\delta }(f(\eta _\delta ) - f(y_j)) - u_j(\eta -y_j) ) \\&\quad ~~ {\mathop {\rightarrow }\limits ^{\delta \rightarrow 0}} -\int _\Omega \left( \dfrac{y_{j} - y_{j-1}}{\tau _j}\right) (\eta - y_j) + f'(y_j)(\eta - y_j) - u_j(\eta -y_j). \end{aligned} \nonumber \\ \end{aligned}$$
(14)

The last term is obtained using the mean value theorem and applying dominated convergence given \(\vert f'(x)\vert \le C(1+\vert x\vert )\) (see also Remark 1). Altogether we have shown (10). Now we want to deduce the estimate (12). The summation of \(\Phi _{l,\tau }(y_l) \le \Phi _{l,\tau }(y_{l-1})\) yields

$$\begin{aligned}&\int _{0}^{t_j} \int _{\Omega } \tfrac{1}{2} \vert \partial _t^{-\tau } y_\tau \vert ^2 + \int _{\Omega } (A(\nabla y_j) + f(y_j)) \nonumber \\&~~\le \int _{\Omega } (A(\nabla y_0) + f(y_0)) + \int _{0}^{t_j} \int _{\Omega } u_\tau \partial _t^{-\tau } y_\tau \le C(\Lambda ) + \tfrac{1}{4} \int _{0}^{t_j} \int _{\Omega } \vert \partial _t^{-\tau } y_\tau \vert ^2. \end{aligned}$$
(15)

Using the assumptions \(-\Lambda +\Lambda ^{-1}\vert p\vert ^2 \le A(p)\), \(- f\le \Lambda \), \(\Vert y_0\Vert _{L^2(\Omega )}\le \Lambda \) as well as \(y_j= y_0 + \int _{0}^{t_j} \partial _t^{-\tau }y_\tau \), we obtain

$$\begin{aligned} \begin{gathered} \Vert \partial _t^{-\tau }y_\tau \Vert _{L^2(0,T;L^2(\Omega ))}+ \Vert \nabla y_\tau \Vert _{L^\infty (0,T; L^2(\Omega ))} + \Vert y_\tau \Vert _{L^\infty (0, T; L^2(\Omega ))} \le C(\Lambda ). \end{gathered} \end{aligned}$$
(16)

Then, choosing \(\eta :=y_{j} - \delta f'(y_{j}), \delta >0\) in (14), we obtain

$$\begin{aligned} \int _{\Omega } f'(y_{j})^2 \le \int _{\Omega } - \tfrac{y_{j} - y_{j-1}}{\tau _j}f'(y_{j}) + u_j f'(y_{j}) - \dfrac{1}{\delta } (A(\nabla y_{j}) - A(\underbrace{\nabla (y_{j}-\delta f'(y_{j}))}_{= {\nabla y_j-\delta f''(y_j)\nabla y_j}})). \nonumber \\ \end{aligned}$$
(17)

To the third integral we applied the mean value theorem pointwisely almost everywhere in \(\Omega \), with the intermediate point of 1 and \(1-\delta f''(y)\) denoted by \(\xi _\delta (y)\).

Note that due to the boundedness of \(f''\) also \(\xi _\delta (\cdot )\) is bounded and \(\xi _\delta \rightarrow 1\) as \(\delta \rightarrow 0\) in the pointwise sense. Now we can use \(0 \le A'(p) ^T p \le \Lambda \vert p\vert ^2\) and \(-f''\le \Lambda \) as well as dominated convergence to obtain

$$\begin{aligned} \int _{\Omega } f'(y_{j})^2\le & {} \int _{\Omega } - \tfrac{y_{j} - y_{j-1}}{\tau _j}f'(y_{j}) + u_j f'(y_{j}) + C(\Lambda ) \vert \nabla y_{j}\vert ^2\\\le & {} \int _{\Omega } \left( \tfrac{y_{j} - y_{j-1}}{\tau _j}\right) ^2 + \tfrac{1}{4}f'(y_{j})^2 + \tfrac{1}{2} u_j^2 + \tfrac{1}{2}f'(y_{j})^2 + C(\Lambda ) \vert \nabla y_{j}\vert ^2, \end{aligned}$$

and hence with (16)

$$\begin{aligned} \left( \sum _{j=1}^N \tau _j \Vert f'(y_{j} )\Vert ^2 \right) ^{1/2}\le C(\Lambda ). \end{aligned}$$
(18)

\(\square \)

Remark 1

As was mentioned, the strong monotonicity of \(A'\) and the growth condition \( \vert A'(p)\vert \le {\overline{C}}_A\vert p\vert \) induce \(A'(0)=0\), \(A'(p)^Tp \le {\overline{C}}_A \vert p\vert ^2\) and \(A(0)+\tfrac{1}{2}C_A\vert p\vert ^2 \le A(p) \). Furthermore \(A(p) \le c(1+\vert p\vert ^2)\) also holds with some \(c>0\). Hence for all \(\eta ,\xi \in L^2(0,T;H^1(\Omega ))\) we have \(A(\nabla \eta )\in L^1(Q)\) and (using Young’s inequality) \(A'(\nabla \eta )^T\nabla \xi \in L^1(Q)\). It also induces the pointwise estimate \( \vert A'(\nabla y + s \delta \nabla \xi )^T\nabla \xi \vert \le C(\vert \nabla y\vert ^2 + \vert \nabla \xi \vert ^2), \) for \(0\le s \delta \le 1\) providing an integrable majorant, which allows to take the limit \(\delta \searrow 0\) for the integral below. Hence we obtain

$$\begin{aligned} \begin{aligned} \lim _{\delta \searrow 0}\dfrac{1}{\delta } \int _Q (A(\nabla y + \delta \nabla \xi ) - A(\nabla y) )&= \lim _{\delta \searrow 0}\int _Q \int _{0}^{1} A'(\nabla y+s \delta \nabla \xi )^T \nabla \xi \, \mathrm ds \\&= \int _Q A'(\nabla y)^T \nabla \xi . \end{aligned} \end{aligned}$$
(19)

The same holds respectively for integration over \(\Omega \). Together with the monotonicity of \(A'\) this enables the usual steps of the proof that solving the variational inequality (10) is equivalent to solving the variational equality (9) with f instead of \(\psi \).

With Lemma 2 at hand we can show the existence of a unique weak solution to the time discretized state equation (9). Note that the following bound (20) (and likewise (33) in the time continuous case) will be crucial for showing the existence of an optimal control later.

Theorem 3

Let Assumptions 1 be fulfilled. If \(\tau =\max _j \tau _j < 1/C_\psi \) then for every \(u_\tau \in U_\tau \) the time discretized state equation (9) has a unique solution \(y_\tau \in Y_\tau \).

The solution operator is denoted by \(S_\tau :U_\tau \rightarrow Y_\tau \).

Furthermore, if \(\Vert u_\tau \Vert _{L^2(0,T;L^2(\Omega ))}\le {\bar{c}}\) for all \(0<\tau <1/C_\psi \), then there exists a constant \(C_{A,\psi ,y_0}(\bar{c})\), independent of \(\tau \), such that

$$\begin{aligned} \begin{gathered} \Vert \partial _t^{-\tau }y_\tau \Vert _{L^2(0,T;L^2(\Omega ))}+ \Vert y_\tau \Vert _{L^\infty (0, T; H^1(\Omega ))} + \Vert \psi '(y_\tau )\Vert _{L^2(0, T; L^2(\Omega ))} \le C_{A,\psi ,y_0}({\bar{c}}). \end{gathered} \nonumber \\ \end{aligned}$$
(20)

Proof

We consider the approximations of \(\psi \) by \(f_n\) according to Assumption 1. Then, due to \(\psi (y_0)\in L^1(\Omega )\) for given \(y_0\in H^1(\Omega )\) and \(-c\le f_n\le c(\psi +1)\), \( -C_\psi \le f_n''\) one can find \(\Lambda \), depending only on \({A,\psi ,y_0}\), large enough such that

$$\begin{aligned} \begin{gathered} \Vert y_0\Vert _{L^2(\Omega )}, \, \int _{\Omega }(A(\nabla y_0) +{f_n}(y_0)), \, -\inf _{t\in {\mathbb {R}}} f_n(t), \, -\inf _{t\in {\mathbb {R}}} f_n''(t) \le \Lambda ,\\ -\Lambda +\Lambda ^{-1}\vert p\vert ^2 \le A(p) \text { and } A'(p)^Tp \le \Lambda \vert p\vert ^2. \end{gathered} \end{aligned}$$
(21)

We denote by \(y _{j,n}\) the solutions of (10) with \(f=f_n\) which exist according to Lemma 2 and remark that they exist for \( \tau <\tfrac{1}{C_\psi }\) where the integrands of (13) are strongly convex due to \(\tfrac{1}{\tau }+f_n''(s) \ge \tfrac{1}{\tau }-C_\psi >0\). Also Lemma 2 provides the estimates (12), i.e., for all \(\tau \) and n it holds

$$\begin{aligned} \begin{gathered} \Vert \partial _t^{-\tau }y_{\tau ,n}\Vert _{L^2(0,T;L^2(\Omega ))}+ \Vert y_{\tau ,n}\Vert _{L^\infty (0, T; H^1(\Omega ))} + \Vert f_n'(y_{\tau ,n})\Vert _{L^2(0, T; L^2(\Omega ))} \le C(\Lambda ,\bar{c}). \end{gathered}\nonumber \\ \end{aligned}$$
(22)

Then, [33, Lemma 1.3] together with \(f_n \rightarrow \psi \) in \(C^1_{\text {loc}}\) for \(n \rightarrow \infty \) and the weak-lower semicontinuity of \(A: w\in L^2(\Omega ;{\mathbb {R}}^d)\mapsto \int _\Omega A(w)\, \mathrm dx \in {\mathbb {R}}\) allows to take for a subsequence of \(y_{j,n} \) the limit inferior for \(n\rightarrow \infty \) for all terms in (22) and the limit superior for \(n\rightarrow \infty \) for those in (10) to obtain (20) and further for all \(\eta \in H^1(\Omega )\) it holds

$$\begin{aligned} \int _\Omega (A(\nabla \eta ) - A(\nabla y_{j}) ) \ge \int _\Omega - \left( \dfrac{y_{j} - y_{j-1}}{\tau _j}\right) (\eta - y_j) - \psi '(y_j)(\eta - y_j) + u_j(\eta -y_j) . \nonumber \\ \end{aligned}$$
(23)

Finally we can go over to the equality (9) by the reasoning from Remark 1.

The uniqueness of the solution of (9) can be shown for each time step separately one after another. For this purpose assume the existence of two solutions. Subtracting their defining equations, testing with their difference and using the strong monotonicity of \(A'\) and of \(s+\tau _j\psi '(s) \) due to \(\tau < 1/C_\psi \) shows that the \(H^1\)-norm of their difference vanishes. \(\square \)

With a further (minor) restriction on the maximal time step \(\tau \) we obtain Lipschitz-continuity of the solution operator for (9) with a constant independent of \(\tau \).

Theorem 4

Let Assumptions 1 and \(\tau \le \tfrac{1}{1+2C_\psi }\) hold. Then the mapping \({\tilde{S}}_\tau :(y_0, u_\tau ) \mapsto y_\tau \) where \( y_\tau \) is the solution of (9), is Lipschitz-continuous in the sense that

$$\begin{aligned} \begin{aligned} \Vert y^{(1)}_\tau -y^{(2)}_\tau&\Vert _{L^\infty (0,T;L^2(\Omega ))} + \Vert \nabla y^{(1)}_\tau -\nabla y^{(2)}_\tau \Vert _{L^2(0,T;L^2(\Omega ))} \\&\le {C}_{A,\psi ,T}\left( \Vert y^{(1)}_0 - y^{(2)}_0\Vert _{L^2(\Omega )} + \Vert u^{(1)}_\tau - u^{(2)}_\tau \Vert _{L^2(0,T;H^1(\Omega )')}\right) , \end{aligned} \end{aligned}$$
(24)

where \(y_{\tau }^{(i)}= {\tilde{S}}_\tau (y_0^{(i)}, u_\tau ^{(i)}) \) for \(i=1,2\).

Proof

We note down the differences by a prescript \(\delta \), e.g., \(\delta y_\tau :=y_{\tau }^{(1)} - y_{\tau }^{(2)}\). With \(\tfrac{1}{2}(a^2-b^2) \le (a-b)a\) in mind, testing the defining equalities (9) with \(\delta y_j\) and using that \(A'\) is strongly monotone as well as (6), we obtain

$$\begin{aligned} \tfrac{1}{2}&\left( \Vert \delta y_j\Vert ^2 - \Vert \delta y_{j-1}\Vert ^2\right) + \tau _j C_A \Vert \nabla \delta y_j\Vert ^2\\&\le \left( \delta y_j - \delta y_{j-1}, \delta y_j\right) + \tau _j\left( A'(\nabla y_{j}^{(1)})-A'(\nabla y_j^{(2)}), \nabla \delta y_j \right) \\&= \tau _j \left( \delta u_j, \delta y_j\right) - \tau _j \left( \psi '(y_{j}^{(1)}) -\psi '(y_j^{(2)}), \delta y_j\right) \\&\le \tfrac{\tau _j}{2\epsilon } \Vert \delta u_j\Vert _{{H^1}'}^2 + \tfrac{\tau _j\epsilon }{2}\Vert \delta y_j\Vert _{H^1}^2 + \tau _jC_\psi \Vert \delta y_j\Vert ^2. \end{aligned}$$

In the last step we used scaled Young’s inequality with \(0<\epsilon <\min (1, 2C_A)\). We now sum over \(j = 1,\ldots ,J\) and get

$$\begin{aligned} \tfrac{1}{2}\Vert \delta y_J\Vert ^2 + \tilde{C}_A \sum _{j=1}^{J} {\tau _j} \Vert \nabla \delta y_j\Vert ^2 \le \tfrac{1}{2} \left( \Vert \delta y_0\Vert ^2 + \sum _{j=1}^{J} \tfrac{\tau _j}{\epsilon } \Vert \delta u_j\Vert _{{H^1}'}^2\right) + \tfrac{1}{2}\tilde{C}_\psi \sum _{j=1}^{J} \tau _j \Vert \delta y_j\Vert ^2 \end{aligned}$$
(25)

for all \(1 \le J \le {N}\). Here we defined \(\tilde{C}_A:=C_A-\tfrac{\epsilon }{2}\) and \(\tilde{C}_\psi :={\epsilon }+2C_\psi \) Omitting the gradient term on the left and absorbing the J-th term from the right, we obtain

$$\begin{aligned} \Vert \delta y_J\Vert ^2&\le \dfrac{1}{ (1-\tilde{C}_\psi \tau _J)} \left( \Vert \delta y_0\Vert ^2 + \sum _{j=1}^{J} \tfrac{\tau _j}{\epsilon } \Vert \delta u_j\Vert _{{H^1}'}^2\right) + \dfrac{\tilde{C}_\psi }{1-\tilde{C}_\psi \tau _J}\sum _{j=1}^{J-1} \tau _j \Vert \delta y_j\Vert ^2\\&\le C_{\psi ,\tau } \left( \Vert \delta y_0\Vert ^2 + \sum _{j=1}^{{N}} \tfrac{\tau _j}{\epsilon } \Vert \delta u_j\Vert _{{H^1}'}^2\right) + C_{\psi ,\tau }{\tilde{C}_\psi }\sum _{j=1}^{J-1} \tau _j \Vert \delta y_j\Vert ^2, \end{aligned}$$

where we had to suppose smallness of \(\tau \) to ensure \(C_{\psi ,\tau }:= \tfrac{1}{1-\tilde{C}_\psi \tau } >0\). Now we apply the discrete Gronwall Lemma (see, e.g. [34, Lemma A.3]), which yields

$$\begin{aligned} \begin{aligned} \Vert \delta y_J\Vert ^2&\le \left( \Vert \delta y_0\Vert ^2 + \sum _{j=1}^{{N}} \tfrac{\tau _j}{\epsilon } \Vert \delta u_j\Vert _{{H^1}'}^2\right) C_{\psi ,\tau } \exp \left( C_{\psi ,\tau }{\tilde{C}_\psi } \sum _{j=1}^{J-1} \tau _j\right) \\&\le \left( \Vert \delta y_0\Vert ^2 + \sum _{j=1}^{{N}} \tfrac{\tau _j}{\epsilon } \Vert \delta u_j\Vert _{{H^1}'}^2\right) C_{\psi ,\tau } \exp \left( C_{\psi ,\tau }{\tilde{C}_\psi T}\right) . \end{aligned} \end{aligned}$$
(26)

Inserting this into (25) we finally get for all \(J=1,\ldots ,{N}\)

$$\begin{aligned} \tilde{C}_A \sum _{j=1}^{J} {\tau _j} \Vert \nabla \delta y_j\Vert ^2 \le \tfrac{1}{2} \left( \Vert \delta y_0\Vert ^2 + \sum _{j=1}^{{N}} \tfrac{\tau _j}{\epsilon } \Vert \delta u_j\Vert _{{H^1}'}^2\right) \left( 1 + C_{\psi ,\tau }{\tilde{C}_\psi T} \exp \left( C_{\psi ,\tau }{\tilde{C}_\psi T}\right) \right) , \nonumber \\ \end{aligned}$$
(27)

which together with (26) and the boundedness of \(C_{\psi ,\tau }\) independently of \(\tau \) yields the inequality (24). \(\square \)

A similar result to Theorem 3 could also be obtained by using results on monotone operators, see, e.g., [33]. Together with an argument formerly found by Stampacchia one would obtain the regularity \(y_j\in L^\infty (\Omega ) \cap H^1(\Omega )\) at each step of our time discretization [32, 35]. These results are applicable if \(\tau \) is sufficiently small such that the term \(y_j + \tau _j \psi '(y_j)\) becomes monontonic. However this regularity comes with restriction on the space dimension d.

Our approach also allows taking the limit \(\tau \rightarrow 0\) providing the convergence of the time discrete solutions \(y_\tau \) with \(\tau \rightarrow 0\) and with it the existence of a solution in the time continuous case. The following lemma serves as a preparation and gives a similar result to [36].

Lemma 5

Let \(\{y_\tau \}_{\tau }\) with \(y_\tau \in Y_\tau \) and \(\tau \rightarrow 0\) be a sequence satisfying

$$\begin{aligned} \Vert \partial _t^{-\tau }y_\tau \Vert _{L^2(0,T;L^2(\Omega ))}+ \Vert y_\tau \Vert _{L^\infty (0, T; H^1(\Omega ))} \le C, \end{aligned}$$
(28)

where \(C>0\) is independent of \(\tau \). Then there exists a subsequence (again denoted by \(\{y_\tau \}_{\tau }\)) and a function \(z\in C([0,T]; L^2(\Omega ))\), such that

$$\begin{aligned} \max _{t\in [0,T]} \Vert y_\tau (t) -z(t)\Vert _{L^2(\Omega )} \rightarrow 0 \qquad \text {as} \qquad \tau \rightarrow 0. \end{aligned}$$
(29)

Proof

Using the definitions from (7), for given \(y_\tau \) we define its linear interpolant \(z_\tau \), i.e., \( z_\tau (t)_{\vert I_{j}} = y_{j-1} + (t-t_{j-1})\partial _t^{-\tau } y_\tau (t_j). \) Note that from (20) we have \(\Vert z_\tau \Vert _{H^1(0,T;L^2(\Omega ))\cap L^\infty (0,T;H^1(\Omega ))}\le C\) for some constant C, independent of \(\tau \). By the compact imbedding \(L^\infty (0,T;H^1(\Omega ))\cap H^1(0,T;L^2(\Omega )) \hookrightarrow C([0,T];L^2(\Omega ))\) (see Aubin–Lions–Simon compactness theorem , e.g., in [37]) we deduce the existence of a function z such that (possibly for a subsequence) \(z_\tau \rightarrow z\) in \(C([0,T]; L^2(\Omega ))\).

In addition, for \(t = \beta t_j + (1-\beta ) t_{j-1}\) with \(\beta \in (0,1]\) we find

$$\begin{aligned} \begin{aligned} \Vert y_\tau (t) - z_\tau (t)\Vert _{L^2(\Omega )}^2&=(1-\beta )^2(t_j-t_{j-1})^2\Vert \partial _t^{-\tau } y_\tau (t_j) \Vert _{L^2(\Omega )}^2\\&\le (1-\beta )^2 \tau \Vert \partial _t^{-\tau } y_\tau \Vert _{L^2(0,T;L^2(\Omega ))}^2 \le C_{A,\psi ,y_0} \tau \end{aligned} \end{aligned}$$
(30)

independent of t. Consequently it holds \(\max _{t\in [0,T]} \Vert y_\tau (t) -z(t)\Vert _{L^2(\Omega )} \rightarrow 0 \) as \(\tau \rightarrow 0\).

\(\square \)

We are now prepared to deduce the convergence of the numerical solutions to the time continuous solution.

Theorem 6

Let Assumptions 1 hold. Then for every \(u \in L^2(0,T;L^2(\Omega ))\) there exists a unique weak solution \(y\in L^\infty (0,T;H^1(\Omega )) \cap H^1(0,T;L^2(\Omega ))\) to (4), i.e.,

$$\begin{aligned} \int _Q \partial _t y \eta + A'(\nabla y)^T \nabla \eta + \psi '(y) \eta = \int _Q u\eta \qquad \forall \eta \in L^2(0, T; H^1(\Omega )), \end{aligned}$$

subject to \(y(0)=y_0\in H^1(\Omega )\) a.e. in \(\Omega \), and it also holds \(\psi '(y)\in L^2(Q)\).

For a sequence \(\{u_\tau \}\) with \(u_\tau \rightharpoonup u\) in \(L^2(0,T;L^2(\Omega ))\) for \(\tau \rightarrow 0\), then for the corresponding sequence of time discrete solutions \(\{y_\tau \}\) and \(y:=S(u)\) the following convergences hold:

$$\begin{aligned} \begin{aligned} y_\tau \rightharpoonup y \text { in } L^2(0,T;H^1(\Omega )),\\y_\tau {\mathop {\rightharpoonup }\limits ^{*}} y \text { in } L^\infty (0,T;H^1(\Omega )),\\ y_\tau \rightarrow y \text { in } C([0,T];L^2(\Omega )),\\ \partial ^{-\tau }_t y_\tau \rightharpoonup \partial _t y \text { in } L^2(0,T;L^2(\Omega )),\\ \psi '(y_\tau ) \rightharpoonup \psi '(y) \text { in } L^2(0,T;L^2(\Omega )). \end{aligned} \end{aligned}$$
(31)

Proof

Given \(u\in L^2(0,T;L^2(\Omega ))\) we choose a sequence of discretizations \(u_\tau \in U_\tau \) with \(u_\tau \rightharpoonup u\) in \(L^2(0,T;L^2(\Omega ))\) for \(\tau \rightarrow 0\). This allows for the choice of a constant \({\bar{c}} >0\) satisfying \(\Vert u_\tau \Vert \le {\bar{c}}\). Let \(y_\tau \) be the solution of (9) corresponding to \(u_\tau \). Then the estimates (20) hold. Hence, for \(\tau \rightarrow 0\) there exists a (sub-)sequence satisfying the first and last two convergences in (31). The strong convergence of \(\{y_\tau \}\) in \(C([0,T];L^2(\Omega ))\) is obtained from Lemma 5. Note that for later convenience we already denote the limit as y and the latter two limits in (31) were identified using pointwise almost-everywhere convergence of \(y_\tau \) (following from the strong convergence), continuity of \(\psi '\) and an application of [33, Lemma 1.3]. Altogether we can take the limit in the variational inequality (23) to obtain that y satisfies

$$\begin{aligned} \int _Q \partial _t y(\eta -y)&+ A(\nabla \eta )- A(\nabla y) + \psi '(y)(\eta -y) -u(\eta -y) \nonumber \\&\ge 0 \quad \forall \eta \in L^2(0,T;H^1(\Omega )). \end{aligned}$$
(32)

Remark 1 yields that y solves also the variational equality (4). Furthermore, using weak (\(*\)) lower-semicontinuity the solution y satisfies

$$\begin{aligned} \Vert \partial _t y\Vert _{L^2(0,T;L^2(\Omega ))} + \Vert y\Vert _{L^\infty (0,T;H^1(\Omega ))} + \Vert \psi '(y)\Vert _{L^2(0,T;L^2(\Omega ))}\le C_{A,\psi ,y_0}({\bar{c}}). \qquad \end{aligned}$$
(33)

The uniqueness follows by subtracting the defining equations for two solutions and using a Gronwall argument to deduce that their difference vanishes.

Recall that the choice of discretization (given by the choice of the intervals) was arbitrary. Furthermore on the way to obtain y from \(y_{j,n}\) we had to take subsequences twice. Whatever choice made we would have gotten a y satisfying the same variational inequality. Since this variational inequality has a unique solution, the whole sequence has to converge. Summarized, for all discretizations, we get a sequence \(y_{j,n}\) that for \(n\rightarrow \infty \) and then \(\tau \rightarrow 0\) (\(j \rightarrow \infty \)) results in the same limit y satisfiying the variational inequality. \(\square \)

Finally, we also obtain Lipschitz-continuity of the time continuous solutions.

Theorem 7

Let Assumptions 1 hold. Then the solution of (4) depends Lipschitz-continuously on \((y_0, u)\) in the sense that

$$\begin{aligned} \begin{aligned}&\Vert y_1-y_2\Vert _{C([0,T];L^2(\Omega ))\cap L^2(0,T;H^1(\Omega ))}\\&\quad \le C_{\psi , A, T} \left( \Vert y_{1,0}-y_{2,0}\Vert _{L^2(\Omega )} + \Vert u_1-u_2\Vert _{L^2(0,T;H^1(\Omega )')}\right) , \end{aligned} \end{aligned}$$
(34)

where \(y_1,y_2\) are the solutions to the data \((y_{1,0}, u_1)\) and \((y_{2,0}, u_2)\) respectively.

Proof

Using sequences \(u_\tau ^{(1)}\), \(u_\tau ^{(2)} \in U_\tau \) converging to \(u_{1}\) and \(u_{2}\) in \(L^2(0,T;L^2(\Omega ))\) we obtain a sequence \(y_\tau ^{(i)} :=S_\tau (u_\tau ^{(i)})\) with \(y_\tau ^{(i)}\) converging to \(y_{i}= S(u_{i})\) in the sense of (31) for \(i=1,2\). Then applying Theorem 4 one obtains the Lipschitz estimate (34) since the constant in (24) is independent of \(\tau \). \(\square \)

We remark that this result can be obtained in a similar way as for the discrete solutions using the continuous Gronwall inequality.

In addition, let us mention that our discretization of the state equation inherits an important property of the time continuous case. Namely, since the state equation is a result of the gradient flow of the energy \({\mathcal {E}}\) given in (1), the energy should decrease in time when there is no input, i.e., \(u=0\).

Theorem 8

Let Assumptions 1 and \(\tau \le {2}/{C_\psi }\) hold. Then the scheme (9) for the state equation is energy stable, i.e., for \(u_\tau =0\) the energy functional \({\mathcal {E}}\) is decreasing in time.

Proof

We test (9) with the difference \(y_{j}-y_{j-1}\) and obtain

$$\begin{aligned} \dfrac{1}{\tau _j}\Vert y_{j} - y_{j-1}\Vert ^2 + \left( A'(\nabla y_j), \nabla y_{j} - \nabla y_{j-1}\right) + \left( \psi '(y_{j}), y_{j}- y_{j-1}\right) = 0. \end{aligned}$$
(35)

The convexity of A (recall \(A'\) is strongly monotone) yields \( \left( A'(\nabla y_j), \nabla y_{j} - \nabla y_{j-1}\right) \ge A(\nabla y_{j}) - A(\nabla y_{j-1})\) for the second term. The third term can be estimated by the following relation

$$\begin{aligned} \psi '(y_{j})( y_{j}- y_{j-1}) \ge \psi (y_{j}) - \psi (y_{j-1}) - \tfrac{C_\psi }{2} (y_{j}- y_{j-1})^2. \end{aligned}$$
(36)

This follows from the fact that this holds for \(f_n\) approximating \(\psi \) as in Assumptions 1.

Collecting terms and using the definition of the Ginzburg–Landau energy (1) one finds

$$\begin{aligned} \left( \dfrac{1}{\tau _j} -\dfrac{C_\psi }{2 }\right) \Vert y_{j}-y_{j-1}\Vert ^2 + {\mathcal {E}}(y_{j}) - {\mathcal {E}}(y_{j-1}) \le 0 \end{aligned}$$
(37)

and thus \({\mathcal {E}}(y_{j}) \le {\mathcal {E}}(y_{j-1})\) if \(\tau \le 2/C_\psi \). \(\square \)

The result from Theorem 8 can be applied to the discretizations assumed in Theorems 3 and 4 since it provides the less restricting assumption on the step length \(\tau \).

Finally let us comment on possible choices for the function A for the quasilinear term and for the function \(\psi \), in particular regarding the application to optimal control of anisotropic Allen–Cahn equations.

Remark 2

The Assumptions 1 on A are fulfilled, e.g., in the following cases:

  1. 1.

    \(A\in C^1({\mathbb {R}})\) is convex, absolutely 2-homogeneous and satisfies \(A(p)>0\) for \(p\ne 0\) as in [20]. Writing \(A(p) =\tfrac{1}{2} \gamma (p)^2\), sufficient conditions on \(\gamma \) for these properties can be found, e.g., in [21]. One can actually require only \(A\in C^0({\mathbb {R}})\). Then the above results still hold for the solution of the variational inequality (32) using the absolutely 2-homogeneity in the estimate (17).

  2. 2.

    A is given as \(A(p) =\tfrac{1}{2} \left( \sum _{l=1}^{L} {(p^T G_lp+\delta )}^{1/2}\right) ^2 \) with symmetric positive definite matrices \(G_l\in {\mathbb {R}}^{d\times d}\) and \(\delta \ge 0\), see [7] for details. If \(\delta =0\), such anisotropies are studied in [22, 23] for the Allen–Cahn equation. These may be regularized using \(\delta >0\) to obtain \(A \in C^2({\mathbb {R}}^d)\) while loosing absolutely 2-homogeneity.

The Assumptions 1 on \(\psi \) are fulfilled in the following cases:

  1. 1.

    \(\psi \in C^2({\mathbb {R}})\) is bounded from below, \(\psi '' \ge -C_\psi \) for some \(C_\psi \ge 0\) and \(\lim _{t\rightarrow \pm \infty }\psi ''(t) =+\infty \). This can be shown by choosing a value \(x_{-1}>0\) large enough such that \(\psi '' \ge 1, \psi ' \ge 0\) on \([x_{-1},\infty )\). Then, with \(x_n:= \text {argmin}_{x\in [x_{n-1}+1,\infty )}\psi ''(x)\) one can define the approximation on \([0,x_n]\) by \(f_n:=\psi \), and for \(x> x_n\) as \(f_n(x):=\psi (x_n)+\psi '(x_n) (x-x_n) +\tfrac{1}{2} \psi ''(x_n)(x-x_n)^2\). Respectively one can construct \(f_n\) on \((-\infty , 0]\).

  2. 2.

    \(\psi \) is the double well potential \(\psi (y) = \tfrac{1}{4}(y^2-1)^2\), since then the conditions in 1. hold. In addition to the regularity \(y \in L^\infty (0,T; H^1(\Omega ))\) of the solution of (4) that was shown in Theorem 6, from the estimate (33) we also obtain the regularity \(y\in L^6(0,T; L^6(\Omega ))\) for all space dimensions if we use this potential.

  3. 3.

    \(\psi \) is one of the following regularizations of the obstacle potential \(\psi _{obst}\), given by \(\tfrac{1}{2}(1-x^2) \) on \([-1,1]\) and \(\infty \) elsewhere:

    • \(\psi \) is the regularization considered in [38] for analyzing the solution of the isotropic Allen–Cahn or Cahn–Hilliard variational inequalities. There \(\psi _{obst}\) is regularized to \(\psi \in C^2\) by a smooth continuation of \(\psi _{obst}\) on \([-1,1]\) with a cubic polynomial in a neighborhood \(\pm (1,1+\delta )\) and then with a quadratic polynomial (cf. formula (2.9) there).

    • \(\psi \) is the Moreau–Yosida regularization of \(\psi _{obst}\), i.e., \(\psi \in C^1\) with \(\psi (x)=\tfrac{1}{2}(1-x^2) + s (\min \{x+1,0\})^2 + s (\max \{x-1,0\})^2 \) where the penalty parameter \(s\in {\mathbb {R}}^+\) is possibly very large. It is, e.g., used in [13] to study the optimal control of isotropic Allen–Cahn inequalities and to obtain a numerical approach.

3 Existence of the Optimal Control in the Time Discretized and in the Continuous Setting

Having shown the existence of solutions to the discretized (9) and time continuous state equation (4) that satisfy the bounds (20) and (33), respectively, we are able to develop the existence results of solutions to the pertinent control problems (8) and (3).

Theorem 9

Let Assumptions 1 be fulfilled and \(\max _j \tau _j :=\tau < \tfrac{1}{(1+ 2C_\psi )}\) hold. Then for every \(y_\Omega \in L^2(\Omega )\) the control problem (8)–(9) has at least one solution in \(U_\tau \times Y_\tau \).

Proof

The requirements assure that Theorem 3 is applicable and for every \(u_\tau \in U_\tau \) we find a unique solution \(S_\tau (u_\tau )=y_\tau \in Y_\tau \) of (9). Since the feasible set \(\{ (u_\tau , y_\tau )\mid y_\tau =S_\tau (u_\tau ) \text{ for } u_\tau \in U_\tau \}\) is nonempty and the cost functional in (8) is bounded from below we can deduce the existence of an infimum \(\iota \) and of a minimizing sequence \(((u_\tau ^{(m)},y_\tau ^{(m)}) )_m \) with \(\iota :=\lim _{m\rightarrow \infty } {J}(y_\tau ^{(m)},u_\tau ^{(m)})\). If \(u_\tau ^{(m)}\in L^2(0,T;L^2(\Omega ))\) was unbounded so would be \({J}(y_\tau ^{(m)},u_\tau ^{(m)})\) which would contradict its convergence to an infimum. Hence there exists a constant \(\bar{c}_\tau > 0\) possibly depending on \(\tau \) with \( \Vert u_\tau ^{(m)}\Vert _{L^2(0,T;L^2(\Omega ))}\le {\bar{c}}_\tau \) for all m and we can extract a weakly convergent subsequence denoted in the same way \( u_\tau ^{(m)} \rightharpoonup u_\tau ^*\) in \(L^2(0,T;L^2(\Omega ))\). From Theorem 3 we obtain independent from m

$$\begin{aligned} \Vert y_\tau ^{(m)}\Vert _{L^\infty (0, T; H^1(\Omega ))} + \Vert \psi '(y_\tau ^{(m)})\Vert _{L^2(0, T; L^2(\Omega ))} \le C_{A,\psi ,y_0}({\bar{c}}_\tau ) . \end{aligned}$$
(38)

Given \(y^{(m)}_\tau \), \(y_\tau ^*\) are determined by N functions in \(H^1(\Omega )\), this yields \(y_\tau ^{(m)} \rightarrow y_\tau ^*\) in \(L^\infty (0, T; L^2(\Omega ))\) and in the pointwise sense, as well as \(\psi '(y_\tau ^{(m)} ) \rightharpoonup \psi '(y_\tau ^*)\) in \(L^2(0, T; L^2(\Omega ))\) possibly for a subsequence. Since \(U_\tau \) is finite dimensional in time and due to the compact imbedding \(L^2(\Omega )\hookrightarrow H^1(\Omega )'\) we obtain \(u_\tau ^{(m)} \rightarrow u_\tau ^*\) in \(L^2(0,T; H^1(\Omega )')\). So the Lipschitz-continuity stated in Theorem 4 in addition yields \(y_\tau ^{(m)} \rightarrow y_\tau ^*\) in \(L^2(0, T; H^1(\Omega ))\). Now we can take the limit in the state equation and obtain

$$\begin{aligned} (y_j^*- y_{j-1}^*, \varphi ) + \tau _j(A'(\nabla y_j^*), \nabla \varphi ) + \tau _j(\psi '(y_j^*), \varphi ) = \tau _j (u_j^*, \varphi ) \qquad j=1,\ldots ,N. \nonumber \\ \end{aligned}$$
(39)

The convergence of the second term arises from the fact that \(A': L^2(\Omega ) \rightarrow L^2(\Omega )\) is a continuous Nemytskii operator. From (39) we conclude that \(y_\tau ^*= S_\tau (u_\tau ^*)\) and hence \((u_\tau ^*,y_\tau ^*)\) is feasible and its optimality follows by using the weak lower-semicontinuity of J. \(\square \)

Similarly we can show the existence of the optimal control in the time continuous setting given the control-to-state operator \(S:u\rightarrow y\) and the estimates (33) for y provided in the proof of Theorem 6.

Theorem 10

If Assumptions 1 and \(y_\Omega \in L^2(\Omega )\) hold, then there exists a solution to the optimization problem (3)–(4).

Proof

As in the proof of Theorem 9 we obtain a minimizing sequence \((u_m,y_m)\) with \(y_m=S(u_m)\) where \(u_m\) is bounded and consequently providing a constant \({\bar{c}}\) such that (33) holds independently of m, i.e.,

$$\begin{aligned} \Vert \partial _t y_m\Vert _{L^2(0,T; L^2(\Omega ))} + \Vert y_m\Vert _{{L^\infty (0,T; H^1(\Omega ))}} + \Vert \psi '(y_m)\Vert _{L^2(0,T; L^2(\Omega ))} \le \bar{c}. \end{aligned}$$
(40)

From this we get a subsequence \((u_m,y_m) \) with \(u_m\) converging weakly to a \(\bar{u}\) in \(L^2(0,T;L^2(\Omega ))\), and \(y_m\) converging to a function \(\bar{y} \) weakly-\(*\) in \(L^\infty (0,T;H^1(\Omega ))\), weakly in \(H^1(0,T;H^1(\Omega )')\), and therefore using the lemma of Aubin–Lions strongly in \( C([0,T]; L^2(\Omega ))\) with \(\bar{y}(0)=y_0\) and pointwise almost everywhere in Q.

Moreover, \(\partial _t y_m \) and \(\psi '(y_m) \) converge to \(\partial _t \bar{y}\) and \(\psi '(\bar{y})\), respectively, in the weak topology of \(L^2(0, T; L^2(\Omega ))\).

In order to obtain \({\bar{y}} = S({\bar{u}})\) we need to be able to pass to the limit also in the \(A'\)-term of (4). Given the fact that \(A': L^2(Q) \rightarrow L^2(Q)\) is a continuous Nemytskii operator it is sufficient to show the strong convergence \(\nabla y_m \rightarrow \nabla \bar{y}\) in \(L^2(0,T;L^2(\Omega ))\). Then finally, the weak lower-semicontinuity of J provides \((\bar{y}, \bar{u})\) being a minimizer of J.

The time derivative is monotone if \(y_m(0)-\bar{y}(0) = 0\). Hence we have \(\langle {\partial _t {\bar{y}},y_m-{\bar{y}\rangle }}\le \langle {\partial _t y_m, y_m-{\bar{y} }}\rangle \), and \(y_m=S(u_m)\) yields

$$\begin{aligned} (A'(\nabla y_m), \nabla y_m - \nabla \bar{y}) \le (u_m,y_m-\bar{y}) - (\psi '(y_m),y_m-\bar{y}) - \langle {\partial _t \bar{y},y_m-\bar{y}}\rangle . \end{aligned}$$

Recalling the convergence properties of \(y_m\) together with \(\Vert u_m\Vert + \Vert \psi '(y_m)\Vert \le C\), the right hand side vanishes in the limit \(m\rightarrow \infty \). From strong monotonicity we obtain

$$\begin{aligned} C\Vert \nabla y_m-\nabla \bar{y}\Vert ^2 \le (A'(\nabla y_m), \nabla y_m - \nabla \bar{y}) - (A'(\nabla \bar{y}), \nabla y_m - \nabla \bar{y}), \end{aligned}$$

where the second term on the right hand side vanishes in the limit by weak convergence and we have just shown that the limit of the first one can be bounded by 0 from above. This finally yields the desired strong convergence of \(\nabla y_m\) in \(L^2(Q)\). \(\square \)

Note that for the convergence \(\nabla y_m \rightarrow \nabla y\) in \(L^2(0,T;L^2(\Omega ))\) we could not use the Lipschitz-estimate (34) for the time continuous problem like we were able to use the analogous estimate (24) for the time discrete problem in Theorem 9. The reason is that in the time continuous case we do not have the analogon to the compact imbedding \(L^2(\Omega )^N \hookrightarrow H^1(\Omega )'^N\) (\(L^2(0,T;L^2(\Omega )) \hookrightarrow L^2(0,T;H^1(\Omega )')\) is not compact). Therefore we had to show the convergence more directly.

As in [39] for elliptic and in [40] for parabolic problems we finally consider the convergence of the minimizers \(u_\tau \) of the discretized problem to a minimizer of the problem in the continuous setting. Note that in [40] a target function \(y_Q\) is given over the whole time horizon or the solution has to have a higher regularity with respect to time. Having shown the strong convergence result \(y_\tau (T,\cdot )\rightarrow y(T,\cdot )\) in Theorem 6 we still can derive the following result.

Theorem 11

Let Assumptions 1 be fulfilled and \(y_\Omega \in L^2(\Omega )\) hold. Consider a sequence of global optimal controls \((u_\tau , y_\tau )_\tau \) of (8) subject to (9) belonging to a sequence of discretizations with \(\tau \rightarrow 0\). Then there exists a subsequence with \(u_\tau \rightarrow u\) in \(L^2(0,T;L^2(\Omega ))\) and with \(y_\tau \) converging to \(y=S(u)\) in the sense of (31) where (uy) solves (3) subject to (4).

Proof

First we choose an arbitrary \(u^*\in L^2(0,T;L^2(\Omega ))\) and a sequence \(u^*_\tau \in U_\tau \) with \(u^*_\tau \rightarrow u^*\) in \(L^2(0,T;L^2(\Omega ))\). Hence \( y^*_{\tau }=S_\tau (u^*_\tau )\) is bounded in \( L^\infty (0, T; H^1(\Omega ))\) due to (20). Now let \((u_\tau )_\tau \) be the sequence of global minimizers to (8) subject to (9) and denote \(y_\tau =S_\tau (u_\tau )\). Then \(J(y_\tau , u_\tau ) \le J(y^*_\tau , u^*_\tau ) \le c \) implies that \((u_\tau )_\tau \) is bounded in \(L^2(0,T;L^2(\Omega ))\) and we deduce a subsequence with \(u_\tau \rightharpoonup u\) in \(L^2(0,T;L^2(\Omega ))\). Then the bounds (20) are fulfilled and Theorem 6 yields that we have the strong convergence \(y_\tau (T,\cdot )\rightarrow y(T,\cdot )\) in \(L^2(\Omega )\) where \(y=S(u)\). Respectively, given some arbitrary sequence \(\tilde{u}_\tau \) with \(\tilde{u}_\tau \rightarrow \tilde{u}\) in \(L^2(0,T;L^2(\Omega ))\) we obtain the latter also for \(\tilde{y}_\tau =S_\tau (\tilde{u}_\tau )\) and \(\tilde{y}=S(\tilde{u})\). This yields

$$\begin{aligned} J(y,u) \le \liminf _{\tau \rightarrow 0} J(y_\tau , u_\tau ) \le {\limsup _{\tau \rightarrow 0} J(y_\tau , u_\tau ) \le \lim _{\tau \rightarrow 0} J(\tilde{y}_\tau , \tilde{u}_\tau )} = J(\tilde{y}, \tilde{u}). \nonumber \\ \end{aligned}$$
(41)

Since \(\tilde{u}\) was arbitrary this yields the global optimality of u. Plugging in \(\tilde{u} = u\) yields the convergence \(\Vert u_\tau \Vert \rightarrow \Vert u\Vert \) and therefore with the weak convergence also the strong convergence \(u_\tau \rightarrow u\) in \(L^2(0,T;L^2(\Omega ))\). \(\square \)

Remark 3

If instead of the cost functionals (3) and (8) one considers the cost functionals with a target function \(y_Q\) in \(L^2(0,T;L^2(\Omega ))\) given over the whole time horizon

$$\begin{aligned} J(y,u) :=\dfrac{1}{2} \Vert y - y_Q\Vert ^2_{L^2(Q)} + \dfrac{\lambda }{2} \Vert u\Vert ^2_{L^2(Q)} \end{aligned}$$
(42)

and its discrete counterpart

$$\begin{aligned} {J}_\tau ({y_\tau },{u_\tau }) :=\dfrac{1}{2}\sum _{j=1}^N \tau _j \Vert y_j-y_{Q,j}\Vert ^2 + \dfrac{\lambda }{2} \sum _{j=1}^N \tau _j \Vert u_{j}\Vert ^2, \end{aligned}$$
(43)

with \(y_{Q,\tau }\in Y_\tau \) and \(y_{Q,\tau }\rightarrow y_Q\) in \(L^2(0,T;L^2(\Omega ))\), the theorems of this section still hold true with proofs following the same lines.