The boundedness-by-entropy principle for cross-diffusion systems

A novel principle is presented which allows for the proof of bounded weak solutions to a class of physically relevant, strongly coupled parabolic systems exhibiting a formal gradient-flow structure. The main feature of these systems is that the diffusion matrix may be generally neither symmetric nor positive semi-definite. The key idea of the principle is to employ a transformation of variables, determined by the entropy density, which is defined by the gradient-flow formulation. The transformation yields at the same time a positive semi-definite diffusion matrix, suitable gradient estimates as well as lower and/or upper bounds of the solutions. These bounds are a consequence of the transformation of variables and are obtained without the use of a maximum principle. Several classes of cross-diffusion systems are identified which can be solved by this technique. The systems are formally derived from continuous-time random walks on a lattice modeling, for instance, the motion of ions, cells, or fluid particles.


Introduction
Many applications in physics, chemistry, and biology can be modeled by reaction-diffusion systems with cross diffusion, which describe the temporal evolution of the densities or mass fractions of a multicomponent system. Physically, we expect that the concentrations are nonnegative or even bounded (examples are given in Section 2). Since generally no maximum principle holds for parabolic systems, the proof of these bounds is a challenging problem. A second difficulty arises from the fact that in many applications, the diffusion matrix is neither symmetric nor positive semi-definite.
In this paper, we introduce a general technique which allows us, under certain assumptions, to prove simultaneously the global existence of a weak solution as well as its boundedness from below and/or above. The key idea is to exploit the so-called entropy structure of the parabolic system, which is assumed to exist, leading at the same time to gradient estimates and lower/upper bounds. More specifically, we consider reaction-diffusion systems of the form (1) ∂ t u − div(A(u)∇u) = f (u) in Ω, t > 0, subject to the boundary and initial conditions (2) (A(u)∇u) · ν = 0 on ∂Ω, t > 0, u(0) = u 0 in Ω.
Here, u(t) = (u 1 , . . . , u n )(·, t) is a vector-valued function (n ≥ 1), representing the densities or mass fractions u i of the components of the system, A(u) = (a ij (u)) ∈ R n×n is the diffusion matrix, and the reactions are modeled by the components of the function f : R n → R n . Furthermore, Ω ⊂ R d (d ≥ 1) is a bounded domain with Lipschitz boundary and ν is the exterior unit normal vector to ∂Ω. The divergence div(A(u)∇u) and the expression (A(u)∇u)· ν consist of the components d j=1 n k=1 ∂ ∂x j a ik (u) ∂u k ∂x j , d j=1 n k=1 a ik (u) ∂u k ∂x j ν j , i = 1, . . . , n, respectively. From the applications, we expect that the solution u has values in an open set D ⊂ R n . When u models concentrations, we expect that D ⊂ (0, ∞) n (positivity); when u models mass fractions, we expect that the values of each component u i are between zero and one, i.e. D ⊂ (0, 1) n (boundedness and positivity). Before summarizing the state of the art of cross-diffusion systems and detailing our main results, we explain the key idea of our principle and provide an illustrating example.
1.1. Idea of the principle. The main assumption in this paper is that system (1) possesses a formal gradient-flow structure, i.e., (1) can be formulated as where B is a positive semi-definite matrix and δH/δu is the variational derivative of the entropy (or free energy) functional H[u] = Ω h(u)dx. The function h : D → [0, ∞) is called the entropy density. Identifying δH/δu with its Riesz representative Dh(u) (the derivative of h) and introducing the entropy variable w = Dh(u), the above formulation can be understood as where B = B(w) = A(u)(D 2 h(u)) −1 and D 2 h is the Hessian of h. For transforming back from the w-to the u-variable, we need to assume that Dh : D → R n is invertible such that u = (Dh) −1 (w). The gradient-flow formulation has two important consequences. First, calculating the formal time derivative of H, (3) and integrating by parts yields (4) dH dt = where A : B = i,j a ij b ij for matrices A = (a ij ) and B = (b ij ). Thus, if f (u) · w ≤ 0 and since B(w) is assumed to be positive semi-definite, H is a Lyapunov functional. We refer to H as an entropy and to the integral of ∇w : B∇w as the corresponding entropy dissipation. Under certain conditions, it gives gradient estimates for u needed to prove the global-in-time existence of solutions to (2) and (3). We remark that the positive semi-definiteness of B(w) is in fact a consequence of the existence of an entropy. It was shown in [19,35] that both properties are equivalent and moreover, B(w) may be even symmetric.
Second, supposing that there exists a weak solution w to (3), the invertibility of Dh : D → R n shows that the original variable u = (Dh) −1 (w) satisfies u(·, t) ∈ D for t > 0. Thus, if D is bounded, we automatically obtain L ∞ bounds for u, without using a maximum principle. If D is only a cone, for instance D = (0, ∞) n , we conclude the positivity of u(t).
We call the above technique the boundedness-by-entropy principle since it provides lower and/or upper bounds for the solutions to (1)-(2) by the use of the entropy density. Summarizing, the principle is based on two main hypotheses: H1: There exists a function h ∈ C 2 (D; [0, ∞)) whose derivative is invertible on R n . This yields the bound u(·, t) ∈ D. H2: The matrix D 2 h(u)A(u) is positive semi-definite for all u ∈ D. This condition is necessary to derive a priori estimates for u. Note that the positive semi-definiteness of D 2 h(u)A(u) is equivalent to that of B(w) = A(u)(D 2 h(u)) −1 since for all z ∈ R n , z ⊤ D 2 h(u)A(u)z = (D 2 h(u)z) ⊤ B(w)(D 2 h(u)z). Hypothesis H2 avoids the inversion of D 2 h(u).
In fact, we need a stronger hypothesis than H2 since it does not allow us to infer gradient estimates. We need to suppose that D 2 h(u)A(u) is positive definite in such a way that we obtain L 2 gradient estimates for u m i , where m > 0 is some number. Moreover, we need an estimate for the time derivative of u i which makes it necessary to impose some growth conditions on the coefficients of A(u). We explain these requirements with the help of the following example.
Notice that the nonnegativity of u 1 and u 2 can be proved easily by a maximum principle argument but the proof of upper bounds is less clear. The logarithmic entropy density (6) h(u) = u 1 (log u 1 − 1) + u 2 (log u 2 − 1) + (1 − u 1 − u 2 )(log(1 − u 1 − u 2 ) − 1) for u = (u 1 , u 2 ) ∈ D, where (7) D = {(u 1 , u 2 ) ∈ (0, 1) 2 : u 1 + u 2 < 1} satisfies Hypothesis H1, since the inverse transformation of variables gives (8) u = (Dh) −1 (w) = e w 1 1 + e w 1 + e w 2 , e w 2 1 + e w 1 + e w 2 ∈ D, where w = (w 1 , w 2 ) ∈ R 2 . Thus, once the existence of weak solutions w to the transformed system (3) is shown, we conclude the bounds 0 < u 1 , u 2 < 1 automatically by transforming back to the original variable. In particular, no maximum principle is used. In fact, the inverse transformation even shows that 1 − u 1 − u 2 > 0, as required from the application. Furthermore, the matrix with δ(u) = u 1 u 2 (1 − u 1 − u 2 )(2 + 4u 1 + u 2 ) is symmetric and positive definite, thus fulfilling Hypothesis H2. However, the matrix B = A(u)(D 2 h(u)) −1 degenerates at u 1 = 0 or u 2 = 0, and we cannot expect to conclude gradient estimates for w from (4). Instead, it is more appropriate to derive these estimates for the original variable u. Indeed, calculating the time derivative of the entropy according to (4) and using ∇w = D 2 h(u)∇u, we find that and the entropy dissipation can be estimated according to Thus, assuming that the integral involving the reaction terms can be bounded uniformly in u, we obtain H 1 (Ω) estimates for u Remark 1 (Notion of entropy). There exists an intimate relation between the boundednessby-entropy principle and non-equilibrium thermodynamics. In particular, the entropy variable w = Dh(u) is related to the chemical potentials of a mixture of gases and the special transformation (8) is connected with a special choice of thermodynamic activities; see Appendix A for details. The entropy density defined above equals the negative thermodynamic entropy.
Since the physical entropy is increasing and we wish to investigate nonincreasing functionals, we have reversed the sign as usual in entropy methods. Moreover, the logarithmic entropy (6) is motivated by Boltzmann's entropy for kinetic equations. For these reasons, we refer to the functional H[u] = Ω h(u)dx as a (mathematical) entropy. We note that in some applications, free energy may be a more appropriate notion.  [56]. In view of these counterexamples, it is not surprising that additional conditions are required to prove that (local in time) weak solutions are bounded and that they can be continued globally in time.
Ladyzenskaya et al. [38,Chap. VII] reduce the problem of finding a priori estimates of local-in-time solutions u to quasilinear parabolic systems to the problem of deriving L ∞ bounds for u and ∇u. Under some growth conditions on the nonlinearities, the global-in-time existence of classical solutions was shown. A fundamental theory of strongly coupled systems was developed by Amann [2]. He formulated the concept of W 1,p weak solutions and their local existence and proved that the solutions exist globally if their L ∞ and Hölder norms can be controlled. The above mentioned counterexamples show that the control on both norms is necessary. Le and Nguyen [41] proved that bounded weak solutions are Hölder continuous if certain structural assumptions on the diffusion matrix are imposed. The regularity of the solutions to systems with diagonal or full diffusion matrix was investigated in, for instance, [3,26,44,50].
The boundedness of weak solutions to strongly coupled systems has been proved using various methods. Invariance principles were employed by Küfner [37] and Redlinger [51], requiring severe restrictions on the initial data. Truncated test functions, which are nonlinear in the solutions to 2×2 systems, were suggested by Le [39] who proved the boundedness under rather restrictive structural assumptions. In the work of Lepoutre et al. [42], the existence of bounded solutions to strongly coupled systems with spatially regularized arguments in the nonlinearities was shown using Hölder theory for nondivergence parabolic operators. Other methods are based on the derivation of L p bounds uniform in p and the passage to the limit p → ∞ (Moser-type or Alikakos-type iterations [1]).
The idea of proving the boundedness of weak solutions using the entropy density (6) was, to our best knowledge, first employed by Burger et al. [8] in a size-exclusion model for two species (see Section 2.1). It was applied to a tumor-growth model in [33] and extended to Maxwell-Stefan systems for fluids with arbitrary many components [34]. A different entropy density was suggested in [29] to prove L ∞ bounds in one space dimension. In fact, the idea of using entropy variables already appeared in the analysis of parabolic systems from non-equilibrium thermodynamics [19], originally used to remove the influence from the electric potential, and goes back to the use of so-called Slotboom variables in semiconductor modeling [46,Section 3.2].
In this paper, we identify the key elements of this idea and provide a general global existence result for bounded weak solutions to certain systems. Furthermore, the technique is applied to a variety of cross-diffusion systems derived from a random-walk master equation on a lattice, underlying the strength and flexibility of the method.
1.4. Main results. Our first main result concerns the existence of bounded weak solutions to (1)-(2) under some general structural assumptions. Motivated by the comments in Sections 1.1 and 1.2, we impose the following hypotheses: H1: There exists a convex function h ∈ C 2 (D; [0, ∞)) (D ⊂ R n open, n ≥ 1) such that its derivative Dh : D → R n is invertible on R n .
H2': Let D ⊂ (a, b) n for some a, b ∈ R with a < b and let α * i , m i ≥ 0 (i = 1, . . . , n) be such that for all z = (z 1 , . . . , z n ) ⊤ ∈ R n and u = (u 1 , . . . , u n ) ∈ D, H2": There exists a * > 0 such that for all u ∈ D and i, j = 1, . . . , n for which m j > 1, it holds that |a ij (u)| ≤ a * |α j (u j )|. H3: It holds A ∈ C 0 (D; R n×n ) and there exists C f > 0 such that for all u ∈ D, f (u) · Dh(u) ≤ C f (1 + h(u)). Hypothesis H1 shows that the inverse transformation u = (Dh) −1 (w) is well defined. If w(t) is a weak solution to (3), we conclude that u(t) = (Dh) −1 (w(t)) ∈ D, yielding the desired L ∞ bounds on u(t). Assumption H2', which implies H2, is employed to prove a priori estimates for ∇u m i i . Hypothesis H2" is needed to show a bound on ∂ t u i . The condition on f (u) in Hypothesis H3 is needed to derive an a priori estimate for the solution. For instance, if the entropy density is given by (6), we may choose Section 6]. In contrast to the structural assumptions of [39], Hypotheses H1-H3 are easy to verify as soon as an entropy density is found (often motivated from the application at hand). Under the above conditions, the following theorem holds.
Note that since D is bounded by Hypothesis H2', the theorem yields global L ∞ bounds on u. We remark that we may also assume the slightly weaker condition u 0 (x) ∈ D. Indeed, we may approximate u 0 by u 0 η (x) ∈ D satisfying u 0 η → u 0 a.e., apply the theorem to u 0 η , and perform the limit η → 0. We refer to [17] for details.
The assumptions of the theorem are satisfied for the tumor-growth model and the Maxwell-Stefan equations -detailed in Section 2.1 -, including the example of Section 1.2. Furthermore, the diffusion matrix with entropy density (6) and domain (7) also satisfies Hypotheses H1, H2', and H2". The model with this diffusion matrix describes the aggregation of two population species with cross-diffusion terms related to the drift term of the Keller-Segel system. For the proof of Theorem 2, we first semi-discretize (3) in time with step size τ > 0 and regularize this equation by the expression ε( |α|=m (−1) m D 2α w + w), where ε > 0, D 2α is a partial derivative of order 2|α|, and m ∈ N is such that H m (Ω) ֒→ L ∞ (Ω). This regularization ensures that the approximate solution w (τ ) is bounded. For the proof of approximate solutions, we only need Hypotheses H1, H2, and H3. The discrete version of the entropy identity (4) and Hypothesis H2' yield gradient estimates for u (τ ) = (Dh) −1 (w (τ ) ). Furthermore, uniform estimates on the discrete time derivative of u (τ ) can be shown with the help of Hypothesis H2". Then, by the discrete Aubin lemma from [25], the limit (τ, ε) → 0 can be performed yielding the existence of a weak solution to (1)- (2).
Assumptions H2' and H2" may be too restrictive in certain applications. For instance, it may be impossible to find a bounded set D ⊂ R n satisfying H1 or the inequality in H2' is not satisfied (see the examples in Section 2). However, we show that our technique can be adapted to situations in which variants of Hypotheses H2' and H2" hold. To illustrate this idea, we choose a class of cross-diffusion systems modeling the time evolution of two population species. These models are derived from a random-walk master equation in the diffusive limit (see Appendix B). They generalize several models from the literature (see Section 2).
We consider two situations. In the first case, we assume that volume limitations lead to a limitation of the population densities (volume-filling case). Then the densities are nonnegative and bounded by a threshold value which is normalized to one. This situation occurs, for instance, in the volume-filling Keller-Segel model [48] and in ion-channel modeling [30]. The diffusion matrix in (1) reads as (see Appendix B) where u 1 and u 2 are the densities of the species with bounded total density, u 1 + u 2 ≤ 1, and u 3 = 1 − u 1 − u 2 . The function q is related to the transition probability of a species to move from one cell to a neighboring cell, and β > 0 is the ratio between the transition rates of both species. Biologically, q vanishes when the cells are fully packed, i.e. if u 1 + u 2 = 1, so q(0) = 0 and q is nondecreasing. When only one species is considered, the diffusion equation corresponds to the equation for the cell density in the volume-filling chemotaxis model [60]. The special case q(u 3 ) = u 3 was analyzed by Burger et al. [8]. We are able to prove the global existence of bounded weak solutions for q(u 3 ) = u s 3 for any s ≥ 1 but also more general functions are allowed (see Theorems 6 and 9).
We may also consider reaction terms f (u) satisfying a particular structure; see Remark 8.
The key idea of the proof is to introduce the entropy density (12) h(u) = u 1 (log u 1 − 1) + u 2 (log u 2 − 1) + u 3 c log q(y)dy, where u = (u 1 , u 2 ) ∈ D = {(u 1 , u 2 ) ∈ (0, 1) 2 : u 1 + u 2 < 1} and 0 < c < 1. A computation shows that Hypotheses H1 and H2 are satisfied but not H2'. Indeed, we will prove in Section 4 (see (32)) that and the factors degenerate at u 1 + u 2 = 1 since q(0) = 0. From this, we are able to conclude a gradient estimate for u 3 but not for u 1 or u 2 . However, this is sufficient to infer the existence of solutions, employing an extension of Aubin's compactness lemma (see Appendix C).
In the second case, volume-filling effects are not taken into account. Then the diffusion matrix reads as (see Appendix B) where p 1 and p 2 are related to the transition probabilities of the two species and ∂ i p j = ∂p j /∂u i . Note that each row of A(u) is the gradient of a function such that div(A(u)∇u) i = ∆(u i p i (u)), i = 1, 2, which allows for additional L 2 estimates for u 1 and u 2 , using the duality estimates of Pierre and Schmitt [49]. This observation was exploited in [23]. When the transistion probabilities depend linearly on the densities, p i (u) = α i0 + α i1 u 1 + α i2 u 2 (i = 1, 2), we obtain the well-known population model of Shigesada, Kawashima, and Teramoto [54] with the diffusion matrix (14) A(u) = α 01 + 2α 11 u 1 + α 12 u 2 α 12 u 1 α 21 u 2 α 20 + α 21 u 1 + 2α 22 u 2 .
The maximum principle implies that u 1 and u 2 are nonnegative. Less results are known concerning upper bounds. In fact, in one space dimension and with coefficients α 10 = α 20 , Shim [55] proved uniform upper bounds. Moreover, if cross-diffusion is weaker than selfdiffusion (i.e. α 12 < α 22 , α 21 < α 11 ), weak solutions are bounded and Hölder continuous [39]. Upper bounds without any restrictions are not known so far (at least to our best knowledge). This model has received a lot of attention in the mathematical literature. One of the first existence results is due to Kim [36] who neglected self-diffusion (α 11 = α 22 = 0) and assumed equal coefficients (α ij = 1). The tridiagonal case α 21 = 0 was investigated by Amann [2], Le [40], and more recently, by Desvillettes and Trescaces [24]. Yagi [61] proved an existence theorem under the assumption that the diffusion matrix is positive definite (α 12 < 8α 11 , α 21 < 8α 22 , α 12 = α 21 ). The first global existence result without any restriction on the diffusion coefficients (except positivity) was achieved in [29] in one space dimension and in [14,15] in several space dimensions. The case of concave functions p 1 and p 2 , for instance, (15) p i (u) = α i0 + a i1 (u 1 ) + a i2 (u 2 ), where a i1 (u 1 ) = α i1 u s 1 , a i2 (u 2 ) = α i2 u s 2 , and i = 1, 2, u = (u 1 , u 2 ), 0 < s < 1, was analyzed by Desvillettes et al. [23]. We are able to generalize this result to the case 1 < s < 4 but we need to restrict the size of the crossdiffusion coefficients α 12 and α 21 . More general functions p 1 and p 2 are possible; see Section 5.
Theorem 4 (No volume-filling case). Let (15) hold with 1 < s < 4 where h is defined in (16) below. Then there exists a weak solution u = (u 1 , u 2 ) to (1)- (2) with diffusion matrix (13) and Nonvanishing reaction terms f (u) can be treated if they satisfy an appropriate growth condition such that f (u (τ ) ) is bounded in some L p space with p > 1, where u (τ ) is a solution to an approximate problem. We leave the details to the reader.
The idea of the proof is to employ the entropy density It is an open problem whether there exists another entropy density fulfilling Hypothesis H2 without any restriction on α ij (except positivity). In order to satisfy Hypothesis H1, we need to regularize the entropy density (16) as in [23]: This regularization is motivated from the population model with diffusion matrix (14), see Section 2.2. The range of Dh ε equals R 2 , as required in Hypothesis H1. However, this regularization makes necessary to regularize also the diffusion matrix, Then the regularized product D 2 h ε (u)A ε (u) is positive semi-definite (Hypothesis H2) only if s < 4. This restriction may be improved by developing a better regularization procedure. This paper is organized as follows. In Section 2, we consider some examples of crossdiffusion systems, studied in the literature, and discuss the validity of our hypotheses. The general existence Theorem 2 is shown in Section 3. The proofs of Theorems 3 and 4 are presented in Sections 4 and 5, respectively. Some further results and open problems are discussed in Section 6. The appendix is concerned with some relations of our method to non-equilibrium thermodynamics; the derivation of a general population diffusion model, containing the diffusion matrices (10) and (13) as special cases; and the proof of a variant of the Aubin compactness lemma, needed in the proof of Theorem 3. To simplify the presentation, we write in the following L p (0, T ; H k (Ω)) instead of L p (0, T ; H k (Ω; R n )).

Examples
We present some diffusion problems from physical and biological applications and discuss the validity of Hypotheses H1-H3. Some examples do not satisfy the hypotheses but similar ideas as detailed in the introduction apply.

Examples with volume filling.
Volume-filling model of Burger. Burger et al. [9] derived from a lattice-based hopping model a cross-diffusion system for n species, incorporating volume-filling effects and modeling the ion transport through narrow pores. The system consists of equation (1) and D i > 0 are some constants. It was shown in [9] that the entropy with density The case of n = 2 species was investigated in [8]. Then the diffusion matrix becomes which is a special case of (10) with q( Consequently, assuming without loss of generality that D 2 > D 1 (otherwise, change the indices 1 and 2), Hypothesis H2 is satisfied but not Hypothesis H2' since the quadratic form degenerates at ρ = 1. Burger et al. have shown in [8] that a global existence analysis is still possible. We generalize this result to general functions q in Theorem 3.
In the general case n > 2 and the case of equal diffusivity constants D 1 = D i for all i = 2, . . . , n, summing up all equations yields ∂ t ρ − ∆ρ = 0 in Ω with homogeneous Neumann boundary conditions. Thus, by the classical maximum principle, ρ is bounded from below and above, and this yields L ∞ bounds for the components u i ≥ 0. The analysis of different diffusivity constants and n > 2 is much more delicate and an open problem.
Tumor-growth model. Jackson and Byrne [32] derived a continuous mechanical model for the growth of symmetric avascular tumors in one space dimension. In this model, the mass balance equations for the volume fractions of the tumor cells u 1 , the extracellular matrix u 2 , and the water phases u 3 = 1 − u 1 − u 2 are supplemented by equations for the velocity, obtained from a force balance. For a small cell-induced pressure coefficient (i.e. θ = β = 1 in [32]), the resulting diffusion matrix reads as .
With the entropy density (6), defined in D which is given by (7), we compute for z = (z 1 , Thus, since u 2 ≤ 1, Hypotheses H1, H2', and H2" are satisfied and the global existence result follows from Theorem 3 if the reaction terms satisfy Hypothesis H3. This result was first proven in [33].
Maxwell-Stefan equations. The Maxwell-Stefan equations describe the diffusive transport of multicomponent gaseous mixtures. In the case of mixtures of ideal gases consisting of n components under isobaric, isothermal conditions with vanishing barycentric velocity, the equations read as where D ij > 0 are the binary diffusion coefficients. The variables u i are the molar concentrations of the mixture and they satisfy n j=1 u j = 1. The inversion of the flux-gradient relations is not straightforward since the linear system in J i has a singular matrix; see [4,45]. The idea of [34] was to replace the last component u n by the remaining ones by u n = 1 − n−1 j=1 u i and to analyze the remaining n − 1 equations. For n = 3 components, the inversion can be made explicit, leading to system (1) for the remaining n − 1 = 2 equations with the diffusion matrix . The diffusion matrix (5) in Section 1.2 is obtained after setting d 0 = 3, d 1 = 2, and d 2 = 1.
Employing the entropy density (6) as in the previous example, we compute Therefore, Hypothesis H1 and H2 are fulfilled with D given by (7). Moreover, Hypothesis H2' is satisfied with m 1 = m 2 = 0, and also Hypothesis H2" holds. Theorem 2 yields the existence of global bounded weak solutions to this problem. It can be shown that this result holds true in the case of n > 3 components; see [34].

Models without volume filling.
Population model of Shigesada, Kawashima, and Teramoto. The model describes the evolution of two population densities u 1 and u 2 governed by equation (1) with the diffusion matrix where the coefficients α ij are nonnegative. Introducing the entropy density a computation shows that Hypothesis H2 is satisfied, i.e. for all z = (z 1 , Thus, we see that Hypothesis H2' with m i = 0 or m i = 1 and Hypothesis H2" hold true. Since D = (0, ∞) 2 is not bounded and the range of Dh is not the whole R 2 , we cannot apply Theorem 2. However, as the coefficients of A(u) depend only linearly on u, the proof of Theorem 4 can be adapted to this case and we conclude the existence of global nonnegative weak solutions to this problem. This statement was first proved in [14] using a different approximation procedure. In Theorem 4 we generalize this result.
Semiconductor model with electron-hole scattering. The carrier transport through a semiconductor device with strong electron-hole scattering but vanishing electric field can be modeled by equation (1) with diffusion matrix where u 1 and u 2 are the electron and hole densities, respectively, and µ 1 and µ 2 denote the corresponding (positive) mobility constants. This model was formally derived from the semiconductor Boltzmann equation with a collision operator taking into account strong electron-hole scattering [52]. The global existence of weak solutions was shown in [16].
Introducing the entropy density Thus, Hypothesis H2 is satisfied but not H2' since the quadratic form is not uniformly positive. However, as shown in [16], bounds on the electron and hole masses (i.e. the integrals of u 1 and u 2 ) together with estimates from the entropy dissipation yield an H 1 bound for u 1/2 1 and u 1/2 2 . Then, with the entropy variables w i = ∂h/∂u i = log u i , the existence of global weak solutions was proved in [16].

Proof of Theorem 2
The proof is based on the solution of a time-discrete and regularized problem, for which only Hypotheses H1, H2, and H3 are needed.
Step 1. Solution of an approximate problem.
Lemma 5 (Existence for the regularized system (18)). Let Assumptions H1, H2, and H3 hold. Let u 0 ∈ L ∞ (Ω; R n ) be such that u 0 (x) ∈ D for x ∈ Ω, and let 0 < τ < 1/C f , where C f > 0 is defined in H3. Then there exists a weak solution w k ∈ H m (Ω; R n ) to (18) satisyfing u(w k (x)) ∈ D for x ∈ Ω and the discrete entropy-dissipation inequality Proof. Let y ∈ L ∞ (Ω; R n ) and δ ∈ [0, 1] be given. We solve first the following linear problem: The forms a and F are bounded on H m (Ω; R n ). The matrix B(y) = A(u(y))(D 2 h) −1 (u(y)) is positive semi-definite since, by Assumption H2, Hence, the bilinear form a is coercive: The last inequality follows from the generalized Poincaré inequality for H m spaces [58, Chap. II.1.4, Formula (1.39)], and C > 0 is some constant only depending on Ω. Therefore, we can apply the Lax-Milgram lemma to obtain the existence of a unique solution (20). This defines the fixed-point operator S : (20). It holds S(y, 0) = 0 for all y ∈ L ∞ (Ω; R n ). We claim that the operator S is continuous. Indeed, let δ η → δ in R and y η → y strongly in L ∞ (Ω; R n ) as η → 0 and set w η = S(y η , δ η ) ∈ H m (Ω; R n ). Then, by continuity, u(y η ) → u(y), B(y η ) → B(y), and f (u(y η )) → f (u(y)) strongly in L ∞ (Ω; R n ), and the above coercivity estimate gives a uniform bound for (w η ) in H m (Ω; R n ). This implies that, for a subsequence which is not relabeled, w η ⇀ w weakly in H m (Ω; R n ). Then, performing the limit η → 0 in (20), it follows that w = S(y, δ). In view of the compact embedding H m (Ω; R n ) ֒→ L ∞ (Ω; R n ), we infer that for a subsequence, S(y η , δ η ) = w η → w = S(y, δ) strongly in L ∞ (Ω; R n ). Since the limit w is unique, the whole sequence (w η ) is converging which shows the continuity. Furthermore, the compact embedding H m (Ω; R n ) ֒→ L ∞ (Ω; R n ) yields the compactness of S.
It remains to prove a uniform bound for all fixed points of S(·, δ) in L ∞ (Ω; R n ). Let w ∈ L ∞ (Ω; R n ) be such a fixed point. Then w solves (20) with y replaced by w. With the test function φ = w, we find that The for all x, y ∈ D. Choosing x = u(w) and y = u(w k−1 ) and using Dh(u(w)) = w, this gives Taking into account the positive semi-definiteness of B(w) and Assumption H3, (21) can be estimated as follows: Choosing τ < 1/C f , this yields an H m bound for w uniform in δ (not uniform in τ or ε). The Leray-Schauder fixed-point theorem shows that there exists a solution w ∈ H m (Ω; R n ) to (20) with y replaced by w and δ = 1.
Proof. First we observe that we may assume that β ≥ 1 since otherwise we rescale the equations by t → βt, which removes this factor in the second equation but yields the factor 1/β > 1 in the first equation.
To verify Assumption H2, we show the following lemma.
Step 2: A priori estimates. By Lemma 5, there exists a sequence (w k ) of weak solutions to (18) satisfying the entropy-dissipation inequality (19) with C f = 0. Taking into account the identity B(w k )∇w k = A(u k )∇u k , where u k = u(w k ), w k is a solution to for all φ ∈ H m (Ω; R 2 ). Furthermore, because of ∇w k : B(w k )∇w k = ∇u k : (D 2 h)(u k )A(u k )∇u k and (29), the discrete entropy inequality (19) can be written as where u k = (u k 1 , u k 2 ) and u k Resolving the recursion, we infer that Using the generalized Poincaré inequality [58, Chap. II.1.4], we deduce that The assumption q(y) ≤ κq ′ (y) 2 (see (28)) implies an H 1 estimate for u k 3 : The boundedness of (u k i ) in L ∞ yields an L 2 estimate for (u k i ) and hence, also for (u k 3 ). Therefore, We cannot perform the simultaneous limit (τ, ε) → 0 as in the proof of Theorem 2 since we need a compactness result of the type of Lemma 13 (see Appendix C) in which the discrete time derivative is uniformly bounded in H 1 (Ω) ′ and not in the larger space H m (Ω) ′ with m > d/2. Therefore, we pass to the limit in two steps.
Theorem 9. Let β > 1 and q(y) = y s with s ≥ 1. Let u 0 ∈ L 1 (Ω) with 0 < u 0 < 1 in Ω. Then there exists a weak solution to (1)-(2) satisfying the weak formulation (11), u(0) = u 0 in H 1 (Ω; R 2 ) ′ , and Proof. It is sufficient to consider the case s > 2 since the case s ≤ 2 is contained in Theorem 6. Assumptions H1 and H2 hold since this requires only q(0) = 0 and q ′ (s) ≥ 0. By Lemma 5, there exists a sequence (w k ) of weak solutions to (18) satisfying the entropy-dissipation inequality (31), which reads as Observing that Compared to the proof of Theorem 2, we cannot conclude an H 1 bound for u k 3 but only for (u k 3 ) s/2 . Set u (ε) i = u k i for i = 1, 2, 3. The function z → z 2/s for z ≥ 0 is Hölder continuous since s > 2. Therefore, by the lemma of Chavent and Jaffre [13, p. 141], Since the embedding W 2/s,s (Ω) ֒→ L s (Ω) is compact, we conclude the existence of a subsequence (not relabeled) such that u (ε) 3 → u 3 strongly in L s (Ω). At this point, we can proceed as in Step 3 of the proof of Theorem 2.
For the limit τ → 0, we need another compactness argument. Let u (τ ) i be defined as in the proof of Theorem 2. We have the uniform estimates Then, by the generalization of the Aubin lemma in [18,Theorem 3], we obtain, up to a subsequence, as τ → 0, The remaining proof is exactly as Step 4 of the proof of Theorem 2.
Since the entropy density h(u), defined in (16), may not fulfill Hypothesis H1, we need to regularize: Then Dh ε (u) = Dh(u) + ε(log u 1 , log u 2 ) ⊤ , and the range of Dh ε is R 2 . The Hessian of h ε equals showing that each component of Dh ε is strictly increasing, and thus, (Dh ε ) −1 : R 2 → D = (0, ∞) 2 is well defined. Hence, h ε fulfills Hypothesis H1. We also regularize the diffusion matrix by setting The entropy density is regularized similarly in [23] but the diffusion matrix is regularized here in a different way.
Step 1: Verification of Hypotheses H1 and H2. Set H = D 2 h(u). First, we show that HA and H ε A ε are positive definite under additional conditions on the functions a ij .
showing the claim.
Estimates (52) and (55) allow us also to apply the Aubin lemma in the version of [18], such that, for a subsequence, u (τ ) → u strongly in L 2s (0, T ; L 2q (Ω)) as (τ, ε) → 0, where q ≥ 2 is such that H 1 (Ω) ֒→ L q (Ω). Because of the uniform bound of (u (τ ) i ) in L 2s(d+1)/d (Ω T ) and the a.e. convergence of (u (τ ) i ), we find that, up to a subsequence, u (τ ) i → u i strongly in L p (Ω T ) for all p < 2s(d + 1)/d and in particular for p = 2s. By (52) and the above strong convergence, again up to a subsequence, u i p i (u (τ ) ) → u i p i (u) a.e. in Ω T . Because of the bound (54), the dominated convergence theorem shows that u (τ ) The above convergence results are sufficient to pass to the limit (τ, ε) → 0, which yields This proves the theorem.

Further results and open problems
In this section, we discuss some further results and open problems related to cross-diffusion systems and entropy methods.
(i) Dirichlet boundary conditions: Dirichlet boundary conditions may be treated under some conditions on the nonlinearites. To see this, we assume that u i = u D,i on ∂Ω for i = 1, . . . , n and we set w D = Dh(u D ). For simplicity, we assume that u D = (u D,1 , . . . , u D,n ) depends on the spatial variable only. Then, using the test function w − w D in (1), we obtain If h(u) grows superlinearly (such as u i log u i ) and w D is smooth, we may estimate the first integral on the left-hand side as Ω (h(u) − w D u)dx ≥ 1 2 Ω h(u)dx − C for some constant C > 0. Under Hypotheses H2' and H3, the second integral on the left-hand side and the second integral on the right-hand side are estimated as follows: Then, assuming that A(u) and f (u) are such that, for instance, we can estimate the remaining integrals on the right-hand side of (56). In a similar way, we may treat Robin boundary conditions by combining the ideas for Dirichlet and homogeneous Neumann conditions. (ii) Uniqueness of weak solutions: It is possible to prove the uniqueness of weak solutions to (1)-(2) (with f = 0) under the assumption that A(u) can be written as a gradient, i.e. A(u) = ∇Φ(u) for some monotone function Φ : D → R n , by employing the H −1 method. Unfortunately, the monotonicity assumption is rather strong since it requires that ∇Φ(u) = A(u) is positive semi-definite. Another uniqueness result can be obtained if we are able to write the diffusion equation in terms of the entropy variable (see (3)), for instance in the approximated setting. For this result, we need to suppose that B(w) = ∇Φ(w) and that Φ • Dh is monotone. Let w (1) and w (2) be two weak solutions to (2)-(3) (with f = 0) with the same initial data. Furthermore, let v ∈ L 2 (0, T ; H 1 (Ω; R n )) be the weak solution to −∆v = u(w (1 )−u(w (2) ) in Ω and ∇v · ν = 0 on ∂Ω. Then by the monotonicity of Φ • Dh, where u (i) = u(w (i) ) for i = 1, 2. This shows that v = 0 and hence u(w (1) ) = u(w (2) ). Uniqueness results under weaker conditions are an open problem (at least to our best knowledge). In fact, it is well known that this question is delicate since non-uniqueness is well known even for some scalar equations; see, e.g., [31]. (iii) Quadratic reaction terms: In this paper, our focus was rather on the diffusion part than on the reaction part. The question is whether the results can be extended to diffusion systems with, say, quadratic reaction rates as they arise in reversible chemistry. For instance, a global existence result for the following problem is generally unknown: for i = 1, 2, 3, 4, with boundary and initial conditions (2). If A(u) is the diagonal matrix A = diag(d 1 , . . . , d n ) with smooth functions d i = d i (x, t) ≥ 0, the global existence of weak solutions was shown in [22]. This result is based on the observation that h(u) = 4 i=1 u i (log u i − 1) possesses an L 2 estimate. This is proved by using the duality method of M. Pierre [49], and the proof is based on the diagonal structure of A. A global existence result for a system modeling the more general reaction A 1 + A 2 ⇋ A 3 + A 4 was given in [10] in the two-dimensional case, the three-dimensional case being an open problem. Reaction-diffusion systems with diffusivities d i depending on u and quadratic rate functions were recently analyzed in [6], but still only in the diagonal case. (iv) Long-time behavior of solutions: We expect that the long-time behavior of weak solutions to (1)-(2) with f = 0 can be proven under some additional conditions. For specific diffusion matrices, the exponential decay has been already shown; see [33,34]. The idea is to estimate the entropy dissipation in terms of the entropy. For instance, let u ∞ = (u ∞,1 , . . . , u ∞,n ) be a constant steady-state to (1)- (2). To simplify, we assume that the entropy is given by and that the entropy dissipation can be estimated from below according to for some λ > 0. Then the entropy-dissipation inequality (4) becomes and Gronwall's lemma implies exponential convergence in terms of the relative entropy H * . By the Csiszár-Kullback inequality, we conclude the exponential decay in the L 1 norm, u(t) − u ∞ L 1 (Ω) ≤ C exp(−λt/2) for t > 0. For details, we refer to [11]. The main task is to derive the entropy-dissipation relation (57). According to Hypothesis H2', we need to prove If α i (u i ) = u 1/2 i (for instance), this inequality is a consequence of the logarithmic Sobolev inequality. For more general functions α i and power-type entropy densities, one may employ Beckner-type inequalities; see [12,Lemma 2]. The general case, however, is an open problem.
(v) Entropies for population models: It is an open problem to find an entropy for the general population model with diffusion matrix A = (a ij ) defined in (67). Just a simple combination of the entropies for the systems analyzed in Theorems 3 and 4, i.e. summing (12) and (16), seems to be not sufficient. It is also an open problem if an entropy for the population system with matrix (13), where p 1 and p 2 are given by (15), exists without restrictions on the coefficients α ij (except positivity). (vi) Gradient systems and geodesic convexity: There might to be a relation between our entropy formulation and the gradient structure for reaction-diffusion systems developed by Liero and Mielke [43] but no results are known so far. Using purely differential methods, they have shown the geodesic λ-convexity for particular reaction-cross-diffusion systems.
It is an open problem if such geometric properties also hold for the examples presented in Section 2. (vii) Coupled PDE-ODE problems: Coupled reaction-diffusion-ODE problems occur, for instance, in chemotaxis-haptotaxis systems modeling cancer invasion [57]. Denoting by u 1 , u 2 , u 3 the densities of the cancer cells, matrix-degrading enzymes, extracellular matrix, respectively, the evolution is governed by (1) with the exemplary diffusion matrix It is an open problem whether these systems possess an entropy structure and whether our method can be extended to such problems. One idea could be to reduce the problem to the subsystem (u 1 , u 2 ) and to estimate terms involving ∇u 3 directly by differentiating the ODE for u 3 . (viii) Bounded weak solutions: A challenging task is to determine all diffusion systems whose weak solutions are bounded. As a first step in this direction, we consider n = 2 and diffusion matrices A(u) = (a ij ) depending linearly on the variables, where α ij , β ij , γ ij ∈ R. We wish to find conditions on the coefficients for which the logarithmic entropy density (6) satisfies Hypothesis H2. Requiring that B = A(D 2 h) −1 is symmetric, we can fix some coefficients, The matrix (c ij ) = (D 2 h)A is positive semi-definite if and only if which is the case if 0 ≤ c 22 < ∞. However, it seems to be difficult to solve inequalities (58)- (59) in the general situation. Possibly, techniques from quadratic optimization with inequality constraints and quantifier elimination may help.
Appendix A. Relations to non-equilibrium thermodynamics We show that the entropy variable w = Dh(u), defined in Section 1, is strongly related to the chemical potentials of a fluid mixture and that the particular change of unknowns associated with the logarithmic entropy density (6) is related to a special choice of the thermodynamic activities.
We introduce first the thermodynamic setting. Consider a fluid consisting of N components with the same molar mass under isobaric and isothermal conditions. We write ρ i instead of u i to denote the mass density of the ith component. The evolution of the mass densities is governed by the mass balance equations where J i are the diffusion fluxes. We have assumed that the barycentric velocity vanishes and that there are no chemical reactions. Furthermore, we assume for simplicity that the total mass density is constant, N j=1 ρ j = 1. Let s(ρ) = s(ρ 1 , . . . , ρ N ) be the thermodynamic entropy of the system. Then the chemical potentials µ i are defined (in the isothermal case) by where here and in the following we set physical constants (like temperature) equal to one. Neglecting also body forces and the (irreversible) stress tensor, the diffusion fluxes J i can be written as (see [21, Chapter IV, (15)] or [5,Formula (170)]) where L ij are some diffusion coefficients such that (L ij ) is positive definite. Once an explicit expression for the thermodynamic entropy is determined, equations (60)-(61) are closed. Now, we explain the relation of the entropy variables to the above setting. Since ρ N = 1 − N −1 j=1 ρ j , we may express the density of the last component in terms of the others such that we can introduce and in fact, h corresponds to the (mathematical) entropy introduced in Section 1. With this notation, the entropy variables become which relates the entropy variables to the chemical potentials. Moreover, comparing the flux vector J = B(w)∇w from (3) with (61), we see that the diffusion matrix B(w) coincides with (L ij ).
For the second statement, we recall that the chemical potentials of a mixture of ideal gases can be formulated as i is the Gibbs energy which generally depends on temperature and pressure. Since we have supposed an isobaric, isothermal situation, µ 0 i is constant and, for simplicity, we set µ 0 i = 0 for i = 1, . . . , N . In order to model non-ideal gases, it is usual in thermodynamics to introduce the thermodynamic activity a i and the activity coefficient γ i by we recover the ideal-gas case. In the volume-filling case, Fuhrmann [28] has chosen γ i = 1 + N −1 j=1 a j for numerical purposes. Then and since a i = exp(µ i ), it follows that which corresponds to the inverse transformation (8) if we identify µ i with w i . This expression can be derived directly from (62). Indeed, if µ 0 j=1 ρ j , and inverting these relations, we find that Appendix B. Formal derivation of two-species population models We derive formally cross-diffusion systems from a master equation for a continuous-time, discrete-space random walk in the macroscopic limit. We consider only random walks on one-dimensional lattices but the derivation extends to higher dimensions in a straightforward way. The lattice is given by cells x i (i ∈ Z) with the uniform cell distance h > 0. The densities of the populations in the ith cell at time t > 0 are denoted by u 1 (x i , t) and u 2 (x i , t), respectively. We assume that the population species u 1 and u 2 move from the ith cell into the neighboring (i ± 1)th cells with transition rates S ± i and T ± i , respectively. Then the master equations can be formulated as follows [47]: where i ∈ Z. We further suppose that the transition rates S ± i and T ± i depend on the departure cell i and the arrival cells i ± 1: , where σ 0 > 0 is some number. Abbreviating p j (x i ) = p j (u 1 (x i ), u 2 (x i )) and q j (x i ) = q j (1 − u 1 (x i ) − u 2 (x i )), the master equations (63)-(64) become The functions p 1 (x i ) and p 2 (x i ) model the tendency of the species to leave the cell i, whereas q 1 (x i ) and q 2 (x i ) describe the probability to move into the cell i. The latter functions allow us to model the so-called volume-filling effect. Indeed, we may interpret u 1 and u 2 as volume fractions satisfying u 1 + u 2 ≤ 1. Then 1 − u 1 − u 2 describes the volume fraction not occupied by the two species. If the ith cell is fully occupied, i.e. u 1 (x i ) + u 2 (x i ) = 1, and q j (1 − u 1 (x i ) − u 2 (x i )) = q j (0) = 0, the probability to move into the ith cell is zero.
It seems very difficult-if not impossible-to explore the entropy structure of this equation in full generality. Therefore, we investigated two special cases in this paper. First, we assumed that p 1 = p 2 = 1, q 1 = q, and q 2 = βq, where β > 0. This corresponds to the volume-filling case (10). Second, we have set q 1 = q 2 = 1. This gives the matrix (13).
Note that the result would follow from the Aubin compactness lemma if y τ was bounded from below by a positive constant, since in this situation, it would suffice to apply the Aubin lemma in the version of [25] to infer the strong convergence of (a subsequence of) (z τ ) in L 2 (0, T ; L 2 (Ω)) which, together with the strong convergence of (y τ ), would give the result.
Proof. The proof is inspired from [8,Section 4.4] but parts of the proof are different. The idea is to prove that lim (h,k)→0 Ω T (y τ z τ )(x + h, t + k) − (y τ z τ )(x, t) 2 dx dt = 0 uniformly in τ > 0, where Ω T = Ω × (0, T ) and y τ (·, t), z τ (·, t) are extended by zero for T ≤ t ≤ T + k. Then the result follows from the lemma of Kolmogorov-Riesz [7,Theorem 4.26] and the L ∞ boundedness of y τ z τ . We write For the estimate of I 1 , we integrate over the line segment [x, x + h] and employ a standard extension operator in L 2 : where here and in the following, C > 0 denotes a generic constant.
Because of the relative compactness of (y τ ) in L 2 , the right-hand side converges to zero uniformly in τ > 0. This finishes the proof.