Inhomogeneous Patlak-Keller-Segel models and Aggregation Equations with Nonlinear Diffusion in $\Real^d$

Aggregation equations and Patlak-Keller-Segel (PKS) models for chemotaxis with nonlinear diffusion are popular models for nonlocal aggregation phenomenon and are a source of a number of interesting mathematical problems in nonlinear PDE. The purpose of this work is twofold. First, we continue our previous work, which focused on nonlocal aggregation, modeled with a convolution. The goal was to unify the local and global theory of these convolution-type models, including the identification of a sharp critical mass; however, some cases involving unbounded domains were left open. In particular, the biologically relevant case $\Real^2$ was not treated. In this paper, we present an alternative proof of local existence, which now applies to $\Real^d$ for all $d \geq 2$ and give global results that were left open. The proof departs from previous work in that it uses a more direct and intuitive regularization that constructs approximate solutions on $\Real^d$ instead of on sequences of bounded domains. Second, this work develops the local, subcritical, and small data critical theory for a variety of Patlak-Keller-Segel models with spatially varying diffusion and decay rate of the chemo-attractant.


Introduction
In this paper we study several types of aggregation models with nonlinear diffusion and nonlocal self-attraction. The primary focus is to develop and extend the relevant local, subcritical, and small data critical/supercritical theory. These results exist in the perturbative regime as they all fundamentally treat the PDE as a nonlinear perturbation of the diffusion equation (see below for more information). Furthermore, we also present several non-perturbative global existence results for a class of critical problems as well.
The first general class of systems we study are those where the nonlocal self-attraction arises as the result of a convolution operator u t + ∇ · (u∇K * u) = ∆A(u), where L 1 + (R d ; µ) := f ∈ L 1 (R d ; µ) : f ≥ 0 . Equation (1) is of interest in mathematical biology as it models the competition between a species' desire to aggregate and to disperse. Dispersal is modeled via the, potentially nonlinear, diffusion term ∆A(u) and the aggregation is modeled via the nonlocal advective term ∇ · (u∇K * u). The most well-known example of (1) is the parabolic-elliptic Patlak-Keller-Segel model, based on the original models of Patlak [37] and Keller and Segel [29]. For more information on the modeling aspects, see the [26,25] for reviews of various chemotaxis models and [44,15,34,24,16] for more general swarming and aggregation models. In this paper, we extend our recent work [5] to provide a more satisfactory and complete local and global theory for (1) on R d for d ≥ 2. In [5], we studied the local and global existence and uniqueness of bounded, integrable solutions to (1) in bounded domains for d ≥ 2 and all space for d ≥ 3. The primary goal of the previous work was to unify the existing Patlak-Keller-Segel global existence theory [41,42,43,14,31,17,11] with the local existence and uniqueness theory for less singular versions of (1) [8]. In that work , the R 2 case was not treated due to several technical difficulties. Since this case is very important for mathematical biology, we make specific effort to treat this case and discuss the difficulties in more detail below.
We present an alternative proof of local existence of (1) for a wide range of A and K which applies on R d , d ≥ 2, for solutions with bounded second moment. The new proof is based on a regularization directly on R d , in contrast to [5]. One of the benefits of this regularization is that it allows one to rigorously justify the application of homogeneous Sobolev embeddings in formal arguments, which are crucial in deducing small data global existence and uniform boundedness in supercritical cases. We prove such results below following iteration techniques similar to those employed in [20,41,3]. We also expand the global existence results of [5] to estimate the critical mass for kernels with a logarithmic singularity at the origin in R d .
The second class of nonlocal aggregation models we study is the variable-coefficient parabolicelliptic Patlak-Keller-Segel model,    u t + ∇ · (u∇c) = ∆A(u), −∇ · (a(x)∇c) + γ(x)c = u, u(0, x) = u 0 (x) ∈ L 1 + (R d ; (1 + |x| 2 )dx) ∩ L ∞ (R d ). (2) We assume a(x) ∈ C 1 is strictly positive and γ(x) ∈ L ∞ is non-negative. This PDE system is in many ways similar to (1), however they are not of the same form, as the solution to the equation for the chemical concentration, c, cannot be written in convolution form unless a(x) and γ(x) are both constant. In this paper we develop the local, subcritical and small data critical/supercritical theory for (2). The proofs are analogous to those of (1) with additional complications arising due to the different nature of the estimates for c, ∇c, D 2 c in terms of u (see Appendix). Analysis of the critical case and the identification of the critical mass has been completed by I. Kim and one of the authors in a separate work [4].
One of the most important properties of (1) and (2) is that each dissipate the following free energy F(u(t)) = S(u(t)) − W(u(t)). ( The entropy, S(u(t)), and the interaction energy (potential energy), W(u(t)), are given by S(u(t)) = Φ(u(x, t))dx, W(u(t)) = 1 2 u(x, t)c(x, t)dx, with c(x, t) = K * u if the system is of convolution type. The entropy density (internal energy density), Φ(z), is a strictly convex function satisfying In fact, both (1) and (2) are formally the gradient flows for (3) in the Euclidean Wasserstein metric (see e.g. [1]). For (1) and (2) there is no fully developed theory for making this precise; however, some aspects can be recovered and have proven very useful [9,10]. In any case, the free energy (3) plays an important role, especially in the global theory as in for example [41,14,17,11,12,5].

Notation and Conventions
We work on R d for d ≥ 2. For notational simplicity we denote the parabolic domain by R d T = R d × [0, T ] and the standard L p norm by u p = u L p (R d ) . We also introduce the following notation for the k th moments: We let η(x) ∈ C ∞ c (R d ) with 0 < η(x) ≤ 1 for |x| < 1, η(x) = 0 for |x| ≥ 1, and η(x) ≡ 1 for |x| ≤ 1/2 be our canonical cut-off function and denote the standard mollifer J ǫ v := η * v.
We use N to denote the Newtonian potential: By 'weighted Young's inequality' we mean for a, b > 0 and 1 = 1 p + 1 q and ǫ > 0, Since we will be working with many largely irrelevant constants, we use the notation f p,k,... g to denote f ≤ C(p, k, ..)g, where C(p, k, ..) is a generic constant that depends on p, k etc.

Definitions and Assumptions
We consider the general class of kernels introduced in [5], which includes fundamental solutions to elliptic PDEs and other commonly considered attractive kernels.
This definition ensures that K is attractive, well-behaved at the origin, and has second derivatives that define bounded distributions on L p for 1 < p < ∞. The obvious example of an admissible kernel is the Newtonian potential, which is effectively the most singular admissible kernel both at the origin and at infinity (in the sense that it decays the slowest). We remark that many of our results ( §2.1 and §3) still hold if we replace condition (KN) with the assumption K(x) = K(−x), allowing for non-radially symmetric and general attractive/repulsive type kernels.
We limit ourselves to diffusions that do not spread mass faster than linear diffusion; however, using the techniques of [17] one could also treat cases with fast diffusion. This is more general than the diffusion considered in [8,5], which were restricted to degenerate diffusion.
Definition 2 (Admissible Diffusion Functions). We say that the function A(u) is an admissible diffusion function if: Following [5,8,7] we use the following notion of weak solution, which is stronger than traditional distribution solutions. In d ≥ 3, test functions are taken inḢ 1 , whereas in d = 2 minor adjustments must be made, as discussed below and in [7]. By density arguments, this is basically the same as taking test functions in C ∞ c and requiring various regularity assumptions on the solution. However, but we prefer the current statement of the definition to emphasize the kind of test functions that we are interested in. Taking test functions in these spaces is important for the proof of uniqueness, which is based on anḢ −1 stability estimate [8,7,5,2].
Remark 2. The additional complication in R 2 is due to the fact that the norm ∇f 2 is not wellbehaved in R 2 , since ∇f 2 scales like L ∞ in R 2 . Indeed, there exists a sequence of Schwartz functions with ∇f n 2 = 1 and f n → ∞ point-wise a.e. (consider f (x) = log log(1 + |x| −1 )1 |x|<1 and scaling f λ (x) = f (λx), λ → 0). In order to remove such pathologies from our space of test functions we follow [7], which also requires test functions to lie in L ∞ (R 2 ).
Following [5], we now define a notion of criticality for (1), which in general has no scaling symmetries. However, a kind of scaling symmetry can be recovered in the limit of mass concentration, which in turn is expected to govern blow-up (see Theorem 3 below). Suppose K(x) ∼ |x| −d/p as From here we see the limit is independent of u unless d − dm = −d/p. If the limit is −∞ we expect aggregation to dominate near mass concentration and if the limit is +∞ we expect diffusion to dominate. As mass concentration should occur on vanishing length-scales, we may use this scaling heuristic to define a notion of criticality. The limiting case of p → ∞ occurs when considering K(x) ∼ − log |x| as x → 0 and is discussed more below. Of course this corresponds to the 2D classical parabolic-elliptic PKS model, but we can consider the same singularity in higher dimensions (as done in for example [28]).
Then the critical exponent associated to K is given by For the variable-coefficient Patlak-Keller-Segel system (2) we take m ⋆ = 2 − 2/d.
Remark 3. Due to the monotonicity assumptions in Definition 1 (see also Lemma 1 above), for p < ∞ the definition is equivalent to requiring that K(x) = O(|x| −d/p ) as x → 0 which is the same as requiring K ∈ L p,∞ loc . Similarly, when m ⋆ = 1 we have that K(x) = O(log |x|) as x → 0.
Remark 4. The variable coefficient system (2) should be roughly as singular as the constant coefficient case, hence the corresponding definition in this case. Now we define the notion of criticality by relating the critical exponent of the kernel to the diffusion, again focusing on the limit of mass concentration. It is easier to define this notion in terms of the quantity A ′ (z), as opposed to using Φ(z) directly. This is not so surprising as A ′ (z) is precisely the local coefficient of diffusivity, and directly measures the strength of the diffusion relative to the mass density.
Definition 5 (Criticality). We say that the problem is subcritical if The following lemma, from [5], enumerates several important aspects of admissible kernels. Part (c) in particular provides a useful characterization of kernels with m ⋆ < 2 − 2/d. Lemma 1. Let K be admissible. Then each of the following is true: Moreover, C(p) p as p → ∞.
(c) Let 1 ≤ m * < 2 − 2/d and β be such that loc for some β > 1, then, D 2 K ∈ L β,∞ . We will also need the logarithmic Hardy-Littlewood-Sobolev inequality in order to relate the interaction energy to the Boltzmann entropy, as in for instance [21,14,5].
As noted above for d = 2 the admissible kernels are generally only in BM O(R 2 ) and hence can grow logarithmically at infinity. This introduces a number of complications for the local and global well-posedness. To begin with, in the proof of the energy dissipation inequality, one must ensure that the interaction energy of the approximate solutions converges to the interaction energy of the weak solution being constructed. However, K * f will be unbounded for general f ∈ L 1 ∩ L ∞ hence more care must be taken than in d ≥ 3. The dual of BM O is the Hardy space H 1 [40], a strict subset of L 1 , which we define via duality, Accordingly, we have the natural analogue of Hölder's inequality [40] Kf which in particular implies K * f ∈ L ∞ (R d ) whenever K ∈ BM O and f ∈ H 1 . The following lemma found in [2] provides sufficient conditions for f ∈ L 1 such that f ∈ H 1 and a useful estimate of the norm that supplies the convergence of the interaction energies.

Statement of Results
Theorem 1 (Local Existence and Energy Dissipation for Convolution-type Systems). Let d ≥ 2, K be admissble and u 0 ∈ L 1 Then there exists a T > 0 and a weak solution u(t) of (1) which satisfies u(t) ∈ C([0, T ]; L 1 ) and u(0) = u 0 . Moreover, F(u 0 ) < ∞ and u(t) satisfies the energy dissipation inequality, For a rigorous interpretation of the free energy dissipation, see for example [19]. As mentioned above, the local existence for general models (1) for d ≥ 3 was proved in [5], which extended the existence results of [41,14,8]. We present an alternative which is specialized to treating R d that has certain advantages. In particular we treat R 2 . We also prove the corresponding theorem for variable-coefficient Patlak-Keller-Segel systems.
Theorem 2 (Local Existence and Energy Dissipation for Variable-Coefficient Systems). Let d ≥ 2, a(x) ∈ C 1 be strictly positive such that a + |∇a| ∈ L ∞ and let γ(x) ∈ L ∞ be non-negative. In d = 2, further suppose that γ(x) is strictly positive. If ) and u(0) = u 0 . Moreover, F(u 0 ) < ∞ and u(t) satisfies the energy dissipation inequality Uniqueness for convolution-type systems is proven in [5] for d ≥ 3; the same proof works for (2) using the elliptic estimates found in the Appendix. More recent work undertaken by J. Azzam and one of the authors [2] proves uniqueness for the d = 2 case. For completeness, we state a continuation theorem proved in [5], which extends previous theorems stated in [27,17,11]. The extension to cover (2) is straightforward and is briefly discussed below in §3.1.
Theorem 3 (Continuation [5]). The weak solution to (1) or (2) has a maximal time interval of existence One of the primary tools in the proofs of Theorems 1-3 is the use of Alikakos iteration methods commonly used in the study of these PDEs as for example [27,31,17,5,3]. These methods are fundamentally perturbative in nature (as are the related methods of [38,13]), depending on relatively crude Gagliardo-Nirenberg inequalities to overpower the nonlinear aggregation with diffusion only in certain regimes. In subcritical regimes this is sufficient and these methods prove global existence and uniform boundedness in L ∞ , as in [31,17]. In the critical and supercritical cases, one can prove the same provided that the initial condition is small in the corresponding critical norm and that the nonlinear diffusion compares favorably with the homogeneous diffusion even at low densities. As shown in for example [20,41,38,3], stronger decay estimates may also be deduced using various refinements of similar iteration methods.

Theorem 4 (Subcritical and Small Data Theory
(i) (subcritical) Suppose that for z sufficiently large there exists some δ > 0 such that for m > m ⋆ . Then the solution is global and (ii) (small data in critical and supercritical cases) Suppose instead (13) is satisfied for all z > 0 is not strictly positive and d ≥ 3 then there exists a constant ǫ 0 = ǫ 0 (δ, m, d) > 0 such that if then the solution is global and u(t) ∈ L ∞ ((0, ∞) × R d ).
Remark 6. Part (i) follows easily from the techniques used to prove the Theorem 3 found in [31,17,5] and will not be proved here. However, (ii) is not so immediate, especially for the variable-coefficient system (2) when γ(x) is not strictly positive.
For critical problems, deducing global bounds and decay estimates for larger data requires fully non-pertubative techniques that depend heavily on the energy dissipation inequalities (10) and (11). The use of sharp functional inequalities to determine when mixed-sign energies such as (3) are coercive is the classical standard method of treating large data and determining the sharp critical mass, for example for the focusing nonlinear Schrödinger equations, marginal unstable thin film equations and, of course, PKS [45,46,30,14,17,11,5]. Our previous work of [5] treated the case m ⋆ > 1 for critical convolution-type systems in R d , and in this work we complete the m ⋆ = 1 case, for example, now covering the variants of the classical Patlak-Keller-Segel system in R 2 . For Patlak-Keller-Segel, the proof that the critical mass is sharp follows easily from a standard Virial argument (see e.g. [14,41,43,11,5] or the more classical [35]), which can be modified in a straightforward manner to treat more general problems [5]. The corresponding program for the variable-coefficient systems (2) is a more difficult problem, completed by I. Kim and one of the authors elsewhere [4]. We also stress that (ii) is already known, e.g. [14,11], but we restate and prove it to provide comparison with the other problems that do not have scaling symmetries.
and let u(t) be the local-in-time weak solution to (1) with u(0) = u 0 and define M = u 0 1 .
(i) Suppose the problem is critical. Then there exists a critical mass M c such that if M < M c then u(t) exists globally. The estimate of M c is given below by (14). If K is bounded from below and (ii) Suppose m ⋆ = 2 − 2/d, A(u) = u 2−2/d and K = N , the Newtonian potential. Furthermore, suppose that M < M c . Then u(t) exists globally and satisfies Remark 9. Note the additional requirements in part (i) of Theorem 5 in order to assert global boundedness. Although we believe that solutions are bounded (and in many cases likely decay as in (ii)) our proof cannot rule out an unbounded increase of entropy or interaction energy as the solution spreads without the additional assumptions.
2 Local Existence and Energy Dissipation

Local Existence for Convolution-type Systems
This section focuses on the proof of local existence of weak solutions with bounded second moment. The proofs included here are simplifications of our work in [5] and combines techniques from the PKS model found in [14] and the non-singular aggregation-diffusion equations found in [8]. One advantage of the new techniques used here is the treatment of R 2 . Furthermore, in contrast to the work in [5], we obtain solutions directly in R d , removing the need for the intermediate step of finding solutions in bounded domains. Consider the regularized aggregation-diffusion equation We define, Proposition 2 (Local Existence for the Regularized System). Let ǫ > 0 be fixed and u 0 ( To prove the above proposition we need some preliminary definitions. Define the Hilbert space [22]. We will construct a weak solution to (15) with an analogous definition of weak solutions to Definition 3. For the remainder of the paper we denote the mass of the initial data, u 0 , by M , i.e. u 0 1 = M .
We prove Proposition 2 using the Schauder fixed point theorem, see e.g. [6]. The necessary compactness for the application is obtained via the Aubin-Lions Lemma [39]. Now we state and prove some a priori estimates that will be of used in the proof of Proposition 2. Some of these estimates are the same or closely related to estimates proved elsewhere (e.g. [14,8,5]).
Lemma 4 (A priori bounds with linear advection). For fixed ǫ > 0 letũ ∈ L 2 (0, T ; L 2 )∩L ∞ (0, T ; L 1 ) be given. Let u ǫ be the global strong solution to Proof. In what follows denote M := u ǫ (t) 1 = u 0 1 . By (17) and integration by parts, once on the diffusion term and twice on the aggregation term we obtain Integrating implies, which gives (i). The bound (ii) follows similarly by estimating the growth of u(t) p and passing to the limit p → ∞. To continue, we need a bound on the L 1 norm of A ǫ (u ǫ ). Condition (D3) of Definition 2 implies that A ǫ (z) ≤ (C A + 2ǫ) z for some C A > 0 and sufficiently small z. Hence by Chebyshev's inequality, We now turn to the less trivial estimate (iii). Let η R (x) := η(xR −1 ) for some R > 0, where η is the smooth cut-off function defined above. Now takeÃ = A ǫ (u ǫ )η R as a test function in the definition of weak solution (Definition 3), which implies, Note also, we can apply the chain rule (see Lemma 14 in [5] or Lemma 6 in [8]) and get The latter error term can be bounded as follows Therefore, since A ǫ (u ǫ ) ∈ L 1 ∩ L ∞ , we have for all T < ∞, by taking R → ∞, This completes the proof of (iii). This bound, along with the fact that a ′ ǫ ≥ ǫ, gives us the bound in (iv).
The bound on the second moment of u ǫ , (v), follows from the bound on the L 1 norm of A ǫ (u): An application of Grönwall's inequality then provides the quantitative result. Let φ ∈ C([0, T ]; H 1 ), then by the definition of the weak solution (Definition 3, Remark 1) and Cauchy-Schwarz, Therefore the bounds we have already obtained can be combined to imply ∂ t u ǫ L 2 (0,T ;H −1 ) ≤ C(T,ũ, u 0 ). Now that we have all the required a priori estimates we are ready to prove Proposition 2.
Proof. (Proposition 2) Define the compact and convex subset of L 2 (0, T ; L 2 ) for some C 0 to be chosen below. Compactness in the L 2 (0, T ; L 2 ) topology follows from the Lions-Aubin lemma (since V ⊂⊂ L 2 ) and the fact that S T is closed due to the weak compactness of V and L 2 (0, T ; H −1 ), and the lower semi-continuity of the L 1 norm.
Define the map J : S T → S T as follows: givenũ ∈ S T we define Jũ = u, where u is the solution to (17). Our goal is to apply the Schauder fixed point theorem. First, we verify that S T is invariant under J.
Step 1: (Invariance of S T ) Letũ ∈ S T . Recall from clasical regularity theory (e.g. [33] or [22]) that since u satisfies the a priori estimate (ii) from Lemma 4 it is a global strong solution (of course it is not necessarily classical due to the potential irregularity ofũ). Furthermore, the additional bounds provided by Lemma 4 plus conservation of mass give that u ∈ S T for C 0 sufficiently large and T sufficiently small. Note that T and C 0 depend only on M , the L ∞ norm of the initial data, and ǫ. Indeed, (i) provides a bound on u L 2 (0,T ;L 2 ) , (iv) provides a bound on ∇u L 2 (0,T ;L 2 ) , and (vi) provides a bound on ∂ t u L 2 (0,T ;H −1 ) . Moreover, the bound (ii) and (v) along with, provides an estimate on the first moment of the L 2 norm of u. Hence, J : S T → S T . We are left to show that J is a continuous map in L 2 (0, T ; L 2 ).
Step 2: (Continuity in L 2 (0, T ; L 2 )) We show J is continuous as a mapping from L 2 (0, T ; L 2 ) to C([0, T ];Ḣ −1 ), which by interpolation against uniform bounds in H 1 provided by (iv), implies continuity in L 2 (0, T, H −1 as can be seen from the Fourier transform). This approach, as opposed to working in L 2 directly, is convenient due to the nonlinear diffusion, which behaves most naturally inḢ −1 . This argument is reminiscent of theḢ −1 stability estimate used to prove uniqueness of weak solutions [8,7,5,2]. Let {v n } n≥0 ⊂ S T be such that v n → v in L 2 (0, T ; L 2 (R d )). We show that Jv n → Jv in C([0, T ];Ḣ −1 (R d )). To this end, let φ n := −N * (Jv n − Jv) and denote u n := Jv n and u := Jv. It is important to note that while the v n 's may not have constant L 1 norm, u n and u do since they have the same initial data.
In order for us to proceed, we must show −∆φ = u n − u, in the sense of distributions, and that ∇φ n ∞ + ∇φ n 2 C 0 ,M,u 0 ,T 1. To prove the former, it suffices to show that φ(t) ∈ L ∞ (R d ), which implies −∆φ is a bounded distribution. All of these are proven by arguments found in the proof of Theorem 1 in [2] so we only briefly recall them here. In dimensions d ≥ 3, these facts can all be established using the L p estimates on u n and u combined with Young's inequality and the fact that N ∈ L d/(d−2),∞ and ∇N ∈ L d/(d−1),∞ . However, in d = 2 the problem is more delicate as N (x) = 1 2π log |x| grows at infinity. Lemma 3 implies φ(t) ∈ L ∞ , as u n − udx = 0 and u n , u have uniformly bounded first moments. Proving ∇φ(t) 2 1 can be shown using the Fourier transform: u n − udx = 0 implies u n − u(0, t) ≡ 0 and the uniform boundedness of the first moments of u n , u implies Lipschitz continuity of u n − u. See [2] for more information. Now we may compute the evolution of ∇φ 2 2 = u n − u 2Ḣ −1 : Therefore, using φ(t) in the definition of weak solution, ) · ∇φ n dx.
We have dropped the time dependence for notational simplicity. Since A ǫ is increasing, we have the desired monotonicity of the diffusion, Using integration by parts we have, Hence, Moreover, By the uniform bound on u ∞ (by part (ii) above), the uniform bound on the mass in S T and the regularization of K, Integrating implies for some C > 0 depending on the uniform bounds and ǫ (using φ(0) ≡ 0), Since t ≤ T we have, By assumption, v n (s) − v(s) 2 → 0 in L 2 ((0, T )) and hence also pointwise a.e. on (0, T ). Since v n (s)−v(s) 2 2 , by the dominated convergence theorem we have ∇φ n 2 → 0 uniformly on (0, T ). Therefore, on S T , J is a continous mapping from L 2 (0, T ; L 2 ) to C([0, T ];Ḣ −1 ) and hence by interpolation against the uniform H 1 bounds provided by (iv) in Lemma 4, is also a continuous mapping from L 2 (0, T ; L 2 ) to L 2 (0, T ; L 2 ).
Finally, we apply the Schauder fixed point theorem, which implies there exists a solution Ju = u with u ∈ S T for some T > 0. By the regularization of K, it is straightforward to extend this solution to the regularized system indefinitely and to use a boot-strap argument to show that u is a classical solution to (15). Proposition 2 provides a global family of classical solutions {u} ǫ>0 ; however, we need to prove a priori bounds that are independent of the regularizing parameter ǫ to prove obtain any compactness and pass to a solution of the original system (1). The proof of compactness will strongly depend on the following iteration result.
Lemma 5 (Iteration Lemma [31,17]). Let 0 < T ≤ ∞ and assume that there exists a c > 0 and The primary a priori bound required to obtain compactness is a bound on the L ∞ norm which is independent of ǫ. The proof of Lemma 8 in [5] applies directly to the following lemma; however, we include the proof for the completeness. In this setting we simplify the proof by applying a homogeneous Gagliardo-Nirenberg inequality, Lemma 13.
Lemma 6 (L ∞ Bound of Solution). Let {u ǫ } ǫ>0 be the classical solutions to (15) on D T , with smooth, non-negative, and bounded initial data J ǫ u 0 . Then there exists C = C( u 0 1 , u 0 ∞ ) and T = T ( u 0 1 , u 0 p ) for any p > d such that for all ǫ > 0, Proof. For notational simplicity we drop the ǫ. The first step is to obtain an interval for which the L p norm of u is bounded for some p > d. Following the techniques commonly used, see for example [27,31,17,11,5], we define the function u ǫ k = (u ǫ − k) + , for k > 0. Due to conservation of mass the following inequality provides a bound for the L p norm of u given a bound on the L p norm of u k , We look at the time evolution of u k p .
Step 1: where we used the fact that for l > 0 Integrating by parts once more and using Lemma 1 (b) gives, Now, using (19) we obtain that An application of the Homogeneous Gagliardo-Nirenberg inequality gives that for any p such that d < 2(p + 1) (e.g. Lemma 13 in the Appendix): , Above β 1 , β 2 > 1. For k large enough we have that A ′ (u) > c > 0 over {u > k}; hence, if we choose δ small enough we obtain the final differential inequality: The inequality (21) in turns gives a T p = T (p) > 0 such that u k p is bounded on [0, T p ] and by (19) then u p remains bounded on the same time interval.
Step 2: We are not done since the bounds in (21) blow up as p → ∞. We follow the by now standard procedure and prove that the velocity field is bounded in L ∞ (R d ) on some time interval [0, T ]. Then this allows us to invoke Lemma 5 and obtain the desired bound. Since ∇K ∈ L 1 loc and ∇K1 R d \B 1 (0) ∈ L q for all q > d/(d − 1) (by Lemma 1 (a)), then for all p > d/(d − 1) By Lemma 1 (b) we also have, for all p, 1 < p < ∞, Then by Morrey's inequality we have ∇K * u ǫ ∈ L ∞ (R d ) uniformly in ǫ by choosing some p > d and invoking step one. Then Lemma 5 concludes the proof.
Proof. (Theorem 1) For all ǫ > 0, let u ǫ be the solution to 15 provided by Proposition 2. Lemma 6 provides a uniform-in-ǫ a priori upper bound on u ǫ (t) ∞ on some time inverval (0, T ) as well as a uniform bound on ∇K * u ∞ . Hence, we may easily deduce the following a priori bound on the second moment: The first term is bounded in terms of u 1 and u ∞ as in the proof of Lemma 4. By Grönwall's inequality we therefore have a uniform-in-ǫ a priori upper bound on the second moment. Similarly, one may alter the proof of (iii) in 4 to bound ∇A L 2 (0,T ;L 2 ) independent of ǫ using the uniform bound on u ǫ ∞ . Using these a priori bounds we may follow the arguments of [5] (see also [8]) and prove the following lemma.
Equi-continuity relies on the fact that A(u ǫ ) L 2 (0,T ;H 1 (D)) ≤ C uniformly in ǫ. The proof depends on the domain size but when combined with tightness obtained from a priori bounds on the second moment (see below), this suffices. Hence, using Lemma 7 we may extract a subsequence u ǫ k → u in L 1 ((0, T ) × R d ). Additionally, since u ǫ k ∞ is uniformly bounded, by lowersemicontinuity, we also have u ∈ L ∞ ((0, T ) × R d ). Once the limit is extracted, we may further upgrade the convergence to C([0, T ]; L 1 ) as in [8,5] and hence to C([0, T ]; L p ) for all 1 ≤ p < ∞ due to the uniform L ∞ bound. The last remaining technical point is to ensure that the limit is indeed a weak solution of the original system in the sense of Definition 3. This may be done using a limiting argument and the a priori bounds available on u as in [5,8].
Proving the energy dissipation inequality in R 2 also presents an additional complication. As K is potentially logarithmically unbounded at infinity and hence the convergence of the interaction energy is non-trivial, we cannot follow the exact argument from [5]. Hence, let u(t) be the unique weak solution constructed above and {u ǫ } ǫ>0 be the solutions to the regularized system (15). We may follow the proof in [5] to prove convergence of the entropy and the energy dissipation, hence, we need only focus on the interaction energy. To this end, Since K ǫ (x) = K(x) for all |x| > 1 we have, Now consider T 2. By the duality of BM O and H 1 and Lemma 3 we have, for any 1 < p < ∞. However, u ǫ → u in C([0, T ]; L p ) for all such p, so the first term is not an issue.
To deal with the second term we first use Cauchy-Schwarz However, since u ǫ → u in C([0, T ]; L 1 ) and both u ǫ and u have uniformly bounded second moments on [0, T ] we have that T 2 → 0. The final term, T 3, follows similarly. Hence, the energy dissipation inequality holds in R 2 .

Local Existence for Spatially Inhomogeneous Patlak-Keller-Segel Systems
In this section we prove local existence of solutions of (2). Once again, we begin with a regularization, which in this case is given by where A ǫ (u ǫ ) is the parabolic regularization given by (16). This particular regularization is convenient because (22) satisfies a natural energy dissipation inequality, Indeed, using one can formally obtain Moreover, the regularization of the chemo-attractant only occurs on the right-hand side, which proves to be of some technical use. The proof of Theorem 2 is very similar to that of Theorem 1: one simply needs to check that c satisfies properties sufficiently similar to those satisfied by convolution-type systems. Most of these properties are obatained via standard elliptic estimates, which we state and prove in the Appendix to be referenced as needed. As was done for the convolution-type systems, we first need equivalent estimates as those provided by Lemma 4 for the system whereũ ∈ L 2 (0, T ; L 2 ) ∩ L ∞ (0, T ; L 1 ) is given. As above, this will allow us to obtain the necessary compactness and a priori estimates to apply the Schauder fixed point theorem and obtain a family of classical solutions for (23), as in Proposition 2.
Proof. (Theorem 2) As this proof follows that of Theorem (1) we simply mention and work out the differences. The first step is to prove the existence of solutions to the regularized system (23). This is done, as in the proof of Proposition (2), by a fixed point argument. Hence, we need to show that for a givenũ the system (23) has a solutions that satisfies the appropriate bounds. Examining the proof of Lemma (4) we see that all the bounds hold provided we have the bounds on ∆J ǫ c ∞ and ∇J ǫ c ∞ . Using Young's inequality for convolutions, when γ > 0 we obtain The estimates are different in the case when γ ≥ 0, as we only have estimate (36) (necessarily we are assuming d ≥ 3). We instead have, ∆J ǫ c ∞ ǫ c 6d 5d−12 Although this estimate involves a lower power of ũ 2 , this does not pose a problem for our needs. A similar bound for ∇J ǫ c ∞ can be obtained. These bounds are sufficient to obtain a family of solutions on {u ǫ } ǫ>0 on [0, T ǫ ) by following the proof of Proposition 2. Next, we need to prove a uniform-in-ǫ L ∞ bound of the solutions, similar to Lemma 6. Recall that the first step in the proof of Lemma 6 was to obtain a bound on u ǫ p for p > 1. This required a bound of the form for example, to obtain the inequalities after (20). If γ(x) is strictly positive, a proof of (24) can be found in Lemma 12 in the Appendix. For the remaining cases, d ≥ 3, this estimate is replaced by which is obtained from Lemma 11, noting that pd 2p+d < p and interpolating between the L p and L 1 norms. The the next step was to use the bounds on the L p norms of u ǫ to deduce a bound on ∇c ǫ ∞ , independent of ǫ. By Morrey's inequality and then the Gagliardo-Nirenberg inequality (38), If γ is strictly positive then we may use Lemma 12 and we have If γ is not strictly positive we have to use the L p estimate (36) and Lemma 11, which holds in d ≥ 3 for p > d/(d − 2): ∇c ∞ a,p,d u p + u pd d+2 u p + u 1 .
Hence, we may proceed as in Lemma 6 and apply Lemma 5 and deduce the requisite local in time, uniform-in-ǫ L ∞ estimates. The last primary estimate we need before we can obtain enough pre-compactness to extract a subsequence which converges towards a solution to the original model (2) is a uniform estimate on A(u) L 2 (0,T,H 1 ) . Examining the proof of estimate (iii) in Lemma 4, one can see that the bound on ∇c ∞ is sufficient. Analogously as to above, pre-compactness of {u ǫ } ǫ>0 follows from these uniform estimates. Finally, to obtain a solution to (2) the only real difference with convolution-type systems is that we need to check that u ǫ ∇c ǫ − u∇c L 2 → 0 as ǫ → 0. Indeed, we know that Furthermore, again using (38) we have The result follows from Lemma 12 or Lemma 11 and the convergence in L p of J ǫ u ǫ to u. Hence, one may deduce that u is a weak solution to (2) in the sense of Definition 3. We are left to prove the energy dissipation inequality (11). Convergence of the entropy term follows exactly as the proof of inequality (10), given in Theorem 1. The interaction energy on the other hand requires a little more. As above, Given that u ǫ → u in L 1 and that we have uniform bounds on c ǫ ∞ by and u ǫ L 1 = M we only have to verify the uniform convergence of c ǫ to c. It follows from similar arguments as above, using Morrey's inequality and (38) Hence, the result follows from Lemma 11 and (36) or Lemma 12 and the L p uniform-in-time convergence of u ǫ to u. The last step that requires attention is proving the convergence of the free energy dissipation (also called the generalized Fisher information), However, as in the convolution-type case in [8,5], it follows from a weak lower-semicontinuity result due to Otto [36] (see also [19]).

Continuation for Spatially Inhomogeneous Patlak-Keller-Segel Systems
The proof of Theorem 3 is very similar to that of Lemma 6 and largely follows the work of [17,11]. Both of these results quantify the strength of the nonlocal advection term with the inequality ∆K * u p p,d u p .
For the system (2), unless γ(x) is strictly positive, we only have the estimate provided by Lemma 11 in the Appendix: for p > d/(d − 2), ∆c p p,d u p + u pd 2p+d u p + u 1 .
The lower-order term (which is constant due to conservation of mass) can be safely added into other lower-order terms in the argument which are already present due to the inequality (19). The difference between Lemma 6 and Theorem 3 is the use of a better Gagliardo-Nirenberg inequality to control the highest order contribution from the advection term: with α 2 > 0 determined via scale invariance and homogeneity (Lemma 13). The latter factor can be related to the nonlinear diffusion and the former factor is the quantity appearing in (12). The key point is that the condition (12) then implies we may make the highest order contribution of the nonlinear advection arbitrarily small compared to the diffusion by choosing k large.

Subcritical and Small Data Theory
Proof. (Theorem 4) We proceed with the formal computations, noting that the computations are completely rigorous by appealing to the parabolic regularization above. The proof follows a similar outline to the proof of Lemma 6 or Theorem 3, which is now standard in the literature on PKS and related models. As remarked above, the subcritical case is a standard variant of the proof of Theorem 3 [31,17,5]; therefore, we only prove (ii). We first prove the result for convolution-type systems and for variable-coefficient systems where γ is strictly positive.
Step 2 (L ∞ bound): Once more we use Lemma 5, which implies that if ∇c ∈ L ∞ ((0, ∞) × R d ) then u ∈ L ∞ ((0, ∞) × R d ). If m ⋆ = 2 − 2/d, by Lemma 1 or Lemma 12, For systems of convolution type we have by weak Young's inequality, and for variable coefficient systems we may use (38) and Lemma 12. By Step 1, we may choose u 0 q sufficiently small such that both of these norms are uniformly bounded, which is sufficient to apply Morrey's inequality and conclude that ∇c ∈ L ∞ ((0, ∞) × R d ).
As above α 1 (p + 1)/2 = 1. Let θ ∈ (0, 1) such that u (p+1)d 2p+2+d ≤ M 1−θ u θ p+1 . Therefore by (28) we have, Ifq = 1 we may choose M sufficiently small (depending on p) so that the first term is negative (in fact, as negative as we require, say < −δ/2). Since again by interpolation u p p ≤ u p+1 p+1 + M we then have Hence for all p we may choose M sufficiently small such that the L p norm of u(t) is uniformly bounded in time. From here we may proceed similarly to above (using Lemma 11 instead of Lemma 12) and eventually invoke Lemma 5. Ifq > 1 more thought is necessary since u(t) q is not a conserved quantity and the continuity argument used for convolution-type systems will need to be refined due to the presence of the lower-order term. From the previous argument in theq = 1 case it is clear that the result will follow if for any ǫ > 0, we may choose u 0 q sufficiently small to ensure u(t) q < ǫ uniformly. To this end we note that (30) holds for allq: The result will follow from a continuity argument. If u 0 q is chosen such that then for some time (depending on M ) this implies, which in turn implies, From here it is clear that if the mass is chosen sufficiently small then u(t) q can be bounded uniformly by a sufficiently small constant. Hence we may proceed as after (30) above to finish the proof.

Large Data Theory for Critical Convolution-type Systems
As in [5,14,11], global existence in critical cases will follow from continuation (Theorem 3) in [5] provided {u>k} u log udx is bounded for large k, which will be deduced using the energy dissipation inequality (10). There are two difficulties here. Firstly, due to (D3), the nonlinear entropy Φ(u)dx and the Boltzmann entropy u log udx are no longer uniformly bounded from below as the solution spreads out. Moreover, and perhaps more fundamentally, is the possibility of K not being bounded from below, which again can cause the potential energy to grow unboundedly as the solution spreads out. Both difficulties will be overcome by the following two standard lemmas. The first shows that the decay of the entropy is bounded from below by controllable quantities. and Proof. Following [14], by Jensen's inequality for probability measures (dµ = u M dx), Therefore, (8) holds. Next, by (D3), for some δ > 0, A ′ (z) ≤ C A for z < δ. Let h(z) = z 1 A ′ (s)s −1 ds and note that Φ(u)dx = R d u 0 h(z)dzdx. For z < 1 we have, Therefore, since log z is integrable at zero, we have the following by Chebyshev's inequality, Therefore, using (31) we obtain (32).
The following lemma establishes a uniform bound on the second moments for critical problems with m ⋆ = 1.
Lemma 9 (Second Moment Estimate). Let A and K be admissible and critical with m ⋆ = 1. Then, Proof. We argue formally, noting that the computations can easily be made rigorous with standard arguments. Computing the time evolution of the second moment, By (D1) and (D3) of Definition 2, and criticality in the sense of Definition 5, we necessarily have A ′ (z) 1, and hence A(u)dx M . By Definition 1 and m ⋆ = 1 we have, Therefore by integration, the lemma follows.
We now prove Theorem 5.
Proof. (Theorem 5) We proceed formally, noting that the arguments can be made rigorous with the regularization procedure of the local existence theory. We begin by proving (i). We first prove the result for the case when either K is not bounded below or the diffusion is not degenerate. As noted above, we will not get uniform-in-time bounds. Recall the energy dissipation inequality (10), By the asymptotic expansion of the kernel assumed in Proposition 1 and (BD), we have that for all ǫ > 0, ∃ δ, R > 0 such that, Note that for R > 0 sufficiently large, By the logarithmic Hardy-Littlewood-Sobolev inequality (6), By Lemma 9, M 2 (t) 1 + t for all t < ∞. Since M < M c as defined in (14), it is possible to choose k large enough and ǫ small enough such that {u>k} u log udx is bounded on any finite time interval. This is sufficient to imply equi-integrability on any finite time interval and hence by Theorem 3 the solution u(t) must be global.
We now refine the argument under the additional hypotheses that K is bounded below and the diffusion is degenerate. First note (see e.g. [5]) that We now turn to the proof of (ii), which shows that with additional homogeneity assumptions the energy dissipation inequality can be used to deduce optimal decay results for critical problems when M < M c . Recall this result is already known [14,11]. The key is that the scaling invariance can be used to transform the energy dissipation inequality into something significantly stronger. We proceed by considering the self-similar variables, as in [14,11,13,3], defining θ(τ, η) such that e −dτ θ(τ, η) = u(t, x), with coordinates e τ η = x and e dτ − 1 = dt. In these coordinates, if u(t, x) solves (1), by the homogeneity of the Newtonian potential we have, ∂ τ θ = ∇ · (ηθ) + ∆θ 2−2/d − ∇ · (θ∇N * θ).
Lemma 11 (Homogeneous gradient estimates). Let d ≥ 3 and let c be a strong solution which vanishes at infinity in R d of −∇ · (a(x)∇c) = f.
Suppose a(x) ∈ C 1 is strictly positive, bounded and ∇a is also uniformly bounded. Then if p > d/(d − 2) we have, Moreover, as p → ∞, C(a, p, d) p.
We now have to deal with the latter error terms. Using the interpolation inequality on page 250 of [23], which is essentially just (38), we have, a∇c L p (B σR (x 0 )) ≤ a ∞ ∇c L p (B σR (x 0 )) ≤ a ∞ Rǫ D 2 c L p (B σR (x 0 )) + a ∞ C Rǫ c L p (B σR (x 0 )) , where crucially the constant C does not depend on σ or R (C is some power of a constant of a Gagliardo-Nirenberg inequality which depends only on the geometry and not the diameter). Hence by choosing ǫ sufficiently small independent of R we have, Since p > d/(d − 2), (36) ensures that latter term will add up as we cover R d with balls (using that we may take R 1 due to the bound on ∇a). Therefore, D 2 c p f p + f dp Proof. Easy variant of the above using the better L p norm estimate (37) to control the lower order terms in (39).

Appendix: Gagliardo-Nirenberg
Lemma 13 (Homogeneous Gagliardo-Nirenberg). Let d ≥ 2 and f : R d → R satisfy f ∈ L p ∩ L q and ∇f k ∈ L r . Moreover let 1 ≤ p ≤ rk ≤ dk, k < q < rkd/(d − r) and Then there exists a constant C GN S which depends on s, p, q, r, d such that where 0 < α i satisfy