A superposition principle for the inhomogeneous continuity equation with Hellinger-Kantorovich-regular coefficients

We study measure-valued solutions of the inhomogeneous continuity equation $\partial_t \rho_t + {\rm div}(v\rho_t) = g \rho_t$ where the coefficients $v$ and $g$ are of low regularity. A new superposition principle is proven for positive measure solutions and coefficients for which the recently-introduced dynamic Hellinger-Kantorovich energy is finite. This principle gives a decomposition of the solution into curves $t \mapsto h(t)\delta_{\gamma(t)}$ that satisfy the characteristic system $\dot \gamma(t) = v(t, \gamma(t))$, $\dot h(t) = g(t, \gamma(t)) h(t)$ in an appropriate sense. In particular, it provides a generalization of existing superposition principles to the low-regularity case of $g$ where characteristics are not unique with respect to $h$. Two applications of this principle are presented. First, uniqueness of minimal total-variation solutions for the inhomogeneous continuity equation is obtained if characteristics are unique up to their possible vanishing time. Second, the extremal points of dynamic Hellinger-Kantorovich-type regularizers are characterized. Such regularizers arise, e.g., in the context of dynamic inverse problems and dynamic optimal transport.


Introduction
The main objective of this paper is to present a new superposition principle for positive measure solutions to the linear inhomogeneous continuity equation, assuming natural regularity on the velocity field and on the source term. Such assumptions are substantially weaker than what is currently available in the literature, as we will discuss below. To be more precise, given Ω ⊂ R d the closure of an open bounded domain, we consider narrowly continuous curves of positive measures t → ρ t in M + (Ω) solving (1) ∂ t ρ t + div(vρ t ) = gρ t in (0, 1) × Ω in the sense of distributions, where v : (0, 1) × Ω → R d is a velocity field satisfying no flux boundary conditions on ∂Ω and g : (0, 1) × Ω → R is a source term encoding the inhomogeneity of the equation. We assume that the coefficients v and g are Hellinger-Kantorovich-regular, namely, they are Borel measurable and satisfy the bound In the following we will clarify the role of (2) in connection to recent advancements in the theory of Unbalanced Optimal Transport. Our task is to provide a superposition principle for (1) that allows to represent any positive solution t → ρ t as a superposition of elementary solutions, that is, curves of measures of the form t → h(t)δ γ(t) , where the trajectories γ : [0, 1] → Ω and the weights h : [0, 1] → [0, ∞) solve, in an appropriate sense, the system of characteristics for (1): (3) (i)γ(t) = v(t, γ(t)) (ii)ḣ(t) = g(t, γ(t))h(t) in (0, 1) .
Notice that (i) describes all possible elementary trajectories which follow the flow given by v, while (ii) encodes the lack of mass preservation for solutions to (1), due to the inhomogeneity. The precise statement of such superposition principle is given in Theorem 1.1 below. Subsequently we provide two applications of the superposition principle for (1). First we prove uniqueness for minimal norm solutions to (1) under the assumption of uniqueness for solutions to (3) up to their possible vanishing time (see Theorem 1.2); Second, we characterize extremal points of regularizers closely related to the energy at (2), and apply such result to sparsity for dynamic inverse problems regularized via unbalanced optimal transport (see Theorem 1.3).
Concerning relevant literature, we mention that the superposition principle for narrowly continuous curves of probability measures t → ρ t solving the homogeneous continuity equation (4) ∂ t ρ t + div(vρ t ) = 0 in (0, 1) × Ω is by now classical. It was first introduced in the Euclidean setting by Ambrosio in [4], where it was employed to investigate uniqueness and stability of Lagrangian flows in the context of DiPerna-Lions Theory [25]. Since then it has been applied to different tasks [5,10,11,12,13] and extended to various settings [15,34,37,44]. In [3] the velocity field v is assumed to satisfy An elementary solution to (4) is of the form t → δ γ(t) where γ : [0, 1] → Ω is an absolutely continuous curve solving the characteristic equation (i) in (3). Due to the lack of regularity of v, solutions to the initial value problem associated to (i) are not unique. Such non-uniqueness is reflected in the superposition formula, which in this case is achieved by constructing a probability measure σ on the set Γ := C([0, 1]; Ω). To be more precise, it can be shown that if ρ t ∈ M + (Ω) is a narrowly continuous solution to (4) and v satisfies (5), then there exists a measure σ ∈ M + (Γ) concentrated on absolutely continuous curves satisfying (i), with the property that ρ t can be represented by the pushforward of σ via the evaluation map e t (γ) := γ(t), that is, (6)ˆΩ ϕ(x) dρ t (x) =ˆΓ ϕ(γ(t)) dσ(γ) for all ϕ ∈ C(Ω) , t ∈ [0, 1] .
We refer the reader to [3,Theorem 8.2.1] for a proof of (6) with Ω = R d and to [20,Theorem 7] for the case of Ω being the closure of a bounded domain. A generalization of (6) for positive measure solutions to the inhomogeneous continuity equation (1) in Ω = R d is presented in [37]. Specifically, the following is proven in [37,Theorem 4.1]: suppose that ρ t ∈ M + (Ω) is a narrowly continuous solution to (1), that v satisfies (5) and g is bounded; then there exists a representing measure σ ∈ M + (Γ × Ω), concentrated on pairs (γ, x) with γ absolutely continuous curve solving (i) in (3) with the initial condition γ(0) = x, and such that ρ t is represented via the implicit formula (7)ˆΩ ϕ(x) dρ t (x) =ˆΓ ×Ω ϕ(γ(t)) dσ(γ, x) +ˆt 0 ˆΩˆΓ ϕ(γ(t)) dσ x s (γ)g(s, x) dρ s (x) ds , for all ϕ ∈ C(Ω), where for fixed t, the family {σ x t } x∈Ω is the disintegration of σ with respect to (ẽ t ) # σ ∈ M + (Ω), withẽ t (γ, x) := γ(t). There are two main drawbacks with the superposition principle from [37]: First, the representation formula (7) is implicit; Second, the source term g is required to be bounded. Such assumption on g is substantial, as it implies uniqueness of solutions to (ii) in (3) along any trajectory. This fact essentially allows the author of [37] to construct the measure σ in (7) in the same way as the one in (6). Another limitation of [37] is that it is not possible to provide a representation via (7) for solutions with mass that is vanishing or generating from zero during the evolution (for an example, see Remark 4.6).
The main focus of this paper is to obtain a superposition principle for (1) which overcomes the above mentioned limitations of [37]. Indeed we obtain an explicit representation formula for (1) that resembles (6). In addition, we remove the boundedness assumption on g, and we replace it by the growth condition (2). Removing such assumption on g is far from straightforward, as it requires a new functional analytic framework for constructing a representation measure σ. In fact, the low regularity of g implies non-uniqueness for the initial value problem associated with (ii) in (3). This suggests that a measure σ representing a solution t → ρ t to (1) has to account for non-uniqueness both for the trajectories γ and the weights h. Therefore, σ cannot just be a measure on Γ, but rather on a space of pairs (γ, h), as discussed in Theorem 1.1 below. We now discuss the coupling of the continuity equation at (1) with the energy at (2), which is at the center of recent important developments in the theory of Unbalanced Optimal Transport. The classical theory of Optimal Transport, in its Monge-Kantorovich formulation [29,42,47], concerns the problem of transporting mass from a probability measure into a target one, while minimizing a given cost. Benamou and Brenier [9] made the crucial observation that the classical formulation of optimal transport has a dynamic counterpart, which links the continuity equation (4) with the energy at (5). More precisely they observed that it is possible to compute the optimal transport between two probability measures ρ 0 and ρ 1 by minimizing the dissipation at (5) among all the curves of probability measures t → ρ t and velocity fields v solving the continuity equation (4) with initial and final conditions given by ρ 0 and ρ 1 respectively. Such dynamic formulation makes possible to endow the space of probability measures with a differentiable structure [3], bringing to light deep connections between optimal transport and functional analytic issues, such as the characterization of differential equations as gradient flows in spaces of measures [3,7,8,27,28,41,40] or the derivation of sharp inequalities [1,24,35,36,38,39]. Particularly in connection to applications, the assumption of mass preservation during the evolution is quite restrictive. Overcoming this limitation is at the core of the so-called unbalanced optimal transport theory. Among the various formulations, we highlight the one introduced in [22,31,33]. There, transporting a positive measure ρ 0 into a target one ρ 1 corresponds to minimize a weighted version of (2) among all curves of positive measures t → ρ t and fields v, g satisfying the inhomogeneous continuity equation (1) with initial and final conditions given by ρ 0 and ρ 1 respectively. The quantity at (2) takes the name of Wasserstein-Fisher-Rao or Hellinger-Kantorovich energy in the literature. Such an approach has been successfully employed in applications where mass preservation is violated [21,23,32,43]. In particular in [33] it is shown that the above minimization procedure induces a distance which is compatible with a differentiable structure on the space M + (Ω). This distance can also be derived from the dynamic formulation of the Logarithmic-Entropy Optimal Transport problem [33] or can be regarded as dissipation energy for a certain class of scalar reaction-diffusion equations [32]. We conclude this introduction by discussing in more details the superposition principle we propose for (1), as well as the applications provided in this paper. The rest of the manuscript is organized as follows. In Section 2 we introduce basic notations, as well as presenting some results on continuity equations and optimal transport energies. In Section 3 we set the functional analytic framework needed in order to prove our superposition principle. In particular we investigate properties of the Hellinger-Kantorovich energy (2) when restricted to elementary solutions to (1). In Section 4 we provide a proof for the main result of this paper, that is, the superposition principle in Theorem 1.1 below. Finally, in Sections 5, 6 we detail applications of the superposition principle to uniqueness for solutions to (1) and to sparsity for dynamic inverse problems with Hellinger-Kantorovich-type regularizers.
1.1. Main result. To obtain a superposition principle for (1) under the energy bound (2) we construct a positive measure σ on the set S Ω of narrowly continuous curves t → ρ t with values in C Ω := {hδ γ ∈ M(Ω) : h ≥ 0 , γ ∈ Ω} . We endow C Ω with the flat distance of measures and S Ω with the respective supremum distance. In this way S Ω becomes a separable metric space. Notice that S Ω plays the role of the set of continuous curves Γ in (6). As we will see, c.f. Remark 3.2, the construction of C Ω closely resembles the cone space introduced in [32,33] to study absolutely continuous curves with respect to the Hellinger-Kantorovich distance. It is immediate to check that elements of S Ω can be represented by ρ t = h(t)δ γ(t) , for some nonnegative weight h ∈ C[0, 1] and curve γ ∈ C({h > 0}; Ω), where we set {h > 0} := {t ∈ [0, 1] : h(t) > 0}. Thus, the mass of the elements of S Ω is varying continuously in time and is allowed to vanish, reflecting the behavior of solutions to (1). The measure σ we construct is concentrated on elements ρ t = h(t)δ γ(t) ∈ S Ω , with h and γ solving the system of ODEs: a.e. in (0, 1) .
Notice that, in comparison to the system of characteristics at (3), we are restricting the first ODE to the set {h > 0}. Indeed, if h(t) = 0, then ρ t = 0 and thus we lose any information on the trajectories for that time instant. The above observations are formalized in the following theorem, which is the main result of our paper (c.f. Theorem 4.3).  (2) and such that v has no flux on ∂Ω. Then there exists a measure σ ∈ M + (S Ω ) concentrated on curves of measures ρ t = h(t)δ γ(t) with h, γ solving (8) and such that Conversely, assume that σ ∈ M + (S Ω ) is concentrated on solutions to (8) and satisfies Then (9) defines a narrowly continuous curve of positive measures solving (1).
Notice that the growth condition (10) is natural, in the sense that if a measure σ represents ρ t and (2) holds, then automatically σ satisfies (10). We refer the reader to Remark 4.4 below for more details. We also remark that the set Ω in Theorem 1.1 is required to be bounded. Indeed it would be interesting to extend our result to unbounded domains, in the spirit of [3,37] where Ω = R d is considered. However, it seems that a different proof strategy or stronger assumptions are required, see Remark 4.7 below for details. Moreover, similarly to [3,37], it should be possible to prove a version of Theorem 1.1 in which (2) is replaced by an L p bound for 1 ≤ p ≤ ∞. Such analysis falls outside the scope of our paper. The proof of Theorem 1.1 is presented in Section 4. It is based on a similar smoothing strategy as the one employed in [4] to prove (6). However in this case there are two main differences: first one needs to establish compactness properties for a coercive version of the Hellinger-Kantorovich energy when restricted to elements of C Ω , see Proposition 3.10; second the smoothing needs to take into account the possibility of the measure ρ t vanishing at some time instance, as detailed in Remark 4.6 below.

1.2.
Uniqueness of solutions to the continuity equation. In Section 5 we present the first application of the superposition principle of Theorem 1.1. Our aim is to show that uniqueness of solutions for the system of ODEs at (3), up to their possible vanishing time, implies uniqueness for measure solutions to the inhomogeneous continuity equation (1) satisfying the bound (2) and with minimal total variation. The key ingredient of the proof is formula (9), which allows to decompose any solution of (1) satisfying the bound (2) into a superposition of elementary curves t → h(t)δ γ(t) such that (γ, h) are solutions to (8). Such representation allows to link uniqueness for (3) with the one for (1). The main difference between our result and the classical one for the homogeneous continuity equation [6,Theorem 9] lies in the fact that elementary solutions ρ t = h(t)δ γ(t) are allowed to vanish in time. In this case uniqueness for (3) is not enough to ensure uniqueness of solutions to the inhomogeneous continuity equation. Indeed, when the mass of a solution vanishes at a given time instantt ∈ (0, 1), the uniqueness assumption for (3) is not providing any information on the behavior of the solutions for t >t: this is because the measure σ is concentrated on solutions to (8) where i) is only valid in the set {h > 0}. Therefore, in order to recover uniqueness for (1), we impose an extra constraint on the total variation of its solutions. More precisely, we show that solutions to (1) with minimal mass can be represented, invoking Theorem 1.1, by a measure σ concentrated on curves t → h(t)δ γ(t) such that (γ, h) solves (8) and h is strictly positive in an interval [0, τ ) ∩ [0, 1] for some τ ∈ R. Such observation allows to employ uniqueness for the system of characteristics at (3), up to their possible vanishing time, to infer uniqueness for measure solutions to (1) with minimal total variation. We obtain the following theorem, c.f. Theorem 5.1.

1.3.
Extremal points of the Hellinger-Kantorovich energy. In the context of inverse problems, the knowledge of the structure of extremal points of the regularizer allows to numerically reconstruct sparse solutions, i.e., solutions given by the superposition of finitely many extremal points [17,18]. It has been recently proposed [21] to regularize dynamic inverse problems via an energy related to the one at (2). To be more specific, the energy at (2) can be recast into a convex functional B δ over the space M := M((0, 1) × Ω) d+2 defined by (11) B δ (ρ, m, µ) : if ρ ≥ 0, m, µ ≪ ρ, and set to ∞ otherwise, where δ > 0 is a parameter (c.f. Section 2.2). The regularizer studied in [21] consists in the energy at (11) to which the total variation of ρ is added, while enforcing the continuity equation constraint ∂ t ρ + div m = µ. An analysis of the extremal points of such energy is currently missing in the literature: Therefore, in this paper, we employ the superposition principle of Theorem 1.1 to characterize the extremal points of the set where α, β > 0 are parameters. Notice that we do not impose boundary conditions in the continuity equation at (12). Moreover the total variation of ρ is added to the functional B δ , in order to enforce coercivity, and thus compactness of B. We prove the following result (c.f. Theorem 6.3).
In the above we denote by AC 2 the set of absolutely continuous functions with a.e. derivative in L 2 (see [3, Section 1.1] for a precise definition). Theorem 1.3 is a generalization of the results obtained in [20], where the Benamou-Brenier energy with homogeneous continuity equation constraint is considered. In Section 6.2 we apply Theorem 1.3 to understand the structure of sparse solutions for dynamic inverse problems with unbalanced optimal transport regularization. In particular, we consider the inverse problem proposed in [21], where the minimization of the energy at (12) is coupled with a fidelity term penalizing the distance between ρ and some fixed observation. Applying recent results on sparsity [16,19] we show that the minimization problem in [21] admits a solution which is a finite linear combination of extremal points of B, that is, of curves as described in Theorem 1.3.

Preliminaries
For measure theory notations and definitions we follow [2]. Given a metric space Y we denote by M(Y ), M(Y ; R d ), M + (Y ) the spaces of bounded Borel measures, bounded vector Borel measures, bounded positive Borel measures on Y , respectively. Throughout the paper, whenever we say that a set or a function is measurable, we always intend Borel measurable, i.e., measurability with respect to the Borel σ-algebra. For a measure µ we denote its total variation measure by |µ|. We say that a sequence of measures {µ n } n on Y converges narrowly to µ if Y ϕ(y) dµ n (y) →´Y ϕ(y) dµ(y) for all ϕ ∈ C b (Y ), where C b (Y ) denotes the set of real valued continuous and bounded functions on Y .
Let Ω ⊂ R d be the closure of a bounded domain, with d ∈ N, d ≥ 1, and define the time-space domain X Ω := (0, 1) × Ω. We say that ρ ∈ M(X Ω ) disintegrates with respect to time if there exists a Borel family of measures is continuous for each fixed ϕ ∈ C(Ω). The family of narrowly continuous curves is denoted by C w ([0, 1]; M(Ω)). Notice that if t → ρ t is narrowly continuous, by the principle of uniform boundedness, it follows that ρ := dt ⊗ ρ t belongs to M(X Ω ). We also introduce C w ([0, 1]; M + (Ω)) as the family of narrowly continuous curves with values into M + (Ω). The above definitions extend verbatim to the case Ω = R d . . We say that the triple (ρ, m, µ) ∈ M Ω solves the continuity equation (13) ∂ t ρ + div m = µ in X Ω , whenever (13) holds in the sense of distributions, i.e., ˆX Here, ρ represents a density, m a momentum field advecting ρ, while µ is a source term accounting for mass change. The above definition also holds for unbounded spatial domains, e.g., Ω = R d . Moreover the time interval (0, 1) can be replaced by (0, T ) with T > 0. We remark that (14) includes no flux boundary conditions for m on ∂Ω, and no initial conditions for ρ are prescribed. Moreover (14) can be equivalently tested with maps in C 1 c (X Ω ) [3, Remark 8.1.1]. The following lemma provides some properties of solutions to (14) which will be needed in the coming analysis. The statement holds both in bounded domains as well as in R d . For a proof in bounded domains see, e.g., Propositions 2.2, 2.4 in [21], which can be easily generalized to R d .
In the rest of the paper we will identify ρ t with its narrowly continuous representativeρ t , whenever the assumptions of Lemma 2.1 hold.

2.2.
Optimal transport energy. We now introduce the Wasserstein-Fisher-Rao energy, also known as the Hellinger-Kantorovich energy, as originally done in [22,31,33]. To this end, let δ > 0 be a fixed parameter. Define the convex, one-homogeneous and lower semi-continuous where ∞y 2 = ∞ for y = 0 and ∞y 2 = 0 for y = 0. The Wasserstein-Fisher-Rao energy is given by the map B δ : where λ ∈ M + (X Ω ) is an arbitrary measure such that ρ, m, µ ≪ λ. Definition (16) does not depend on the choice of λ, as Ψ δ is one-homogeneous. Properties of the energy B δ which are relevant in the following analysis are summarized in Lemma A.4 (for a proof see [21, Proposition 2.6]). We now introduce a coercive version of B δ : Set and define the functional J α,β,δ : where α > 0 and β > 0 are fixed constants. We remark that adding the total variation of ρ to B δ enforces the balls of J α,β,δ to be compact in the weak* topology of M Ω . Such property, together with others, is the object of Lemma A. 5 Then for each x ∈ R d the ODE admits a unique absolutely continuous solution t → X x (t) defined for all t ∈ [0, 1].
Next we provide a representation formula for measure solutions of the continuity equation (13). This is the analogue of [3, Lemma 8.1.6] for the inhomogeneous continuity equation, and a generalization of [37,Proposition 3.6] to the case of g unbounded.
Proof. Narrow continuity of t → ρ t follows immediately from (20), dominated convergence and the continuity of t → X x (t) for each x. Let now ϕ ∈ C 1 c ((0, 1) × R d ). Then for ρ 0 -a.e. x in R d , the map t → ϕ(t, X x (t)) is absolutely continuous in (0, 1), with a.e. derivative given by thanks to Proposition 2.2. By (20) we also have that t → ϕ(t, X x (t))e´t 0 g(s,Xx(s)) ds is absolutely continuous in (0, 1), and for a.e. t ∈ (0, 1) it holds In particular, it is immediate to check that x)| dt, which are finite by (18), (20). Therefore, we can apply Fubini's theorem and (21), (22), (23), to computê where ρ = dt ⊗ ρ t . Now notice that the above right-hand side vanishes since ϕ is compactly supported, concluding the proof.
The next proposition states that, under some regularity assumptions, every solution of (14) can be represented as in (21).
is a narrowly continuous solution to the continuity equation ∂ t ρ t + div(vρ t ) = gρ t in (0, 1) × R d in the sense of (14), for some Borel (20) and Then for ρ 0 -a.e. x ∈ R d the ODE (19) admits a solution X x (t) for t ∈ [0, 1], and where the push-forward is with respect to the space variable.

Functional analytic setting
In this section we discuss the functional analytic setting that is instrumental in proving the superposition principle in Theorem 1.1. Throughout the section, V will be the closure of a bounded domain of R d , with d ∈ N, d ≥ 1. We recall the notations X V := (0, 1) × V and 3.1. Curves in cones of measures. We start by introducing the set and the space of narrowly continuous curves with values in C V , i.e., Notice that if t → ρ t belongs to S V , then ρ := dt ⊗ ρ t belongs to M(X V ). With a little abuse of notation, in what follows, we will denote by ρ both the curve t → ρ t and the measure dt ⊗ ρ t .
We endow the set C V with the flat distance on M(V ), that is, for We then define a distance over S V , by setting Remark 3.2. In [32,33] the authors introduced the cone space over V given by However in [32,33] the cone space is equipped with the cone distance for all ρ 1 , ρ 2 ∈ C V . By elementary calculations, and employing (29) below, it is possible to show that H 2 and d F induce equivalent topologies on C V , e.g., there exists a constant C > 0 such that The following characterization for d F holds.
Proof. By definition it follows that By symmetry we can assume The thesis follows since the supremum is achieved by ( We will now show that the metric space In order to achieve that, we need a preliminary lemma. Then the following statements are equivalent: (i) ρ t is narrowly continuous, Proof. Assume (i), so that the map t → h(t)ϕ(γ(t)) is continuous for each ϕ ∈ C(V ). By choosing ϕ ≡ 1 we conclude that h is continuous. If we pick ϕ(x) := x i coordinate function, for all i = 1, . . . , d, we also infer continuity for hγ, so that γ is continuous in {h > 0}. Conversely, att by boundedness of ϕ and continuity of h, while if h(t) > 0, we conclude by (ii).
Then the following statements are equivalent: In particular, we have that for sufficiently large n. By continuity of h, (ii), and the assumption h(t) > 0, we conclude continuity for γ, and hence (i). The final part of the statement follows from the first part and from the definition of d.
For the space (S V , d) the following holds.
Proposition 3.6. We have that (S V , d) is a complete separable metric space.
The above statement is somewhat classical. However, due to the lack of a reference, we provide a proof in Section A.4. We conclude this section with a useful lemma that provides sufficient conditions for continuity and measurability for scalar maps on (S V , d).
If ϕ is measurable (resp. continuous), then Ψ t is measurable (resp. continuous) with respect to d.
Proof. Notice that the condition ϕ(x, 0) = 0 for all x ∈ V implies that Ψ t is well defined. Suppose first that ϕ is continuous and assume that d(ρ n , ρ) → 0 as n → ∞.
If h(t) = 0, then ρ t = 0 and Ψ t (ρ) = 0. By continuity of ϕ and compactness of V we infer that Ψ t (ρ n ) → 0. If h(t) > 0, the usual argument by contradiction implies that |γ n (t) − γ(t)| ≤ 2 for n sufficiently large. Thus by (29) and the convergences min(h n (t), h(t)) → h(t) > 0 and d F (ρ n t , ρ t ) → 0, we have that γ n (t) → γ(t). By continuity of ϕ we conclude Ψ t (ρ n ) → Ψ t (ρ). Suppose now that ϕ is measurable. Define the evaluation map e t : S V → C V by e t (ρ) := ρ t and the projection π : where p ∈ V is arbitrary but fixed. Notice that by construction e t is continuous from (S V , d) into (C V , d F ). Additionally the map hδ γ → (γ, h) is continuous in C V {(0, 0)} by repeating the above arguments. Hence π is measurable, being sum of measurable functions. Noting that Ψ t = ϕ • π • e t , we see that Ψ t is measurable.

3.2.
Properties of the Hellinger-Kantorovich energy over C V . In this section we investigate some properties of the coercive version of the Hellinger-Kantorovich energy at (17) when restricted to measures belonging to S V . To be more precise, we consider the functional where J α,β,δ is defined at (17) and α, β, δ > 0. We start by introducing the subset of S V (32) As already mentioned in the introduction, we denote by AC 2 the set of absolutely continuous functions with a.e. derivative in L 2 (see [3, Section 1.1] for a precise definition).
From the regularity assumed, we immediately infer the product rule at (33) Then the following properties hold: iii) The curve t → ρ t belongs to H V . Moreover the energy J α,β,δ can be computed by Then (ρ, m, µ) belongs to M V and solves the continuity equation (14) in X V . Moreover J α,β,δ (ρ, m, µ) < ∞ and (34) holds.
Moreover F is lower semi-continuous and its sublevel sets are compact.
Proof. We start by showing that the domain of F is given by H V and that (37) holds. Assume first that ρ * ∈ S V and F (ρ * ) < ∞. We claim that exists a pair (m * , Indeed the functional (m, µ) → J α,β,δ (ρ, m, µ) is weak* lower semi-continuous by Lemma A.5. Invoking (120) and the direct method, we conclude that the infimum at (31) is achieved, showing (38). Hence we can apply the direct implication of Proposition 3.9 to (ρ * , m * , µ * ) to obtain that ρ * ∈ H V and that (37) holds. Conversely, assume that ρ * t = h(t)δ γ(t) ∈ H V and set m := γρ * , µ := (ḣ/h)ρ * . By the converse implication of Proposition 3.9 we know that (ρ * , m, µ) ∈ M V and J α,β,δ (ρ * , m, µ) < ∞, from which we infer F (ρ * ) < ∞. Thus there exists a pair (m * , µ * ) ∈ M(X V ; R d ) × M(X V ) such that (38) holds. An application of the direct implication of Proposition 3.9 to (ρ * , m * , µ * ) yields (37). We now prove that F is lower semi-continuous with respect to d. To this end, assume that d(ρ n , ρ) → 0 as n → ∞. We claim that dt ⊗ ρ n t * ⇀ dt ⊗ ρ t weakly* in M(X V ). By density, it is sufficient to prove convergence for test functions where the first term in the first line was estimated by (29), and the second one by (27). Since the estimate does not depend on t, and ε is arbitrary, we conclude that dt ⊗ ρ n t * ⇀ dt ⊗ ρ t . We now claim that F is weak* lower semi-continuous in S V considered as a subset of M(X V ): Indeed assume that ρ n * ⇀ ρ in M(X V ). Without loss of generality we can assume that sup n F (ρ n ) < ∞ along a subsequence, so that there exist (m n , µ n ) ∈ M(X V ; R d ) × M(X V ) such that, up to subsequences, F (ρ n ) = J α,β,δ (ρ n , m n , µ n ). By (120) we infer the existence of a pair (m, µ) such that, up to subsequences, m n * ⇀ m, µ n * ⇀ µ. We can now invoke weak* lower semi-continuity of J α,β,δ (Lemma A.5) to conclude weak* lower semi-continuity of F . Since dt ⊗ ρ n t * ⇀ dt ⊗ ρ t in M(X V ) whenever d(ρ n , ρ) → 0, we infer lower semi-continuity of F with respect to d. Finally, we show that the sublevel sets of F are compact with respect to d. As F ≥ 0 and is positively one-homogeneous, it is enough to show that In order to show compactness of S F we first provide some preliminary estimates for the maps h and hγ. By (37) we immediately infer where we used thatḣ = 0 almost everywhere in {h = 0} ([26, Theorem 4.4]), Hölder's inequality, and the fact that F (ρ) ≤ 1 in conjunction with (37). Since h ≥ 0, choosing t 1 ∈ arg min h in the above estimate yields where R := max{|p| : p ∈ V }, C := 2/(βδ 2 √ α) + 1/α. Recall that R < ∞ as V is bounded. Thus, by (39) and (40), Moreover, by (39)-(40) we can estimatê where we used Hölder's inequality, (37), (41), and F (ρ) ≤ 1. By Lemma 3.8 and the above estimates we thus infer for every 0 ≤ t 1 ≤ t 2 ≤ 1. Hence, considering a sequence {ρ n } n in S F with ρ n t = h n (t)δ γn(t) , by (40)- (42) we have that h n and h n γ n are equibounded and equicontinuous. Therefore Ascoli-Arzelà's theorem implies that, up to subsequences, h n → h and γ n h n → f uniformly, where h ∈ C[0, 1], h ≥ 0 and f ∈ C([0, 1]; R d ). Define γ(t) := f (t)/h(t) if h(t) > 0. By the uniform convergence h n → h we have that γ(t) ∈ V for t ∈ {h > 0}. Therefore, by setting ρ t := h(t)δ γ(t) , Lemma 3.4 implies that ρ ∈ S V . Since h n → h pointwise and γ n → γ pointwise in {h > 0}, and since h n ∞ ≤ C, by dominated convergence one immediately concludes that dt ⊗ ρ n t * ⇀ dt ⊗ ρ t in M(X V ). We can then invoke the weak* lower semi-continuity of F to conclude that ρ ∈ S F . We are left to prove that ρ n → ρ with respect to d. Fix ε > 0. By the uniform convergences h n → h and h n γ n → hγ, there exists N (ε) ∈ N such that (43) |h Let t ∈ {h ≥ ε} and n ≥ N (ε). Using the above condition we infer Set m n (t) := min(h n (t), h(t)). Then, by (29), Let now t ∈ {h ≤ ε}. By triangle inequality and (29), (43) . In total we infer d(ρ n , ρ) < Cε for n ≥ N (ε), concluding the proof.

The main decomposition theorem
In this section we will prove the decomposition result in Theorem 1.1 anticipated in the introduction. Specifically, the proof is presented in Sections 4.1, 4.3, while Section 4.2 contains auxiliary results which are instrumental to the proof. For reader's convenience we will recall a few notations and the statement of Theorem 1.1. Let d ∈ N, d ≥ 1 and V ⊂ R d be the closure of a bounded domain of R d . We denote the time-space cylinder by X V := (0, 1) × V . We also recall the definitions of C V and S V at (25)- (26). The set C V is equipped with the flat metric d F defined at (27), while S V is equipped with the supremum distance d defined at (28). We remind the reader that (S V , d) is a complete metric space (Proposition 3.6). Moreover we will also consider the set H V introduced at (32). Let v : X V → R d , g : X V → R be given measurable maps and consider the system of ODEṡ For v and g as above, we define the following subset of H V : (44) H where the notation dσ(γ, h) is a shorthand for expressing that the integral is computed on all curves ρ t = h(t)δ γ(t) ∈ S V .
Definition 4.1. For a measure σ ∈ M + (S V ) we define the set function ρ σ t as for all Borel sets E ⊂ V and t ∈ [0, 1].
therefore the integral is well defined, possibly unbounded. Assume in addition that σ ∈ M + 1 (S V ). It is easy to check that ρ σ t at (46) belongs to This fact can be shown by mimicking the proof of [14, Theorem 3.6.1], in conjunction with (47) holds.
We are now ready to state the main decomposition result of the paper.

4.1.
Proof of the converse implication of Theorem 4.3. We now prove the converse statement in Theorem 4.3. To this end, assume that σ ∈ M + (S Ω ) is concentrated on H v,g Ω and (50) holds. Let us first show that σ ∈ M + 1 (S Ω ). Let ρ t = h(t)δ γ(t) ∈ S Ω and t * ∈ arg min h, which exists by continuity of h (see Lemma 3.4). Using the definition of H v,g Ω we can estimate for all t ∈ [0, 1]. In particular, concluding that σ ∈ M + 1 (S Ω ), thanks to (50). We now show that the curve t → ρ σ t defined by (46) belongs to C w ([0, 1]; M + (Ω)). First, Remark 4.2 implies that ρ σ t ∈ M + (Ω) for all t ∈ [0, 1]. For the narrow continuity, fix ϕ ∈ C(Ω) and notice that by definition the map t → h(t)ϕ(γ(t)) is continuous for all ρ t = h(t)δ γ(t) ∈ S Ω . Since σ ∈ M + 1 (S Ω ) we can apply dominated convergence and conclude that also t →´Ω ϕ(x) dρ σ t (x) is continuous. We are left to show that ρ σ solves the continuity equation ∂ t ρ σ t + div(vρ σ t ) = gρ σ t in X Ω . To this end, fix b ∈ C 1 (Ω). By Lemma 3.8 the map t → h(t)b(γ(t)) is differentiable almost everywhere and (33) holds. Therefore, for all 0 ≤ s ≤ t ≤ 1 the following holdŝ where in the last equality we used that σ is concentrated on H v,g Ω and applied Fubini's Theorem, which we are allowed to do as the integrand is absolutely integrable by (50), triangle inequality, and the fact that b ∈ C 1 (Ω). In particular, the map t →´Ω b(x) dρ σ t (x) is absolutely continuous with almost everywhere derivative given by The second equality in (52) follows because v and g are measurable and hence Ψ(t, x) := b(x)g(t, x) + ∇b(x) · v(t, x) is measurable in Ω for a.e. t fixed. From (50) we have that (γ, h) → h(t)Ψ(t, γ(t)) belongs to L 1 σ (S Ω ) for a.e. t, and hence by Remark 4.2 we can apply (47) to Ψ(t, ·) and obtain the second equality in (52). Identity (52) implies that ρ σ t solves the continuity equation in X Ω in the sense of (14), for all ϕ ∈ C 1 c (X Ω ) of the form ϕ(t, x) = a(t)b(x) for a ∈ C 1 c (0, 1), b ∈ C 1 (Ω), and hence, by density, for all the elements of C 1 c (X Ω ). 4.2. Regularized solutions of the continuity equation. Before starting the proof of the direct statement in Theorem 4.3, we provide some smoothing arguments which will be employed to construct the measure σ. To this end, let Ω ⊂ R d , d ∈ N, d ≥ 1 be the closure of a bounded domain. Let v : X Ω → R d , g : X Ω → R be given measurable maps, and ρ t ∈ C w ([0, 1]; M + (Ω)) be such that ∂ t ρ t + div(vρ t ) = gρ t in X Ω in the sense of (14). We extend v, g to zero to the space (0, 1) × R d . Similarly extend ρ t to zero so that ρ t ∈ M + (R d ). Notice that the extensions (ρ, v, g) satisfy the continuity equation in (0, 1) × R d , due to the no-flux boundary conditions. For x ∈ R d , r > 0 let B r (x) := {x ∈ R d : |x| < r} and let ξ ∈ C ∞ (R d ) be such that ξ ≥ 0, supp ξ ⊂ B 1 (0) and´R d ξ dx = 1. For every 0 < ε < 1 and x ∈ R d set ξ ε (x) := ε −d ξ(xε −1 ). Note that supp ξ ε ⊂ B ε (0). Let R > 0 be such that where v ε t and g ε t are set to be zero in the region where ρ ε t (x) = 0, i.e., in (0, 1) × (R d V ). Here, with a little abuse of notation, we denote v t = v(t, ·), v ε t = v ε (t, ·), g t = g(t, ·), g ε t = g ε (t, ·). Lemma 4.5. Let ρ t ∈ C w ([0, 1]; M + (Ω)) and v : X Ω → R d , g : X Ω → R d be measurable. Suppose that ∂ t ρ t +div(vρ t ) = gρ t in X Ω in the sense of (14) and that (48) holds. Let (ρ ε t , v ε t , g ε t ) be defined as in (54). Then (ρ ε t dx, v ε t , g ε t ) is a solution to ∂ t ρ ε t dx + div(v ε t ρ ε t dx) = g ε t ρ ε t dx in (0, 1) × R d and ρ ε t dx → ρ t narrowly in M(V ) as ε → 0, for all t ∈ [0, 1]. Moreover v ε and g ε satisfy (18) and (20), respectively. Finally, for every t ∈ [0, 1] there holds Proof. By the interplay between weak differentiation and mollification, it is immediate to check that (ρ ε t dx, v ε t , g ε t ) solves the continuity equation in (0, 1) × R d for all 0 < ε < 1. The fact that ρ ε t dx → ρ t narrowly is an immediate consequence of the properties of convolutions and of the convergence η ε → 0 as ε → 0. We now prove that v ε satisfies (18). Notice that by definition v ε t (x) = 0 in R d (Ω + B 1 (0)) for every t ∈ [0, 1]. Moreover ρ ε t ≥ ε in V for all t. Thereforê As t → ρ t (Ω) is continuous, the quantity C(ρ) := max t |ρ t (Ω)| is well defined. Thereforê where the last term is finite by (48). By similar computations and by (48), one can easily show that g ε satisfies (20). We now prove the first estimate in (55). Fix t ∈ [0, 1]. If ρ t = 0, there is nothing to prove. Otherwise we havê where in the last inequality we used Proposition A.7. Since v(t, ·) vanishes in R d Ω, we conclude the first estimate in (55). A similar argument yields the second estimate in (55).  (14) and (48) holds. The above is the reason why we add η ε to the definition of ρ ε t in (54), since otherwise, we could have ρ t * ξ = 0 for some t, independently on the chosen mollifier. We remark that the addition of η ε is the main difference to the smoothing results [3, Lemma 8.1.9] and [37, Lemma 3.10], where narrowly continuous measure solutions ρ t to (14) are smoothed via ρ t * ξ with ξ being a mollifier.

4.3.
Proof of the direct implication of Theorem 4.3. We divide the proof of the direct implication of Theorem 4.3 into two steps: First we construct a measure σ ∈ M + (S Ω ) satisfying (49); Then we prove that σ is concentrated on H v,g Ω .
Step 1 -Construction of the measure σ.
Let V := B R (0), with R > 0 as in (53). For each 0 < ε < 1 define ρ ε t , v ε t , g ε t according to (54). By Lemma 4.5 the triple (ρ ε t dx, v ε t , g ε t ) solves ∂ t ρ ε t dx + div(v ε t ρ ε t dx) = g ε t ρ ε t dx in (0, 1) × R d and satisfies the bounds (18), (20), (55). As (48) holds, we can then apply Proposition 2.4 and obtain the representation where X ε x and R ε x are the unique solutions to the ODEs system , for all t ∈ [0, 1]. We define σ ε by duality as . Here we adopted the notation ϕ(γ, h) to denote that ϕ is evaluated on the curve t → h(t)δ γ(t) . We claim that σ ε ∈ M + (S V ). First, we show that σ ε is well-defined. Indeed, notice that ρ ε t ≥ ε in V by construction. Hence by (56) and (20) we estimate for all x ∈ V , where C(ε) > 0 is a constant depending only on ε. Also, by construction, v ε (t, x) = 0 for x ∈ R d Ω + B 1 (0) and t ∈ [0, 1]. Therefore from (57) we deduce that X ε x (t) ∈ V for each initial datum x ∈ V and 0 < ε < 1. Thanks to Lemma 3.4, we then obtain that the is continuous from R d to (S V , d), which is a consequence of the stability of solutions to (57) with respect to the initial datum x ∈ R d , and of the fact that uniform convergence of weights and curves implies d-convergence of measures in S V , thanks to (29). This proves that the definition at (58) is well posed. We now estimate the total variation of σ ε . By (54) and standard properties of convolutions we have that ρ ε t dx M(V ) ≤ ρ t M(Ω) + ε|V | for all t ∈ [0, 1], ε ∈ (0, 1). Hence, by testing σ ε against ϕ ≡ 1 and using (56) we infer Moreover σ ε ≥ 0 by (59), showing that σ ε ∈ M + (S V ). We also remark that σ ε is concentrated on H V , given that the curve t → (´1 0 R ε x (s) ds) −1 R ε x (t)δ X ε x (t) belongs to H V for each x ∈ V , thanks to the regularity of solutions to (57). We now show that the family σ ε is tight as 0 < ε < 1, by proving that is the functional defined at (31): Indeed assume that (61) holds; by Proposition 3.10 we know that F is d-measurable and its sublevels are compact. Moreover (S V , d) is a complete separable metric space (see Proposition 3.6). Thus we can apply Proposition A.1 to conclude tightness for σ ε . Let us proceed with the proof of (61). First notice that (58) can be tested against F , as σ ε ≥ 0 and F is lower semi-continuous with respect to the metric d (Proposition 3.10). Since σ ε is concentrated on H V , by formula (37) and one-homogeneity of F with respect to h we have By (57), (56) and (55) we estimatê and, in a similar fashion, Finally, by (60),ˆS From the above estimates, and (62), (48), we conclude (61), proving that {σ ε } ε is tight. Since {σ ε } ε is uniformly bounded by (60), we can apply the compactness result [14,Theorem 8.6.2] to infer the existence of σ ∈ M + (S V ) such that σ ε → σ narrowly as ε → 0. In particular, as F is d-lower semi-continuous, F ≥ 0 and (61) holds, we can apply (115) in Proposition A.2 to infer´S V F (γ, h) dσ(γ, h) < ∞. From the latter, we see that σ is concentrated on the domain of F , that is, on the set H V (Proposition 3.10). We now prove that σ satisfies the representation formula (49). To this end, let ϕ ∈ C c (X V ) and define the map Ψ(γ, h) :=´1 0 h(t)ϕ(t, γ(t)) dt for ρ = hδ γ ∈ S V . We claim that σ ε according to (58) can be tested against Ψ: indeed, first notice that Ψ is d-continuous. This is because the map (γ, h) → h(t)ϕ(t, γ(t)) is continuous for t fixed, by Lemma 3.7; if d(ρ n , ρ) → 0, then h n − h ∞ ≤ d(ρ n , ρ) by (29), so that {h n } n is uniformly bounded; thus by dominated convergence we conclude continuity for Ψ, since ϕ is bounded, and since ϕ(t, γ n (t)) → ϕ(t, γ(t)) when h(t) > 0. Moreover, thanks to (63), we can estimate showing that the right-hand side of (58) tested against |Ψ| is finite. The fact that σ ε can be tested against Ψ follows immediately. By (58), the latter yields where in the last equality we used (56). We want to pass to the limit as ε → 0 in (65). Notice that the right-hand side passes to the limit since dt ⊗ ρ ε t dx * ⇀ dt ⊗ ρ t in M(X V ): Indeed ρ ε t dx → ρ t narrowly in M(V ) for all t (Lemma 4.5) and ρ ε t dx is uniformly bounded in M(V ), as previously shown. Concerning the left-hand side of (65), we first claim that the map |Ψ| is uniformly integrable with respect to σ ε according to definition (116). To this end, for k > 0 define A k := {(γ, h) ∈ S V : |Ψ(γ, h)| ≥ k}. By the definition of σ ε and by (63) we get concluding uniform integrability for |Ψ|. Therefore we can invoke (117) and pass to the limit as ε → 0 in the left-hand side of (65). After one application of Fubini's Theorem we obtain We claim that (49) descends from (66). In order to show it, we first derive a pointwise in time version of (66). We start by showing that Θ(t) :=´S V h(t)ϕ(t, γ(t)) dσ(γ, h) is continuous for all ϕ ∈ C c (X V ) fixed. Indeed, the map t → h(t)ϕ(t, γ(t)) is continuous for each fixed (γ, h) ∈ S V , by Lemma 3.4. Moreover, by recalling that σ ε is concentrated on solutions of (57), and by arguing as in the proof of (51), we can show that for all ε it holds that Therefore, by employing (63), (58), (56), (55), and setting C := ρ M(X Ω ) + |V |, we obtain , and the last term is bounded by assumption (48). Finally, the map (γ, h) ∈ S V → h ∞ is d-continuous and non-negative, therefore by the narrow convergence σ ε → σ and (115) we infeŕ S V h ∞ dσ(γ, h) < ∞. By dominated convergence we then conclude continuity of Θ. As a byproduct of this argument, we have additionally shown that σ ∈ M + 1 (S V ). Notice that also the map t →´V ϕ(t, x) dρ t (x) is continuous, as a consequence of the narrow continuity of t → ρ t . Testing (66) against ϕ(t, x) := a(t)b(x) for a ∈ C c (0, 1), b ∈ C(V ), yields Fix t ∈ [0, 1] and b ∈ C(V ) such that b = 0 in Ω and b > 0 in V Ω. Recalling that ρ t is concentrated on Ω, from (67) we obtain a set E t ⊂ S V such that σ(S V E t ) = 0 and h(t)b(γ(t)) = 0 for all (γ, h) ∈ E t . In particular, by definition of b, for all (γ, h) ∈ E t . Let Q ⊂ [0, 1] be a dense countable subset and define E := ∩ t∈Q E t , so that σ(S V E) = 0 and (68) holds for all (γ, h) ∈ E, t ∈ Q, that is, γ({h > 0} ∩ Q) ⊂ Ω for σ-a.e. (γ, h) ∈ S V . By density of Q and continuity of h, γ we deduce γ({h > 0}) ⊂ Ω for σ-a.e. (γ, h) ∈ S V , from which we conclude concentration of σ on S Ω . Since we already showed that σ is concentrated on H V we also infer that σ is concentrated on H Ω . It is immediate to check that S Ω is d-closed in S V , and hence d-measurable. Therefore we can restrict σ to S Ω to obtain a measure in M + 1 (S Ω ) satisfying (49), as claimed.
Step 2 -σ is concentrated on H v,g Ω .
Step 3 -σ is concentrated on H 1 Ω . We are left to prove that σ is concentrated on (53). Also recall that we have proven σ ε → σ narrowly. Moreover note that the set H 1 V is closed in S V , as an immediate consequence of (29). Hence, by (115), we get As σ is concentrated on S Ω , we conclude. This ends the proof of Theorem 4.3. Remark 4.7. As mentioned in the introduction, it would be interesting to extend Theorem 4.3 to the case of Ω = R d . Notice however that our construction of the measure σ is heavily reliant on the boundedness of Ω: first, such assumption is needed in proving compactness of the sublevels of the functional F (see (40) and estimates after), which in turn allows to show tightness for the family σ ε (see (61) and argument immediately after); second, boundedness of Ω is employed to provide the uniform bound (60) on the norm of σ ε . These arguments are crucial to obtain compactness for σ ε and, consequently, the representing measure σ as their limit.

Uniqueness of characteristics and uniqueness for the PDE
The aim of this section is to apply Theorem 4.3 to relate uniqueness of the characteristics with uniqueness of solutions for the continuity equation with given initial data and minimal total variation. Throughout the section Ω ⊂ R d with d ≥ 1 is the closure of a bounded domain. We denote X Ω := (0, 1) × Ω. Moreover S Ω denotes the set defined at (26), equipped with the distance d at (28). We remind the reader that (S V , d) is a complete metric space (Proposition 3.6). Let v : X Ω → R d and g : X Ω → R be measurable maps and recall the definition of H v,g Ω at (44), i.e., the set of regular characteristics of the ODEs system (O1)-(O2). Also recall the definition of H 1 Ω at (45) Finally, we define the following set D v,g := (t → ρ t ) ∈ C w ([0, 1]; M + (Ω)) : (ρ t , v t , g t ) satisfy ∂ t ρ t + div(vρ t ) = gρ t and (48) .
We will prove the following result: Then, for any initial data ρ 0 ∈ M + (Ω) concentrated on A, the continuity equation ∂ t ρ t + div(vρ t ) = gρ t admits at most one solution ρ ∈ D v,g with initial data ρ 0 and such that In the next section we provide several auxiliary lemmas and definitions, which will be instrumental in proving Theorem 5.1. The proof of Theorem 5.1 will be carried out in Section 5.2.
5.1. Auxiliary results. Define the following subset of S Ω : for some τ ∈ R} . The first step is to prove that condition (81) implies that the measure σ obtained by Theorem 4.3 is concentrated on S * Ω . To this aim, we define a cut-off operator on the space S Ω .
Next we show that we can disintegrate any measure obtained by the application of Theorem 4.3 into a family of Borel measures parametrized by x ∈ Ω and concentrated on the set Notice that E x is measurable for every x ∈ Ω. Indeed, by employing similar arguments to the ones in Lemma 3.7, one can show that the map π : S * Ω {0} → Ω defined as π(γ, h) := γ(0) is continuous. Therefore, as S * Thus E x is measurable, given that H 1 Ω is closed and H v,g Ω is measurable by Lemma 5.3. Lemma 5.5. Let v : X Ω → R d , g : X Ω → R be measurable. Let ρ ∈ D v,g and σ ∈ M + 1 (S Ω ) be such that (49) holds. Then there exists a Borel family of measures {σ x } x∈Ω ⊂ M + (S Ω ) such that for every f ∈ L 1 σ (S Ω ) we have where E x is defined as in (89). Moreover σ x is concentrated on E x for ρ 0 -a.e. x ∈ Ω.

5.2.
Proof of Theorem 5.1. Assume that t → ρ t belongs to D v,g . Moreover suppose that ρ 0 is concentrated on A ⊂ Ω and that (81) holds. By Theorem 4.3, there exists σ ∈ M + 1 (S Ω ) concentrated on H v,g Ω ∩ H 1 Ω that represents ρ t , that is, (49) holds. Using Lemma 5.4, we infer that σ is concentrated on H := H v,g Ω ∩ H 1 Ω ∩ S * Ω . Thanks to Lemma 5.5 we can disintegrate σ into a Borel family {σ x } x∈Ω ⊂ M + (S Ω ) such that (90) holds, with σ x concentrated on E x for ρ 0 -a.e. x ∈ Ω. We claim that assumption (Hyp) implies that E x contains at most one point for all x ∈ A. Indeed, suppose that (γ . Now notice that by linearity of (O2) and assumption (Hyp), we have that γ x and by the continuity of h i we also obtain that h x 1 (τ 1 ) = h x 2 (τ 2 ) = 0. By definition of τ i we conclude that τ 1 = τ 2 = τ , so that (γ Since h x 1 (t) = h x 2 (t) = 0 for all t ≥ τ , we conclude that E x contains at most one point. Thus, for ρ 0 -a.e. x ∈ E, E := {x ∈ Ω : E x = ∅}, we have We claim that c x = 1/h x (0). Indeed, by definition of E x , we have γ x (0) = x. Using (49), (90), and σ(S Ω H) = 0, we then obtain for all ϕ ∈ C(Ω), showing that c x h x (0) = 1 for ρ 0 -a.e. x ∈ E. Again by (49) and (90) we get for every ϕ ∈ C(Ω), where we also used that σ x ∈ E. Thus ρ t depends only on the initial data ρ 0 , ending the proof.

Extremal points of the Wasserstein-Fisher-Rao energy
Let Ω ⊂ R d with d ≥ 1 be the closure of a bounded domain of R d . Let α, β > 0, δ ∈ (0, ∞] and define B to be the unit ball of the functional J α,β,δ defined at (17), that is, The aim of this section is to characterize the extremal points Ext B. Notice that J α,β,∞ corresponds to the coercive version of the Benamou-Brenier energy, whose extremal points were characterized in [20]. Hence here we focus on the case δ < ∞. After the characterization of Ext B is obtained, we will show how this information can be applied to the analysis of dynamic inverse problems which are regularized via the optimal transport energy J α,β,δ [21]. In particular we will obtain a sparse representation formula for regularized solutions to the dynamic problem. Before stating the characterization theorem we remind the reader the notations C Ω , S Ω , H Ω introduced at (25), (26), (32). In the following S Ω is equipped with the distance d at (28), making it a complete metric space (Proposition 3.6). We now define the set of characteristics of (14) with energy J α,β,δ = 1, which will play a role in the characterization of Ext B. Definition 6.1 (Characteristics). Define the set C of all the triples (ρ, m, µ) ∈ M Ω of the form ρ = h(t) dt ⊗ δ γ(t) , m =γ(t)ρ, µ =ḣ(t) dt ⊗ δ γ(t) that satisfy the following properties: i) t → h(t)δ γ(t) belongs to H Ω , ii) the set {h > 0} := {t ∈ [0, 1] : h(t) > 0} is connected, iii) the energy satisfies J α,β,δ (ρ, m, µ) = 1.
The above definition is well-posed since (ρ, m, µ) belongs to M Ω and solves the continuity equation (14) in X Ω (by the converse of Proposition 3.9 with V = Ω). Hence (iii) is compatible with the definition of J α,β,δ . Remark 6.2. If (ρ, m, µ) ∈ M Ω with ρ ∈ H Ω , then an application of Proposition 3.9 (with V = Ω) yields the representation In particular J α,β,δ is d-measurable, as a consequence of Proposition 3.10. For a measurable set E ⊂ [0, 1] we define the localized energy We are now ready to state the characterization theorem. The proof of Theorem 6.3 will be carried out in the next section, while in Section 6.2 we will detail the application of Theorem 6.3 to dynamic inverse problems.
Claim: h 1 /h 2 is constant in each connected component of E.
In order to prove that (ρ, m, µ) ∈ C we are left to show that the set {h > 0} is connected. To this end, assume by contradiction that {h > 0} = E 1 ∪ E 2 with E 1 , E 2 relatively open, non-empty and disjoint. For t ∈ [0, 1] set ρ i t := h(t) χ E i (t)δ γ(t) . Note that as {h > 0} is relatively open we have that ∂ {h>0} E i = ∂ [0,1] E i ∩ {h > 0} where we denote by ∂ A the relative boundary with respect to the set A. Hence as ∂ {h>0} E i = ∅ we deduce that h(t) = 0 for every t ∈ ∂ [0,1] E i . In particular the map t → h(t) χ E i (t) is continuous in [0, 1]. Moreover γ ∈ C({h χ E i > 0}; R d ), hence Lemma 3.4 ensures that the curve t → ρ i t belongs to S Ω . We claim that t → ρ i t belongs to H Ω . In order to show this, we make use of the information (t → ρ t ) ∈ H Ω . Notice that the set for every ϕ ∈ C 1 c (0, 1), where we used that h = 0 on ∂ Thanks to Proposition 3.9 we have that (ρ i , m i , µ i ) belongs to M Ω and J(ρ i , m i , µ i ) =ˆE (t) 2 h(t) + αh(t) dt < ∞ .
A.3. Properties of B δ and J α,β,δ . In this section we gather some of the properties of the functionals B δ and J α,β,δ introduced in Section 2.2. The interested reader can find the proofs of such results in Proposition 2.6 and Lemmas 4.5, 4.6 in [21].