Topologies on unparameterised path space

The signature of a path, introduced by K.T. Chen [5] in $1954$, has been extensively studied in recent years. The $2010$ paper [12] of Hambly and Lyons showed that the signature is injective on the space of continuous finite-variation paths up to a general notion of reparameterisation called tree-like equivalence. The signature has been widely used in applications, underpinned by the result [15] that guarantees uniform approximation of a continuous function on a compact set by a linear functional of the signature. We study in detail, and for the first time, the properties of three candidate topologies on the set of unparameterised paths (the tree-like equivalence classes). These are obtained through properties of the signature and are: (1) the product topology, obtained by equipping the tensor algebra with the product topology and requiring $S$ to be an embedding, (2) the quotient topology derived from the 1-variation topology on the underlying path space, and (3) the metric topology associated to $d( [ \gamma] ,[ \sigma] ) := \vert\vert \gamma^*-\sigma^*\vert\vert_{1}$ using suitable representatives $\gamma^*$ and $\sigma^*$ of the equivalence classes. The topologies are ordered by strict inclusion, (1) being the weakest and (3) the strongest. Each is separable and Hausdorff, (1) being both metrisable and $\sigma$-compact, but not a Baire space and so neither Polish nor locally compact. The quotient topology (2) is not metrisable and the metric $d$ is not complete. An important function on (unparameterised) path space is the (fixed-time) solution map of a controlled differential equation. For a broad class of such equations, we prove measurability of this map for each topology. Under stronger regularity assumptions, we show continuity on explicit compact subsets of the product topology (1). We relate these results to the expected signature model of [15].


Introduction
A continuous bounded variation path γ : [0, 1] → V taking values in a finitedimensional vector space V has associated to it a sequence of tensors in the product space ∞ i=0 V ⊗i obtained by taking iterated integrals of successively higher degree.The resulting sequence, its signature S(γ), was first studied by K.T. Chen [4,5] and has, in more recent work [12], been shown to characterise γ up to an equivalence relation on the space of continuous bounded variation paths called tree-like equivalence.The signature map S accordingly becomes well-defined and injective on the ∼ τ -equivalence classes: the space of so-called unparameterised paths C 1 .
The significance of this body of theory has been given additional impetus by recent applications, see e.g.[15], [21] in which the invariance of the signature under ∼ τ provides a form of dimensional reduction.These applications benefit from wider properties of the signature, particularly the fact that the collection of monomials on the range S of the signature map coincides with the restriction to S of linear functionals on the tensor algebra, see [15].This allows an approximation theory using the signature to be developed, the most basic form of which provides that any continuous function f on a (locally) compact subset K of C 1 can be uniformly approximated by (the restriction of) a linear functional on the tensor algebra.The choice of topology on the space of unparameterised paths, necessary to a complete understanding of this method, is often left unspecified in applications or is otherwise chosen to suit the application at hand.An instance of this is the paper [7], where the authors prove universality and characteristicness for kernels derived from the signature, and in doing so they impose the quotient topology derived from the variation topology on the underlying space of parameterised paths.
The purpose of this paper is to broaden this discussion by evaluating and comparing the properties of three topologies which are arrived at by leveraging different properties of C 1 .The qualities we emphasise are: (1) The injectivity of the signature map from C 1 onto a subset of the product space ∞ i=0 V ⊗i .This allows one to equip the range of the signature map with a (subspace) topology, and hence to transfer this topology onto C 1 by requiring the signature map to be an embedding.We refer to this as the product topology.(2) The fact that tree-like equivalence on the space of continuous bounded variation paths is an equivalence relation means that C 1 can be endowed with the quotient topology derived from the 1-variation topology as in the reference [7] referred to above.(3) The existence, for each equivalence class [γ] in C 1 , of a so-called tree-reduced representative [12], which is characterised by having minimal length.By considering the constant speed parameterised version of this representative γ * , we can define the metric topology on C 1 induced by The product and quotient topologies can be seen as opposite extremes for how one can topologise C 1 .The former utilises only the topology of the range of the signature map, while the latter relies only on the topology of the domain.The metric topology lies somewhere in the middle, defined through both the topology on the domain and properties of the signature.Variants of (1) may also be of interest, e.g.where a topology on C 1 is induced from a subspace topology on the tensor algebra, but the principle remains the same [6].We focus on the product topology as it is the weakest one in which all projections of the signature map are continuous.Given the central role played by uniform approximation to a now-growing body of applications, it seems important to establish the basic topological features of these approaches.The main conclusions of Section 3 are captured in the following list: • All of the three topologies listed above are separable and Hausdorff.
• The collections of open sets are strictly ordered by inclusion, the product topology being the weakest and the metric topology being the strongest.• The product topology is metrisable and σ−compact, but not a Baire space, and so neither Polish nor locally compact.
• A completion of the product topology is given by the subspace G * ⊂ ∞ i=0 V ⊗i of group-like elements.
• The quotient topology is not metrisable.
• The metric d is not complete.
The condition of metrisability is stated as an assumption in a number of recent works [3,11,14] and our results would therefore seem to preclude the use of quotient topology in these cases.A careful reading of the reference [10] underpinning these results, however, shows the property of complete regularity and not metrisability is the key assumption.A relevant resulting question, which we have not been able to resolve, is whether the quotient topology is completely regular.We will discuss this and related points in more detail throughout the text.
The uniform approximation theory referred to above invites the study of two points.The first is the availability of compact subsets of C 1 -which should, ideally, be explicitly describable -and the second is to understand the classes of continuous functions on these subsets.With respect to both points, the underlying topology has an unignorable bearing.We investigate this in the final section of the paper proving that: • The subset B(r) of unparameterised paths with tree reduced length bounded by r < ∞ is a compact subset of C 1 in the product topology.• The fixed-time solution of a differential equation dy t = f (y t )dγ t , under suitable regularity and growth conditions on f , induces a well-defined function on C 1 .This function is continuous on B(r).
The relevance of these results is accounted for by the uses of the signature; physical limits on the ability to record and store data impose a bound on the (tree-reduced) lengths of the paths that can practically be considered.From this point of view, restricting attention to functions on B(r) is a natural step.The second item ensures that a rich class of input-output response pairs are admissible as the true underlying causal relationship, e.g. in regression analysis.
From the perspective of doing probability on C 1 , it can be desirable to work with measures defined on the Borel σ-algebra of a Polish topology; see e.g. the monograph [9] for a detailed overview.The results above exclude having this structure for both the product and the quotient topologies.Nevertheless, the σ-compactness of the product topology still offers a way to obtain some of the benefits of Polishness, notably versions of Ulam's and Lusin's Theorems still hold.In Section 4 we illustrate how this approach can be used to validate and extend the framework of the expected signature model proposed in [15].If a Polish space is genuinely needed, then an alternative is to consider completions.In the case of the product topology, this leads to the subspace of group-like elements G * .1.1.More general unparameterised paths.We comment here briefly on the notion of unparameterised paths that we adopt.In this article, as is customary in the literature, see for example [13,7], we work with paths over a fixed pre-defined interval (which we take to be [0, 1] for convenience).A plausible alternative might be to adapt our definitions to take account for paths defined on possibly different compact subintervals of R. While the results for the product topology would remain unchanged, this would result in a different topology for the domain of the signature map and the conclusions for the quotient and metric topologies would need further definitions to be interpreted.Technical obstacles for instance prevent the easy construction of a 1−variation type distance between a path defined on an interval [a, b] and another defined on [c, d] 1 .
1 One approach is to consider reparameterisations (non-decreasing surjections) of the interval [a, b] onto [c, d].However, the inverse of a reparameterisation is not guaranteed to be a reparameterisation.Additionally, whilst every path is a continuous reparameterisation of itself run at constant speed, the

Signatures and unparameterised paths
We work with spaces of paths defined on the closed interval [0, 1] taking values in a finite-dimensional vector space V .We will use | • | to denote a fixed but arbitrary norm which we assume to be derived from an inner-product •, • on V .The properties of interest to us are invariant under translation by a constant vector; as such, we will assume throughout that all paths start at the origin 0 ∈ V .The notion of finite pvariation is well known.Definition 2.1.Let 1 ≤ p < ∞.We denote by C p the space of paths γ : [0, 1] → V such that γ 0 = 0 and which have finite p-variation in the sense that The signature of a path will be a central object in the later discussion.
Definition 2.2.Let 1 ≤ p < 2 and assume that γ belongs to C p .The signature of γ is defined to be the element of the product space where for every n = 0, 1, 2, . . . the expression S n (γ) denotes the n-fold iterated Young integral V ⊗i , the image of the signature map.When p = 1 we write S 1 = S.
We let π n : V ⊗i denote the canonical projection and we call S (n) (γ) := π n S(γ) the n-step truncated signature of γ.The map S (n) : C p → n i=0 V ⊗i is continuous, see [16,Corollary 2.11].The set ∞ i=0 V ⊗i becomes an algebra, the tensor algebra, when equipped with an appropriate collection of vector space operations and an associative product (the tensor product, which we shall write implicitly).The definition is standard and we do not repeat it; the reader is referred to [19].As is usual, we use T ((V )) when we wish to emphasise the role of the algebra structure in addition to the underlying set ∞ i=0 V ⊗i .The identity element of this algebra is Definition 2.4.The set of group-like elements G * is the subset of Remark 2.5.In other words, x is in G * if and only if every projection π n x can be realised as the n-step truncated signature of a continuous bounded variation path.There are alternative ways to characterise G * .For example, if exp : T ((V )) → T ((V )) is the exponential map with respect to the tensor product on T ((V )), then G * = exp(g), where g is the Lie algebra generated by V .See Theorem 3.2 of [19] for further equivalent algebraic characterisations.
Lemma 2.6.The product space ∞ i=0 V ⊗i endowed with the product topology is a Polish space.The set G * is closed in ∞ i=0 V ⊗i with respect to this topology and S ⊆ G * = S.
reverse does not necessarily hold.Combined, these problems pose difficulties in defining a distance that is symmetric and satisfies the triangle inequality Proof.A countable product of Polish spaces is a Polish space from general theory.To see that G * is closed we show that the limit x of any convergent sequence (x m ) ∞ m=1 in G * is again in G * .Let x m = (x m,0 , x m,1 , . . ., x m,j , . . .), the convergence lim m→∞ x m = x in the product topology holds if and only if x m,j → x j in V ⊗j as m → ∞ for every j = 0, 1, 2, . . .For any n its follows that π n x m → π n x as m → ∞, and since it is known that G n = π n (G * ) is closed in the truncated tensor algebra, see for example [8,Theorem 7.30] or [16,Proposition 2.25], it follows that π n x is also in G n .By definition this implies x is then in G * .To see the final statement, it is clear by definition that S ⊆ G * .Then if x is in G * , for every n there exists by definition γ n such that S (n) (γ n ) = π n x.The sequence determined by Remark 2.7.It is well known that the inclusion S ⊂ G * is in fact strict, and the results of Section 3 provide a new (non-constructive) proof.
The tensor algebra plays a special role in the study of the signature because of the way it interacts with a natural binary operation on path space.Recall that concatenation * : C p × C p → C p is the binary operation defined by Another important operation is the unary operation ← − : C p → C p which reverses the order of the path: The following relations hold [5]: where multiplication is in T ((V )).Distinct paths can have equal signatures.For example, for any γ = o we have that γ * ← − γ = o, while from (2.2) is any parameterisation of [0, 1], i.e. a continuous, non-decreasing surjection of the interval [0, 1] onto itself.Then S(γ • τ ) = S(γ), namely the signature is invariant under reparameterisation.For 1 ≤ p < 2 we can define an equivalence relation on C p by identifying paths with the same signature.One of the main results of [12] is to show that this coincides with the notion of tree-like equivalence and hence to give a complete description of the equivalence classes.We recall that a path γ is tree-like if there exists a real tree T such that γ admits a factorisation γ = φ• ρ through a pair of continuous maps ρ : [0, 1] → T and φ : T → V , where ρ(0) = ρ(1).We then have the following definition.Definition 2.8 (Tree-like equivalence).Let 1 ≤ p < 2 and assume that γ and σ are in σ is tree-like then we call γ and σ tree-like equivalent and write γ ∼ τ σ.
In [12] for the case when p = 1, and later in [1] for the case p > 1, which even covers weakly geometric rough paths, the following collection of results are proved.
The relation ∼ τ defines an equivalence relation on C p which we call tree-like equivalence.
(3) In the case p = 1, each ∼ τ equivalence class contains an element of minimal length.This element is called the tree-reduced representative.It is unique up to its parameterisation.
2.1.Tree-reduced paths.It will be useful for us to work with a concrete representative of an equivalence class.When p = 1 the tree-reduced representative is a natural choice.
It is only necessary to fix the parameterisation.
Definition 2.11.Let γ in C 1 .We will use γ * to denote the tree-reduced representative of [γ] parameterised at constant speed.
We can manufacture simple examples of paths that are already tree-reduced by considering axis paths defined with respect to an orthonormal basis.Given v in V we let γ v denote the linear path in When parameterised at constant speed it is defined by where t 0 = 0 and From now on, unless stated otherwise, we assume that all piecewise linear paths as in Example 2.13 are parameterised at constant speed.The following result is known more generally for C 2 paths [12] and irreducible piecewise linear paths [17], but the arguments rely delicately on estimates of the hyperbolic development of the path.In the case of axis paths, it is possible to present a simplified and direct proof using elementary tools.The use of axis paths will be fundamental to our subsequent discussion, and so we include a proof for completeness.13.Assume further that for every consecutive pair of vectors v i , v i+1 = 0 for i = 1, . . ., m − 1.Then γ is tree reduced.
Proof.We prove the result indirectly not through the definition of tree-like equivalence, but by using asymptotic properties of the signature.We recall that the projective tensor norm is the largest cross norm on V ⊗n .For a tensor A in the algebraic n-fold tensor product space it is defined by It can be deduced that if Note here that ϕ is a bounded linear map on V ⊗n .Consider γ with the constant-speed parameterisation described in Example 2.13.Then, using the previous inequality, we obtain the lower bound where 0 The event B is required, since it is possible for X n to be non-zero even if one of the N i is odd: the order statistics may "skip" a segment.Their union is Ω, and it can easily be checked that on C, and therefore that Letting {E i : i = 1, 2, . . ., m} be a collection of i.i.d Rademacher random variables and r = min i=1,...,m |t i − t i−1 | ∈ (0, 1), simple arguments can be employed to show Using these inequalities together with (2.4) we obtain easily that lim sup For any path γ in C 1 with length L such that S(γ) = S(γ) we deduce where the second inequality follows from [16, Proposition 2.2], and hence γ is treereduced.
The following construction can be found in [17].It illustrates how the class of paths considered in the previous lemma can be used to gain insight into properties of the signature.It will be an integral tool in the next section.

Example 2.15 ([17]
).Let W be a two-dimensional vector subspace of V which is identified with R 2 through an orthonormal basis {v i : i = 1, 2}.We can define two sequences {ρ n : n = 1, 2, . . .} and {σ n : n = 1, 2, . . .} of so-called axis paths by ρ 1 := γ v1 * γ v2 , σ 1 := γ v2 * γ v1 and then for n = 2, 3, . . .by If two consecutive line segments v i and v i+1 are positively collinear, then by replacing γ vi * γ vi+1 with γ vi+vi+1 , we may write each path in the form required for Lemma 2.14.By virtue of this Lemma, each of these paths is tree-reduced.For every n, the paths ρ n and σ n have length 2 n and it was further shown in [17] that the terms in their signatures coincide up to degree n, i.e.
Moreover, it can be shown that S n+1 (ρ n ) and S n+1 (σ n ) are not equal for any n.Consequently, each tree-reduced path Γ k := σ k * ← − ρ k has length 2 k+1 and satisfies since the projection at level k of the algebraic inversion and multiplication depends only on the level k projections of the inputs.

Topologies on unparameterised path space
Our aim now is to explore the basic properties of three topologies on the space C 1 of unparameterised paths.This will allow us, in the next section, to put forward a framework within which one can understand uniform approximation results for continuous functions defined on compact subsets of C 1 .As noted in [12], there is no canonical topology, but there are at least three principled approaches to constructing one: (1) To use the injectivity of the signature map S : By requiring that S is a topological embedding, any topology on S induces a unique topology on C 1 .We focus subsequently on the subspace topology on S with respect to the product topology on ∞ i=0 V ⊗i .We denote this by (C 1 , χ pr ) and refer to it as the product topology.
(2) To use the quotient topology on C 1 inherited from the 1-variation norm topology on C 1 .We denote this topological space by (C 1 , χ τ ).(3) To define a metric using the tree-reduced representatives parameterised at constant speed: It is easily seen that this defines a metric on C 1 .The associated metric topology will be denoted by (C 1 , χ d ).
Let us now show some simple properties of these topological spaces.
Proof.Since V is Hausdorff, then so is the product topology on ∞ i=0 V ⊗i .Hence singletons are closed in ∞ i=0 V ⊗i , and consequently singletons are closed in χ pr , since S is a topological embedding.
The following proposition shows that the three topologies we have proposed are distinct.In fact, the collections of open sets are strictly ordered with χ pr defining the weakest and χ d the strongest of the topologies.Proposition 3.3.We have the strict inclusions χ pr ⊂ χ τ ⊂ χ d .
Proof.Letting ||•|| V ⊗k be a family of norms on the tensor powers V ⊗k , a subbase for χ pr is given by the collection of sets V of the form For any such set, the preimage . This latter set is closed in C 1 by Lemma 3.2 and the fact that χ pr ⊆ χ τ .The limit γ is then also in this set which is a subset of π −1 (A).
To prove that χ τ ⊆ χ d we let [γ] be in C 1 and prove that every open neighbourhood The tree-reduced representative γ * belongs to [γ] and consequently there exists δ > 0 such that To prove the strict inclusion we find a neighbourhood of the constant path To do so, we consider as above a two-dimensional subspace of V spanned by an orthonormal set {v 1 , v 2 }.Using this basis we define a family of axis paths given by: For every ǫ > 0 the curve γ ǫ is tree-reduced by Lemma 2.14.On the other hand, γ 0 is tree-like equivalent to the constant path o.It follows that while an easy calculation using (2.3) shows that ||γ ǫ − γ 0 || 1 ≤ 6ǫ.Hence any χ τneighbourhood of [o] must contain [γ ǫ ] for some ǫ > 0. In view of (3.2), the set B d ([o] , δ) can never contain such a neighbourhood whenever δ < 2.
The inclusions χ pr ⊂ χ τ ⊂ χ d and the fact that χ pr is the subspace topology of a Hausdorff space immediately yields the following corollary.−1 := [ ← − γ ] forms a topological group.This is not the case for the metric topology.
We provide below a proof from first principles, however the continuity may also be seen from existing results in the literature.For example, by using that π n G * is a closed Lie subgroup of the closed (with respect to the product topology) set π n T ((V )), where T ((V )) := {a ∈ T ((V )) : a 0 = 1}.The manifold topology of the Lie group is the same as the subspace topology of the product topology, for further details we refer the reader to [8, Chapter 7].

Proof. We first show continuity of the group operations on (C 1 , χ pr ). For [•]
−1 , we define the following map on ∞ i=1 V ⊗i , which when restricted to S is the algebraic inversion: By definition of the subspace topology and the fact that S ⊆ ψ −1 (S), it suffices to show continuity of ψ on the tensor algebra.Moreover, by definition of the product topology it is enough to show continuity in each factor.Since the projection onto V ⊗n of ψ(a) depends only on π n a, we may consider the factorisation By continuity of π n we only need to show continuity of ψ n .This follows from the continuity of the tensor product, the canonical isomorphisms V ⊗k ⊗ V ⊗l ∼ = V ⊗k+l , and the continuity of addition.The continuity of group multiplication follows similarly.
We now prove discontinuity of group multiplication with respect to the metric topology.Let {v 1 , v 2 } be orthonormal vectors and define the two sequences of axis paths For each n, the path ρ n (resp.σ n ) is tree reduced so that ρ * n = ρ n (resp.σ * n = σ n ).Moreover the concatenation ρ n * σ n is tree reduced and parameterised at constant speed.

It follows that d([ρ
On the other hand, for every n and so ρ n (resp.
Hence multiplication is not continuous with respect to χ d .
Remark 3.6.The question of continuity of the group operations on (C 1 , χ τ ) is of particular interest.Showing continuity would imply that the quotient topology is completely regular since every topological group is uniformisable (and every uniform space is completely regular).

Complete metrisability of candidate topologies?
We start by answering this question in the negative for the product topology (C 1 , χ pr ).
Proof.The proof has two parts: we first show that every non-empty open set in χ pr is unbounded with respect to the metric d, and then that C 1 may be written as the countable union of closed sets that are bounded with respect to d.
Consider the two dimensional subspace of V spanned by two orthonormal vectors {v 1 , v 2 }, and let (Γ k ) ∞ k=1 be the sequence of axis paths defined in Example 2.15.Each Γ k is tree reduced by Lemma 2.14, thus As shown in the proof of Proposition 3.
which is continuous by Proposition 3.5 and bijective with its image V := λ(U ).The inverse map is also continuous by Proposition 3.5.Therefore λ(U ) is an open neighbourhood of [o].If U were bounded in d from [o] by some constant K > 0, then by construction of λ, the length of the tree-reduced representative of every [σ] in V is at most 2K.This would imply that V were bounded in d, a contradiction.
For any r > 0, define the set We will show later, in Proposition 4.2, that B(r) is compact in χ pr , and so closed by the Hausdorff property.Since each B(r) is bounded in d, it must have empty interior by the preceding.Writing we see that C 1 is the countable union of nowhere dense sets and so is not a Baire space.
Remark 3.8.It can also be shown that every open set in χ τ is unbounded in d, though we do not include a proof here.
As a consequence of the above, and the Baire Category Theorem, we obtain the following corollary.
The following is one of the main results of this paper.
Theorem 3.10.The three topologies on the space C 1 of unparameterised paths are separable and have the following properties.
(3) The metric d defined by (3.1) is not complete.
Proof.We will show separability of χ d and use the inclusions V ⊗i with the product topology is a Polish space.It follows that χ pr coincides with the metric topology of the restriction of any metric which generates the product topology on ∞ i=0 V ⊗i .Since C 1 with the product topology is not a Baire space, the Baire Category Theorem implies no metric generating χ pr is complete.
We will prove item 2 using the fact that any metrisable space must be both first countable and regular [18].By assuming first-countability we show regularity cannot hold.Thus we assume that there exists {U i } ∞ i=1 a countable neighbourhood basis of [o] which, without loss of generality, is assumed to consist of open sets.As before, we consider sequences of axis paths in a two-dimensional subspace of V defined relative to two orthonormal vectors {v 1 , v 2 }.The first sequence will consist of tree-like paths; for every n ∈ N we take

Every γ n belongs to the equivalence class [o] and hence
We introduce a strictly decreasing sequence of positive real numbers {a n } ∞ n=1 by We let Γ n be the path , where ǫ n = a n 6 For every n, the path Γ n is tree-reduced by Lemma 2.14 and has length ||Γ n || 1 = 2n+2ǫ n .By the same argument in the proof of Proposition 3.3 these properties yields that On the other hand, we can obtain the estimate The collection {U n } ∞ n=1 is a neighbourhood basis at [o], and therefore any set U in χ τ which contains [o] must contain U k as a subset for some k.Using the estimate (3.3) above and the definition of the sequence {a n } ∞ n=1 we have that Γ k ∈ B (γ k , a k ) and therefore [Γ k ] ∈ U k .In other words, we have shown that any neighbourhood of [o] must have non-empty intersection with F and therefore (C 1 , χ τ ) cannot be regular.
To prove the final item we define another sequence of paths and there exists a constant c > 0 such that for any 1 ≤ n < m we have As mentioned in the introduction, several recent references, see for example [3,11,14], state metrisability as an assumption.This premise may be traced to [7], which itself derives from an application of Theorems 2.4, 3.1, and 4.6 of Giles in [10].The results of Giles are refashioned as Theorem 2.6 in [7] to include metrisability as a condition.A careful examination of the underlying reference, shows that metrisability is not necessary and, indeed, that the first two assertions of Theorem 2.6 hold for any topological space.The third and final assertion of Theorem 2.6 relies on Theorem 4.6 of [10], which is proved under the hypothesis that the space is completely regular.It is thus of interest to know whether the quotient topology has this property (or, equivalently, if it is uniformisable).With respect to this question, our proof of non-metrisability of the quotient topology does not yield an answer, except that first countability would then imply a lack of (complete) regularity.Alternatively, since χ τ is Hausdorff, local compactness would imply complete regularity.
The non-metrisability of the quotient topology may present challenges when using it for advanced applications in probability and stochastic analysis.As pointed out in the monograph [9], many probability measures in practice are Borel measures on Polish spaces.This assumption greatly simplifies the analysis in many applications by circumventing the complexities of working with general topological spaces.The development of theory reflects this.Useful tools are available in this setting including the facts that finite Borel measures on Polish spaces are tight (Ulam's Theorem), any Borel measurable function on such a space is continuous on a large compact set (Lusin's Theorem), and a family of Borel measures is tight if and only if it is relatively compact in the topology of weak convergence (Prokhorov's Theorem).In the case of the product topology χ pr , Ulam's Theorem, Lusin's Theorem, and one direction of Prokhorov's Theorem still hold.In the next section we illustrate how these methods can be used to support an analysis of the expected signature model of [15].
Nevertheless, it may still be desirable to identify a Polish topology on C 1 .The previous theorem excludes χ τ as a possibility, while χ pr can be metrised by using any metric which induces the topology on the product space.Taking the completion of C 1 with respect to any such metric is a direct way to generate a Polish space.Alternatively, the completion of C 1 with respect to the metric d will also generate a Polish space, should a stronger topology be desired.An interesting question we have not answered is whether χ d is completely metrisable.Proposition 3.11.Let ρ be any metric on that induces the product topology on ∞ i=0 V ⊗i , and let ρ| S denote its restriction to S. Then the set of group-like elements G * equipped with the metric ρ| G * is a valid ρ| S -completion of (C 1 , χ pr ).
Proof.This follows more or less immediately from Lemma 2.6.Definition 3.12.We denote the topological space obtained in Proposition 3.11 by C1 , χ pr , and the d−completion by C1 , χ d .
In contrast to ( C1 , χ pr ), it is not clear whether there is a natural space with which to identify the d−completion of (C 1 , χ d ).

Uniform approximation, linear regression and the Expected Signature Model
The contemporary use of the signature in applications in regression analysis, and its wider use in statistical and machine learning applications, is in large part underpinned by the following fundamental result.
Theorem 4.1 (The Fundamental Theorem of Uniform Approximation by the Signature, [15]).Let (C 1 , χ) be the space of unparameterised paths equipped with a topology χ for which S : C 1 → T ((V )) is continuous.Let Φ : C 1 → R be a continuous function on a compact subset K of C 1 .Then for every ǫ > 0, there exists a linear functional L on T ((V )) such that sup A version of this theorem has appeared in various guises in earlier work, the first we believe being [15].The choice of topology is however rarely addressed explicitly; an exception, as noted earlier, is the paper [7] where the quotient topology is selected.The matter of which space to work with in the context of Theorem 4.1 seems to merit some consideration.Our aim is not to advocate for any particular choice, but to audit our selected choice of candidates.We will prioritise three questions: (1) How readily can compact subspaces of the chosen topology be found?
(2) Given a compact subspace, how easy is it to exhibit continuous functions on these subspaces?(3) To what extent do these pairs of sets (compact subspaces and their set of continuous functions) relate to the practical use of Theorem 4.1?We address these points below.Our focus will mainly be on the product topology where it is simplest to provide positive answers to these questions.∞ n=1 for convenience.We write γ for the limit and note from (4.1) that ||γ|| 1 ≤ r and therefore ||γ * || 1 ≤ r so that [γ] is in B (r) .Let 1 < p < 2, then a standard inequality gives ∞ n=1 is a Cauchy sequence and thus convergent to γ in p-variaton.For each m the map  Proof.The inclusion map ι : S → G * is continuous, and so B(r) is a compact subset of G * for every r > 0. Since G * is metrisable, each B(r) is closed and measurable.Thus S is a countable union of measurable sets, hence it too is measurable.Remark 4.4.It is not difficult to exhibit compact subsets K ⊂ C 1 which are not contained in any B (r) , for r > 0. For instance, by using the example of Proposition 3.3, the set is compact since any (non-trivial) sequence has a subsequence converging to [o] in the product topology.On the other hand, each of the tree-reduced paths Γ k has length 2 k+1 .
An important example of a function on C 1 is the soluton of an ordinary differential equation.Suppose that W is a finite-dimensional vector space.Let T (W ) denote the space of smooth vector fields on W and suppose that f : V → T (W ) is linear, then we can solve uniquely the differential equation (4.2) dy t = f dγt (y t ) , started at y 0 .
To relate this to the signature we introduce the following notation.
Notation 4.5.Let f : V → T (W ) be linear and let I : W → W be the map.Let D (W ) denote the space of smooth differential operators on W and for k ∈ N let f (k) : V ⊗k → D (W ) be the unique linear map which is determined by We can prove the following.
Proposition 4.7.Suppose that W is a finite-dimensional vector space.Let f : V → T (W ) be linear and assume that there exists C < ∞ such that for every k ∈ N where f (k) I (y) denotes the operator norm of the linear map f (k) I (y) .Define a function (the Itô-map) by Ψ y0,f : C 1 → W, γ → y 1 where y is the unique solution over [0, 1] of the differential equation (4.2).Then (1) The function Ψ y0,α is constant on every equivalence class of ∼ τ and, for every γ in C 1 , Ψ y0,α (γ) is given by the convergent series S k (γ) I (y 0 ) .
(2) The Itô-map is a well-defined function from C 1 into W.For every r > 0, the restriction of this function to B (r) is continuous with respect to the topology χ pr on C 1 .
Remark 4.8.From (4.3) we can see that condition (4.4) will hold if the derivatives Proof.The signature is invariant on the tree-like equivalence classes and hence so will be the function Ψ y0,α once we prove (4.5).To see this, we observe first by iterated use of the change-of-variable formula that for any N ≥ 1 we have so that indeed (4.5) holds.

4.2.
Revisiting the expected signature model in regression analysis.The chief motivation of Theorem 4.1 is to provide a theoretical justification for the so-called expected signature model introduced by Levin, Lyons and Ni in [15].To explore the relation between this model and the foundations developed above, suppose that Γ is a random variable taking values in the space (S, χ pr ).Let Y be another random variable, defined on the same probability space as Γ, which takes values in a finite dimensional vector space W . Then the goal of regression analysis is to learn the conditional expectation where Y is interpreted as the response of some system to the input Γ.Another way of saying this is that we want to approximate the Borel-measurable function f : S → W defined by Suppose the law of Γ, a Borel probability measure on S, is denoted by µ.Then, as we will see in Corollary 4.10, by a version of Lusin's Theorem [2], the function f in (4.6) is almost continuous in the sense that for any δ > 0 there exists a compact set K = K δ such that µ (S \ K) < δ and such that f is continuous on K.By Theorem 4.1, it is then reasonable to adopt the model Y = L (Γ) + ǫ, where L is the restriction to S of a linear function from T ((V )) to W and ǫ is a Wvalued random variable satisfying E [ ǫ| Γ] = 0 .This obviates the need to find an explicit compact set and continuous function relating the independent and dependent variables.
The typical case is where Γ = S (γ) for some stochastic process γ in C 1 , so that the probability measure µ = S * µ γ is the push-forward of the law of γ under the signature map, and where Y is the solution to an ordinary differential equation driven by γ.The following result describes conditions for the well-definedness and measurability of the functions on C 1 → W which result from this construction.Proposition 4.9.Let W be a finite-dimensional vector space.Suppose that f : V → T (W ) is a linear function which satisfies the following conditions: (1) For every v in V , f v : W → W is Lipschitz continuous.
(2) For every R > 0 there exists a finite positive C = C (R) such that for every k in N the following bound holds Then the Itô map γ → y 1 derived from the ordinary differential equation dy = f dγt (y t ) starting at y 0 is invariant with respect to the tree-like equivalence relation on C 1 and induces a well-defined Borel measurable function Ψ y0,f : C 1 → W , where C 1 is equipped with any of the candidate topologies.Additionally, for χ pr and χ d , there exists a Borel measurable function Ψy0,f : C1 → W which agrees with Ψ y0,f on C 1 .A general version of Lusin's Theorem yields the following corollary for the product topology χ pr .
Corollary 4.10.Consider the setup as in Proposition 4.9, and let µ be a Borel probability measure on (C 1 , χ pr ).Then, for every δ > 0, there is a compact set K ⊂ C 1 such that µ(C 1 \ K) < δ and Ψ y0,f | K is continuous.
Proof.Since C 1 is σ−compact with respect to χ pr , µ is tight.Separability of (C 1 , χ pr ) and [2, Proposition 7.2.2] then imply µ is a Radon measure.Measurability of Ψ y0,f and Lusin's Theorem for finite Radon measures [2,Theorem 7.1.13]shows the existence of compact sets with the desired property.

Definition 2 . 10 .
Let γ in C 1 , then we denote the ∼ τ equivalence class of γ by [γ].The quotient space C 1 τ and hence that χ pr ⊆ χ τ .To see the strict inclusion we consider a two-dimensional subspace W of V spanned by two orthonormal vectors {v 1 , v 2 }.In this basis we consider the sequence of axis paths(Γ k ) ∞ k=1 constructed in Example 2.15.Define the set A := {[Γ k ] : k = 1, 2, . . .} ⊂ C 1 .Then A is notclosed in the product topology since for every m we have S m (Γ k ) → 0 as k → ∞ while A does not contain the equivalence class of tree-like paths [o].On the other hand, A is closed in the quotient topology because any sequence (γ n ) ∞ n=1 in π −1 (A) which converges to γ in C 1 must be such that sup n ||γ n || 1 < ∞.Using the fact that each Γ k is tree-reduced and ||Γ k || 1 = 2 k+1 then shows that there exists N such that {γ

Corollary 3 . 4 .
All three candidate topologies are Hausdorff.Proposition 3.5.(C 1 , χ pr ) with the operations [γ] • [σ] := [γ * σ] and [γ] 3, [Γ k ] converges to [o] in χ pr , and so every open neighbourhood of [o] contains infinitely many terms of the sequence (Γ k ) ∞ k=1 .Hence every open neighbourhood of [o] is unbounded in d.Now let [γ] ∈ C 1 and U be any open neighbourhood of γ.Define the map for every k ∈ N and since χ pr ⊂ χ d , the only possible limit point of the sequence is [o].This cannot happen owing to d([o], [ρ n ]) ≥ 2, which holds for all n.

Proof.
The product topology is metrisable and hence it suffices to prove sequential compactness.Let ([γ n ]) ∞ n=1 be a sequence in B (r) then, by definition of the path γ * and the set B (r) , the sequence (γ * n ) |γ * n (t) − γ * n (s)| ≤ r (t − s) for all s ≤ t in [0, 1] and n ∈ N; that is, the paths are equicontinuous.The Arzela-Ascoli theorem gives a uniformly convergent subsequence which we again call (γ * n ) is continuous and it therefore holds thatS m (γ * n ) → S m (γ) as n → ∞.We have shown that [γ n ] → [γ] in B (r) as n → ∞ in the product topology.

Corollary 4 . 3 .
The range of the signature S is measurable with respect to the Borel sigma-algebra on G * .

Proof.
The ordinary differential equation has a unique solution for every γ in C 1 .A repetition of the same argument in the proof of Proposition 4.7 leads to the estimatey 1 = y 0 + N k=1 f (k) S k (γ) I (y 0 ) + E N +1 , where |E N | ≤ C (|y 0 | + L γ ) N L N γ N !,which allows one to deduce invariance on the equivalence classes by taking the limit N → ∞. see that the resulting function Ψ y0,f : C 1 → W is measurable with respect to B(C 1 ), the Borel sigma algebra of C 1 , and B(W ) we notice that it is the pointwise limit of the functions L N • S where L N denotes the (restriction of) the linear functionL N : (a 0 , a 1 , . . . ) → y 0 + N k=1 f (k) a k I (y 0 ) ∈ W to S .The existence of a measurable extension follows from the fact that (W, B(W )) is a Polish space; see[20, Theorem 1]  noting that B(C 1 ) = A ∩ C 1 : A ∈ B( C1 ) .
are the order statistics of an i.i.d.sample of 2n uniform [0, 1] random variables defined on a probability space (Ω, F , P).Let N i , for i = 1, . . ., m, denote the number of realisations in this sample which are contained in the interval [t i−1 , t i ].We define three events by conclude separability of χ pr and χ τ .By [8, Proposition 1.31, Corollary 1.35], the space of absolutely continuous paths with respect to ||•|| 1 is separable.Consider the subspace A of C 1 of tree-reduced paths parameterised at constant speed; that is all representatives of equivalence classes seen by the metric d.Since every path parameterised at constant speed is Lipshitz, we may consider A as a subspace of absolutely continuous paths.Since any subspace of a separable metric space is separable, A is separable with respect to ||•|| 1 .This is equivalent to the separability of χ d .Item 1 is a consequence of the definition of χ pr and Proposition 3.7.The space ∞ i=0 =y , for a suitable class of smooth test functions ϕ.Remark 4.6.The right-hand side is a k th -order differential operator.If (y 1 , . . ., y n ) denotes coordinates on W and f v