Rotatable random sequences in local fields

An infinite sequence of real random variables $(\xi_1, \xi_2, \dots)$ is said to be rotatable if every finite subsequence $(\xi_1, \dots, \xi_n)$ has a spherically symmetric distribution. A celebrated theorem of Freedman states that $(\xi_1, \xi_2, \dots)$ is rotatable if and only if $\xi_j = \tau \eta_j$ for all $j$, where $(\eta_1, \eta_2, \dots)$ is a sequence of independent standard Gaussian random variables and $\tau$ is an independent nonnegative random variable. Freedman's theorem is equivalent to a classical result of Schoenberg which says that a continuous function $\phi : \mathbb{R}_+ \to \mathbb{C}$ with $\phi(0) = 1$ is completely monotone if and only if $\phi_n: \mathbb{R}^n \to \mathbb{R}$ given by $\phi_n(x_1, \ldots, x_n) = \phi(x_1^2 + \cdots + x_n^2)$ is nonnegative definite for all $n \in \mathbb{N}$. We establish the analogue of Freedman's theorem for sequences of random variables taking values in local fields using probabilistic methods and then use it to establish a local field analogue of Schoenberg's result. Along the way, we obtain a local field counterpart of an observation variously attributed to Maxwell, Poincar\'e, and Borel which says that if $(\zeta_1, \ldots, \zeta_n)$ is uniformly distributed on the sphere of radius $\sqrt{n}$ in $\mathbb{R}^n$, then, for fixed $k \in \mathbb{N}$, the distribution of $(\zeta_1, \ldots, \zeta_k)$ converges to that of a vector of $k$ independent standard Gaussian random variables as $n \to \infty$.


Introduction
An n-vector ξ of real-valued random variables is rotatable if U ξ has the same distribution as ξ for every n × n orthogonal matrix U ; that is, the distribution of ξ is spherically symmetric. Similarly, an infinite sequence (ξ i ) i∈N of real-valued random variables is rotatable if the vectors (ξ i ) i∈ [n] are rotatable for every n ∈ N (here we use the notation [n] := {1, . . . , n}). Because permutation matrices are orthogonal, it follows that a rotatable infinite sequence is exchangeable and hence, by de Finetti's theorem, distributed as a mixture of independent, identically distributed sequences. A famous result of Maxwell [Max75,Max78] says that if a real random vector is spherically symmetric and has independent (necessarily identically distributed) entries, then the distribution of the entries is centered Gaussian. Combining these two observations makes plausible the celebrated theorem of Freedman [Fre62] that a rotatable infinite sequence of real-valued random variables is a scale mixture of sequences of independent standard Gaussian sequences; that is, if ξ is an infinite rotatable sequence, then ξ = τ η, where (η i ) i∈N is a sequence of independent standard Gaussian random variables and τ is a nonnegative random variable that is independent of η.
We refer the reader to the Historical and Bibliographical Notes in [Kal05] for an indication of the later literature around Freedman's theorem and for remarks on the connection between this result and the classical theorem of Schoenberg which says that a continuous function φ : R + → C with φ(0) = 1 is completely monotone if and only if φ n : R n → R given by φ n (x 1 , . . . , x n ) = φ(x 2 1 + · · · + x 2 n ) is nonnegative definite for all n ∈ N.
Our primary goal, realized in Theorem 3.3, is to obtain an analogue of Freedman's theorem for sequences of random variables taking values in fields other than the fields of real and complex numbers. More specifically, we consider the local fields; a local field is any locally compact, nondiscrete topological field other than R or C (a local field is, for some prime p, a finite algebraic extension of either the field of p-adic numbers or the field of Laurent series over the finite field of integers modulo p). We recall some facts about the structure of local fields and vector spaces over them in Section 2. Just as in the real case, Freedman's theorem is equivalent to a structure theorem for nonnegative definite functions, and we present such a result in Section 4.
In order to search for a counterpart Freedman's theorem we need to have a parallel for the group of orthogonal matrices so that we can say what it means for a distribution to be "spherically symmetric" in the local field setting. The n × n orthogonal matrices are, of course, the matrices that preserve the Euclidean metric on n-dimensional space and so, denoting our local field by K, we take as our parallel of the orthogonal matrices those matrices that are isometries of K n , where K n is equipped with the natural metric described in Subsection 2.2 -see Subsection 2.3.
Not surprisingly, our counterpart of Freedman's theorem also involves a suitable parallel of the class of Gaussian measures in a local field setting. Such an analogue was considered in [Eva89,Eva01]. The idea is that, given one has a notion of spherical symmetry, one can take the property embodied in Maxwell's theorem to be the definition of Gaussianity for local fields. We recall some of the elementary properties of Gaussian probability measures on local fields in Subsection 2.6 after we have laid some groundwork on Haar measure and Fourier theory on local fields in Subsection 2.4 and Subsection 2.5, respectively.

Local fields
From now on, let K be a fixed local field. We refer the reader to [Sch06,vR78,Tai75] for in-depth treatments of various aspects of analysis on local fields. Any result that we state without a proof or a bibliographic citation may be found in these references.

2.1.
Basics. There is a distinguished real-valued mapping on K which we denote by | · |; this map has the properties and takes the values {0} ∪ {q m : m ∈ Z}, where q = p c for some prime p and positive integer c.
A map with properties (1.1)-(1.3) is called a non-Archimedean valuation. Property (1.3) is known as the ultrametric inequality or the strong triangle inequality. The mapping (x, y) → |x − y| on K × K is a metric on K which induces the topology of K. The metric space K is a complete, totally disconnected, ultrametric space under this metric.
Write D for the closed unit ball {x : |x| ≤ 1}. Choosing ρ ∈ K so that |ρ| = q −1 , we have for each k ∈ Z; in particular, ρ k D is both open and closed for each k ∈ Z.
The set D is a ring, called the ring of integers of K. Each of the sets ρ k D, k ∈ Z, is a compact D-submodule of K, and every non-trivial compact D-submodule of K is of this form. For ℓ < k the additive quotient group ρ ℓ D/ρ k D has order q k−ℓ . Consequently, D is the union of q disjoint translates (that is, cosets) of ρD. Each of these cosets is, in turn, the union of q disjoint translates of ρ 2 D, and so on.
Remark 2.1. We can thus think of the collection of balls contained in D as being arranged in an infinite rooted q-ary tree: the root is D itself, the nodes at level k are the balls of radius q −k (cosets of ρ k D), and the q "children" of such a ball are the q cosets of ρ k+1 D that it contains. We can uniquely associate each point in D with the sequence of balls that contain it, and so we can think of the points in D as the boundary of this tree.
2.2. Norms. A norm on a vector space E over the local field K is a real-valued mapping · on E with the properties The norm induces a metric on E by (x, y) → x − y . The resulting metric space is an ultrametric space. We always take the norm on K n to be the one given by The space K n is complete under this metric. In general, a K-Banach space is a vector space E over K that is equipped with a norm such that the resulting metric space is complete.
2.3. Orthogonality. The following definition of orthogonality in a normed vector space over K mimics a characterization of orthogonality in an inner product space over R that does not explicitly involve the inner product and instead is in terms of the induced metric.
Definition 2.2. Given a subset G of a normed vector space E over K, write G for the linear span of G. A subset F of E is K-orthogonal if x − y ≥ x for all x ∈ F and y ∈ F \ {x} ; that is, the best approximation of x in the vector space F \ {x} is 0. Equivalently, a subset F is K-orthogonal if for any finite subset {x 1 , . . . , x n } ⊆ F and collection of scalars α 1 , . . . , α n ∈ K, Remark 2.3. If F is K-orthonormal, then, for any finite subset {x 1 , . . . , x n } ⊆ F and collection of Conversely, suppose that for any finite subset {x 1 , . . . , x n } ⊆ F and collection of scalars Theorem 2.4. The following are equivalent for an n × n matrix U .
(i) The matrix U is an isometry of K n .
(ii) The matrix U is invertible and the entries of both U and U −1 lie in D.
(iii) The columns of U are K-orthonormal.
(iv) The rows of U are K-orthonormal.
(v) The entries of U lie in D and | det U | = 1.
where we applied Remark 2.3 for the last equivalence.
(i) ⇐⇒ (iv): We have already shown that (i), (ii), and (iii) are equivalent. It remains to observe that U is invertible with U and U −1 both having entries in D if and only if the transpose U ⊤ is invertible with U ⊤ and (U ⊤ ) −1 both having entries in D.
(ii) ⇐⇒ (v): If (ii) holds, then it follows from the properties of the valuation that | det U | ≤ 1 and holds, then (ii) follows from Cramer's rule.
Notation 2.5. In light of Theorem 2.4, we write GL n (D) for the group of matrices that satisfy the equivalent conditions of the theorem and say that these matrices are K-orthogonal.
Lemma 2.7. The group of matrices GL n (D) acts transitively on the set S (n) ; that is, given x, y ∈ K n with x = y = 1, there is exists U ∈ GL n (D) such that U x = y.
Proof. Let e i , i ∈ [n], be the coordinate vectors in K n ; that is e i is the vector with 1 in the i th coordinate and 0 elsewhere. Because GL n (D) is a group, it suffices to show that for any x ∈ K n with x = 1 there exists a matrix U ∈ GL n (D) such that U e 1 = x.
Fix such an x. Because x = 1, there is at least one coordinate, say i, such that |x i | = 1. Let U be an n × n matrix that has first column equal to x and remain columns given by the n − 1 vectors e j , j = i, listed in some order. Note that U e 1 = x. Clearly, det U = ±x i so that | det U | = |x i | = 1 and U ∈ GL n (D).

Haar measure.
There is a unique measure λ on K which has the properties for each x ∈ K, and λ(D) = 1; the measure λ is just the suitably normalized Haar measure on the additive group of K.
For n ∈ N, the measure λ ⊗n on K n has the properties for each n × n matrix M . In particular, λ ⊗n is GL n (D)-invariant.
Notation 2.8. Write γ for the restriction of the measure λ to D and, for n ∈ N, set γ n := γ ⊗n . Denote by σ n the probability measure obtained by conditioning the probability measure γ n on the set S (n) ; that is, σ n is γ n (· ∩ S (n) ) normalized to be a probability measure.
Proposition 2.9. The probability measure σ n is the unique probability measure on S (n) that is invariant under the action of GL n (D).
Proof. It is clear that γ n is invariant under the action of GL n (D). Because S (n) is a subset of D n that has positive γ n measure (namely, 1 − q −n ) that is invariant under the action of GL n (D), it follows that σ n is invariant under the action of GL n (D). It therefore remains to establish the uniqueness claim. This, however, is immediate from Theorem 2.11 below and Lemma 2.7.
Remark 2.10. (i) The uniqueness claim in Proposition 2.9 may be established by an alternative route. Because GL n (D) is a compact, second countable, Hausdorff group, it possesses a Haar measure which is unique if we normalize it to be a probability measure. Denote this probability measure by µ. Suppose that ν is a GL n (D)-invariant probability measure on S (n) . For any pos- It is possible to describe the Haar measure on GL n (D) quite concretely. Let γ n×n be the probability measure on n × n matrices given by γ n×n ((dm i,j ) i,j∈[n] ) = i,j∈[n] γ(dm i,j ); that is, if M is distributed according to γ n×n , then the entries of M are independent and each is distributed according to γ. The Haar measure on GL n (D) is just γ n×n conditioned on the set GL n (D); that is, the Haar measure is the restriction of γ n×n to GL n (D) normalized to be a probability measure. Because we don't need this result in what follows, we leave the (straightforward) proof to the reader. We note in passing that the γ n×n measure of GL n (D) is i∈ [n] (1 − q −i ) -see [Eva02, Theorem 4.1]. (iii) As we have seen, the probability measure γ n is GL n (D)-invariant. On the other hand, if the push-forward of γ n by an n × n matrix M is again γ n , then property (v) of Theorem 2.4 holds and M ∈ GL n (D). We may therefore add another equivalent property to the list in Theorem 2.4. A similar remark holds with γ n replaced by σ n .
For the sake of completeness, we state the following classical result on the existence and uniqueness of measures invariant under the action of a group (see, for example, [Kal02, Theorem 2.29]).
Theorem 2.11. Let G be a locally compact, second countable, Hausdorff group of measurable transformations on a locally compact, second countable, Hausdorff space S. Suppose that G acts properly (that is, the set {g ∈ G : gs ∈ K} ⊆ G is compact for all s ∈ S and compact K ⊆ S) and transitively (that is, given s, t ∈ S there exists g ∈ G such that gs = t). Then, up to normalization, there is a unique non-zero G-invariant Radon measure on S.
The following corollary is an easy consequence of Proposition 2.9, but we include the proof for the sake of completeness.
Corollary 2.12. Let ξ be a K n -valued random variable such that U ξ has the same distribution as ξ for all U ∈ GL n (D). Possibly on some extension (Ω,F ,P) of (Ω, F , P) there is a σ n -distributed, S (n) -valued random variable ϑ and a {ρ m : m ∈ Z} ∪ {0}-valued random variable R independent of ϑ such that ξ = Rϑ. Consequently, if ν is a GL n (D)-invariant probability measure on K n , then ν is the push-forward of π ⊗ σ n by the map (r, x) → rx for some probability measure π on {ρ m : m ∈ Z} ∪ {0}.
Set R := ρ m on the event A m × S (n) , m ∈ Z, and R := 0 on the event A ∞ × S (n) . Put It is clear that ξ = Rϑ. For any r ∈ {ρ m : m ∈ Z} ∪ {0} the conditional distribution of ϑ given the event {R = r} is obviously invariant under the action of GL n (D) and so, by Proposition 2.9, the random variable ϑ is independent of R with distribution σ n .
A classical theorem often attributed to Poincaré says that for fixed k ∈ N the distribution of the first k coordinates of a point uniformly distributed over the sphere of radius √ n in R n converges to the distribution of a vector of k independent standard Gaussian variables as n → ∞. See Section 6 of [DF87] for a discussion of the history of this result leading to the conclusion that a more appropriate attribution is to the work of Borel in [Bor06] (see also Chapter V of [Bor14]). It is shown in [DF87] that indeed the total variation distance between the distribution of the first k coordinates of a point uniformly distributed over the sphere of radius √ n in R n and the distribution of a vector of k independent standard Gaussian variables converges to zero as n → ∞ provided that k = o(n).
The analogue of such a result in the local field setting is the following. Note that the relevant total variation distance converges to zero as n → ∞ regardless of the relative size of k with respect to n.
Theorem 2.13. For n ∈ N and k ∈ [n], let σ n,k be the push-forward of σ n by the map that sends (x i ) i∈[n] ∈ K n to (x i ) i∈[k] ∈ K k . The total variation distance between the probability measures σ n,k and γ k is q −n (1 − q −k )/(1 − q −n ).
Proof. Let (ξ n,i ) i∈[n] have distribution σ n and let (η i ) i∈[n] have distribution γ n so that (ξ n,i ) i∈ [k] has distribution σ n,k and (η i ) i∈[k] has distribution γ k .
By definition, the distribution of (ξ n,i ) i∈[n] is the same as the distribution of (η i ) i∈[n] conditioned on the event { (η i ) i∈[n] = 1}. Write δ n,i for the indicator of the event {|ξ n,i | = 1} and ǫ i for the indicator of the event {|η i | = 1}. Let ν 0 (resp. ν 1 ) be γ conditioned on the event {x ∈ K : |x| < 1} (resp. {x ∈ K : |x| = 1}); that is, ν 0 (resp. ν 1 ) is the conditional distribution of η i given the event If (e i ) i∈[n] ∈ {0, 1} n = 0, then the conditional distribution of (ξ n,i ) i∈[k] given the event is the same as that of (η i ) i∈ [k] given the event {(ǫ k ) i∈[k] = (e i ) i∈[k] }; both conditional distributions are i∈[k] ν ei . It follows that the total variation distance we seek is the same as the total variation distance between the distribution of (δ n,i ) i∈ [k] and the distribution of (ǫ i ) i∈ [k] . Furthermore, the conditional distribution of (δ n,i ) i∈ [k] given the event { i∈[k] δ n,i = j} is the same as the conditional distribution of (ǫ i ) i∈ [k] given the event { i∈[k] ǫ i = j}. Thus, it further suffices to compute the total variation distance between the distribution of i∈[k] δ n,i and the distribution of i∈[k] ǫ i . Put X = i∈[n] ǫ i and Y = i∈[k] ǫ i . Noting that the distribution of i∈[k] δ n,i is the same as the conditional distribution of Y given the event {X = 0}, the total variation distance we seek is For y = 0 we have Therefore, as claimed.
2.5. Fourier theory. Recall that a character on a locally compact Abelian group G with the group operation written additively is a map κ : G → T := {z ∈ C : |z| = 1} such that κ(g + h) = κ(g)κ(h) for all g, h ∈ G. It is possible to fix a character χ for K such that χ restricted to the subgroup D is trivial (that is, always takes the value 1) while χ restricted to the subgroup ρ −1 D is non-trivial. Fixing any such choice of the character χ, an arbitrary character on K is of the form x → χ(ax) for some a ∈ K. More generally, an arbitrary character on K n is of the form x → χ(a · x) for some a ∈ K n , where a · x is the usual dot product of the vectors a and x.
2.6. Gaussian random variables. For an introduction to the/an analogue of Gaussian probability measures on local fields and the proofs of any results stated without proof in this subsection, see [Eva89,Eva01].
The following definition mimics a standard definition/characterization of Gaussian random variables taking values in a Banach spaces over the real numbers.
Theorem 2.15. Let E be a separable K-Banach space and let ξ be an E-valued random variable.
(i) The random variable ξ is K-Gaussian if and only if the distribution of ξ is Haar measure on some compact D-module of E. In particular, if E = K, then ξ is K-Gaussian if and only if the distribution of ξ is either the point mass at 0 or the restriction of the Haar measure λ to one of the sets ρ m D, m ∈ Z, normalized to be a probability measure. (ii) The random variable ξ is K-Gaussian if and only if the K-valued random variable T ξ is K-Gaussian for all T ∈ E * , where E * is the dual space of continuous, linear maps from E to K. (iii) If ξ is K-Gaussian, then the compact D-module in (i) is the set where ζ ∞ := ess sup{|ζ(ω)| : ω ∈ Ω} for a K-valued random variable ζ. (iv) The random variable ξ is K-Gaussian if and only if T ξ ∞ is finite for all T ∈ E * and, for all t ∈ K, Definition 2.16. A K-valued random variable is standard K-Gaussian if its distribution is γ.

Rotatable random sequences
The following analogue of Theorem 2 in [DF87] follows from a combination of Corollary 2.12 and Theorem 2.13.
Proposition 3.1. Suppose that ν is a GL n (D)-invariant probability measure on K n so that ν is the push-forward of π⊗σ n by the map (r, x) → rx for some probability measure π on {ρ m : m ∈ Z}∪{0}. The total variation distance between ν and the push-forward of π ⊗ γ n by the map (r, x) → rx is at most q −n .
Definition 3.2. An infinite sequence (ξ i ) i∈N of K-valued random variables is rotatable if U (ξ i ) i∈ [n] has the same distribution as (ξ i ) i∈[n] for every n ∈ N and U ∈ GL n (D).
Theorem 3.3. A K-valued random infinite sequence ξ = (ξ i ) i∈N is rotatable if and only if sup i∈N |ξ i | is almost surely finite and (possibly on some extension of (Ω, F , P)) ξ i = τ η i , i ∈ N, where • η 1 , η 2 , . . . are independent, identically distributed, standard K-Gaussian random variables, • τ is the random variable taking values in {ρ m : m ∈ Z} ∪ {0} given by • the random variable τ is independent of the sequence η = (η i ) i∈N .
Proof. For each n ∈ N, let and let (ξ n,i ) i∈[n] be distributed according to σ n and independent of τ n . Observe that, by Corollary 2.12, (ξ i ) i∈[n] has the same distribution a τ n (ξ n,i ) i∈ [n] . Now letη be an infinite sequence of independent, identically distributed, standard K-Gaussian random variables independent of (τ n ) n∈N . Writing L(ζ) for the distribution of a random variable ζ and · − · TV for the total variation distance, we have which goes to 0 as n → ∞ by Theorem 2.13. So τ nη certainly converges in distribution to ξ. Now |τ n | = (ξ i ) i∈[n] = i∈[n] |ξ i | is increasing with n ∈ N, and so that τ n → τ almost surely for some random variable τ taking values in {ρ m : m ∈ Z} ∪ {0}. Thus ξ has the same distribution as τη. The almost sure result then follows from the transfer lemma (Corollary 6.11 from [Kal02]); we thus have random variables η,τ such that (τ,η) has the same distribution as (τ , η) andτ η = ξ almost surely. It remains to observe that = |τ | almost surely, so thatτ = τ almost surely.
There are several extensions and variants of Freedman's theorem in the literature -see [DEL92] for a review. One extension is to consider infinite random sequences where the individual entries take values in R m for some m ∈ N or, more generally, in some separable Banach space over R as in [Daw78]. The analogue of the result in [Daw78] holds in the local field setting, as we now explain. We say that a random sequence ξ with entries in a separable K-Banach space E is rotatable if for any n ∈ N and U ∈ GL n (D) we have that ( j∈[n] U i,j ξ j ) i∈[n] has the same distribution as (ξ i ) i∈ [n] . Write Γ(E) for the set of Gaussian probability measures on E. The set Γ(E) is a closed subset of the Polish space of probability measures on the Polish space E, where we equip the space of probability measures with the topology of weak converence. By Theorem 2.15, Γ(E) is in a bijective correspondence with the set of compact D-modules in E. The analytic content of Theorem 3.3 is that the distribution of a rotatable sequence of K-valued random variables is Γ(K) µ ⊗N π(dµ) for some probability measure on Γ(K) and it is not too difficult to establish the following generalization that is a local field counterpart to the result in [Daw78]. We omit the proof.
Theorem 3.4. An infinite sequence ξ of random variables in a separable K-Banach space E is rotatable if and only if the distribution of ξ is Γ(E) µ ⊗N π(dµ) for some probability measure π on Γ(E).
Other variants of Freedman's theorem involve considering infinite random sequences ξ such that for all n ∈ N the random vector U (ξ i ) i∈[n] has the same distribution as (ξ i ) i∈[n] for all U ∈ G n , where G n is some subgroup of the n×n orthogonal group that contains the subgroup of permutation matrices. For example, [Smi81] considers the case where G n is the subgroup that fixes the vector (1, . . . , 1) ⊤ and shows that in this case ξ = α+βη, where η is an independent, identically distributed, standard Gaussian sequence and (α, β) is an independent R × R + -valued random vector. It would be interesting to investigate the counterparts of such results in the local field setting, but we leave this to a future paper.

A local field analogue of Schoenberg's theorem
In the case of real-valued rotatable random variables over R, Freedman's theorem is equivalent to a result about nonnegative definite functions (see Theorem 1.31 and Appendix 4 in [Kal05]) which we recalled in the Introduction. Here, we provide an analogue of the latter result in the local field setting.
Recall that a function φ defined on an Abelian group G is nonnegative definite if i,j∈[n] φ(g i − g j )z izj ≥ 0 for all g 1 , . . . , g n ∈ G, z 1 , . . . , z n ∈ C, and n ∈ N.
Then φ n is nonnegative definite for every n ∈ N if and only if φ is nonnegative and nonincreasing.
Proof. By Bochner's theorem for locally compact groups (see Section 36 of [Loo53], also see [Wei40]), the function φ n is nonnegative definite if there exists some K n -valued random variable ξ (n) such that φ n (t) = E[χ(t · ξ (n) )] for t ∈ K n ; that is, φ n is the characteristic function of ξ (n) . Using the definition of φ n , this is E[χ(t · ξ (n) )] = φ( t ).
We next claim that the random sequence ξ is rotatable. Given n ∈ N and U ∈ GL n (D), which shows that U (ξ i ) i∈[n] has the same distribution as (ξ i ) i∈[n] . Using Theorem 3.3, we have that ξ = τ η almost surely, where η is a sequence of independent, identically distributed, standard K-Gaussian random variables. So φ n is nonnegative definite for each n ∈ N if and only if φ( t ) = E[χ(t · τ (η i ) i∈[n] )] for each n ∈ N. Thus This is equivalent to φ having the desired properties.