The stochastic telegraph equation limit of the stochastic higher spin six vertex model

In this paper, we prove that the stochastic telegraph equation arises as a scaling limit of the stochastic higher spin six vertex (SHS6V) model with general spin $I/2, J/2$. This extends results of Borodin and Gorin which focused on the $I=J=1$ six vertex case and demonstrates the universality of the stochastic telegraph equation in this context. We also provide a functional extension of the central limit theorem obtained in [Borodin and Gorin 2019, Theorem 6.1]. The main idea is to generalize the four point relation established in [Borodin and Gorin 2019, Theorem 3.1], using fusion.

Here, R(X, Y, x, y) is the Riemann function defined as R(X, Y ; x, y) = 1 2πi −β1 where the contour of the complex integration is a small circle in positive direction which only includes the pole at −β 1 . When f is given by f (X, Y ) = θ(X, Y )η(X, Y ), where η is the space-time white noise with dirac delta correlation function and θ is a deterministic integrable function. By formula (1.2), the solution to the stochastic telegraph equation is a Gaussian field with covariance function where the discrete Riemann function R d equals (see [BG19,Eq. 45]) (1.7) Here, the contour is a small circle going in positive direction which only encircles the pole at − 1 b2(1−b1) .
In the first version of the arxiv paper [BG18], Borodin and Gorin showed that under a special scaling regime where the weight of the corner type vertex goes to zero, the height function of the stochastic six vertex model converges to the telegraph equation. They also conjectured that the fluctuation field will converge to the stochastic telegraph equation with some heuristic arguments and proved this result under a special situation called low density boundary regime. The result for general boundary condition was later proved in [ST19] and [BG19] via two distinct approach. This result comes as a surprise. Since from [GS92,BCG16] we know that the stochastic six vertex model belongs to the KPZ universality class. The one point fluctuation of the models in this universality is governed by Tracy Widom distribution [TW94]. However, the solution to the stochastic telegraph equation does not lie in this universality (since it is a Gaussian field). In addition, [CGST20] shows that under weakly asymmetric scaling (which is a different scaling compared with the one in [BG19]), the stochastic six vertex model converges to the KPZ equation [KPZ86,Cor12], which is a parabolic stochastic PDE while the stochastic telegraph equation is hyperbolic!
It is natural to ask if the stochastic telegraph equation also arises as a scaling limit of other probabilistic models. In this paper, we show that the stochastic higher spin six vertex (SHS6V) model, which is a higher spin generalization of the stochastic six vertex model, converges to the stochastic telegraph equation under certain scaling regime. This extends the universality of the stochastic telegraph equation. In addition, [Lin20] showed that under a different scaling than the one considered in this paper, the SHS6V model converges to the KPZ equation. This tells us that the SHS6V model converges to two distinct types of stochastic PDE under various choice of scaling.
1.2. The SHS6V model. The SHS6V model is a four-parameter family of quantum integrable system first introduced in [CP16] and has been intensely studied in recent years, from the perspective of symmetric polynomial [Bor17,Bor18], exact solvability [BCPS15,CP16,BP18], Markov duality [CP16, Kua18, Lin19] and scaling limit [CT17,IMS20,Lin20]. In particular, it is a higher spin generalization of stochastic six vertex model from spin parameter I = J = 1 to general I, J ∈ Z ≥1 . In this paper, we discover a scaling regime for the SHS6V model (which degenerates to the scaling in [BG19] when I = J = 1), under which we prove that: 1) the hydrodynamic limit of the SHS6V model is a telegraph equation; 2) the fluctuation field of the model converges to a stochastic telegraph equation. To explain our result with more detail, we start with a brief review of the SHS6V model. Definition 1.1 (J = 1 L-matrix). We define the J = 1 L-matrix to be a matrix with row and column indexed by Z ≥0 × {0, 1}. The element of the J = 1 L-matrix is specified by As a convention, throughout the paper, we set ν = q −I for some fixed I ∈ Z ≥1 . Note that L (1) α (I, 1; I + 1, 0) = 0, hence the J = 1 L-matrix transfers the subspace {0, 1, . . . , I} × {0, 1} to itself and we will restrict ourselves on this subspace.
Proof. This follows from [CP16, Proposition 2.3], which can also be verified directly.
For an entry L (1) α (i 1 , j 1 ; i 2 , j 2 ), we interpret the four tuple (i 1 , j 1 , i 2 , j 2 ) as a vertex configuration in the sense that a vertex is associated with i 1 input lines and j 1 input lines coming from bottom and left, i 2 output lines and j 2 output lines flowing to above and right, see Figure 1. The quantity L (1) α (i 1 , i 2 ; j 1 , j 2 ) gives the weight of the vertex configuration. Note that for a vertex associated with L (1) α , we allow up to I number of vertical lines and up to one horizontal line. We say that the L-matrix is conservative in lines as it assigns zero weight to the entry L (1) (1) α (i 1 , j 1 ; i 2 , j 2 ), which absorbs i 1 ∈ {0, 1, . . . , I} input lines from bottom, j 1 ∈ {0, 1} input line from left, and produces i 2 ∈ {0, 1, . . . , I} output lines to above, j 2 ∈ {0, 1} output lines to right. Right panel: Visualization of the vertex configuration (i 1 , j 1 ; i 2 , j 2 ) = (2, 1; 3, 0) in terms of lines.
We want to relax the restriction that the multiplicities of the horizontal line are bounded by 1, and instead, consider multiplicities bounded by any fixed J. This motivates us to define the L (J) α matrix, the construction of it follows the so-called fusion procedure, which was invented in a representation-theoretic context [KRS81,KR87] to produce higher-dimensional solutions of the Yang-Baxter equation from lower-dimensional ones. The explicit expression of general J L-matrix is derived separately in [Man14] and [CP16]: (1.8) Here, 4φ3 is the regularized terminating basic hyper-geometric series defined by It is a simple exercise to see when J = 1, the expression of L (J) α matches with L (1) α in Definition 1.1. We will show momentarily that L (J) α is stochastic (Corollary 1.4). This allows us to view the matrix element L (J) α (i 1 , j 1 ; i 2 , j 2 ) as a vertex configuration in the manner that we described in J = 1 case. Note that now we allow up to J lines in the horizontal direction.
Despite explicitness, the expression of the L-matrix above is too complicated to manipulate. For instance, using (1.8) directly, it might be hard to demonstrate the stochasticity of L (J) α . To this end, we recall a probabilistic derivation of L (J) α in [CP16] using the idea of fusion, which goes back to [KR87]. We start by introducing a few notations.
Define the stochastic matrix Ξ with rows and columns indexed by {0, 1} ⊗J and {0, 1, . . . , J} such that In terms of the right part of Figure 2, these matrix elements provide the transition probabilities from the lines coming into a column from bottom and left, to those leaving to the top and right.
The following lemma allows us to decompose the vertex with horizontal spin J/2 (i.e. the vertex associated with L (J) α ) in terms of a sequence of horizontal spin 1/2 vertices, see Figure 2 for visualization. Lemma 1.3. The following identity holds Proof. This was shown in [CP16, Theorem 3.15].
h v .
Proof. Note that under the range imposed on q, α, referring to Lemma 1.2, the matrix L (1) αq i is stochastic for each i = 0, 1, . . . , J − 1. As the product of stochastic matrices is stochastic as well, the stochasticity of L (J) α follows directly from Lemma 1.3.
We proceed to define the SHS6V model on the first quadrant Z 2 ≥0 . For each vertex (x, y) ∈ Z 2 ≥0 , we associate it with a four tuple (v x,y , h x,y , v x,y+1 , h x+1,y ) ∈ Z 4 ≥0 such that v x,y , h x,y represent the number of lines entering into the vertex from bottom and left, v x,y+1 , h x+1,y denote the number of lines flowing from the vertex to above and right. Note that configurations chosen for two neighboring vertices need to be compatible in the sense that the lines keep flowing. For instance, v x,y+1 also represents the number of vertical input lines flowing into (x, y + 1), h x,y+1 equals the number of horizontal lines entering into (x + 1, y) (see the right part of Figure 3). α (v x,y , h x,y ; ·, ·). Proceeding with this sequential sampling, we get a collection of paths going to the up-right direction and we call this the SHS6V model.
We associate a height function H : Z 2 ≥0 → Z to the path ensemble, where the paths play a role as the level lines of the height function (see Figure 3). Define for any x, y ∈ Z ≥0 , Graphically, when we move across i number of vertical lines from left to right, the height function will decrease by i. When we move across j number of horizontal lines, the height function will increase by j. We further extend H(x, y) to all (x, y) ∈ R 2 ≥0 by first linearly interpolating the height function first in the x-direction, then in the y-direction. It is obvious that the resulting function is Lipschitz and monotone.
For later use, we call I/2, J/2 the vertical and horizontal spin respectively. If a vertex is of horizontal spin 1/2, we call it a J = 1 vertex, otherwise we call it a general J vertex.
1.3. Four point relation. [BG19] shows that the stochastic six vertex model height function converges to a telegraph equation and its fluctuation field converges to a stochastic telegraph equation. The key observation is the following four point relation, which says that if we define Here b 1 , b 2 are the weight of the six vertex model configuration (in our notation b 1 = α+ν 1+α , b 2 = 1+αq 1+α ). Then the conditional expectation and variance of ξ read E ξ S6V (x + 1, y + 1)|F(x, y) = 0, (1.9) where F(x, y) is a sigma algebra generated by {H(u, v) : u ≤ x or v ≤ y} and ∆ x := q H(x+1,y) − q H(x,y) , ∆ y := q H(x,y+1) − q H(x,y) . The parameters γ i , i = 1, 2, 3 depend on b 1 , b 2 .
In our paper, we generalize the above relations to the SHS6V model. Define H(x; y + 1) H(x + 1; y + 1)  We prove (respectively in Theorem 2.3 and Theorem 2.5) that (1.12) R(x, y) is an error term that is negligible under our scaling. From now on, we may also use ξ to denote ξ SHS6V .
Why does such a generalization exist? In the context of the stochastic six vertex model, (1.9) is related to the self-duality discovered in [CP16, Proposition 2.20], though it is more of a local relation than the way duality is generally stated (it is unclear to us how to prove (1.9) from the duality directly). In fact, [CP16,Corollary 3.3] shows that the SHS6V model with general I, J enjoys the same self-duality, so it is natural to expect that (1.11), as a generalized version of (1.9) holds. For the quadratic variation, the situation is more subtle for the SHS6V model. We do not come up with a simple reason why (1.12) holds, though this may be understandable from our proof, which is briefly explained in the next paragraph. Here, we just emphasize that as shown in Remark 2.6, there exist no γ i , i = 1, 2, 3 such that the identity without an error term holds for the SHS6V model. We also emphasize that it is only under our scaling (1.13) that R(x, y) is negligible.
Let us explain the ideas and techniques used in proving (1.11) and (1.12). In [BG19], the authors prove (1.9) and (1.10) via a direct computation, which corresponds to enumerating all possible six vertex configurations. In our case, the situation is more involved: when J is large, the expression of L (J) α is so complicated that it is hopeless to check these relations directly. Alternatively, we first verify them directly for J = 1, in which case the L-matrix has a simple expression given by Definition 1.1. For general J, we use fusion, which allows us to decompose the general J vertex into a sequence of J = 1 vertices (see Figure 2). Repeatedly using the J = 1 version of (1.11) (where the spectral parameter α is replaced by αq i in the expression of ξ), we get J identities. Summing up these identities in a clever way, we see a telescoping property and (1.11) follows. To prove (1.12), besides using fusion, we need to refer to the property of our scaling (1.13), which says that with a probability converging to 1, the lines entering into a vertex will keep flowing in the same direction (see Lemma 2.4).
In [CP16], the fusion was stated in a way that the spectral parameters progress geometrically by q from bottom to top when we decompose the general J vertex to a column of J = 1 vertices. It turns out that (Lemma 2.1) we can also reverse the direction and let the parameters progress geometrically by q from top to bottom (meanwhile we change the probability distribution assigned on the input lines from the left). We did not see this result elsewhere. Note that it is only after this reversal of the spectral parameters that we obtain the telescoping property mentioned in the previous paragraph.
1.4. Stochastic telegraph equation as a scaling limit of the SHS6V model. Having established the four point relation, we are ready to talk about our result. We show that under our scaling, (i). (Hydrodynamic limit (or law of large numbers) -Theorem 1.6): The SHS6V model height function converges uniformly in probability to a telegraph equation. (ii). (Functional central limit theorem -Theorem 1.7 (also see Corollary 1.9)): The fluctuation field of the height function around its hydrodynamic limit (viewed as a random continuous function) converges weakly to a stochastic telegraph equation.
Once we have proved the four point relation for the SHS6V model, the proof for the law of large numbers is akin to [BG19, Theorem 5.1]. For the functional central limit theorem, our proof breaks down into proving the finite dimensional weak convergence (Proposition 3.1) and tightness (Proposition 3.2). For finite dimensional convergence, the proof follows a similar idea as in [BG19, Theorem 6.1], subject to certain generalization. For the tightness, we rely on the Burkholder inequality and a careful control of joint moments of ξ at different locations (Lemma 3.3). We remark that the proof of the tightness may not fit to the regime of classical functional martingale CLT result (e.g. [Bro71, Section 6]), see Remark 3.4 for more discussion.
To present our results, let us first introduce our scaling. Fix I, J ∈ Z ≥1 and positive β 1 , β 2 such that β 1 = β 2 , we scale the parameter q, α in the way that It is straightforward that as L → ∞, α and q always satisfy one of the conditions given in Corollary 1.4, thus L with the boundary condition specified by q h(x,0) = q χ(x) and q h(0,y) = q ψ(y) .
Having established the law of large number for the height function, we proceed to show the functional central limit theorem. As a convention, we endow the space C(R 2 ≥0 ) with the topology of uniform convergence over compact subsets and use " ⇒ " to denote the weak convergence. Recall that we linearly extend H(x, y) for non-integer x, y, so H(x, y) ∈ C(R 2 ≥0 ).
Theorem 1.7. Assuming further that χ(x) and ψ(y) are piecewise C 1 -smooth, we have the weak convergence as L → ∞, where ϕ(x, y) is a random continuous function which solves the stochastic telegraph equation

15)
Here, q h x := ∂ x (q h(x,y) ) and q h y := ∂ y (q h(x,y) ), the boundary of ϕ is given by zero.

Remark 1.8. By (1.4), it is clear that ϕ is a Gaussian field with covariance function
where R IJ is the Riemann function in (1.3) with β 1 and β 2 replaced by Iβ 1 and Jβ 2 respectively, i.e.
As a corollary of the previous results, we have the following. Corollary 1.9. As L → ∞, The rest of the paper is organized as follows. In Section 2, we first establish an identity (Lemma 2.1), which gives an alternative way to apply fusion. Then, we prove our four point relation (Theorem 2.3 and Theorem 2.5). We also discuss some properties of our scaling (Lemma 2.4). In Section 3, we first use the four point relation to prove the law of large numbers (Theorem 1.6) and the finite dimensional version of the CLT (Proposition 3.1). Then we establish the tightness (Proposition 3.2) and improve our CLT to the functional level (Theorem 1.7).
1.5. Acknowledgment. The author wants to thank Ivan Corwin for many valuable comments on the paper; Vadim Gorin for helpful comments and discussion; and Shalin Parekh for an inspiring discussion about the tightness result. The author was supported by Ivan Corwin through the NSF grants DMS-1811143, DMS-1664650 and also by the Minerva Foundation Summer Fellowship program.

Four point relation
In this section, we prove the four point relation (1.11) and (1.12) that mentioned in Section 1.3. To begin with, we present a lemma that allows us to reverse the spectral parameters upside down when we decompose the general J vertex into a column of J = 1 vertices, see Figure 4 for visualization. The key for our proof is an identity that allows us to switch a pair of vertices with different spectral parameters, see Figure 5. We do not find such identity in the literature. It seems to us that this identity does not follow directly from the Yang-Baxter equation.
Define the stochastic matrix Λ, Note that comparing with the expression of Λ and L ⊗qJ α , the term q i−1 is replaced by q J−i , which corresponds to reversing the spectral parameters upside down. (2.1) Consequently, we have alternate expression for the general J vertex weight Proof. By Lemma 1.3, it is clear that (2.1) implies (2.2). It suffices to prove (2.1), which says, graphically . Pictorial representation the identity (2.1). The weight (wt) of a diagram is given by a summation of products of L-matrices over h 1 , . . . , h J , with condition h 1 + · · · + h J = h and h 1 + · · · + h J = h . Each product on the left (resp. right) hand side in the summation is reweighted by Λ(h; h 1 , . . . , h J ) (resp. Λ(h; h 1 , . . . , h J )).
When J = 1, the proof is trivial. When J = 2, the identity (2.1) reduces to Figure 5. there are nine cases in total. One can verify each case directly and here, we only show our verification for h = 1 and h = 1, in which case the computation is more involved. The LHS in Figure 5 equals and the RHS equals It is not hard to see directly that the RHS of (2.3) and (2.4) are both the sum of the following four terms (divided by a common denominator (1 + q)(1 + α)(1 + αq)) For the verification of other h, h ∈ {0, 1, 2}, we omit the details of our computation.
For general J. We look at the column of vertices on the LHS of the equation illustrated in Figure 4. From bottom to top, we label the vertices from 1 to J. Sequentially for i = 1, . . . , J − 1, we apply the J = 2 identity (that we just verified) for the vertex i and i + 1 in that column. Then, the spectral parameters of the vertices (looking from bottom to top) change from (α, αq, . . . , αq J−1 ) to (αq, αq 2 , . . . , αq J−1 , α), note that the vertex with spectral parameter α moves from bottom to top. The Λ also changes accordingly. Then we apply the J = 2 identity for i = 1, . . . , J − 2 to move the spectral parameter αq to the second top place. If we keep implementing this procedure, finally we get a column of vertices with spectral parameters (αq J−1 , αq J−2 , . . . , α). The left input lines are weighted by Λ.
Remark 2.2. It turns out that following the same argument, the identities (2.1), (2.2) also hold when we replace the stochastic matrix Λ with where σ is an arbitrary permutation of {1, 2, . . . , J}. We do not include this generalization in the lemma since we are not going to use it.
then we have, Proof. Since our model is homogeneous, i.e. every vertex is assigned with the same L-matrix, we suppress the dependence on x, y in our notation and denote by (2.7) We prove this identity in two steps: Step 1 (J = 1): We assume J = 1, in which case the vertex weight (1.8) reduces to the weights in Definition 1.1. Let us verify (2.7) directly, Since J = 1, h is either 0 or 1, we discuss them respectively.
If h = 0, i.e. H(x, y + 1) = H, by Definition 1.1, Hence, Step (2.11) Since h = h 1 + · · · + h J , H J = H(x, y + 1). Furthermore, H J = H(x + 1, y + 1) in law. It suffices to prove This is equivalent to We define the sigma algebra F i = σ H i , H i , H i+1 for i = 0, 1, . . . , J − 1. Since all the vertices are of horizontal spin 1/2 now, using the J = 1 version of (2.6) (proved in Step 1) for the i-th vertex (with the spectral parameter αq J−i ) looking from the bottom, we have In In other words, Iterating the above equation from i = J to i = 1, one concludes the desired (2.12). To prove relation (1.12), we need the following fact which says that under our scaling (1.13), it is unlikely that a vertex will change the direction of lines entering into it. More specifically, if a vertex has i vertical input lines and j horizontal input lines, with probability going to 1, it produces i vertical and j horizontal output lines.
We use O(a) to denote some quantity bounded by a constant times a, when the scaling parameter L is large. E ξ(x + 1, y + 1) 2 |F(x, y)

where R(x, y) is a random field with the uniform upper bound
|R(x, y)| ≤ CL −4 , (2.14) for all x ∈ [0, LA] ∩ Z and y ∈ [0, LB] ∩ Z, C is some constant that only depends on A, B.
Proof. We only need to show that the random field R(x, y) defined via satisfies (2.14). Using same notation as in the proof of Theorem 2.3, It is clear that E ξ(x + 1, y + 1) 2 |F(x, y) = E ξ 2 |F . Our proof is divided into two steps.
If h = 0, The second equality in the above display follows from a straightforward calculation.
Step 2 (general J): Similar as what we did in Theorem 2.3, we apply fusion (see Figure 6). Recall H i , H i from (2.10) and (2.11) and define where the second equality follows from the telescoping property of the summation.
By Theorem 2.3, ξ i are martingale increments, so E ξ i ξ j |F] = 0 for i = j. It follows from (2.22) that (2.23) Using the J = 1 version of (2.16) proved in Step 1 for the i-th vertex counting from bottom (here, though the spectral parameter changes from α to αq i , it does not affect our scaling) (2.25) By conditioning, E ξ 2 i |F = E E ξ 2 i |F i−1 F (note that here we are not using the tower property but instead the sequential update rule). Using (2.23) and (2.24), we get Note that under our scaling, lim L→∞ 1+αq J−i 1+α = 1, along with (2.25), Furthermore, by Lemma 2.4, Hence, we can simplify (2.26) and get The last line is because ∆ x and ∆ y and H are measurable with respect to F.
Since ∆ y = 0, the right hand side of (1.10) reduces to So for all v ∈ {0, 1, . . . , I}, Canceling the factor (q −v − 1)q 2H on both sides, we get Since γ 2 does not depend on v, so the previous equation could not hold for v = 1, 2 simultaneously.
The following corollary is a direct consequence of Theorem 2.5.
Proof. It is clear that there exists C such that for any x ∈ [0, LA] ∩ Z and y ∈ [0, LB] ∩ Z, Similarly, |∆ y | ≤ CL −1 . Referring to Theorem 2.5 (note that q H(x,y) is bounded), the corollary follows.

Proof of the main results
Having established the four point relation, we move on proving Theorem 1.6 and Theorem 1.7. Corollary 1.9 follows from a straightforward argument once we proved Theorem 1.7. For the ensuing discussion, we will usually write C for constants, we might not generally specify when irrelevant terms are being absorbed into the constants. We might also write for example C(n) when we want to specify which parameters the constant depends on. We first demonstrate (i). By Theorem 2.3, Summing this equation over x = 0, 1, . . . , LX − 1 and y = 0, 1, . . . LY − 1 yields Combining this with (3.1) and taking the L → ∞ limit, q h satisfies the integral equation In other words, any limit point q h of E q 1 L H(Lx,Ly) as L → ∞ satisfies the telegraph equation By our assumption on the boundary, we also know that q h(x,0) = q χ(x) and q h(0,y) = q ψ(y) . This implies that h = h, which concludes (i).
Summing over Since ξ(x, y) is a martingale increment, using Corollary 2.7 Applying Doob's L p maximial inequality, it is clear that Observing that U (L·, L·) are uniformly bounded and uniformly Lipschitz on [0, A] × [0, B]. Therefore, their law are tight, any subsequential limit U has continuous trajectories must solve the L = ∞ version of (3.3), which reads (the right hand side is zero by (3.4)) According to [BG19,Prop 4.1], the only solution to the above equation is given by U = 0, which implies (ii).
We move on proving the functional CLT for the SHS6V model. The proof of Theorem 1.7 is composed of showing the finite dimensional weak convergence and demonstrating the tightness, which is formulated into the following two propositions.
Denote by Proposition 3.1 (finite dimensional convergence). With the same setup as in Theorem 1.7, we have the weak convergence in finite dimension as L → ∞, Recall that we linearly interpolate H(x, y) for non-integer x, y, thus H is a function in C(R 2 ≥0 ), so is U L (x, y).
Consequently, the sequence of random function U L (·, ·) ∈ C(R 2 ≥0 ) is tight. Proof of Theorem 1.7. The proof is a direct combination of Proposition 3.1 and Proposition 3.2.
We first prove the finite dimensional weak convergence.
The key for the proof is to study the conditional variance of M L (t) at t = L 2 XY . We show that as L → ∞, it converges to the variance of ϕ (1.15) in probability. In other words, we need to prove where the RHS is the variance of ϕ(X, Y ), see Remark 1.8.
To prove this convergence, we first use Theorem 2.5,  where o(1) represents the term converging to zero as L → ∞. Using these expansions, we have Using law of large number proved in Theorem 1.6, uniformly for y Consequently, it follows from (3.14) Note that in the last line, we used the property that the solution q h to the (1.14) is piecewise C 1 (since we assume additionally the boundary χ and ψ are smooth). By (3.15), (3.16) By first letting L → ∞ then θ → 0, we conclude the desired (3.10). The approximation for (3.11) is similar, we omit the detail.
Things become more involved when we show (3.12), note that  H(x, y). In the last equality, we used the approximation in (3.13) and Note that − ∆ x , ∆ y indicate the number of lines entering into the vertex (x, y) from bottom and left. Each unusual vertex might change the LHS summation at most by 2IJθL. As an analogue of [BG19, Eq. 93], It follows from Lemma 2.4 that the probability that a vertex is unusual is upper bounded by CL −1 , where C is a constant. Thus, with high probability as L → ∞. Noting that This being the case, combining (3.17) and (3.18) (together with Theorem 2.3) yields Using similar approximation as in (3.16), by first letting L → ∞ then θ → 0, we demonstrate (3.12). Having proved (3.10)-(3.12), we simply obtain the desired (3.9).
We conclude the theorem using martingale CLT [HH14, Section 3]. Recall that R d (LX, LY ; x + 1, y + 1) 2 E ξ(x + 1, y + 1) 2 |F(x, y) , has the same L → ∞ behavior as its unconditional variance, in the sense that their ratio tends to 1 in probability. (ii). The Lindeberg's condition, i.e. lim L→∞ Using Corollary 2.7, it is clear that the conditional variance on the LHS of (3.9) is uniformly bounded. By the convergence in (3.9) together with dominated convergence theorem, we know that both the conditional and unconditional variance of M L (t) at t = L 2 XY converge to the RHS of (3.9) (which equals to variance of ϕ(X, Y ) given by (1.16)), this concludes (i).
The Lindeberg's condition (ii) follows directly from how ξ is defined: By straightforward computation, there exists constant C such that |ξ(x+1, y+1)| ≤ CL −1 for all x ∈ [0, LA] and y ∈ [0, LB]. In addition, R d (LX, LY, x, y) is uniformly bounded. So when L is large enough, which implies that for every i ∈ [1, L 2 XY ], Having verified (i) and (ii), we conclude our proof using the martingale central limit theorem.
We move on proving Proposition 3.2. Before presenting our proof, we require the following result.
Proof. It suffices to prove that for (x, y) We first finish the proof of the lemma assuming (3.19). Consider the ordering (3.8) of integer points in [1, LA] × [1, LB], without loss of generality, we assume (x i , y i ) = (x(s i ), y(s i )) so that s 1 < · · · < s n . Recall that F(x, y) = σ H(i, j) : i ≤ x or j ≤ y , so ξ(x i , y i ) ∈ F(x n − 1, y n − 1) for i = 1, . . . , n − 1. By (3.19) and conditioning, It is evident that we can rewrite ξ(x + 1, y + 1) in (2.5) as Referring to (3.20), we conclude that for fixed A and B there exists a constant C such that for arbitrary L > 1, Note that Proof of Proposition 3.2. Using Kolmogorov-Chentsov criterion, the tightness of U L (·, ·) follows directly from (3.5). It suffices to show that there exists constant C such that for X ∈ [0, LA] and 0 ≤ Y 1 ≤ Y 2 ≤ LB, Taking the n-th power of both sides in the above display and apply the inequality (a + b) 2n ≤ 2 2n−1 (a 2n + b 2n ) to the right hand side, we have Denote the first and second term above (without the constant multiplier) by M 1 and M 2 respectively. We proceed to upper bound M 1 and M 2 respectively.
For M 1 , since ξ(x, y) is a martingale increment, by Burkholder-Davis-Gundy inequality (for discrete martingale increment ξ), we have Here, the summation above is taken over the partition λ of n, that is to say, λ = (λ 1 ≥ · · · ≥ λ s ) ∈ Z s ≥1 with s i=1 λ i = n, (λ) = s is the length of the partition λ. We want to upper bound the right hand side in the above display, by Lemma 3.3, we know that the E Using Lemma 3.3 and by similar argument used in upper bounding M 1 , we have The last inequality in the above display is due to our assumption Y 2 − Y 1 ≥ L −1 .

Remark 3.4.
It is worth remarking that the classical theory for martingale functional CLT (e.g. [Bro71, Section 6]) might not be helpful for proving our tightness. In order to get the tightness, the classical theory requires U L (X, Y ) to be a martingale in (X, Y ) in order to control (using martingale inequalities) the modulus for small δ > 0 and then apply Arezla-Ascoli (see [Bil13,Theorem 7.3]). In our case, though ξ(x, y) is a martingale increment, U L (X, Y ) fails to be a martingale due to dependence of R d on X, Y in (3.24). E q H(Lx,Ly) log q = ϕ(x, y) q h(x,y) log q .
To get the second equality above, we apply Theorem 1.6 and Theorem 1.7 to the denominator and numerator respectively. By straightforward computation, φ(x, y) := ϕ(x,y) q h(x,y) log q solves (1.17), which concludes the corollary.