Ensemble averages, Poisson processes and Microstates

We consider ensemble averaged theories with discrete random variables. We propose a suitable measure to do the ensemble average. We also provide a mathematical description of such ensemble averages of theories in terms of Poisson point processes. Moreover, we demonstrate that averaging theories of this type has an equivalent description as tracing over parts of the microscopic degrees of freedom in a suitable continuous limit of a single microscopic theory. The results from both approaches can be identified with Liouville gravity, of which we further address some implications on the microscopic theory, including venues to look for quantum effects from the view point of the averaged theory. Generalizations to other point processes are also discussed.


Introduction
An increasing amount of evidence emerges in recent studies that suggests the holographic dual of classical gravity might be the average of an ensemble of field theories [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. Most of the analyses by far focus on models with Gaussian type continuous random variables. On the other hand, there has been explicit computations of gravitational path integral in some simple toy models whose boundary dual is shown to be an average of theories where quantities are subjected to discrete distributions [20], see also [21,22] for related discussion. Given their possible connection to the microscopic discreteness of quantum theories, averages measured by discrete distributions are clearly interesting and worth in-depth study on its own right.
With these motivations, we study properties of an average of field theories with random variables drawn from the Poisson distribution. We show that the effective theory after the average is a Liouville theory. Along the derivation, we demonstrate the importance of choosing an appropriate measure for the discrete averaging process to give a well behaved effective theory. To provide a mathematically more accurate description of such averaging, we show that our setting can be cast precisely into a point process. In this language, the averaging to get the effective action is nothing but the Laplace functional of the Poisson process.
Moreover it is interesting to understand the nature of averaging over random theories [6,8,[23][24][25][26][27], namely if the average is genuinely among different theories, or it is simply a useful trick for certain computation, or it is originated from averaging among an ensemble of states in a "Parent" theory. We try to understand this question quantitatively in our model and show that one can rewrite the average over the theories with Poisson randomness as a trace over a part of the microscopic degrees of freedom in a single (suitably double scaled) microscopic model. This connection is different from previous discussions in the literature, and gives a concrete realization that sets up an equivalence between the average over an ensemble of theories and the average over an ensemble of states in a given theory. In this microscopic point of view, the above requirement of choosing an appropriate measure in the random average approach is reflected on a careful definition of how to trace out part of the underlying degrees of freedom.
We further discuss the average over random theories subjected to the Skellam distribution. The averaged effective theory is a Sinh-Gordon type model. One can also obtain this resulting theory from tracing out some fermionic degrees of freedom in a double-scaling limit of a microscopic theory. As in the Liouville case, this gives a concrete realization of a gravitational theory as an effective description of some microscopic model after we choose to erase (part of) the model's microscopic information.

Averaged Poisson random models
Motivated by the above derivation, we consider a real scalar field with a chemical potential. In Euclidean signature, this is given by the Lagrangian (2.1) Notice that the discussion in this section applies to general dimension so we will first keep the dimension unspecified. In the following we consider a general case where the source J has two components, namely (2. 2) The component J 0 (x) is a conventional classical source, and the other J 1 (x) is a random source related to a Poisson distribution. What we are interested in is to consider the effective theory after averaging over the random source J 1 (x). 1 In practice, this means we would like to find S eff schematically from e −S eff = DJ 1 (x)P(J 1 (x))e − d d xL(φ) . (2. 3) The crucial question is how to pick the correct measure DJ 1 (x)P(J 1 (x)). One might think this is in exact parallel to the average over a Gaussian type random coupling like in the SYK model [1,[28][29][30][31][32] or models with Gaussian random sources [33][34][35][36][37][38], where P(J 1 (x)) =Pois(J 1 (x), λ(x)) being simply the Poisson distribution with parameter λ(x). However, as we show in appendix B this naive definition does not give a sensible average over the discrete valued random sources. In fact, it turns out that it is not at all trivial to pick a correct measure to do the averaging over such a set of theories; an inappropriate choice could leads to pathological (to all degree) resulting theories. In the following, we give an example of finding a sensible measure of the set of theories to be averaged. We provide 2 different approaches, one more physically oriented while the other more mathematically rigorous, to analyse this discrete random model and we will show that they lead to the same result.

A physical point of view
To make the average process well defined, it turns out that we should treat the combination J 1 (x)dV (x) as the random variable that satisfies, in the example of Poisson distribution where dV (x) is an volume element around position x and λ(x)dV (x) is the Poisson parameter. In writing this expression we have chosen a given discretization of the spacetime, so that the volume V (M) = dV (x n ), and we will consider the fine grained limit of such discretization.
Notice that the Poisson probability distribution we considered here has a position dependent parameter, and the distribution at each point only depends on the local information. The latter is the same as the assertion that currents on different dV (x n ) are mutually independent for any discretization, which is the reason that we can write it as a product as in (2.4). The probability function superficially depends on how we do the discretization of the spacetime M, but as we will show in the following, since the mutually independence property is true for any discretization, all the results after an average over this probability distribution does not depend on the concrete discretization, and hence we can take the continuum limit of the spacetime discretization smoothly.
It is crucial that the distribution is integer valued, rather than real valued, so that we can treat it as a counting measure supported on a measure zero subset in an arbitrarily small dV ; the discretized value J 1 (x)dV (x) can be thought of as counting the number of random points in the volume dV (x). Therefore putting dV into the distribution is as well defined as an integral over a sum of Dirac delta functions in dV (x). In addition, we have also rescaled the λ(x n ) accordingly so that the distribution itself is not singular, i.e. λ(x) is not zero, as the volume element tends to zero. To put it another way, this can be understood as the following: if we consider the mean of the rescaled sources J 1 (x n )dV (x) that is subjected to a Poisson distribution with parameter λ (x) Physically, we would like to appropriately normalize the mean value, which means we would like the mean value of dV (x)J 1 (x n ) to again be proportional to the volume element dV (x). Therefore we choose to rescale the mean value to This mean value is just the parameter in the Poisson distribution, so the above scaling argument indicates that we have to consider the Poisson distribution with the parameter λ(x n )dV (x n ). This then justifies why we would consider the rescaled distribution (2.4). With this choice of the random ensemble, we can work out the average of the random sources of the model. We will formulate the computation in more rigorous mathematical language in the next section. Here we provide an instructive derivation where we discretized the integral as a Riemann sum and then take the continuous limit Notice that in the second line we have used the fact that the Probability distribution at different positions are mutually independent. We have also used n to collectively label the different grid points. This leads to the following averaged effective potential where the "path integral" DJ 1 (x) contains both a sum over all the sources at different spacetime points x and a sum over all possible values of J 1 (x)dV (x) at each point x. Therefore we get the following effective action 14) The result (2.14) is a generalized Liouville theory, with an effective background curvature indicated by the value of J 0 (x) and a "cosmological constant" λ(x). In this derivation λ(x) remains a function, or a background field, with out dynamics, therefore for any given λ(x) we get a different effective action. Clearly when the Poisson parameter λ(x) takes a homogeneous value λ(x) = λ, the action is exactly a Liouville action. We will provide a detailed interpretation of this position dependent expression and its relation to 2D gravity in section 5.3.
Further notice that the sign of the Liouville potential in (2.14) is "wrong": from the probability interpretation λ(x) ≥ 0, on the other hand the potential in Euclidean signature with this sign is unstable. To cure this problem, we can define a slightly modified Poisson average procedure. In particular, we can consider the same set of theory with the random potential as in (2.1). The source still contains a fixed piece J 0 (x) and a random piece J 1 (x), and the J 1 (x) piece is again related to a Poisson distribution. The crucial difference is that we do not do the average of this family of theories naively according to the probability distribution. Instead, we consider the average with an extra insertion of the (−1) F "operator" in the measure. The average of the potential term is The J 1 independent factor is introduced so that the average is correctly normalized Further notice that the inclusion of this twist operator does not affect other properties of the definition. With this new twist operator inserted, we obtain the following effective action Let us summarized what we have done so far. Conceptually, the logic we follow here is to consider different ensemble average schemes and check if any of the scheme has a clear physical interpretation. As in many other recent discussions of ensemble averaging of theories, e.g. [10,11], right now we do not have a clear criterion to determine what family of theories should be grouped together and averaged over, and what is the measure we should use for the average. In this work, by comparing the result with other well-defined theories we determine what is a better averaging scheme given a set of theories to be averaged over. Therefore an alternative interpretation of the previous analysis is that we find a proper measure for the average of the set of theories (2.1) that necessarily includes a "twisted" factor (2.19).
In fact, this twist factor resembles the (−1) F factor in the computation of the Witten index [39] in supersymmetric theories. The need of this factor gives a clear indication, together with the fact that the Poisson distribution is discrete, that there must be a microscopic origin of this random averaged model, and in addition there must be fermionic degrees of freedom in the microscopic model. We will see in section 3 that this is indeed the case and the Fermionic number (−1) F operator in the microscopic theory indeed plays a role of the (−1) F operator we inserted here in the measure of the random averaged model.

A Poisson process point of view
In this section we provide a more mathematical, yet still intuitive, description of the above computation. The model we are interested in is still where now we interpret the source J 1 (x) be a Poisson process on the carrier space There are different equivalent definitions of Poisson distributions. Here we adopt the following intuitive definition of the Poisson process [40]. A Poisson process Π describes a random set of points on a given carrier space whose appearing probability is mutually independent and obeys a Poisson distribution. Concretely, let B be a Borel measurable subset of the carrier space X. Let the number of the points appearing in this region B be  where the Poisson parameter Λ(B), also known as the mean measure, is determined by where dV (x) is a volume element. The integrable function λ(x) is commonly referred to as the intensity function. A numerical simulation of a Poisson process is shown in figure 1.
In particular, for infinitesimal B where the integral is well approximated by the area element, we have locally The Λ(x) and λ(x) functions parametrize the mean value of the Poisson distribution as a function of x, effectively this describes the spatial shape of the Poisson distributions on the carrier space. Notice that not any Λ(x) (λ(x)) can be a mean (intensity) measure. As shown in [40], to make sure that a Poisson process exists, the mean measure needs to be nonatomic, which means the λ(x) should not have any delta function support. Furthermore the mean measure Λ(dx) should satisfy the following very mild finiteness condition By the restriction theorem [40],we can always formally consider a discretization of the carrier space M = ∞ n=1 dV (x n ) 2 and further decomposes the mean measure Λ to a sum Λ n each of which only has a support on dV (x n ), which means Λ n (A) = Λ n (A ∩ dV (x n )) for ∀A ⊂ M. Therefore as long as the Λ n (M) = Λ(dV (x n )) < ∞, there is a Poisson process with the given mean measure Λ or equivalently the intensity measure λ(x), even if Λ(M) → ∞. Now we come back to our model, the source being a Poisson process means the source, accumulated on a given open set of the carrier space, is identified as the Poisson random variable (2.30) In other words, we identify the counting measure on a volume element in our model to be Then the interaction part of the action can be understood as In this language the average over this random source is nothing but the expectation value of the exponential (2.33) Given the identification (2.31), this is simply the Laplace functional of the Poisson process. For a general Poisson process with the counting measure N (dx) and mean measure Λ(x), the Laplace transform of a test function f (x) is For completeness, we provide some details of the Laplace transform in appendix C. Using this result, the quantity we would like to compute in our model thus becomes As in the previous more physical derivation, there is a sign flip of the Liouville potential term. In this approach the effect of inserting a (−1) n term can be equivalently performed by considering a slightly modified point process where the distribution is the Poisson distribution with an extra alternating factor: Although the extra sign could make the classical probability interpretation of the P function obscure, it is perfectly compatible with the definition of point process, in particular the independence among different spatial regions. So we can simply consider it as a different measure defining a new point process, with which we can compute the expectation valuẽ Similarly, the Laplace functional that is crucial in the above definition becomes The rest computation is identical to those in the previous section and we again arrive at the effective Lagrangian (2.21).

Quenched vs annealed: the "wormhole" contribution
We can compute the partition function of the averaged theory, as well as the average of the partition function of the individual theories. The difference between the two should be related to the contributions from "wormholes" connecting different realizations [8,20,24,41].
To be precise, we consider n replicas with either quenched or annealed random variable among them. The annealed partition function of the averaged theory with the Lagrangian (2.21) is simply the n th power of the averaged partition function where the random variables fluctuates in each replica The quenched partition function is the average of the partition functions on the n replicas, where the random variables are not averaged over in each replica and is only averaged over for the n replicas as a whole. This can be computed as It is clear that the two results (2.40) and (2.43) are different, which indicates that in the gravitational dual of this model the wormhole solutions connecting the different boundaries should give significant contributions to the gravitational path integral.
To better illustrate the difference between (2.40) and (2.43), we consider a special case where λ(x) = λ 1, then the path integrals can be approximated by the contributions from the saddle points. In this limit, we have where the saddle point is at φ j (x) = 0 for any j. On the other hand, we get which is divergent. From this result, it is clear that the quenched and the annealed partition functions are significantly different, hence confirming the contribution from wormhole type topologies. Moreover, there is a rather simple explanation of the divergence in the "quenched" partition function Z n : its divergence is due to the appearance of n−1 zero modes in the theory and from the potential gravity dual interpretation they should correspond to n − 1 free moduli parameters characterizing different topologies connecting k = 2, 3, . . . , n replicated boundaries. Further notice that since we are considering the partition function, instead of the extensive quantities such as the free energy or the entanglement entropy, the second replica discussed in details in [41] is not necessary in our discussion.

Poisson random average from tracing over microstates
In spite of the recent progresses that demonstrate the success and power of ensemble averaging of theories, A general subtlety caused by considering an ensemble average of theories is its tension with the traditional point of view of quantum theories. The quantization is usually carried out for a given theory with a single fixed action, which could be an obstruction to further understand ensemble averaged theories and in particular its quantum counterpart. A way out is to consider the ensemble average of theories as and effective description of the low energy limit of (a subsystem of) a microscopic theory. In this section, we materialise this idea into an explicit connection that reformulate the above average over the Poisson random potentials into a trace over a large number of microscopic degrees of freedom in a single refined model.

The microscopic setting
We consider the following microscopic model. The model is defined on a spatial lattice on each site of which resides a d-level spin system. The lattice points are labelled by a "position" vector x. 3 Thus the full system has a total Hilbert space that is a tensor product of the Hilbert space on each lattice point. We label the Hilbert space of the spin system at position x to be H x and the state vector is labelled as |i For simplicity, we consider all the Hilbert spaces to be identical. One simple example is a theory of N pairs of free complex fermions ψ i ,ψ i , i = 1, . . . , N on each site, so that the dimension of the Hilbert space is d x = 2 N . We can choose the ground state |0 to be annihilated by ψ i so that the states in the Hilbert space is spanned byψ i 1 . . .ψ i k |0 . The simplest example is N = 1 and the Hilbert space is just a single quantum bit whose dimension is 2 and we can conveniently label them to be |0 and |1 . We define a number operator at each site It is clear that We start with the system completely free, with neither on-site or inter-site interactions.

Tracing over states
Now we can turn on a source φ x conjugated to the number operator on each site x, the single site Hamiltonian then reads Notice that at the moment we do not add any kinetic term to φ x and it is just a classical chemical potential. We will later consider the continuum limit where the kinetic term could emerge. The theory describing this web of fermionic theories is defined by the Hamiltonian Notice that in determining the dynamics of the system, we also need to provide the information about the quantum state of the spin. Usually we consider the source to be classical and does not change much. But in the following, we will make this source a dynamical field. The first step to make it dynamical is to add conjugate momentum terms to the φ x on each site so that where H x is defined in (3.3).
We can in fact allow other terms involving only the φ x fields. Putting every thing together, we consider a system described by the Hamiltonian To proceed further, we allow the fermions to back react on φ x and try to find an effective theory of φ x . This can be done by tracing over the microscopic spin fields to get an effective action for φ x . In the Hamiltonian formalism, the effective description of φ x is governed by the effective Hamiltonian H eff Notice that in addition to (3.7), there is another seemingly more natural definition of the H eff which is the analogue of the partition function of the fermionic sector. But notice that this is not precisely the partition function since the theory also couples to the φ field.
In addition, what we want is not the "partition function" of the fermionic system, instead we want to integrate out the fermionic fields in a basis independent manner. For this reason (3.7) is a better definition. To see this, consider a general basis of the Hilbert space |i , on which an operator takes the form T i j . Under a general change of basis |i → |i = A j i |j , which might include changes that mix bosonic with fermionic components, the operator becomes where (−1) |k| is the action (−1) F on the outgoing state and |k| labels the oddity under fermionic number operator. The extra sign factor comes from moving A −1 through the (−1) F in the trace. It is clear that equation (3.8) does not satisfy this condition, which justifies the definition (3.7). 4 We prepare the system to be in a state such that the fermions on the different sites do not entangle with each other, therefore the density matrix is a tensor product of the density matrices at each site. We prepare the system on each site to be in a mixed state with a density matrix Next we would like to consider a "continuum limit" or the large-n limit where n is the total number of lattice points per unit volume. Notice that here we have implicitly embedded the lattice into an ambient space M = R d and the volume in the above sentence refers to the volume measured in the ambient space. This ambient space places no role in the following discussion except for setting up a scale. We consider the limit where the total number of excitations are fixed and finite in this limit. This imposes a very non-trivial constraint on the probability function in the mixed state density matrix. 5 For simplicity we put the sites of the fermionic systems to be on a square lattice with the lattice parameter (i.e. intervals between each lattice point in each direction) a. Then the above continuous limit can be reached by taking a = 1 m with m → ∞. It is clear that this grid of points is bijective to the grid of rational point in this limit. It is also clear that any open set on the background M = R d contains a dense subset of points in the above m → ∞ limit. Furthermore the density of grid point (per unit volume) is n = m d . Therefore in the above continuous where we have indicated by dx that the the Λ(dx) as a function of x is also closely related to the set dV (x). 6 We can thus factor out the dV (x) dependence on the both An illustration of this limit is shown in figure 2.
Consider the fermionic systems on the sites inside the element dV (x). We assume the p(x) to be smooth enough so that in the small region dV (x) it is approximately constant. The fermionic factors in the density matrix that comes from the region dV (x) is (3.14) In the particle number basis in dV (x) it reads where as above n = ndV (x) is the total number of grid points in the volume element, n x is the total number of sites that are in the |1 states in the volume element dV (x), 7 Notice that in principle there is still the dV (x) dependence in λ(x) due to the different choice of the representative point in dx. But as we have assumed p(x) to be smooth enough, this dependence drops out. In the following, we will always assume this and drop the dx label in λ(x). equivalently it is the eigenvalue of the operator x∈dV (x) N x . From this we read out the probability to have n x = k fermionic excitations in this volume is In the limit (3.12), this probability becomes Next we would like to trace over the Hilbert space of the spin degrees of freedom. For reasons discussed above, we consider the following trace Because e iπF |n = e iπN |n and the bosonic fields φ x do not act on these fermionic component of the state (so they should be considered as functions in this computation), the H x,1 and F x are diagonal on the states described by the density matrix. This together with (3.17), (3.18) and (3.22) reduces the trace to where in the last line we have taken the continuous limit and we have also used (3.12) and ∞ n=0 λ n n! e −λ e un+c = e λ(e u −1)+c . (3.28) In addition, the discretization is fine enough so that the φ x is approximately constant in each element dV (x).
In addition, we notice that in the above derivation we did not specify any details about the decomposition of M into the countable sum of open sets dV (x). First, it is trivial to show the existence of such decompositions since the lattice points are countable and each open set contains a large number of lattice points, we can therefore use any lattice point in a given open set dV (x) to label it. Since all the open set are disjoint, there is no ambiguity, such as repeated labelling, in this process. This shows the existence of the decomposition. Second, the fact that the above derivation does not depend on the details of the decomposition simply means for different choices of the decomposition the result is always the same.
What we have done so far is to consider a special scaling limit (3.13) of a lattice system, where a Poisson process description of the system is available. To see this clearly, we start with (3.24) where we trace over the fermions. In the limit (3.12) this trace can be cast into the form of (3.25) where we show that the number of excited lattice point in subset dV (x) are mutually independent since there is no fermionic hopping terms among different site. Furthermore the probability of n x in the mixed state represented by the density matrix (3.10) in the limit (3.13) is a Poisson distribution with Poisson parameter Λ(x). These together mean that the limit can be described by a Poisson process with the n x being the counting measure (2.23) and the n p(x) and np(x) quantities in the special limit (3.13) being the mean measure and the intensity of the Poisson process. Therefore, as discussed in previous sections 2.2 and 2.1 this theory, in particular equation (3.24), can be considered as an ensemble of theories with a source N x that is related to a discrete Poisson distribution that is of the same type of the theory defined by (2.1) and (2.7). But on the other hand, the theory clearly has another microscopic description as explained in this section. This then setup an equivalence with the ensemble averaged Poisson random theory discussed in section 2.1 and section 2.2 and gives an explicit example that an ensemble average of theories could actually be equivalently an average of an ensemble of states in a single (microscopic) theory. 8 With this understanding, we actually do not have to do the above computation as in (3.24)-(3.27), rather this is nothing but the Laplacian functional of a Poisson process where the mean measure is identified with the Λ dx (x) function in the continuum limit. Therefore we get an effective potential of the bosonic φ source function as we find the effective potential to have the form of a (generalized) Liouville potential Notice that explicitly the Hamiltonian is not exactly of the Liouville form since now the µ is a function rather than a constant. This is similar with the results in the previous section 2.1 and 2.2. Again we defer a more detailed discussion of the position dependent case in section 5.3.

The low energy limit
Next we focus on the low energy modes which are expected to have a canonical kinetic term in continuous spacetime and are exactly described by a Liouville theory. This amounts to go to the frequency space and extract the effective action near zero momentum.
In the following, we consider the simplest case with 1 spatial dimension. Then the position label x in φ x is equivalently labelled by the order of the φ x on the chain, namely x = ja where j ∈ Z and a is just a scale of the grid interval. Equivalently, we can use this j index to label the different fields.
Our main consideration is on the following quadratic terms in (3.6) where t ij = t ji . For simplicity we consider a homogeneous chain so that the only non-vanishing hopping coupling is t jj+1 = t . (3.34) It is well known that a change of variables φ k → φ k = 1 N + 1 j sin π(j + N + 1)k 2(N + 1) φ j , k = 1, . . . , 2N + 1 (3.35) diagonalizes the Hamiltonian (3.32). Furthermore, it is instructive to define a spatial momentum In terms of this momentum the new field can be written as where x 0 = −(N + 1)a. From this expression it is evident that the p k has the meaning of a momentum since the form of the change of variable (3.37) is in fact a discrete Fourier transform with the momentum space represented by p k .

With this new variable the diagonalized Hamiltonian then reads
(3.38) In the continuum limit a → 0, N a → ∞, so that p k → 0, the dispersion relation becomes In particular, if we choose with c being a constant, we recover the dispersion relation Therefore, to the leading order we get with the simplification (3.34) and (3.40) we recover the spatial momentum term of a relativistic particle in a flat background. Ex-plicitly, the Hamiltonian is where we have used (3.37). In the third line we have regarded the x j formally as the discrete value of a continuous position coordinate with respect to which the ∂ x operator is defined. It should be considered as the continuous limit of the difference operator. Therefore together with (3.3), (3.5), (3.30), (3.31), (3.32) and (3.47), we get the following action for the low energy continuous limit of the theory obtained from tracing over the microscopic fermionic degrees of freedom where we have taken the n → ∞ limit of (3.47). We find this is the Hamiltonian of the Liouville theory, similarly as that from averaging over a Poisson distributed random sources. Further notice that in this derivation, we have tuned the parameters in the microscopic model so that in the continuum limit the scalar fields has a classical relativistic kinetic term on flat Euclidean spacetime. We could in fact consider more general parameters in the microscopic model, which will lead to a more general kinetic term.

Sinh-Gordon type models
In the previous example, the random source is drawn from a Poisson process which has the property that only positive values are supported in the distribution. One would ask if this discussion is only specific to the Poisson distribution or can be made more general. In addition, one would naturally ask what if we consider a distribution where the random variable can take both positive and negative (discrete) values, which is more similar to the Gaussian distribution. In the following we consider such a model from the random ensemble average point of view and the microscopic points of view. We will show that the average of this model leads to Sinh-Gordon type theories.

The random ensemble point of view
First, on the level of random average of different theories, we could consider a different Probability distribution of the source. For example, we can consider a similar model of where I x is the modified Bessel function of the first kind and the µ 1 and µ 2 are parameters characterizing this random distribution The random averaging can be done following the above Liouville discussion in either the more physical way discussed in section 2.1 or the more mathematical way in section 2.2. In the latter approach, the random sources can still be considered from a point process, the only difference is that now the probability distribution of the N (B) for any open set on M is a Skellam distribution. Then following a similar discussions, we arrive at the solution with the following effective interaction Hamiltonian where µ 1 (x) and µ 2 (x) are two functions parametrizing the point process with the Skellam distribution measure on different open sets.

The microscopic point of view
As the above Poisson process case, there is again a microscopic setting which realizes the above theory as an average over an ensemble of microstates in a single theory. This provides another example showing that averaging over different theories could be understood as averaging over an ensemble of states within one theory in an appropriate limit. Explicitly, we consider a microscopic model with again a grid of sites on which some fermionic quantum mechanical modes exist. We consider a simplest model with two complex fermions ψ 1 x and ψ 2 x on each site labelled by x. The interaction Hamiltonian of the collection of the fermionic systems reads Notice that there is a relative minus sign between the two terms, this reflects the fact that we assign opposite counting charges of the two fermions under the particle number operator, mimicking the electron-hole pair in more familiar systems. One can again consider the microstates to be in an ensemble represented by a factorized density matrix where Here we emphasize again that this is only a special mixed state that we choose the microscopic fermions to stay in. The system could very well be in a different mixed states, and in those cases the effective theory of φ could be very different. The Hamiltonian is closed related to the "net" number operator Next we would like to consider the "continuum limit" in the same sense as the in previous Liouville discussion in section 3 n → ∞ , (4.10) where n is the total number of lattice points per unit volume. Concretely, we again assume p i (x) to be smooth enough so that in the small region dV (x) it is approximately a constant, so we consider the limit 11) and similarly Once again, we could consider Λ i,dV (x) as an induced measure on the carrier space Then the probability of the total net excitation in this volume element dV (x) being m dV = k with k > 0 is (4.14) which can be simply counted using the fact that the systems on each sites are independents. We have also only computed the k > 0 case, for k < 0, we can rewrite the sum in terms of n 1 = n 2 + k and the result is the same as P (m dV = −k).
In the limit (4.11), the above probability becomes where I k (x) is the modified Bessel function of the first kind. For k < 0 we follow the same procedure, and the result can be put into a uniform expression for all values of k Next we would like to trace over the Hilbert space of the Fermi system, which is equivalent to integrating out the spin system background. We again consider the following trace which follows from a similar derivation as in the previous Liouville theory case. For simplicity we drop the terms that only depend on the bosonic field. As shown in the previous case, those terms do not affect the evaluation of the trace and can be put back at the end. The evaluation of this trace reads where we have used e iπF |n 1 , n 2 = e iπ(n 1 +n 2 ) |n 1 , n 2 = e iπ(|n 1 −n 2 |) |n 1 , n 2 and also we have adopted the identification of Λ i,dx (x) = µ i (x)dx as an induced measure to rewrite the weighted sum in the exponential into an integral over M with a dynamical measure. We thus find the resulting Hamiltonian is of the form of a Sinh-Gordon type potential. We can follow the same treatment in the previous section 3 to show the emergence of a relativistic kinetic terms and obtain the full Hamiltonian of the Sinh-Gordon action.

Discussion
In this section we discuss some interesting questions and extensions related to the material in the above main text.

Relation to Gaussian randomness
Focusing on discrete distributions is a crucial difference between this work and the previous literature on ensemble averages of continuous Gaussian type random variables. It is often the case that a further limit of the discrete distribution, for example the Poisson distribution that we mainly discussed here, gives a Gaussian distribution, which is guaranteed by the central limit theorem as long as the events that the distribution describes are mutually independent. Therefore we can also regard the discussion in this paper as a first attempt towards refining the results in previous Gaussian random (holographic) theories, including SYK type models and the random matrix theories, in the sense of reverting the limit to go back from Gaussian to discrete distributions.

The quantum mechanics dual description
We can also integrate out the φ x fields and ask what does the resulting microscopic model look like. We expect the result, which is a 0+1d quantum mechanical model with x being discrete flavour labels of the fermions, to be a purely field theoretical description of the same system.
Explicitly, it is easier to work in the Lagrangian formalism, where the coupled theory is defined as Notice that in this section we again work in Euclidean signature. Treating x as a flavour index, we can integrate out the φ x to get where ∼ simply means up to an irrelevant constant and the matrix M is defined to be We observe that the quantum mechanical description of the same model is characterized by a complex fermion system with a "nearest neigh bore" charge-charge coupling. We can further consider the M −1 factor as the coupling constants of the chargecharge interaction. This interaction can be analysed by a derivative expansion of the inverse of (5.3). In particular, the leading term in this expansion is simply the inverse of (5.3) with the derivative term turned off. The next-to-leading term is a 2-derivative term, which is irrelevant in the quantum mechanics system. As a result, for the purpose of obtaining an low energy effective theory we can simply drop all the derivative corrections to get the following effective action where We can again consider the special value (3.33) and (3.34) where the inverse can be computed as Clearly g xy = g yx so we can focus on y ≥ x cases where the sum over j reduces to half of the range Adding back the y < x terms, we simply get (5.14) For the special value (3.40), the parameter in the above result behaves as In this range, the coupling considered above is in general real and positive. However, when the inequality is saturated, the coupling actually diverges. The physical reason of this divergence is clear. When the inequality saturates m φ = 0 according to (3.40), so the effective mass of the infrared φ mode vanishes. When this happens, integrating out this massless modes is inconsistent and leads to divergences. What we have observed is just a realization of this well known phenomenon, see e.g. [42,43] in our very simple model, where no gauge symmetry or supersymmetry is involved.
Further notice that the results in this section are simplified version due to the assumption (3.33) and (3.34). Without these assumptions we expect to get richer structure of this dual description.

Interpretation as emergent gravity ?
From the results of the previous sections, we observe that Liouville theory and generalizations could arise from averaging over an ensemble of theories with random sources. The latter type of theories could have a microscopic origin where one traces over a set of underlying fermionic degrees of freedom in a specific state to generate an effective theory of the rest degrees of freedom. In both the two scenarios a key point is that by relating such averaging or tracing out the microscopic degrees of freedom to the mathematical description of (Poisson) point processes, an emergent probabilistic measure on the carrier space naturally appears. One is then tempting to identify such an emergent measure, which ultimately comes from tracing over the fermionic microstates, with a geometric metric which facilitates an emergent gravity theory whose action takes the form of Liouville action.
This does not sound that unreasonable given that the 2D Liouville theory itself does emerge from the response of the conformal matter fields to a Weyl transformation of the background metric, see e.g. [44]. The only difference in our setting is that we do not start with a conformal matter fields that couples to a gravitational background. Instead we start from a large number of microscopic fermionic modes that couples to a bosonic mode, then in a double-scaled continuous limit an effective Liouville action emerges for the bosonic modes.
Concretely, recall that the effective action of the Liouville theory can be obtained from the response of the matter field to a Weyl transformation of the metric h µν → e φ h µν , with the effective action where µ is a cosmological constant and Q is a background charge related to the property, such as the central charge, of the matter fields. In our analysis, the effective action of the free scalar theory with a Poisson random source has been shown (2.21) to be Given the resemblance of the two actions, we would like to try matching the two actions by identifying the dilaton Φ in (5.16) with the boson φ in (2.21) for d = 2. However, given the form of the two actions, it is the simplest if we fix a gauge of the gravitational theory with dynamical metric field h µν . This can be directly observed from a simple comparison: fluctuations of the metric, not only its determinant, is dual to a change of the kinetic term that can be traced to a different hopping structure on the microscopic lattice. Thus choosing a different hopping structure on the lattice correspond to a change of the metric. Fixing the metric to a certain gauge is them mapped to a choice of the hopping terms on the lattice. For example, for the choice of the hoppings we discussed above, the effective kinetic term is simply the Laplacian on the conformally flat spacetime. To make the comparison transparent, we fix the metric in (5.16) to the conformal gauge In this gauge, the above duality simplifies to The third line reduces to an identity as a result of our fixing the gauge (5.18), while the first two equations implies a condition between the canonical source J 0 (x) and the λ(x) that can either be understood as the "parameter" characterizing the point process or characterizing the state of in which the effective action describes in the microscopic ensemble average description. In particular, the relation is where in the last expression we have assumed that all the other parameters in the microscopic setting are independent of x. This is a very interesting relation stating that we can consider the J 0 as also the source of the log(λ(x)) when the latter is regarded as a classical field. This means for a given J 0 (x) the value of the function λ(x) is determined once a boundary condition is provided. Microscopically, this relation has a clearer interpretation: for any given J 0 (x) the microscopic state in which the system has a gravitational description can be identified so that the density matrix of the state (3.10) scales correctly (3.12) so that the limiting λ(x) function should satisfies the condition (5.22). This agrees with the general philosophy of gauge/gravity duality where the content of the duality depends crucially on which states the quantum field theory is in. It is true that the limit (3.12) and the condition (5.22) can not isolate a single state in the microscopic theory. But this is also as what we would expect since in our picture the different states giving the same limit (3.12) is not distinguishable from the semi-classical gravity interpretation. It is likely that the different states with the same (3.12) carries information about quantum corrections to the classical gravity/geometry. We plan to understand this proposal better in the future.
Further notice that the requirement of a gravitational interpretation only fix the shape dependence of J 0 (x) on λ(x), the size of the J 0 (x) field can however vary and it is proportional to Q in the gravity interpretation. We know that in the derivation of the Liouville theory as a response of matter fields to a Weyl transformation of the gravity background they couple to, Q roughly counts the number of degrees of freedom of the matter fields, and hence an indication of the strength of the matter coupling. In our derivation, the φ(x) scalar field can be interpreted as the dilaton in the 2d gravity of the Liouville type, and then the size of J 0 (x) can thus be understood as the strength of the coupling to some other matter field that could be considered as other conformal matters fields in the gravity interpretation.
Of course this identification and interpretation are specialized in the conformal gauge (5.18). It is definitely an interesting question to set up a concrete relation covariantly. One simple idea is that one should be able to impose conditions on the allowed probability measure Λ(dx) = λ(x)dV (x) so that there is an emergent diffeomorphism on the λ(x) that can be identified with the diffeomorphism in the gravity description But this is beyond the scope of this paper and we will discuss this in more details in future work. Nevertheless, in either respect, namely either our result being gauge dependence or in the near future we consider diffeomorphism invariant λ(x) measure as the effective background metric, our construction and in particular the interpretation of emergent gravity is not forbidden by the Weinberg-Witten theorem.
Finally, we notice that connection between dilaton gravity models and the Liouvilletype theory have been discussed recently [1,14,[46][47][48][49] and similar discussion about the relation between the dilaton gravity theory with the sinh-Gordon and other types of theory [9,46,51]. There have also been some discussions about the connection between 1d Liouville and the Schwarzian theory. In the paper [47,48], the author considered a rewriting of the Schwarzian theory as a Liouville theory with some regularization. In [49] an indirect connection between the two theories are made via the reduction from integrating in and out different Lagrangian multiplier fields.

Concluding remarks
In this paper, we consider averaging over theories with discrete probability distribution, and show an explicit equivalence between this type of theories with the effective theory from tracing over part of the microscopic degrees of freedom of a single theory in some appropriate limit. Further notice that in this discussion, especially from the point of view of the Poisson point process, the spatial dimension of the carrier space is not crucial. Therefore it is possible to compare the above approach to some other known random deformations of free field theories in higher dimensions [53][54][55][56][57][58][59].
While finishing the paper, we noticed other discussions of probabilistic construction of Liouville conformal field theory [60]. The approach there is very different from ours; there the Liouville field itself is considered as a random field under a specific probability measure. In our analysis in section 2.1 and 2.2, we only consider the source to be related to Poisson process, while the field φ(x) itself is conventional. So at the moment we have not observed any direct connection between our analysis and the approach used in [60] and the reference therein. It will be interesting to better understand possible connections in the future. supported by funds from the Kavli Institute for Theoretical Science (KITS) and a startup funding from the University of Chinese Academy of Science (UCAS), this work is also supported in part by the U.S. Department of Energy grant DE-SC0019480 under the HEP-QIS QuantISED program and by funds from the University of California.

A Discrete distributions
For the convenience of later computation it is useful to first list some properties of the distributions discussed in the main text.
• The Poisson distribution has the probability mass function Its moment generating function (MGF) and characteristic function (CF) are where E is the mean value of the distribution assuming the random variable to be x.
• The Skellam distribution [64] has the probability mass function P (x; µ 1 , µ 2 ) = e −(µ 1 +µ 2 ) µ 1 µ 2 where I k (x) is the modified Bessel function of the first kind. The MGF and CF are To understand better the meaning of the Skellam distribution we consider two independent Poisson distributions P (x 1 , µ 1 ) and P (x 2 , µ 2 ). Then the joint distribution is It is easier to consider directly the generating function For the current Poisson distribution, this gives Then the above joint distribution (A.8) is simply the z Consider the special case where the generating function becomes The coefficient of z x 1 in the above generating function leads to the Skellam distribution. To see this, we make use of the summation identity The derivation for the CF is similar; one only needs to substitute Φ → iΦ.

B A meaningless averaging scheme
In Euclidean signature, we try to get an effective action from If we naively consider the measure P (J 1 , λ 1 ) to be the Poisson distribution function of the random variable J 1 at each point x, the average over the random source is To understand how to perform this average, we first consider discretizing the spacetime so that the above expression reduces to where in the second line we have used the sum over the Poisson distribution for any fixed x n (A.3); this is just the standard expression for the Laplace transform of the Poisson distribution with the parameter λ(x n ). However, the result of this direct computation looks strange since the ∆ d x is further on the shoulder of an extra exponential. This result is difficult to interpret; especially it is not clear how to take it's continuous limit.
Comparing the derivation in this appendix with the results in section 2.1, we find the crucial difference between the naive random average and the Poisson process is that in the latter case the quantity that obeys the Poisson distribution is the measure on the carrier space, rather than the "rate of event" J 1 (x).

C More about the Poisson process C.1 Random points versus counting measures
The notion of Poisson process has at least two interpretations. One is similar to the arrival process, like the time at which a bus comes to a bus stop, where it is really considered as a process or a sequence. The other puts more emphasis on a measure theoretical interpretation where the (Poisson) point process is considered as a set of points on the carrier space and the number of points in any given subset naturally leads to a counting measure on the subset. Notice that this counting measure has nothing to do with the original measure of the carrier space; rather it is a characteristic property of the point process only. In this language, the measure N (dx) in (2.31) is precisely a counting measure, its number counts how many points in the (Poisson) point process appears in the subset volume dV (x) ⊂ X. For this special case it is further a random counting measure in the sense that the number N (dx) is a random function of the subset dV (x). With this proper interpretation, an integral over this counting measure is nothing but [65] where Π denotes the point process and x i are thus point in the point process that falls in X. The LHS emphasizes that the point process N could be considered as a random counting measure, while the RHS emphasizes that the point process can be considered as a counting process whose elements are just a set of discrete points/events.

C.2 A derivation of the Laplace functional
One derivation follows from the property of sums over Poisson processes [40]. We can consider integrals of the type where the function f (x) takes only a finite number of values f (x) ≡ f k (x) = f 1 , . . . f k . We can then decompose M into a union of A i ,i.e. M = k i=1 A i such that Given this, we can compute