Tail probabilities for triangular arrays

Di erent discrete time triangular arrays representing a noisy signal of players' activities can lead to the same limiting di usion process yet lead to di erent limit equilibria. Whether the limit equilibria are equilibria of the limiting continuous time game depends on the limit properties of test statistics for whether a player has deviated. We provide an estimate of the tail probabilities along these arrays that allows us to determine the asymptotic behavior of the best test and thus of the best equilibrium.


Introduction
It is frequently difficult to determine the set of equilibrium payoffs in discrete time repeated games with imperfect public monitoring when the discount factor is bounded away from one. In the continuous time case Sannikov [2007] and Sannikov and Skrzypacz [2007] have obtained striking characterizations of the equilibrium set in continuous time games where the public signals are modeled as a diffusion process, with the players' actions altering the diffusion's drift but not its volatility. These continuoustime models are motivated as modeling the limit of very high frequency interactions, which raises the question of what sorts of high-frequency limits the models capture. This in turn depends on the relationship between the signal processes in discrete and continuous time. Fudenberg and Levine [2009] (hereafter referred to as FL) show by example that the same limiting diffusion processes can arise as the limit of different discrete-time structures that have very different limit equilibria.
In characterizing the "cooperative" equilibria of a repeated game it is necessary to understand which "punishment schemes" are incentive compatible for players. This can be thought of as testing for whether a deviation has occurred combined with a punishment if the test is failed. Intuitively, as with the normal distribution, the tails of a diffusion process permit a very accurate test for the difference in means by using a cutoff for the signal, above which the test is considered to have "failed." However, since the worst possible punishment in a repeated game is bounded, what matters is not just the accuracy of the test but whether defections can be detected with sufficient probability. As we approach continuous time as the limit of shorter discrete intervals, the question becomes how rapidly the probability with which defections can be detected decreases relative to the size of available punishment. If the only way to create a sufficiently accurate test is to send the cutoffs very quickly to infinity, then punishment will occur too rarely to provide sufficient incentives for cooperation. In this case we can expect that there will only be static equilibria in the limit. Consequently a key question is whether it is possible to design a test that finds an appropriate balance between accuracy and frequency of punishment as the period length shrinks. For concreteness we will illustrate this idea in a simple principal-agent game instead of the repeated game studied in FL.
In many -if not most -cases of interest, the public signal is not literally continuously distributed, but the diffusion process arises as the limit of the aggregate of many small discrete events such as price changes. In this case we are interested not in the normal distribution per se, but rather a distribution that approaches normality in the limit.
It might be hoped that a version of the central limit theorem could be used to examine the convergence properties of the test statistic. Unfortunately as periods shrink the optimal cutoff increases in such a way that the probability of detection decreases (the cutoff normalized by the standard deviation increases) so the standard central limit theorem is not useful. Instead what is required is an estimate of the tail probabilities, that is of the probabilities of very unlikely but informative signals. 2 The most closely related result in the literature is what Feller [1971] calls a "large deviations" theorem, although that term is now used for other things. Feller's result applies only to i.i.d. random variables, and not to triangular arrays; this note provides the additional uniformity assumptions needed to adapt the Feller proof to the case of triangular arrays and adapts the proof to show how these uniformity assumptions are used. The result reported here can then be used to show that the equilibria of discretetime games whose signals are binomial arrays do indeed converge to the equilibria of the associated continuous time game, as it was in FL's study of games with a long run player against a myopic opponent. In the next section we sketch a simpler one-shot agency problem where the tail probability estimates can be used in similar way. 3

A Motivating Example
The information issues that arise in repeated game setting arise in a simplified form even in a principal-agent problem, as we now show. Suppose that there is a period of length τ . At the beginning of the period the agent may choose not to be employed by the principal in which case he receives zero. If he chooses employment he must decide between working (W) and shirking (S). If he works he is paid an amount W τ 2 This issue is delicate because the likelihood ratio between signals between two normal distributions with a common variance and different mean becomes unbounded in the tail: this was originally exploited by Mirlees [1974]. 3 Sadzik and Stachetti [2012] study the limit of discrete-time agency problems when the discrete-time signals have a continuous density as opposed to being the sum of discrete random variables. Their "hidden action" case corresponds to the example presented here.
proportional to the length of time he works. If he shirks he gets a bonus of Gτ . At the end of the period, a principal observes a noisy signal y of the agent's lack of effort and if this signal exceeds a threshold y he imposes a fixed penalty P . Notice that P is not proportional to the length of the period; the idea is that the principal can impose a longterm punishment on the agent if he feels the agent has shirked even for a short period of time. For example if the principal can fire the agent, then we would expect that / P W r = , which is the amount that the agent would have earned from a lifetime of employment with the principal.
The question we wish to address is for particular distributions of y whether it is possible to set the threshold y so that the agent can be induced to work rather than shirk.
Notice that whether or not it is desirable to do this depends on payoffs to the principal which we do not specify.
Let p represent the probability that the punishment is received if the agent works and q the probability of punishment if the agent shirks. Then if it is to be optimal for the agent to work rather than shirk then it should be that the incentive constraint holds. This is similar to (1) in FL. If it is to be optimal to choose employment then the participation constraint should be satisfied, that is If in the limit as 0 τ → both of these are to hold for some values of , , G P W then it must be that lim ( ) 0 ρ τ > and lim ( ) µ τ < ∞ . This is analogous to Corollary 2 in FL.
We suppose that the signal y is generated by stochastic process 0 S if the agent works and process 1 S if the agent shirks. This state of the appropriate process is observed at the terminal time τ , and we shall be interested in the case where τ is small. The simplest and quite standard way to do this is to assume that d S are diffusions with common volatility 2 σ and drift d = 0,1 respectively, so that the signal is distributed Consider first the incentive constraint It is easy to ensure that ρ remains bounded away from 0 as 0 τ → ; for example when and in the limit the participation constraint would be violated. Hence we must allow z → ∞ as 0 τ → to have / p τ bounded above. Thus the question becomes whether it is possible to keep / p τ bounded above at the same time allowing z to grow sufficiently slowly that ( ) ρ τ remains bounded away from zero. The answer depends on the behavior of the normal distribution Φ in the upper tail where z is large, and using bounds for the normal distribution Fudenberg and Levine [2007] show that in fact it is impossible to do so. Hence the agent cannot be induced to work when the time period is very short.
The problem with this analysis in an economic setting is that economic signals are unlikely to exactly follow a diffusion process, and in many cases will not have a continuous density when examined at a sufficiently fine scale. For example, the observed signal might be aggregate sales, which is the sum of a number of discrete random variables representing individual sales opportunities. As such we might expect from the central limit theorem that to a good approximation it follows a diffusion, and thus that the probabilities of correctly detecting a deviation and of falsely suspecting one could both be computed using the normal distribution. However, as we saw, in order to reach conclusions about incentives it is necessary to know what the probability of the signal is for very unlikely values of the normalized cutoff z , and the central limit theorem does not help with this. Instead, to analyze a sequence of games where the signals are a sum that is approaching a diffusion limit, an extension of the central limit theorem to tail probabilities is needed.
To illustrate the usefulness of the tail probability bound, consider signals that are generated as the sums of the binomial random variables ( )  ∆ to be the number of intervals of length ∆ that occur during a period of length τ , where we assume that τ is an integral multiple of ∆ . We take the signal to be the sum of these binomials during the period Hence y has variance 2 σ τ and mean either 0 or τ depending on whether the action taken is work or shirk.
To apply the central limit theorem, we should assume that the number of observations per period k grows large even as 0 τ → . The key issue is that for k large while y is approximately normal it is not exactly normal, so the normal bounds used in Fudenberg and Levine [2007] do not apply directly. Moreover, Fudenberg and Levine [2007] use bounds in the upper tail of the normal, the convergence of which are not guaranteed by the central limit theorem. Hence we will need a version of the central limit theorem that applies to the tail probabilities. This in turn requires that as the period length The best theorem we know of is the "large deviations" result of Feller [1971, pp. 548-553], which gives conditions under which the c.d.f. of normalized sums n F satisfy as the cutoff n x → ∞ . Feller's theorem is proven and applies only in the context of the standard central limit theorem -that is, the sum of i.i.d. random variables. In our setting we are dealing with a triangular array, so we must extend Feller's result to that case. The main part of the paper proves the relevant theorem (the "main theorem"), which gives four conditions that enable us to reach the same conclusion for triangular arrays.
The first two conditions are technical conditions on the cumulant generating function that are easily shown to be satisfied in the binomial case; see FL Lemma A.2.
Thus it remains to verify the third and fourth conditions, which are that In our case as we vary τ and k and implicitly ∆ we will generally wish to alter the cutoff y and the normalized version / z y σ τ ≡ . Suppose first that the cutoff is asymptotically very large in the sense that Then it is shown in FL Lemma A.5 that the cutoff is sufficiently far out in the tail that there is inadequate punishment: that is / 0 q τ → (and consequently since q p ≥ also / 0 p τ → ). Hence we may assume the third condition of the main theorem. The fourth condition of the main theorem requires z → ∞ ; if not, then the punishment probability does not go to zero, and as noted above this results in a trivial equilibrium. Hence we may apply the main theorem and since the normal bounds used in Fudenberg and Levine [2007] can be applied to conclude that all limit equilibria are trivial.

The Setup
As indicated, we extend an argument concerning i.i.d. random variables from Feller [1971, pp. 548-553] to the case of triangular arrays. We adopt Feller's notation to the maximum extent feasible. We suppose that we are given for each n a sequence

Basic Facts
The following version of the central limit theorem is taken from Feller.

Berry-Esseen Theorem: 5 for all
We also use some basic results about the standard normal distribution.

The "Associated" Distribution
Feller's proof replaces the normalized sum n z and its cdf n F * with a different random variable. This "associated" random variable has probability measure given by the Notice that n s V * has a thicker right tail than n F * . The idea is that by applying the Berry-Esseen theorem to n s V * , we can pull this back to the thinner tailed n F * to get a bound that will apply even for large values of x .

Sketch of the Proof
We want to give a sufficient condition for The idea is to introduce an intermediate quantity n A and give a sufficient condition that the two together then giving the desired result. The first step will follow by applying the Berry-Esseen theorem to the thick tailed n V * . The second step shows that when we thicken the tail by multiplying by a carefully chosen exponential we do not shift

Third step
Define the quantity n A by replacing n n s V * in the expression from step 1