Poisson approximation

We overview results on the topic of Poisson approximation that are missed in existing surveys. The topic of Poisson approximation to the distribution of a sum of integer-valued random variables is presented as well. We do not restrict ourselves to a particular method, and overview the whole range of issues including the general limit theorem, estimates of the accuracy of approximation, asymptotic expansions, etc. Related results on the accuracy of compound Poisson approximation are presented as well. We indicate a number of open problems and discuss directions of further research.


Weak convergence to a Poisson law
Poisson approximation appears natural in situations where one deals with a large number of rare events.The topic has attracted a considerable body of research.It has important applications in insurance, extreme value theory, reliability theory, mathematical biology, etc. (cf.[7,12,39,48,61]).However, existing surveys are surprisingly sketchy, and miss not only a number of results obtained during the last three decades but even some classical results going back to 1930s.
The paper aims to fill the gap.We present a comprehensive list of results on the topic of Poisson approximation, and formulate a number of open problems.Related results on the topic of compound Poisson approximation are presented as well.

Weak convergence to a Poisson law
We denote by Π(λ) a Poisson law with parameter λ.
The following Poisson limit theorem is due to Gnedenko [36] and Marcinkiewicz [54].
Let {X n,1 , ..., X n,kn } n≥1 , where {k n } is a non-decreasing sequence of natural numbers, be a triangle array of independent random variables (r.v.s).
Theorem 1 [36,54] If {X n,k } are infinitesimal r.v.s, then as n → ∞ if and only if for any ε ∈ (0; 1), as n → ∞, k The following corollary presents necessary and sufficient conditions for the weak convergence of a sum of independent and identically distributed (i.i.d.) integer-valued r.v.s to a Poisson random variable.
Let IN denote the set of natural numbers, and let Z + := IN ∪{0}.Corollary 2 deals with the case of non-negative integer-valued random variables.

Dependent Bernoulli random variables
The topic of Poisson approximation to the distribution of a sum of dependent Bernoulli r.v.s has applications in extreme value theory, reliability theory, etc. (cf.[7,12,48,61]).Let {X n,1 , ..., X n,n } n≥1 be a triangle array of 0-1 random variables such that sequence X n,1 , ..., X n,n is stationary for each n ∈ IN.For instance, in extreme value theory one often has where {Y i , i ≥ 1} is a stationary sequence of random variables and {u n } is a sequence of "high" levels.The special case where {Y i , i ≥ 1} is a moving average is related to the topic of the Erdös-Rényi partial sums (cf.[61], ch.2).Let F l,m (τ ) be the σ-field generated by the events {X n,i }, l ≤ i ≤ m.Set Condition ∆ is said to hold if α n (l n ) → 0 for some sequence {l n } of natural numbers such that 1 ≪ l n ≪ n.
Class R. If ∆ holds, then there exists a sequence {r n } of natural numbers such that (for instance, one can take r n = [ n max{l n ; nα n (l n )} ]).We denote by R the class of all such sequences {r n }.
Let ζ r,n be a r.v. with the distribution In extreme value theory L(ζ r,n ) is known as the cluster size distribution.
Condition (10) prohibits asymptotic clustering of rare events.In the case of independent r.v.s taking values in Z + assumption (10) Remark 2.1.The following condition (D ′ ) has been widely used in extreme value theory (cf.[48,61]): for any sequence {r = r n } such that n ≫ r n ≫ 1. Condition (D ′ ) means that there is no asymptotic clustering of extremes.Condition (D ′ ) was introduced by Loynes [52].
A generalisation of Corollary 2 to the case of stationary ϕ-mixing r.v.s has been given by Utev [82], Theorem 10.1, who has shown that conditions (3 ′ ) and (D ′ ) are necessary and sufficient for (9).Sufficient conditions for Poisson convergence without assuming stationarity have been provided by Sevastyanov [75].A Poisson limit theorem in the case of a two-dimentional random field {X i,j } has been given by Banis [8].

Accuracy of Poisson approximation
The problem of evaluating the accuracy of Poisson approximation to the distribution of a sum S n = X 1 + ... +X n of independent 0-1 random variables has attracted a lot of attention among researchers (cf.[12,61] and references wherein).A natural task is to obtain a sharp estimate of the accuracy of Poisson approximation to the distribution of L(S n ).In this section we overview available estimates.
Historically, the accuracy of Poisson approximation was first studied in terms of the uniform distance (sometimes called the Kolmogorov distance).
The uniform distance d K (X; Y ) ≡ d K (F X ; F Y ) between the distributions of random variables X and Y with distribution functions (d.f.s) F X and F Y is defined as Many authors evaluated the accuracy of Poisson approximation to L(S n ) in terms of the total variation distance.Recall that the total variation distance d T V (X; Y ) between the distributions of r.v.s X and Y is defined as , where the infimum is taken over all random pairs (X ′ , Y ′ ) such that L(X ′ ) = L(X) and L(Y ′ ) = L(Y ) [32,21].
The Gini-Kantorovich distance between the distributions of r.v.s X and Y with finite first moments (known also as the Kantorovich-Wasserstein distance) is where where the infimum is taken over all random pairs (X ′ , Y ′ ) such that L(X ′ ) = L(X) and L(Y ′ ) = L(Y ) [83].If X and Y take values in Z + , then [67] Distance d G was introduced by Kantorovich [43] (to be precise, Kantorovich has introduced a class of distances that includes d G ).We add the name of Gini since Gini [35] used IE|X − Y |-type quantities.Barbour et al. [12] called d G the "Wasserstein distance" after Dobrushin [32] attributed it to Vasershtein [84].
If distributions P 1 and P 2 have densities f 1 and f 2 with respect to a measure µ, set Then d H denotes the Hellinger distance.It is known that Certain other distances can be found in [71,61].Below we present estimates of the accuracy of Poisson approximation for L(S n ) in terms of d K , d T V and d G distances.

Independent Bernoulli r.v.s
We denote by B(n, p) the Binomial distribution with parameters n and p.Let Π(λ) denote the Poisson distribution with parameter λ.A Poisson Π(λ) random variable is denoted by π λ .
Let X 1 , X 2 , ..., X n be independent Bernoulli B(p i ) r.v.s.Denote Many authors worked on the problem of evaluating the accuracy of Poisson approximation to L(S n ) in terms of the uniform distance d K , the total variation distance d T V and the Gini-Kantorovich distance d G .
It seems natural to approximate B(n, p) by the Poisson distribution.For instance, in the case of identically distributed Bernoulli B(p) r.v.s {X i } one has where N n ≡ n is the total number of 0's and 1's among X 1 , X 2 , ..., X n and {π n (t), t ∈ [0; 1]} is a Poisson jump process on [0; 1] with intensity rate n.Thus, Prohorov [66] has established the existence of an absolute constant c such that Tsaregradskii [81] has shown that if X and Y are integer-valued r.v.s.Using inequality (22), he has derived the estimate Inequality (23) seems to be the first estimate of the accuracy of Poisson approximation with an explicit constant.
In the case of non-identically distributed Bernoulli B(p i ) random variables Shorgin [78] has proved that if θ < 1. Kontoyiannis et al. [46] have shown that Many authors worked on the problem of evaluating the total variation distance d T V (S n ; π λ ) (cf. [12,61] and references wherein).LeCam [49] presents the following bound: This bound is sharp: according to (2.10) in Deheuvels & Pfeifer [30], in the case of i.i.d.Bernoulli B(p) r.v.s if np → 0. Note that ( 24) is a consequence of the property of d T V and the following fact: Indeed, denote X = (X 1 , ..., X n ), π = (π p 1 , ..., π pn ), where {π p i } are independent Poisson Π(p i ) r.v.s.Then Kerstan [44] has shown that if p * n := max i≤n p i ≤ 1/4.According to Romanowska [68] Barbour and Eagleson [9] have derived the popular estimate Presman [65] has established an estimate of d T V (S n ; π λ ) with the constant 0.83 at the leading term.In the case of i.i.d.Bernoulli B(p) r.v.s Presman's bound becomes Xia [85] has derived an estimate with the constant 0.6844 at the leading term.Roos [71] (see also Čekanavičius & Roos [25]) has obtained a bound with a correct constant 3/4e at the leading term: Roos [71] has shown that that if θ → 0 and λ → 1 as n → ∞, then Thus, constant 3/4e cannot be improved.Denote The following inequality from Novak [61], Theorem 4.12, sharpens the second-order term of the right-hand side of estimate (30): An estimate in terms of the Gini-Kantorovich distance is available as well: (cf. [61], formula (4.53)).
Asymptotics of d T V (S n ; π λ ).The asymptotics of d T V (S n ; π λ ) in the case of identically distributed Bernoulli B(p) r.v.s has been established by Prohorov [66]: Deheuvels & Pfeifer [29] and Roos [70] have generalised (37) to the case of non-identically distributed {X i }.The following result concerning the asymptotics of d T V (S n ; π λ ) uses the notation from [61], ch. 4. Given a non-negative integer-valued random variable Y, we denote by Y ⋆ a random variable with the distribution One can check that if λ → ∞ and θ → 0 as n → ∞.Deheuvels & Pfeifer [29] present also the asymptotics of Poisson approximation to the multinomial distribution.Results on the accuracy of Poisson approximation to the distribution of a sum of Bernoulli r.v.s can be generalised to the case of a multinomial distribution.
Let Sn be a random vector with multinomial distribution B(n, p 1 , ..., p m ): where Formula (38) describes, in particular, the joint distribution of the increments of the empirical d.f..

Note that Sn
where ξ, ξ1 , ..., ξn are i.i.d.random vectors with the distribution vector ēj has the j th coordinate equal to 1 and the other coordinates equal to 0. Let π = (π 1 , ..., π m ) be a vector of independent Poisson r.v.s with parameters np 1 , ..., np m , and let π n (•) denote a Poisson jump process on [0; 1] with intensity rate n.Then π is a vector of increments of process π n (•): Arenbaev [6] has shown that if n → ∞ (the term 1/ √ np in (40) apparently needs to be replaced with p +1/ √ np , cf.

Shifted Poisson approximation.
Shifted Poisson approximation to B(n, p) has been considered by a number of authors (see [16,18,62] and references therein).The accuracy of shifted Poisson approximation can be sharper than that of pure Poisson approximation.Another advantage of using shifted Poisson approximation is the possibility to derive a more general result (e.g., a uniform in p estimate of d T V (B(n, p); Π(np))).We present such a result below.
An estimate of the accuracy of shifted Poisson approximation to the distribution of a sum of Bernoulli B(p i ) r.v.s in terms of the Gini-Kantorovich distance is given by Barbour & Xia [18].

Dependent Bernoulli r.v.s
Let X 1 , ..., X n be (possibly dependent) Bernoulli r.v.s.Set p i = IP(X i = 1|X 1 , ..., X i−1 ).A generalisation of (24) has been given by Serfling [74]: We present now a generalisation of (28) to the case of dependent Bernoulli r.v.s.Let {X a , a ∈ J} be a family of dependent Bernoulli B(p a ) random variables.Assign to each a ∈ J a "neighborhood" The idea of splitting the sample into "strongly dependent" and "almost independent" parts goes back to Bernstein [19] (see also [75]).Denote S = a∈J X a , λ = IES, and let The following Theorem 6 is due to Arratia et al. [2] and Smith [79].
Theorem 6 There holds In the case of independent random variables one can choose B a = {a}, then (46) coincides with (28).
Theorem 6 has applications to the problem of Poisson approximation to the distribution of the number of long head runs in a sequence of Bernoulli r.v.s, and to the problem of Poisson approximation to the distribution of the number of long match patterns in two sequences (e.g., DNA sequences, see [12,61] and references therein).
The topic concerning L(S n ) in the case of stationary dependent r.v.s {X i } has applications in extreme value theory [48,61].The case where the sequence X 1 , ..., X n is a moving average is related to the topic concerning the so-called Erdös-Rényi maximum of partial sums ( [61], ch.2).
A generalization of Theorem 6 to the case of compound Poisson approximation has been given by Roos [69].

Independent integer-valued r.v.s
The topic of Poisson approximation to the distribution of a sum of integer-valued r.v.s has applications in extreme value theory, insurance, reliability theory, etc. (cf.[7,12,48,61]).For instance, in insurance applications the sum S n = n i=1 Y i 1I{Y i > y i } of integer-valued r.v.s allows to account for the total loss from the claims exceeding excesses {y i }.One would be interested if Poisson approximation to L(S n ) is applicable.
In extreme value theory one often deals with the number of extreme (rare) events represented by a sum S n = ξ 1 + ... + ξ n of 0-1 r.v.s (indicators of rare events).The r.v.s ξ 1 , ..., ξ n can be dependent.One way to cope with dependence is to split the sample into blocks, which can be considered almost independent (the so-called Bernstein's blocks approach [19]).The number of r.v.s in a block is an integer-valued r.v.; thus, the number of rare events is a sum of almost independent integer-valued r.v.s.
In all such situations one deals with a sum of integer-valued r.v.s that are non-zero with small probabilities, and Poisson or compound Poisson approximation to L(S n ) appears plausible.An estimate of the accuracy of Poisson approximation to the distribution of S n can indicate whether Poisson approximation is applicable.
The problem of evaluating the accuracy of Poisson approximation to the distribution of a sum of independent non-negative integer-valued r.v.s has been considered, e.g., in [10,11,61].LeCam's inequality (24) and the Barbour-Eagleson inequality (28) has been generalised to the case of non-negative integer-valued r.v.s by Barbour [10].Theorem 7 below presents another result of that kind ( [61], ch.4).
Let X 1 , X 2 , ..., X n be independent non-negative integer-valued r.v.s, Distribution (47) In the case of Bernoulli B(p i ) r.v.s one has X * i ≡ 0, and (48) coincides with (28).In the case of i.i.d.r.v.s (48) becomes Here {X * } may be chosen independent of {X}, although one would prefer to define X and X * on a common probability space in order to make IE|X −X * | smaller.

Shifted Poisson approximation.
A number of authors dealt with shifted Poisson approximation to the distribution of a sum S n of integer-valued r.v.s (see [16,62] and references therein).Let where [x] and {x} = x−[x] denote the integer and the fractional parts of x.
Let {X a , a ∈ J} be a family of r.v.s taking values in Z + .Suppose one can choose the "neighborhoods" {B a } so that r.v.s {X b , b ∈ J \B a } are independent of X a .We call this assumption the "local dependence" condition.
In Theorem 9 we drop the local dependence condition assumed in Theorem 8.

Asymptotic expansions
Let X 1 , ..., X n be independent Bernoulli B(p i ) r.v.s.Shorgin [78] presents asymptotic expansions of IP(S n ≤ x) in the case of non-i.i.d.{X i }.Asymptotic expansions to IP(S n ∈ A), A ⊂ Z + , and IEh(S n ), where function h obeys certain restrictions, are given by Barbour [10].
The formulation of the full asymptotic expansions is cumbersome and will be omitted.We present below the first-order asymptotics of IEh(S n ) for particular classes of functions h.

Sum of a random number of random variables
Let ν, X, X 1 , X 2 , ... be independent non-negative random variables, where r.v.ν takes values in Z + , X, X 1 , X 2 , ... are i.i.d.random variables.
Set S ν = X 1 + ... + X ν .A natural task is to evaluate the accuracy of Poisson approximation to L(S ν ).
We now consider the situation where r.v.ν depends on {X i }.Let X, X 1 , X 2 , ... be i.i.d.non-negative integer-valued r.v.s.Set S 0 = 0, and let µ(t) denote the stopping time: Theorems 12-13 below are cited from see [61], ch. 3.They provide estimates of the accuracy of Poisson approximation to the distribution of the number of exceedances of a "high" level x ∈ [0; t] till µ(t).
Note that is the largest observation among {X 1 , ..., X µ(t) , t − S µ(t) }.Let X k,t denote the k th largest element among {X 1 , ..., X µ(t) , t − S µ(t) }.Then The topic has applications in finance.For instance, suppose a bank has opened a credit line for a series of operations, and the total amount of credit is t units of money.The cost of the i-th operation is denoted by X i .What is the probability that the bank will ever pay x or more units of money at once? that there will be a certain number of such payments?Information on the asymptotic properties of the distribution of random variables M t and N t (x) can help to answer these questions.
Let {X < i , i ≥ 1}, {X > j , j ≥ 1} be independent r.v.s with the distributions We set p x = IP(X ≥ x), Let K * , K * denote the end-points of L(X), and set In Theorems 12-13 we assume the following condition: Condition ( 63) means the tail of L(X) is light (cf.(3.15) in [61]).Inequality (63) holds if function g(x) = e cx IP(X ≥ x) is not increasing as x > 1/c (∃c > 0).The equality in (63) for all x ≥ 0 may be attained only if L(X) is exponential with IEX = D.
Let π(t, x) denote a Poisson r.v. with parameter p x t/IEX.
Theorem 3.7 in [61] presents asymptotic expansions for IP(N x (t) = k).Note that the asymptotic expansions for L(M t ) have been established under a weaker moment assumption (cf.[61], ch.3).

The number of intervals between consecutive jumps of a Poisson process.
Consider a Poisson jump process {π λ (s), s ≥ 0} with parameter λ > 0, and let η i denote the moment of its i th jump.Set X i = η i − η i−1 .Then N t (x) is the number of intervals between consecutive jumps with lengths greater or equal to x.If the points of jumps represent catastrophic/rare events, then N t (x) can be interpreted as the number of "long" intervals without catastrophes.Let π t,x be a Poisson r.v. with parameter tλe −λx .Then for any k ∈ Z + , as t → ∞, (cf. (3.12) in [61]).
Open problems.2.5.Will asymptotic expansions for L(N x (t)) hold under a weaker moment assumption?2.6.Generalise the results of Theorems 12-13 to the case of

Applications
Applications of the theory of Poisson approximation to meteorology, reliability theory and extreme value theory have been discussed in [7,39,48,61].In this section we present a number of results that are not fully covered in existing surveys.

Long head runs
Let {ξ i , i ≥ 1} be a sequence of 0-1 random variables.We say a head run (a series of 1's) For instance, if n = 5 and ξ 1 = ξ 2 = ξ 3 = 1, ξ 4 = 0, ξ 5 = 1, there is one series (head run) of length 3 and one series of length 1. Denote Then is the number of head runs of length ≥ k among ξ 1 , ..., ξ n (NLHR).Set L n = max{k : L n is the length of the longest head run (LLHR) among X 1 , ..., X n .Obviously, The problem of approximating the distribution of LLHR is a topic of active research; it has applications in reliability theory and psychology (cf.[7,61]).
There is a close relation between N t (x) and W n (k).Let η 0 = 0,

Theorem 13 entails
Corollary 15 For any j ∈ Z + , as n → ∞, According to Theorem 3.13 in [61], the rate n −1 ln n in ( 68) cannot be improved.
The number of long non-decreasing runs.Let ξ i = 1I{Y i ≤ Y i+1 }, where {Y i } are i.i.d.r.v.s with a continuous d.f.. Then NLHR W n (k) is the number of non-decreasing runs of length ≥ k (NLNR), and LLHR is the length of the longest non-decreasing run (LLNR) among Y 1 , ..., Y n+1 .We denote LLNR by L + n and NLNR by W + n (k).The topic concerning LLNR and NLNR has applications in finance.It is well known that prices of shares and financial indexes evolve in cycles of growth and decline.Knowing the asymptotics of L + n and W + n (k) can help evaluating the length of the longest period of continuous growth/decline of a particular financial instrument as well as the distribution of the number of such long periods.
Pittel [63] has proved a Poisson limit theorem for NLNR (see also Chryssaphinou et al. [27] concerning the case of a Markov chain).
We proceed with the case of i.i.d.r.v.s with a continuous d.f..Note that L( The accuracy of compound Poisson approximation to the distribution of the number of non-decreasing runs of fixed length has been evaluated by Barbour & Chryssaphinou [15], p. 982 (continuous d.f.) and Minakov [59] (discrete d.f.).
Open problem.

Long match patterns
Closely related to the number of long head runs is the number of long match patterns (NLMP) between sequences of independent r.v.s.Information on the distribution of NLMP and the length of the longest match pattern (LLMP) can help recognising "valuable" fragments of DNA sequences (see [2,3,58,60]).
In this section we present results on the accuracy of Poisson approximation to the distribution of NLMP.Theorems 17, 19 and Lemma 21 below have been established by the author (see [61], ch.4).
Let X, X 1 , ..., X m , Y, Y 1 , ..., Y n be independent non-degenerate random variables taking values in a discrete state space A. Denote (k ∈ IN) is the length of the longest match pattern between (X 1 . . .X m ) and (Y In the rest of this section we assume that r.v.s X, X 1 , ..., X m , Y, Y 1 , ..., Y n are identically distributed.We set and let where log is to the base 1/p.Note that Taking into account Hölder's inequality, we conclude that Note that c + = c * = 1 if L(X) is uniform over a finite alphabet.Let π m,n denote a Poisson random variable with parameter λ k,m,n .
The following theorem shows that the distribution of the number of long match patterns can be well approximated by the Poisson law.
Theorem 17 has been derived using Theorem 6 and Lemma 21.
If m → ∞ and n → ∞ in such a way that (ln mn)/(min{m, n}) → 0, then It is easy to see that the accuracy of estimate (72) depends on the relation between m and n.If L(X) is uniform over a finite alphabet and (ln mn)/(m ∧ n) → 0, then Corollary 18 implies that If L(X) is uniform over a finite alphabet and for some constant c > 0, then the right-hand side of (72) becomes O(n −1 ln n).We conject that the correct rate of convergence in (74) for the uniform The reason why (71) does not yield such a rate is the lack of factor e −λ on the righthand side.Results obtained for LLHR by the method of recurrent inequalities do produce such a factor (cf. Theorem 3.12 in [61]).
In a more general situation one can consider NLMP with say r mismatches allowed.An estimate of the accuracy of Poisson approximation to the distribution of the number of long r-interrupted match patterns among X 1 , ..., X m , Y 1 , ..., Y n (match patterns of length ≥ k with ≤ r "interruptions") can be found in [58,61].
N * n is the number of long match patterns in one and the same sequence, X 1 , ..., X n .Statistic N * n was introduced by Zubkov & Mihailov [93] who have shown that is the length of the longest match pattern among X 1 , ..., X n .Obviously, The next theorem evaluates the accuracy of Poisson approximation to L(N * n ).
Theorem 19 has been derived using Theorem 6 and Lemma 21.
If L(X) is uniform over a finite alphabet, then the right-hand side of ( 75) is O(n −1 ln n), the right-hand side of (76) The key result behind Theorems 17 and 19 is the following Lemma 21 For all natural i, j, i ′ , j ′ such that (i, j) = (i ′ , j ′ ) , Denote by τ k = min{n : N * n (k) = 0} the first instance a match pattern of length k appears in the sequence {X i , i ≥ 1} .Then The results on the asymptotics of τ k can be derived from the corresponding results on NLMP with a small number of mismatches has been considered by several authors (see [58,61] and references therein).
A number of authors evaluated the accuracy of compound Poisson approximations to the distribution of NLMP (see [58,61,76] and references therein).
Open problems.3.2.Derive uniform in k estimates of (possibly shifted) Poisson approximation to L(W m,n ) and L(N * n ).3.3.Find the 2 nd -order asymptotic expansions for IP(W m,n ∈ •) and IP(N * n ∈ •).3.4.Check if the correct rate of convergence in (73) and (76) in the case of uniform L(X) is O (n −1 ln n).3.5.Improve the estimate of the rate of convergence in the limit theorem for the length of the longest r-interrupted match pattern.

Compound Poisson approximation
The topic of compound Poisson (CP) approximation is vast.From a theoretical point of view, the interest to the topic arises in connection with Kolmogorov's problem concerning the accuracy of approximation to the distribution of a sum of independent r.v.s by infinitely divisible laws (see [5,50,64,66] and references therein).
Recall that the class of infinitely divisible distributions coincides with the class of weak limits of compound Poisson distributions [45].
The topic has applications in extreme value theory, insurance, reliability theory, patterns matching, etc. (cf.[7,12,48,61]).For instance, in (re)insurance applications the sum S n = n i=1 Y i 1I{Y i > x i } of integer-valued r.v.s allows to account for the total loss from the claims {Y i } that exceed excesses {x i }.If the probabilities IP(Y i > x i ) are small, L(S n ) can be accurately approximated by a Poisson or a compound Poisson law.
In extreme value theory one deals with the number of extreme (rare) events represented by a sum of 0-1 r.v.s (indicators of rare events).The indicators can be dependent.A wellknown approach consists of grouping observations into blocks which can be considered almost independent [19].The number of r.v.s in a block is an integer-valued r.v., hence the number of rare events is a sum of almost independent integer-valued r.v.s.In all such situations the block sums are non-zero with small probabilities.More information concerning applications can be found in [7,12,34,48].
This section concentrates on results concerning compound Poisson (CP) approximation that can be derived from the results concerning pure Poisson approximation.

CP limit theorem
Compound Poisson (CP) distribution is the distribution of a r.v.
where τ p and ζ ′ are independent r.v.s, L(ζ ′ ) = L(ζ|ζ = 0), L(τ p ) = B(p).Note that (6.26) in [61]).Let {X n,1 , ..., X n,n } n≥1 be a triangle array of stationary dependent 0-1 random variables, i.e., sequence X n,1 , ..., X n,n is stationary for each n ∈ IN.Set Let ζ r,n be a r.v. with the distribution (8).The following Theorem 22 generalises Theorem 3 to the case of CP approximation.It states that under certain assumptions weak convergence of the cluster size distribution (see (80) below) is necessary and sufficient for the CP limit theorem for S n .
In Theorem 22 below we will assume (11) and the following condition: Note that relation (11) does not imply (79) -for example, consider the case X n,1 ≡ X.
Theorem 22 Assume conditions (11), (79) and ∆.If for a sequence {r = r n } ∈ R, then The limit in (81) does not depend on the choice of a sequence {r n } ∈ R.
Zaitsev [89] has derived an estimate of the accuracy of compound Poisson approximation that can be sharper than (24 * ) if λ = p 1 + ... + p n is "large".The following Theorem 23 presents Zaitsev's result.
Theorem 23 There exists an absolute constant C such that Inequality (82) has been generalised to the multidimensional situation by Zaitsev [90].
We consider now the situation where Presman [65] was probably the first to notice that in such a situation an estimate of the accuracy of compound Poisson approximation to L(S n ) follows from the estimate of the accuracy of pure Poisson approximation to L(τ 1 +...+τ n ).Indeed, denote It is easy to check (see, e.g., [65]) that Besides, according to [61], Lemma 5.4, Presman [65] has evaluated d T V (ν n ; π λ ) (and hence d T V (S n ; Y )) using ( 83) and (29).Michel [53] has applied (83) and the Barbour-Eagleson estimate (28).

CP approximation to B(n, p)
Below we present an estimate of the accuracy of compound Poisson approximation to the Binomial law related to the topic of pure Poisson approximation.
Let X, X 1 , ... be independent Bernoulli B(p) r.v.s.Presman [64] has shown that sup where the compound Poisson distribution F n,p is constructed via Poisson distributions (a similar result in terms of d K is due to Meshalkin [55]).We present Presman's result in Theorem 24 below (see also [5], ch.4).

Denote
Let η 1 , η 2 , η 3 be independent r.v.s with distributions (multiplication is superior to division).Set Note that Y is a CP r.v..One can check that IE(X −p) 3 = pq(q−p), where {Y i } are independent copies of Y .
Unit measure (signed measure) approximations.A number of authors evaluated the accuracy of unit measure (signed measure) approximation to the distribution of a sum of independent Bernoulli r.v.s (see, e.g., [22,16]).In particular, Borovkov [22] has generalised LeCam's inequality (24).Barbour & Chekanavichius [16] present a unit measure approximation to the distribution of a sum of independent integer-valued r.v.s.Note that asymptotic expansion (56) is an example of a unit measure approximation.Dependent 0-1 r.v.s.Let X, X 1 , ... be a stationary sequence of 0-1 r.v.s.The following Theorem 25 is an application of (83) in the case of dependent r.v.s.
Let π, ζ 2 , . . .be independent random variables, where 1 The distribution of S n = X 1 + ... + X n can be approximated by a CP distribution L(Y n ).
If the random variables {X i } are m-dependent, then one can choose l = m, r = ⌈ √ mn ⌉, the smallest integer greater than or equal to √ mn , and get the estimate For further results on the accuracy of compound Poisson approximation to a sum of dependent r.v.s, see [26] and references therein.

Poisson process approximation
Point process counting locations of rare events.Let {ξ i , i ≥ 1} be Bernoulli r.v.s (e.g., ξ i = 1I{X i > u n }, where u n is a "high" level).Then is called a "Bernoulli process".S n (•) counts locations of extreme/rare events represented by r.v.s {ξ i }.
For instance, let X, X 1 , X 2 , ... be a stationary sequence of random variables, and let {u n } be a sequence of levels.Set ξ i = 1I{X i > u n }.Then S n (•) = N n (•, u n ), where Process N n (•, u n ) counts locations of exceedances of level u n .Let {r = r n } be a sequence obeying (7).We denote by ζ r,n a r.v. with distribution (8).
The necessity part of Theorem 26 is given by Theorem 3: if (92) holds, then so does (10).The proof of Theorem 26 follows the lines of the proof of Theorem 7.1 in [61].
Xia [86] presents an estimate of the accuracy of Poisson process approximation in terms of a d G -type distance.
In the general case (when the limiting distribution of ζ r,n is not degenerate) the limiting distribution of N n (•, u n ) is necessarily compound Poisson (Hsing et al. [40], see also [61], ch. 7).
Excess process.Let X, X 1 , X 2 , ... be a stationary sequence of observations.A typical example of a rare event is an exceedance of a high threshold.
If one is interested in the joint distribution of exceedances of several levels among X 1 , ..., X n , a natural tool is the excess process N ε n (•).We give the definition of the excess process below.
Theorem 27 Assume mixing condition condition ∆, and let π(•) denote a Poisson process with intensity rate 1.Then if and only if condition C ′ holds.
Necessary and sufficient conditions for the weak convergence of N * n to a Poisson cluster process are given in [61], ch. 8.
An estimate of the accuracy of approximation N * n (•) ≈ π(T ) j=1 γ j (•) in terms of the d G (X; Y )-type distance is given in [17].
Open problem.5.1.Improve the estimate of the accuracy of approximation N * n ≈ N * presented in [17].
∈J T * ij the number of long match patterns (patterns of length ≥ k).Then {M * m,n < k} = {W m,n = 0}.