On some aspects of a bivariate alternative zero-inflated logarithmic series distribution

In this paper, we discuss some important aspects of the bivariate alternative zero-inflated logarithmic series distribution (BAZILSD) of which the marginals are the alternative zero-inflated logarithmic series distributions of Kumar and Riyaz (2015. An alternative version of zero-inflated logarithmic series distribution and some of its applications. Journal of Statistical Computation and Simulation, 85(6), 1117–1127). We study some important properties of the distribution by deriving expressions for its probability mass function, factorial moments, conditional probability generating functions, and recursion formulae for its probabilities, raw moments and factorial moments. The parameters of the BAZILSD are estimated by the method of maximum likelihood and certain test procedures are also considered. Further certain real-life data applications are cited for illustrating the usefulness of the model. A simulation study is conducted for assessing the performance of the maximum likelihood estimators of the parameters of the BAZILSD.

It is important to note that the BAZILSD possesses a bivariate random sum structure as shown in Section 2. Certain bivariate random sum distributions are studied in the literature. For example, see Kumar (2007Kumar ( , 2013. The random sum structure arises in several areas of scientific research particularly in actuarial science, agricultural science, biological science and physical science. Chapter 9 of Johnson et al. (2005) fully devoted to univariate random sum distributions.
For simplicity in the notations, we adopt the following notations throughout in the manuscript. and (1.13)

A genesis and some properties of the BAZILSD
First, we derive the BAZILSD in the following and discuss some of its properties. Consider the sequence {Y n = (Y 1n , Y 2n ); n ≥ 1} of independent and identically distributed bivariate Bernoulli random vectors, each with pgf in which λ j = θ j θ , j = 1, 2, 3 with θ = θ 1 + θ 2 + θ 3 such that θ 1 > 0, θ 2 > 0 and θ 3 ≥ 0. Let X be a non-negative integer valued random variable having AZILSD with pgf (1.6), in which θ = θ 1 + θ 2 + θ 3 . Assume that {Y n : n ≥ 1} and X's are independent. Define S n = (S 1n , S 2n ), for each n ≥ 0 in which (S 10 , S 20 ) = (0, 0) and S rm = m j=1 Y rj , for r = 1, 2 and m ≥ 1. Set S X = ∞ n=0 S n I [X=n] where I [X=n] denotes the indicator function of an event [X = n]. Then the pgf of S X is where is defined in (1.9). We call a distribution with pgf (2.1) 'the bivariate alternative zero-inflated logarithmic series distribution' or, in short 'the BAZILSD'. Clearly when α = 0, the pgf given in (2.1) reduces to the following pgf of the BZILSD with pmf (1.2).
which shows that the proposed bivariate model of the AZILSD can be considered as a more flexible model in practical point of view compared to the BZILSD. Further, it can be noted that the marginals of the BAZILSD are AZILSD whereas the marginals of the BZILSD are not ZILSD.
Proposition 2.1: If V = (V 1 , V 2 ) follows the BAZILSD, then the marginal distribution of V j for j = 1, 2 is AZILSD with pgf given below.
The proof follows from the fact that H V 1 (t) = H(t, 1) and H V 2 (t) = H(1, t).

Proposition 2.2:
The pgf of the conditional distribution of V 1 given V 2 = v is the following: for any non-negative integer v, Proof: For any non-negative integer v, assume that P (V 2 = v) > 0. Now, we have the following partial derivatives of order (0, v) of H(t 1 , t 2 ) with respect to t 2 evaluated at (t 1 , t 2 ) = (t, 0).
where for j = 0, 1, 2, . . . , and R j (t) is defined in (1.8). Now, applying the formula for the conditional pgf in terms of partial derivatives of the joint pgf developed by Subrahmaniam (1966), we obtain the conditional pgf of V 1 given V 2 = v as which implies (2.3) in the light of (1.8).
Remark 2.1: The conditional distribution of V 1 given V 2 = v as given in (2.3) can be written as is the pgf of a binomial random variable with parameters z 1 and p = θ 3 θ 2 +θ 3 and H Z 2 (t) is the pgf of a random variable following the AZILSD with parameters v, θ 1 and α. Thus clearly, the conditional distribution V 1 given V 2 = v is the distribution of the sum of two independent random variables Z 1 and Z 2 .
By using Remark 2.1, we obtain the following proposition.
Remark 2.2: By a similar approach, for any non-negative integer v with P(V 1 = v) > 0, we can obtain the conditional pgf of V 2 given V 1 = v by interchanging θ 1 and θ 2 in (2.3). Therefore, it is evident that comments similar to those in Remark 2.1 are valid regarding conditional distribution of V 2 given V 1 = v and the explicit expression for E(V 2 |V 1 = v) and Var(V 2 |V 1 = v) can be obtained by interchanging θ 1 and θ 2 in the right hand side expressions of (2.6) and (2.7) respectively.

Proposition 2.4:
follow the BAZILSD with pgf (2.1) and let m, n be any non-negative integers. The pmf f (m, n) and the (m, n)-th factorial moment μ [m, n] of the BAZILSD are Proof: In order to obtain the probability mass function of the BAZILSD, we need the following derivatives of The following derivatives are needed in the sequel, in which 0 ≤ i ≤ r and j ≥ 1.
The proof follows from (2.9) in the light of the relations: Proposition 2.6: Let V = (V 1 , V 2 ) follow the BAZILSD with pgf (2.1). Then U = V 1 + V 2 follows the modified AZILSD studied by Kumar and Riyaz (2013).
The proof follows from the fact that the pgf of

Recursion formulae
In this section, we develop certain recursion formulae for probabilities, raw moments and factorial moments. Let V = (V 1 , V 2 ) be a random vector with pgf (2.1). For the sake of computational simplicity, we define u Now we obtain the following propositions.

Proposition 3.1:
The probability mass function f (m, n; u) of the BAZILSD satisfies the following recurrence formulae, in which δ j is defined in Proposition 2.5.
Proof: From (2.10) with m = 1, we have the following.
On differentiating both sides of (3.1) with respect to t 1 , we have (m + 1)f (m + 1, n ; u) t m 1 t n 2 . (3.7) From (3.1), we also have the following.
(3.8) Now by using (3.7) and (3.8) in (3.6) we get (3.9) On equating the coefficient of t m 1 t 0 2 on both sides of (3.9), we get (3.2). By equating the coefficient of t m 1 t n 2 on both sides of (3.9), we get the relation (3.3). We omit the proof of relations (3.4) and (3.5) as it is similar to that of relations (3.2) and (3.3).
Proposition 3.2: Two recurrence formulae for the (m, n)-th raw moment μ m,n (u) of the BAZILSD are the following, for m, n ≥ 0.
Now, on expanding exponential functions, rearranging the term and by using standard properties of double sum we obtain the following. (3.14) On equating coefficients of (it 1 ) m (it 2 ) n m!n! on both sides of (3.14), we get the relation (3.10). A similar procedure will give (3.11).

Proposition 3.3: :
The (m, n)-th order factorial moment μ [m,n] (u) of the BAZILSD satisfies the following recurrence formulae, for m, n ≥ 0, in which μ [0,0] Proof: Let V = (V 1 , V 2 ) be a random vector having the BAZILSD with pgf H(t 1 , t 2 ) as given in (3.1). Then the factorial moment generating function F(t 1 , t 2 ) of the BAZILSD is On differentiating (3.16) with respect to t 1 , we get In the light of (3.17), we can write this as Equating the coefficient of t m 1 t n 2 m!n! on both sides of (3.18), we get (3.15). Similar procedures will lead to (3.16).

Estimation and testing
In this section, we discuss the estimation of the parameters θ 1 , θ 2 , θ 3 and α of the BAZILSD by the method of method maximum likelihood and construct certain test procedures for testing the significance of the additional parameter α of the BAZILSD.

Maximum likelihood estimation
Let a(m, n) be the frequency of the ( and D * r is defined in Proposition 2.4. Letθ 1 ,θ 2 ,θ 3 andα denote the maximum likelihood estimators of the parameters θ 1 , θ 2 , θ 3 and α of the BAZILSD. On differentiating (4.2), partially with respect to the parameters θ 1 , θ 2 , θ 3 and α, respectively, and equating to zero, we get the following likelihood equations, in which where D j and R j (θ ) are defined in (2.5) and (1.8), respectively. Now on solving these likelihood equations (4.3)-(4.6) by using some mathematical software such as MATHLAB, MATHCAD, MATHEMATICA, etc., one can obtain the maximum likelihood estimators of the parameters θ 1 , θ 2 , θ 3 and α.

Testing of the hypothesis
For testing the hypothesis H 0 : α = 0 against the alternative hypothesis H 1 : α = 0, we construct the generalized likelihood ratio test (GLRT) and Rao's efficient score test (REST) as follows.
In case of (GLRT), the test statistic is whereˆ is the maximum likelihood estimator of = (θ 1 , θ 2 , α) with no restrictions, andˆ * is the maximum likelihood estimator of when α = 0. The test statistic −2 log λ given in (4.7) is asymptotically distributed as Chi-square with one degree of freedom. For details, see Rao (1973).
In case of (REST), the following test statistic can be used.
where T = (T 1 , T 2 , T 3 , T 4 ) and φ = (I rs ) 4×4 are the Fisher information matrices in which T i and I rs for i = 1, 2, 3, 4 and r, s = 1, 2, 3, 4 are as given in the Appendix. The test statistic given in (4.8) follows Chi-square distribution with one degree of freedom (see Rao, 1973).

Applications
For numerical applications, we consider two real-life data sets of which the first data set is from MitchelL and Paulson (1981), which consists of the number of aborts by 109 aircrafts in two consecutive six months of one year period and the second data set, taken from Partrat (1993), is the yearly frequency of hurricanes affecting tropical cyclones in two zones belonging to the North Atlantic coastal states in the USA. We have fitted the BZILSD, the BAZILSD and the bivariate Poisson distribution (BPD) to these data sets by the method of the maximum likelihood estimates of the parameter of the models. For the first data set, the maximum likelihood estimates (MLE S ) of the parameters in case of the BZILSD areθ 1 = 0.75,θ 2 = 0.17 andθ 3 = 0.01, those in case of the BAZILSD areθ 1 = 0.65,θ 2 = 0.23,θ 3 = 0.04 andα = 0.02, and those in case of the BPD areλ 1 = 0.67λ 2 = 0.47andλ 3 = 0.01. For the second data set, the MLE S of the parameters in case of the BZILSD areθ 1 = 0.55,θ 2 = 0.36 andθ 3 = 0.02, those in case of the BAZILSD areθ 1 = 0.35,θ 2 = 0.31,θ 3 = 0.04 andα = 0.01, and those in case of the BPD arê λ 1 = 0.62,λ 2 = 0.61 andλ 3 = 0.01. The computed values of the expected frequencies of the BZILSD, the BAZILSD and the BPD are all presented in the Tables 1 and 2.
(In each cell, the first row represents the observed frequency, the second row represents theoretical frequency of the BZILSD, the third row represents theoretical frequency of BAZILSD and the last row represents theoretical frequency of BPD).
(In each cell, the first row represents the observed frequency, the second row represents theoretical frequency of the BZILSD, the third row represents theoretical frequency of BAZILSD and the last row represents theoretical frequency of BPD).

Simulation
It is quite difficult to examine the theoretical performance of the estimators of different parameters of the BAZILSD obtained by the method of maximum likelihood. So we have attempted a simulation study for assessing the performance of the estimators. We have simulated three data sets of sample size 150, 300 and 600 in both the positively correlated and negatively correlated situations of the BAZILSD by using Markov chain Monte Carlo (MCMC) procedure, and considered 200 replications in each case. We have considered the following two sets of parameters: (i) θ 1 = 0.4361, θ 2 = 0.2679, θ 3 = 0.1905, α = 0.0110 (positively correlated) and (ii) θ 1 = 0.0847, θ 2 = 0.0439, θ 3 = 0.0216, α = 0.0112 (negatively correlated) as initial values of the parameters while simulating the data sets.
The computed values of the bias and standard errors in case of each of the estimators are given Table 5. From Table  5, it can be observed that both the bias and standard errors of the estimators of the parameters are in decreasing order as the sample size increases.