A Bivariate Model based on Compound Negative Binomial Distribution

A new bivariate model is introduced by compounding negative binomial and geometric distributions. Distributional properties, including joint, marginal and conditional distributions are discussed. Expressions for the product moments, covariance and correlation coe cient are obtained. Some properties such as ordering, unimodality, monotonicity and self-decomposability are studied. Parameter estimators using the method of moments and maximum likelihood are derived. Applications to tra c accidents data are illustrated.

The univariate compound negative binomial models arise naturally in insurance and actuarial sciences and were studied by several authors (see Drekic & Willmot (2005)). Panjer & Willmot (1981) studied compound negative binomial with exponential distribution. Subrahmaniam (1966) derived the Pascal-Poisson distribution (compound negative binomial with Poisson distributions) as a limiting case of a more general contagious distribution (see Johnson, Kemp & Kotz (2005)). Subrahmaniam (1978) investigated the parameters estimates for the Pascal-Poisson distribution by method of moments and maximum likelihood procedures. Jewell & Milidiu (1986) suggested three methods to approximate the evaluation of the compound Pascal distribution where the compounding distribution is dened on both negative and positive integers. Ramsay (2009) derived expression for the cumulative distribution function of compound negative binomial where the compounding distribution is Pareto distribution. Wang (2011) presented recursion on the pdf of compound beta negative binomial distribution. Willmot & Lin (1997) constructed upper bound for the tail of the compound negative binomial distribution. Cai & Garrido (2000) derived two sided-bounds for tails of compound negative binomial distributions. Vellaisamy & Upadhye (2009b) studied convolutions of compound negative binomial distributions. Gerber (1984), dhaene (1991), Vellaisamy & Upadhye (2009a) and Upadhye & Vellaisamy (2014) considered the problem of approximating a compound negative binomial distribution by a compound Poisson distribution. Hanagal & Dabade (2013) introduced compound negative binomial frailty model with three baseline distributions.
proposed a bivariate compound Poisson distribution and introduced bivariate versions of the Neyman Type A, Neyman type B, geometric-Poisson and Thomas distributions. Earthquake data was used to illustrate the application of these distributions. Özel (2011) dened a bivariate compound Poisson distribution to model the occurences of forshock and aftershock sequences in Turkey.
In this paper, we study the random vector (Y 1 , Y 2 ) where Y 1 ∼ N B(r, p 1 ) and W i ∼ geo(p 2 ). We refer to this distribution as BGNBD, which stands for bivariate geometric-negative binomial distribution. The BGNBD distribution can be used as appropriate model for many problems of social, income and physical nature. For instance, the number of purchased order and the number of total soled items per day, the total number of insurance claimed and the number of claimants per unit time, the total number of injury accidents and number of fatalities and the number of visits and number of drugs prescribed.
Our paper is organized as follows. In Section 2, the bivariate geometric-negative binomial distribution is derived and distributional properties are discussed. Parameter estimators of BGNBD are derived using the methods of moment and maximum likelihood in Section 3. Applications on real data sets are presented in Section 4 to illustrate the BGNBD. Finally, some conclusions are drawn in Section 5.

Bivariate Geometric Negative Binomial
where Y 1 is a negative binomial variable given in (1) and the W i 's are i.i.d. geometric variables (p 2 ), independent of the Y 1 , is said to have a bivariate geometric-negative binomial distribution with parameters r,p 1 and p 2 . This distribution is denoted by BGNBD(r, p 1 , p 2 ).
The random variable Y 2 is distributed according to the compound geometricnegative binomial distribution (CGNB) with parameters r, p 1 and p 2 , denoted by CGN B(r, p 1 , p 2 ).
The Skewness of Y 2 ∼ CGN B(r, p 1 , p 2 ) is given by where µ i , i = 1, 2, 3 are the rst three moments about zero of W . As r and p 1 and the moments of W are positive, it follows that the compound geometric negative binomial distribution is positively skewed.
Proposition 1. sf r = 1 then gqxfh rndom vrile hs the representE which is a mixture of degenerate distribution at 1 with probability p 1 and Geo( p1p2 1−p2q1 ) with probability q 1 . Hence, the proof is complete.
Since the negative binomial distribution can be represented as a compound Poisson distribution with logarithmic compounding distribution. Then, the compound negative binomial distribution is a compound Poisson distribution with a compound logarithmic distribution as the compounding distribution. This is stated in the following proposition.
Note that the log-concavity is equivalent to strongly unimodal, and it implies that the distribution is unimodal and has increasing hazard (failure) rate.
• hivisiility nd elfEdeomposility Useful theorems from Steutel & van Harn (2004) regarding the representation of innitely divisible and self-decomposable for distributions on the set of nonnegative integers are quoted here. The results of these theorems enable us to prove the self-decomposability of the compound negative binomial distribution.
Note that the probability generating function of the compound negative binomial distribution is given by Therefore by Theorem 1 and Proposition 2, the compound negative binomial distribution is innitely divisible. Proposition 4. he ompound negtive inomil distriution hs nonil sequene representtion of the form Proof . From Proposition 2, the compound negative binomial distribution can be regarded as compound Poisson distribution with λ = −r log p 1 and compounding distribution with pgf of the form Since f * i W is the probability mass function of G i W , we get It is easily seen that the canonical representation of the compound Poisson distribution is given by Substituting λ = −r log p 1 and (9) in (10), we get the relation (8).
Example 1. In case of compound geometric-negative binomial distribution, we have which is non-increasing function. Hence, the compound geometric-negative binomial distribution is self-decomposable.
Then X 1 is less than X 2 in likelihood ratio order (denoted by is increasing in x.

Proposition 5. vet {W
Proof . The result follows from application of Theorem 1.C.11 of Shaked & Shanthikumar (2007), and likelihood ordering of negative binomial distribution and log-concavity of geometric distribution.
The covariance matrix of (Y 1 , Y 2 ) takes the form and the correlation coecient of Y 1 and Y 2 is where C.V (W ) denotes the coecient of variation of W . It is interesting to note that the correlation does not depend on r. This gives more exibilty in modeling as one can let the mean and the variance varies without aecting the correlation. Also, One can see that the correlation coecient is a decreasing function of p 1 and p 2 and assumes only positive values. Obviously, the correlation is bounded by 0 and 1, where the lower bound is attained if p 2 = 0 and the upper bound is attained when p 2 = 1 which correspond to the trivial cases Y 2 = 0 and Y 1 = Y 2 , respectively.
• Product moments and joint cumulants.
• Conditional distribution and regression functions 1. It is obvious that the conditional distribution of Y 2 given Y 1 is a negative binomial random variable with parameters y 1 and p 1 . Thus which is a linear in y 1 with regression coecient 1−p2 p2 . As the coecient is non-negative we have the conditional mean of Y 2 increases with the increase in y 1 . Also the conditional variance is which has similar properties as the conditional mean.
The following proposition gives the distribution of the random sum S = Y 1 + Y 2 .
and, the proof is complete.
i=0 W i ) e mutully independent fqxfh for i = 1, 2 · · · , nD Y 1i is negtive inomil rndom vrile with prmeters r i Dp 1 D nd W i 9s re iid rndom vriles distriuted s geometri with prmeter p 2 D nd independent of the Y 1i 9sD then the distriution of Then, the mgf of the sum of the n random vectors which is the mgf of BGNBD with parameters r = n i=1 r i , p 1 and p 2 .
• Limiting Distribution. Since the negative binomial distribution with parameters r and p 1 converges to the Poisson distribution with parameter λ = r(p − 1) where r → ∞ and p 1 → 1. Thus, we have the following proposition.
Proposition 9. sf (Y 1 , Y 2 ) ∼ BGN BD(r, p 1 , p 2 )D then the funtion f Y1,Y2 (y 1 , y 2 ) de(ned in @IIA is T P 2 F Proof . For z 1 < z 2 , we have . which is decreasing function in y 1 , hence f Y1,Y2 (y 1 , y 2 ) dened in (11) is The T P 2 is very strong positive dependence between random variables in particular it implies association and positive quadrant dependence and hence a nonnegative covariance (see for example Barlow & Proschan (1975)).
i por r < 1 nd y 2 = 0D the joint pmf of fqxfh given in @IIA is logE onvex in y 1 D otherwise it is logEonveF ii he joint pmf of fqxfh given in @IIA is logEonve in y 2 F Proof .
i In order to prove that the joint pmf of BGNBD given in (11) is logconcave in y 1 , we need to show that y2) is decreasing in y 1 for every y 2 . But f Y1,Y2 (y 1 + 1, y 2 ) f Y1,Y2 (y 1 , y 2 ) = (1 − p 1 )p 2 (1 + y 2 + r − 1 y 1 + 1 + ry 2 y 1 (y 1 + 1) ) Thus, the ratio is decreasing in y 1 for r ≥ 1. For r < 1, we have two cases, the rst is that y 2 = 0 then the ratio increasing and the second case where y 2 > 0 which is clearly decreasing in y 1 . ii The log-concavity of BGNBD in y 2 follows from the fact that the y 1 − th convolution of geometric distribution with parameter p 2 is negative binomial distribution with parameters y 1 and p 2 which is a log-concave.
• Stochastic Order Proof . The result follows from an application of Theorem 6.B.3 of Shaked & Shanthikumar (2007) and Proposition 5.
• Method of moments The moment estimatesp 1M M ,p 2M M andr M M of p 1 ,p 2 and r are obtained from solving the moments equations. Using the moments As the value of r is non-negative, then the estimater M M has meaning only when s 2 1 >ȳ 1 .
• Maximum Likelihood Maximum likelihood estimates (MLE) for the parameters p 1 , p 2 and r can be derived by considering the likelihood function given by Then it can be seen that the MLE satisfŷ Note that MLE and MM estimate of p 2 are identical. Under mild regularity condition the maximum likelihood estimatorΘ = (r,p 1 ,p 2 ) for large sample has approximately a multivariate normal distribution N 3 (Θ, I −1 (Θ)) where I(Θ) = −E( ∂ 2 log L ∂Θ∂Θ ). In order to obtain the asymptotic variance-covariance matrix of p 1 , p 2 and r, we need the second partial derivatives of the log likelihood function. These are given by Hence, Cov(p 2 ,r) = Cov(p 1 ,p 2 ) = 0.

Numerical Example
For comparison purposes, the BGNBD was tted to the same sets of accident data used by Leiter & Hamdan (1973) and Cacoullos & Papageorgiou (1980), i.e., the total number of injury accidents recorded during 639 days (in 1969 and 1970) in a 50-mile stretch of highway in eastern Virginia (Y 1 ), and the corresponding number of fatalities (Y 2 ) for individual years. We look at the data as three sets of data. The rst data is the entire study, the second and third set of data representing the total number of injury accidents in 1969 and 1970, respectively. Descriptive statistics of the considered data are presented in Table 1.
As the estimation criterion holds (s 2 1 >ȳ 1 ), hence we considered estimating the parameters using both methods the moments and the maximum likelihood. The results are reported in Table 2. Comparing the MM and MLE of the parameters show that they are quite similar. The estimated variance-covariance matrix of the maximum likelihood estimators are computed for each data set.   In order to investigate the performance of the BGNBD, we compared the tting of this model with the results of tting the bivariate Poisson-Poisson (BPPD), bivariate binomial-Poisson (BBPD), bivariate geometric-Poisson (BGPD), and bivariate negative binomial-Poisson (BNBPD) distributions to the data (For more information about these distributions, see Alzaid et al. (2017)). The BBPD is tted assuming dierent values of the parameter m, the BNBPD assuming dierent values of the parameter r for the rst two data sets, in this case the moments estimates coincide with the maximum likelihood estimates. The t of each model was measured using the Akaike information criterion AIC, SSE values and chi-square goodness-oft criterion, where the SSE is dened by SSE = ally1,y2 (observed − expected) 2 . The observed and expected values for the bivariate models along with the loglikelihood, AIC, χ 2 values, degrees of freedom (d.f.), corresponding p-values and SSE are given in Tables 3-5. Figure 1 demonstrates the tted distributions. The values of χ 2 , were computed after the grouping of bolded cells in the table. The results show that the log-likelihood and AIC values of all the bivariate models are essentially the same. Note that the t of the models BPPD, BBPD, BGPD and BNBPD is much better for the individual years, than it is for the entire 639 days. It is obvious from the χ 2 and SSE values in Table 3 that the models BPPD, BBPD, BGPD and BNBPD could not give a satisfactory t for the data. The t by BGNBD yields a smaller χ 2 and SSE values as compared with the other models, which implies that this model ts the data well, this is also reected by the p-value. Same conclusion is reached from Table 4. The p-values of the models in Table 5, suggest acceptable with the superiority of BGNBD as judged by larger p-value and smaller SSE.

Conclusions
In this paper, the moments, cumulants, skewness of the univariate CGNBD are derived. Some monotonicity and distributional properties of the univariate CGNBD are provided. Then, BGNBD is dened and some important probabilistic characteristics such as moments, cumulants, covariance, and the coecient of correlation are obtained. Some applications to accident data have been presented to illustrate the usage of the BGNBD. The results showed the superiority of BGNBD among other competitive models in the presented applications.