Applied Mathematics and Computation 217 (2010) 3069–3087
Contents lists available at ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
A class of bivariate negative binomial distributions with different index parameters in the marginals Choung Min Ng a, Seng-Huat Ong a, H.M. Srivastava b,⇑ a b
Institute of Mathematical Sciences, University of Malaya, 50603 Kuala Lumpur, Malaysia Department of Mathematics and Statistics, University of Victoria, Victoria, British Columbia V8W 3R4, Canada
a r t i c l e
i n f o
a b s t r a c t
Keywords and Phrases: Extension of trivariate reduction Meixner class of polynomials and Srivastava’s triple hypergeometric series Canonical expansions and quadrant dependence Mixed Poisson distribution Computer generation of bivariate samples Multivariate extensions and goodness-of-fit
In this paper, we consider a new class of bivariate negative binomial distributions having marginal distributions with different index parameters. This feature is useful in statistical modelling and simulation studies, where different marginal distributions and a specified correlation are required. This feature also makes it more flexible than the existing bivariate generalizations of the negative binomial distribution, which have a common index parameter in the marginal distributions. Various interesting properties, such as canonical expansions and quadrant dependence, are obtained. Potential application of the proposed class of bivariate negative binomial distributions, as a bivariate mixed Poisson distribution, and computer generation of samples are examined. Numerical examples as well as goodnessof-fit to simulated and real data are also given here in order to illustrate the application of this family of bivariate negative binomial distributions. Ó 2010 Elsevier Inc. All rights reserved.
1. Introduction The univariate negative binomial distribution has the following probability mass function:
rðxÞ ¼
ðmÞx x p ð1 pÞm x!
ð0 < p < 1; m > 0;
x ¼ 0; 1; 2; . . .Þ;
where p is the probability parameter and m is the index (dispersion) parameter. The negative binomial distribution has important applications in various disciplines [17]. Thus, clearly, there are a number of univariate generalizations (see, for example, Gupta and Ong [14]) and bivariate extensions of the negative binomial distribution [9,48]. Kocherlakota and Kocherlakota [22] and Mardia [31] have reviewed these bivariate negative binomial distribution as well as other bivariate generalizations of important univariate discrete distributions. Bivariate generalizations of important univariate distributions are of continuing interest (see, for example, Kundu and Gupta [24]). These bivariate generalizations can be constructed in a wide variety of ways like mixing or compounding, sampling and trivariate reduction. The method of trivariate reduction or random element in common is a popular method of construction due to its simplicity and ease of generating samples on a computer given a univariate generator. This method is described as detailed below. Given independent random variables Y1, Y2 and W from the same family, a bivariate generalization (X1, X2) is given by
X1 ¼ Y 1 þ W
and X 2 ¼ Y 2 þ W:
However, this method of genesis imposes a restrictive range on the correlation coefficient q. ⇑ Corresponding author. E-mail addresses:
[email protected] (C.M. Ng),
[email protected] (S.-H. Ong),
[email protected] (H.M. Srivastava). 0096-3003/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2010.08.040
ð1Þ
3070
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
Apart from having a restricted range of the correlation coefficient, bivariate extensions of the negative binomial distribution, which are found in the existing literature, also suffer from a lack of flexibility due to the marginal distributions having the same index parameter m [9,32,48]. In many Monte Carlo simulation experiments, there is a need to vary the degree of dependence [7, p. 573] or specify different marginal distributions especially when the bivariate structure of the distribution is not well understood. Clearly, the possession of different marginal distributions makes the bivariate distribution more flexible for empirical modelling. In this paper, we introduce a class of bivariate negative binomial distributions based upon an extension of trivariate reduction, which have different marginal distributions. The proposed bivariate negative binomial distribution may be formulated as a mixed Poisson model. Mixed Poisson models form a useful class of distributions in practical applications (see, for example, Johnson et al. [17]). Applications of bivariate mixed Poisson distributions have been examined by (among others) Stein and Juritz [47], Aitchison and Ho [1] and Chib and Winkelmann [5]. Edwards and Gurland [9] and Subrahmaniam [48] have considered their bivariate negative binomial distributions in modelling accident proneness in the context of the bivariate mixed Poisson model. We examine the potential application of the proposed extended bivariate negative binomial distribution, as a bivariate mixed Poisson distribution and computer generation of samples, as well as its application to data fitting. Various interesting properties, such as canonical expansion and quadrant dependence, will also be considered. Quadrant dependence [29] is a useful concept of bivariate dependence which is easier to verify than the usual linear dependence. As special cases, the bivariate negative binomial distribution of Edwards and Gurland [9] and Subrahmaniam [48] are shown to be quadrant dependent. A canonical expansion of a bivariate distribution is a single series expansion in terms of its marginal distributions and the corresponding orthogonal polynomials. The class of bivariate distributions formulated from the extension of trivariate reduction is shown to have canonical expansions. This extends the result of Eagleson [8]. For a particular case of the proposed extended bivariate negative binomial distribution, the explicit canonical expansion of the joint probability distribution is derived. The canonical expansion of a bivariate probability density function is a useful tool in the study of the structure of bivariate distributions [23]. Recently, Cuadras [6] derived canonical expansion in terms of distribution functions. Pertinent theory of canonical expansion of bivariate probability density function is outlined in the Appendix. As an example, the bivariate negative binomial distribution of Edwards and Gurland [9] and Subrahmaniam [48], with the joint probability generating function given by
Gðz1 ; z2 Þ ¼
X
zx1 zy2 hðx; yÞ ¼
x;y
H
1 h1 z1 h2 z2 h3 z1 z2
m ;
ðH ¼ 1 h1 h2 h3 ; hi = 0 ði ¼ 1; 2; 3Þ; H 5 1Þ and the joint bivariate probability mass function given by
hðx; yÞ ¼ Hm
minðx;yÞ X i¼0
Cðm þ x þ y iÞ xi yi i h h h CðmÞi!ðx iÞ!ðy iÞ! 1 2 3
has canonical (diagonal) expansion of the bivariate probability mass function [37] given by
hðx; yÞ ¼ f ðxÞgðyÞ
1 X
qi mi x; m;
i¼0
h1 þ h3 h2 þ h3 mi y; m; ; 1 h2 1 h1
where
f ðxÞ ¼
m x ðmÞx H h1 þ h3 ; x! 1 h1 1 h2
gðyÞ ¼
m y ðmÞy H h2 þ h3 y! 1 h2 1 h1
and
mi ðx; m; pÞ mi ðx; m; pÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ðmÞi i!pi with the ith orthonormal Meixner polynomial defined by
1 mi ðx; m; pÞ ¼ ðmÞi 2 F 1 i; x; m; 1 p in terms of the Gauss hypergeometric function 2F1(a, b; c; z) defined by 2 F 1 ða; b; c; zÞ
¼
1 X ðaÞn ðbÞn zn ðcÞn n! n¼0
ðjzj < 1Þ:
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
3071
Here, and in what follows, (k)m denotes the Pochhammer symbol defined, in terms of the familiar Gamma function, by
ðkÞm :¼
Cðk þ mÞ ¼ CðkÞ
ðm ¼ 0; k 2 C n f0gÞ
1
kðk þ 1Þ ðk þ n 1Þ ðm ¼ n 2 N; k 2 CÞ;
N and C being, as usual, the sets of positive integers and complex numbers, respectively, it being understood conventionally that (0)0 :¼ 1. This paper is organized as follows. In Section 2, an extension of the bivariate negative binomial distribution is proposed and formally defined, which is based upon an extension of the method of trivariate reduction and as a bivariate mixed Poisson distribution. The multivariate extension is also given. Distributional properties are considered in Section 3. A result on the canonical expansion of distributions derived from the extension of trivariate reduction is given in Section 4. This extends the result of Eagleson [8] for the Meixner class of distributions. The canonical expansion of the extended bivariate negative binomial distribution is then derived. In Section 5, quadrant dependence of the extended bivariate negative binomial distribution is considered. Section 6 discusses the computer generation of bivariate samples with varying dependence. Applications and numerical examples as well as goodness-of-fit are considered in Section 7. Several concluding remarks and observations are presented in Section 8. 2. A new bivariate negative binomial distribution 2.1. Extended trivariate reduction The method of trivariate reduction given by (1) may be extended by considering
X1 ¼ Y 1 þ W 1
and X 2 ¼ Y 2 þ W 2 ;
ð2Þ
where (W1, W2) is a pair of randomly correlated elements independent of Y1 and Y2. Lai [25] examined the situation where
W 1 ¼ I1 W
and W 2 ¼ I2 W;
I1 and I2 being the indicator random variables such that (I1, I2) has a joint probability distribution (p00, p01, p10, p11). For the genesis by (2), the general form of the bivariate probability generating function is given by
GðX 1 ;X 2 Þ ðz1 ; z2 Þ ¼ GY 1 ðz1 ; a1 ; p1 ÞGY 2 ðz2 ; a2 ; p2 Þ GðW 1 ;W 2 Þ ðz1 ; z2 ; m; h1 ; h2 ; h3 Þ;
ð3Þ
where GY 1 ; GY 2 and GðW 1 ;W 2 Þ are the corresponding probability generating functions of Y1, Y2 and (W1, W2). Furthermore, Y1, Y2, W1 and W2 are from the same family of univariate distributions with (W1, W2) being jointly distributed. We note that, if (W1, W2) has a distribution formulated by trivariate reduction, then (X1, X2) again has a distribution formed by the usual trivariate reduction. We now consider the extension of the bivariate negative binomial distribution when (W1, W2) is distributed as the bivariate negative binomial distribution of Edwards and Gurland [9]. 2.1.1. Extension of bivariate negative binomial distribution Let
Y 1 NBða1 ; p1 Þ and Y 2 NBða2 ; p2 Þ denote negative binomial random variables and let
ðW 1 ; W 2 Þ BNBðm; h1 ; h2 ; h3 Þ be the Edwards–Gurland bivariate negative binomial or compound correlated bivariate Poisson random variable with the joint probability generating function given by
GðW 1 ;W 2 Þ ðz1 ; z2 Þ ¼
m
H
1 h1 z1 h2 z2 h3 z1 z2
ðH ¼ 1 h1 h2 h3 Þ:
We note that this bivariate negative binomial distribution has a correlation coefficient q in the range 0 5 q 5 1. When h3 = 0, this reduces to the bivariate negative binomial distribution which may be formulated through inverse sampling (see Kocherlakota and Kocherlakota [22, p. 122]). By (3), this extended bivariate negative binomial distribution of (X1, X2) has joint probability generating function
GðX 1 ;X 2 Þ ðz1 ; z2 Þ ¼
q1 1 p1 z1
a1
q2 1 p2 z2
a2 m H ; 1 h 1 z1 h 2 z2 h 3 z1 z2
where
p1 ¼
h1 þ h3 ; 1 h2
p2 ¼
h2 þ h3 ; 1 h1
qi ¼ 1 pi
ði ¼ 1; 2Þ
ð4Þ
3072
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
and
H ¼ 1 h1 h2 h3 ; with marginal probability generating functions of X1 and X2 given by
qi 1 pi zi
GX i ðzÞ ¼
ai þm ði ¼ 1; 2Þ;
that is,
X 1 NBða1 þ m; p1 Þ and X 2 NBða2 þ m; p2 Þ: This extended bivariate negative binomial has a correlation in the range 0 5 q 5 1 as seen from the Eq. (12), because the term (h3 + h1h2) is positive. 2.1.2. Multivariate extensions The following formulation:
X1 ¼ Y 1 þ W 1
and X 2 ¼ Y 2 þ W 2
can be extended to develop multivariate distributions. By taking
X1 ¼ Y 1 þ W 1;
X2 ¼ Y 2 þ W 2; . . . ; Xk ¼ Y k þ W k ;
a multivariate negative binomial distribution (MNB) with different parameters in the marginal is obtained as follows. Let
X1 ¼ Y 1 þ W 1;
X2 ¼ Y 2 þ W 2; . . . ; Xk ¼ Y k þ W k ;
where Y1, Y2, . . . , Yk are independent negative binomial random variables and
W ¼ ðW 1 ; W 2 ; . . . ; W k Þ MNB; with the probability generating function given by
0
1m
B C H C; GW ðz1 ; z2 ; . . . ; zk Þ ¼ B P PP @ A 1 ki¼1 hi zi hij zi zj h12...k z1 z2 . . . zk 1 5 i<j 5 k
where k X
H¼1
XX
hi
i¼1
hij h12...k :
1 5 i<j 5 k
We thus find that
X ¼ ðX 1 ; X 2 ; . . . ; X k Þ MNB has its probability generating function given by
GX ðz1 ; z2 ; . . . ; zk Þ ¼
k Y
qi 1 pi zi
i¼1
ai
1m
0
C B H C; B Pk PP A @ 1 i¼1 hi zi hij zi zj h12...k z1 z2 . . . zk 1 5 i<j 5 k
with the marginal probability generating functions given by
GX i ðzÞ ¼
qi 1 pi z
ai þm ;
where
/i ¼ hi þ
k X j¼1 ðj–iÞ
hij þ
X
X
hirs þ þ h12...k ;
1 5 r<s 5 k ðr–i;s–iÞ
Therefore, each of the marginals
X i NBðai þ m; pi Þ ði ¼ 1; 2; . . . ; kÞ will have different parameters.
pi ¼
/i
H þ /i
and qi ¼ 1 pi
ði ¼ 1; 2; . . . ; kÞ:
3073
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
2.2. Formulation as a mixed Poisson distribution (X1, X2) is said to have a mixed Poisson distribution if its joint probability mass function is given by
hðx1 ; x2 ; nÞ ¼
Z
fX 1 ðx1 jhÞfX2 ðx2 jhÞgðh; nÞdh;
where fX i ðxi jhÞ is the Poisson probability mass function with parameter h, and where h is a random variable H with mixing distribution having probability density function g(h; n) with a vector of parameters n. An extension of this is provided by
hðx1 ; x2 ; nÞ ¼
Z Z
fX 1 ðx1 jh1 ÞfX 2 ðx2 jh2 Þgðh1 ; h2 ; nÞdh1 dh2 ;
ð5Þ
where h1 and h2 are jointly distributed with probability density function g(h1, h2; n) (see, for example, Ong [36]). We show that the extended bivariate negative binomial distribution can be formulated as a mixed Poisson distribution given by (5). The bivariate gamma distribution considered by Gupta [11] has its moment generating function given by
MðU;VÞ ðt1 ; t 2 Þ ¼
b1 b1 t 1
a1 m
b2 b2 t2
a2 m b1 t 1 b2 t 2 q 2 t 1 t 2 m b1 b2 b1 b2
0 5 q2 5 1;
m < minða1 ; a2 Þ :
ð6Þ
It may be formulated by the extended trivariate reduction (2) where
Y 1 Cða1 m; b1 Þ and Y 2 Cða2 m; b2 Þ are independent gamma random variables with
ai m > 0 ði ¼ 1; 2Þ and
ðW 1 ; W 2 Þ BCðm; b1 ; b2 Þ has the Wicksell–Kibble bivariate gamma distribution [21]. Let X Poi(k) denote the Poisson random variable with parameter k. The mixed Poisson formulation is given as follows. Suppose that
X 1 jU PoiðUÞ and X 2 jV PoiðVÞ; where U and V have a joint bivariate gamma distribution given by (6). Then, clearly, the unconditional (X1, X2) has the extended bivariate negative binomial distribution. The formulation is easily proved by using the following relation between the moment generating function of the mixing distribution and the probability generating function of the mixed distribution (see Ong [34]):
GðX 1 ;X 2 Þ ðz1 ; z2 Þ ¼ M ðU;VÞ ðz1 1; z2 1Þ: This leads to the extended bivariate negative binomial probability generating function given by
GðX 1 ;X 2 Þ ðz1 ; z2 Þ ¼
q1 1 p1 z1
a1 m
q2 1 p2 z2
a2 m m H ; 1 h1 z1 h2 z2 h3 z1 z2
where
pi ¼
1 1 þ bi
h1 ¼
p1 ð1 q2 p2 Þ ; 1 q2 p1 p2
and qi ¼ 1 pi h2 ¼
ði ¼ 1; 2Þ;
p2 ð1 q2 p1 Þ ; 1 q2 p1 p2
and
H ¼ 1 h1 h2 h3 ¼
q1 q2 : 1 q2 p1 p2
We note that
1 < h3 < 0 with h3 þ h1 h2 > 0: Rewriting in terms of h1, h2 and h3, we have
p1 ¼
h1 þ h3 1 h2
and p2 ¼
h2 þ h3 : 1 h1
h3 ¼
p1 p2 ð1 q2 Þ 1 q2 p1 p2
ð7Þ
3074
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
The marginal probability generating functions of X1 and X2 are given by
GX 1 ðz1 Þ ¼
q1 1 p 1 z1
a1
and GX2 ðz2 Þ ¼
q2 1 p2 z2
a2 ;
which yields
X 1 NBða1 ; p1 Þ and X 2 NBða2 ; p2 Þ: As a special case of this extended bivariate negative binomial distribution, the bivariate negative binomial distribution is included with (U, V) having the Wicksell–Kibble bivariate gamma distribution (see Ong [34]), that is, when
a1 ¼ a2 ¼ m in (6) and (7). 2.3. A general form of extended bivariate negative binomial In the mixed Poisson formulation, it is found that 1 < h3 < 0 whereas by formulation (2), 0 5 h3 < 1. Hence, an extended bivariate negative binomial distribution with 1 < h3 < 1 can be now defined. Definition. The joint probability generating function of the extended bivariate negative binomial distribution is given by
GðX 1 ;X 2 Þ ðz1 ; z2 Þ ¼
q1 1 p1 z1
a1
q2 1 p 2 z2
a2 m H ; 1 h1 z1 h2 z2 h3 z1 z2
ð8Þ
with the following parameters:
p1 ¼
h1 þ h3 ; 1 h2
p2 ¼
h2 þ h3 ; 1 h1
qi ¼ 1 pi
ði ¼ 1; 2Þ
and
H ¼ 1 h1 h2 h3 and the following restrictions:
0 < p1 ; p2 ; h1 ; h2 ; H < 1; 1 < h3 < 1;
h3 þ h1 > 0;
h3 þ h2 > 0;
h3 þ h1 h2 > 0 and a1 ; a2 ; m > 0: This is obtained by combining (4) and (7) (by substituting a1 m for a1 and a2m for a2). We note that the marginal distributions
X 1 NBða1 þ m; p1 Þ and X 2 NBða2 þ m; p2 Þ have different parameters. The correlation is in the range (0, 1) as indicated by Eq. (12).
3. Distributional properties This section discusses the distributional properties for the extended bivariate negative binomial distribution obtained through the extended trivariate reduction and mixed Poisson formulations. 3.1. Joint probability mass function In order to obtain the probability mass function of the extended bivariate negative binomial distribution from the corresponding probability generating function, the second member of Eq. (4) is expanded in powers of z1 and z2. The coefficient for the term zx11 zx22 will give the probability mass function denoted by
PrðX 1 ¼ x1 ; X 2 ¼ x2 Þ ¼ f ðx1 ; x2 Þ as follows:
" m X i # r s minðr;sÞ x1 X x2 X ða1 Þx1 r x1 ða2 Þx2 s x2 r s h1 h2 h3 f ðx1 ; x2 Þ ¼ fX1 ðx1 ÞfX2 ðx2 Þ ðmÞrþsi i! q1 q2 ða1 þ mÞx1 r p1 ða2 þ mÞx2 s p2 h1 h2 i i r¼0 s¼0 i¼0
H
ðx1 ; x2 ¼ 0; 1; 2; . . .Þ;
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
3075
where
fX i ðxi Þ ¼
ðai þ mÞxi xi !
x
a þm
pi i qi i
ði ¼ 1; 2Þ
are the probability mass functions of the marginal distributions X1 and X2 and the parameters are as defined in Section 2.1.1. Another method to obtain this probability mass function is to differentiate the given probability generating function repeatedly and then extracting the coefficient of zx11 zx22 [22, p. 2]:
f ðx1 ; x2 Þ ¼
1 @ x1 þx2
G ðz ; z Þ ðX 1 ;X 2 Þ 1 2 x1 !x2 ! @zx11 @zx22 z
:
1 ¼z2 ¼0
For the particular case when h3 = 0, f(x1, x2) reduces to the following form:
f ðx1 ; x2 Þ ¼ fY 1 ðx1 ÞfY 2 ðx2 ÞHm F 2 ½m; x1 ; x2 ; 1 a1 x1 ; 1 a2 x2 ; 1 h2 ; 1 h1 ; where F2 is the second-kind Appell function of two variables [43, p. 53]: 0
F 2 ða; b; b ; c; c0 ; x; yÞ ¼
0 1 X ðaÞmþn ðbÞm ðb Þn xm yn ðcÞm ðc0Þn m! n! m:n¼0
ðjxj þ jyj < 1Þ
and
fY i ðxi Þ ¼
ðai Þxi xi !
a
x
pi i qi i
ði ¼ 1; 2Þ:
The extended bivariate negative binomial probability mass function can also be expressed in terms of Srivastava’s general triple hypergeometric series, F(3)[x, y, z] [43, p. 69], by using the bivariate mixed Poisson formulation given in Section 2.3. We thus get
ð9Þ
where
/¼
q2 1 q2
and pi ¼
1 1 þ bi
ði ¼ 1; 2Þ
and (see, for example, [41, p. 44, Eqs. (14) and (15)])
2 6 F ð3Þ ½x; y; z ¼ F ð3Þ 4
0
3
00
ðaÞ :: ðbÞ; ðb Þ; ðb Þ : ðcÞ; ðc0 Þ; ðc00 Þ; 0
00
ðeÞ :: ðgÞ; ðg 0 Þ; ðg 00 Þ : ðhÞ; ðh Þ; ðh Þ;
1 X x‘ ym zn 7 Kð‘; m; nÞ ; x; y; z 5 ¼ ‘! m! n! ‘;m;n¼0
ð10Þ
where, for convenience,
QC QC 0 0 QC 00 00 QA Q Q0 Q 00 00 0 ðaj Þ‘þmþn Bj¼1 ðbj Þ‘þm Bj¼1 ðbj Þmþn Bj¼1 ðbj Þ‘þn j¼1 ðc j Þ‘ j¼1 ðcj Þm j¼1 ðcj Þn Kð‘; m; nÞ ¼ Qj¼1 QH QH0 0 QH00 00 ; QG QG 0 0 QG00 00 E ðh Þ ðh Þ ðe Þ ðg Þ ðg Þ ðg Þ j ‘ j ‘þm j m j¼1 j¼1 j¼1 ðhj Þn j¼1 j¼1 j¼1 j ‘þmþn j¼1 j mþn j ‘þn provided that the defining triple hypergeometric series in (10) converges. Details of the derivation of the probability mass function (9) are given in the Appendix. 3.2. Factorial moments and correlation By differentiating the probability generating function (4) repeatedly,
Gðx1 ;x2 Þ ðz1 ; z2 Þ ¼ GðX 1 ;X 2 Þ ðz1 ; z2 Þ
x1 X x2 X ða1 Þx
i¼0
x1 r
ða2 Þx2 s
p2 1 p2 z2
x2 s
ðx2 sÞ! ri si i # ðmÞrþsi h1 þ h3 z2 h2 þ h3 z1 h3 x1 !x2 ! i!ðr iÞ!ðs iÞ! 1 h1 z1 h2 z2 h3 z1 z2 1 h1 z1 h2 z2 h3 z1 z2 1 h1 z1 h2 z2 h3 z1 z2 r¼0 s¼0
minðr;sÞ X
p1 ðx1 rÞ! 1 p1 z1 1 r
3076
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
and this gives the factorial moments as
l½xðX11;x;X22 Þ ðx1 ;x2 Þ
¼G
½x1 X1
ð1; 1Þ ¼ l
l
½x2 X2
x1 X x2 X
"
r¼0 s¼0
ða1 Þx1 r
ða2 Þx2 s
ða1 þ mÞx1 ða2 þ mÞx2
x1 r
!
x2 s
!
minðr;sÞ X
ðmÞrþsi
i¼0
r
!
i
! i # Hh3 ; i! ðh1 þ h3 Þðh2 þ h3 Þ i
s
ð11Þ where
lX½xii ¼ ðai þ mÞxi
xi pi qi
ði ¼ 1; 2Þ
are the factorial moments of the marginal distributions of X1 and X2, respectively. From (11) and the factorial moments of the marginals X1 and X2, the correlation coefficient is found to be
mðh3 þ h1 h2 Þ m ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qðW 1 ;W 2 Þ ; qðX1 ;X2 Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ða1 þ mÞða2 þ mÞ ða1 þ mÞða2 þ mÞð1 h1 Þð1 h2 Þðh1 þ h3 Þðh2 þ h3 Þ
ð12Þ
where qðW 1 ;W 2 Þ is the correlation coefficient of (W1, W2). 3.3. Conditional distributions and regressions The probability generating function of the conditional distribution of X1 (given X2 = x2) can be found by using
GX 1 ðzjx2 Þ ¼
Gðx1 ;0Þ ð0; zÞ Gðx1 ;0Þ ð0; 1Þ
;
where
Gðx1 ;x2 Þ ðu; v Þ ¼
@ x1 þx2
x1 x2 GðX 1 ;X 2 Þ ðz1 ; z2 Þ @z1 @z2 z1 ¼u;
: z2 ¼v
Consequently, we have
GX 1 ðzjx2 Þ ¼
1 p1 1 p1 z
s ms a1 X x2 PrðY 2 ¼ x2 sÞPrðW 2 ¼ sÞ h2 þ h3 z h1 : 1 ðz 1Þ PrðX 2 ¼ x2 Þ h2 þ h3 1 h1 s¼0
ð13Þ
From (13), the conditional distribution of X1 given X2 = x2 is observed to be the convolution of V1 and V2 where
V 1 NBða1 ; p1 Þ and V2 provides a finite mixture of convolutions:
ðU 1s þ U 2s Þ ðs ¼ 0; 1; . . . ; x2 Þ; with
U 1s Bin s;
h3 h2 þ h3
and U 2s NBðm þ s; h1 Þ
when 0 < h3 < 1. In the case when 1 < h3 < 0, V2 is a mixture of convolutions of the pseudo-binomial and the negative binomial random variables as described in [20]. The regression of X1 on X2 is thus shown to be as follows:
E½X 1 jX 2 ¼ x2 ¼ E½V 1 þ E½V 2 ¼ ¼
a1 p1 q1
þ
a1 p1 q1
þ
x2 X PrðY 2 ¼ x2 sÞPrðW 2 ¼ sÞ ðE½U 1s þ E½U 2s Þ PrðX 2 ¼ x2 Þ s¼0
x2 X h3 þ h1 h2 PrðY 2 ¼ x2 sÞPrðW 2 ¼ sÞ mh1 s : þ PrðX 2 ¼ x2 Þ ð1 h1 Þðh2 þ h3 Þ s¼0 1 h1
We note here that V1 is equivalent to Y1 NB(a1, p1) and the convolution of U1s and U2s gives the conditional distribution of W1 (given W2 = s). Similarly, the probability generating function of the conditional distribution of X2 (given X1 = x1), as well as the regression of X2 on X1, are obtained as follows:
GX 2 ðzjx1 Þ ¼
1 p2 1 p2 z
s ms a2 X x1 PrðY 1 ¼ x1 sÞPrðW 1 ¼ sÞ h1 þ h3 z h2 1 ðz 1Þ PrðX 1 ¼ x1 Þ h1 þ h3 1 h2 s¼0
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
3077
and
a2 p2
E½X 2 jX 1 ¼ x1 ¼
þ
q2
x1 X h3 þ h1 h2 PrðY 1 ¼ x1 sÞPrðW 1 ¼ sÞ mh2 s : þ PrðX 1 ¼ x1 Þ ð1 h2 Þðh1 þ h3 Þ s¼0 1 h2
Furthermore, we have
E½X k1 jX 2 ¼ x2 ¼
x2 k X i X k PrðY 2 ¼ x2 sÞPrðW 2 ¼ sÞ h ki i h i E Y 1 E W 1 jW 2 ¼ s : PrðX ¼ x Þ i 2 2 i¼0 s¼0
4. Canonical expansion of bivariate distributions formed by extended trivariate reduction Many well-known bivariate generalizations of univariate Poisson, binomial, negative binomial, normal and gamma are constructed by trivariate reduction. The aforementioned univariate distributions, considered as weights for orthogonal polynomials, have generating functions for their orthogonal polynomials of the form G(t, x) = f(t)exu(t). Eagleson [8] has shown that their bivariate distributions obtained from trivariate reduction have canonical expansions (see Barrett and Lampard [3]; see also Lancaster [26]). Theorem 4.1 from Eagleson [8] is given below as a reference. Theorem (see Eagleson [8, Theorem 4.1]). If, for a particular distribution, (i) the orthogonal polynomials are generated by a function of the form:
Gðt; xÞ ¼ f ðtÞexuðtÞ ; where f(t) is a power series in t with f(0) = 1 and u(t) is a power series in t with u(0) = 0 and u0 (0) = 1, (ii) the distribution is additive, and (iii) a bivariate distribution is generated by using the additive property (1), then the matrix of correlations of the pairs of orthonormal polynomials on the marginals is diagonal. Further, qrr = qr depends only on the normalizing factor of the rth orthogonal polynomial. The following result extends Eagleson’s result (the above Theorem) to bivariate distributions which can be formed by the extended trivariate reduction given in Section 2.1. Result 1. If (W1, W2) has a bivariate distribution with canonical expansion in terms of orthogonal polynomials, then another bivariate distribution (X1, X2) generated by using the additive property:
X1 ¼ Y 1 þ W 1
and X 2 ¼ Y 2 þ W 2 ;
where Y1 and Y2 are independent, also has a canonical expansion in terms of orthogonal polynomials. Proof. Let nr ðuÞ denote the rth orthonormal polynomial of the rth orthogonal polynomial nr(u) on a distribution U and
cðUÞ ¼ r
Z
1
1
n2r ðuÞdFðuÞ; ðU;VÞ
where F(u) is the distribution function of U. We also let qrs denote the correlation of a (r, s) pair of such orthonormal polynomials on the marginals of the bivariate distribution of (U, V), that is,
qrsðU;VÞ ¼ E nr ðuÞns ðv Þ : Extending the above Theorem of Eagleson (see [8, p. 1211]), we obtain
X X 12 ðX ;X Þ cr 1 cs 2 qrs 1 2 ¼
Z
nr ðx1 Þns ðx2 ÞdFðx1 ; x2 Þ
! r X s nj ðy2 Þnsj ðw2 Þ dF 1 ðy1 ÞdF 2 ðy2 ÞdF 12 ðw1 ; w2 Þ i j i¼0 j¼0 Z Z Z r s XX r s ni ðy1 Þnj ðy2 ÞdF 1 ðy1 ÞdF 2 ðy2 Þ nri ðw1 Þnsj ðw2 ÞdF 12 ðw1 ; w2 Þ : ¼ i j
¼
Z Z Z
i¼0
r X r
!
ni ðy1 Þnri ðw1 Þ
j¼0
Using a corollary from Lancaster’s work [27, p. 535], which states that a necessary and sufficient condition for independence of the marginal variables of a bivariate statistical distribution is that
qij ¼ 0 ði > 0; j > 0Þ;
3078
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
together with another result of Lancaster [27, p. 532, Theorem A], which states that
Z
ni ðuÞnj ðv ÞdFðu; v Þ ¼ 0 ði – jÞ;
we find that
ðX Þ ðX Þ 12 ðX ;X Þ cr 1 cs 2 qrs 1 2 ¼
Z
1 1 Þ ðW 2 Þ 2 ðW 1 ;W 2 Þ nr ðw1 Þns ðw2 ÞdF 12 ðw1 ; w2 Þ ¼ drs cðW cs qrs ; r
where drs is the Kronecker’s delta. This clearly indicates that the matrix of correlations is diagonal.
h
Remark 1. In general, the existence of the canonical expansion of a bivariate distribution may be proved by using the criterion of Brown [4] which requires that the conditional moments E[Xnjy] and E[Ynjx] must be polynomials with degree less than or equal to n. This criterion may not be easy to apply. Result 1 shows that the extended bivariate negative binomial distribution has a canonical expansion in terms of orthogonal polynomials. We next derive this canonical expansion. The factorial moment generating function for
f ðxÞ
mr ðx; m; pÞ ðmÞr
is given by
r ðmþrÞ 1 X mr ðx; m; pÞ tp tp ð1 þ tÞx f ðxÞ ¼ ðpÞr 1 ; ðmÞr q q x¼0
ð14Þ
where
1 mr ðx; m; pÞ ¼ ðmÞr 2 F 1 r; x; m; 1 p is the rth Meixner polynomial and
f ðxÞ ¼
ðmÞx x m pq x!
is the negative binomial probability mass function. The factorial moment generating function of the extended bivariate negative binomial distribution from (4) is given by
Hðt1 ; t 2 Þ ¼ ð1 A1 t1 Þa1 m ð1 A2 t2 Þa2 m 1
m ðA3 þ A1 A2 Þt 1 t 2 ; ð1 A1 t 1 Þð1 A2 t2 Þ
where
A1 ¼
p1 ; q1
A2 ¼
p2 q2
and A3 ¼
h3
H
:
Following Kocherlakota and Kocherlakota [22, p. 135], H(t1, t2) is expanded to give the following result:
Hðt1 ; t 2 Þ ¼
1 X ðmÞi i d ðA1 t 1 Þi ð1 A1 t 1 Þa1 mi ðA2 t 2 Þi ð1 A2 t 2 Þa2 mi ; i! i¼0
ð15Þ
where
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d¼
qðW 1 ;W 2 Þ ð1 h1 Þð1 h2 Þ pffiffiffiffiffiffiffiffiffiffiffi
H A1 A2
:
Using the relation in (14), the canonical expansion for the extended bivariate negative binomial distribution probability mass function is found to be given by
f ðx1 ; x2 Þ ¼ fX 1 ðx1 ÞfX2 ðx2 Þ
1 X i¼0
¼ fX 1 ðx1 ÞfX2 ðx2 Þ
1 X i¼0
di
ðmÞi ðp1 p2 Þi mi ðx1 ; a1 þ m; p1 Þ mi ðx2 ; a2 þ m; p2 Þ ða1 þ mÞi ða2 þ mÞi i!
ðmÞi qiðW 1 ;W 2 Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi mi ðx1 ; a1 þ m; p1 Þmi ðx2 ; a2 þ m; p2 Þ; ða1 þ mÞi ða2 þ mÞi
ð16Þ
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
3079
where
fX j ðxj Þ ¼
ðaj þ mÞxj xj !
x
a þm
pj j qj j
ðj ¼ 1; 2Þ
are the marginal probability mass functions of X1 and X2, and
mi ðxj ; aj þ m; pj Þ mi ðxj ; aj þ m; pj Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðaj þ mÞi i!pi j
ðj ¼ 1; 2Þ
is the ith orthonormal Meixner polynomial. 5. Quadrant dependence Various forms of bivariate dependence have been used for statistical analysis. One of them is positive quadrant dependence, introduced by Lehmann [29], which is a very useful measure of dependence. Compared to other types of dependence, positive quadrant dependence is easier to show as seen from its definition given as follows. Definition. Two random variables X and Y are said to be positive quadrant dependent if
PrðX 5 x; Y 5 yÞ = PrðX 5 xÞPrðY 5 yÞ ð8x; 8yÞ:
ð17Þ
Jensen [16] has extended (17) to regions other than quadrant with the concept of positive dependence when the marginal distributions are identical. Jensen’s definition of positive dependence is recalled here as follows. Two random variables X and Y are said to be positively dependent if
PrðX 2 A; Y 2 AÞ = PrðX 2 AÞPrðY 2 AÞ for every measurable set A with respect to the marginal measure. It is easy to show that the extended bivariate negative binomial distribution is positively dependent from the canonical expansion of its joint probability mass function given by (16) when marginals are identical. We now show that the extended bivariate negative binomial distribution is positive quadrant dependent when the marginal parameters are different. Result 2. The extended bivariate negative binomial distribution with the joint probability generating function (8) is positive quadrant dependent when
0 < qðW 1 ;W 2 Þ < 1: Proof. First of all, we note that
GðX;YÞ ðz1 ; z2 Þ = GX ðz1 ÞGY ðz2 Þ implies that
PrðX 5 x; Y 5 yÞ=PrðX 5 xÞPrðY 5 yÞ 8x and 8y; that is, positive quadrant dependence. This follows by extracting the (x, y)th term from the probability generating function. Upon rewriting (8) by using the relation in (15), we obtain
Gðz1 ; z2 Þ ¼ Hðz1 1; z2 1Þ; so that
GðX 1 ;X 2 Þ ðz1 ; z2 Þ ¼ ½1 þ A1 ð1 z1 Þa1 m ½1 þ A2 ð1 z2 Þa2 m 2 3 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!i i 1 X q ð1 h1 Þð1 h2 Þ ð m Þ A A ð1 z Þð1 z Þ ðW ;W Þ 1 2 1 2 1 2 i 5; pffiffiffiffiffiffiffiffiffiffiffi 41 þ ½1 þ A1 ð1 z1 Þ½1 þ A2 ð1 z2 Þ i! H A1 A2 i¼1 where
A1 ¼
p1 q1
and A2 ¼
Since
jzi j 5 1 ði ¼ 1; 2Þ; we have
1 zi = 0
p2 : q2
ð18Þ
3080
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
and we thus find that the terms Ai(1 zi) are also positive. When
0 < qðW 1 ;W 2 Þ < 1; the infinite series in braces in (18) is positive. Hence (18) implies that
GðX 1 ;X 2 Þ ðz1 ; z2 Þ = GX 1 ðz1 ÞGX 2 ðz2 Þ: From the remark at the beginning of the proof, we conclude that the extended bivariate negative binomial is positive quadrant dependent when
0 < qðW 1 ;W 2 Þ < 1:
Remark 2. Positive quadrant dependence of the bivariate negative binomial distributions of Edwards and Gurland [9] and Subrahmaniam [48] follow upon setting
a1 ¼ a2 ¼ 0: Remark 3. For the extended trivariate reduction (2), it is easy to show that, if (W1, W2) is positive quadrant dependent, then (X1, X2) is also positive quadrant dependent. 6. Computer generation of bivariate samples In this section, we give the algorithms to generate random samples from extended bivariate negative binomial distribution. The algorithm is also applicable to generate random samples for the extended bivariate binomial and gamma distributions. 6.1. Mixture method By the following formulation:
X1 ¼ Y 1 þ W 1
and X 2 ¼ Y 2 þ W 2 ;
the general form of probability generating function for the extended bivariate negative binomial distribution is given as in (3). The correlation for the extended bivariate negative binomial distribution is given by
m qðX1 ;X2 Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qðW 1 ;W 2 Þ ; ða1 þ mÞða2 þ mÞ where a1, a2 and m are the corresponding index parameters for Y1, Y2 and (W1, W2) distributions and qðW 1 ;W 2 Þ is the correlation of the bivariate distribution of (W1, W2). Extended bivariate binomial and gamma distributions can also be shown to have the same form of correlation relation. For any of the bivariate distributions, given the marginals:
X 1 gða; h1 Þ and X 2 gðb; h2 Þ as well as the correlation qðX 1 ;X 2 Þ , it can be deduced that
Y 1 gða m; h1 Þ and Y 2 gðb m; h2 Þ; W 1 gðm; h1 Þ and W 2 gðm; h2 Þ and
qðW 1 ;W 2 Þ ¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ða1 þ mÞða2 þ mÞ
m
qðX1 ;X2 Þ ;
with g() being one of the corresponding univariate distributions. Ong ([34,35]) has given several mixture models as well as algorithms for computer generation of the bivariate negative binomial, binomial and gamma distributions of (W1, W2) with given marginals and correlation. Utilizing this and the above formulation, an algorithm to generate bivariate data from one of the three distributions with different marginals is given below. Algorithm 1 1. Set
0 < m < minða; bÞ;
a1 ¼ a m; a2 ¼ b m
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
3081
and
qðW 1 ;W 2 Þ ¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ða1 þ mÞða2 þ mÞ
m
qðX1 ;X2 Þ :
2. Generate y1 g(a1, h1) and y2 g(a2, h2). 3. Use known marginals:
W 1 gðm; h1 Þ;
W 2 gðm; h2 Þ and qðW 1 ;W 2 Þ
in the bivariate negative binomial algorithm from Ong and Lee [37] to generate (w1, w2). 4. x1 = y1 + w1 and x2 = y2 + w2. 6.2. Conditional distribution technique The conditional distribution of X1 (given X2 = x2) is the convolution of V1 and V2 as given in Section 3.3. We are given the marginals:
X 1 NBða; p1 Þ;
X 2 NBðb; p2 Þ and qðX 1 ;X 2 Þ :
When 0 < h3 < 1, it is found that
V 1 NBða m; p1 Þ; h3 and U 2s NBðm þ s; h1 Þ ðs ¼ 0; 1; . . . ; x2 Þ U 1s Bin s; h2 þ h3 and
Y 2 NBðb m; p2 Þ and W 2 NBðm; p2 Þ: When 1 < h3 < 0, V2 is a mixture of convolutions between a pseudo-binomial and a negative binomial random variables which can be easily generated using the standard inverse transform method. Algorithm 2 1. Set 0 < m < min (a, b), a1 = a m, a2 = b m. 2. Set h1, h2 and h3 such that
mðh3 þ h1 h2 Þ : qðX1 ;X2 Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ða1 þ mÞða2 þ mÞð1 h1 Þð1 h2 Þðh1 þ h3 Þðh2 þ h3 Þ 3. Generate x2 NB(b, p2) and v1 NB(a1, p1). 4. Set v2 = 0. For s = 0 to x2, (a) When 0 < h3 < 1, 3 and u2s NB(m + s,h1). (i) Generate u1s B s; h2hþh 3 PrðY 2 ¼ x2 sÞPrðW 2 ¼ sÞ ðu1s þ u2s Þ. (ii) v 2 ¼ v 2 þ PrðX 2 ¼ x2 Þ (b) When 1 < h3 < 0, generate 5. x1 = v1 + v2.
v2 using inverse transform method based on the probabilities given in [20].
7. Applications In this section, we illustrate the applications of the two formulations of the extended bivariate negative binomial distribution for accident data bearing in mind that it is also applicable in other contexts. 7.1. Extended trivariate reduction Suppose that accidents or injuries are due to (a) individual characteristics and (b) environmental factors [2]. Let Y1 and Y2 represent the number of accidents due to cause (a) in two different time periods. Suppose that accidents due to cause (b) vary from one time period to another as a pair of correlated random variables (W1, W2). The total number of accidents in each period will be given by
X1 ¼ Y 1 þ W 1
and X 2 ¼ Y 2 þ W 2 :
If Y1 and Y2 are assumed to be Poisson-distributed but due to individual characteristics, accident proneness varies from individual to individual as a gamma distribution, then Y1 and Y2 will have the negative binomial distributions. A similar
3082
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
reasoning for accidents due to environmental factors lead to the assumption that (W1, W2) has the bivariate negative binomial distribution of Edward and Gurland [9]. Hence, accidents in the two time periods will have the extended bivariate negative binomial distribution. 7.2. Mixed Poisson formulation Let X and Y represent the number of accidents or injuries sustained by a group of individuals in two different time periods, each of unit length, with Poisson distributions Poi(k1) and Poi(k2), respectively. Suppose that the population consists of individuals where the proneness of each individual to accidents varies from individual to individual (see Edwards and Gurland [9] and Subrahmaniam [48]), that is, ki, i = 1, 2 differs from individual to individual. If k1 and k2 have a joint bivariate gamma distribution given by (6) we get the extended bivariate negative binomial distribution. 7.3. Numerical examples Two examples of the extended bivariate negative binomial distribution fits to a simulated data set and the rain-forest data set (see Holgate [15]) are considered in this section. The parameters are estimated by the maximum likelihood estimation (MLE) and the fits are compared with Edwards and Gurland’s bivariate negative binomial distribution. The log likelihood function is maximized using the numerical method of simulated annealing to obtain globally optimum parameter estimates. Suitable bounds are set for the unbounded parameters a1, a2 and m to assist in the numerical parameter searches. Bounds for the parameters p1, p2 and h3 are as given in Section 2.3. Example 1. A sample of size 500 is simulated from the extended bivariate negative binomial distribution with p1 = 0.4, p2 = 0.5, h3 = 0.3, a1 = 0.5, a2 = 2.5 and m = 1.0, where the marginals X1 NB(0.4, 1.5) and X2 NB(0.5, 3.5) clearly have different index parameters. Simulation is done according to the algorithm outlined in Section 6.1. Observed frequencies for the data are shown in Table 1. The extended bivariate negative binomial and Edwards and Gurland’s bivariate negative binomial distributions are fitted to the data with grouping of frequencies at the cell (16, 8). The comparison of the fittings is made based on the chi-square, v2 goodness-of-fit statistic. The parameter estimates and corresponding v2 values as well as degrees of freedom (d.f.) are given in Table 2. The expected frequencies for these two distributions are then given in Table 3. It is obvious from the v2 values in Table 2 that bivariate negative binomial could not give a satisfactory fit (p-value = 0.09) when the index parameters for the marginals are different as compared to the extended bivariate negative binomial (pvalue = 0.56). Example 2. The abundance of two different plant species in the rain-forest data [15] can be due to individual growth factor and environmental factors such as climate and space. We fit the data to the extended bivariate negative binomial and Edwards and Gurland’s bivariate negative binomial distributions. The ML estimates of the parameter are tabulated here as follows.
Table 1 Simulated sample of size 500 from extended bivariate negative binomial distribution. (p1 = 0.4, p2 = 0.5, h3 = 0.3, a1 = 0.5, a2 = 2.5, m = 1.0)
Note: The dashed lines indicate grouping of the data for the v2 goodness-of-fit test to yield a minimum expected frequency of 1.
3083
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087 Table 2 Extended bivariate negative binomial and bivariate negative binomial: parameter estimates and v2 values. Distribution
ML estimates
v2
EBNB
p1 = 0.415158, p2 = 0.506243, h3 = 0.350500, a1 = 0.443790, a2 = 2.492638, m = 0.825251
51.84 d.f. =54
BNB
p1 = 0.245742, p2 = 0.546149, h3 = 3.16 108, m = 2.823620
70.81 d.f. =56
EBNB: extended bivariate negative binomial. BNB: bivariate negative binomial.
Table 3 Expected extended bivariate negative binomial (bivariate negative binomial) frequencies: simulated data. x2
x1 0
1
3
4
5
6
7
8+
0
37.90 (36.40)
6.98 (13.24)
2 2.09 (3.26)
0.71 (0.68)
0.25 (0.13)
0.09 (0.02)
0.04 (0.00)
0.01 (0.00)
0.01 (0.00)
1
52.70 (48.90)
20.67 (24.09)
4.93 (7.48)
1.59 (1.87)
0.56 (0.41)
0.20 (0.08)
0.08 (0.02)
0.03 (0.00)
0.02 (0.00)
2
49.13 (44.48)
26.00 (27.64)
9.34 (10.37)
2.50 (3.04)
0.84 (0.77)
0.30 (0.17)
0.11 (0.04)
0.04 (0.01)
0.03 (0.00)
3
38.47 (34.03)
23.94 (25.53)
11.20 (11.22)
3.91 (3.77)
1.11 (1.07)
0.38 (0.27)
0.14 (0.06)
0.05 (0.01)
0.03 (0.00)
4
27.32 (23.57)
18.78 (20.72)
10.36 (10.44)
4.58 (3.96)
1.57 (1.25)
0.47 (0.35)
0.16 (0.09)
0.06 (0.02)
0.04 (0.01)
5
18.23 (15.31)
13.40 (15.43)
8.24 (8.77)
4.31 (3.70)
1.83 (1.29)
0.62 (0.39)
0.19 (0.11)
0.07 (0.03)
0.04 (0.01)
6
11.66 (9.50)
8.99 (10.79)
5.95 (6.83)
3.50 (3.17)
1.76 (1.21)
0.73 (0.40)
0.24 (0.12)
0.08 (0.03)
0.04 (0.01)
7
7.22 (5.70)
5.77 (7.21)
4.03 (5.02)
2.57 (2.55)
1.46 (1.05)
0.71 (0.38)
0.29 (0.12)
0.10 (0.03)
0.05 (0.01)
8
4.37 (3.33)
3.58 (4.64)
2.61 (3.53)
1.77 (1.95)
1.10 (0.87)
0.61 (0.33)
0.29 (0.11)
0.11 (0.03)
0.06 (0.01)
9
2.59 (1.90)
2.17 (2.90)
1.63 (2.40)
1.16 (1.42)
0.77 (0.68)
0.47 (0.28)
0.25 (0.10)
0.12 (0.03)
0.07 (0.01)
10
1.52 (1.07)
1.29 (1.77)
1.00 (1.58)
0.73 (1.00)
0.51 (0.51)
0.33 (0.22)
0.20 (0.08)
0.10 (0.03)
0.07 (0.01)
11
0.87 (0.59)
0.76 (1.06)
0.60 (1.01)
0.45 (0.69)
0.33 (0.37)
0.22 (0.17)
0.14 (0.07)
0.08 (0.03)
0.07 (0.01)
12
0.50 (0.33)
0.44 (0.62)
0.35 (0.63)
0.27 (0.46)
0.20 (0.26)
0.15 (0.13)
0.10 (0.05)
0.06 (0.02)
0.06 (0.01)
13
0.28 (0.18)
0.25 (0.36)
0.20 (0.39)
0.16 (0.30)
0.12 (0.18)
0.09 (0.09)
0.06 (0.04)
0.04 (0.02)
0.05 (0.01)
14
0.16 (0.10)
0.14 (0.21)
0.12 (0.24)
0.09 (0.19)
0.07 (0.12)
0.06 (0.07)
0.04 (0.03)
0.03 (0.01)
0.04 (0.01)
15
0.09 (0.05)
0.08 (0.12)
0.07 (0.14)
0.05 (0.12)
0.04 (0.08)
0.03 (0.05)
0.03 (0.02)
0.02 (0.01)
0.03 (0.01)
16+
0.11 (0.06)
0.10 (0.14)
0.08 (0.20)
0.07 (0.19)
0.06 (0.14)
0.05 (0.09)
0.04 (0.05)
0.03 (0.02)
0.06 (0.02)
3084
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
Table 4 Observed and expected frequencies for rain-forest data. x1
x2
Observed
Expected BNB
Type II BNNB
0 1 2 3 4
0 0 0 0 0
34 12 4 5 2
31.93 16.10 6.35 2.30 0.80
30.19 16.94 6.79 2.36 0.75
28.80 17.64 7.53 2.66 0.84
0 1 2 3 4
1 1 1 1 1
8 13 3 3 0
10.20 9.74 4.73 1.87 0.68
10.94 8.99 4.75 2.04 0.78
11.14 9.52 5.05 2.11 0.76
0 1 2 3 4
2 2 2 2 2
3 6 1 2 0
2.88 2.92 2.28 1.14 0.46
2.84 3.07 2.01 1.04 0.46
3.00 3.19 2.00 0.96 0.39
0 1 2 3 4 x1 6 4 x1 P 5
3 3 3 3 3 x2 P 4
1 1 0 1 0 0 1
0.78 0.81 0.67 0.51 0.27 1.18 1.39
0.64 0.85 0.67 0.40 0.21 0.99 1.31
0.67 0.84 0.61 0.33 0.15 0.75 1.04
13.16 6
14.28 8
13.44 9
v2 d.f.
EBNB
BNNB: bivariate non-central negative binomial.
(i) Extended bivariate negative binomial distribution:
p1 ¼ 0:308341;
p2 ¼ 0:245690;
h3 ¼ 0:225283;
a1 ¼ 1:463361; a2 ¼ 1:300684; m ¼ 0:638345; with marginals X1 NB(0.3038, 2.1017) and X2 NB(0.2457, 1.9390). (ii) Edwards and Gurland’s bivariate negative binomial
p1 ¼ 0:288067;
p2 ¼ 0:208444;
h3 ¼ 0:003166;
m ¼ 2:336087;
with marginals X1 NB(0.2881, 2.3361) and X2 NB(0.2084, 2.3361). The expected frequencies obtained from both distributions are shown in Table 4. Expected frequencies from the Type II bivariate non-central negative binomial distribution fit from Ong and Lee [38] are also given for comparison. We note that for the rain-forest data the marginal distributions for the extended bivariate negative binomial and bivariate negative binomial are similar. As expected, the fit by extended bivariate negative binomial yields a smaller v2 value compared to bivariate negative binomial since more flexibility is allowed for the marginals. This v2 value is also smaller than the v2 value obtained from the Type II bivariate non-central negative binomial distribution. 8. Concluding remarks and observations This paper has considered an extension of trivariate reduction and a mixed Poisson formulation to obtain a bivariate negative binomial distribution with different marginal parameters and the correlation coefficient in the range (0, 1). Distributional properties have been derived. For the extended trivariate reduction, a result on the existence of canonical expansion of the bivariate distribution has been obtained. The extended bivariate negative binomial distribution is shown to be positive quadrant dependent. The generation of bivariate data with varying dependence has also been considered. A fit with simulated data shows that when the negative binomial marginals are very different, the extended bivariate negative binomial distribution is to be preferred. For the rain-forest data, the extended bivariate negative binomial distribution fits better than the bivariate negative binomial distribution as judged by the v2 value (see also a recent work by Nadarajah et al. [33]). We note that a bivariate binomial distribution can be constructed in a similar manner as the extended bivariate negative binomial distribution. Furthermore, bivariate continuous distributions such as the bivariate normal and gamma distributions
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
3085
can also be constructed through this extended trivariate reduction method by utilising the moment generating functions instead of probability generating functions. Here, in our present investigation, we have successfully applied such special functions as the Laguerre and Meixner polynomials (as well as hypergeometric functions in one, two and three variables) to provide a systematic study of an extension of trivariate reduction and a mixed Poisson formulation for obtaining a bivariate negative binomial distribution with different marginal parameters and the correlation coefficient in the range (0, 1). Traditionally, various classes of special functions and polynomials of mathematical physics and analytic number theory in one and more variables have found interesting applications not only in mathematical, physical and engineering sciences, but surely also in probability theory and other disciplines in the statistical sciences (see, for example, [40–42]; see also [18,19]). We choose to conclude this paper by referring the interested reader to several recent investigations on the subject by (among others) Gupta et al. ([12,13]) (see also [10,30]), Lee et al. [28] and Srivastava et al. ([44–46]). Appendix A A.1. Canonical expansion The canonical expansion of a bivariate distribution is defined as follows. Let h(x, y) be a bivariate probability density function with marginal probability density functions f(x) and g(y) where the ð1Þ ð2Þ parameters have been suppressed for simplicity. Let fwi ðxÞg and fwi ðyÞg be complete set of orthonormal functions with respect to f(x) and g(y) respectively. Then h(x, y) can be expanded as a double series in the form:
hðx; yÞ ¼ f ðxÞgðyÞ
1 X 1 X i¼0
where
qij ¼
Z Z
ð1Þ
ð2Þ qij wð1Þ i ðxÞwj ðyÞ;
ðA:1Þ
j¼0
ð2Þ
hðx; yÞwi ðxÞwj ðyÞdx dy
and the series: 1 X 1 X i¼0
q2ij
j¼0
is convergent [26]. Let
/2 ¼
Z
2 dHðx; yÞ dFðxÞdGðyÞ 1; dFðxÞdGðyÞ
where H(x, y), F(x) and G(y) are the distribution functions corresponding, respectively, to h(x, y), f(x) and g(y). If /2 is bounded (see Lancaster [26]), then (A.1) can be expressed in the canonical form such that
qij ¼ 0 ði – jÞ; that is, the coefficient matrix [qij] is diagonal. The double series in (A.1) yields
hðx; yÞ ¼ f ðxÞgðyÞ
1 X
ð2Þ qi wð1Þ ðqi :¼ qii Þ; i ðxÞwi ðyÞ
ðA:2Þ
i¼0
with
/2 ¼
1 X
q2i :
i¼1
The series in (A.2) is the canonical expansion of the bivariate probability density function h(x, y) and qi is known as the ith canonical coefficient. A.2. Probability mass function of the extended bivariate negative binomial distribution In terms of the classical Laguerre polynomials LðnaÞ ðxÞ of order (or index) a and degree n in x, defined by
LðnaÞ ðxÞ ¼
n X n þ a ðxÞk k¼0
nk
k!
¼
nþa n
1 F 1 ðn;
where 1 F 1 ða; c; zÞ
¼
1 n X ðaÞn zn z o ¼ lim 2 F 1 a; b; c; b ðcÞn n! jbj!1 n¼0
a þ 1; xÞ;
3086
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087
denotes the Kummer confluent hypergeometric function, the canonical form of the bivariate gamma distribution probability density function [25, p. 41] with the moment generating function (6) is given by
f ðx; yÞ ¼
1 ex xa1 þm1 ey ya2 þm1 X ða þm1Þ ða þm1Þ a q2k Lk 1 ðxÞLk 2 ðyÞ; Cða1 þ mÞ Cða2 þ mÞ k¼0 k
where
ak ¼
k!ðmÞk : ða1 þ mÞk ða2 þ mÞk
By applying the following result of Srivastava [39, p. 306, Eq. (2.3)] (see also Srivastava and Manocha [43, p. 221, Eq. (18)]) 1 X
n!ðkÞn n LðaÞ ðxÞLðbÞ n ðyÞz ða þ 1Þn ðb þ 1Þn n " # xz ð3Þ xz yz xyz ðjzj < 1Þ; U ; ; ¼ ð1 zÞk exp a k þ 1; k; a þ 1; b þ 1; 1z 1 z 1 z ð1 zÞ2
n¼0
to the infinite sum in the probability density function and setting
x ¼ b1 k and y ¼ b2 h; we obtain
f ðk; hÞ ¼
eb1 k ba11 þm ka1 þm1 eb2 h ba22 þm ha2 þm1 b kq2 ð1 q2 Þm exp 1 2 Cða1 þ mÞ Cða2 þ mÞ 1q " # 2 2 2 b kq b hq b b khq Uð3Þ a1 ; m; a1 þ m; a2 þ m; 1 2 ; 2 2 ; 1 2 ; 1 q 1 q ð1 q2 Þ2
where U(3) is a certain confluent hypergeometric function of three variables defined by Srivastava [39] by (see, for example, Srivastava and Manocha [43, p. 222])
Uð3Þ ½a; b; c; d; x; y; z ¼
1 X ‘;mn¼0
ðaÞ‘ ðbÞmþn x‘ ym zn ; ðcÞ‘þn ðdÞmþn ‘! m! n!
which can, in fact, be expressed in terms of a very specialized case of Srivastava’s general triple hypergeometric series F(3)[x, y, z] defined by (10). By the mixed Poisson formulation, the joint probability mass function of the extended bivariate negative binomial distribution is then given by
f ðx1 ; x2 Þ ¼
Z
1
0
Z 0
1
ek kx1 eh hx2 f ðk; hÞdk dh x1 ! x2 !
Z ða1 Þl ðmÞmþn ba11 þm ba22 þm ðb1 /Þl ðb2 /Þm ðb1 b2 /ð1 /ÞÞn 1 ða2 þ mÞmþn ða1 þ mÞlþn Cða1 þ mÞCða2 þ mÞ l! m! n! 0 l;m;n¼0 Z 1 k x2 k x1 e k b1 k a1 þm1 lþn e h b2 h a2 þm1 mþn k k dk h h dh e e x1 ! x2 ! 0 x1 a1 þm x2 a2 þm X 1 ða1 þ mÞx1 ða2 þ mÞx2 1 b1 1 b2 ¼ 1 þ b1 1 þ b2 x1 ! 1 þ b1 x2 ! 1 þ b2 l;m;n¼0 n l m ða1 Þl ðmÞmþn ðx2 þ a2 þ mÞmþn ðx1 þ a1 þ mÞlþn 1 b1 / 1 b2 / 1 b1 b2 /ð1 /Þ ; l! 1 þ b1 m! 1 þ b2 n! ð1 þ b1 Þð1 þ b2 Þ ða2 þ mÞmþn ða1 þ mÞlþn ¼
1 X
which gives the probability mass function (9) in terms of one of Srivastava’s general triple hypergeometric series as defined by (10). References [1] J. Aitchison, C.-H. Ho, The multivariate Poisson-log normal distribution, Biometrika 75 (1989) 621–629. [2] A.G. Arbous, J.E. Kerrich, Accident statistics and the concept of accident proneness, Biometrics 7 (1951) 340–432. [3] J.F. Barrett, D.G. Lampard, An expansion for some second-order distributions and its application to noise problems, IRE Trans. Inform. Theory IT-1 (1955) 10–15. [4] J.J. Brown, A criterion for the diagonal expansion of a second-order probability distribution in orthogonal polynomials, IRE Trans. Inform. Theory IT-4 (1958) 172. [5] S. Chib, R. Winkelmann, Markov chain Monte Carlo analysis of correlated count data, J. Business Econ. Statist. 19 (2001) 428–435. [6] C.M. Cuadras, Correspondence analysis and diagonal expansions in terms of distribution functions, J. Statist. Plan. Infer. 103 (2002) 137–150.
C.M. Ng et al. / Applied Mathematics and Computation 217 (2010) 3069–3087 [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48]
3087
L. Devroye, Non-Uniform Random Variate Generation, Springer-Verlag, New York, Berlin, Heidelberg, 1986. G.K. Eagleson, Polynomial expansions of bivariate distributions, Ann. Math. Statist. 35 (1964) 1208–1215. C.B. Edwards, J. Gurland, A class of distributions applicable to accidents, J. Amer. Statist. Assoc. 56 (1961) 503–517. M. Garg, K. Jain, H.M. Srivastava, Some relationships between the generalized Apostol–Bernoulli polynomials and Hurwitz–Lerch Zeta functions, Integral Transforms Spec. Funct. 17 (2006) 803–815. A.K. Gupta, On the expansion of bivariate gamma distribution, J. Indian Statist. Assoc. 17 (1979) 41–50. P.L. Gupta, R.C. Gupta, S.-H. Ong, H.M. Srivastava, A class of Hurwitz–Lerch Zeta distributions and their applications in reliability, Appl. Math. Comput. 196 (2008) 521–531. R.C. Gupta, S.N.U.A. Kirmani, H.M. Srivastava, Local dependence functions for some families of bivariate distributions and total positivity, Appl. Math. Comput. 216 (2010) 1267–1279. R.C. Gupta, S.-H. Ong, A new generalization of the negative binomial distribution, Comput. Statist. Data Anal. 45 (2004) 287–300. P. Holgate, Bivariate generalizations of Neyman’s type A distribution, Biometrika 53 (1966) 241–245. D.R. Jensen, A note on positive dependence and the structure of bivariate distributions, SIAM J. Appl. Math. 20 (1971) 749–753. N.L. Johnson, A.W. Kemp, S. Kotz, Univariate Discrete Distributions, third ed., Wiley Series in Probability and Statistics, A Wiley-Interscience Publication, John Wiley and Sons, New York, Chichester, Brisbane, Toronto, Hoboken, New Jersey, 2005. N.L. Johnson, S. Kotz, N. Balakrishnan, Continuous Univariate Distributions, Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics, A Wiley-Interscience Publication, second ed., vol. 1, John Wiley and Sons, New York, Chichester, Brisbane, Toronto, 1994. N.L. Johnson, S. Kotz, N. Balakrishnan, Continuous Univariate Distributions, Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics, A Wiley-Interscience Publication, Second ed., vol. 2, John Wiley and Sons, New York, Chichester, Brisbane, Toronto, 1995. A.W. Kemp, Convolutions involving binomial pseudovariables, Sankhya¯ Ser. A 41 (1979) 232–243. W.F. Kibble, A two-variate gamma type distribution, Sankhya¯ 5 (1941) 137–150. S. Kocherlakota, K. Kocherlakota, Bivariate Discrete Distributions, Marcel Dekker, New York, 1992. S. Kotz, Multivariate distributions at a cross-road, Statistical Distributions in Scientific Work, vol. 1, Reidel, Dordrecht, 1974. D. Kundu, R.D. Gupta, Bivariate generalized exponential distribution, J. Multivariate Anal. 100 (2009) 581–593. C.-D. Lai, Construction of bivariate distributions by a generalised trivariate reduction technique, Statist. Probab. Lett. 25 (1995) 265–270. H.O. Lancaster, The structure of bivariate distributions, Ann. Math. Statist. 29 (1958) 719–736. H.O. Lancaster, Correlations and canonical forms of bivariate distributions, Ann. Math. Statist. 34 (1963) 532–538. P.-A. Lee, S.-H. Ong, H.M. Srivastava, Some integrals of the products of Laguerre polynomials, Internat. J. Comput. Math. 78 (2001) 303–321. E.L. Lehmann, Some concepts of dependence, Ann. Math. Statist. 37 (1966) 1137–1153. S.-D. Lin, H.M. Srivastava, P.-Y. Wang, Some expansion formulas for a class of generalized Hurwitz–Lerch Zeta functions, Integral Transforms Spec. Funct. 17 (2006) 817–827. K.V. Mardia, Families of Bivariate Distributions, Griffin, London, 1970. C.R. Mitchell, A.S. Paulson, A new bivariate negative binomial distribution, Naval Res. Logist. Quart. 28 (1981) 359–374. S. Nadarajah, H.M. Srivastava, A.K. Gupta, Skewed Bessel function distributions with application to rainfall data, Statistics 41 (2007) 333–344. S.-H. Ong, Mixture formulations of a bivariate negative binomial distribution with applications, Comm. Statist. Theory Methods 19 (1990) 1303–1322. S.-H. Ong, The computer generation of bivariate binomial variables with given marginals and correlation, Comm. Statist. Simulation Comput. 21 (1992) 285–299. S.-H. Ong, Canonical expansions, correlation structure and conditional distributions of bivariate distributions generated by mixtures, Comm. Statist. Theory Methods 22 (1993) 2527–2547. S.-H. Ong, P.-A. Lee, On the bivariate negative binomial distribution of Mitchell and Paulson, Naval Res. Logist. Quart. 32 (1985) 457–465. S.-H. Ong, P.-A. Lee, Bivariate non-central negative binomial distribution: another generalization, Metrika 33 (1986) 29–46. H.M. Srivastava, An extension of the Hille–Hardy formula, Math. Comput. 23 (1969) 305–311. H.M. Srivastava, K.C. Gupta, S.P. Goyal, The H-Functions of One and Two Variables with Applications, South Asian Publishers, New Delhi, Madras, 1982. H.M. Srivastava, P.W. Karlsson, Multiple Gaussian Hypergeometric Series, Halsted Press, Ellis Horwood Limited, John Wiley and Sons, Chichester, New York, Brisbane, Toronto, 1985. H.M. Srivastava, B.R.K. Kashyap, Special Functions in Queuing Theory and Related Stochastic Processes, Academic Press, New York and London, 1982. H.M. Srivastava, H.L. Manocha, A Treatise on Generating Functions, Halsted Press, Ellis Horwood Limited, John Wiley and Sons, Chichester, New York, Brisbane, Toronto, 1984. H.M. Srivastava, S. Nadarajah, Some families of Bessel distributions and their applications, Integral Transforms Spec. Funct. 17 (2006) 65–73. H.M. Srivastava, S. Nadarajah, S. Kotz, Some generalizations of the Laplace distribution, Appl. Math. Comput. 182 (2006) 223–231. H.M. Srivastava, R.K. Saxena, C. Ram, A unified presentation of the Gamma-type functions in diffraction theory and associated probability distributions, Appl. Math. Comput. 162 (2005) 931–947. G. Stein, J.M. Juritz, Bivariate compound Poisson distributions, Comm. Statist. Theory Methods 16 (1987) 3591–3607. K. Subrahmaniam, A test of ‘intrinsic correlation’ in the theory of accident proneness, J. Roy. Statist. Soc. Ser. B 35 (1966) 131–146.