Научная статья на тему 'On a wide plurimodal class of distributions suitable for asymmetric data sets'

On a wide plurimodal class of distributions suitable for asymmetric data sets Текст научной статьи по специальности «Математика»

CC BY
59
20
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
Asymmetric distributions / Maximum likelihood estimation / Model selection / Plurimodality / Simulation

Аннотация научной статьи по математике, автор научной работы — C. Satheesh Kumar, G.V. Anila

Asymmetric normal distributions have received much attention in the literature during the last three decades. But, plurimodal asymmetric normal distributions are not much studied in the literature even though it has much relevance in practical situations. Here we propose a new class of plurimodal, asymmetric normal distribution and investigate its several statistical properties, including certain reliability aspects. A location-scale extension of the proposed model is developed and studied their properties. The maximum likelihood estimation method is employed for estimating the parameters of the proposed extended class of distributions and conducted generalized likelihood ratio test procedure for testing the parameters of the distribution. Three real-life data sets are considered for illustrating the usefulness of the model and a brief simulation study is carried out for examining the performance of maximum likelihood estimators of the proposed model.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «On a wide plurimodal class of distributions suitable for asymmetric data sets»

On a wide plurimodal class of distributions suitable for

asymmetric data sets

C. Satheesh Kumar and G. V. Anila

Department of Statistics, University of Kerala, Trivandrum-695581, India.

drcsatheeshkumar@gmail.com Department of Statistics, University of Kerala, Trivandrum-695581, India.

anilasaru@gmail.com

Abstract

Asymmetric normal distributions have received much attention in the literature during the last three decades. But, plurimodal asymmetric normal distributions are not much studied in the literature even though it has much relevance in practical situations. Here we propose a new class of plurimodal, asymmetric normal distribution and investigate its several statistical properties, including certain reliability aspects. A location-scale extension of the proposed model is developed and studied their properties. The maximum likelihood estimation method is employed for estimating the parameters of the proposed extended class of distributions and conducted generalized likelihood ratio test procedure for testing the parameters of the distribution. Three real-life data sets are considered for illustrating the usefulness of the model and a brief simulation study is carried out for examining the performance of maximum likelihood estimators of the proposed model.

Keywords: Asymmetric distributions, Maximum likelihood estimation, Model selection, Pluri-modality, Simulation

1. Introduction

The normal distribution is the most important and most widely used distribution in statistics. It is an inevitable tool for the analysis and interpretation of data. But in many practical applications it has been observed that real life data sets are not symmetric. So normal distribution is not an acceptable model for modeling such data sets. In order to overcome this drawback, [2] considered an asymmetric form of normal distribution by introducing a skewness parameter into its probability density function (p.d.f) and named it as "the skew normal distribution". The skew normal distribution defined by [2] as follows:

Let <(.) and $(.) be the p.d.f and cumulative distribution function (c.d.f) of a standard normal variate. Then a random variable X is said to follow the skew normal distribution with parameter A € R = (-&,&) if its p.d.f g (x; A), for x € R, is given by

g (x; A) = 2< (x) ® (Ax). (1)

A distribution with p.d.f. (1) we denoted as SND(A) through out the manuscript. The SND(A) has been further studied by several authors such as [3],[4], [5], [6], [7], [8] and [10].

A generalized form of skew normal distribution is developed by [1] through the following p.d.f.

g1 (x; A1, A2) = 2<(x)^ ' (2)

in which x € R, Ai € R, A2 > 0. A distribution with pdf (2) we denoted as SGND(A1, A2).

The SGND(A1, A2) of [1] is log-concave and hence it is not suitable for plurimodal data. To overcome this drawback, [11] considered an extended version of SGND(A1, A2) through the name "extended skew generalized normal distribution (ESGND(A1, A2, a))"which has the following p.d.f.

A1 x

g2(x; Ai, A2, a) = $(x)

1 +

V7! + A2 x2

(3)

where x G R, A1 G R, A2 > 0 and a > — 1. Through the present work our intention is to propose a wide class of plurimodal asymmetric normal distributions as a modified version of the ESGND(A1, A2, a) and named it as the "modified skew generalized normal distribu-tion(MSGND)". In section 2 we present the definition and properties of MSGND. In section 4 we present the characteristic function and moments of MSGND. In section 5 certain reliability measures such as reliability function, mean residual life function etc are derived along with some conditions for unimodal and plurimodal situations are obtained. In section 6 a location scale extension of the MSGND is defined and obtained its properties such as characteristic function, reliability measures etc. In section 7 maximum likelihood estimation of the parameters of the distribution is discussed and in section 8 we constructed a generalized likelihood ratio test (GLRT) procedure. Real life data applications are given for illustrating the usefulness in section 9, a brief simulation study is attempted in section 10. While modelling certain real life data sets ESGND will not give better fit, the MSGND gives better fits. For example see the illustrations given in section 9, where the MSGND is found to be suitable for modelling data sets arising from athletic as well as agricultural data sets.

2. Modified skew generalized normal distribution

Here we define a new class of skew normal distribution namely the "modified skew generalized normal distribution (MSGND)"and derive its distributional important properties.

Definition 2.1. A random variable X is said to follow modified skew generalized normal distribution if its p.d.f is of the following, in which x G R, Ai G R, A2 > 0, fi G R and a > —1.

f (x; Ai, A2, a, = a+2

2 + a[&(p)]-!&( + A2 + A(x)

(4)

where A(x) = A1 x , <(.) and $(.) are the p.d.f and c.d.f of a standard normal variate. A

\ 1 +A2 x2

distribution with p.d.f (4) we denoted as MSGND(A1, A2, a, fi). Note that

1. When a = 0 or A1 = 0, MSGND(A1, A2,a, fi) reduces to the standard normal distribution N (0,1).

2. When fi = 0, MSGND(A1, A2, a,fi) reduces to the ESGND(A1, A2, a).

3. When a = —1 and fi = 0, MSGND(A1, A2, a, fi) reduces to the SGND(A1, A2).

For some particular choices of a, A1, A2 and fi, the p.d.f. f (x; A1, A2, a, fi) given in (4) of MSGND(A1, A2,a, fi) is plotted as given in Figure 1.

/ V

a=-1, /3=0.4, A, = 10,A2=3 a=-1, /3=0.2, AI = 10,A2=3 a=-1, /3=0.5, A, = 10,A2=3

--------

—.7:

Figure 1: Probability plots of MSGND(A1, A2, a, f) for fixed values of A1, A2, a and various values of f

3. Results

Result 3.1. If X has MSGND(A1, A2, a,f), then Y1 = -X has MSGND(-A1, A2, a,f).

Proof. The p.d.f f1 (y) of Y1 = -X is the following, for y € R, A1 € R, A2 > 0, f € R and a > -1.

fi(y) = f (-y;-Al, A2,a,ß)l g |

$(-y)

4>(-y)

a + 2 f (y; -Ai, A2, a, ß)

2 + aW)]-1 O (ßy/T7Äf + A(-y)

Result 3.2. If X has MSGND(A1,A2,a,f) then Y2 = |X| has the p.d.f (5), in which A(y) O + A(y)) + O + A(-y)).

Proof. The p.d.f. f2(y) of Y2 = |X| is the following, for y > 0.

f2(y) = f (y;Av A2, a, f) | dx I + f (-y;A2, a, f) | ^ |

3

2

0

2

3

in the light of Result 3.1 we have,

f2 (y) = f (y; A1, A2, a, fi)+ f (y; —A1, A2, a, fi)

<(y)

a + 2

<(y)

a + 2

<(y)

a + 2

2 + a[&(fi)]—1 + + A(y)

+

2 + a[&(fi)]—1 & (fi^TTA2 + A(—y) 4 + a[&(fi)]—1\& (fi^1 + A; + A(y)

+&( fi^r+A2 + A(—y)

4 + a[&(fi)]—1 a(y)

<(y)

a + 2

(5)

Result 3.3. If X has MSGND(A1, A2, a,fi) then Y3 = X2 has pdf (6), in which a(y) is as defined in Result 3.2.

Proof. For y > 0, the p.d.f of f3(y) of Y3 = X2 is

, dx

, dx,

f3 (y) = f (Vy; A1, A2, a, fi)| -y | + f (—4,y; A1, A2, a, fi)| — |

<(Vy)

a + 2

<(—Vy

a + 2

<(Vy)

2 + a[&(fi)]—1 & (fiV1 + A2 + A\J~(y

1

(a + 2)2Vy

2 + a[&(fi)]—1 &(fiV1 + A + A(—y[(y))

4 + a[&(fi)]—1 a(vy)

2VV 1

2Vy

+

(6)

Result 3.4. The c.d.f of MSGND(A1, A2, a, fi) with p.d.f (4) is the following, for x G R.

F (x)

&(x) a + 2

2+a

[&(fi)] —

a[&(fi)]—\

zp(x; A(v))<

where

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Zfi (x, A(v))

fl r fiV 1+A2+A(v)

!x J0

< (v)< (u) dvdu,

(7)

(8)

with A(v) = ,A1v ,. For particular values of A1, A2, fi and x, we can evaluate (8) by using the

y 1+A2 v2

mathematical softwares such as MATHCAD, MATHEMATICA, etc.

Proof.

F(x)

f (v; A1, A2, a, fi)dv

&(x) + a[!(+r1 i—1 <(v)&(fiy1+A1 + A(v)^j dv

„ Ax) + a + 2 w a +

2&(x) + a[&(fi)]—1

a + 2 &(x) a + 2

a + 2

a[&(fi)]—1 2

-&(x) — Zfi (x, A(v)) a[&(fi)]—\ , w

-—¡+r (x,A(v)).

1

2

x

1

4. Characteristic function and Moments

In this section we obtain the characteristic function and moments of MSGND.

Result 4.1. The characteristic function tyX (t) of MSGND(A1, A2, a, f) with p.d.f (4) is the following, for any t € R and i = \f—1.

Vx (t)

e 2

a + 2

Ai (u+it)

2 + a[O(ß)]-1 E

O fß^l + Ai + A(u + it)

(9)

where A(u + it) = .Al(u+n) ,

V ' ■s/l+A2(u+it)2

Proof. Let X follows MSGND(A1, A2, a, f) with p.d.f (4). Then by the definition of characteristic function, we have the following for any t € R and i = \f—1.

tx (t) = E(eitX)

2 ^to

. . eitx<p(x)dx + a[o(ß)] 1 f" eitx$(x)h> (ßVl + A2 + A(x)) dx a+2 / —œ a+2 œ v '

e~T \ /*œ 1 —(x—it)2 ,_

e '2 + a[O(ß)]-1 —= e®(ßy/1 + A2 + A(x))dx

a + 2 [' xr/J J-œ On substituting x - it = u in (10), we obtain

Vx(t) = ~e~+2 ^ 2 + a[O(ß)]-1 E

ß^1 + A1 + A(u + it)

(10)

(11)

which implies (9). ■

The expression for even moments and odd moments of MSGND (A1, A2, a, f) are obtained through the following results.

Result 4.2. If X follows MSGND (A1, A2, a, f), then for k=1,2,...,

E(X2k )

2k+ 2

r(k + 1 ^ a[O(ß)]-1 (ß A A ) (a + 2)V2n T{k + 2Ak (ß,A1,

(12)

in which

Ak(A1,A2,ß) = J uk-1 Q(—u)® (ß^J 1 + A2 + A(—u)j du,

where A(y/u) = ^1+A"u, A1 € R, A2 > 0, f € R which can be easily evaluated by using the softwares MATHCAD and MATHEMATICA.

Proof. By the definition of raw moments, for any k > 0, integer,

/TO

x2kf (x;A 1,A2, a, f)dx.

-TO

On substituting x2 = u in (13) we obtain the following in the light of (4) we have,

(13)

1 /*œ 1

E(X2k ) = uk d>(—u) —= du +

a+2 0 u

œ

a[O(ß)]-1

^M 2(a + 2)

^ uk$(—ß^1 + A1 + A(—u)) —¡du

1 r/"œ k 1 / ^ a[O(ß)]-1

uk-2Q(—ü)®(ß^J 1 + A2 + A(—WH du,

which leads to (12).

2

2

2

Result 4.3. If X follows MSGND(A1, A2, a, p), then for k=0,1,2,...,

2k+1 W, -, N *MP)]-1

in which

^) = r<k+1) + Ak+1 <ai,A2, p), (14)

Ak+1 <Ai, A2,p) = J™ uk$<Vû)<£ UyJ 1 + A2 + \<?Vû)j du,

for Ai G R, A2 > 0, p G R which can be easily evaluated using the softwares MATHCAD and MATHEMATICA.

Proof. By definition of raw moments,

/TO

x2k+1f (x; A1, A2, a, p)dx. (15)

-TO

On substituting x2 = u in (15) in the light of (4), we get

E(X2k+1) = 1 iTO uk+ 2 <p<vu) 4= du + t(p)\-1 v ' a + 2Jo ' 2<a + 2)

J\k+1 + A2 + A(4û)^j 4udu

1

< + 2) , uk $<vu)du + ^^ (a + 2) LJo

uk $<4u)®( + A2 + A(4u)

10

du,

which leads to (14).

5. Reliability measures and mode

Here we obtain some properties of MSGND(A1, A2, a, fi) with p.d.f. (4) useful in reliability studies. Let X follows MSGND(A1, A2, a, fi) with p.d.f (4). Now, from the definition of reliability function R(t), failure rate r(t) and mean residual life function p.(t) of X, we obtain the following results.

Result 5.1. The reliability function R(t) of X is the following, in which Zfi(t, A(x)) is as defined in Result 3.4.

R (t)^ {2 + «№ } + (, A(x)')

Result 5.2. The failure rate r(t) of X is given by

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

$(t)[2 + a[&(fi)]—1 ®(fiV!TA + A(x))]

r <t)

<1 _ <ï,(t))[2 + «Mi!] + aMP)]-1 Çp<t, A(x)) ' Result 5.3. The mean residual life function of MSGND(A1, A2, a, p) is

$ (p^1 + A2 + A(x)} $<t)

m (t) = f 2*<l\,+ a[^(p)]-1

(a + 2)R(t) (a + 2)R(t) +Ap<t; A1, A2)] _ t (16)

where

, d ( rpV 1+A1+A(x)

ap <t; A1, A2) = / $<x)

it

, . , è(u)du

dx 0

dx.

Proof. By definition, the mean residual life function (MRLF) of X is given by

M(t) = E(X - t|X > t) (17)

= E(X|X > t) - t,

where

2 rM

E(X|X > t) = R(t)(a + 2)Jt x*(x)dx (18)

+

/" xHx)®( + A(x)^ dx.

Since $(.) is the p.d.f of standard normal variate (x) = -x$(x). Therefore (18) becomes,

2 rM /

E(X|X >t) = (0T2m / -(p(x)dx (19)

L" (xy6 +A«) dx.

On integrating (19), we obtain the following

E(XX >t) = urkm *(,) + TaWm (-4,(x(x)+f^2 wC

(a + 2)R(t) ' (a + 2)R(t)

d ( rfiVl+^+^x)

aM))]-1 [ ,(x)

R(t)(a + 2) J

. . $(u)du

dx \ J —to

dx.

(20)

On solving (20) and substituting in (17), we get (16).

The functions R(t),r(t) and p.(t) are equivalent in the sense that if one of them is given, the other two can be uniquely determined. ■

Next, through the following result we derive certain conditions under which the MSGND(Ai, A2, a,)) is log-concave.

Result 5.4. The p.d.f of MSGND(A 1, A2, a,)) is log-concave under the following two cases. Case 1: For x > 0,

(i) when A1 < 0 provided for all a > 0 and ) > 0 and

(ii) when A1 > 0 provided | 3A1A2x3 5 | < | 3AlAzx 3 |

(1+A2x2) 2 (1+A2x2) 2

Case 2: For x < 0, the p.d.f of MSGND(A1,A 2, a,)) is log concave (i) when A1 > 0 provided for all a > 0 and ) > 0 and

(i) when A1 < 0 provided | | < |.

(1 + A2x2) 2 (1 + A2x2) 2

Proof. To prove log[f (x;A1,A2, a,))] is a concave function of x, it is enough to show that its second derivative is negative for all x. Thus

d , xc, , , „m a[F(R)]-1f(n)n'

—log[f(x;A1,A2, a,))] = -x + LVF7J Jy'"

dx ' 1 2 2 + a[F())]-1 F(n)

and

d2

log[f (x; A1, A2, a,))] = -1 - A1 - A2 + A3

in which,

2

Ai = a[F(ft)]-1n f(n)n (21)

2 + a[F(ft)]-1 F(n)

A = *2[Fm-2(f (n))2 n '2 (22)

2 [2 + a[F(ft)]-1 F(n)]2 ( )

and

where

A = g[F(P)]-1f (n)n" (23)

3 2 + a[F(ft)]-1 F(n) ' (3)

n = A(x) + ft^ 1 + Af

A1 A1A2x2

n n"

V7! + A2 x2 (1 + A2 x2)3 3A1A22x3 3A1A2x

(1 + A2 x2 )5 (1 + A2 x2 ) 2

3

Note that Ax > 0, for a > 0 and n > 0. Here n > 0 for all values of Ai > 0 and ft > 0. Consequently A2 > 0 for all values of a > 0, ft > 0 and A1 > 0. Also, A3 < 0 for either when a < 0 and n > 0 or when a > 0 and n < 0. Hence (4) is log-concave in these situations. ■

As a consequence of the above result, we have the following result.

Result 5.5. MSGND(A1, A2, a, ft) density is strongly unimodal under the following two cases. Case 1: For x > 0,

(i) if A1 < 0 provided for all a > 0 and ft > 0 and

3Aj A2 x3 1 ^ 1 3A1A2 x 5 1 < 1 3

(1+A2x2 )2 (1+A2x2 )2

(ii) if A1 > 0 provided | 3A1 A2x35 | < | 3A1A2x 3 |

Case 2: For x < 0,

(i) if A1 > 0 provided for all a > 0 and ft > 0 and

(i) if Ai < 0 provided | 3AlA2x35 | < | 3AlA2x 3 |.

(1 + A2x2 ) 2 (1 + A2x2 ) 2

Result 5.6. MSGND(A1, A2, a, ft) density is plurimodal under the following two cases. Case 1: For x > 0,

(i) if A1 < 0 provided for all a < 0 and ft > 0 and

(ii) if A1 > 0 provided | 3AlA2x35 | > | 3AlA2x 3 |

(1+A2x2) 2 (1+A2x2) 2

Case 2: For x < 0,

(i) if A1 > 0 provided for all a < 0 and ft > 0 and

(i) if Ai < 0 provided | 3AlA2x35 | > | 3AlA2x 3 |.

(1 + A2x2 ) 2 (1 + A2x2 ) 2

6. Extended form of MSGND

In this section we discuss an extended form of MSGND(Ai, A2, a, ft) by introducing the location parameter p and scale parameter a.

Definition 6.1. Let X ~ MSGND(A1, A2, a, ft) with p.d.f given in (4). Then Y = p + aX is said to have an extended MSGND with the following p.d.f.

f * (y, p, a; A1, A2, a, ft)

$() 2 + a[O(ft)]-1

a (a + 2) a ^ft^T+A2 + A* (y))

(24)

in which A* (y)

Aijy-p)

2, y G R, p G R, A1 G R, ft G R, a > 0, A2 > 0 and a > -1. A

Va2+A1(y-p)2'

distribution with p.d.f (24) is denoted as EMSGND(p, a; A1, A2, a, ft). Clearly when

(i) When ft=0, the EMSGND (p, a; A1, A2, a,ft) reduces to ESGND (p,a;A1, A2, a) of [11].

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

(ii) ft=0 and A1 = 0, the EMSGND(p, a; A\, A2, a, ft) reduces to the p.d.f of normal distribution.

(iii) When ft = 0 and A2 = 0, the EMSGND(p, a; Ai, A2, a, ft) reduces to EGMNSN ( p, a; a, A) of [9].

Now, we obtain the following results of EMSGND(p, a; A1, A2, a, ft), in a similar way as we defined in section 2 and 4.

Result 6.1. The cumulative distribution function (c.d.f) F* (y) of EMSGND(p, a; A1, A2, a, ft) with p.d.f (24) is the following, for y G R.

F* (y)

2 +

aO[(ft)]-1

*( ) aMft)]-1 ÔOT2) - a(a + 2) tft (y A (y))

where gft (y, A* (y)), is as defined in Result 3.4.

Result 6.2. The characteristic function of EMSGND(p, a; A1, A2, a, ft) is given by

eitp 2 I ,

fY(t) = ^^ ^2 + a[O(ft)]-1 E

where A* (z + a2it)

a + 2

A1 (z+a2it) sJa2+A2 (z+a2it)2'

O (ft^1 + A2 + A*(z + a2it^ J,

Result 6.3. The reliability function R*(t) of Y is the following, in which gft(t, A*(y)) is as defined in Result 3.4, with A*(y) = . Al(y-p) ,

7 \/a +A2(y-p)2

R* (t)

a (a + 2) tft (t, A*(y)).

1 - F( ^ ) a

2 + a [F(ft)]-1}

+

a[F(ft)]-1 a (a + 2)

Result 6.4. The failure rate r* (t) of Y is given by

r* (t) where A* (t) =

f ( - ) 2 + a[F(ft)]-1 F (ftyj 1 + A2 + A*(t))

1 a(a+2) 1 - F() {2+a [F(ft)]-1} + «f tp (t, a* (t))

A1(t-p)

V^2+A2(t-p)2'

1

2

1

7. Maximum likelihood estimation

The log-likelihood function, ln L of the random sample of size n from a population following

EMSGND(u, a; A1, A2, a,)) is the following.

( 1 ) 1 n (y. - u)2

ln L = n ln - n ln a - n ln(a + 2) -- Y SH-1L.

\V2nj ( ) a2

+ £ ln (2 + a [6(f)]-16 (+ A* (y)^ .

(25)

On differentiating (25) with respect to parameters u, a, A1, A2, a and ) and then equating to zero, we obtain the following normal equations.

f (y-n - £am)]-1* W1^+A'(y))(TO'ct) (26)

a2 .=1 2 + «[®())]-1t (fiJlTA, + A'(y))

and

i=1

n am)]-1 t^J1^+A*(y))

+ Y ___7 V[a2+A2(yi-U)21 2/ = 0

2 + a[6())]-16 (pj1+A1 + A*(y)) ,

i=1

a i=1 a i=1

n +f "A1*m-1 ^^+A' (27)

a a3 2 + a[6())]-16 + A*(y))

0, (28)

n a[6())]-1t (fiJl + AAi + A*(y)

E-

A* (y) | A

A1 + v^A?

i=1

2 + a[6())]-16 + A*(y))

A1 (yi-u)3

n a[6())]-1t ()v/1+A2 + A*(y)

E-

L [a2+A2(yi-U)213J

i=1

2 + a[6())]-16 (fJl + AA2 + A*(y))

0, (29)

n [6())]-16 UJl + Ai + A*(y))

Y-v ; ,—-—J-) = 0 (30)

.=1 2 + a[6())]-16 1 + A1 + A*(y))

n a[6())]-1 <p(pJ\ + A{ + A*(y))Jl+A\ Y-v ; 1—- -—- (31)

.=1 2 + a[6())]-16 1 + A1 + A*(y))

£ a[6())]-2+ A*(y)) ^^ .=1 2 + a[6())]-16 + A2 + A*(y)) .

On solving the equations (26) to (31), we get the maximum likelihood estimate of the parameters of EMSGND(u, a; A1, A2, a,)).

8. Generalized likelihood ratio test

In this section we discuss a test procedure for testing the parameter ft of EMSGND. For testing the null hypothesis H0 : ft = 0 against the alternative hypothesis H1 : ft = 0 by using the generalized likelihood ratio test, the test statistic is

-2lnA(x) = 2[lnL(©; x) - lnL(©*; x)],

where © is the maximum likelihood estimator of © = (p, a, A1, A2, a, ft) with no restriction, and © * is the maximum likelihood estimator of © when ft = 0. The test statistic given is asymptotically distributed as x2 with 1 degrees of freedom. For further details see [12].

9. Applications

In this section we consider three real life data applications of the EMSGND. The first data is taken from [9]. The data gives the Otis IQ scores for 52 non-white males hired by a large insurance company in 1971. The observed data is given below: Data set 1:

91,102,100,117,122,115, 97,109,108,104,108,118, 103,123,123,103,106,102, 118,100,103,107, 108,107, 97, 95, 119, 102,108, 103, 102, 112, 99,116,114, 102, 111, 104,122, 103, 111, 101, 91, 99, 121, 97, 109, 106,102,104, 107, 95.

The second data is taken from [8]. This data is related to the milk production of 28 cows in which the variable under study is the daily milk production in kilogram and the variable recorded for three times milking cows. Data set 2:

34.6, 27.7, 29.2, 25.3, 27.6, 37.9, 32.6, 32, 30.7, 29.6, 38.3, 32.9, 30.8, 32.2, 32.9, 28.1, 33.9, 28.6, 28.1, 35.9, 34.8, 40.3, 30.9, 34.4, 19.8, 25.8, 37.3, 32.4.

The third data is taken from [8]. The data includes 100 females and 102 males with 13 variables such as height, weight, body mass index (BMI) etc. We choose for the variable under study is the BMI values for the second 50 females. The data is given below: Data set 3: 24.47, 23.99, 26.24, 20.04, 25.72, 25.64, 19.87, 23.35, 22.42, 20.42, 22.13, 25.17, 23.72, 21.28, 20.87, 19.00, 22.04, 20.12, 21.35, 28.57, 26.95, 28.13, 26.85, 25.27, 31.93, 16.75, 19.54, 20.42, 22.76, 20.12, 22.35, 19.16, 20.77, 19.37, 22.37, 17.54, 19.06, 20.30, 20.15, 25.36, 22.12, 21.25, 20.53, 17.06, 18.29, 18.37, 18.93, 17.79, 17.05, 20.31.

We obtained the maximum likelihood estimate (MLE) of the parameters by using the data sets with the help of the MATHCAD software. The numerical results obtained are presented in Table 1, which includes the estimated values of the parameters and the corresponding Kolmogorov-Smirnov Statistics (KSS) values of models: ESGND(p, a; A1, A2, a) and EMSGND(p, a; A1, A2, a, ft). Also, its Akaike's Information Criterion(AIC), Bayesian Information Criterion(BIC) and corrected Akaike's Information Criterion(AICc) values are obtained.

Table 1: Estimated values of the parameters for the model: ESGND(p, a; Ai, A2, a) and EMSGND(p, a; Ai, A2, a, f) with respective values ofKSS, P-value, log-likelihood, AIC, BIC and AlCc in case of Data Set 1, 2 and 3.

Data set Estimates of ESGND(p, a; A1, A2, a) EMSGND

the parameters (p, a; Ai, A2, a, f)

1 p 102.18167 106.654

a 4.94535 8

a 1.89352 2 f - 8 A1 6.23351 0.809 A2 0.38068 0.349 KSS 05 0.129961 P-Value 3.91952 x10 -12 0.315666 Log-likelihood -193.257 -183.43 AIC 396.515 378.859 BIC 406.271 390.567 _AICc_397.819_380.726

2 pi 31.43211 31.5934 a 2.0174 4.464 a 0.96229 0.5

f - 20.006

A1 0.64746 6.002

_A_15.91043_8_

KSS 0.451625 0.0783872 P-Value 9.83689 x10 -6 0.98998 Log-likelihood -100.348 -81.1221 AIC 210.697 174.244 BIC 217.358 182.237 _AICc_213.424_178.244

3 p 20.1321 21.865 a 1.07639 3.33 a 1.17325 0.6 f - 8 Aa1 7.85813 7 A2 10.13911 5

KSS 05 0.121453

P-Value 1.07824x10 -11 0.418815

Log-likelihood -324.56 -130.59

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

AIC 659.119 273.18

BIC 668.68 284.652

AICc 660.483 275.134

It is clear from Table 1, that the EMSGND(p, a; A1, A2, a, f) is a more appropriate model to all the three data sets compared to the existing model ESGND(p, a; A1, A2, a). We have plotted the histogram of the respective data sets along with the corresponding fitted values of the ESGND and EMSGND in Figures 2, 3 and 4 respectively. It shows that EMSGND yields a better fit than ESGND in all the cases. Thus, the model discussed in this paper provides more flexibility in modeling in case of all the three datasets due to the presence of the extra parameter.

ESGND EMSGND

Figure 2: Histogram of Data set 1 and fitted distributions

- ESGND EMSGND

Figure 3: Histogram of Data set 2 and fitted distributions

□ .20

90

100

110

120

130

Q.25

15

20

25

30

35

40

45

Figure 4: Histogram of Data set 3 and fitted distributions

Also, we conduct a generalized likelihood ratio test for illustrating the usefulness of the model, which is described as follows.

Let us consider the problem of testing a hypothesis H0 : fi = 0 against H1 : fi = 0 in the case of Data set 1. The MLEs and values of the likelihood for ESGND and EMSGND are

fi = 102.18167, a = 4.94535, A1 = 6.23351, A2 = 0.38068, a = 1.89352,

L(0*; x) = 1.17315 x 10-84 and

fi = 106.654, a = 8, A1 = 0.809, A2 = 0.349, a = 2, /3 = 8,

L(©; x) = 2.17542 x 10-80, respectively. The value of likelihood ratio (LR) test statistic is 19.6557. Since the critical value for the test with significance level 0.05 at one degrees of freedom is 3.84, the null hypothesis is rejected.

Similarly we consider the problem of testing H0 : fi = 0 against H1 : fi = 0 using the Data set

2. The MLEs and values of the likelihood for ESGND and EMSGND are

fi = 31.43211, a = 2.0174, A1 = 0.64746, A2 = 15.91043, a = 0.96229,

L(©*; x) = 2.62584 x 10-44 and

fi = 31.5934, a = 4.464, A1 = 6.002, A2 = 8, a = 0.5, /3 = 20.006,

L(@; x) = 5.87655 x 10-36, respectively. The value of likelihood ratio (LR) test statistic is 38.4525. Since the critical value for the test with significance level 0.05 at one degrees of freedom is 3.84, the null hypothesis is rejected.

Similarly we consider the problem of testing H0 : = 0 against H1 : = 0 using the Data set

3. The MLEs and values of the likelihood for ESGND and EMSGND are

fi = 20.1321, a = 1.07639, A1 = 7.85813, A2 = 10.13911, a = 1.17325,

L(©*; x) = 1.11045 x 10-141 and

fi = 21.865, a = 3.33, A1 = 7, A2 = 5, a = 0.6, 3 = 8,

L(©; x) = 1.92965 x 10-57, respectively. The value of likelihood ratio (LR) test statistic is 387.939. Since the critical value for the test with significance level 0.05 at one degrees of freedom is 3.84, the null hypothesis is rejected.

10. Simulation Study

In order to assess the performance of the maximum likelihood estimators of the parameters of the EMSGND(ft, a; A1, A2, a, ft), we have conducted a brief simulation study as follows. We have simulated data sets of sizes 30, 50 and 100 from the EMSGND for the parameter values ft = 2, a = 0.5, A1 = 0.8, A2 = 0.3, a = 1 and ft = 8. We obtain likelihood estimates of these parameters and computed bias and mean square errors (MSE). The results obtained are presented in Table 2.

Table 2: Estimate of the parameters and corresponding bias and mean square error(MSE).

Sample size Parameters Estimate Bias MSE

30 ft 1.988331 0.1883313 0.03546867

a 0.4833721 -0.0166279 0.0002764871

A1 0.83 0.54 0.2916

A2 0.29 -1.71 2.9241

ft 7.98 1.98 3.9204

& 1.37 1.07 1.1449

50 fl 1.99147 -0.008530334 7.27666 x10-05

a 0.4910136 -0.00898637 8.075485 x10-05

A1 0.78 -0.02 4x 10-04

A2 0.285 -0.015 0.000225

ft 7.87 -0.13 0.0169

& 1.36 0.36 0.1296

100 1.994749 -0.005251242 2.757554 x10-05

a 0.5006707 0.0006706597 4.497845 x10-07

A1 0.795 -0.005 2.5 x10-05

A2 0.29 -0.01 1x10-04

ft 7.9 -0.1 0.01

& 1 -3.248735 x10-12 1.055428 x10-23

From Table 2, it can be observed that both the bias and MSE are in decreasing order as sample size increases.

11. Summary and Conclusion

Through this paper we proposed a wide class of distributions which are suitable for asymmetric as well as plurimodal situations. Certain structural properties of the distribution are derived and discussed its reliability properties as well as unimodal and plurimodal properties. A location scale extension of this class of distribution is also considered and obtained its analogous properties. Further we discussed the maximum likelihood estimation of the parameters of the model and thereby illustrated the procedures through certain real life applications using three real data sets and shown that the model is suitable for all the data sets compared to the existing model. Also, we constructed a test procedure for establishing the significance of the additional parameter ft. In order to assess the performance of the maximum likelihood estimation procedure, we carried out a brief simulation study. The proposed model is shown to be more appropriate for asymmetric as well as plurimodal data sets. Certain characteristic properties as well as inferential aspects of the model are yet to study, which we hope to publish in another article. Even though there is flexibility in the proposed model compared to the existing model from the practical point of view, there is scope for developing a further generalized version of the proposed model so as to model more complicated data sets. Such possibilities are under investigation and hope to publish through another article shortly.

Acknowledgments

Authors would like to express their sincere thanks to the Editor in Chief, the Associate Editor and anonymous referees for their valuable comments that have helped to improved the quality and presentation of this paper.

References

[1] Arellano-Valle, R.B Gomez, H. W., and Quintana, F. A. (2004). A new class of skew-normal distributions. Communications in Statistics-Theory and Methods, 33:1465-1480.

[2] Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian-Journal of Statistics, 12:171-178.

[3] Azzalini, A. (1986). Further results on a class of distributions which includes the normal ones. Statistica, 46:199-208.

[4] Azzalini, A and Dalla Valle, A (1996). The multivariate skew-normal distribution. Biometrika, 83:715-726.

[5] Branco, M. D. and Dey, D. K.(2001). A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis, 79:99-113.

[6] Henze, N.(1986). A probabilistic representation of the'skew-normal'distribution. Scandinavian Journal of Statistics, 13:271-275.

[7] Kumar, C. S. and Anila, G. V(2018). Asymmetric Curved Normal Distribution. Journal of Statistical Research, 52:173-186.

[8] Kumar, C. S. and Anila, G. V(2018). On Some Aspects of a Generalized Asymmetric Normal Distribution. Statistica, 77:161-179.

[9] Kumar, C. S. and Anusree, M. R.(2011). On a generalized mixture of standard normal and skew normal distributions. Statistics & Probability Letters, 81:1813-1821.

[10] Kumar, C. S. and Anusree, M. R. (2014). On a modified class of generalized skew normal distribution. South African Statistical Association, 48:111-124.

[11] Kumar, C. S. and Anusree, M. R. (2015). On an extended version of skew generalized normal distribution and some of its properties. Communications in Statistics-Theory and Methods, 44:573-586.

[12] Rohatgi, V. K and Saleh, A. M. E. (2015). An Introduction to Probability and Statistics. John Wiley & Sons, New York.

i Надоели баннеры? Вы всегда можете отключить рекламу.