Statistical Analysis of Neonatal Mortality: A Case Study of Ethiopia

Background: Neonatal mortality is a significant public health concern worldwide. It is estimated that four million neonatal deaths occur annually, 98% of which occur in developing countries. In Ethiopia neonatal mortality is a major problem accounting for more than 42% of under-five deaths. This study is an attempt to study the determinants of neonatal mortality in Ethiopia using data collected in Ethiopian demographic and health survey. Methods: The survey collected information from a total of 16,515 women aged 15-49 years out of which 9209 women were considered in this study. To meet our objectives, descriptive statistics and Poisson, negative binomial, zero inflated Poisson and zero-inflated negative binomial regression models were used for data analysis considering household, maternal and socio-demographic, socio-economic and environmental variables as explanatory variables and number of neonatal deaths per-woman as the response variable. Results: Each of the four models was compared by a variety of statistical techniques and it was found that zeroinflated negative binomial model was a better fit than the other models. Based on descriptive statistics results 23.2% of mothers have experienced at least one neonatal death in their lifetime. From result of the zero-inflated negative binomial regression model, being born to mothers whose age at first birth is at least 20 years, whose level of education is secondary and above, who reside in urban areas and who attended at least four antenatal care visits significantly decreases the risk of neonatal mortality. Conclusion: Neonatal mortality must decline more rapidly to achieve the Millennium Development Goal (MDG) target for under-five mortality in Ethiopia. Increasing access to maternal and child health services in rural areas, improving the level of education of mothers, encouraging utilization of antenatal care services and improving access to safe/pipe drinking water are recommended as possible interventions. Maternal health before, during and after pregnancy, conditions at the time of labour and delivery and post-natal care of babies play a significant role in reducing neonatal mortality. The major direct causes of neonatal deaths vary from region to region and from country to country. Countries with high estimates of Neonatal Mortality Rate (NMR) (NMR>45/1,000 live births) have a larger proportion of deaths (nearly 50%) that is attributable to infections such as pneumonia, diarrhea and tetanus. In Ethiopia mortality trends can be examined by comparing data from DHS conducted in 2000, 2005 and 2011. Infant and underfive mortality rates obtained by these surveys evidence a continuous declining trend in mortality. On the other hand, even though neonatal mortality rate decreased from 49 deaths per 1,000 live births in 2000 to 39 deaths per 1,000 live births in 2005, it has remained stable at 37 deaths per 1,000 live births in 2011. Approximately 42% of the under-5 mortality in Ethiopia is attributable to neonatal deaths. According to the 2011 Ethiopia Demographic and Health Surveys (DHS), the country is experiencing a high neonatal mortality rate at 37 per 1000 live births, comparable to the average rate of 35.9 per 1000 live births for the African region overall [3,4]. *Corresponding author: Berhanu Teshome Woldeamanuel, Department of Statistics, Mekelle University, Ethiopia, Tel: +251910118464; E-mail: berteshome19@gmail.com Received: February 09, 2018; Accepted: April 23, 2018; Published: April 30, 2018 Citation: Woldeamanuel BT (2018) Statistical Analysis of Neonatal Mortality: A Case Study of Ethiopia. J Preg Child Health 5: 373. doi:10.4172/2376127X.1000373 Copyright: © 2018 Woldeamanuel BT. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Maternal health before, during and after pregnancy, conditions at the time of labour and delivery and post-natal care of babies play a significant role in reducing neonatal mortality. The major direct causes of neonatal deaths vary from region to region and from country to country. Countries with high estimates of Neonatal Mortality Rate (NMR) (NMR>45/1,000 live births) have a larger proportion of deaths (nearly 50%) that is attributable to infections such as pneumonia, diarrhea and tetanus.
In Ethiopia mortality trends can be examined by comparing data from DHS conducted in 2000, 2005 and 2011. Infant and underfive mortality rates obtained by these surveys evidence a continuous declining trend in mortality. On the other hand, even though neonatal mortality rate decreased from 49 deaths per 1,000 live births in 2000 to 39 deaths per 1,000 live births in 2005, it has remained stable at 37 deaths per 1,000 live births in 2011. Approximately 42% of the under-5 mortality in Ethiopia is attributable to neonatal deaths. According to the 2011 Ethiopia Demographic and Health Surveys (DHS), the country is experiencing a high neonatal mortality rate at 37 per 1000 live births, comparable to the average rate of 35.9 per 1000 live births for the African region overall [3,4].

Introduction
The neonatal period (birth to 28 th day of life) is the most vulnerable and high-risk time in life because of the highest mortality and morbidity incidence in human life during this period. Neonatal mortality refers to the probability of dying of a live born infant within the first 28 days of life. An estimated 40 percent of deaths in children less than five years of age occur during the first 28 days of life [1]. The remaining 60 percent of deaths occur during the subsequent 1800 days of the first five years of life. The average daily mortality rate during the neonatal period is close to 30 fold higher than during the postnatal period (one month to one year of age). During 2010, an estimated 7.7 million children under five years of age died worldwide. This included 3.1 million neonatal deaths, 2.3 million post neonatal deaths (age one month to one year) and 2.3 million childhood deaths (age 1-5 years) [2].
Despite the very high burden of mortality, the problem of neonatal mortality has received little attention until relatively recently. There is now a growing consensus within the international community that increased efforts are needed to reduce newborn deaths if further progress is to be made in reducing child mortality Neonatal Mortality (NM) has become a significant public health concern worldwide. Of the 130 million babies born every year, a total of 4 million die in the neonatal period. In Africa, the neonatal mortality rate is high (41 neonatal deaths per 1000 live births) and the distribution in the Sub-Saharan region (Eastern, Western and Central Africa) range between 42 and 49 neonatal deaths per 1000 live births. Every year about 300,000 African babies die on the first day of their birth. Like many other countries in the Sub-Saharan regions, babies born in Ethiopia are at risk of dying in the neonatal period, it has remained stable at 37 deaths per 1,000 live births in 2011 [5].
In Ethiopia, there are limited studies that look at the importance of the association between each determinant and NM. In addition, to our knowledge no studies have used the recent Demographic Health Survey that was conducted in Ethiopia in the year 2011 regarding this problem. Moreover, NM in Ethiopia is an important concern, where it is essential for monitoring the current health programs and formulating policies for improving the current situation. From that concrete recommendations may be given to policy makers and health program managers for improving the health policies and the maternal and child care strategy. An understanding of the factors related to neonatal mortality is therefore important to guide the development of focused and evidence-based health interventions to prevent neonatal deaths. Therefore the purpose of this study is to identify and analyse the various possible risk factors of neonatal mortality in Ethiopia using count data models.

Materials and Methods Data
The current study is based on Ethiopia Demographic and Health Survey 2011 data which is part of the worldwide measure DHS project funded by the United States Agency for International Development (USAID) conducted by Central Statistical Agency (CSA). The survey was designed to provide estimates for the health and demographic characteristics in the country. Nationally 16,515 women of age 15-49 were included in the survey.
This study is based on retrospective data from women of childbearing aged . Information on neonatal mortality is obtained from mothers asking relevant questions about her live births, when the child was born and whether the birth she gave was still alive. In case the baby is dead, the number of day when was he or she died was recorded within the first month of birth, where neonatal mortality refers to children who died within the first 28 days of their birth. Finally, response on 9,209 women of age 15-49 who gave birth were analysed based on count regression model to determine the predictors of neonatal mortality in the region.

Variables in the study
The dependent variable of the study is number of neonatal death that a mother has experienced counted as 0, 1, 2,… The most important expected determinants factors of neonatal mortality from their theoretical justification included in this study are given in Table 1

Methodology
Modeling count variables is a common task in public health research and social sciences. The classical Poisson regression model for count data is often of limited use in these disciplines because empirical count data sets typically exhibit over-dispersion and/or an excess number of zeros. Hence in modeling count data, it is a usual practice after the development of Poisson regression model to proceed with analysis of correcting for overdispersion if it exists. One of the approaches to modeling overdispersion is to use Quasi Likelihood estimation technique [1,6]. Besides the problem of overdispersion, it is also common that real life count data exhibit excess zeros. While the negative binomial model can deal with overdispersion, the Hurdle and Zero-inflated Poisson and Negative Binomial Regression models can be used to handle excess zeros in their own way. In almost all practical cases, the count data are skewed, non-negative, overdispersed and have excess zeros. These features have motivated the application of various methods and models for count data regressions. Hence, this paper provides a practical approach of modeling count data focusing on data that exhibits overdispersion and excess zeroes [6,7].

Count data models
One of the crucial questions in statistical analysis of count data is how to formulate an adequate probability model to describe observed variation of counts. Because of the restrictive nature of equidispersion assumption in standard Poisson model, researchers have developed techniques and tests that allow detecting the overdispersion (or under-dispersion) in the population. Recently, Zero-inflated models have been developed to take into account the excess of zeroes in the data. The Zero-inflated model can be seen as a finite mixture model where one distribution is considered as a degenerate process with a unit point mass at zero [8].

Poisson regression model (PR)
Consider we have n independent random variables denoted by Y1, Y2, …, Y n . Where y i denote neonatal death for i th mother in her life time. Poisson regression regression model for count data assumes that each observed count y i is drawn from a Poisson distribution with rate of occurrence of an event, parameter λ i i=1, 2, …, n. Let X i denote a vector of covariates in the study for the i th mother.
Then the Poisson equation of the model with rate parameter λ i is given by: The mean of the response variable λ i is related with the linear predictor through the so called link function [9]. Now let X be nϰ (k+1) matrix of explanatory variables. The relationship between Y i and i th row vector of X, X i , linked by g(λ i ), is the canonical link function given by: where, X i =(ϰ i0 , ϰ i1 , …, ϰ ik ) is the ith row of covariate matrix, (with ϰ i0 =1indicating a vector of unity that represent the values for intercept term) and β= (β 0 , β 1 , …, β k ) is unknown (k+1) dimensional vector of regression parameters.
The above two equations are called Poisson regression model or log-linear model.

Negative binomial regression model (NBR)
The main problem in the use of standard Poisson regression model, where there a more dispersed observations or if the data have excess zeros, another way of modeling such over-dispersed count data is a negative binomial (NB) distribution which can arise as a gamma mixture of Poisson distributions.
Let Y 1 , Y 2 , ..., Y n be a set of n independent random variables one parameterization of its probability density function is, then the negative binomial model given as: Covariates can be introduced into a regression model based on the NB distribution via the relationship: where xij is the i th observation corresponding to the j th covariate, k is the number of covariates in the model and β j is the j th regression parameter [10,11].

Zero-inflated regression models
One characteristic of the Poisson distribution is that the mean of the distribution is equal to the variance; however when there are excess zeros, probability of zero in the standard model will be less than the expected. The problem of standard models in underpredicting zeros and overpredicting ones is very common and sometimes this problem can be very severe when there are a lot of zeros in the distribution. In such cases, Zero inflated Poisson (ZIP) and Zero inflated negative binomial (ZINB) models can be used to account for excess zeros. The zero values in the ZIP model can be viewed as comprising two parts. One portion of the zero counts arises from the inflated part of the distribution and the other portion comes from what would be expected given a Poisson distribution with parameter λ [12].

Zero-inflated Poisson regression model (ZIP):
Generally, the zero-inflated probability mass function has the form: If Y i are independent random variables having a zero-inflated Poisson distribution, the zeros are assumed to arise in two ways corresponding to distinct underlying states. The first state occurs with probability Ф i and produces only zeros, while the other state occurs with probability (1-Ф i ) and leads to a standard Poisson count with mean λ i . In general, the zeros from the first state are called structural zeros and those from the Poisson distribution are called sampling zeros. This twostate process gives a simple two-component mixture distribution with probability mass function: where λ i is the mean of the non-zero outcomes that can be modelled with the associated explanatory covariates using a natural logarithmic link function as: where X i =(1, x i1 , x i2 , …, x ik ) is a (k+1) x1 vector of explanatory variable of the i th subject and β is (k+1)x1 vector of regression coefficient parameters. Ф i (0 < Ф i <1) is the probability of an excess zero (being in the zero mortality state) determined by a logit model. To predict membership in the "Always Zero" group, we can use the same variables or we can use a smaller subset of the variables or even different variables altogether, that is:

Zero-inflated negative binomial regression model (ZINB):
The main difference between ZIP and ZINB model is that the Poisson distribution for the count data is replaced by the negative binomial distribution. The probability density function of a zero-inflated negative binomial distribution is a simple modification of the ZIP and is given by: where λ i is the mean of the non-zero response that can be modelled with the associated explanatory covariates using a natural logarithm link function as defined in equation (7) and Ф i is the probability of excess zeros which can be estimated by the logistic regression as defined in equation (8) [12].
The ZINB model is a special case of a two-class finite mixture model like the ZIP model with mean where the parameters λ i and Ф i depend on the covariates and α ≥ 0 is a scalar. Thus we have over-dispersion whenever either Ф i or α is greater than 0. Thus, the equation in (10) reduces to NB when Ф i =0 and to the ZIP when α=0.
The likelihood function for ZINB model is:

Assessing model fit
Overall regression test: Often after fitting a model it is good to assess how model is good fit. The null and alternative hypothesis for overall model goodness of fit may be symbolically stated as: H 0 : β 1 =β 2 =β 3 =⋯=β K =0 versus Ha: Not all β j =0 for j=1, 2, 3, …, k This can be tested using the deviance test with k degrees of freedom. The deviance (log likelihood ratio statistic) is given by Long [13]: We reject the null hypothesis of no model goodness of fit for large value of the test for a given level of significance.
Test for significance of a single variable: The null hypothesis for this test may be stated as "Factor X i does not have any value added to the prediction of the response given that other factors are already included in the model. " To test such a null hypothesis, one can use Where i βˆis the estimated regression coefficient and is the estimate of the standard error of i βˆ. This test statistic has the t-distribution with n-k-1 degree of freedom.

Test for over dispersion parameter:
The negative binomial regression model reduces to the Poisson regression model when the over dispersion parameter is not significantly different from zero. To assess the adequacy of the negative binomial model over the Poisson regression model, we can test the hypothesis: This is a test of significance of the over dispersion parameter α α.
The presence of the over dispersion parameter α in the NB regression model is justified when the null hypothesis H 0 : α=0 is rejected. In order to test the above hypothesis a score test statistic is proposed. The general score test statistic for testing H 0 can be given by: where i λˆ is the predicted value from the Poisson regression model.
Under the null hypothesis that the data follow a Poisson model, the limiting distribution of the score statistic is chi-squared with one degree of freedom [14,15].

Goodness of-fit tests:
In this section, several goodness-of-fit measures will be briefly discussed, including the, likelihood ratio test, Vuong statistic, Akaike Information Criteria (AIC) and Bayesian Information Criteria (BSC).
Likelihood ratio: The advantage of using the maximum likelihood method is that the likelihood ratio test may be employed to assess the adequacy of the negative binomial over the Poisson because negative binomial will reduce to the Poisson when the dispersion parameter, α, equals zero. In this study a likelihood ratio was used to compare the Poisson with the negative binomial and zero-inflated Poisson with zeroinflated negative binomial since Poisson is nested on negative binomial and zero-inflated Poisson is nested in zero-inflated negative binomial. However this will not be used to compare Poisson or negative binomial with the zero inflated Poisson and negative binomial as long as these models are not nested one on the other.
The likelihood ratio statistic is given by: where 1  and 0  are the model's log likelihood under the alternative and null hypothesis, respectively. T has a chi-square distribution with one degrees of freedom. This method is not appropriate for models which are not nested. In such situations, we will use another method such as the Akakie information criteria (AIC) and Bayesian information criteria (BIC).

Akakie information criteria (AIC)
AIC is the most common means of identifying the model which fits well by comparing two or more than two models. It tries to balance the goodness of fit against the complexity of the model. It is given by:  denotes the log likelihood of a model that is to be compared with the other models and k is the number of parameter in the model including the intercept. A good model is the one which has the minimum AIC value [16,17].
The BIC is defined as: where  denotes the log likelihood of the model, n is the sample size and k is the number of parameters in the model including the intercept. For this measure, the smaller the BIC, the better the model is [18].

Vuong statistic
Vuong has introduced a test which is a well suited method to compare zero-inflated regression models to standard non nested model for counts data. Suppose f 1 (y i /x i ) and f 2 (y i /x i ) denote the probability density functions of zero-inflated model (ZIP or ZINB) and parent (traditional) model (PO or NB), respectively. We want to test the following hypotheses: For large sample size and under the null hypothesis, the statistic V has an asymptotic standard normal distribution. Within the family of ZIP models, testing if a Poisson model is adequate corresponds to testing: Where possible tests include the likelihood ratio test (LRT) and the score test. For a general ZIP regression model, the LRT for zeroinflation is given by: Where ) ( 0 λ l and ) , ( φ λ l are, respectively, the maximized loglikelihoods under the Poisson regression and the ZIP regression models [16,19]. Table 2 below shows various descriptive statistics of neonatal deaths. The result shows that 76.8% of the mothers have not faced any neonatal death in their lifetime. If we observe the overall pattern of neonatal death at the national level, it is highly skewed to the right with excess zeroes (skewness=3.353, kurtosis=17.419) ( Table 2). Table 3 presents summary statistics of the variables that are assumed to affect neonatal mortality. The variables included are region, place of residence, mothers' education level, husband/partner's education level, number of antenatal visits, place of delivery, postnatal care, mothers' occupation, religion, mothers age at first birth, source of water supply and availability of toilet facility ( Table 3).

Results
The total number of women considered in this study was 9209 of which 2140 of them experienced neonatal death. Of the total number of neonatal deaths per woman, 14.3% and 25.9% of neonatal deaths have occurred in urban and rural areas, respectively. Another maternal variable that possibly has a strong bearing on the survival prospects of a child is the mother's age at the time of first birth. Regarding mothers' age at first birth, 5827 (63.3%) of children were from mothers of age less than 20 years while the remaining 3382 (36.7%) were from mothers with age 20 years and above. Of these, 26.6% and 14.8% of children have died before the age of one month, respectively. Of the total of 9209 women included, 5987 (65%) have delivered at home and only 3038 (33%) have taken postnatal care. 26.5% of children from mothers who attended postnatal care have died before the age of one month. This figure is 21.6% for children whose mothers have not attended postnatal care.
Overall, more than three fourth 7126 (77.4%) of the respondents (mothers) live in rural areas, while less than one-fourth 2083 (22.6%) of them live in urban areas. From a theoretical perspective, place of residence is an important determinant of child survival. Mothers living in urban areas have a higher chance of getting health service and are aware of the benefit of medication than mothers who reside in rural areas. Table 3 reveals that the percentage of neonatal death for mothers who reside in rural and urban areas are about 26 and 14.3, respectively. Table 3 also reveals that there is a decreasing trend in neonatal deaths with regards to education level of mothers (that is, neonatal deaths decrease as education level increases). In particular, the percentage of neonatal mortality was 27.4% for those with no education, 16.6% for those with primary education and 7.1% for those with secondary and higher education. Similarly fathers' education shows higher mortality for neonates from illiterate fathers (27.8%) and is lower for fathers with secondary and higher education (9.3%).
If we consider the regional variation in neonatal death, Amhara and Benishangul-Gumuz had the highest proportion of neonatal death (29.1% and 28.8%, respectively), while Addis Ababa city administration has the lowest proportion of neonatal death (only 8.4%). As far as the numbers of antenatal visits are concerned, the neonatal death percentage decreased  with an increase in the number of visits. Specifically, the neonatal death rate was 26.5% for woman who did not attend antenatal care, 19.9% for those who had between one and three visits and 14.7% for those who attended four or more visits (the minimum number recommended by WHO). Table 3 also show the summary statistics of sanitation indicators -source of water and type of toilet facility. Piped water supply reduces neonatal mortality directly by reducing the incidence of diarrhea that arises from the ingestion of contaminated water and food and indirectly when caregivers are able to devote more time to childcare instead of water collection activities. The results indicate that 3426 (37.2%) households had piped water while 3912 (42.5%) used water from unprotected source. Children to mothers with unprotected source of water supply accounted for 26.4% of neonatal death, while those to mothers with piped source of water supply accounted for 19% of neonatal deaths (Table 3).

Statistical data analysis and model comparison
The variable of interest in this study was the number of neonatal deaths per woman through her life time. Such data can be well fitted by the count data models rather than the linear regression model.

Likelihood ratio test (LR), Akaike information criterion (AIC) and
Bayesian information criterion (BIC) were used to compare the candidate models to identify the most appropriate model.

Development of count data regression models for determinants of neonatal mortality: Model identification
An initial step in the model building process is to identify sets of explanatory variables that have the potential for being included in the linear component of a multivariable count data regression model. We start with fitting univariable Poisson regression model. Based on univariate Poisson model we can identify candidate covariates to be considered for the multivariable model. Inclusion of such covariates will be based on a significant reduction in the value of -2Log  L , where the bigger the reduction of this value, the better the fit. Including highest educational level of women, husband/partner's educational attainment, number of antenatal visits, postnatal care, place of residence, region, place of delivery, age of mothers at first birth, religion, source of water supply, availability of toilet facility and occupation of mothers in the model, one at a time, result in a significant reduction in -2Log  L using the chi-square test. Hence all these covariates are considered in the multivariable count data regression models.  Initially, we fitted Poisson model to identify the risk factors of neonatal mortality. The fitted Poisson model is then tested for overdispersion. However, the overdispersion might be due variation among observations or to excess zeroes. This brings the negative binomial model and the zero-inflated models into the picture. Thus, in order to select an appropriate model which fits the data well, four different models were considered, namely: the standard Poisson, negative binomial, zero-inflated Poisson and zero-inflated negative binomial models.
We start here first by checking the overall goodness of fit using deviance (likelihood ratio) statistic. Accordingly, the deviance-based chi-square test provided a chi-square value of 776.51 (p<0.0001) which would imply good fit for the model the fitted Poisson model. The following table display SAS and STATA output of the Poisson, negative binomial, zero-inflated Poisson and zero-inflated negative binomial regression models fit statistics ( Table 4).
The score test statistic for over dispersion is χ 2 =1322.77 with p-value<0.0001, indicating that there is over-dispersion and the Poisson model is inappropriate. This is also supported by the Pearson Chisquare value divided by the degrees of freedom is close to one compared to the Poisson regression model, which one possible indication that there is over-dispersion.
Next we fit a negative binomial regression model with the same explanatory variables. The likelihood ratio test was used to compare the fit of the standard Poisson with NB regression model. This statistic was found to be statistically significant, indicating that NB regression model is a better fit to the data than the standard Poisson regression model (χ 2 (1) =655.11, p-value<0.0001). Thus, the observed data are better explained by the negative binomial than the Poisson model. Further as one can be seen from Table 4, the NB regression model is a better fit than the Poisson model since it has the smallest AIC (13289.57) as well as BIC (13496.28) values.
Still the reason for over dispersion is unknown thus, we need to fit the zero inflated count models. After fitting zero inflated models, we test H 0 : α=0 versus H a : α>0 and H 0 : Ф=0 versus H a : Ф>0 to identify whether the over dispersion is due to the presence of excess zeros or high variability in the non-zero outcomes. If the null hypothesis testing the inflation parameter H 0 : Ф=0 is not rejected, then the negative binomial model is appropriate and the over dispersion problem is due to the presence of high variability in the non-zero outcomes. However, if both parameters are significantly different from zero, then the zero inflated negative binomial regression models is more appropriate to fit the data.
Our result indicates that the over dispersion parameter (α) in the ZINB regression model is significantly different from zero (χ 2 =136.61 with P-value<0.001) and the likelihood ratio chi-square test statistic for testing the inflation parameter is also significant (χ 2 =600.317, P-value<0.0001).
Hence there was a high variability in the non-zero outcomes and the data excess of zeros. Thus, it is be better to use models which take into account both the high variability due to non-zero outcomes and excess of zeros simultaneously. Figure 1 below plots of difference between the observed probability of each count and the prediction from each of the four models. This result shows that ZINB distribution has the ability to reproduce the zeroes in the population better than ZIP distribution, which means that the ZINB model gives a good prediction of the number of neonatal deaths.   The other measure used to identify the appropriate model in this paper is the Vuong's test. The Vuong test compares the ZIP model to the standard Poisson model by testing the null hypothesis that both models are equally similar to the observed distribution. The resulting statistic was statistically significant (V=9.17, p-value<0.0001), demonstrating that the ZIP model more reflects the observed data than standard Poisson due to the presence of excess zeroes. Similarly, the Vuong test that was used to compare ZINB versus negative binomial yields a test statistic of V=4.73 with p-value<0.0001. Thus, the Zero-inflated negative binomial regression model more accurately fits the number of neonatal deaths data compared to the standard negative binomial model.
Finally goodness of fit test for the fitted zero-inflated negative binomial regression model was assessed using the Pearson chi-square statistic. Accordingly, the Pearson's chi-square test provided a chi-square value of 14144.59 (p-value<0.0001) which would imply good fit for the neonatal mortality data. Therefore, zero-inflated negative binomial (ZINB) regression model was found to be the most appropriate model which fits the data better than the other possible candidate models. Thus, in the analyses discussed below, the ZINB model is used (Figure 1). Table 5 provide estimates of the effect of some selected characteristics of mothers on the mortality of their neonate. The first set of coefficients is from the equation predicting counts for "not always zero" group. These coefficients can be interpreted the same way as regular negative binomial coefficients. The factors region, place of residence, education level of mothers, source of water supply, religion, age of mother at time of first birth, antenatal care, postnatal care and education level of husband/partner are statistically significant predictors of this count outcome. Availability of toilet facility, place of delivery and occupation of mothers are not statistically significant ( Table 5). Concerning the regional disparity in neonatal mortality, the results in shows that there is no significant difference in the risk of neonatal death between Affar region and Addis Ababa city administration.

Results in
Children born to a mother with secondary or higher schooling was associated with a 64% decreased risk of neonatal death compared to being born to mothers with no education, while primary education of the mothers decreases the risk of neonatal mortality by 32%, keeping all other covariates constant. Results in Table 5 indicate that children born from mothers whose age at first birth is greater than or equal to 20 have a significantly lower risk of mortality compared to those born from mothers whose age at first birth is less than 20. The risk of neonatal death was about 25% lower for births to mothers aged 20 and above compared with births to mothers less than age of 20 years (OR: 0.751, CI: 0.666-0.846). This finding is consistent [20][21][22]. Children living in the rural areas had an increased risk to death compared to those children living in the urban areas. The risk of neonatal death was about 1.3147 times higher for a child whose mother resides in rural areas compared to their urban counterparts (OR: 1.315, 95% CI: 1.092-1.583). Theoretically, all things being equal, living in urban areas should be associated with a higher standard of living, better sanitation and better health facilities, among other things ( Table 5).
The risk of neonatal mortality for those children whose mothers use unprotected source of water is 23% higher compared to those who use piped water supply (OR: 1.232, 95% CI: 1.069-1.420). There is no significant difference in neonatal mortality among those using piped water and protected source of water supply.
Further Table 3 shows that children of mothers whose husband has primary education and secondary and higher education have a significantly lower neonatal mortality risk than children of mothers whose husband has no education. Neonates born from mothers attending 1-3 antenatal care visits have a 20 percent lower risk of mortality than neonates born from mothers attending no antenatal care visits. Moreover, mothers with four or above antenatal care visits have a 32% lower risk than mothers with no antenatal care visits.
The second set of coefficients on the bottom half of table 5 predict the dichotomous outcome of group membership (i.e., "not always zero" versus "always zero" groups). The results show that women who delivered at health facility are less likely than those delivered at home to be in the "always zero" group. The odds for health facility delivery to be in the always zero group are 92% lower than for delivery at home. The odds for mothers with access to toilet facility to have no NM (being in always zero group) are 61% lower than those without availability of toilet facility. The results also indicate that the chance for membership in the "always zero" group (no neonatal death) increases by a factor of 3.67 for mothers whose religion is protestant holding all other variables constant (Table 5).

Discussion
The study has empirically investigated and identified the factors that are associated with the risk of neonatal mortality in Ethiopia based on EDHS 2016 data using four count regression methods. The study use retrospective study design. The main limitation of this study is it was difficult to gather all necessary variables and some variables did not get in a way we need. This may affect the association of outcome variable and the determinant factors. The confounding effect of unmeasured variables cannot be controlled. Some determinant factors like mode of delivery, birth weight and maternal smoking habits are a factors associate to neonatal mortality. But even if they are important determinant factors, we fail to include them in the analysis due to incompleteness of the data. Thus, any reader of this manuscript should take in to account the above limitations.
The findings of the study show that being born to a mother with  education schooling was associated with decreased risk of neonatal death compared to being born to mothers with no education. Thus, educational level of mothers is an important and significant factor of neonatal mortality risks in Ethiopia. In literature most of the studies also report a negative relationship between neonatal death and mothers' education level. They indicated that education improves the ability of mothers to implement simple health knowledge and facilitates their capacity to manipulate their environment including health care facilities, interact more effectively with health professionals, comply with treatment recommendations and keep their environment clean. Furthermore, educated women have greater control over health choices for their children [23][24][25].
Results also shows that children born from mothers whose age at first birth is greater than or equal to 20 have a significantly lower risk of mortality compared to those born from mothers whose age at first birth is less than 20. Children living in the rural areas had an increased risk to death compared to those children living in the urban areas. Theoretically, all things being equal, living in urban areas should be associated with a higher standard of living, better sanitation and better health facilities, among other things. Place of residence was found to have an association with neonatal mortality, such that children living in the rural areas had an increased risk to death compared to those children living in the urban areas. The risk of perinatal death was less for a child whose mother resides in urban areas compared to their rural counterparts. Theoretically, all things being equal, living in urban areas should be associated with a higher standard of living, better sanitation and better health facilities, among other things. This finding is in agreement with studies by which shows a significant association between place of residence, age at first birth and neonatal death [20][21][22]26].
Lack of clean water supply combined with little or no health care knowledge can provide routes for infections. The findings of this study show that mothers who use water from unprotected sources are at a higher risk of experiencing neonatal death than those who use pipe water. There is no significant difference in neonatal mortality among those using piped water and protected source of water supply.
Finally, husband education was found to be a significant predictor of neonatal mortality. Further result shows that children of mothers whose husband has primary education and secondary and higher education have a significantly lower neonatal mortality risk than children of mothers whose husband has no education. This finding is consistent with [24,27]. As an indicator of health care service utilization during pregnancy, antenatal care service factors demonstrated a significant association with neonatal mortality.

Conclusion
The study has empirically investigated and identified the factors that are associated with the risk of neonatal mortality in Ethiopia based on Ethiopian DHS 2011 data using methods of count data analysis [5]. Descriptive statistics and multivariable analyses were done to achieve this. In this study, after fitting four different count models it was found that NB, ZINB, regression models were better fitted the data than Poisson and ZIP. Furthermore the zero-inflated negative binomial model was better fitted with data which is characterized by excess zeros and high variability in the non-zero outcome.
According to the results of multivariable ZINB regression model it is found that the variables age of the mother at the time of the first birth, highest educational level the mother attained, number of antenatal visits during pregnancy, husband/partner's educational attainment, postnatal care/seeking care after birth, place of residence, region, source of water supply and religion are the significant determinants of neonatal mortality in Ethiopia. However, place of delivery, access to toilet facility and mothers' occupations were found to be insignificant factors of neonatal morality in Ethiopia.
In recent decades, Ethiopia has achieved significant declines in under-five and infant mortality rates. However, neonatal mortality rates have stayed higher than post-neonatal rates. Further interventions have to be done to reduce mortality rates among neonates. The findings presented in this study have the following research and policy implications: • The survival experience of children in rural areas is much lower than those in urban areas. Thus, there is a need to increase access to maternal and child health services in rural areas.
• Improving the level of education of mothers is vital. Education of mothers plays an important role in survival of neonate and thus, health programs need to focus on supporting women with little or no education.
• Encourage, advocate and promote utilization of antenatal care services and the importance of following for both the mother and the newborn.
• The study revealed that children of households who use unprotected water as a source of drinking are at a higher risk of neonatal death. Thus, efforts should be made to improve access to safe/pipe drinking water and effective programs to reduce early childbearing of women should be implemented so as to decrease neonatal mortality.