Modeling the Infant’s Age at Hospital Admission in Neonatal with Jaundice at the Ghaem Mashhad Hospital Using Count Models with Excess Zeros

Bazzazzadeh V, Doosti H, Boskabadi H, Saffari SE, Peyman N and Chesneau C 1Department of Biostatistics, Mashhad University of Medical Sciences, Iran 2Assistant Professor of Statistics, Mashhad university of Medical Sciences, Iran 3Departement of Pediatrics, Mashhad University of Medical Sciences, Iran 4Centre for Quantitative Medicine, Office of Clinical Sciences, Duke-NUS Medical School, Singapore 5Associate Professor of Health Education, Mashhad university of Medical Sciences, Iran 6Associate Professor of Statistics, Université de Caen Normandie, France


Introduction
The neonatal jaundice (hyper bilirubinemia) is a prevalent disease in newborns that causes the yellow color of the skin and sclera in newborns. Neonatal jaundice is the most common disease in new born neonatal within the first week of life, the prevailing cause of hospitalization the healthy infants and the prenatal infants in hospital. Neonatal jaundice refers to when the total bilirubin levels of the infant's blood serum is more than5% mg/dl (86 micromoles per litter). In this disease, an increase in bilirubin can affect on brain and finally lead to the Encephalopathy and Kernicter us. It also can be led to death of tens of mental retardation and disorders of the nerve [1,2]. The world wide reduction in infant mortality and recognize the preventable factors that caused the readmission of the infant after birth is very important. This important problem with the global emphasis for faster permission of the mothers and infants from the hospital and reduction of the symptoms and cost of the substrate hospitalization became clearer. So infants would be permitted during the first 24hours after birth but during this time do not occur clinical signs of jaundice. Prevention of the encephalopathy and high bilirubin depend on early diagnosis of infants at risk and treatment [3]. Due to prevalence of this disease in Iran and the importance of it, yet the need is felt that more studies are designed to identify the influencing factors of disease [1].
In statistics, count regression model is used when the response variable is non-negative integer. The count models are typically under the class of nonlinear regression models. The Poisson regression model is the simplest and the most widely used model when dealing with count data. The interpretation of a Poisson model is straightforward due to its simplicity to regression linear model as many aspects. One of the basic assumptions in a Poisson model is the equality between the mean and variance that usually is not the case in reality, especially when there is skewness in response variable. When there is over dispersion in the model variance is greater than mean then the Poisson regression is not an appropriate model anymore and could lead to some non-robust results such as incompatible estimation, under-estimated coefficients, increasing the alpha risk and the narrow confidence interval. One alternative model to deal with over dispersion problem is the Negative binomial regression model.
Existence of high frequency of zeros (excess zeros) is another common issue in count data. Zero-inflated models are proposed to handle the excess zero problem in count data. In count models, the expected zero frequency is equal to share of count distribution in create zero. Now, if numbers of zeros become more than share count distribution in create zero, there would be zero-inflated. In medical data, over dispersion and excess zeros problem usually occur in right skewed count data. A Zero-inflated model can handle both over dispersion and excess zero problems [4][5][6]. Since the day of admission is a discrete variable so the infant's age at hospital admission after birth is count discrete variable [7,8].
Since early admitted infants with jaundice to hospital and proper handling them led to an improvement in infants health and reduction of problems arising of their jaundice, therefore we are looking for the best count model to predict the common factors rising infant's jaundice on the age at hospital admission after birth.

Participants
This is a cross-sectional study. The data is collected fromMarch2005 to September2015, 3130 infants that referral to infant's emergency unit and infant's neonatal intensive care unit of the MGH. We have excluded the patients who: were not willing to contribute in this research or whose information were incomplete, their parents were not willing to contribute in this research, did not have sufficient information regarding to their fetal. Finally, 565 infants were enrolled. The study was confirmed by the Ethics Committee of Mashhad University of Medical Sciences and the parents' consent before entering the study. All patients' records were collected in a researcher-made questionnaire that was specifically designed for this purpose. The validity of the questionnaire was confirmed by four faculty members at medical school. In neonates' investigation, age at hospital admission, age at disease onset, symptoms at admission time, and treatment status were recorded and infants completely checked. All lab tests conducted to investigate the causes of jaundice include hematocrit, direct and indirect bilirubin, blood group of mother and neonatal, culture and urinalysis, sodium and other tests were performed.
Three groups of sepsis, pneumonia and urinary tract infections were introduced in a single group as infectious causes. If mother's Rh is negative and infant's Rh is positive, and direct Coombs test was positive, incompatibility diagnosis was suggested. ABO incompatibility is detected been raised if the blood group mother was O and blood group of infant was A or B and the existence of at least two of the following conditions: 1-Jaundice first day2-direct Coombs test positive3-there Mikroes see in the peripheral blood4-indirect Coombs test positive if there is no ABO and or Rh incompatibility but direct Coombs test is positive, known as Sub-group conflicts. Three groups of Rh and ABO incompatibility and sub-groups within a larger group was placed as blood incompatibility. Yellow infants born to mothers with diabetes and yellow infants patients with hypothyroidism and infants with Beckwith-Wiedemann syndrome were divided in groups of endocrine causes jaundice. Infants with a hematoma pottery, adrenal hemorrhage, cerebral hemorrhage and the skin ecchymosis were introduced as occult bleeding [9].
In this study, questionnaire data includes sex, age of hospital admission, bilirubin level, sodium level, blood types of mother and neonatal, diagnosis of jaundice and treatment status. Data were extracted for statistical analysis using SAS software version 9.3 for Windows (SAS, Inc, Cary, NC). The count regression models-including Poisson, negative binomial, zero-inflated poison, zero-inflated negative binomial and hurdle Poisson regression models were fitted on the data. The goodness-of-fit statistics (BIC and AIC) was calculated to find the best model and fit. Among all models, the best fit belongs to the model with the minimum AIC/BIC value.

Data analyzing procedure
The main motivations of zero-inflated count regression, is using them for actual data because the data shows consistently high dispersion and high zero [10]. In such cases, the regression models that included additional zeros are introduced. These models are called zero-inflated. Zero-inflated count regressions are a way for modeling zero-inflated count data with so much dispersion. In this specific case, using methods such as zeroinflated Poisson regression, zero-inflated negative binomial regression and the hurdle Poisson model have been suggested [4][5][6].

Zero-inflated model
Zero-inflated Poisson regression and zero-inflated Negative binomial regression model by fitting a two-component mixture model directly modelling the extra number of zeros in counting variable. It is combination of a point density at zero with a count distribution, such as Poisson, Geometric, or negative binomial. So there are two sources for zero values, zeros can be both point density and count distribution. Therefore, this model directly modeling the number of zeros too available in variable. For modeling the zero values in contrast the count values the binomial model used. The aim of first model is whether the use of logistic regression model lead to given event or not. The second model also due a Poisson or negative binomial model evaluates the frequency of an event's occurrence. Zero-inflated models outputs would be due two types of coefficients that one predicts the probability of no occurrence of an event (logistics) and the other predicts the frequency of the given event (Poisson or negative binomial).The variables used in the model can be the same or different. The density zero-inflated is combination of a point density at zero I_{0} (y) and a count distribution ( ) ; , count f y x β . The ratio of observations accumulated at zero is likely :

Hurdle Poisson model
Hurdle ; , zero f y z γ (Censored of the right in y=1 ) , hurdle model is a combination of the two models are shown as follows: The parameters β and γ , and one or two the Parameters were density function negative binomial distribution) by estimating the maximum likelihood (ML) were calculated. One of the advantages is that the likelihood function can be maximized separate the component count and hurdle. The corresponding regression equation as follows [9].
In this study, we pay to the causes of jaundice in the infant's age at hospital admission at the first fourteen days since the birth. Response variable is the age at hospital admission (number of days elapsed since birth until the day of admission to hospital) in the first fourteen days that it is a discrete count variable. Given the importance of the first three days of the infant's age is important to us and we will investigate factors involved in these three days, we define a new variable quantities of age admission after birth (day) on days 1, 2, 3give zero value, age admission (day) for the fourth day give 1, age admission (day) for the fifth day give 2, … age admission (day) for the fourteenth day give 11. As a result, a new variable amounts of y = 0,1,2,…, 11 will be. The explanatory variables include infant's sex, blood type of mother, blood type of infant, bilirubin, sodium, treatment, and diagnosis of neonatal jaundice.
In order to select the final model that is the best fit to the data and selection on the comparing criteria as well as the zeroinflated data fitted regression, given that the sample size is not the same throughout variables and for have the same variables number in comparison models, at first, the significance level of each variable in the zero-inflated negative binomial regression model is checked, then if the calculated p-value is less than the required significance level (p <0.05), the corresponding variable is significant and would be entered in primary model.

Results
From 3130 evaluated neonates, 1565 infants were enrolled .238 infants (15.2%) were admitted in the first three days, and 238infants (15.2%) on the fourth day. Only29 infants (1.9%) on the fourteenth day were admitted that given that jaundice occurs in the first three days of the life, this is natural. The distribution age at admission after a three-day is shown in Figure 1.   The infants' mean bilirubin with jaundice were 20.73±5.78and sodium were 143.89±10.9. The lowest bilirubin level in infants was 6.1mg.dl and the highest value was 57.7mg. dl. The lowest and highest sodium levels of infants with jaundice were 120 and 205 milimol. With fitting zero-inflated negative binomial regression model on individual variables and review them, only variables bilirubin, sex, blood group of mothers in the zero-inflated model were significant and variables sodium, causes diagnosis and treatment of jaundice were significant in negative binomial model. According to these results, in the final models only significant variables in zero-inflated model and regression model entered to model to determine the significance of them simultaneously.

0126
In fitting models, for the diagnosis of jaundice variable level's other factors, for treatment methods level's other methods, for sex variable being male and for mother's blood group and infant's blood group level's AB + were selected as reference levels. The results of the estimated coefficients (standard error) of five model fitted to the data shown in Table 1. With the Poisson regression model fitting, only the diagnostic of causes of neonatal jaundice variable was significant. Endocrine disorders can have a direct relationship with the logarithm of the expected admitted age (p-value <0.05). The chance of having an infant with jaundice causes of endocrine disorders is 1.55 times than other diagnostic jaundice causes. Changing in diagnosis of neonatal jaundice from endocrine disorders to other factors increase 0.53the expected logarithm of the admitted age. In the negative binomial regression model only sodium (p-value <0.05) can be an effective factor on the infant's age at hospital admission. As one unit increase of infant's sodium level, the expected logarithm of the admitted age would increase 0.004. The significance of the dispersion parameter indicates that there is over-dispersion problem in data. Fitting zero-inflated Poisson regression model, only the diagnostic causes of neonatal jaundice variable became significant.
Endocrine disorders and occult bleeding can have a direct relationship with the expected logarithm of the admitted age (p-value <0.05). Chance of having an infant with endocrine disorders is 1.59 times other factors and chance of having an infant with occult bleeding in the first fourteen days also is 1.78 times other factor. Changing from endocrine disorders to other causes of jaundice or occult bleeding to other factors increases respectively the expected logarithm of admitted age 0.47 and 0.57. Only blood group of mother affects on the probability of first three days to other days. Logarithm of the infant's age admission in first three days has direct and negative relationship with blood group of mother of B+. Estimates of inflated ratio of zero in zeroinflated Poisson model equal to 47% that means that if the age at hospital admission have Poisson distribution the zeros in these data are 47% more than of the share of distribution. In fitting the zero-inflated negative binomial model neonatal jaundice causes diagnosis and treatment variables significantly associated with the age at hospital admission. Detection of occult bleeding and type of phototherapy have direct and positive relationship with the expected logarithm of the age of hospital admission (p-value <0.05).
As changing the detection of occult bleeding to other detections, expected logarithm of admitted age increases0.8 and as well as by changing the type of phototherapy to other treatments the expected logarithm of admitted age increases 0.81. In zeroinflated model none of the variables had a significant impact in the first three days. The significant dispersion parameter indicates an over-dispersion problem in data (p-value <0.05). If admitted age increases one unit, the logarithm of the odds of zero-inflated increases 0.31. In other words the inflated ratio at zero estimation in zero-inflated negative binomial model equal to 31% that showing if the age at hospital admission variable has negative binomial distribution in this data the zeros are 31 percent more than the share distribution. Fitting Hurdle Poisson model to the data, jaundice diagnosis and treatment variables became significant in the Poisson model. The occult bleeding diagnosis, infection and type of phototherapy treatment can have a direct relationship with the expected logarithm of the admitted age. Detection of occult bleeding of jaundice, infection and type of phototherapy treatment respectively resulted to increase 0.65, 0.53 and 0.61in expected logarithm of admitted age than other factors.
In zero-inflated model there was only blood group of mother significant and sex and bilirub invariables didn't significantly associated with probability of admission in the first three days. The infants whose mothers had blood group type of O-, A+, B+ had a chance to admission in first three days (p-value <0.05). An infant whose mother has blood group type of O-than infants whose mothers had other blood groups the admission chance was 4.06 times. Mothers who are also their blood group A+, B+ the chance of their infant in admission in first three days were lesser as 0.77 and 0.94 times than other mothers. The estimated ratio of inflated in zero in hurdle Poisson model was 66% showing that if the admitted age have Poisson distribution, in this data the zeros values 66% more than whatever here distribution. In this Study given to results of AIC and BIC criteria documented that zero-inflated negative binomial regression model is better to fit data while it has the least amount of information. Given to the issue of over-dispersion in data and high frequency of zero in response variable this result was expected.

Discussion
Appropriate deal for the diagnosis, treatment and follow of jaundice are always one of the major challenges in neonatology. Prevent against jaundice, early diagnosis and appropriate treatment and prevent against complications can reduce jaundice problems of infants. One of the important affairs in these newborns is the investigation of the cause of jaundice. Identifying of determinant of jaundice can help to the doctor in appropriate action preventing against complications. The number of infants who returned to hospital in the first three days were 238 (15.2%) and in fourteenth day only 11infants (9.2%). The mean level of bilirubin in the girl infants was among 25.7±3.7mg.dl and for the boy infants was29.41±5.3mg.dl. The mean sodium or the boy infants was 144.4±10.9 and for girls was 143.46±10.9.Our results indicate that only for28.5%of infants the determinants of jaundice were diagnosed. For 217 infants (13.4%) the causes of jaundice were blood group incompatibility and only for 43 infants (2.7%) was occult bleeding.
The most frequency of unknown causes of jaundice were in the fourth to sixth day of return and in the first three days of return blood group incompatibility and occult bleeding were

0127
more detected. The most common time of endocrine disorders and infection at the hospitalization were in the sixth and seventh day. The recent studies have indicated that most common time of admission to hospital for Rh incompatibility, occult bleeding and endocrine disorders was 4 to 6 days and for infection was behind seventh day [9]. The most common causes of jaundice in the birthday, was Rh incompatibility (1.39%) and in other days of returning to hospital the cause was unknown. After unknown causes, the most common causes of jaundice in days of 2-9was ABO incompatibility (14.1%) and after 13 days was infection (15.5%) that was closer to our results. The causes of neonatal jaundice on the infant's age at hospital admission statistically hadn't significant difference (P <0 .001).
A previous study showed that the mean of admitted age of infants with the Hypothyroidism of newborns was more than jaundice newborns with unknown causes (P =0.001) that was consistent with our results [11]. In Study the amount of ABO incompatibility (blood group of mothers O and blood group of infants A or B)40.4%had been reported [12]. Admitted age to hospital for infants with ABO incompatibility were in days of 3to 8, for infants with RH incompatibility were during first 24hours to 7 days after birth and for infants with the incompatibility of sub-blood groups were in the days of 2 to 7. The age at disease onset was 3 days and admitted agewas6 days. Results of a previous study showed that the prevalence of risk factors for premature jaundice were respectively: ABO incompatibility, Rh incompatibility and pottery hematoma [13]. The most common infant's blood groups were A and AB and mother's blood group was O. No significant relationship were found between sex and prevalence of premature jaundice but between ABO incompatibility and prevalence of premature jaundice significant relationship were found that these results were consistent with our results.
In a study, results showed that ABO incompatibility, urinary tract infection, hypothyroidism can be risk factors for neonatal jaundice [14]. Unfortunately many studies that examine the causes of jaundice by the age at hospital admission were not found. With Poisson regression model fitting only occult bleeding and endocrine disorders had a direct and positive impact on the mean age at hospital admission. While the negative binomial regression model fitting the significant of dispersion parameter showed that there is over-dispersion problem in the data and Poisson regression model cannot be a suitable model to fit the data. In this model only the sodium level had direct and significant relationship with mean of the infant's admitted age to hospital.
Given that a significant proportion of infants have been admitted in the first three days zero-inflated regression models may be appropriate models to fit the data. In zero-inflated Poisson regression model the diagnostic causes of jaundice and types of treatment had a significant relationship with the admitted age. While blood group of mother significantly associated with the likelihood of the age at hospital admission in the first three days. The infants that their blood group of mothers were B+ in comparison with other infants had0.08 times more chance to admission to hospital in the first three days. The infant's sex and bilirubin levels factors individually were associated with the likelihood of admission in the first three days but in the presence of other factors were not significant.
With the zero-inflated negative binomial regression model fitting occulted bleeding and phototherapy treatments type were significant. This relationship was positive and a chance to see the infant in the first 14 days due to occult bleeding was 1.82 times and due to treatment with phototherapy was1.76 times than other infants. While in the hurdle Poisson model reasons include occult bleeding and infection and type of phototherapy treatment can be effective factors on the infant's age at hospital admission in the first fourteen days and infants that whose mothers had blood group as O-have positive and more chance to admission to hospital in the first three days and whose mothers had blood group as A+ or B+ had little chance and negative to admission in the first three days.
According to MSE criteria, it was observed that the negative binomial regression model among the other models is the best model in estimation but in comparing these methods with an artificial neural network the artificial neural network had better performance than regression [4]. A previous study compared Poisson and negative binomial regression models based on the AIC criteria, and it turns out that the negative binomial regression model was a better fit to the data than Poisson regression model [13]. In another study the performance of zero-inflated regression models against Poisson and negative binomial regression model were compared and the results showed that the zero-inflated negative binomial regression model was the best and graceful model in fitting to the data [14]. In a study a hurdle model was used to analyze the data with excess zeros [15]. In a study 7infittingcount data in the field of Health Sciences, four models include Poisson, negative binomial and zero-inflated Poisson and zero-inflated negative binomial were used. Some other studies showed that negative binomial and zero-inflated negative binomial models fit the data well when there are over-dispersion and excess zeros problems and it is superior to Poisson model, which is consistent with our results [5,6,16,17].

Conclusion
While there is over-dispersion in the data, the negative binomial model is recommended rather than Poisson model. Due to excess zeros problem in the response variable, the count models that can handle this issue (such as zero-inflated and hurdle model) should be used. Zero-inflated negative binomial model is superior to zero-inflated Poisson model based on goodness-of-fit criteria. Zero-inflated negative binomial

0128
regression model is proposed as a predictive model for the infant's age at hospital admission.