STUNTING DETERMINANTS AMONG TODDLERS IN PROBOLINGGO DISTRICT OF INDONESIA USING PARAMETRIC AND NONPARAMETRIC ORDINAL LOGISTIC REGRESSION MODELS

. Abstract: Stunting is a chronic nutritional problem in toddlers characterized by a shorter height than other children of their age. Stunting is a major nutritional problem faced by Indonesia. This research aimed to develop a risk model for the incidence of stunting in toddlers. This research was conducted in the village of stunting locus in the Public Health Center area that was selected to be the sample in Probolinggo District. Data were collected in the villages of Alaspandan, Bucorwetan, Petunjungan


INTRODUCTION
One of the strategic issues that becomes a priority in health development in Indonesia which resulted in the 2019 National Health Working Meeting is stunting. Stunting is a chronic nutritional problem in toddlers, characterized by a shorter height than other children their age. Although the 2018 Basic Health Research stated a decrease in the stunting rate from 37.2% (in 2013) to 30.8% (in 2017), this figure is still higher than the maximum stunting rate set by WHO, which is 20%.
The government must seriously deal with stunting because it can reduce the productivity of human resources of Indonesia in the future. It is important to reduce the incidence of stunting in toddlers as early as possible to avoid long-term adverse impacts such as stunted child development.
Stunting can affect brain development so that the child's intelligence level is not optimal. This is at risk of reducing productivity as an adult. Stunting also makes children more susceptible to disease. Children who are stunted have a higher risk of developing chronic diseases in adulthood [1].
There are two approaches in the regression modeling namely parametric regression and nonparametric regression approaches [2]. The parametric regression known as global approach assumes that the regression model for each observation has the same parameters, while the nonparametric regression known as local approach assumes that not all observations have the same parameters [3]. Next, related to children growth, the children born up to about one year of age have physical growth that increases rapidly and then decreases slowly as the child gets older [4]. 3 STUNTING DETERMINANTS AMONG TODDLERS Therefore, the nonparametric regression model approach locally was applied to this type of data pattern [5].
There are many researchers who have discussed not only parametric regression but also nonparametric regression using some estimators [6][7][8][9][10][11][12][13][14][15]. Also, several researches examining the factors which are influence the incidence of stunting in toddlers have been carried out parametrically by [16][17][18][19]. This study aimed to develop a risk model for the incidence of stunting in a toddler using parametric and nonparametric ordinal logistic regressions approaches and then the best model will be obtained.

PRELIMINARIES
In this section, we provide brief overview of dataset's description, parametric ordinal logistic regression model, and nonparametric ordinal logistic regression model.

Dataset
The data used in this research were data on toddlers nutritional status provided in the applications

Parametric Ordinal Logistic Regression Model
Parametric logistic regression models are used to model the relationship between categorical response variables and predictor variables which are categorical or continuous. If the response variable consists of two categories, it is called a dichotomous or binary logistic regression model.
Next, if the response variable is divided into more than two categories, it is called a polycotomous logistic regression model, and if there is a level in which category (ordinal scale), it is called an ordinal logistic regression model [20].
Models that can be used for ordinal logistic regression are cumulative logit models. Suppose the response variable Y has as many as G categories on the ordinal scale, and represents the vector of the predictor variable in the ℎ observation, 12 ...
then the cumulative logit model can be presented as follows: (1)  x . The cumulative logit is also defined as follows: Based on Eq.(1) and Eq. (2), the ordinal logistic regression model can be presented as follows: Hence, we have: represents the probability of response variable in the ℎ 5 STUNTING DETERMINANTS AMONG TODDLERS observation which has the category g against i x . Thus, the probability of each response category can be expressed as follows: The probability score for each response category is used as a guide for classification. An observation will be included in the response category g based on the greatest probability value.
Estimator for parameters of ordinal logistic regression model can be obtained using the maximum likelihood estimation (MLE) method. The principle of the MLE method is to estimate parameters by maximizing the likelihood function which is obtained by taking the first partial derivative of the likelihood function with respect to its parameters, and then and then it equates to zero. The result of this process is a non-linear function of the parameters to be estimated. Therefore, a numerical method is needed to obtain the parameters estimates. The numerical method that can be used is the Newton-Raphson iteration method.

Nonparametric Ordinal Logistic Regression Model
The nonparametric ordinal logistic regression model is a development model of the ordinal logistic regression model using nonparametric regression approach. According to [21], the general form of nonparametric ordinal logistic regression model is as follows: x is a vector of the predictor variable;  is the intercept; and ( ) j f  is unknown regression function of the ℎ predictor variable which will be estimated using nonparametric regression approach based on local linear estimator.
According to [22], if j x is being in the area of 0 j x then the function ( ) j f  will be approached by Taylor's expansion to = 1 which can be presented as follows: Based on Eq. (7), the Eq.(6) can be expressed as follows: 6 RIFADA, CHAMIDAH, NINGRUM, MUNIROH Based on Eq.(6) and Eq. (8), the cumulative probability of the response category g can be expressed as follows: The estimation of parameters of nonparametric ordinal logistic regression model was carried out using local maximum likelihood estimation (LMLE) method. The results of the first partial derivatives obtained are implicit form, thus a numerical method is needed to obtain the parameters estimations. The numerical method that can be used is the Newton-Raphson iteration method [2].

MAIN RESULTS
The response variable (Y) used in this research was the nutritional status of toddlers based on the height/age index which was categorized into three categories namely severely stunted

Characteristics of Research Variables
The characteristic analysis of research variables describes the condition of the toddlers who were used as samples of the research. The characteristic analysis for the response variable is presented in Fig.1.  Fig.1, it can be seen that the prevalence of toddlers who experience stunting is 26.7%, consisting of 21.3% moderately stunting and 5.4% severely stunting. Furthermore, based on the characteristic analysis of the predictor variables, it can be seen that the variable of birth weight has an average of 3.019 kg, but at its minimum value, there is still birth weight which is less than 2 kg, namely 1.6 kg. It shows that there are still toddlers who were born prematurely. Furthermore, the variable of maternal age has a median value of 30 years, but the minimum and maximum ages of 20 years and 43 years are the period outside the age of the mother to give birth to healthy babies.
So, there is a possibility that toddlers were born less healthy, and in other variables the parenting pattern has a median value of 80 and a minimum value of 20. This proves that there is still lack of maximal application of good parenting styles for toddlers. Also, the variable of hygiene and sanitation has median value of 60 and minimum value of 0. This indicates that there is still lack of implementation of cleanliness and sanitation in the environment where toddlers live.
Before analyzing the data using regression, data exploration was carried out for each predictor had VIF values more than 10. So it can be said that the two variables experienced multicollinearity cases. Therefore, those variables are excluded from the model. Thus, for further analysis, 9 predictor variables were used.

Ordinal Logistic Regression
The first step to build the risk model for the incidence of stunting in a toddler in Probolinggo out re-ordinal logistic regression testing with the variables used in the model were birth length, maternal height, and health services. The results of simultaneously parameters testing after the backward eliminations method showed that there were predictor variables that significantly affected the model. Furthermore, a partial test is carried out and its outputs are given in Table 2.
Furthermore, the probability function of each response category can be obtained as follows: (i). The probability of the nutritional status of toddlers who are severely stunted is given by: . The probability of nutritional status of toddlers who are moderately stunted is given by: The next step is testing suitability of the model to find out whether the model equation that has been formed is appropriate. The result of the model suitability test shows that the P-value for Deviance is 1 so that the model obtained is suitable. The final step is calculating the value of the classification accuracy between the actual value and the predicted value obtained from the model that has been built. The result of the classification accuracy of the risk modeling of the incidence of child stunting in Probolinggo district using the parametric ordinal logistic regression approach was 72.45%.

Nonparametric Ordinal Logistic Regression
The first step to obtain parameter estimation of nonparametric ordinal logistic regression model using local maximum likelihood estimation method is by determining the optimal bandwidth for 11 STUNTING DETERMINANTS AMONG TODDLERS each predictor variable, which is the bandwidth that has a minimum CV value. The optimal bandwidth results for each significant predictor variable, i.e. birth length, maternal height, and health services were 4.34, 8.84, and 50.01, respectively, with CV values of 39.08.
Furthermore, these optimal bandwidths are used to estimate the parameters for each observation with fixed point arbitrary. The results of parameters estimation for each observation are presented in Table 3. The estimation results of these parameters are used for estimating the probabilty of each toddler who suffers from stunting. Furthermore, the classification accuracy value obtained from modeling the risk of stunting incidence of children under five in Probolinggo District using a nonparametric ordinal logistic regression approach was 73.98%.

Determination of the Best Model
In summary, the comparison of the classification accuracy values of modeling the risk of stunting in toddlers under five based on the parametric and nonparametric regression approaches is presented in Table 4. Based on the validation results in Table 4, it can be seen that the best model for modeling the risk of stunting in toddlers under five in Probolinggo District is model obtained based on a nonparametric ordinal logistic regression approach with classification accuracy of 73.98%. This shows that the nonparametric ordinal logistic regression approach has improved the classification accuracy although the increase is not very significant.

CONCLUSIONS
Based on the analysis of variable characteristics, it shows that the prevalence of toddlers under five who are stunted is 26.7%, consisting of 21.3% moderately stunted and 5.4% severely stunted.