LOCAL LINEAR NEGATIVE BINOMIAL NONPARAMETRIC REGRESSION FOR PREDICTING THE NUMBER OF SPEED VIOLATIONS ON TOLL ROAD: A THEORETICAL DISCUSSION

In this paper, we describe a theoretical discussion about local linear negative binomial regression for predicting the number of speed violations on toll road. Data on the number of speed violations on toll roads is a count data. Count data is a non-negative integer data generated from continuous calculation process. We usually use Poisson regression to analyze count data of a response variable. But, one of infractions on Poisson regression assumption is over-dispersion. To overcome that over-dispersion we should use negative binomial nonparametric regression model approach. The negative binomial nonparametric regression model is a development of the negative binomial parametric regression model. In this research, we theoretically discuss estimation of negative binomial 2 CHAMIDAH, WIDYANTI, TRAPSILAWATI, SYAFITRI nonparametric regression model based on local linear estimator which is applied to data of the number of speed violations on toll roads. The estimation results of the negative binomial nonparametric regression model that we have obtained then can be used to predict the number of speed violations on toll roads so that the Ministry of Transportation together with the police can use it to take preventive measures.


INTRODUCTION
Regression analysis is a method to determine the functional relationship between predictor variables and response variables. One of regression models is Poisson regression. Poisson regression model is the popular regression model on the count response variable [1]. One of the common departures from Poisson regression is that the failure of mean equal variance restriction (equi-dispersion cases), over-dispersion are mostly appeared. Negative binomial regression can be used to handle that over-dispersion in the Poisson regression model [2].
The researches about handling over-dispersion cases in the Poisson regression have often been done by some researchers. Generalized Poison has been used for modeling the number of traffic accidents by [3]. According to Ismail and Jemain [4], generalized Poisson and negative binomial regression can handle over-dispersion in Poisson regression. Development of negative binomial regression model for handling over-dispersion cases has been done by [5] and [6]. The other research was conducted by [7] that discussed the difference between negative binomial regression and Poisson regression in analyzing AIDS death rate, and concluded that negative binomial regression approach is better than Poisson to handle the over-dispersion.
In parametric regression modeling we assume that the regression curve tends to form a certain pattern, such as linear, quadratic, etc. But in the fact, there are many cases where the regression curve does not always form a certain pattern. In this case, nonparametric regression 3 LOCAL LINEAR NEGATIVE BINOMIAL NONPARAMETRIC REGRESSION modeling is more suitable to be used [8]. The nonparametric regression approach becomes an alternative analysis tool in the uncertain pattern because nonparametric regression has high flexibility that is by no assuming the form of regression curve but estimating regression function just based on behavior of data itself. There are some estimators used in the nonparametric regression approach for example local polynomial estimator ( [9], [10]), local linear estimator ([11]- [14]), spline and kernel estimator ( [15]- [29]).
Although there have been studies about negative binomial regression and nonparametric regression models, for either count data and other data types, but those studies was still limited to the one independent variable or one predictor variable. Therefore, in this research we develop those models to a negative binomial nonparametric regression model with more than one predictor variables which is applied to the count data namely data of the number of speed violations on toll roads. Hence, in this research we discuss theoretically how we estimate the model by using local linear estimator. The local linear estimator was chosen because it has the advantage on estimating the function at each point so that the approximate revenue obtained is closer to the actual observation data pattern such that the estimated model we have obtained then can be used for predicting the number of speed violations on toll roads..

PRELIMINARIES
A model that has very similar properties to the Poisson model is Poisson-Gamma model in which the dependent variable is modeled as a Poisson variable with a mean where the model error is assumed to follow a Gamma distribution. As it names implies, the Poisson-Gamma is a mixture of two distributions and was first derived by [30]. This mixture distribution was developed to account for over-dispersion that is commonly observed in discrete or count data [31]. It became very popular because the conjugate distribution (same family of functions) has a closed form and leads to the negative binomial (NB) distribution.
As discussed by [32] that the name of this distribution comes from applying the binomial theorem with a negative exponent. There are two major parameterizations that have been proposed and they are known as the NB1 and NB2, the latter one being the most commonly known and utilized. NB2 is therefore described first in this sub-section. Suppose that we have a series of random counts that follows the Poisson distribution: is the observed number of counts for = 1,2, … , ; and is the mean of the Poisson distribution. If the Poisson mean is assumed to have a random intercept term and this term enters the conditional mean function in a multiplicative manner, we get the following relationship [2]: is the log-link between the Poisson mean and the covariates or independent variables s and s are the regression coefficients. The relationship can also be formulated using vectors, such that = exp( ). The probability density function (PDF) of the NB2 model is therefore [33]: The first two moments of the NB2 are the following: Furthermore, the NB1 is very similar to the NB2, but the parameterization of the variance (the second moment) is slightly different than in equation (5). The first two moments of the NB1 are the following: The coefficients of the Negative Binomial (NB) regression model are estimated by taking the first-order conditions and making them equal to zero.

MAIN RESULTS
In this section we describe theoretically the estimation of negative binomial nonparametric regression model using local linear estimator if it is applied to data of the number of speed violations on toll roads. Finally, the estimated model that we has obtained can be used for estimating the number of speed violations on toll roads.
Next, suppose that x and y were a pair of data ( ) where i m(x ) is an unknown function or is not bound to the assumption of a particular function.
The function is approached by nonparametric regression based on a linear local estimator that is defined as: Hence, we can write equation (11) in matrix notation as follows: Furthermore, by substituting equation (12) into equation (10) we have the following equation: Then, for estimating parameter β in equation (13) we use the maximum local likelihood maximization method. For that purpose and based on equation (13) we have the weighted local likelihood function with the multi-predictor kernel function as follows: Next, we take derivation of local likelihood function given in (14) with respect to parameter vector β and parameter  as follows: The maximum value of the local likelihood function in equation (14) will be reached when equations in (15) are equal to 0, but because the two equations cannot be solved directly, an approach method is needed to get the solution. One of the numerical methods that can be used is