Combination Estimation of Smoothing Spline and Fourier Series in Nonparametric Regression

So far, most of the researchers developed one type of estimator in nonparametric regression. But in reality, in daily life, data with mixed patterns were often encountered, especially data patterns which partly changed at certain subintervals, and some others followed a recurring pattern in a certain trend. *e estimator method used for the data pattern was a mixed estimator method of smoothing spline and Fourier series. *is regression model was approached by the component smoothing spline and Fourier series. From this process, the mixed estimator was completed using two estimation stages. *e first stage was the estimation with penalized least squares (PLS), and the second stage was the estimation with least squares (LS). *ose estimators were then implemented using simulated data.*e simulated data were gained by generating two different functions, namely, polynomial and trigonometric functions with the size of the sample being 100. *e whole process was then repeated 50 times. *e experiment of the two functions was modeled using a mixture of the smoothing spline and Fourier series estimators with various smoothing and oscillation parameters. *e generalized cross validation (GCV) minimum was selected as the best model. *e simulation results showed that the mixed estimators gave a minimum (GCV) value of 11.98. From the minimum GCV results, it was obtained that the mean square error (MSE) was 0.71 and R was 99.48%. So, the results obtained indicated that the model was good for a mixture estimator of smoothing spline and Fourier series.


Introduction
Regression curve approaches that are often used are parametric regression and nonparametric regression approaches. However, not all relationship patterns can be approached with a parametric approach because there is no information about the form of the relationship between the response variable and the predictor variable. If the shape of the curve is unknown and the pattern is spread, it can be assumed that the regression curve uses an approach of the nonparametric regression model. Some nonparametric regression models that are widely used are spline [1] and Fourier series estimators [2].
Estimation methods have attracted a lot attention of nonparametric regression researchers and become popular among them. One of the methods is the smoothing spline estimator. Smoothing spline estimates nonparametric regression functions that are assumed to be smooth in the sense that the function is included in a particular function space and is often assumed to be appropriate in the Sobolev space. Smoothing spline also has an excellent power to handle data of which the nature changes at certain subintervals [3][4][5][6][7]. In addition to the spline estimator, there is a popular estimation technique in nonparametric regression, namely, the Fourier series estimator. Fourier series is a trigonometric polynomial that has flexibility so that it can adapt effectively to the local nature of the data. e Fourier series estimator is generally used if the data investigated for the pattern are unknown and there is a tendency for repeated patterns [8,9]. Fourier series is one model that has a good statistical and visual interpretation among the nonparametric regression models [10]. Advantages of estimating Fourier series are being able to handle data characters that follow repeated patterns at certain trend intervals and having good statistical interpretation. In previous research studies, only one type of estimator was developed.
Along with the development of research on nonparametric regression, lately a mixed estimator has been developed in nonparametric regression. Sudiarsa et al. developed a combination estimator of Fourier series and truncated spline [11]. Budiantara et al., Rismal et al., and Ratnasari et al. developed a mixture of kernel and truncated spline estimators [12][13][14]. Other research on mixed estimators was conducted by Afifah et al. and Nisa et al. who developed a mixture of kernel and Fourier series [15,16]. Recent research about mixture estimator is mix local polynomial and truncated spline which was developed by Suparti and Santoso [17]. Mixed smoothing spline and kernel estimator was developed by Hidayat et al. [18][19][20]. In daily life, mixed data patterns often appear, and some data patterns change at certain subintervals and some follow recurring patterns at certain trends. So as to handle the data pattern, in this study, we developed combination estimation of smoothing spline and Fourier series. Based on the description of previous research studies, focus of this paper will be emphasized on the nonparametric regression model that combines smoothing spline and Fourier series obtained through optimization of penalized least squares (PLS). Furthermore, this combined estimator is applied to the simulation data. ese simulation data are generated from two different functions to represent two different patterns of predictor variables so that this condition is in accordance with the combined estimator that was developed.

Materials and Methods
e data provided is in pairs (x 1i , x 2i , . . . , x pi , t 1i , t 2i , . . . , t qi , y i ) of which is assumed that the predictor variables were (x 1i , x 2i , . . . , x pi , t 1i , t 2i , . . . , t qi ) and the respon variables y i . e relationship between the two variables follows nonparametric regression multivariable model: (1) Assume the multivariable nonparametric regression model is additive, so the regression model is obtained as follows: In the estimation process using PLS function, h k (t ki ) is a fixed model. Furthermore, the function h k (t ki ) is approximated by the Fourier series function. Function g j (x ji ) is assumed to be in Sobolev space.
Based on equation (2) for i � 1, 2, . . . , n, the following equation is obtained: y 1 � g 1 x 11 + g 2 x 21 + · · · + g p x p1 + h 1 t 11 + h 2 t 21 + · · · + h q t q1 + ε 1 , y 2 � g 1 x 12 + g 2 x 22 + · · · + g p x p2 + h 1 t 12 + h 2 t 22 + · · · + h q t q2 + ε 2 , ⋮ y n � g 1 x 1n + g 2 x 2n + · · · + g p x pn + h 1 t 1n + h 2 t 2n + · · · + h q t qn + ε n . (3) Its matrix form can be written as e regression model in equation (4) can be written as Furthermore, equation (5) can be written as with y � y 1 , y 2 , . . . , y n ′ , Component regression curve g j (x ji ), j � 1, 2, . . . , p is assumed to be smooth in the sense that it is contained in the Sobolev space g j εW m 2 , j � 1, 2, . . . , p. Component regression curve h k (t ki ), k � 1, 2 . . . , q is approached by using the Fourier series function:h k (t ki ) � b k t ki + (1/2)α 0k + K u�1 α uk cos ut ki , k � 1, 2, . . . , q. Combined estimator smoothing spline and Fourier series in the nonparametric regression estimation method can be obtained through two stages. e first stage is done by completing a smoothing spline component using the PLS method, and the second phase is done by completing the Fourier series components using the LS method. To 2 Journal of Mathematics complete the smoothing spline components, equation (6) is modified to the following form: where z � y − h. Estimation of the smoothing spline component can be obtained by PLS optimization; penalty [g j (x ji )] 2 dx j for penalized least squares optimization: where goodness of fit is expressed as with 0 < λ j < ∞. Furthermore, the results estimated in the first stage are substituted into regression equation (8). e second stage, to get the estimation of the components of the Fourier series, is obtained by LS optimization. Furthermore, the results of two-stage estimation are substituted into equation (6) to obtain a combined smoothing spline and Fourier series estimator in multivariable nonparametric regression.

Results and Discussion
Function g in equation (6) is a function whose form is unknown and assumed to be smooth in the sense of being contained in space W. en, space W can be decomposed into direct sum of two spaces W 0 and W 1 which are perpendicular to each other, that is, W � W 0 ⊕ W 1 with W 0 ⊥W 1 . e following describes the shape of the functions g and goodness of fit which is explained in the following theorem. (9), i.e.,

Theorem 1. If given goodness of fit equation
then goodness of fit can be written as where z � y − h, e proof of eorem 1 is provided in Appendix A.

Lemma 1.
If given a penalty of PLS as equation (9), a penalty can be written as where e proof of Lemma 1 is provided in Appendix B. en, according to eorem 1 and Lemma 1, the estimator component smoothing spline g (λ,K) (x, t), Fourier series h (λ,K) (x, t), and combination smoothing spline and Fourier series μ (λ,K) (x, t) will be searched. e whole process is described in eorem 2. (16) then the estimator for g (λ,K) (x, t), h (λ,K) (x, t) and μ (λ,K) (x, t) obtained through the optimization PLS of equation (9) is given by

Theorem 2. If given the regression model:
where Journal of Mathematics 3 e proof of eorem 2 is provided in Appendix C.

Simulation. In this research, a simulation is conducted
where it aims to show the ability of a combined estimator of smoothing spline and Fourier series in multivariable nonparametric regression. Data are generated from the polynomial function for smoothing spline; trigonometric functions for Fourier series and errors are normally distributed. e simulation about mixed estimator smoothing spline and Fourier series in nonparametric regression uses sample size n = 100, oscillation parameters (K = 1, K = 2 and K = 3). Each oscillation parameter is repeated fifty times. e regression equation designed for this simulation study is as follows: with Furthermore, generation function error ε i is obtained from the distribution N(0, 1), Scatterplot simulation data are shown in Figure 1.
e plot of simulation data in two dimensions is shown in Figure 1; from Figure 1, it can be seen that the data tend to change on a particular subinterval such as pattern smoothing spline and data tend to have a repeating pattern like the Fourier series pattern. Based on estimates using twostage estimate for the simulated data, the best model for the combined estimator is modeled with smoothing parameter and oscillation parameters. After obtaining the optimal smoothing parameter and oscillation parameters, the minimum GCV value is chosen. e GCV values of the oscillation parameters K � 1, K � 2, and K � 3 are given in Table 1. Table 1 and Figure 2 show that the GCV minimum value is 39.79, the optimal smoothing parameter is 0.9, and the oscillation parameter is K = 1.
is model gives satisfying results with GCV = 39.79; MSE = 0.13; and R 2 = 99.48%. e plot between the estimation results and the original simulation data is presented in Figure 3. Based on   Based on the theory, it was obtained a mixed estimator of smoothing spline and Fourier series. is theory was then proved through simulation; based on the simulation which was conducted, the theory produced a result of R 2 whose value was high. is result of the research can be developed by the other research using different estimators on mixed patterned data. Besides that, optimization methods can be used to solve other mixed estimator problems, multiresponse model, and longitudinal data.

Conclusions
Based on the discussion, the following conclusions can be drawn: (a) Based on PLS optimization, the smoothing spline component estimator is obtained as g (λ,K) (x, t), Fourier series component estimator is obtained as h (λ,K) (x, t) , and combined smoothing spline and Fourier series estimator is obtained as μ (λ,K) (x, t) , which are given by (b) e result of the simulation is that mixed estimator smoothing spline and Fourier series is good because it has R 2 = 99.48% and MSE = 0.13. Journal of Mathematics spline order and base on space W 1 is ψ 1 , ψ 2 , . . . , ψ n with n being the number of observations, then for each function, g j εW can be written as follows:

Appendix
where u j ϵW 0 can be written as while v j ϵW 0 can written as follows: where c j and d j are constants. So, for each function, g j εW can be described as follows: and by describing L x as a linear functional limited to space W and g j εW, the equation can be presented as follows: L x is a linear functional limited to space W, so we get a single value η i εW which is a representation of L x and satisfies the following equation: (A.7)
(A. 16) e Fourier series regression equation can be written as follows: (A.17)

Equation (
A.17) can be written in the following form: In equation (A.16), function of the Fourier series in the nonparametric regression component with the predictor q can be expressed in the following form: T k α k � T 1 α 1 + T 2 α 2 + · · · + T q α q � Tα, (A.19) where T � t 11 1/2 cos t 11 . . . cos Kt 11 . . . t q1 1/2 cos t q1 . . . cos Kt q1 t 12 Goodness of fit component is as follows: where z � y − h. So, it is proven that goodness of fit can be written as Here is an explanation for the penalty component of the optimization PLS.

B. Proof for Lemma 1
Penalty component j (x j )] 2 dx j can be obtained with the following explanation: (B.1)

Journal of Mathematics
By substituting equation (B.1) into the penalty component, the following can be obtained: So, it is proven that

C. Proof for Theorem 2
eorem 1 discussed about the function and goodness of fit on spline function; the first step will be conducted by making estimator mixture by using PLS. e description will be processed below.

(C.8)
After getting g (λ,K) (x, t), the next value will be sought, i.e., h (λ,K) (x, t), which is explained in the second step below. Next, we substitute equation (C.8) into equation (4)  (C.10) After getting equation (C.10), the next step is to look for ε ′ ε which will be explained as follows: (C.14) Based on equation (C.14), we obtain a combination estimator of smoothing spline and Fourier series:

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.