Statistical inference on nonparametric regression model with approximation of Fourier series function: Estimation and hypothesis testing

Nonparametric regression is an approximation method in regression analysis that is not constrained by the assumption of knowing the regression curve. One of the functions to approximate the curve is a Fourier series function. The nonparametric regression model with approximation of a Fourier series function has been widely discussed by several researchers. However, discussions on statistical inference, particularly in partial hypothesis testing, has not been carried out previously. Therefore, the purpose of this research is to discuss the statistical inference on nonparametric regression model with approximation of a Fourier series function. The discussion includes parameter and model estimations, simultaneous and partial hypotheses testing. In the application, we use life expectancy data from East Java Province during 2022. Based on data analysis, we obtain a model estimation with an R-square value of 96.24 %. At a 5 % significance level, the parameters simultaneously have a significant influence on the model. Partially, four parameters are not significant. However, overall, the predictor variables significantly influence the life expectancy data.• The Fourier series function used is a Fourier series function introduced by Bilodeau (1992).• The model estimation is obtained by selecting the optimal number of oscillation parameters.• The statistical test is obtained using the LRT method.


Background
Nonparametric regression is an approximation method in regression analysis that is not constrained by the assumption of knowing the regression curve.Moreover, nonparametric regression offers high flexibility, as the regression curve can be adapted to the local nature of the data [ 1 ].However, a nonparametric regression curve cannot be determined arbitrarily without information from the data, such as examining data patterns based on a scatterplot [2][3].In nonparametric regression, there are several estimator approaches for the regression curve, such as Spline function [4][5][6], Kernel function [7][8][9], Fourier series function [10][11][12][13], and many others.Among these approaches, Fourier series function is particularly intriguing to many researchers in the field of nonparametric regression due to its ability to handle data that exhibits repeated patterns at certain interval.According to Srivani et al. and Steinerberger, Fourier series is a trigonometric polynomial function that combines cos and sin functions [ 14 , 15 ].However, Bilodeau introduced a new Fourier series function for a nonparametric regression model, which includes only the cos function and incorporates a trend to the function [ 10 ].The Fourier series estimator introduced by Bilodeau has been widely adopted and further developed by numerous researchers for estimating the nonparametric regression curve [see, 13 , 16-19 ].In the application, the Fourier series estimator introduced by Bilodeau has been use for modelling various different data [ 16,17 , 19-21 ].
In regression analysis, statistical inference plays a crucial role in obtaining the best model estimates.Generally, statistical inference in regression analysis is divided into two main areas: estimation and hypothesis testing.Estimation in regression analysis involves estimating the parameters in the model.These parameter estimates provide a model estimation that is useful for drawing conclusions about the response variable based on the given predictor variables.Moreover, hypothesis testing in regression analysis aims to determine whether the parameters significantly influence the model.Generally, hypothesis testing in regression analysis is divided into two categories: simultaneous and partial hypotheses testing.Simultaneous hypothesis testing aims to determine the combined influence of all parameters on the model.If this testing indicates that all parameters simultaneously have a significant influence, then partial hypothesis testing is performed to identify which specific parameters influence the model.In the field of regression analysis, statistical inference has been widely discussed by several researchers [22][23][24].
The discussion related to statistical inference in nonparametric regression model with approximation of a Fourier series function has so far primarily focused on parameter and model estimation [ 10 , 12 , 16-18 , 20,21 ].Ramli, et al. is the only researcher who discussed hypothesis testing in nonparametric regression model with approximation of a Fourier series function, focusing on simultaneous hypothesis testing [ 19 ].Meanwhile, partial hypothesis testing in this model has not been carried out.The significant role of statistical inference in regression analysis, especially in nonparametric regression model with approximation of a Fourier series function makes this an interesting topic for discussion in this research.Although several aspects of statistical inference in this model, such as estimation and simultaneous hypothesis testing have been previously discussed both theoretically and practically, this research will only briefly discuss upon these areas.Therefore, the primary focus will be on the theoretical development of partial hypothesis testing, which has not been carried out previously.Furthermore, to apply statistical inference in nonparametric regression model with approximation of a Fourier series function, we use life expectancy data from 38 districts and cities in East Java Province during 2022.

The model and its estimation
In this section, we provide a general description of nonparametric regression model with approximation of a Fourier series function.The Fourier series function we utilize follows the formulation introduced by Bilodeau [ 10 ].Parameter estimation in the nonparametric regression model with approximation of a Fourier series function is conducted using the Ordinary Least Squares (OLS) method, a technique employed by several previous researchers.Additionally, the selection of optimal parameter estimates for the model employs the Generalized Cross Validation (GCV) method.

The nonparametric regression model with approximation of a Fourier series function
Suppose   is the response variable and   1 ,   2 , ...,   are the  predictor variables, where  = 1 , 2 , ...,  .Assuming the relationship between the response and predictor variables follows the nonparametric regression model as describe in Eq. (1) .
Assuming between the predictor variables are not correlated, the nonparametric regression model in Eq. ( 1) can be written as an additive model.
Since (  ) is the nonparametric regression curve which we assume as an unknown function, (  ) can be approximated using one of the estimators in nonparametric regression model, specifically the Fourier series function [ 10 ], the Fourier series function is given in Eq. (3) .
where   is the constant parameter (representing the baseline level),   is the trend parameter (representing a linear trend into the function), and   is the parameter for the cos terms (adding periodic component to the function), for  = 1 , 2 , ...,  and  = 1 , 2 , ...,  with  is the oscillation parameter.Furthermore, since (  ) is approximated with a Fourier series function then by substituting Eq. (3) into Eq.( 2) , we obtain the nonparametric regression model with approximation of a Fourier series function as follows.

Parameter and model estimations
In regression analysis, obtaining model estimates is equivalent to estimating the parameters in the model.In nonparametric regression model with approximation of a Fourier series function, parameter estimates can be obtained using several estimation methods, such as Penalized Least Square (PLS) method [ 10 , 12 , 18 , 20 ] and OLS method [ 16,17 , 19 , 21 ].The choice between these two methods depends on the form of the curve (  ) .If the curve (  ) is assumed to be a smooth function, then the Partial Least Squares (PLS) method is utilized.On the other hand, if we simply aim to obtain parameter estimates without assuming that the curve (  ) must be smooth, we can use the OLS method.In general, the OLS method is a classic approach in regression analysis for obtaining parameter estimates by minimizing the sum of squared errors.In this research, we employ the OLS method for parameter estimation.Therefore, based on the nonparametric regression model with approximation of a Fourier series function in Eq. ( 5) , the parameter estimation using the OLS method is obtained as follows [ 16,17 , 19 , 21 ].
where δ in Eq. ( 6) is the parameter estimation for  in Eq. (5) .Furthermore, based on δ in Eq. ( 6) , we obtain the estimation of the nonparametric regression model with approximation of a Fourier series function as follows.

Model selection
In contrast to parametric regression models, where estimates are obtained directly from the parameters without the need for selecting an optimal model, nonparametric regression models require the selection of an optimal model estimate.Despite using approaches such as Spline, Kernel, or Fourier series functions to approximate the regression curve, there is always at least one parameter that necessitates choosing the optimal model estimate.The parameters in question are the knot points in a Spline function [4][5][6], the bandwidth in a Kernel function [7][8][9], and the oscillation parameters in a Fourier series function [ 16 , 19 , 21 ].Therefore, a method is needed to select the optimal model.One such method is the GCV method.The GCV method was First introduced by Craven and Wahba [ 25 ].This method has been widely used in nonparametric regression models for optimal model selection.Selecting an optimal nonparametric regression model with approximation of a Fourier series function involves choosing the optimal oscillation parameters based on the minimum GCV value.The GCV function for selecting optimal oscillation parameters is given in Eq. ( 8) [ 16 , 19 , 21 ].
Table 1 Partial hypotheses form.

Null Hypothesis Alternative Hypothesis Null Hypothesis Alternative Hypothesis Null Hypothesis
Alternative Hypothesis

Simultaneous hypothesis testing
Simultaneous hypothesis testing is used to evaluate the influence of all parameters simultaneously in a regression model.This type of testing, particularly in nonparametric regression model with approximation of a Fourier series function was developed by Ramli et al. [ 19 ].Therefore, explanations related to simultaneous hypothesis testing are provided briefly to support the application of the data in this research.The first step in simultaneous hypothesis testing is to formulate the hypothesis form.Based on the nonparametric regression model with approximation of a Fourier series function in Eq. ( 4) , we can define the hypothesis form for simultaneous hypothesis testing as follows.
The hypothesis form given in Eq. ( 9) is a specific hypothesis form used to evaluate the influence of all parameters on the model without testing the magnitude of their influence.Furthermore, to test the hypothesis form in Eq. ( 9) , it is necessary to obtain a formula for the statistical test.One method to obtain a statistical test is the Likelihood Ratio Test (LRT) method.According to Ramli et al., using the LRT method, the formulation of the statistical test for simultaneous hypothesis testing with the hypothesis form given in Eq. ( 9) is as follows [ 19 ].

Partial hypothesis testing
In this section, we discuss partial hypothesis testing for parameters in nonparametric regression model with approximation of a Fourier series function.The partial hypothesis testing continues from the simultaneous hypothesis testing explained in the previous section.Following the same framework, the formula of the statistical test for partial hypothesis testing is derived using the LRT method.Furthermore, we will provide a more detailed discussion on the distribution of the statistical test and the rejection region of the null hypothesis.The initial step in partial hypothesis testing involves formulating the hypothesis form.Based on the nonparametric regression model with approximation of a Fourier series function described in Eq. ( 4) , we can formulate the hypotheses form for partial hypothesis testing as presented in Table 1 below.
Table 1 presents a partial hypothesis formulation for all parameters in nonparametric regression model with approximation of a Fourier series function.The number of hypothesis formulations in Table 1 corresponds to the number of parameters in the model.Consequently, testing these hypotheses individually is time-consuming and involves lengthy mathematical elaboration.To address this, we combine the hypotheses in Table 1 into a simpler form to obtain the statistical test, its distribution, and the rejection region for the null hypothesis.This is achieved by formulating the hypotheses in Table 1 into the following vector form.
where   is a row vector with the  ℎ column is 1 and otherwise is 0 with  = 1 , 2 , ...,  , where  is the number of rows in vector , defined as  =  ( + 2 ) .
To obtain the formula for the statistical test in partial hypothesis testing, where the hypothesis form is given in Eq. ( 11) , we use the LRT method.The LRT method has been widely used in hypothesis testing across various regression models [see, 19 , 26-28 ].The main concept of the LRT method is to compare the likelihood function of the model under the null hypothesis with the likelihood function of the model under the alternative hypothesis.Based on the nonparametric regression model in Eq. ( 2) , where   is known to be normally distributed with mean 0 and variance  2 , the likelihood function is obtained as follows.
Furthermore, the likelihood function in Eq. ( 12) can be written as the following vector.
where the likelihood function in Eq. ( 14) is the likelihood function for  ∼ (,  2 ) .Furthermore, let  be the parameter space under the null hypothesis and Ω be the parameter space under the hypothesis.Therefore, based on the hypothesis form in Eq. ( 11) and the likelihood function in Eq. ( 14) , the parameter spaces  and Ω can be determined as follows.
Based on the likelihood function in Eq. ( 14) and the parameter space under the null hypothesis (  ) in Eq. ( 15) , the likelihood function under the parameter space  is obtained as follows.
where the likelihood function in Eq. ( 17) has a condition that     = 0 .Similarly, based on the likelihood function in Eq. ( 14) and the parameter space under the hypothesis ( Ω) as defined in Eq. ( 16) , the likelihood function under the parameter space Ω is obtained as follows.
The statistical test The statistical test for partial hypothesis testing with the hypothesis form in Eq. ( 11) is obtained using the LRT method.Based on the LRT method, we obtain a likelihood ratio by comparing the maximum likelihood function under parameter space  and the maximum likelihood function under parameter space Ω.Therefore, based on the likelihood ratio, we derive a formula for the statistical test.Furthermore, the statistical test is obtained based on Theorem 1 .However, before presenting Theorem 1 , we first obtain the likelihood ratio using the following equation.
where  in Eq. ( 19) is the likelihood ratio,  ( ω ) is the maximum likelihood under the parameter space of  , and  ( Ω) is the maximum likelihood under the parameter space of Ω.According to Ramli et al., if Ω is the parameter space under the hypothesis as given in Eq. ( 16) with the likelihood function under the parameter space Ω is provided in Eq. ( 18) , then the maximum likelihood under the parameter space Ω is obtained as follows [ 19 ].
Furthermore, for  ( ω ) can be obtained by estimating the parameters within the parameter space  , namely   and  2  .Since the parameter space  has a specific constraint for the parameter   , where     = 0 .Therefore, we can estimate   using the Lagrange Multiplier (LM) method, which is commonly applied when parameters have specific constraints.This method is extensively discussed in the context of general linear hypotheses in parametric regression, as explained in Searle [ 29 ].Based on the likelihood function under the parameter space  in Eq. ( 17) with the parameter space  defined in Eq. ( 15) , the LM function is given as follows. where Since the LM function in Eq. ( 21) contains two parameters, namely   dan .Therefore, the estimation of parameters   dan  are obtained by taking the partial derivatives OF  (  , ) with respect to   and , and then setting these derivatives equal to 0. The estimation of parameter is obtained as follows.
Furthermore, for the estimation of  is obtained as follows.
Based on Eq. ( 24) , since   is estimated by δ in Eq. ( 23) then the estimation of  is obtained as follows.
For the estimation of parameter  2  , we can directly use the Maximum Likelihood Estimation (MLE) method, by solving  ln  (  ) Based on the likelihood function under the parameter space  in Eq. ( 17) , we obtain ln  (  ) as follows.
Therefore, based on Eq. ( 27) , we obtain: Furthermore, by elaborating on Eq. ( 28) , and considering that   is estimated by δ in Eq. ( 26) , the estimation of  2  is obtained as follows.
Therefore, based on Eqs. ( 26) and ( 29) , we obtain the maximum likelihood under the parameter space  as follows.
where  ( ω ) in Eq. ( 30) is the maximum likelihood under the parameter space  .Furthermore, based on the maximum likelihood under the parameter space  in Eq. ( 30) and the maximum likelihood under the parameter space Ω in Eq. ( 20) , we can express the likelihood ratio in Eq. ( 19) as follows.
Based on the likelihood ratio in Eq. (33) , the statistical test for partial hypothesis testing in nonparametric regression model with approximation of a Fourier series function with the hypothesis form given in Eq. ( 11) is presented in Theorem 1 .

Theorem 1.
Given the hypothesis form in Eq. (11) , by using the LRT method, the statistical test for testing  0 ∶    = 0 against  1 ∶    ≠ 0 is as follows.
Proof.Noted that the partial hypothesis testing in nonparametric regression model with approximation of a Fourier series function employs the LRT method.According to Casella and Berger, the LRT method specifies a rejection region, denoted as {  1 ,   2 , ...,   ;   | (  1 ,   2 , ...,   ;   ) ≤  } , where  is a constant satisfying 0 ≤  ≤ 1 [ 30 ].Therefore, based on the rejection region of the LRT method, the likelihood ratio in Eq. ( 33) can be expressed as: By multiplying Eq. ( 35) by  , where  represents the degrees of freedom as obtained later in Theorem 2 , Eq. ( 35) can be written as: Since   is a row vector and δΩ is a column vector, the product of   δΩ dan   ( ′  ) −1  ′  yield constants.Therefore, Eq. ( 36) can be written as: Furthermore, Eq. ( 37) is squared on both sides, and since δΩ = ( ′  ) −1  ′  equals δ as of Eq. ( 6) .Therefore, Eq. (37) can be written as: where  = where  is the statistical test for partial hypothesis testing with the hypothesis form given in Eq. (11) .

Distribution of the statistical test
Based on Theorem 1 ,  is the statistical test for partial hypothesis testing in nonparametric regression model with approximation of a Fourier series function using the hypothesis form in Eq. (11) , where the rejection region for testing  0 ∶    = 0 against  1 ∶    ≠ 0 is {  1 ,   2 , ...,   ;   ||| ≤  * } , where  * contains a constant  .To determine the value of  we derive the distribution of the statistical test .The distribution of the statistical test  is obtained based on Theorem 2 .Before presenting Theorem 2 , Lemma 1 is provided to facilitate the proof of Theorem 2 .

Lemma 1.
Let  is the statistical test given in Theorem 1 then  2 can be written in quadratic form as follows .
where  2 describes the quadratic form of  with  ′  and  ′ ( −  )  are being quadratic in .
Theorem 2. Let  denote the statistical test for testing  0 ∶    = 0 against  1 ∶    ≠ 0 .If the statistic  2 follows an  distribution with degrees of freedom 1 and  , then the statistical test  follows the  student distribution with  degrees of freedom as follows.
Proof.Noted that  is the statistical test for testing  0 ∶    = 0 against  1 ∶    ≠ 0 , where  is given in Theorem 1 .Furthermore, based on Lemma 1 we have the statistic  2 as follows.
MethodsX 13 (2024) 102922 The rejection region for the null hypothesis The purpose of determining the rejection region for the null hypothesis is to find out whether the null hypothesis is rejected or fails to be rejected at a significance level .Noted that partial hypothesis testing in nonparametric regression model with approximation of a Fourier series function, with the hypothesis form in Eq. ( 11) , uses the LRT method.Since the LRT method has the rejection region of {  1 ,   2 , ...,   ;   | (  1 ,   2 , ...,   ;   ) ≤  } , where  is a constant.Therefore, based on Theorem 1 , the critical region for testing  0 ∶    = 0 against  1 ∶    ≠ 0 can be written as follows.
If a significance level  is given, then based on Eq. ( 44) , the critical region for rejecting the null hypothesis for testing  0 ∶    = 0 against  1 ∶    ≠ 0 is given by: Based on Theorem 2 , we know that the statistical test  follows a  student distribution with  degrees of freedom, where  =  − ( ( + 1 ) + 1 ) .Furthermore, since  * = ((  ) − 2  − 1 )  , where  is a constant, then based on Eq. ( 45) ,  * can be obtained by integrating the probability density function of the  student distribution with  degrees of freedom at a significance level  as follows.
where  (  ) is the probability density function of the  student distribution with  degrees of freedom.Based on Eq. ( 46) ,  * represents a statistic value of the  student distribution with  degrees of freedom at the significance level  2 , denoted as  * =  (  2 , ) .Therefore, based on Eq. ( 45) , the critical region for rejecting the null hypothesis for testing  0 ∶    = 0 against  1 ∶    ≠ 0 can be written as Eq. ( 47) .
Based on Eq. ( 47) , the null hypothesis is rejected at a significance level  if and only if || ≥  (  2 , ) or the probability of  (|| ≥  (  2 , ) ) is less than .

Real data application
Statistical inference on nonparametric regression model with approximation of a Fourier series function have been theoretically discussed in the previous section.To apply these concepts to real data, we gathered life expectancy data from all districts and cities in East Java Province.East Java has 29 districts and 9 cities, providing a total of  = 38 observations for this research.The variables used in this research are divided into two categories: the response variable (  ) and the predictor variables (  ) .The response variable is the life expectancy data of 38 districts and cities in East Java Province.The predictor variables used in the study include, poverty percentage ( 1 ) , open unemployment rate ( 2 ) , labor force participation rate ( 3 ) , average years of schooling ( 4 ) , and population percentage ( 5 ) .Life expectancy and predictor variables were sourced from the website of Badan Pusat Statistik (BPS) for East Java Province in 2022.
Before proceeding with modelling life expectancy data in East Java province, we first examined the relationship between life expectancy and each predictor variable.This preliminary investigation aims to discern the patterns exhibited by each predictor, thereby guiding the selection of an appropriate analytical approach.The relationship between life expectancy data and each predictor variable is expected to exhibit both unknown patterns and repeated patterns at certain intervals.Therefore, life expectancy data can be modelled using the nonparametric regression model with approximation of a Fourier series function.Furthermore, we visually explore the relationships between life expectancy and the predictor variables through scatter plots.The scatter plots illustrating the relationship between life expectancy data and variables poverty percentage ( 1 ) , open unemployment rate ( 2 ) , labor force participation rate ( 3 ) , average years of schooling ( 4 ) , and population percentage ( 5 ) are presented in Fig. 1 .
Based on Fig. 1 , the relationship between life expectancy data and each predictor variable does not exhibit a specific pattern such as linear or nonlinear, which are typically assumed in parametric regression models.Therefore, the relationship pattern between life expectancy data and each predictor variable is considered to be unknown.Furthermore, to model life expectancy data with each predictor variable, a suitable approach is to use a nonparametric regression model.One effective estimator within nonparametric regression, particularly for curve estimation of life expectancy data is a Fourier series function.Based on Eq. ( 4) with  = 38 dan  = 5 , the general form of the nonparametric regression model with approximation of a Fourier series function for life expectancy data in East Java Province during 2022 can be written as follows.

Table 2
The GCV values when  is consistent across all predictor variables.
Nilai Since the parameter estimates in nonparametric regression model with approximation of a Fourier series function depend on the number of oscillation parameters  , achieving optimal estimation results involves selecting the optimal  .The optimal  is determined using the GCV method, where the optimal  is identified based on the smallest GCV value.In this research, the maximum number of  is for  = 5 .Furthermore, the selection of  in this research is approached in two scenarios, where  is consistent across all predictor variables and  varies for each predictor variable.Based on the analysis results, the GCV values obtained when  is consistent across all predictor variables are presented in Table 2 .
Table 2 shows the values of GCV,  2 , Mean Square Error (MSE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percent Error (MAPE) for each T, where T is consistent across all predictor variables.According to Table 2 , the smallest GCV value obtained is 1.641.Therefore, the optimal number of  is determined to be  = 2 for all predictor variables.Furthermore, due to different behavioural changes observed in each predictor variable concerning the response variable, a second scenario was explored in this research where  varies for each predictor variable.By setting  = 5 as the maximum, the combination of  for each predictor variable is selected when the maximum is  = 2 ,  = 3 ,  = 4 and  = 5 .Based on the analysis results presented in Table 3 , the minimum GCV values are obtained for each maximum  , specifically  = 2 ,  = 3 ,  = 4 and  = 5 .
Table 3 only shows the minimum GCV values as well as  2 , MSE, MAE, RMSE, and MAPE values for each optimal  combination.This is due to the substantial number of combinations for each maximum  , there are 32 combinations for  = 2 , 243 combinations for  = 3 , 1024 combinations for  = 4 , and 3125 combinations for  = 5 .For example, for a maximum of  = 3 , there are 243 combinations.From these 243 combinations, the smallest GCV value was 1.345, obtained with the combination  = 2 for variable  1 ,  = 2 for variable  2 ,  = 3 for variable  3 ,  = 3 for variable  4 , and  = 2 for variable  5 .Based on Table 3 , we obtain the smallest GCV value of 0.706, which corresponds to the maximum  = 5 with the optimal combination of  being  = 1 for variable  1 ,  = 2 for variable  2 ,  = 4 for variable  3 ,  = 3 for variable  4 , and  = 5 for variable  5 .

Table 3
The minimum GCV values for the optimal combination of  .Overall, based on Tables 2 and 3 , comparing the minimum GCV value when  is consistent across all predictor variables and when  varies for each predictor variable, the smallest GCV value is 0.706.This minimum GCV value is achieved with the combination  = 1 for variable  1 ,  = 2 for variable  2 ,  = 4 for variable  3 ,  = 3 for variable  4 , and  = 5 for variable  5 .Therefore, we obtain the nonparametric regression model with approximation of a Fourier series function for life expectancy data in East Java Province during 2022 as follows. 𝑡 Furthermore, the estimated parameters in nonparametric regression model with approximation of a Fourier series function for life expectancy data in East Java Province during 2022 are presented in Table 4 .
Noted that life expectancy data in East Java Province is modelled using the nonparametric regression model with approximation of a Fourier series function as given in Eq. (48) .Furthermore, to determine whether the parameters in Eq. ( 48) simultaneously influence the model, simultaneous hypothesis testing is conducted.Based on Eq. (48) , the hypothesis form for simultaneous hypothesis testing in nonparametric regression model with approximation of a Fourier series function for life expectancy data in East Java Province can be expressed as follows.
Since Λ is the statistical test for simultaneous hypothesis testing in nonparametric regression model with approximation of a Fourier series function, as specified in Eq. (10) , where Λ follows an  distribution with degrees of freedom  ( + 1 ) + 1 and  − ( ( + 1 ) + 1 ) .Therefore, based on analysis results for  = 5 and  = 38 with the optimal combination of  given in Table 3 , we obtain the statistical test Λ value of 29,767.82with a  −  of 7 .384 × 10 −35 .Furthermore, using a significance level  of 5 %, we obtain an  (0 .05 , 21 , 17 ) value of 2.219.Since the statistic value of Λ is greater than the statistic value of  (0 .05 , 21 , 17 ) or alternatively, since the  −  is less than 5 %, we reject the null hypothesis.Therefore, it can be concluded that at a significance level of 5 %, there is evidence that at least one parameter in Eq. ( 48) is not equal to 0. In other words, the parameters in the nonparametric regression model with approximation of a Fourier series function for life expectancy data in East Java Province simultaneously have a significant influence.
Furthermore, following the simultaneous hypothesis testing which concluded that at least one parameter is significant in the model, the next step is to determine which specific parameters are significant through partial hypothesis testing.Based on the nonparametric regression model with approximation of a Fourier series function for life expectancy data in East Java Province as described in Eq. ( 48) , the hypotheses form for partial hypothesis testing are provided in Table 5 .
Table 5 describes the hypothesis forms for partial hypothesis testing in nonparametric regression model with approximation of a Fourier series function for life expectancy data in East Java Province during 2022, where the general model is given in Eq. (48) .Furthermore, according to Theorem 1 ,  is the statistical test for partial hypothesis testing in nonparametric regression model with approximation of a Fourier series function, with  defined by Eq. (34) .Furthermore, based on Theorem 2 , it is known that the statistical test  follows a  student distribution with  degrees of freedom.Based on the analysis results for  = 1 , 2 , ... 25 ,  = 5 ,  = 38 and the optimal combination of  given in Table 3 , we obtain the statistic values of || and the  −  as presented in Table 6 .48) and shows the estimation, the statistical test, the  −  , and the decision of each parameter.Furthermore, based on Eq. (47) , the null hypothesis in partial hypothesis testing in nonparametric regression model with approximation of a Fourier series function is rejected at the significance level  if and only if the statistic value || is greater than the statistic value of  (  2 , ) , or alternatively, if the  −  is less than .Furthermore, using a significance level  of 5 % and for  = 17 , we obtain the statistic value  (0 .025 , 17 ) of 2.11.Based on Table 6 , the decisions to reject the null hypothesis or fail to reject the null hypothesis are made by comparing each statistic value || with the statistic value  (0 .025 , 17 ) of 2.11 or alternatively, by comparing each  −  with  of 5 %.For example, for parameter  1 , we obtain the statistic value || of 3.581.This value is greater than 2.11, or the  −  of 0.002 is < 5 %.Therefore, the decision is to reject the null hypothesis, indicating that parameter  1 has a significant influence on the model.Overall, at a significance level of 5 %, all parameters have a significant influence on the model, except for parameters  2 ,  13 ,  24 and  25 , which do not have a statistically significant influence on the model.Although several parameters are not significant, overall, the predictor variables significantly influence the life expectancy data at a 5 % significance level.

Model interpretation
The life expectancy data was modelled using the nonparametric regression model with approximation of a Fourier series function, where theoretically the model depends on the optimal number of oscillation parameter  .In this research, selecting the optimal number of oscillation parameter  is based on the smallest GCV value.The smallest GCV value was obtained as 0.706, with the optimal combinations of  given in Table 3 .Therefore, the nonparametric regression model with approximation of a Fourier series function for the life expectancy data is written in Eq. ( 48) , where the parameters estimation provided in Table 4 .Furthermore, based

Table 5
Partial Hypotheses Form.