Next Article in Journal
Developable Ruled Surfaces Generated by the Curvature Axis of a Curve
Next Article in Special Issue
A Binary-State Continuous-Time Markov Chain Model for Offshoring and Reshoring
Previous Article in Journal
Constrained Binary Optimization Approach for Pinned Node Selection in Pinning Control of Complex Dynamical Networks
Previous Article in Special Issue
The Boundary Integral Equation for Kinetically Limited Dendrite Growth
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generalized Partially Functional Linear Model with Unknown Link Function

1
School of Science, North China University of Technology, Beijing 100144, China
2
Department of Statistics, University of Leeds, Leeds LS2 9JT, UK
*
Author to whom correspondence should be addressed.
Axioms 2023, 12(12), 1089; https://doi.org/10.3390/axioms12121089
Submission received: 13 October 2023 / Revised: 16 November 2023 / Accepted: 22 November 2023 / Published: 28 November 2023
(This article belongs to the Special Issue Advances in Mathematics: Theory and Applications)

Abstract

:
In existing models with an unknown link function, the issue of predictors containing both multiple functional data and multiple scalar data has not been studied. To fill this gap, we propose a generalized partially functional linear model, which not only models the relationship between multiple scalar and functional predictors and responses, but also automatically estimates the link function. Specifically, we use the functional principal component analysis method to reduce the dimensionality of functional predictors, estimate the regression coefficients using the maximum likelihood estimation method, estimate the link function using the method of local linear regression, iteratively obtain the final estimator, and establish the asymptotic normality of the estimator. The asymptotic normality is illustrated through simulation experiments. Finally, the proposed model is applied to study the influence of environmental, economic, and medical levels on life expectancy in China. In the study, functional predictors are the daily air quality index, temperature, and humidity of 58 cities in 2020, and scalar predictors are GDP and the number of beds in hospitals. The experimental results indicate that the unknown link function model has a smaller prediction error and better performance than both the model with the known link function and the model without a link function.

1. Introduction

In 1982, Ramsay [1] first proposed the definition of functional data, laying a foundation for the development of functional data analysis. In 2005, Ramsay and Silverman provided a detailed introduction to the general methods and steps of functional data analysis, including functional principal component analysis and functional linear regression models in their book [2]. In 2012, Horváth and Kokoszka [3] focused on the inferential methods in functional data analysis.
In 2009, Shin [4] proposed a partial functional linear model (PFLM), which explores the relationship between a scalar response variable and mixed-type predictors. In 2012, Shin and Lee [5] derived the asymptotic prediction rate of PFLM and compared it with that of other functional regression models.
In 2002, James [6] proposed generalized linear models with functional predictors and applied them to standard missing data problems. In 2005, Müller and Stadtmüller [7] proposed a generalized functional linear regression model where the response variable is a scalar and the predictor is a random function. They also considered the situation where the link and variance functions were unknown. In 2015, Shang and Cheng [8] proposed a roughness regularization approach in making nonparametric inference for generalized functional linear models with known link functions. In 2019, Wong et al. [9] investigated a class of partially linear functional additive models that predict a scalar response by both the parametric effects of a multivariate predictor and the non-parametric effects of a multivariate functional predictor.
In a generalized linear model, sometimes the link function may not be known exactly, but can be assumed to be of some general ‘parametric’ form. In 1984, Scallan et al. [10] showed how generalized linear models can be extended to fit models with such link functions.
In 1994, Weisberg and Welsh [11] used kernel smoothing estimation to estimate the link function and estimated regression coefficients through the link function, then alternated between these two steps, which effectively solves the fitting problem when the link function is unknown. However, kernel smoothing estimation may have problems at the boundary, so local polynomial fitting is introduced, which performs better near the boundary.
In 1998, Chiou and Müller [12] considered the condition of the link and the variance functions to be unknown but smooth. Consistency results for the link and the variance function estimators, as well as the sampling distribution of the regression coefficients, were obtained. In 2005, Chiou and Müller [13] introduced a flexible marginal modeling approach for statistical inference for clustered and longitudinal data under minimal assumptions. The predictor was longitudinal data in the model. The estimated estimating equation approach was semi-parametric. The semi-parametric model proposed was fitted by quasi-likelihood regression. The consistency of the estimates of the link and variance functions and the asymptotic limit distribution of regression coefficients were given. In addition, there are other methods to estimate unknown functions. In 2009, Bai et al. [14] focused on single-index models for longitudinal data. They proposed a procedure to estimate the single-index component and the unknown link function based on the combination of the penalized splines and quadratic inference functions. In 2012, Pang and Xue [15] generalized the single-index models to the scenarios with random effects. The link function was estimated by using the local linear smoother. A new set of estimating equations modified for the boundary effects was proposed to estimate the index coefficients. In 2017, Yuan and Diao [16] developed a sieve maximum likelihood estimation for generalized linear models, in which the estimator of the unknown link function was assumed to lie in a sieve space. Various methods of sieves including the B-spline and P-spline-based methods were introduced.
In 2017, Kokoszka and Reimherr [17] wrote a book that introduced the basic concepts, methods, and applications of functional data analysis. The book provided a clear and systematic overview, covering key areas such as representation, smoothing, interpolation, statistical modeling, and inference for functional data. It also included detailed explanations of practical examples and computational methods. In 2023, Rao and Reimherr [18] introduced a novel neural network-based nonlinear model of functional data designed to exploit the structure of functional data and fit it with a derived function gradient optimization algorithm, demonstrating the effectiveness of these methods in dealing with complex functional models and providing new breakthroughs for deep learning applications in the field of functional data analysis.
The relationship between environmental factors and human health has been a topic of significant research interest in recent years. In 2012, Huang et al. [19] explored the relationship between temperature and years of life lost (YLL). The study found that both high and low temperatures lead to an increase in YLL, with high temperatures having a greater impact. In 2020, Yang et al. [20] applied a generalized additive model to assess the associations between daily PM2.5 exposure and YLL due to respiratory diseases in 96 Chinese cities during 2013–2016. They further estimated the avoidable YLL and potential gains in life expectancy under the assumption that daily PM2.5 level met World Health Organization standards. In 2021, Deryugina and Molitor [21] explored the factors influencing life expectancy across the United States. The study found that individuals living in areas with severe air pollution, poor water quality, and inadequate healthcare facilities generally had shorter life expectancy and poorer health conditions.
In summary, the existing models with unknown link functions have not addressed the issue of the generalized partially functional regression model, which involves regressing the response variable on multiple functional and scalar predictors. To fill this gap, this study proposes a generalized partially functional linear model with an unknown link function. The proposed model avoids the problem of decreased model accuracy caused by selecting an incorrect link function. The predictors in the proposed model include both multiple functional data and multiple scalar data. It reveals the complex relationships between variables and provides a flexible and effective modeling approach. It can achieve better prediction and explanation.
The paper is organized as follows. All the published works and definitions that are referred to in the process of theorem proving are introduced in Section 2. The abbreviations used in the article are introduced in the Section 3.1. The generalized partial functional linear model with unknown link function is proposed in Section 3.2. The estimation of the regression coefficients and the link function is discussed in Section 3.3. In Section 4, asymptotic normality of estimators are derived. Simulation results are reported in Section 5. The average life expectancy study in 58 cities in China is given in Section 6. In Section 7, a brief summary and limitations of the research are provided. Possible applications and future directions are presented in Section 7.

2. Preliminaries

In this section, we provide an overview of the published works and definitions that are relevant to our research. These preliminary concepts and references lay the foundation for a better understanding of the subsequent discussion.
(1) In 1982, Mack and Silverman [22] provided a comprehensive analysis of the weak and strong uniform consistency properties of kernel regression estimates, highlighted their theoretical properties and practical significance in non-parametric regression modeling. In this paper, we directly apply the results of Proposition 4 as Lemma 1 for Theorem 1 in this paper.
(2) In 1995, Masry and Tjøstheim [23] discussed the estimation and identification of nonlinear time series of ARCH type. They provided an estimation method to obtain consistent estimates of the parameters and proved the asymptotic normality. They also explored model identification methods. Their studies are of significant importance for modeling and analyzing financial time series. Theorem 3.3 in their work is used to prove Theorem 1 in this paper.
(3) In 1999, Chiou and Müller [24] focused on the study of non-parametric quasi-likelihood methods. They provided the theoretical derivation process of this method, and explored its applications in statistical inference. Theorem 4.1 in their paper is used to prove Lemma 2 and Lemma 3 for Theorem 2 in this paper.
(4) In 2021, Xiao et al. [25] proposed a generalized partially functional linear regression model where the response variable is 0 or 1 and the predictors were multiple functional and scalar, and the asymptotic property of the estimated coefficients in the model was established. The proof method of Theorem 1 in [25] is used to prove Theorem 2 in this work.

3. Model and Estimation

The data we observe for the i-th subject are { Y i , X i 1 ( t ) , X i 2 ( t ) , , X i d ( t ) , Z i } ,   i = 1 , , n . We assume that these data are independent, identically distributed (i.i.d) copies of { Y , X 1 ( t ) , , X d ( t ) , Z } . For j = 1 , , d , the functional predictor X j ( t ) is a random curve. X i j ( t ) ,   i = 1 , 2 , , n are samples of X j ( t ) and X i j ( t ) are square integrable on a real bounded interval T, i.e., X i j ( t ) L 2 ( T ) . L 2 ( T ) refers to the space of square integrable functions defined on T. And the scalar predictor vector Z = ( Z 1 , Z 2 , , Z q ) T is a q dimensional random vector. The response Y is a real-valued random variable that may be binary or count.

3.1. Abbreviation Introduction

Table 1 is a list of the abbreviations we use in this work along with their corresponding full forms:

3.2. Model

We establish a model for the relationship between the response variable Y i and the predictors X i j ( t ) ,   j = 1 , 2 , , d and Z i :
Y i = g ( j = 1 d T X i j ( t ) β j ( t ) d t + Z i T γ ) + ε i ,
where β j ( · ) is the regression coefficient function that needs to be estimated for the functional predictors X i j ( t ) ; γ is a q dimensional vector with the elements to be the regression coefficients for the scalar predictors Z i that need to be estimated, i.e., γ = ( γ 1 , γ 2 , , γ q ) T . Here ε i is i.i.d copies of ε , which is the random error variable and ε = Y g ( η ) , E [ ε X j ( t ) , Z ] = 0 , where
η = j = 1 d T X j ( t ) β j ( t ) d t + Z T γ .
The relationship between the response variable Y and η is established through g ( · ) , i.e., E [ Y | X j ( T ) , Z ] = μ = g ( η ) . g ( · ) is the link function that is unknown and needs to be estimated in this paper.
Let σ 2 ( · ) be a variance function that satisfies σ 2 ( · ) c > 0 for a constant c > 0 , such that
V a r [ Y | X j ( t ) , Z ] = σ 2 ( μ ) = σ 2 ( g ( η ) ) ,
V a r [ ε ] = E [ ε 2 ] = σ 2 ( E [ Y | X j ( t ) , Z ] ) .
To reduce the dimensionality of the functional predictors X i j ( t ) , we adopt the method of FPCA in this paper. First, we need to standardize the original data by centering them, so that E [ X i j ( t ) ] = 0 ,   j = 1 , , d , and E [ Z l ] = 0 ,   l = 1 , , q .
By KL expansion and Mercer’s theorem, X i j ( t ) can be expanded as
X i j ( t ) = k = 1 ξ i j k ρ j k ( t ) ,
where ξ i j k represents the functional principal component scores, and ρ j k ( · ) are called functional principal components, which are the eigenfunctions of the covariance operator of X i j ( t ) . Notice that ρ j k ( · ) ,   k = 1 , 2 , form an orthonormal basis for the function space L 2 ( T ) . Then regression coefficient function β j ( t ) L 2 ( T ) can be expanded as
β j ( t ) = k = 1 χ j k ρ j k ( t ) .
where χ j k represents the functional principal component scores.
After plugging the above two expansions into (1), we have
Y i = g j = 1 d k = 1 m j ξ i j k χ j k + Z i T γ + ε i .
In (4), we truncated the predictors at m j (depending on sample size n), and m j increases asymptotically with n .

3.3. Estimation

Define a parameter vector θ 0 , where
θ 0 = ( χ 11 , χ 12 , , χ 1 m 1 , , χ d 1 , χ d 2 , , χ d m d , γ 1 , , γ q ) T .
For the estimation of the parameter vector θ and the link function g, we use an iterative estimation method to obtain the final estimates. Let there exist a constant c > 0 ; with this c and n, we can define θ n = { θ : θ θ 0 c n 1 1 2 2 } . The norm of finite dimensional spaces used in this paper is the Euclidean norm. The overall iterative process is briefly described below:
Step 1 To obtain the estimate θ ( 0 ) of θ 0 by solving Equation (5), it is assumed that the link function g ( · ) is known. The link function g ( · ) is required to be second-order continuously differentiable to ensure the existence of the Hessian matrix, moreover, for the variance function σ 2 ( · ) is defined on the range of link function and is strictly positive.
U ( θ ) = i = 1 n ( Y i μ i ) g ( η i ) σ 2 ( μ i ) Δ i = 0 ,
where η i = j = 1 d k = 1 m j ξ i j k χ ˜ j k + Z i γ ˜ , χ ˜ near χ , γ ˜ near γ , μ i = g ( η i ) and
Δ i = ( ξ i 11 , ξ i 12 , , ξ i 1 m 1 , , ξ i d 1 , ξ i d 2 , , ξ i d m d , z i 1 , , z i q ) T .
Here, χ ˜ and γ ˜ represent the corresponding estimated value in step 1 but not the final estimate.
We introduce the following matrix:
D 0 = D n , q = z i l 1 i n , 1 l q ,
D j = D n , m j = ξ i j k 1 i n , 1 j d , 1 k m j ,
D = D n , q + j = 1 d m j = D 0 , D 1 , , D d ,
V = d i a g σ 2 ( μ 1 ) , σ 2 ( μ 2 ) , , σ 2 ( μ n ) ,
G = d i a g g ( η i ) 1 i n ,
Y = Y 1 , Y 2 , , Y n T ,
μ = μ 1 , μ 2 , , μ n T .
Then, Equation (5) can be expressed in matrix form, i.e.,
D T V 1 G Y μ = 0 .
We can solve it by the weighted least squares method. A Taylor expansion of g 1 ( Y ) , where
g 1 ( Y ) = g 1 ( μ ) + g 1 ( μ ) ( Y μ ) = η + G 1 ( Y μ ) ,
and then we can get
D T W g 1 ( Y ) η = 0 ,
where W = V 1 G 2 . Simplification yields estimates
χ ˜ j ( 0 ) = ( D j T W D j ) 1 D j T W g 1 ( Y ) ,
γ ˜ j ( 0 ) = ( D 0 T W D 0 ) 1 D 0 T W g 1 ( Y ) ,
where χ ˜ j ( 0 ) = ( χ ˜ j 1 ( 0 ) , , χ ˜ j m ( 0 ) ) T , j = 1 , 2 , , d , γ ˜ ( 0 ) = ( γ ˜ 1 ( 0 ) , γ ˜ 2 ( 0 ) , , γ ˜ q ( 0 ) ) T .
Let
θ ˜ ( 0 ) = ( χ ˜ 11 ( 0 ) , χ ˜ 12 ( 0 ) , , χ ˜ 1 m 1 ( 0 ) , , χ ˜ d 1 ( 0 ) , χ ˜ d 2 ( 0 ) , , χ ˜ d m d ( 0 ) , γ ˜ 1 ( 0 ) , γ ˜ 2 ( 0 ) , , γ ˜ q ( 0 ) ) T .
Step 2 By local linear regression, the estimates g ( 0 ) , g ( 0 ) of the link functions g, g are obtained.
Let the bandwidth b = b n of the kernel function k ( · ) converge to zero and define k b ( · ) = b 1 k ( · · b b ) . Since the convergence rates of g ( · ) and g ( · ) are different, their bandwidth choices should also be different. Let h 0 = h 0 n denote the bandwidth of g ( · ) , and h 1 = h 1 n denote the bandwidth of g ( · ) , but in this paper, for simplicity, the bandwidth h = h 0 = h 1 is chosen. Let the distributions of both the functional predictors X j ( t ) and the scalar predictors Z belong to a compact support set U, and we have Ω = { u = η i X j ( t ) , Z U } . To simplify the expression, we let g = g ( u ; θ ) , g = g ( u ; θ ) . For a fixed θ , apply the method of local linear regression to obtain an initial estimate of g ˜ ( 0 ) and g ˜ ( 0 ) for g and g , respectively. We minimize the weighted sum of squares at any point u, and the formula for calculating the weighted sum of squares is
i = 1 n [ Y i g g ( η i u ) ] 2 k h ( η i u ) .
Through minimizing (6), we can obtain g ˜ ( 0 ) and g ˜ ( 0 ) , and they can be represented as g ˜ ( 0 ) = i = 1 n ω i ( u ; θ ) Y i , g ˜ ( 0 ) = i = 1 n ω ˜ i ( u ; θ ) Y i , where
ω i ( u ; θ ) = k h ( η i u ) [ φ n , 2 ( u ; θ , h ) ( η i u ) φ n , 1 ( u ; θ , h ) ] φ n , 0 ( u ; θ , h ) φ n , 2 ( u ; θ , h ) φ n , 1 2 ( u ; θ , h ) ,
ω ˜ i ( u ; θ ) = k h ( η i u ) [ ( η i u ) φ n , 0 ( u ; θ , h ) φ n , 1 ( u ; θ , h ) ] φ n , 0 ( u ; θ , h ) φ n , 2 ( u ; θ , h ) φ n , 1 2 ( u ; θ , h ) ,
φ n , l ( u ; θ , h ) = 1 n i = 1 n η i u h l k h ( η i u ) , l = 0 , 1 , 2 .
Step 3 Using the method of Step 1, the link function is replaced by the estimated link functions g ˜ ( α ) and g ˜ ( α ) , where α = 0 , 1 , 2 , . To update θ ˜ ( α ) , solve the estimation equation (5) for θ . From this we can obtain the estimated value of θ ˜ ( α )
θ ˜ ( α ) = ( χ ˜ 11 ( α ) , χ ˜ 12 ( α ) , , χ ˜ 1 m 1 ( α ) , , χ ˜ d 1 ( α ) , χ ˜ d 2 ( α ) , , χ ˜ d m d ( α ) , γ ˜ 1 ( α ) , γ ˜ 2 ( α ) , , γ ˜ q ( α ) ) T .
Step 4 Using the method in Step 2, the parameter vector is replaced by the estimated θ ˜ ( α ) = ( χ ˜ j 1 ( α ) , χ ˜ j 2 ( α ) , , χ ˜ j m ( α ) , γ ˜ 1 ( α ) , γ ˜ 2 ( α ) , , γ ˜ q ( α ) ) T , where α = 1 , 2 , 3 , From this we obtain the estimates g ˜ ( α ) and g ˜ ( α ) for g and g , where α = 1 , 2 , 3 ,
Step 5 Repeat the above steps until θ ˜ ( α + 1 ) θ ˜ ( α ) converge, and stop the iteration.
Step 6 The final estimate of the regression coefficient θ is obtained as θ ^ , and the estimate of the link function g is obtained as g ^ .

4. Asymptotic Properties

To derive the asymptotics of the estimates of the link function g ^ and the regression coefficients θ ^ , some additional assumptions are required:
(C1)
There exists b = max ( 4 , c ) for a constant c > 0 , such that E [ T X j ( t ) b d t ] < , j = 1 , , d , E [ Z b ] < , E [ ε ] < .
(C2)
Let the density function f ( · ) of η i be strictly positive, and f ( · ) satisfies the first-order Lipschitz condition when θ θ 0 .
(C3)
The kernel function k ( · ) satisfies the first-order Lipschitz condition and is a bounded and continuous symmetric probability density function and satisfies  u 2 k ( u ) d u 0 , u 2 k ( u ) d u < .
(C4)
n h 4 / log 2 n , n h 5 = O ( 1 ) . Here, h is the bandwidth of the kernel function.
(C5)
For j = 1 , , d , m j n 1 / 4 0 as n .
Remark 1. 
(C1) It is a necessary condition for the asymptotic normality of the estimator. (C2) Ensures that g ˜ ( α ) , g ˜ ( α ) are far from 0 when θ ˜ ( α ) is close enough to θ. (C3) The usual assumptions about the kernel function. (C4) The usual assumptions about the bandwidth. (C5) Some controls are applied to m in order to make the convergence faster.

4.1. Asymptotic Convergence of g ( α )

Lemma 1. 
Let ( ζ 1 , ι 1 ) , , ( ζ n , ι n ) be independent and identically distributed random vectors. Furthermore, assume that for any s > 0 , there exist E ι i s < , i = 1 , , n and sup ζ ι s f ( ζ , ι ) d ι < such that f ( · , · ) is the joint density function of ( ζ , ι ) . Let k ( · ) be a bounded and strictly positive kernel function that satisfies the Lipschitz condition, we have
sup ζ 1 n i = 1 n k h ( ζ i ζ ) ι i E [ k h ( ζ i ζ ) ι i ] = O p log 1 1 h h n h 1 2 .
Proof. 
See Proposition 4 in Mack and Silverman (1982) [22]. □
Theorem 1. 
If we assume that (C1)–(C5) holds, for σ 2 > 0 , then we have
n h [ g ˜ ( α ) ( u ; θ ) g ( u ) I ( u ) ] D N ( 0 , ϑ 2 ( u ) )
where I ( u ) = 1 2 h 2 μ 2 g ( u ) , ϑ 2 ( u ) = ν 2 σ 2 ν 2 σ 2 j = 1 d f j ( u ) j = 1 d f j ( u ) , and for the kernel function, let
μ l = u l k ( u ) d u , ν l = k l ( u ) d u ,   l = 1 , 2 .
Proof. 
g ˜ ( α ) ( u ; θ 0 ) , h g ˜ ( α ) ( u ; θ 0 ) T = Γ n 1 ( u ; θ 0 ) Φ n ( u ; θ 0 ) ,
where
Γ n ( u ; θ 0 ) = φ n , 0 ( u ; θ 0 ) φ n , 1 ( u ; θ 0 ) φ n , 1 ( u ; θ 0 ) φ n , 2 ( u ; θ 0 ) ,
Φ n ( u ; θ 0 ) = ϕ n , 0 ( u ; θ 0 ) , ϕ n , 1 ( u ; θ 0 ) T ,
φ n , l ( u ; θ 0 ) = 1 n i = 1 n η i 0 u h l k h ( η i 0 u ) , l = 0 , 1 , 2 ,
ϕ n , l ( u ; θ 0 ) = 1 n i = 1 n Y i η i 0 u h l k h ( η i 0 u ) , l = 0 , 1 ,
η i 0 = j = 1 d k = 1 m j ξ i j k χ j k + Z i T γ .
By expanding φ n , l ( u ; θ ) , l = 0 , 1 , 2 , 3 , we obtain that
E φ n , l u ; θ = E 1 n i = 1 n η i u h l k h η i u = j = 1 d f j ( u ) μ l + O ( h ) .
From Lemma 1, it can be proved that for l = 0 , 1 , 2 , 3
φ n , l ( u ; θ ) E [ φ n , l ( u ; θ ) ] = O p log 1 1 h h n h 1 2 .
Taking (8) into (9), we can obtain that
φ n , l ( u ; θ ) = j = 1 d f j ( u ) μ l + O p log 1 1 h h n h 1 2 + h .
Then
Γ n ( u ; θ ) = Γ ( u ) + O p log 1 1 h h n h 1 2 + h ,
where Γ ( u ) = j = 1 d f j ( u ) d i a g ( 1 , μ 2 ) , ⊗ indicates the Kronecker product.
Inverting the matrix Γ n ( u ; θ ) , we get
Γ n 1 ( u ; θ ) = Γ 1 ( u ) + O p log 1 1 h h n h 1 2 + h .
Let
ϕ n , l * ( u ; θ ) = 1 n i = 1 n [ Y i g ( η i ) ] η i u h l k h ( η i u ) ,
where l = 0 , 1 , and
Φ n * Φ n * ( u ; θ ) = ϕ n , 0 * ( u ; θ ) , ϕ n , 1 * ( u ; θ ) T .
By expanding ϕ n , l * ( u ; θ ) , we obtain when l = 0 , 1 , 2 ,
E ϕ n , l * u ; θ = E 1 n i = 1 n Y i g η i η i u h l k h η i u = O n 1 2 .
From Lemma 1, combined with (10) and (11), we can prove that
ϕ n , l * ( u ; θ ) = O p log 1 1 h h n h 1 2 + n 1 12 2 .
The Taylor expansion of g ( η i ) at u is
Φ n Φ n * = Γ n g ( u ) h g ( u ) + 1 2 h 2 φ n , 2 g ( u ) φ n , 3 g ( u ) + o p ( h 2 ) ,
where Γ n = Γ n ( u ; θ ) , φ n , l = φ n , l ( u ; θ ) ,   l = 2 , 3 .
Combining (7) and (12), we can obtain
g ˜ ( α ) g h [ g ˜ ( α ) g ] = Γ n 1 Φ n * + 1 2 Γ n 1 h 2 φ n , 2 g ( u ) φ n , 3 g ( u ) + o p h 2 = Γ 1 ( u ) Φ n * + 1 2 h 2 μ 2 g ( u ) μ 3 μ 2 g ( u ) + o p h 2 + n 1 / 2 ,
where
g ˜ ( α ) g = [ j = 1 d f j ( u ) ] 1 ϕ n , 0 * ( u ; θ ) + 1 2 h 2 μ 2 g ( u ) + o p ( h 2 + n 1 12 2 ) .
Since θ θ 0 = O ( n 1 12 2 ) , (10) can be transformed into
ϕ n , 0 * ( u ; θ ) = 1 n i = 1 n [ Y i g ( η i 0 ) ] k h ( η i 0 u ) + O p ( n 1 12 2 ) .
Taking it into (13) and combining it with Theorem 3.3 in Masry and Tjøstheim (1995) [23], finally, Theorem 1 can be proved. □
Corollary 1. 
If we further refine the condition in assumption (C4) such that n h 5 0 , then it follows that
n h [ g ˜ ( α ) ( u ; θ ) g ( u ) ] D N ( 0 , ϑ 2 ( u ) ) .

4.2. Asymptotic Convergence of θ ^

First, we need to provide some more specific explanations for the estimation iteration process mentioned in “Estimation”, which makes some preparation for Theorem 2.
(1) solving U ( θ ) by Equation (5) given the assumption that the link function is known. Assume Q θ 0 = U ( θ 0 ) μ μ θ 0 T , then it follows that
Q θ 0 = i = 1 n g ( η i 0 ) 2 σ 2 ( μ i 0 ) Δ i Δ i T + i = 1 n ( Y i μ i 0 ) g ( η i 0 ) [ σ 2 ( μ i 0 ) ] [ σ 2 ( μ i 0 ) ] 2 g ( η i 0 ) σ 2 ( μ i 0 ) Δ i Δ i T .
Let θ ¯ θ n , where
θ ¯ = ( χ ¯ j 1 , χ ¯ j 2 , , χ ¯ j p j , γ ¯ 1 , , γ ¯ q ) T ,   j = 1 , 2 , , d ,
and satisfies η ¯ i = j = 1 d k = 1 m j ξ i j k χ ¯ j k + Z i γ ¯ , μ ¯ i = g ( η ¯ i ) . Similarly, we can obtain
Q θ ¯ = i = 1 n g ( η ¯ i ) 2 σ 2 ( μ ¯ i ) Δ i Δ i T + i = 1 n ( Y i μ ¯ i ) g ( η ¯ i ) [ σ 2 ( μ ¯ i ) ] [ σ 2 ( μ ¯ i ) ] 2 g ( η ¯ i ) σ 2 ( μ ¯ i ) Δ i Δ i T .
(2) Solving U * ( θ ) given the link function is unknown by
U * ( θ 0 ) = i = 1 n ( Y i μ ˜ i 0 ) g ˜ ( α ) ( η i 0 ) σ 2 ( μ ˜ i 0 ) Δ i = 0 ,
where μ ˜ i 0 = g ˜ ( α ) ( η i 0 ) . Similarly, we can obtain
Q θ ¯ * = i = 1 n g ˜ ( α ) ( η ¯ i ) 2 σ 2 ( μ ¯ i * ) Δ i Δ i T + i = 1 n ( y i μ ¯ i * ) g ˜ ( α ) ( η ¯ i ) [ σ 2 ( μ ¯ i * ) ] [ σ 2 ( μ ¯ i * ) ] 2 g ˜ ( α ) ( η ¯ i ) σ 2 ( μ ¯ i * ) Δ i Δ i T ,
where μ ¯ i * = g ( α ) ( η ¯ i ) .
Lemma 2. 
If the assumptions (C1)–(C5) hold, we have
sup θ ¯ θ n 1 n Q θ ¯ * Q θ ¯ = O p ( 1 ) .
Proof. 
Let M i = 1 σ 2 ( μ ¯ i * ) 1 σ 2 ( μ ¯ i ) , N i = [ σ 2 ( μ ¯ i * ) ] [ σ 2 ( μ ¯ i * ) ] 2 [ σ 2 ( μ ¯ i ) ] [ σ 2 ( μ ¯ i ) ] 2 , By Theorem 4.1 of Chiou and Müller [24], we know that max 1 i n M i = o p ( 1 ) and max 1 i n N i = o p ( 1 ) , then
1 n Q θ ¯ * Q θ ¯ = A + B + C ,
where A, B, and C can be expressed as
A = 1 n i = 1 n g ˜ ( α ) ( η ¯ i ) 2 σ 2 ( μ ¯ i * ) g ( η ¯ i ) 2 σ 2 ( μ ¯ i ) Δ i Δ i T 1 n max 1 i n M i ,
B = 1 n i = 1 n y i μ ¯ i g ˜ ( α ) η ¯ i σ 2 μ ¯ i * σ 2 μ ¯ i * 2 g η ¯ i σ 2 μ ¯ i σ 2 μ ¯ i 2 g ˜ ( α ) η ¯ i σ 2 μ ¯ i * g η ¯ i σ 2 μ ¯ i Δ i Δ i T 1 n i = 1 n y i μ ¯ i max 1 i n N i + max 1 i n M i ,
C = 1 n i = 1 n μ ¯ i * μ ¯ i g ˜ ( α ) η ¯ i σ 2 μ ¯ i * σ 2 μ ¯ i * 2 g ˜ ( α ) η ¯ i σ 2 μ ¯ i * Δ i Δ i T = o p ( 1 ) .
Then. by (14) we can get
1 n Q θ ¯ * Q θ ¯ = o p ( 1 ) .
Lemma 3. 
If the assumptions (C1)–(C5) hold, we have
( 1 n ) U * ( θ 0 ) U ( θ 0 ) = o p ( 1 ) .
Proof. 
Combining Theorem 1 and max 1 i n M i = o p ( 1 ) in Lemma 2, we can prove that
1 n ( U * ( θ 0 ) U ( θ 0 ) ) = 1 n ( Y i μ i 0 ) g ( η i 0 ) 1 σ 2 ( μ ˜ i 0 ) 1 σ 2 ( μ i 0 ) + ( Y i μ i 0 ) σ 2 ( μ ˜ i 0 ) ( g ( η i 0 ) g ˜ ( α ) ( η i 0 ) ) + g ˜ ( α ) ( η i 0 ) σ 2 ( μ ˜ i 0 ) ( μ ˜ i 0 μ i 0 ) Δ i = o p ( 1 ) .
Theorem 2. 
If we assume that (C1)–(C5) hold, we have
n d G 2 β ^ 1 , β 1 m 1 2 m 1 n d G 2 β ^ d , β d m d 2 m d n o 1 γ 1 γ ^ 1 n o q γ q γ ^ q d N ( 0 , I ) ,
In the case of truncated models for m j , let χ ^ j be the estimator of χ j = ( χ j 1 , χ j 2 , , χ j m j ) T , Λ ˜ j = λ j , k 1 k 2 , 1 k 1 , k 2 m j , where λ j , k 1 k 2 = E g ( η ) 2 σ 2 ( μ ) ξ j k 1 ξ j k 2 . We define χ ¯ j = ( χ j ( m j + 1 ) , χ j ( m j + 2 ) , ) T . Therefore, we have the following expression:
d G 2 ( β ^ j , β j ) = ( χ ^ j χ j ) T Λ ˜ j ( χ ^ j χ j ) + k 1 , k 2 = m j λ j , k 1 k 2 χ ¯ j 2 , j = 1 , , d .
Furthermore, let o l = E g ( η ) 2 σ 2 ( μ ) z i l 2 , where l = 1 , , q . Here, I represents a ( q + j = 1 d m j ) × ( q + j = 1 d m j ) dimensional identity matrix.
Proof. 
By using the Taylor expansion with a suitable mean value θ ¯ , we can obtain
U * ( θ ^ ) = U * ( θ 0 ) Q θ ¯ * ( θ ^ θ 0 ) = 0 .
Then, by Lemmas 2 and 3, (15) can be deformed as
U * ( θ ^ ) = U ( θ 0 ) Q θ ¯ ( θ ^ θ 0 ) + o p ( n ) = 0 .
Then, we can get
θ ^ θ 0 = Q θ ¯ 1 U ( θ 0 ) + o p ( 1 n ) .
By combining the above equation with
U ( θ ˜ ( α ) ) = U ( θ 0 ) Q θ ¯ ( θ ˜ ( α ) θ 0 ) = 0 ,
we can get
n ( θ ^ θ 0 ) = n ( θ ˜ ( α ) θ 0 ) + o p ( 1 ) .
By (16), it can be seen that it transforms the relationship between θ ^ and θ 0 in the case of unknown link functions into the relationship between θ ˜ ( α ) and θ 0 in the case of known link functions, and then combined with Theorem 1 in [25], the proof of Theorem 2 can be obtained. □

4.3. Asymptotic Convergence of g ^

Theorem 3. 
If we assume that (C1)–(C5) hold, for σ 2 > 0 , then we have
n h [ g ^ ( u ; θ ^ ) g ( u ) I ( u ) ] D N ( 0 , ϑ 2 ( u ) ) .
Proof. 
n h [ g ^ ( u ; θ ^ ) g ( u ) I ( u ) ] = n h g ^ ( u ; θ ^ ) g ^ u ; θ + g ˜ ( α ) u ; θ g ( u ) I ( u ) ] n h g ^ u ; θ θ ^ θ + n h g ˜ ( α ) u ; θ g ( u ) I ( u ) = n h g ˜ ( α ) u ; θ g ( u ) I ( u ) + o p ( 1 ) .
The above expression transforms the relationship between g ^ and g into the relationship between g ˜ ( α ) and g (i.e., Theorem 1). Therefore, by Theorem 1, we can get Theorem 3. □
Corollary 2. 
If we further refine the condition in assumption (C4) such that n h 5 0 , then it follows that
n h [ g ^ ( u ; θ ^ ) g ( u ) ] D N ( 0 , ϑ 2 ( u ) ) .
Remark 2. 
Let ( e 1 , λ 1 ) , ( e 2 , λ 2 ) , , ( e m j , λ m j ) represent the eigenvalues and eigenvectors of Ω, where
e k = ( e j 1 , , e j m j ) , w k ( t ) = j = 1 d ρ j k ( t ) e j k , k = 1 , 2 , , m j ,
Ω = 1 n E g ^ ( η ^ i ) 2 σ 2 ( μ ^ i ) D j T D j , i = 1 , , n , j = 1 , , d .
Then, the 95% confidence band for the regression coefficient function β ^ j ( t ) can be expressed as
β ^ j ( t ) ± r ( α ) k = 1 m j w k ( t ) 2 e k ,
where r ( α ) = [ m j + 2 m j Φ ( 1 α ) ] , α = 0.05 , Φ ( 1 α ) = 1.96 .

5. Simulation

We consider a binary response and two functional predictors as well as three scalar predictors. The functional predictors X i 1 ( t ) and X i 2 ( t ) ( i = 1 , , n ) are observed at 50 equal distant time points on the interval [ 0 , 1 ] .
The sample sizes are n = 50 , 100 , 300 . Let the score coefficients ξ i j k for each functional predictor satisfy the following assumptions:
ξ i 1 k N ( 0 , λ 1 k ) , k = 1 , 2 , 3 , 4 ,
where λ 11 = 1 ,   λ 12 = 2 2 2 2 ,   λ 13 = 1 / 2 ,   λ 14 = 2 2 4 4 .
ξ i 2 k N ( 0 , λ 2 k ) ,   k = 1 , 2 , 3 ,
where λ 21 = 1 ,   λ 22 = 2 2 2 2 ,   λ 23 = 1 / 2 .
We define the orthonormal basis functions ρ 1 k ( t ) and ρ 2 k ( t ) t [ 0 , 1 ] , which satisfy
ρ 1 k ( t ) = 2 sin ( 2 k π t ) , k = 1 , 2 , 3 , 4 ,
ρ 2 k ( t ) = 2 cos ( 2 k π t ) , k = 1 , 2 , 3 .
Then, X i j ( t ) can be represented through Karhunen–Loeve expansion as follows:
X i 1 ( t ) = k = 1 4 ξ i 1 k ρ 1 k ( t ) ,
X i 2 ( t ) = k = 1 3 ξ i 2 k ρ 2 k ( t ) .
Figure 1 shows the 50 trajectories of the two functional predictors X 1 ( t ) and X 2 ( t ) .
The scalar predictor Z = ( Z 1 , Z 2 , Z 3 ) T satisfies the following assumption
Z 1 N ( 0 , 1 ) , Z 2 N ( 0 , 3 3 ) , Z 3 N ( 0 , 5 5 ) .
We assume that the regression coefficient functions of the functional predictors satisfy the following assumption
β 1 ( t ) = k = 1 4 χ 1 k ρ 1 k ( t ) ,
β 2 ( t ) = k = 1 3 χ 2 k ρ 2 k ( t ) ,
where χ 1 k = 1 3 k , k = 1 , 2 , 3 , 4 and χ 2 k = 1 3 k , k = 1 , 2 , 3 . Moreover, we assume that the regression coefficients γ = ( γ 1 , γ 2 , γ 3 ) T of the scalar predictors satisfy γ 1 = 2 2 2 2 , γ 2 = 3 3 3 3 , γ 3 = 1 / 2 .
Define
P ( X , Z ) = g ( j = 1 2 T X j ( t ) β j ( t ) + Z T γ ) .
And we select the link function as
g ( x ) = exp ( x ) 1 + exp ( x ) .
We generate binary response
Y ( X , Z ) B i n o m i a l ( P ( X , Z ) , 1 )
as pseudo random sequence.
We obtain a sample
( Y i , X i 1 ( t ) , X i 2 ( t ) , Z i ) , i = 1 , , n ,
where n is the sample size. The number of functional principal components that explain 85% of cumulative variation contribution are m 1 = 3 , 3 , 4 , m 2 = 2 , 3 , 3 , respectively. We run 100 simulations.
Figure 2 shows the asymptotic behavior of the link function under different sample sizes. The black lines in Figure 2 shows the relationship between η and μ , where
η = j = 1 2 T X j ( t ) β j ( t ) d t + Z T γ , μ = g ( η ) = exp ( η ) 1 + exp ( η ) [ 0 , 1 ] .
The additional colored lines shown in Figure 2 represent the estimated link function g ^ for different sample sizes. These lines are obtained through iterative processes, starting with an initial value of g set to g ( η ) = η . The iterative process continues until one of the following conditions is met: 100 iterations have been performed, or the error in the regression coefficients is less than 0.01. The purpose of these lines is to illustrate the relationship between η ^ and μ ^ , where
η ^ = j = 1 2 T X j ( t ) β ^ j ( t ) d t + Z T γ ^ , μ ^ = g ^ ( η ^ ) [ 0 , 1 ] .
Since in this case, both η ^ and η are in [ 2 , 2 ] , we denote the argument of g and g ^ by η , and the x-axis in Figure 2 is denoted by η and is shown in the interval [ 0 , 1 ] . Table 2 presents the estimates of g ^ evaluated through RMISE under different sample sizes. The RMISE is defined as follows:
R M I S E = 1 Q 2 2 ( g ^ ( η ) g ( η ) ) 2 d η ,
where Q = 100 is the number of simulations here. In summary, Figure 2 and Table 2 demonstrate that as the sample size increases, the estimated link function g ^ becomes closer and closer to the true link function g.
In Table 3, it can be seen that both the SD and RMISE of the estimated regression coefficient functions β ^ 1 ( t ) and β ^ 2 ( t ) decrease as the sample size n increases.
Figure 3 displays the estimated functional regression coefficients β ^ 1 ( t ) and β ^ 2 ( t ) , as well as their 95% confidence intervals under different sample sizes. The red curve in the figure represents the theoretical values of β 1 ( t ) and β 2 ( t ) , while the blue curve represents the estimated values β ^ 1 ( t ) and β ^ 1 ( t ) . The gray shaded area represents the 95% confidence interval of the estimates. It can be seen that as the sample size increases, the estimated values become closer to the true values.
Table 4 presents the estimated scalar regression coefficient γ ^ and corresponding standard deviation under different sample sizes. It can be seen that as the sample size n increases, γ ^ = ( γ ^ 1 , γ ^ 2 , γ ^ 3 ) T becomes closer to the true values γ = ( 2 / 2 , 3 / 3 , 1 / 2 ) T . Moreover, as the sample size n increases, the SD becomes smaller, indicating that the estimated values have more certainty.
Table 5 presents the M1 and M2 values for different sample sizes, where M 1 = 1 Q i = 1 Q M A E , MAE= 1 n i = 1 n | Y i Y ^ i | , M 2 = 1 Q i = 1 Q M S E , MSE = 1 n i = 1 n ( Y i Y ^ i ) 2 , and Y i and Y ^ i represent the real and the predicted values of the response variable, respectively. We can find that as the sample size increases, the values of M1 and M2 become smaller, indicating that the predictive performance of the model improves.

6. Application

As is well known, research on average life expectancy is crucial for social development, health policies, and population management. Studies on average life expectancy can help governments, health departments, and social institutions develop relevant policies and plans to improve people’s quality of life and health conditions. By understanding people’s life expectancy, the efficiency of healthcare systems and the effectiveness of social welfare and public health policies can be evaluated, providing a basis for resource allocation and planning. Additionally, research on average life expectancy can also help people understand population structure and trends, providing references for social-economic development, pension systems, and labor market planning. Therefore, in the application of our proposed model, we investigate factors that influence average life expectancy, including air quality index (AQI), temperature, GDP, and number of beds in hospitals.

6.1. Data Description

We collected average daily temperature (Temp) data for 58 cities in China in 2020 from the National Meteorological Science Data Sharing Service Platform, and average daily Air Quality Index (AQI) data from the National Environmental Monitoring Station. We also collected GDP, number of beds in hospitals, and life expectancy data for each city from local statistical bulletins and government documents. Among them, there are two functional predictive variables, which are daily AQI and temperature from 1 January to 31 December 2020, for 366 days in 58 cities. There are also two scalar predictive variables, which are GDP and number of beds in hospitals for the 58 cities in 2020. The response variable is the life expectancy of residents in each city in 2020.
Figure 4 shows the daily AQI and temperature for 58 cities in 2020.

6.2. Data Analysis

According to a report released by the National Health Commission, the average life expectancy of Chinese residents in 2020 was 77.9 years. Therefore, we divide the response variable as follows: when the life expectancy of a city is greater than 77.9 years, we represent it as 1; otherwise, when the life expectancy is less than 77.9 years, we represent it as 0. For the functional predictors, we first centralize the data. Second, we conduct FPCA and select the number of functional principal components that explain 75% of the variation. The number of components for AQI and temperature is p A Q I = 10 and p T e m p = 3 , respectively. We use GCV to demonstrate the predictive accuracy of the estimators. In this application, G C V = 0.135 .

6.3. Results Analysis

By inputting the data into the generalized partially functional linear model, we obtain the regression coefficient function β ^ ( t ) for the functional predictors and the regression coefficients γ ^ for the scalar predictors. The results are shown in Table 6 and Figure 5, respectively.
Table 6 presents the estimated values of the regression coefficient γ ^ for scalar predictor variables. We can see that both GDP and number of beds in hospitals have a positive relationship with life expectancy, and are significant at the 5% level. This means that when a region has a higher GDP and more hospital beds, the life expectancy in that region is longer. In other words, the better the economic development and medical resources of a region, the longer the life expectancy.
In Figure 5, we see the estimated values of the regression coefficient function β ^ ( t ) . For AQI, we can find a negative relationship between AQI and life expectancy in general. The higher the value of AQI, the more serious the air pollution is, and the lower the life expectancy corresponding to it. However, there is a more obvious positive relationship trend in February to April, which may be influenced by some other external factors. For temperature, we can find that the effect of temperature on life expectancy varies with the change of seasons. In spring, summer, and fall (March to October), the effect of temperature on life expectancy is negatively correlated, and in winter (November to February), it is positively correlated, which is consistent with the conclusion in Huang et al. [19].
To confirm the necessity of considering the unknown link function model, we choose models without a link function and with the logit link function (i.e., g ( η ) = e η 1 + e η ) and compare them with our proposed models with unknown link functions. In order to evaluate the prediction performance of the three models, we use MAE, MSE, and R 2 . Additionally, we calculate the accuracy using the confusion matrix, where we define TP as the number of samples correctly classified as positive, TN as the number of samples correctly classified as negative, FP as the number of samples incorrectly classified as positive, and FN as the number of samples incorrectly classified as negative (missed detections). We obtain the model’s accuracy using the formula
A c c u r a c y = T P + T N T P + T N + F P + F N .
When the values of MAE and MSE are smaller, it indicates that the model has a smaller prediction error and better performance. When R 2 is closer to 1, it indicates that the model has a stronger ability to explain the response variable. The experimental results are shown in Table 7. It can be seen that the model we proposed has the best performance.

7. Conclusions

This article proposes a generalized partially functional linear model for scalar response and predictor variables that include both functional and scalar components, without specifying a link function. We use functional principal component analysis to reduce the dimensionality of functional data, estimate the regression coefficients using the maximum likelihood estimation method, estimate the link function using the method of local linear regression, iteratively obtain the final estimator, and establish the asymptotic normality of the estimator. The accuracy of the proposed model is validated through simulation studies.
The article applies the proposed model to the study of average life expectancy. Using daily AQI, temperature, GDP, and number of beds in hospitals for 58 cities in China in 2020, the study explores the impact of environmental, economic, and medical factors on life expectancy. The results indicate that GDP and number of beds in hospitals have a positive correlation with the life expectancy, while the AQI has an overall negative correlation. Temperature has a negative correlation with the average life expectancy in spring, summer, and autumn, and a positive correlation in winter. Overall, the study concludes that the average life expectancy is higher in areas with better environmental, economic, and medical development.
This model can be used in various fields, including economics, bio-medicine, engineering, etc. However, this model still has certain limitations. For example, the relationship between air quality and temperature needs to be further considered. There is a certain correlation between temperature and air quality. Generally, an increase in temperature can lead to the intensified volatilization and diffusion of pollutants in the air, thereby causing a decline in air quality. In the next phase of research, we will consider the interactions between functional predictors to make results more accurate. In addition, the algorithms and optimization methods of the model can be further improved to enhance computational efficiency. Combining this model with other machine learning methods can further improve predictive performance.

Author Contributions

W.X.: methodology, software, validation, writing—review, supervision, funding acquisition. S.L.: methodology, software, data curation, writing—original draft. H.L.: writing—review, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the Yujie Talent Project of North China University of Technology (Grant No. 107051360023XN075-04).

Data Availability Statement

The original data supporting the results of this study can be obtained from the National Meteorological Science Data Sharing Service Platform, the National Environmental Monitoring Station, and local statistical bulletins.

Acknowledgments

The authors would like to thank the referees and the editor for their useful suggestions, which helped us improve the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ramsay, J.O. When the data are functions. Psychometrika 1982, 47, 379–396. [Google Scholar] [CrossRef]
  2. Ramsay, J.O.; Silverman, B.W. Functional Data Analysis, 2nd ed.; Springer: New York, NY, USA, 2005. [Google Scholar]
  3. Horváth, L.; Kokoszka, P. Inference for Functional Data with Application; Springer: New York, NY, USA, 2012. [Google Scholar]
  4. Shin, H. Partial functional linear regression. J. Stat. Plan. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
  5. Shin, H.; Lee, M.H. On prediction rate in partial functional linear regression. J. Multivar. Anal. 2012, 103, 93–106. [Google Scholar] [CrossRef]
  6. James, G.M. Generalized linear models with functional predictors. J. R. Stat. Soc. Ser. B 2002, 64, 411–432. [Google Scholar] [CrossRef]
  7. Müller, H.G.; Stadtmüller, U. Generalized functional linear models. Ann. Stat. 2005, 33, 774–805. [Google Scholar] [CrossRef]
  8. Shang, Z.F.; Cheng, G. Nonparametric inference in generalized functional linear models. Ann. Stat. 2015, 43, 1742–1773. [Google Scholar] [CrossRef]
  9. Wong, R.K.W.; Li, Y.; Zhu, Z.Y. Partially Linear Functional Additive Models for Multivariate Functional Data. J. Am. Stat. Assoc. 2019, 114, 406–418. [Google Scholar] [CrossRef]
  10. Scallan, A.; Gilchrist, R.; Green, M. Fitting Parametric Link Functions in Generalized Linear Models. Comput. Stat. Data Anal. 1984, 2, 37–49. [Google Scholar] [CrossRef]
  11. Weisberg, S.; Welsh, A.H. Adapting for the missing link. Ann. Stat. 1994, 22, 1674–1700. [Google Scholar] [CrossRef]
  12. Chiou, J.M.; Müller, H.G. Quasi-likelihood regression with unknown link and variance functions. J. Am. Stat. Assoc. 1998, 93, 1376–1387. [Google Scholar] [CrossRef]
  13. Chiou, J.M.; Müller, H.G. Estimated estimating equations: Semiparametric inference for clustered and longitudinal data. J. R. Stat. Soc. Ser. B 2005, 67, 531–553. [Google Scholar] [CrossRef]
  14. Bai, Y.; Fung, W.K.; Zhu, Z.Y. Penalized quadratic inference functions for single-index models with longitudinal data. J. Multivar. Anal. 2009, 100, 152–161. [Google Scholar] [CrossRef]
  15. Pang, Z.; Xue, L. Estimation for the single-index models with random effects. Comput. Stat. Data Anal. 2012, 56, 1837–1853. [Google Scholar] [CrossRef]
  16. Yuan, M.; Diao, G. Sieve maximum likelihood estimation in generalized linear models with an unknown link function. Wiley Interdiscip. Rev. Comput. Stat. 2017, 10, e1425. [Google Scholar] [CrossRef]
  17. Kokoszka, P.; Reimherr, M. Introduction to Functional Data Analysis; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar] [CrossRef]
  18. Rao, A.R.; Reimherr, M. Nonlinear Functional Modeling Using Neural Networks. J. Comput. Graph. Stat. 2023, 32, 1248–1257. [Google Scholar] [CrossRef]
  19. Huang, C.; Barnett, A.G.; Wang, X.; Tong, S. The impact of temperature on years of life lost in Brisbane, Australia. Nat. Clim. Chang. 2012, 2, 265–270. [Google Scholar] [CrossRef]
  20. Yang, Y.; Qi, J.L.; Ruan, Z.L.; Yin, P.; Zhang, S.Y.; Liu, J.M.; Liu, Y.N.; Li, R.; Wang, L.J.; Lin, H.L. Changes in Life Expectancy of Respiratory Diseases from Attaining Daily PM2.5 Standard in China: A Nationwide Observational Study. Innovation 2020, 1, 100064. [Google Scholar] [CrossRef] [PubMed]
  21. Deryugina, T.; Molitor, D. The Causal Effects of Place on Health and Longevity. J. Econ. Perspect. 2021, 35, 147–170. [Google Scholar] [CrossRef]
  22. Mack, Y.P.; Silverman, B.W. Weak and strong uniform consistency of kernel regression estimates. Probab. Theory Relat. Fields 1982, 63, 405–415. [Google Scholar] [CrossRef]
  23. Masry, E.; Tjøstheim, D. Estimation and Identification of Nonlinear ARCH Time Series: Strong Convergence and Asymptotic Normality. Econom. Theory 1995, 11, 258–289. [Google Scholar] [CrossRef]
  24. Chiou, J.M.; Müller, H.G. Nonparametric quasi-likelihood. Ann. Stat. 1999, 27, 36–64. [Google Scholar] [CrossRef]
  25. Xiao, W.W.; Wang, Y.X.; Liu, H.Y. Generalized partially functional linear model. Sci. Rep. 2021, 11, 23428. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The predictors X 1 ( t ) and X 2 ( t ) .
Figure 1. The predictors X 1 ( t ) and X 2 ( t ) .
Axioms 12 01089 g001
Figure 2. Asymptotic properties of the link function g. The black line in the graph represents the true link function g = exp ( η ) / ( 1 + exp ( η ) ) . The purple, yellow, and red lines in the graph represent the estimated link functions g ^ under sample sizes of n = 50 , n = 100 , and n = 300 , respectively.
Figure 2. Asymptotic properties of the link function g. The black line in the graph represents the true link function g = exp ( η ) / ( 1 + exp ( η ) ) . The purple, yellow, and red lines in the graph represent the estimated link functions g ^ under sample sizes of n = 50 , n = 100 , and n = 300 , respectively.
Axioms 12 01089 g002
Figure 3. Estimated values of regression coefficient function β ^ 1 ( t ) , β ^ 2 ( t ) (blue curves) and their 95% confidence intervals (grey area) for difference sample size, where the red curves are the theoretical regression coefficient functions β 1 ( t ) , β 2 ( t ) .
Figure 3. Estimated values of regression coefficient function β ^ 1 ( t ) , β ^ 2 ( t ) (blue curves) and their 95% confidence intervals (grey area) for difference sample size, where the red curves are the theoretical regression coefficient functions β 1 ( t ) , β 2 ( t ) .
Axioms 12 01089 g003
Figure 4. Daily AQI (left plot) and daily temperatures (right plot) for 58 cities in 2020; each curve represents one city.
Figure 4. Daily AQI (left plot) and daily temperatures (right plot) for 58 cities in 2020; each curve represents one city.
Axioms 12 01089 g004
Figure 5. Estimated values of regression coefficient function β ^ ( t ) and their 95% confidence intervals.
Figure 5. Estimated values of regression coefficient function β ^ ( t ) and their 95% confidence intervals.
Axioms 12 01089 g005
Table 1. The abbreviations and their corresponding full forms.
Table 1. The abbreviations and their corresponding full forms.
AbbreviationFull Form
FPCAFunctional principal component analysis
KL expansionKarhunen–Loeve expansion
RMISERoot Mean Integrated Square Error
SDStandard Deviation
GCVGeneralized Cross Validation
MAEMean Absolute Error
MSEMean Squared Error
TPTrue Positive
TNTrue Negative
FPFalse Positive
FNFalse Negative
Table 2. RMISE of g and g ^ for different sample size n.
Table 2. RMISE of g and g ^ for different sample size n.
nRMISE
500.3540
1000.2734
3000.1449
Table 3. SD and RMISE of the estimated values of β ^ 1 ( t ) and β ^ 2 ( t ) for different sample sizes n.
Table 3. SD and RMISE of the estimated values of β ^ 1 ( t ) and β ^ 2 ( t ) for different sample sizes n.
nSDRMISE
500.24750.3405
β ^ 1 ( t ) 1000.13440.2517
3000.05520.1204
500.25360.3232
β ^ 2 ( t ) 1000.12610.2863
3000.02390.1033
Table 4. Estimated values of scalar regression coefficients γ ^ and their SD in brackets for different sample sizes n.
Table 4. Estimated values of scalar regression coefficients γ ^ and their SD in brackets for different sample sizes n.
n γ ^ 1 γ ^ 2 γ ^ 3
500.7298 (0.191)0.5928 (0.177)0.5307 (0.232)
1000.6892 (0.092)0.5832 (0.071)0.4894 (0.096)
3000.7105 (0.019)0.5732 (0.018)0.4988 (0.016)
Table 5. The M1 and M2 values for different sample sizes n.
Table 5. The M1 and M2 values for different sample sizes n.
nM1M2
500.31820.1579
1000.30280.1498
3000.29210.1406
Table 6. Regression coefficients γ ^ and their significance levels.
Table 6. Regression coefficients γ ^ and their significance levels.
EstimateStd.Errort ValuePr (> t )
γ ^ G D P 0.67760.3391.99880.04639
γ ^ B e d s 0.73540.3672.00380.04585
Table 7. Comparison between Unknown Link Function Model, Logit Link Function Model, and Model without a Link Function.
Table 7. Comparison between Unknown Link Function Model, Logit Link Function Model, and Model without a Link Function.
Link FunctionMAEMSE R 2 Accuracy
Unknown0.25840.13990.891681.03%
Logit0.28720.25110.667375.86%
Without0.47770.31460.411874.14%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xiao, W.; Li, S.; Liu, H. Generalized Partially Functional Linear Model with Unknown Link Function. Axioms 2023, 12, 1089. https://doi.org/10.3390/axioms12121089

AMA Style

Xiao W, Li S, Liu H. Generalized Partially Functional Linear Model with Unknown Link Function. Axioms. 2023; 12(12):1089. https://doi.org/10.3390/axioms12121089

Chicago/Turabian Style

Xiao, Weiwei, Songxuan Li, and Haiyan Liu. 2023. "Generalized Partially Functional Linear Model with Unknown Link Function" Axioms 12, no. 12: 1089. https://doi.org/10.3390/axioms12121089

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop