Abstract
A goodness-of-fit test for the Functional Linear Model with Scalar Response (FLMSR) with responses Missing at Random (MAR) is proposed in this paper. The test statistic relies on a marked empirical process indexed by the projected functional covariate and its distribution under the null hypothesis is calibrated using a wild bootstrap procedure. The computation and performance of the test rely on having an accurate estimator of the functional slope of the FLMSR when the sample has MAR responses. Three estimation methods based on the Functional Principal Components (FPCs) of the covariate are considered. First, the simplified method estimates the functional slope by simply discarding observations with missing responses. Second, the imputed method estimates the functional slope by imputing the missing responses using the simplified estimator. Third, the inverse probability weighted method incorporates the missing response generation mechanism when imputing. Furthermore, both cross-validation and LASSO regression are used to select the FPCs used by each estimator. Several Monte Carlo experiments are conducted to analyze the behavior of the testing procedure in combination with the functional slope estimators. Results indicate that estimators performing missing-response imputation achieve the highest power. The testing procedure is applied to check for linear dependence between the average number of sunny days per year and the mean curve of daily temperatures at weather stations in Spain.
Similar content being viewed by others
References
Bianco A, Boente G, González-Manteiga W et al (2019) Plug-in marginal estimation under a general regression model with missing responses and covariates. Test 28(1):106–146. https://doi.org/10.1007/s11749-018-0591-5
Bianco A, Boente G, González-Manteiga W et al (2020) Robust location estimators in regression models with covariates and responses missing at random. J Nonparametr Stat 32(4):915–939. https://doi.org/10.1080/10485252.2020.1834108
Cardot H, Mas A, Sarda P (2007) CLT in functional linear regression models. Probab Theory Related Fields 138(3–4):325–361. https://doi.org/10.1007/s00440-006-0025-2
Chen F, Jiang Q, Feng Z, et al (2020) Model checks for functional linear regression models based on projected empirical processes. Comput Stat Data Anal 144(106897). https://doi.org/j.csda.2019.106897
Ciarleglio A, Petkova E, Harel O (2022) Elucidating age and sex-dependent association between frontal EEG asymmetry and depression: an application of multiple imputation in functional regression. J Am Stat Assoc 117(537):12–26. https://doi.org/10.1080/01621459.2021.1942011
Crambes C, Henchiri Y (2019) Regression imputation in the functional linear model with missing values in the response. J Stat Plan Inference 201:103–119. https://doi.org/10.1016/j.jspi.2018.12.004
Cuesta-Albertos JA, García-Portugués E, Febrero-Bande M et al (2019) Goodness-of-fit tests for the functional linear model based on randomly projected empirical processes. Ann Stat 47(1):439–467. https://doi.org/10.1214/18-AOS1693
Escanciano JC (2006) A consistent diagnostic test for regression models using projections. Econ Theory 22(6):1030–1051. https://doi.org/10.1017/S0266466606060506
Febrero-Bande M, Galeano P, González-Manteiga W (2017) Functional principal component regression and functional partial least-squares regression: an overview and a comparative study. Int Stat Rev 85(1):61–83. https://doi.org/10.1111/insr.12116
Febrero-Bande M, Galeano P, González-Manteiga W (2019) Estimation and prediction for the functional linear model with scalar response with responses missing at random. Comput Stat Data Anal 131:91–103. https://doi.org/10.1016/j.csda.2018.07.006
Ferraty F, Vieu P (2006) Nonparametric Functional Data Analysis: Theory and Practice. Springer Series in Statistics. Springer, New York. https://doi.org/10.1007/0-387-36620-2
Ferraty F, Sued M, Vieu P (2013) Mean estimation with data missing at random for functional covariables. Statistics 47(4):688–706. https://doi.org/10.1080/02331888.2011.650172
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22. https://doi.org/10.18637/jss.v033.i01
García-Portugués E, González-Manteiga W, Febrero-Bande M (2014) A goodness-of-fit test for the functional linear model with scalar response. J Comput Graph Stat 23(3):761–778. https://doi.org/10.1080/10618600.2013.812519
García-Portugués E, Álvarez-Liébana J, Álvarez-Pérez G et al (2021) A goodness-of-fit test for the functional linear model with functional response. Scand J Stat 48(2):502–528. https://doi.org/10.1111/sjos.12486
González-Manteiga W (2023) A review on specification tests for models with functional data. Span J Stat 4(1):9–40. https://doi.org/10.37830/SJS.2022.1.02
González-Manteiga W, Pérez-González A (2006) Goodness-of-fit tests for linear regression models with missing response data. Can J Stat 34(1):149–170. https://doi.org/10.1002/cjs.5550340111
González-Manteiga W, Crujeiras RM, García-Portugués E (2023) A review of goodness-of-fit tests for models involving functional data. In: Balakrishnan N, Gil MA, Martín N, et al (eds) Trends in mathematical, information and data sciences, studies in systems, decision and control, vol 445. Springer, Cham, p 349–358, https://doi.org/10.1007/978-3-031-04137-2_29
Greven S, Scheipl F (2017) A general framework for functional regression modelling. Stat Model 17(1–2):1–35. https://doi.org/10.1177/1471082x16681317
Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer series in statistics. Springer, New York. https://doi.org/10.1007/978-1-4614-3655-3
Hsing T, Eubank R (2015) Theoretical foundations of functional data analysis, with an introduction to linear operators. Wiley series in probability and statistics. John Wiley & Sons, Chichester. https://doi.org/10.1002/9781118762547
Kokoszka P, Reimherr M (2017) Introduction to functional data analysis. Texts in statistical science series. CRC Press, Boca Raton. https://doi.org/10.1201/9781315117416
Li X (2012) Lack-of-fit testing of a regression model with response missing at random. J Stat Plan Inference 142(12):155–170. https://doi.org/10.1016/j.jspi.2011.07.005
Ling N, Liang L, Vieu P (2015) Nonparametric regression estimation for functional stationary ergodic data with missing at random. J Stat Plan Inference 162:75–87. https://doi.org/10.1016/j.jspi.2015.02.001
Ling N, Liu Y, Vieu P (2016) Conditional mode estimation for functional stationary ergodic data with responses missing at random. Statistics 50(5):991–1013. https://doi.org/10.1080/02331888.2015.1122012
Ling N, Cheng L, Vieu P et al (2022) Missing responses at random in functional single index model for time series data. Stat Pap 63(2):665–692. https://doi.org/10.1007/s00362-021-01251-2
Mammen E (1993) Bootstrap and wild bootstrap for high dimensional linear models. Ann Stat 21(1):255–285. https://doi.org/10.1214/aos/1176349025
McLean MW, Hooker G, Ruppert D (2015) Restricted likelihood ratio tests for linearity in scalar-on-function regression. Stat Comput 25(5):997–1008. https://doi.org/10.1007/s11222-014-9473-1
Pérez González A, Cotos-Yáñez TR, González-Manteiga W et al (2021) Goodness-of-fit tests for quantile regression with missing responses. Stat Pap 62(3):1231–1264. https://doi.org/10.1007/s00362-019-01135-6
Qin J, Zhang B, Leung DHY (2017) Efficient augmented inverse probability weighted estimation in missing data problems. J Bus Econ Stat 35(1):86–97. https://doi.org/10.1080/07350015.2015.1058266
Ramsay JO, Silverman BW (2005) Functional data analysis. Springer series in statistics. Springer, New York. https://doi.org/10.1007/b98888
Reiss PT, Goldsmith J, Shang HL et al (2017) Methods for scalar-on-function regression. Int Stat Rev 85(2):228–249. https://doi.org/10.1111/insr.12163
Smaga L (2022) Projection tests for linear hypothesis in the functional response model. Commun Stat Theory Methods. https://doi.org/10.1080/03610926.2022.2101120
Sun Z, Wang Q (2009) Checking the adequacy of a general linear model with responses missing at random. J Stat Plan Inference 139(10):3588–3604. https://doi.org/10.1016/j.jspi.2009.04.024
Sun Z, Chen F, Zhou X et al (2017) Improved model checking methods for parametric models with responses missing at random. J Multivar Anal 154:147–161. https://doi.org/10.1016/j.jmva.2016.11.003
Zheng SJ, Gao SY, Sun ZH (2020) Projection-based consistent test for linear regression model with missing response and covariates. Acta Math Appl Sin 36(4):917–935. https://doi.org/10.1007/s10255-020-0976-6
Zhu Y, Zhao P (2023) Diagnostic measures for functional linear model with nonignorable missing responses. Commun Math Stat. https://doi.org/10.1007/s40304-022-00301-x
Acknowledgements
The authors acknowledge financial support by MCIN/AEI/ensuremath-10.13039/501100011033: the first and fourth authors from grant PID2020-116587GB-I00, the second author from grant PID2019-108311GB-I00, and the third author from grant PID2021-124051NB-I00. Comments from two anonymous reviewers are acknowledged.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Febrero-Bande, M., Galeano, P., García-Portugués, E. et al. Testing for linearity in scalar-on-function regression with responses missing at random. Comput Stat (2024). https://doi.org/10.1007/s00180-023-01445-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00180-023-01445-2