Abstract
The importance of discovering significant variables from a large candidate pool is now widely recognized in many fields. There exist a number of algorithms for variable selection in the literature. Some are computationally efficient but only provide a necessary condition for, not a sufficient and necessary condition for, testing if a variable contributes or not to the system output. The others are computationally expensive. The goal of the paper is to develop a directional variable selection algorithm that performs similar to or better than the leading algorithms for variable selection, but under weaker technical assumptions and with a much reduced computational complexity. It provides a necessary and sufficient condition for testing if a variable contributes or not to the system. In addition, since indicators for redundant variables aren’t exact zero’s, it is difficult to decide variables whether are redundant or not when the indicators are small. This is critical in the variable selection problem because the variable is either selected or unselected. To solve this problem, a penalty optimization algorithm is proposed to ensure the convergence of the set. Simulation and experimental research verify the effectiveness of the directional variable selection method proposed in this paper.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Altan, A., Hacıoğlu, R.: Model predictive control of three-axis gimbal system mounted on UAV for real-time target tracking under external disturbances. Mech. Syst. Signal Process. 138, 106548 (2020)
Altan, A., Hacioğlu, R.: Hammerstein model performance of three axes gimbal system on Unmanned Aerial Vehicle (UAV) for route tracking. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). pp. 1–4 (2018)
Belge, E., Kaba, H., Parlak, A., Altan, A., Hacioglu, R.: Estimation of small unmanned aerial vehicle lateral dynamic model with system identification approaches. Balkan J. Electr. Comput. Eng. 1, 121–126 (2020). https://doi.org/10.17694/bajece.654499
Nemeth, J.G., Kollar, I., Schoukens, J.: Identification of Volterra kernels using interpolation. IEEE Trans. Instrum. Meas. 51, 770–775 (2002)
Cheng, C.M., Peng, Z.K., Zhang, W.M., Meng, G.: Volterra-series-based nonlinear system modeling and its engineering applications: a state-of-the-art review. Mech. Syst. Signal Process. 87, 340–364 (2017)
Alippi, C., Piuri, V.: Experimental neural networks for prediction and identification. IEEE Trans. Instrum. Meas. 45, 670–676 (1996)
Ljung, L.: System Identification: Theory for the User. Prentice-Hall, New York (1999)
Söderström, T., Stoica, P.: System identification. Prentice Hall, New York (1989)
Bai, E.-W., Chan, K.-S.: Identification of an additive nonlinear system and its applications in generalized Hammerstein models. Automatica 44, 430–436 (2008)
Bai, E.-W., Cheng, C., Zhao, W.-X.: Variable selection of high-dimensional non-parametric nonlinear systems by derivative averaging to avoid the curse of dimensionality. Automatica 101, 138–149 (2019)
Cheng, C., Bai, E.-W.: Variable selection according to goodness of fit in nonparametric nonlinear system identification. IEEE Trans Automat Contr. 66, 3184–3196 (2021)
Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37, 373–384 (1995)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58, 267–288 (1996)
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl Stat. 1, 302–332 (2007)
Zou, H.: The adaptive Lasso and its oracle properties J. Am Stat Assoc. 101, 1418–1429 (2006)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J R Stat Soc Series B Stat Methodol. 68, 49–67 (2006)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (with discussions). Ann. Stat. 32, 407–499 (2004)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat Assoc. 96, 1348–1360 (2001)
Antos, A., Kontoyiannis, I.: Convergence properties of functional estimates for discrete distributions. Random Struct. Algorithms 19, 163–193 (2001)
Evans, D.: A computationally efficient estimator for mutual information. Proc. Math. Phys. Eng. Sci. 464, 1203–1215 (2008)
Cheng, C., Bai, E.-W.: Ranking the importance of variables in nonlinear system identification. Automatica 103, 472–479 (2019)
Rosasco, L., Villa, S., Mosci, S., Santoro, M., Verri, A.: Nonparametric sparsity and regularization. J. Mach. Learn. Res. 14, 1665–1714 (2013)
Bai, E.W.: Non-parametric nonlinear system identification: an asymptotic minimum mean squared error estimator. IEEE Trans. Autom. Contr. 55, 1615–1626 (2010)
Bai, E., Li, K., Zhao, W., Xu, W.: Kernel based approaches to local nonlinear non-parametric variable selection. Automatica 50, 100–113 (2014)
Zhao, W., Chen, H.-F., Bai, E., Li, K.: Kernel-based local order estimation of nonlinear nonparametric systems. Automatica 51, 243–254 (2015)
Cheng, C., Bai, E., Peng, Z.: Consistent variable selection for a nonparametric nonlinear system by inverse and contour regressions. IEEE Trans. Autom. Contr. 64, 2653–2664 (2019)
Cho, S.J., Hermsmeier, M.A.: Genetic algorithm guided selection: variable selection and subset selection. J. Chem. Inf. Comput Sci. 42, 927–936 (2002)
Trevino, V., Falciani, F.: GALGO: an R package for multivariate variable selection using genetic algorithms. Bioinformatics 22, 1154–1156 (2006)
Jarvis, R.M., Goodacre, R.: Genetic algorithm optimization for pre-processing and variable selection of spectroscopic data. Bioinformatics 21, 860–868 (2005)
Bai, E.W., Zhao, W., Zheng, W.X.: Variable selection in identification of a high dimensional nonlinear non-parametric system. Control Theory Technol. 13(1), 1–16 (2015)
Fan, J., Irène, G.: Local Polynomial Modeling and Its Applications. Chapman and Hall/CRC, New York (1996)
Chan, S.C., Zhang, Z.G.: Local polynomial modeling and variable bandwidth selection for time-varying linear systems. IEEE Trans. Instrum. Meas. 60, 1102–1117 (2011)
Li, B., Wang, S.: On directional regression for dimension reduction. J. Am. Stat. Assoc. 102, 997–1008 (2007)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005)
Downs, J.J., Vogel, E.F.: A plant-wide industrial process control problem. Comput. Chem. Eng. 17, 245–255 (1993)
Lawrence Ricker, N.: Decentralized control of the Tennessee Eastman challenge process. J. Process. Control. 6, 205–221 (1996)
Funding
This work was supported by the Basic Research Project (Grant No. 20195208003).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A
Following the definition of \(G\),
Further define
It follows that
Thus, \(G\) can be expressed as Eq. (6).
Appendix B
Proof of Theorem 1
Since \(y\) is independent of \(x_{i} ,i = d + 1, \ldots ,q\), and \(x_{i} \in S_{c} ,x_{j} \in S_{r}\) are independent. \(E\left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]\) can be represented as
If \(E\left[ {x_{i}^{2} |y} \right],\;\;i = 1, \ldots d\) is non-degenerated, then we have
where \(\left| {\sigma_{i} } \right| > 0,\;\;i = 1, \ldots ,d\).
Else if \(E\left[ {x_{i} |y} \right],\;\;i = 1, \ldots d\) is non-degenerated, i.e., it is random and is almost surely not a constant, we have,
where \({\text{diag}} \left( {U_{11} } \right) = \left[ {\eta_{1} , \ldots ,\eta_{d} } \right] > 0\). Then
and \({\text{diag}} \left( {E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right]} \right) \ge \left[ {\eta_{1}^{2} , \ldots ,\eta_{d}^{2} } \right] > 0\).
In addition, the coefficient \(E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right]\) of \(E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right]\) holds, \(E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right] > 0\). Under Assumptions 1 and 2, we have
The above equality and inequality imply that \(e_{i}^{{\text{T}}} Ge_{i} > 0\;\;{\text{for}}\;\;i = 1, \ldots ,d\)and
This completes the proof.□
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, B., Cai, Q.Y., Peng, Z.K. et al. Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression. Nonlinear Dyn 111, 12101–12112 (2023). https://doi.org/10.1007/s11071-023-08488-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11071-023-08488-6