Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression

Sun, B.; Cai, Q. Y.; Peng, Z. K.; Cheng, C. M.; Wang, F.; Zhang, H. Z.

doi:10.1007/s11071-023-08488-6

Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression

Original Paper
Published: 08 May 2023

Volume 111, pages 12101–12112, (2023)
Cite this article

Nonlinear Dynamics Aims and scope Submit manuscript

B. Sun^1,2,
Q. Y. Cai²,
Z. K. Peng^1,3,
C. M. Cheng ORCID: orcid.org/0000-0001-5222-4047¹,
F. Wang² &
…
H. Z. Zhang²

168 Accesses
Explore all metrics

Abstract

The importance of discovering significant variables from a large candidate pool is now widely recognized in many fields. There exist a number of algorithms for variable selection in the literature. Some are computationally efficient but only provide a necessary condition for, not a sufficient and necessary condition for, testing if a variable contributes or not to the system output. The others are computationally expensive. The goal of the paper is to develop a directional variable selection algorithm that performs similar to or better than the leading algorithms for variable selection, but under weaker technical assumptions and with a much reduced computational complexity. It provides a necessary and sufficient condition for testing if a variable contributes or not to the system. In addition, since indicators for redundant variables aren’t exact zero’s, it is difficult to decide variables whether are redundant or not when the indicators are small. This is critical in the variable selection problem because the variable is either selected or unselected. To solve this problem, a penalty optimization algorithm is proposed to ensure the convergence of the set. Simulation and experimental research verify the effectiveness of the directional variable selection method proposed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-dimensional local polynomial regression with variable selection and dimension reduction

Article 17 October 2023

Adaptive Variable Selection in Nonparametric Sparse Regression

Article 31 May 2014

Beyond support in two-stage variable selection

Article 20 November 2015

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Altan, A., Hacıoğlu, R.: Model predictive control of three-axis gimbal system mounted on UAV for real-time target tracking under external disturbances. Mech. Syst. Signal Process. 138, 106548 (2020)
Article Google Scholar
Altan, A., Hacioğlu, R.: Hammerstein model performance of three axes gimbal system on Unmanned Aerial Vehicle (UAV) for route tracking. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). pp. 1–4 (2018)
Belge, E., Kaba, H., Parlak, A., Altan, A., Hacioglu, R.: Estimation of small unmanned aerial vehicle lateral dynamic model with system identification approaches. Balkan J. Electr. Comput. Eng. 1, 121–126 (2020). https://doi.org/10.17694/bajece.654499
Article Google Scholar
Nemeth, J.G., Kollar, I., Schoukens, J.: Identification of Volterra kernels using interpolation. IEEE Trans. Instrum. Meas. 51, 770–775 (2002)
Article Google Scholar
Cheng, C.M., Peng, Z.K., Zhang, W.M., Meng, G.: Volterra-series-based nonlinear system modeling and its engineering applications: a state-of-the-art review. Mech. Syst. Signal Process. 87, 340–364 (2017)
Article Google Scholar
Alippi, C., Piuri, V.: Experimental neural networks for prediction and identification. IEEE Trans. Instrum. Meas. 45, 670–676 (1996)
Article Google Scholar
Ljung, L.: System Identification: Theory for the User. Prentice-Hall, New York (1999)
MATH Google Scholar
Söderström, T., Stoica, P.: System identification. Prentice Hall, New York (1989)
MATH Google Scholar
Bai, E.-W., Chan, K.-S.: Identification of an additive nonlinear system and its applications in generalized Hammerstein models. Automatica 44, 430–436 (2008)
Article MathSciNet MATH Google Scholar
Bai, E.-W., Cheng, C., Zhao, W.-X.: Variable selection of high-dimensional non-parametric nonlinear systems by derivative averaging to avoid the curse of dimensionality. Automatica 101, 138–149 (2019)
Article MathSciNet MATH Google Scholar
Cheng, C., Bai, E.-W.: Variable selection according to goodness of fit in nonparametric nonlinear system identification. IEEE Trans Automat Contr. 66, 3184–3196 (2021)
Article MathSciNet MATH Google Scholar
Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37, 373–384 (1995)
Article MathSciNet MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl Stat. 1, 302–332 (2007)
MathSciNet MATH Google Scholar
Zou, H.: The adaptive Lasso and its oracle properties J. Am Stat Assoc. 101, 1418–1429 (2006)
Article MathSciNet MATH Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J R Stat Soc Series B Stat Methodol. 68, 49–67 (2006)
Article MathSciNet MATH Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (with discussions). Ann. Stat. 32, 407–499 (2004)
Article MATH Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat Assoc. 96, 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Antos, A., Kontoyiannis, I.: Convergence properties of functional estimates for discrete distributions. Random Struct. Algorithms 19, 163–193 (2001)
Article MathSciNet MATH Google Scholar
Evans, D.: A computationally efficient estimator for mutual information. Proc. Math. Phys. Eng. Sci. 464, 1203–1215 (2008)
MathSciNet MATH Google Scholar
Cheng, C., Bai, E.-W.: Ranking the importance of variables in nonlinear system identification. Automatica 103, 472–479 (2019)
Article MathSciNet MATH Google Scholar
Rosasco, L., Villa, S., Mosci, S., Santoro, M., Verri, A.: Nonparametric sparsity and regularization. J. Mach. Learn. Res. 14, 1665–1714 (2013)
MathSciNet MATH Google Scholar
Bai, E.W.: Non-parametric nonlinear system identification: an asymptotic minimum mean squared error estimator. IEEE Trans. Autom. Contr. 55, 1615–1626 (2010)
Article MathSciNet MATH Google Scholar
Bai, E., Li, K., Zhao, W., Xu, W.: Kernel based approaches to local nonlinear non-parametric variable selection. Automatica 50, 100–113 (2014)
Article MathSciNet MATH Google Scholar
Zhao, W., Chen, H.-F., Bai, E., Li, K.: Kernel-based local order estimation of nonlinear nonparametric systems. Automatica 51, 243–254 (2015)
Article MathSciNet MATH Google Scholar
Cheng, C., Bai, E., Peng, Z.: Consistent variable selection for a nonparametric nonlinear system by inverse and contour regressions. IEEE Trans. Autom. Contr. 64, 2653–2664 (2019)
Article MathSciNet MATH Google Scholar
Cho, S.J., Hermsmeier, M.A.: Genetic algorithm guided selection: variable selection and subset selection. J. Chem. Inf. Comput Sci. 42, 927–936 (2002)
Article Google Scholar
Trevino, V., Falciani, F.: GALGO: an R package for multivariate variable selection using genetic algorithms. Bioinformatics 22, 1154–1156 (2006)
Article Google Scholar
Jarvis, R.M., Goodacre, R.: Genetic algorithm optimization for pre-processing and variable selection of spectroscopic data. Bioinformatics 21, 860–868 (2005)
Article Google Scholar
Bai, E.W., Zhao, W., Zheng, W.X.: Variable selection in identification of a high dimensional nonlinear non-parametric system. Control Theory Technol. 13(1), 1–16 (2015)
Article MathSciNet MATH Google Scholar
Fan, J., Irène, G.: Local Polynomial Modeling and Its Applications. Chapman and Hall/CRC, New York (1996)
Google Scholar
Chan, S.C., Zhang, Z.G.: Local polynomial modeling and variable bandwidth selection for time-varying linear systems. IEEE Trans. Instrum. Meas. 60, 1102–1117 (2011)
Article Google Scholar
Li, B., Wang, S.: On directional regression for dimension reduction. J. Am. Stat. Assoc. 102, 997–1008 (2007)
Article MathSciNet MATH Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005)
Article Google Scholar
Downs, J.J., Vogel, E.F.: A plant-wide industrial process control problem. Comput. Chem. Eng. 17, 245–255 (1993)
Article Google Scholar
Lawrence Ricker, N.: Decentralized control of the Tennessee Eastman challenge process. J. Process. Control. 6, 205–221 (1996)
Article Google Scholar

Download references

Funding

This work was supported by the Basic Research Project (Grant No. 20195208003).

Author information

Authors and Affiliations

State Key Laboratory of Mechanical System and Vibration, Shanghai Jiao Tong University, Shanghai, 200240, People’s Republic of China
B. Sun, Z. K. Peng & C. M. Cheng
China Academy of Launch Vehicle Technology, Beijing, 100076, People’s Republic of China
B. Sun, Q. Y. Cai, F. Wang & H. Z. Zhang
School of Mechanical Engineering, Ningxia University, Ningxia, 750021, People’s Republic of China
Z. K. Peng

Authors

B. Sun
View author publications
You can also search for this author in PubMed Google Scholar
Q. Y. Cai
View author publications
You can also search for this author in PubMed Google Scholar
Z. K. Peng
View author publications
You can also search for this author in PubMed Google Scholar
C. M. Cheng
View author publications
You can also search for this author in PubMed Google Scholar
F. Wang
View author publications
You can also search for this author in PubMed Google Scholar
H. Z. Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. M. Cheng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

Following the definition of $G$,

$$ \begin{aligned} G & = E\left[ {2\Sigma - E\left[ {({\textbf{x}} - {\tilde{\textbf{x}}})(\textbf{x} - {\tilde{\textbf{x}}})^{{\text{T}}} |y,\tilde{y}} \right]} \right]^{2} \\ & = 4\Sigma^{2} - 4\Sigma E\left[ {E\left[ {({\textbf{x}} - {\tilde{\textbf{x}}})(\textbf{x}- {\tilde{\textbf{x}}})^{{\text{T}}} |y,\tilde{y}} \right]} \right] \\ & \quad + E\left[ {\left( {E\left[ {({\textbf{x}} - {\tilde{\textbf{x}}})(\textbf{x} - {\tilde{\textbf{x}})}^{T} |y,\tilde{y}} \right]} \right)^{2} } \right]. \\ \end{aligned} $$

Further define

$$ \begin{aligned} & E\left[ {{{(\textbf{x} - \tilde{\textbf{x}})(\textbf{x} - \tilde{\textbf{x}})}}^{T} |y,\tilde{y}} \right] \\ & \quad = \underbrace {{E\left[ {{\mathbf{xx}}^{{\text{T}}} |y} \right] - E[{\mathbf{x}}|y]E\left[ {{\tilde{\textbf{x}}}^{{\text{T}}} |\tilde{y}} \right]}}_{{Q(y,\tilde{y})}} \\ & & \quad \quad \underbrace {{ - E[{\tilde{\textbf{x}}}|\tilde{y}]E\left[ {{\textbf{x}}^{{\text{T}}} |y} \right] + E\left[ {{\tilde{\textbf{x}\tilde{x}}}^{{\text{T}}} |\tilde{y}} \right]}}_{{Q(\tilde{y},y)}}. \\ \end{aligned} $$

$$ \begin{aligned} & E\left[ {\left( {E\left[ {{{(\textbf{x}- \tilde{\textbf{x}})(\textbf{x} - \tilde{\textbf{x}})}}^{{\text{T}}} |y,\tilde{y}} \right]} \right)^{2} } \right] \\ & \quad = 2E\left[ {Q^{2} (y,\tilde{y})} \right] + 2E[Q(y,\tilde{y})Q(\tilde{y},y)] \\ E\left[ {Q^{2} (y,\tilde{y})} \right] \\ & \quad = E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} |y} \right]} \right] + E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ & E[Q(y,\tilde{y})Q(\tilde{y},y)] \\ & \quad = \Sigma^{2} + E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right]E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ \end{aligned} $$

It follows that

$$ \begin{aligned} G & = 2E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]} \right] + 2E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ & \quad + 2E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right]E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ \end{aligned} $$

Thus, $G$ can be expressed as Eq. (6).

Appendix B

Proof of Theorem 1

Since $y$ is independent of $x_{i} ,i = d + 1, \ldots ,q$, and $x_{i} \in S_{c} ,x_{j} \in S_{r}$ are independent. $E\left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]$ can be represented as

$$ E\left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right] = \left( {\begin{array}{*{20}c} {K_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right),\quad K_{11} \in R^{d \times d} . $$

If $E\left[ {x_{i}^{2} |y} \right],\;\;i = 1, \ldots d$ is non-degenerated, then we have

$$ {\text{diag}} \left( {K_{11} } \right) = \left[ {\sigma_{1} , \ldots ,\sigma_{d} } \right] $$

where $\left| {\sigma_{i} } \right| > 0,\;\;i = 1, \ldots ,d$.

$$ \begin{array}{*{20}c} {E^{2} \left( {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right) = \left( {\begin{array}{*{20}c} {K_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {K_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {K_{11}^{2} } & 0 \\ 0 & 0 \\ \end{array} } \right)} \\ {{\text{diag}} \left( {K_{11}^{2} } \right) \ge \left[ {\sigma_{1}^{2} , \ldots ,\sigma_{d}^{2} } \right] > 0} \\ \end{array} $$

$$ \begin{aligned} & \Rightarrow e_{i}^{{\text{T}}} 2E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - {\Sigma |}y} \right]} \right]e_{i} \\ & \quad = 2E\left[ {e_{i}^{{\text{T}}} E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - {\Sigma |}y} \right]e_{i} } \right] > 0,\quad i = 1, \ldots ,d \\ \end{aligned} $$

$$ \begin{aligned} & e_{i}^{{\text{T}}} 2E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]} \right]e_{i} \\ & \quad = 2E\left[ {e_{i}^{T} E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]e_{i} } \right] = 0,\;\;\;i = d + 1, \ldots ,q \\ \end{aligned} $$

Else if $E\left[ {x_{i} |y} \right],\;\;i = 1, \ldots d$ is non-degenerated, i.e., it is random and is almost surely not a constant, we have,

$$ E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] = \left( {\begin{array}{*{20}c} {U_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right),\quad U_{11} \in R^{d \times d} $$

where ${\text{diag}} \left( {U_{11} } \right) = \left[ {\eta_{1} , \ldots ,\eta_{d} } \right] > 0$. Then

$$ E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] = \left( {\begin{array}{*{20}c} {U_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {U_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right) $$

and ${\text{diag}} \left( {E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right]} \right) \ge \left[ {\eta_{1}^{2} , \ldots ,\eta_{d}^{2} } \right] > 0$.

In addition, the coefficient $E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right]$ of $E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right]$ holds, $E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right] > 0$. Under Assumptions 1 and 2, we have

$$ \begin{array}{*{20}c} {e_{i}^{T} \left\{ {2E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right] + 2E\left[ {E\left[ {{\mathbf{x}}^{T} |y} \right]E[{\mathbf{x}}|y]} \right].} \right.} \\ {\left. {E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right]} \right\}e_{i} > 0,i = 1, \ldots ,d} \\ {e_{i}^{T} \left\{ {2E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right] + 2E\left[ {E\left[ {{\mathbf{x}}^{T} |y} \right]E[{\mathbf{x}}|y]} \right].} \right.} \\ {\left. {E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right]} \right\}e_{i} = 0,i = d + 1, \ldots ,q.} \\ \end{array} $$

The above equality and inequality imply that $e_{i}^{{\text{T}}} Ge_{i} > 0\;\;{\text{for}}\;\;i = 1, \ldots ,d$and

$$ e_{i}^{T} Ge_{i} = 0 \quad {\text{for}} \quad \, i = d + 1, \ldots ,q. $$

This completes the proof.□

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, B., Cai, Q.Y., Peng, Z.K. et al. Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression. Nonlinear Dyn 111, 12101–12112 (2023). https://doi.org/10.1007/s11071-023-08488-6

Download citation

Received: 23 June 2022
Accepted: 20 March 2023
Published: 08 May 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11071-023-08488-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression

Abstract

Access this article

Similar content being viewed by others

High-dimensional local polynomial regression with variable selection and dimension reduction

Adaptive Variable Selection in Nonparametric Sparse Regression

Beyond support in two-stage variable selection

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A

Appendix B

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression

Abstract

Access this article

Similar content being viewed by others

High-dimensional local polynomial regression with variable selection and dimension reduction

Adaptive Variable Selection in Nonparametric Sparse Regression

Beyond support in two-stage variable selection

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A

Appendix B

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation