Skip to main content
Log in

Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression

  • Original Paper
  • Published:
Nonlinear Dynamics Aims and scope Submit manuscript

Abstract

The importance of discovering significant variables from a large candidate pool is now widely recognized in many fields. There exist a number of algorithms for variable selection in the literature. Some are computationally efficient but only provide a necessary condition for, not a sufficient and necessary condition for, testing if a variable contributes or not to the system output. The others are computationally expensive. The goal of the paper is to develop a directional variable selection algorithm that performs similar to or better than the leading algorithms for variable selection, but under weaker technical assumptions and with a much reduced computational complexity. It provides a necessary and sufficient condition for testing if a variable contributes or not to the system. In addition, since indicators for redundant variables aren’t exact zero’s, it is difficult to decide variables whether are redundant or not when the indicators are small. This is critical in the variable selection problem because the variable is either selected or unselected. To solve this problem, a penalty optimization algorithm is proposed to ensure the convergence of the set. Simulation and experimental research verify the effectiveness of the directional variable selection method proposed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Altan, A., Hacıoğlu, R.: Model predictive control of three-axis gimbal system mounted on UAV for real-time target tracking under external disturbances. Mech. Syst. Signal Process. 138, 106548 (2020)

    Article  Google Scholar 

  2. Altan, A., Hacioğlu, R.: Hammerstein model performance of three axes gimbal system on Unmanned Aerial Vehicle (UAV) for route tracking. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). pp. 1–4 (2018)

  3. Belge, E., Kaba, H., Parlak, A., Altan, A., Hacioglu, R.: Estimation of small unmanned aerial vehicle lateral dynamic model with system identification approaches. Balkan J. Electr. Comput. Eng. 1, 121–126 (2020). https://doi.org/10.17694/bajece.654499

    Article  Google Scholar 

  4. Nemeth, J.G., Kollar, I., Schoukens, J.: Identification of Volterra kernels using interpolation. IEEE Trans. Instrum. Meas. 51, 770–775 (2002)

    Article  Google Scholar 

  5. Cheng, C.M., Peng, Z.K., Zhang, W.M., Meng, G.: Volterra-series-based nonlinear system modeling and its engineering applications: a state-of-the-art review. Mech. Syst. Signal Process. 87, 340–364 (2017)

    Article  Google Scholar 

  6. Alippi, C., Piuri, V.: Experimental neural networks for prediction and identification. IEEE Trans. Instrum. Meas. 45, 670–676 (1996)

    Article  Google Scholar 

  7. Ljung, L.: System Identification: Theory for the User. Prentice-Hall, New York (1999)

    MATH  Google Scholar 

  8. Söderström, T., Stoica, P.: System identification. Prentice Hall, New York (1989)

    MATH  Google Scholar 

  9. Bai, E.-W., Chan, K.-S.: Identification of an additive nonlinear system and its applications in generalized Hammerstein models. Automatica 44, 430–436 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  10. Bai, E.-W., Cheng, C., Zhao, W.-X.: Variable selection of high-dimensional non-parametric nonlinear systems by derivative averaging to avoid the curse of dimensionality. Automatica 101, 138–149 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cheng, C., Bai, E.-W.: Variable selection according to goodness of fit in nonparametric nonlinear system identification. IEEE Trans Automat Contr. 66, 3184–3196 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  12. Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37, 373–384 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  13. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  14. Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl Stat. 1, 302–332 (2007)

    MathSciNet  MATH  Google Scholar 

  15. Zou, H.: The adaptive Lasso and its oracle properties J. Am Stat Assoc. 101, 1418–1429 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J R Stat Soc Series B Stat Methodol. 68, 49–67 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  17. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (with discussions). Ann. Stat. 32, 407–499 (2004)

    Article  MATH  Google Scholar 

  18. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat Assoc. 96, 1348–1360 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  19. Antos, A., Kontoyiannis, I.: Convergence properties of functional estimates for discrete distributions. Random Struct. Algorithms 19, 163–193 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  20. Evans, D.: A computationally efficient estimator for mutual information. Proc. Math. Phys. Eng. Sci. 464, 1203–1215 (2008)

    MathSciNet  MATH  Google Scholar 

  21. Cheng, C., Bai, E.-W.: Ranking the importance of variables in nonlinear system identification. Automatica 103, 472–479 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  22. Rosasco, L., Villa, S., Mosci, S., Santoro, M., Verri, A.: Nonparametric sparsity and regularization. J. Mach. Learn. Res. 14, 1665–1714 (2013)

    MathSciNet  MATH  Google Scholar 

  23. Bai, E.W.: Non-parametric nonlinear system identification: an asymptotic minimum mean squared error estimator. IEEE Trans. Autom. Contr. 55, 1615–1626 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  24. Bai, E., Li, K., Zhao, W., Xu, W.: Kernel based approaches to local nonlinear non-parametric variable selection. Automatica 50, 100–113 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  25. Zhao, W., Chen, H.-F., Bai, E., Li, K.: Kernel-based local order estimation of nonlinear nonparametric systems. Automatica 51, 243–254 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  26. Cheng, C., Bai, E., Peng, Z.: Consistent variable selection for a nonparametric nonlinear system by inverse and contour regressions. IEEE Trans. Autom. Contr. 64, 2653–2664 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  27. Cho, S.J., Hermsmeier, M.A.: Genetic algorithm guided selection: variable selection and subset selection. J. Chem. Inf. Comput Sci. 42, 927–936 (2002)

    Article  Google Scholar 

  28. Trevino, V., Falciani, F.: GALGO: an R package for multivariate variable selection using genetic algorithms. Bioinformatics 22, 1154–1156 (2006)

    Article  Google Scholar 

  29. Jarvis, R.M., Goodacre, R.: Genetic algorithm optimization for pre-processing and variable selection of spectroscopic data. Bioinformatics 21, 860–868 (2005)

    Article  Google Scholar 

  30. Bai, E.W., Zhao, W., Zheng, W.X.: Variable selection in identification of a high dimensional nonlinear non-parametric system. Control Theory Technol. 13(1), 1–16 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  31. Fan, J., Irène, G.: Local Polynomial Modeling and Its Applications. Chapman and Hall/CRC, New York (1996)

    Google Scholar 

  32. Chan, S.C., Zhang, Z.G.: Local polynomial modeling and variable bandwidth selection for time-varying linear systems. IEEE Trans. Instrum. Meas. 60, 1102–1117 (2011)

    Article  Google Scholar 

  33. Li, B., Wang, S.: On directional regression for dimension reduction. J. Am. Stat. Assoc. 102, 997–1008 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  34. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005)

    Article  Google Scholar 

  35. Downs, J.J., Vogel, E.F.: A plant-wide industrial process control problem. Comput. Chem. Eng. 17, 245–255 (1993)

    Article  Google Scholar 

  36. Lawrence Ricker, N.: Decentralized control of the Tennessee Eastman challenge process. J. Process. Control. 6, 205–221 (1996)

    Article  Google Scholar 

Download references

Funding

This work was supported by the Basic Research Project (Grant No. 20195208003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to C. M. Cheng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

Following the definition of \(G\),

$$ \begin{aligned} G & = E\left[ {2\Sigma - E\left[ {({\textbf{x}} - {\tilde{\textbf{x}}})(\textbf{x} - {\tilde{\textbf{x}}})^{{\text{T}}} |y,\tilde{y}} \right]} \right]^{2} \\ & = 4\Sigma^{2} - 4\Sigma E\left[ {E\left[ {({\textbf{x}} - {\tilde{\textbf{x}}})(\textbf{x}- {\tilde{\textbf{x}}})^{{\text{T}}} |y,\tilde{y}} \right]} \right] \\ & \quad + E\left[ {\left( {E\left[ {({\textbf{x}} - {\tilde{\textbf{x}}})(\textbf{x} - {\tilde{\textbf{x}})}^{T} |y,\tilde{y}} \right]} \right)^{2} } \right]. \\ \end{aligned} $$

Further define

$$ \begin{aligned} & E\left[ {{{(\textbf{x} - \tilde{\textbf{x}})(\textbf{x} - \tilde{\textbf{x}})}}^{T} |y,\tilde{y}} \right] \\ & \quad = \underbrace {{E\left[ {{\mathbf{xx}}^{{\text{T}}} |y} \right] - E[{\mathbf{x}}|y]E\left[ {{\tilde{\textbf{x}}}^{{\text{T}}} |\tilde{y}} \right]}}_{{Q(y,\tilde{y})}} \\ & & \quad \quad \underbrace {{ - E[{\tilde{\textbf{x}}}|\tilde{y}]E\left[ {{\textbf{x}}^{{\text{T}}} |y} \right] + E\left[ {{\tilde{\textbf{x}\tilde{x}}}^{{\text{T}}} |\tilde{y}} \right]}}_{{Q(\tilde{y},y)}}. \\ \end{aligned} $$
$$ \begin{aligned} & E\left[ {\left( {E\left[ {{{(\textbf{x}- \tilde{\textbf{x}})(\textbf{x} - \tilde{\textbf{x}})}}^{{\text{T}}} |y,\tilde{y}} \right]} \right)^{2} } \right] \\ & \quad = 2E\left[ {Q^{2} (y,\tilde{y})} \right] + 2E[Q(y,\tilde{y})Q(\tilde{y},y)] \\ E\left[ {Q^{2} (y,\tilde{y})} \right] \\ & \quad = E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} |y} \right]} \right] + E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ & E[Q(y,\tilde{y})Q(\tilde{y},y)] \\ & \quad = \Sigma^{2} + E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right]E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ \end{aligned} $$

It follows that

$$ \begin{aligned} G & = 2E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]} \right] + 2E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ & \quad + 2E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right]E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] \\ \end{aligned} $$

Thus, \(G\) can be expressed as Eq. (6).

Appendix B

Proof of Theorem 1

Since \(y\) is independent of \(x_{i} ,i = d + 1, \ldots ,q\), and \(x_{i} \in S_{c} ,x_{j} \in S_{r}\) are independent. \(E\left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]\) can be represented as

$$ E\left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right] = \left( {\begin{array}{*{20}c} {K_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right),\quad K_{11} \in R^{d \times d} . $$

If \(E\left[ {x_{i}^{2} |y} \right],\;\;i = 1, \ldots d\) is non-degenerated, then we have

$$ {\text{diag}} \left( {K_{11} } \right) = \left[ {\sigma_{1} , \ldots ,\sigma_{d} } \right] $$

where \(\left| {\sigma_{i} } \right| > 0,\;\;i = 1, \ldots ,d\).

$$ \begin{array}{*{20}c} {E^{2} \left( {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right) = \left( {\begin{array}{*{20}c} {K_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {K_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {K_{11}^{2} } & 0 \\ 0 & 0 \\ \end{array} } \right)} \\ {{\text{diag}} \left( {K_{11}^{2} } \right) \ge \left[ {\sigma_{1}^{2} , \ldots ,\sigma_{d}^{2} } \right] > 0} \\ \end{array} $$
$$ \begin{aligned} & \Rightarrow e_{i}^{{\text{T}}} 2E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - {\Sigma |}y} \right]} \right]e_{i} \\ & \quad = 2E\left[ {e_{i}^{{\text{T}}} E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - {\Sigma |}y} \right]e_{i} } \right] > 0,\quad i = 1, \ldots ,d \\ \end{aligned} $$
$$ \begin{aligned} & e_{i}^{{\text{T}}} 2E\left[ {E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]} \right]e_{i} \\ & \quad = 2E\left[ {e_{i}^{T} E^{2} \left[ {{\mathbf{xx}}^{{\text{T}}} - \Sigma |y} \right]e_{i} } \right] = 0,\;\;\;i = d + 1, \ldots ,q \\ \end{aligned} $$

Else if \(E\left[ {x_{i} |y} \right],\;\;i = 1, \ldots d\) is non-degenerated, i.e., it is random and is almost surely not a constant, we have,

$$ E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] = \left( {\begin{array}{*{20}c} {U_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right),\quad U_{11} \in R^{d \times d} $$

where \({\text{diag}} \left( {U_{11} } \right) = \left[ {\eta_{1} , \ldots ,\eta_{d} } \right] > 0\). Then

$$ E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right] = \left( {\begin{array}{*{20}c} {U_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {U_{11} } & 0 \\ 0 & 0 \\ \end{array} } \right) $$

and \({\text{diag}} \left( {E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right]} \right) \ge \left[ {\eta_{1}^{2} , \ldots ,\eta_{d}^{2} } \right] > 0\).

In addition, the coefficient \(E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right]\) of \(E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]} \right]\) holds, \(E\left[ {E\left[ {{\mathbf{x}}^{{\text{T}}} |y} \right]E[{\mathbf{x}}|y]} \right] > 0\). Under Assumptions 1 and 2, we have

$$ \begin{array}{*{20}c} {e_{i}^{T} \left\{ {2E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right] + 2E\left[ {E\left[ {{\mathbf{x}}^{T} |y} \right]E[{\mathbf{x}}|y]} \right].} \right.} \\ {\left. {E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right]} \right\}e_{i} > 0,i = 1, \ldots ,d} \\ {e_{i}^{T} \left\{ {2E^{2} \left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right] + 2E\left[ {E\left[ {{\mathbf{x}}^{T} |y} \right]E[{\mathbf{x}}|y]} \right].} \right.} \\ {\left. {E\left[ {E[{\mathbf{x}}|y]E\left[ {{\mathbf{x}}^{T} |y} \right]} \right]} \right\}e_{i} = 0,i = d + 1, \ldots ,q.} \\ \end{array} $$

The above equality and inequality imply that \(e_{i}^{{\text{T}}} Ge_{i} > 0\;\;{\text{for}}\;\;i = 1, \ldots ,d\)and

$$ e_{i}^{T} Ge_{i} = 0 \quad {\text{for}} \quad \, i = d + 1, \ldots ,q. $$

This completes the proof.□

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, B., Cai, Q.Y., Peng, Z.K. et al. Variable selection and identification of high-dimensional nonparametric nonlinear systems by directional regression. Nonlinear Dyn 111, 12101–12112 (2023). https://doi.org/10.1007/s11071-023-08488-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11071-023-08488-6

Keywords

Navigation