Skip to main content
Log in

Strong Consistency of Log-Likelihood-Based Information Criterion in High-Dimensional Canonical Correlation Analysis

  • Published:
Sankhya A Aims and scope Submit manuscript

Abstract

We consider the strong consistency of a log-likelihood-based information criterion in a normality-assumed canonical correlation analysis between q- and p-dimensional random vectors for a high-dimensional case such that the sample size n and number of dimensions p are large but p/n is less than 1. In general, strong consistency is a stricter property than weak consistency; thus, sufficient conditions for the former do not always coincide with those for the latter. We derive the sufficient conditions for the strong consistency of this log-likelihood-based information criterion for the high-dimensional case. It is shown that the sufficient conditions for strong consistency of several criteria are the same as those for weak consistency obtained by Yanagihara et al. (J. Multivariate Anal. 157, 70–86: 2017).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. Akadémiai Kiadó, Budapest, Akaike, H. (ed.), p. 995–1010.

  • Akaike, H. (1974). A new look at the statistical model identification. Institute of Electrical and Electronics Engineers Transactions on Automatic Control AC-19, 716–723.

    Article  MathSciNet  Google Scholar 

  • Bozdogan, H. (1987). Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52, 345–370.

    Article  MathSciNet  Google Scholar 

  • Fukui, K (2015). Consistency of log-likelihood-based information criteria for selecting variables in high-dimensional canonical correlation analysis under nonnormality. Hiroshima Math. J. 45, 175–205.

    Article  MathSciNet  Google Scholar 

  • Fujikoshi, Y (1982). A test for additional information in canonical correlation analysis. Ann. Inst. Statist. Math. 34, 523–530.

    Article  MathSciNet  Google Scholar 

  • Fujikoshi, Y (1985). Selection of variables in discriminant analysis and canonical correlation analysis. North-Holland, Fujikoshi, Y. (ed.), p. 219–236.

  • Fujikoshi, Y., Ulyanov, V. V. and Shimizu, R. (2010). Multivariate statistics: High-dimensional and large-sample approximations. John Wiley & Sons Inc., Hoboken.

    Book  Google Scholar 

  • Hannan, E. J. and Quinn, B. G. (1979). The determination of the order of an autoregression. J. Roy. Statist. Soc. Ser. B 26, 270–273.

    MathSciNet  MATH  Google Scholar 

  • McKay, R. J. (1977). Variable selection in multivariate regression: an application of simultaneous test procedures. J. Roy. Statist. Soc., Ser. B 39, 371–380.

    MathSciNet  MATH  Google Scholar 

  • Nishii, R., Bai, Z. D. and Krishnaiah, P. R. (1988). Strong consistency information criterion for model selection in multivariate analysis. Hiroshima Math. J. 18, 451–462.

    Article  MathSciNet  Google Scholar 

  • Oda, R. and Yanagihara, H. (2019). A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables. TR No 19–1, Statistical Research Group, Hiroshima University.

  • Ogura, T. (2010). A variable selection method in principal canonical correlation analysis. Comput. Statist. Data Anal. 54, 1117–1123.

    Article  MathSciNet  Google Scholar 

  • Srivastava, M. S. (2002). Methods of multivariate statistics. Wiley, New York.

    MATH  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6, 461–464.

    Article  MathSciNet  Google Scholar 

  • Timm, N. H. (2002). Applied multivariate analysis. Springer-Verlag, New York.

    MATH  Google Scholar 

  • Yanagihara, H., Oda, R., Hashiyama, Y. and Fujikoshi, Y. (2017). High-Dimensional asymptotic behaviors of differences between the log-determinants of two Wishart matrices. J. Multivariate Anal. 157, 70–86.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the reviewers for valuable comments. Ryoya Oda was supported by a Research Fellowship for Young Scientists from the Japan Society for the Promotion of Science, #18J12123. Hirokazu Yanagihara and Yasunori Fujikoshi were partially supported by Grants-in-Aid for Scientific Research (C) from the Ministry of Education, Science, Sports, and Culture, #18K03415 and #16K00047, respectively.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryoya Oda.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proof of Lemma 1

From the assumption of Lemma 1, the following reductions can be derived:

$$ \begin{array}{@{}rcl@{}} 1&=&P\left( \bigcap\limits_{k=1}^{\infty}\bigcup\limits_{m=1}^{\infty}\bigcap\limits_{n=m}^{\infty}\left\{\left|\frac{1}{h_{j,\ell}}\left\{{\mathrm{LLIC}}(\ell)-{\mathrm{LLIC}}(j)\right\}-\tau_{j,\ell}\right|<\frac{1}{k}\right\}\right)\\ &=&P\left( \bigcap\limits_{k=1}^{\infty}\bigcup\limits_{m=1}^{\infty}\bigcap\limits_{n=m}^{\infty}\left\{-\frac{1}{k}+\tau_{j,\ell}\!<\!\frac{1}{h_{j,\ell}}\left\{{\mathrm{LLIC}}(\ell)-{\mathrm{LLIC}}(j)\right\}\!<\!\frac{1}{k}+\tau_{j,\ell}\right\}\right)\\ &\leq & P\left( \bigcap\limits_{k=1}^{\infty}\bigcup\limits_{m=1}^{\infty}\bigcap\limits_{n=m}^{\infty}\left\{-\frac{1}{k}+\tau_{j,\ell}<\frac{1}{h_{j,\ell}}\left\{{\mathrm{LLIC}}(\ell)-{\mathrm{LLIC}}(j)\right\}\right\}\right)\\ &\leq& P\left( \bigcup\limits_{m=1}^{\infty}\bigcap\limits_{n=m}^{\infty}\left\{\frac{1}{h_{j,\ell}}\left\{{\mathrm{LLIC}}(\ell)-{\mathrm{LLIC}}(j)\right\}>\tau_{j,\ell}\right\}\right)\\ &\leq& P\left( \bigcup\limits_{m=1}^{\infty}\bigcap\limits_{n=m}^{\infty}\left\{\frac{1}{h_{j,\ell}}\left\{{\mathrm{LLIC}}(\ell)-{\mathrm{LLIC}}(j)\right\}>0\right\}\right). \end{array} $$

Hence, we have

$$ \begin{array}{@{}rcl@{}} P(\hat{j} \rightarrow j)&=&P\left( \bigcup\limits_{m=1}^{\infty}\bigcap\limits_{n=m}^{\infty}\{\hat{j}= j\}\right)\\ &=&1-P\left( \bigcup\limits_{\ell\in\mathcal{J}\backslash\{j\}}\bigcap\limits_{m=1}^{\infty}\bigcup\limits_{n=m}^{\infty}\left\{{\mathrm{LLIC}}(\ell)<{\mathrm{LLIC}}(j)\right\}\right)\\ &\geq& 1-\sum\limits_{\ell\in\mathcal{J}\backslash\{j\}}P\left( \bigcap\limits_{m=1}^{\infty}\bigcup\limits_{n=m}^{\infty}\left\{{\mathrm{LLIC}}(\ell)-{\mathrm{LLIC}}(j)<0\right\}\right)\\ &=&1. \end{array} $$

This completes the proof of Lemma 1.

Appendix B: Proof of Lemma 2

Let us take an arbitrary ε > 0, and let k be a natural number such that k > (2ε)− 1. By using Markov’s inequality, for all δ > 0, we have

$$ \begin{array}{@{}rcl@{}} P(np^{-1/2-\varepsilon}|t_{1}-E[t_{1}]|>\delta)&\leq&\frac{1}{(n^{-1}p^{1/2+\varepsilon}\delta)^{2k}}E[(t_{1}-E[t_{1}])^{2k}]\\ &=&O(p^{-2k\varepsilon}),\\ P(n^{-3/4-\varepsilon}||\boldsymbol{T}-E[\boldsymbol{T}]||>\delta)& \leq&\frac{1}{(n^{3/4+\varepsilon}\delta)^{4}}E\left[\left( ||\boldsymbol{T}-E[\boldsymbol{T}]||^{4}\right.\right]\\ &=&O(n^{-1-\varepsilon}). \end{array} $$

Then, since p = p(n) and k > (2ε)− 1, it holds that \(\sum ^{\infty }_{n=1}p^{-2k\varepsilon }<\infty \) and \(\sum ^{\infty }_{n=1}n^{-1-\varepsilon }<\infty \). These equations and the Borel-Cantelli lemma complete the proof of Lemma 2.

Appendix C: Proof of Theorem 1

To prove Theorem 1, we use three lemmas from Yanagihara et al. (2017) and Oda and Yanagihara (2019). Before Lemma 3 is introduced, let Q be an n × (n − 1) matrix satisfying \(\boldsymbol {I}_{n}-n^{-1}\boldsymbol {1}_{n}\boldsymbol {1}^{\prime }_{n}=\boldsymbol {Q}\boldsymbol {Q}^{\prime }\) and QQ = In− 1. Further, let X = (x1,…, xn), where xi is the i-th individual from x. The following lemma is Lemma C.1 by Yanagihara et al. (2017).

Lemma C.1.

For a subset \(j \in \mathcal {J}\), let \(\boldsymbol {\mathcal {E}}\),Aj, andBj be mutually independent random matrices, which are distributed according to

$$ \begin{array}{@{}rcl@{}} &\boldsymbol{\mathcal{E}} \sim N_{(n-1)\times p}(\boldsymbol{O}_{n-1,p},\boldsymbol{I}_{p}\otimes \boldsymbol{I}_{n-1}),\\ &\boldsymbol{A}_{j}\sim N_{(n-1)\times (q-q_{j})}(\boldsymbol{O}_{n-1,q-q_{j}}, \boldsymbol{I}_{q-q_{j}}\otimes \boldsymbol{I}_{n-1}),\\ &\boldsymbol{B}=\boldsymbol{Q}^{\prime}\boldsymbol{X}=(\boldsymbol{B}_{j}, \boldsymbol{B}_{\bar{j}})\sim N_{(n-1)\times q}(\boldsymbol{O}_{n-1,q}, \boldsymbol{\varSigma}_{xx}\otimes \boldsymbol{I}_{n-1}), \end{array} $$

where \(\boldsymbol {\mathcal {E}}\) andB are independent and do not rely on j, and Bj : (n − 1) × qj. Then, we have

$$ \begin{array}{@{}rcl@{}} (n-1)\boldsymbol{S}_{yy\cdot x}&=&\boldsymbol{\varSigma}^{1/2}_{yy\cdot x}\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P})\boldsymbol{\mathcal{E}}\boldsymbol{\varSigma}^{1/2}_{yy\cdot x},\\ (n-1)\boldsymbol{S}_{yy\cdot j}&=&\boldsymbol{\varSigma}^{1/2}_{yy\cdot x}(\boldsymbol{A}_{j}\boldsymbol{\Gamma}^{\prime}_{j}+\boldsymbol{\mathcal{E}})'(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j})(\boldsymbol{A}_{j}\boldsymbol{\Gamma}^{\prime}_{j}+\boldsymbol{\mathcal{E}})\boldsymbol{\varSigma}^{1/2}_{yy\cdot x}, \end{array} $$

whereP = B(BB)− 1B, \(\boldsymbol {P}_{j}=\boldsymbol {B}_{j}(\boldsymbol {B}^{\prime }_{j}\boldsymbol {B}_{j})^{-1}\boldsymbol {B}^{\prime }_{j}\), and Γj is defined in Eq. 3.3.

The following lemma is given by using (23) and (B.6) in Yanagihara et al. (2017).

Lemma C.2.

For a subset \(j \in \mathcal {J}\), let U1 and U2 be independent random matrices distributed according to

$$ \begin{array}{@{}rcl@{}} \boldsymbol{U}_{1}&\sim& N_{(n-q-1)\times p}(\boldsymbol{O}_{n-q-1,p},\boldsymbol{I}_{p} \otimes \boldsymbol{I}_{n-q-1}),\\ \boldsymbol{U}_{2}&\sim& N_{(q-q_{j})\times p}(\boldsymbol{O}_{q-q_{j},p},\boldsymbol{I}_{p} \otimes \boldsymbol{I}_{q-q_{j}}). \end{array} $$

Further, let W1 and W2 be random matrices distributed according to

$$ \begin{array}{@{}rcl@{}} \boldsymbol{W}_{1}&\sim& W_{q-q_{j}}(n-p+q-2q_{j}-1,\boldsymbol{I}_{q-q_{j}}),\\ \boldsymbol{W}_{2}&\sim& W_{q-q_{j}}(n-p+q-2q_{j}-1,\boldsymbol{I}_{q-q_{j}}). \end{array} $$

Then, we have

$$ \log{\frac{|\boldsymbol{S}_{yy\cdot j}|}{|\boldsymbol{S}_{yy\cdot x}|}}=\delta_{j}+\log{\frac{|\boldsymbol{U}^{\prime}_{1}\boldsymbol{U}_{1}+\boldsymbol{U}^{\prime}_{2}\boldsymbol{U}_{2}|}{|\boldsymbol{U}^{\prime}_{1}\boldsymbol{U}_{1}|}}+\log{\frac{|\boldsymbol{W}_{1}|}{|\boldsymbol{W}_{2}|}}, $$

where δj is defined in Eq. 3.3.

The following lemma is Lemma C.2 in Oda and Yanagihara (2019).

Lemma C.3.

Suppose that N − 4k > 0 for \(k\in \mathbb {N}\). Let u and v be independent random variables distributed according touχ2(N) andvχ2(p). Then, we have

$$ E\left[\left( \frac{v}{u}-\frac{p}{N-2}\right)^{2k}\right]=O(p^{k}N^{-2k}). $$

First, we consider the case of \(j \in \mathcal {J}_{+}\backslash \{j_{*}\}\). The distinct elements of \(j\cap \bar {j}_{*}\) denote \(a_{1},\ldots ,a_{q_{j}-q_{*}}\). Let j0 = j, ji = ji− 1∖{ai} (1 ≤ iqjq). Then, \(j_{q_{j}-q_{*}}=j\) holds, and we can express LLIC(j) −LLIC(j) as follows:

$$ \begin{array}{@{}rcl@{}} {\mathrm{LLIC}}(j)-{\mathrm{LLIC}}(j_{*})\!\!&=&\!\!(n-1)\log{\frac{|\boldsymbol{S}_{yy\cdot j}|} {|\boldsymbol{S}_{yy\cdot j_{*}}|}}+m(j)-m(j_{*})\\ \!\!&=&\!\!(n-1)\sum\limits_{i=1}^{q_{j}-q_{*}}\log{\frac{|\boldsymbol{S}_{yy\cdot j_{i-1}}|}{|\boldsymbol{S}_{yy\cdot j_{i}}|}}+m(j)-m(j_{*}).{\kern23pt} \end{array} $$
(C.1)

Then, from Lemma C.1, \(\boldsymbol {S}_{yy\cdot j_{i-1}}\) can be expressed as follows:

$$ (n-1)\boldsymbol{S}_{yy\cdot j_{i-1}}=\boldsymbol{\varSigma}^{1/2}_{yy \cdot x}\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j_{i-1}})\boldsymbol{\mathcal{E}}\boldsymbol{\varSigma}^{1/2}_{yy \cdot x}, $$
(C.2)

where \(\boldsymbol {\mathcal {E}} \sim N_{(n-1)\times p}(\boldsymbol {O}_{n-1,p},\boldsymbol {I}_{p}\otimes \boldsymbol {I}_{n-1})\), \(\boldsymbol {P}_{j_{i-1}}=\boldsymbol {B}_{j_{i-1}}(\boldsymbol {B}^{\prime }_{j_{i-1}}\boldsymbol {B}_{j_{i-1}})^{-1}\boldsymbol {B}^{\prime }_{j_{i-1}}\), \(\boldsymbol {B}_{j_{i-1}}\sim N_{(n-1)\times (q_{j}-i+1)}(\boldsymbol {O}_{n-1,q_{j}-i+1},\boldsymbol {\varSigma }_{j_{i-1}j_{i-1}}\otimes \boldsymbol {I}_{n-1})\), and \(\boldsymbol {\mathcal {E}}\) is independent of \(\boldsymbol {B}_{j_{i-1}}\). Moreover, by applying Lemma C.1 to \(\boldsymbol {S}_{yy\cdot j_{i}}\), we have

$$ (n-1)\boldsymbol{S}_{yy\cdot j_{i}}=\boldsymbol{\varSigma}^{1/2}_{yy \cdot x}\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j_{i}})\boldsymbol{\mathcal{E}}\boldsymbol{\varSigma}^{1/2}_{yy \cdot x}, $$
(C.3)

where \(\boldsymbol {P}_{j_{i}}=\boldsymbol {B}_{j_{i}}(\boldsymbol {B}^{\prime }_{j_{i}}\boldsymbol {B}_{j_{i}})^{-1}\boldsymbol {B}^{\prime }_{j_{i}}\), and \(\boldsymbol {B}_{j_{i}}\) is the (n − 1) × (qji) sub matrix of \(\boldsymbol {B}_{j_{i-1}}=(\boldsymbol {B}_{j_{i}},\boldsymbol {b}_{j_{i}})\). Let

$$ \boldsymbol{V}_{i,1}=\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j_{i-1}})\boldsymbol{\mathcal{E}}, \ \boldsymbol{V}_{i,2}=\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{P}_{j_{i-1}}-\boldsymbol{P}_{j_{i}})\boldsymbol{\mathcal{E}}. $$
(C.4)

Since \((\boldsymbol {I}_{n-1}-\boldsymbol {P}_{j_{i-1}})(\boldsymbol {P}_{j_{i-1}}-\boldsymbol {P}_{j_{i}})=\boldsymbol {O}_{n-1,n-1}\) holds, we observe that Vi,1 and Vi,2 are independent, and Vi,1Wp(nqj + i − 2, Ip), Vi,2Wp(1, Ip) from a property of the Wishart distribution and Cochran’s Theorem (see, e.g., Fujikoshi et al. (2010), Theorem 2.4.2). By using Eqs. C.2C.3, and C.4, we have

$$ \begin{array}{@{}rcl@{}} \frac{|\boldsymbol{S}_{yy\cdot j_{i}}|}{|\boldsymbol{S}_{yy\cdot j_{i-1}}|}&=&\frac{|\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j_{i}})\boldsymbol{\mathcal{E}}|}{|\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j_{i-1}})\boldsymbol{\mathcal{E}}|}\\ &=&\frac{|\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j_{i-1}})\boldsymbol{\mathcal{E}}+\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{P}_{j_{i-1}}-\boldsymbol{P}_{j_{i}})\boldsymbol{\mathcal{E}}|}{|\boldsymbol{\mathcal{E}}^{\prime}(\boldsymbol{I}_{n-1}-\boldsymbol{P}_{j_{i-1}})\boldsymbol{\mathcal{E}}|}\\ &=&\frac{|\boldsymbol{V}_{i,1}+\boldsymbol{V}_{i,2}|}{|\boldsymbol{V}_{i,1}|}. \end{array} $$
(C.5)

Since Vi,2Wp(1, Ip), we can express \(\boldsymbol {V}_{i,2}=\boldsymbol {v}_{i}\boldsymbol {v}^{\prime }_{i}\), where viNp(0p, Ip) and vi is independent of Vi,1. Then, Eq. C.5 is calculated as

$$ \begin{array}{@{}rcl@{}} \frac{|\boldsymbol{S}_{yy\cdot j_{i}}|}{|\boldsymbol{S}_{yy\cdot j_{i-1}}|}&=&|\boldsymbol{I}_{p}+\boldsymbol{V}^{-1}_{i,1}\boldsymbol{V}_{i,2}|\\ &=&1+\boldsymbol{v}^{\prime}_{i}\boldsymbol{V}^{-1}_{i,1}\boldsymbol{v}_{i}\\ &=&1+\frac{||\boldsymbol{v}_{i}||^{2}}{\left( ||\boldsymbol{v}_{i}||^{-1}\boldsymbol{v}^{\prime}_{i}\boldsymbol{V}^{-1}_{i,1}\boldsymbol{v}_{i}||\boldsymbol{v}_{i}||^{-1}\right)^{-1}}. \end{array} $$
(C.6)

Let \(\tilde {v}_{i}=||\boldsymbol {v}_{i}||^{2}\) and \(\tilde {u}_{i}=\left (||\boldsymbol {v}_{i}||^{-1}\boldsymbol {v}^{\prime }_{i}\boldsymbol {V}^{-1}_{i,1}\boldsymbol {v}_{i}||\boldsymbol {v}_{i}||^{-1}\right )^{-1}\). Then, from a property of the Wishart distribution (see, e.g., Fujikoshi et al. (2010), Theorem 2.3.3), we see that \(\tilde {v}_{i}\) and \(\tilde {u}_{i}\) are independent, and \(\tilde {v}_{i} \sim \chi ^{2}(p)\) and \(\tilde {u}_{i} \sim \chi ^{2}(n-p-q_{j}+i-1)\). Then, Eq. C.6 is expressed as

$$ \frac{|\boldsymbol{S}_{yy\cdot j_{i}}|}{|\boldsymbol{S}_{yy\cdot j_{i-1}}|} =\frac{\tilde{v}_{i}}{\tilde{u}_{i}}. $$

From Lemma C.3, by applying Eq. 3.1 in Lemma 2 to the above equation, for all ε satisfying 0 < ε ≤ 1/2, the following equation can be derived:

$$ \begin{array}{@{}rcl@{}} \log{\frac{|\boldsymbol{S}_{yy\cdot j_{i}}|}{|\boldsymbol{S}_{yy\cdot j_{i-1}}|}}&=&\log{\left( 1+\frac{p}{n-p-q_{j}+i-3}+o(p^{1/2+\mathcal{\epsilon}}n^{-1})\right)}\\ &=&\log{\frac{n}{n-p}}+o(p^{1/2+\varepsilon}n^{-1}), \ a.s. \end{array} $$

From the above equation, we have

$$ \begin{array}{@{}rcl@{}} \log{\frac{|\boldsymbol{S}_{yy\cdot j}|}{|\boldsymbol{S}_{yy\cdot j_{*}}|}} &=&-\sum\limits_{i=1}^{q_{j}-q_{*}}\log{\frac{|\boldsymbol{S}_{yy\cdot j_{i}}|}{|\boldsymbol{S}_{yy\cdot j_{i-1}}|}}\\ &=& (q_{j}-q_{*})\log{\left( 1-\frac{p}{n}\right)}+o(p^{1/2+\varepsilon}n^{-1}), \ a.s. \end{array} $$
(C.7)

Therefore, from Eqs. C.1 and C.7, we can expand p− 1{LLIC(j) −LLIC(j)} as follows:

$$ \begin{array}{@{}rcl@{}} &&\frac{1}{p}\left\{{\mathrm{LLIC}}(j)-{\mathrm{LLIC}}(j_{*})\right\}\\ &=&(q_{j}-q_{*})\frac{n}{p}\log{\left( 1-\frac{p}{n}\right)}+\frac{1}{p}\{m(j)-m(j_{*})\}+o(p^{-1/2+\varepsilon}), \ a.s.{\kern19pt} \end{array} $$
(C.8)

Next, we consider the case of \(j \in \mathcal {J}_{-}\). By using Lemma C.2, we have

$$ \log{\frac{|\boldsymbol{S}_{yy\cdot j}|}{|\boldsymbol{S}_{yy\cdot x}|}}=\delta_{j}+\log{\frac{|\boldsymbol{U}^{\prime}_{1}\boldsymbol{U}_{1}+\boldsymbol{U}^{\prime}_{2}\boldsymbol{U}_{2}|}{|\boldsymbol{U}^{\prime}_{1}\boldsymbol{U}_{1}|}}+\log{\frac{|\boldsymbol{W}_{1}|}{|\boldsymbol{W}_{2}|}}, $$
(C.9)

where U1, U2, W1, and W2 are defined in Lemma C.2. Let

$$ \tilde{\boldsymbol{U}}=(\boldsymbol{U}_{2}\boldsymbol{U}^{\prime}_{2})^{1/2}\{\boldsymbol{U}_{2}(\boldsymbol{U}_{1}^{\prime}\boldsymbol{U}_{1})^{-1}\boldsymbol{U}^{\prime}_{2}\}^{-1}(\boldsymbol{U}_{2}\boldsymbol{U}^{\prime}_{2})^{1/2}. $$

From a property of the Wishart distribution, we observe that \(\tilde {\boldsymbol {U}}\) and U2 are independent and \(\tilde {\boldsymbol {U}} \sim W_{q-q_{j}}(n-p-q_{j}-1,\boldsymbol {I}_{q-q_{j}})\). Then, Eq. C.9 is expressed as

$$ \log{\frac{|\boldsymbol{S}_{yy\cdot j}|}{|\boldsymbol{S}_{yy\cdot x}|}}=\delta_{j}+\log{|\boldsymbol{I}_{q-q_{j}}+\tilde{\boldsymbol{U}}^{-1}\boldsymbol{U}_{2}\boldsymbol{U}^{\prime}_{2}|}+\log{\frac{|\boldsymbol{W}_{1}|}{|\boldsymbol{W}_{2}|}}. $$
(C.10)

By a simple calculation, we can note that \(E[||\tilde {\boldsymbol {U}}-E[\tilde {\boldsymbol {U}}]||^{4}]=O(n^{2})\), \(E[||\boldsymbol {U}_{2}\boldsymbol {U}^{\prime }_{2}-E[\boldsymbol {U}_{2}\boldsymbol {U}^{\prime }_{2}]||^{4}]=O(p^{2})\), and E[||W1E[W1]||4] = O(n2). Hence, we can apply Eq. 3.2 in Lemma 2 to \(\tilde {\boldsymbol {U}}\), \(\boldsymbol {U}_{2}\boldsymbol {U}^{\prime }_{2}\), W1, and W2. From Taylor expansion, for all δ satisfying 0 < δ < 1/4, the following equations can be derived:

$$ \begin{array}{@{}rcl@{}} &&\log{|\boldsymbol{I}_{q-q_{j}}+\tilde{\boldsymbol{U}}^{-1}\boldsymbol{U}_{2}\boldsymbol{U}^{\prime}_{2}|}\\ \!\!&=&\!\!(q-q_{j})\log{\frac{n}{n-p}}+\!o\left( p^{3/4+\mathcal{\delta}}n^{-1}\right)\!+\!o\left( pn^{-5/4+\mathcal{\delta}}\right)\!, \ \!a.s., \end{array} $$
(C.11)
$$ \begin{array}{@{}rcl@{}} \log{\frac{|\boldsymbol{W}_{1}|}{|\boldsymbol{W}_{2}|}}\!\!&=&\!\!o(n^{-1/4+\delta}), \ a.s. \end{array} $$
(C.12)

From (C.10)-(C.12), we have

$$ \log{\frac{|\boldsymbol{S}_{yy\cdot j}|}{|\boldsymbol{S}_{yy\cdot x}|}}=\delta_{j}+(q-q_{j})\log{\frac{n}{n-p}}+o(1), \ a.s. $$
(C.13)

Therefore, from Eqs. C.7 and C.13, we can expand n− 1{LLIC(j) −LLIC(j)} as follows:

$$ \begin{array}{@{}rcl@{}} &&\frac{1}{n}\{{\mathrm{LLIC}}(j)-{\mathrm{LLIC}}(j_{*})\}\\ &=&\frac{1}{n}\left\{(n-1)\log{\frac{|\boldsymbol{S}_{yy\cdot j}|}{|\boldsymbol{S}_{yy\cdot x}|}}+(n-1)\log{\frac{|\boldsymbol{S}_{yy\cdot x}|}{|\boldsymbol{S}_{yy\cdot j_{*}}|}}+m(j)-m(j_{*})\right\}\\ &=&\left( 1-\frac{1}{n}\right)\delta_{j}+(q_{j}-q_{*})\log{\left( 1-\frac{p}{n}\right)}+\frac{1}{n}\{m(j)-m(j_{*})\}+o(1), \ a.s. \\ \end{array} $$
(C.14)

Lemma 1, Eqs. C.8 and C.14 complete the proof of Theorem 1.

Appendix D: Proof of Corollary 1

First, we derive Condition C1 from Condition C1. When \(j \in \mathcal {J}_{+}\backslash \{j_{*}\}\), the distinct elements of \(j\cap \bar {j}_{*}\) denote \(a_{1},\ldots ,a_{q_{j}-q_{*}}\) in the same way as the proof of Theorem 1. Let j0 = j, ji = ji− 1∖{ai} (1 ≤ iqjq). Then, we have

$$ m(j)-m(j_{*})=\sum^{q_{j}-q_{*}}_{i=1}\{m(j_{i-1})-m(j_{i})\}\geq (q_{j}-q_{*})\{m(\ell_{2})-m(\ell_{1})\}. $$

Since qjq > 0, it follows from Condition C1 and the above equation that Condition C1 is derived.

Next, we derive Condition C2 from Condition C2. When \(j \in \mathcal {J}_{-}\), let j+ = jj, and let k0 = j, ki = ki− 1∖{bi} \((1 \leq i \leq q_{j_{*}\cap \bar {j}})\) and s0 = j, si = si− 1∖{ci} \((1 \leq i \leq q_{j\cap \bar {j}_{*}})\), where \(b_{1},\ldots ,b_{q_{j_{*}\cap \bar {j}}}\) and \(c_{1},\ldots ,c_{q_{j\cap \bar {j}_{*}}}\) are the distinct elements of \(j_{*}\cap \bar {j}\) and \(j\cap \bar {j}_{*}\), respectively. Then, we have

$$ \begin{array}{@{}rcl@{}} m(j)-m(j_{+})&=&-\sum\limits_{i=1}^{q_{j_{*}\cap \bar{j}}}\{m(k_{i})-m(k_{i-1})\}\geq -q_{j_{*}\cap \bar{j}}\{m(\ell_{q})-m(\ell_{q-1})\},\\ m(j_{+})-m(j_{*})&=&\sum\limits_{i=1}^{q_{j\cap \bar{j}_{*}}}\{m(s_{i})-m(s_{i-1})\}\geq q_{j\cap \bar{j}_{*}}\{m(\ell_{2})-m(\ell_{1})\}. \end{array} $$

Therefore, Condition C2 and the above equations complete Condition C2.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oda, R., Yanagihara, H. & Fujikoshi, Y. Strong Consistency of Log-Likelihood-Based Information Criterion in High-Dimensional Canonical Correlation Analysis. Sankhya A 83, 109–127 (2021). https://doi.org/10.1007/s13171-019-00174-3

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-019-00174-3

Keywords and phrases.

AMS (2000) subject classification.

Navigation