Loyola Accurate Inference for Repeated Measures in High Dimensions Accurate Inference for Repeated Measures in High Dimensions

: This paper proposes inferential methods for high-dimensional repeated measures in factorial designs. High-dimensional refers to the situation where the dimension is growing with sample size such that either one could be larger than the other. The most important contribution relates to high-accuracy of the methods in the sense that p-values, for example, are accurate up to the second-order. Second-order accuracy in sample size as well as dimension is achieved by obtaining asymptotic expansion of the distribution of the test statistics, and estimation of the parameters of the approximate distribution with second-order consistency. The methods are presented in a uniﬁed and succinct manner that it covers general factorial designs as well as any comparisons among the cell means. Expression for asymptotic powers are derived under two reasonable local alternatives. A simulation study provides evidence for a gain in accuracy and power com- pared to limiting distribution approximations and other competing methods for high-dimensional repeated measures analysis. The application of the methods are illustrated with a real-data from Electroencephalogram (EEG) study of alcoholic and control subjects.


Introduction
Repeated measures data arise in various disciplines of the sciences, social sciences, engineering and humanities. Study designs such as time course studies, cross-over designs and split-plot designs naturally lead to repeated measures data. The distinctive feature of repeated measures data is that the observations from the same study unit (observational or experimental) are commensurate and exhibit correlations. Analysis of continuous repeated measures data to make inference on the effects of one or many between-or within-subject crossed or nested factor effects fall into three broad categories: multivariate analysis, univariate analysis and mixed model analyses. Mixed model analyses involve some assumption concerning the correlations of the repeated measures. Despite its generality in modeling the correlation and leading to exact inference, classic multivariate methods are not applicable when the number of repeated measures is larger than the error degrees of freedom.
Univariate methods on the other hand focus on adjusting the analysis of variance (ANOVA) for the within-unit correlation. It is well known that when all observations are independent, ANOVA test statistics have exact F distribution.
In the presence of the within-unit correlation, the ANOVA tests are valid only if these correlations satisfy a condition known as sphericity (Bock, 1963;Huynh and Feldt, 1970). Box (1954) suggested a correction which involves adjustment of the numerator and denominator degrees of freedom of the F-distribution by a constant multiplying factor, commonly referred to as Box's ε. Since the constant factor ε depends on the unknown within-unit covariance matrix, solution such as using a lower bound for ε (Geisser and Greenhouse, 1958) or estimates of the within-unit covariance matrix in the calculation of ε (Huynh and Feldt, 1976;Huynh, 1978;Lecoutre, 1991) have been implemented in practical applications. These solutions have been shown to work satisfactorily in terms of controlling type-I error rate when the number of repeated measures is low compared to the degrees of freedom for estimating the covariance matrix. However, the univariate approach is obviously approximate and Rencher and Christensen (2012, p. 219) argue that it has no power advantage over the exact multivariate test. They continue to say "... The only case in which we need to fall back on univariate test is when there are insufficient degrees of freedom to perform multivariate test...", i.e., when the number of repeated measures is larger than the error degrees of freedom to estimate the within-unit covariance.
Well before most researchers embarked on the high-dimension-low-samplesize (HDLSS) problem, Collier et al. (1967), Stoloff (1970) and Maxwell and Arvey (1982) have numerically demonstrated that the univariate approaches for repeated measures tend to be very conservative. In an attempt to improve the estimation of ε in the high dimensional situation, Chi et al. (2012) used "dual" forms of the within sum-of-squares and cross-products matrices. They claim that, besides giving stable estimates of ε, the use of the "dual" version has computational advantage. The approaches of Brunner et al. (2012) and Happ et al. (2015), on the other hand, overcome the high dimensional problem by using the so-called ANOVA-type statistic (Brunner et al., 1997(Brunner et al., , 1999 and then use F-approximation to their null distribution by matching first two moments of the numerator and denominator quadratic forms with that of a scaled-gamma distribution, an approach shown to be successful in a related problem for independent observations by Brunner et al. (1997). Again, these approaches although were shown to be numerically satisfactory, they are only approximate solutions. On the other hand, by deriving asymptotic distributions of some suitable statistics in the high-dimensional asymptotic framework, Pauly et al. (2015), Takahashi and Shutoh (2016) and Harrar and Kong (2016) devised asymptotically-valid tests. Pauly et al. (2015) consider high-dimensional repeated measures analysis for one sample situation but with the possibility of several within subject factors. The two-sample situation was considered by Takahashi and Shutoh (2016) assuming equal covariance matrices for the two populations. More generally, Harrar and Kong (2016) addressed the multi-group as well as the unequal covariance cases. Other works such as Wang and Akritas (2010a,b) and Wang et al. (2010) are also high-dimensional asymptotic results applicable for repeated measures but additionally assume that the repeated measurements are inherently ordered and the dependence between the measurements decays as the separation between them increases. High-dimensional asymptotic mean vector comparisons have recently received attention in the statistics literature (see Bai and Saranadasa, 1996;Chen and Qin, 2010;Katayama et al., 2013;Cai et al., 2014, and the references there in) under assumptions different from that of Pauly et al. (2015), Takahashi and Shutoh (2016), and Harrar and Kong (2016). These recent results are based on limiting distributions (i.e. first-order asymptotics) and are not applicable in the multi-factor repeated measures settings.
More specifically, Harrar and Kong (2016) have proven asymptotic normality for their test statistics under certain assumptions on the covariances. In their simulation study, Harrar and Kong (2016) noticed that the error of approximation from these asymptotic distributions could be considerable unless both the number of repeated measurements and replication sizes are large. The present paper aims to derive second order asymptotics for the tests considered in Harrar and Kong (2016). In addition, the results in the current manuscript are more general in the sense that they are applicable in situations where there are multiple within and/or between unit crossed and/or nested factors. However, to get second-order asymptotic results that are valid for general design, we need to assume equality of covariance matrices in this paper. This paper is organized as follows. Section 2 introduces the statistical model, hypotheses and notations used in the remainder of the paper. Test statistics for the various effects are presented in Section 3 together with asymptotic expansions for their null distributions. The asymptotic power are derived in Section 4. Numerical studies are carried out in Section 5. First, Monte Carlo simulations are used to show the gain in accuracy from the asymptotic expansions for a selection of covariance matrices and wide choices of values for the number of repeated measures and replication sizes. Data from a large Electroencephalogram (EEG) study of alcoholic and control subjects is used to illustrate the application of the results in Section 6. Also, simulation results by generating data with similar design parameters as the real data is considered later in the section. Sec-tion 7 contains discussions and conclusions. All proofs and preliminary results are placed in the Appendix.

Model and hypotheses
Suppose n i independent b-dimensional observations; for i = 1, . . . , a; are available from multivariate normal populations N b (μ i , Σ) denoted by X i1 , . . . , X ini and assume that the a samples are mutually independent. The number of samples a is assumed fixed. The aim of this paper is to derive second order asymptotic result for testing hypotheses in repeated measures analysis when both the total sample size a i=1 n i and the number of repeated measurements b tend to infinity. Let The usual setting gives the interpretation that X ijk is the response from the kth subject treated with the ith level of factor A (e.g., treatment) and the jth level of factor B (e.g. time). In this model X ijk and X i j k are assumed to be independent only if i = i or k = k . Otherwise the dependence is completely unspecified.
Throughout this paper, 0 will denote a matrix of all zeros where the dimension will be clear from the context, and 1 k denotes the k-dimensional vector consisting of all ones. The matrix I k is the identity matrix, whereas J k and P k are defined as J k = 1 k 1 k and P k = I k − k −1 J k , respectively. We will use extensively the Kronecker (or direct) product A ⊗ B and the direct sum A ⊕ B of matrices A and B. The symbol D → stands as an abbreviation for "converges in distribution to" and P → for "converges in probability to". In estimating a sequence of parameters θ b = O(1) by a sequence of estimators T n,b , consistency is meant in the sense of E(T n,b − θ b ) 2 → 0 as (n, b) go to infinity.
The hypotheses of interest can be expressed as with K = T 1 ⊗T 2 , where T 1 and T 2 are a×a and b×b matrices respectively. We require that the two matrices T 1 and T 2 are symmetric and there exist positive Actually, we can apply the linear transformation G on the data X and use the symmetric and idempotent matrix G 1/2 With this manipulation, the new K still defines the same hypotheses as in (1). Without loss of generality, we can assume that T 1 , T 2 and, therefore, K are symmetric and idempotent matrices. For such T i , K is positive semidefinite matrix. In the following Theorem we establish an equivalent quadratic form expressions for the hypotheses (1).
Theorem 2.1. The null hypotheses (1) are equivalent to The above setup may give the impression that the paper is dealing with one between-subject and one within-subject factor with levels a and b, respectively. However, the indices i = 1, . . . , a and j = 1, . . . , b are to be viewed as lexicographic order of the between-subject factor level combinations and within-subject factor level combinations, respectively. Therefore, the setup covers repeated measures in factorial designs with crossed and nested factors. The factors T 1 and T 2 of matrix K can be viewed as parts of the contrast matrix concerning the between-subject factors and the within-subject factors, respectively. More specifically, suitable choices of T 1 and T 2 allow betweensubject and within-subject mean comparisons. For a concrete example, consider a factorial design in which there are two between-subject crossed factors, say A and C with a and c levels, respectively, and two within-subject factors, say B and D, where the levels of D are nested within that of B (see also other specific designs considered in Section 5). Suppose B has b levels and the jth level of B has d j levels of D nested within it. The mean vector in this set up would be μ = (μ 11 , . . . , μ 1c , . . . , μ a1 , . . . , μ ac ) where To test the interaction effect of A and B, for instance, we would use j 1 dj ) and (Q Q) − is the generalized inverse of Q Q. Further, the set up above can be reset accordingly. For example a would be replaced by ac, and D = diag{n 11 , . . . , n 1c , . . . , n a1 , . . . , n ac } −1 . See also Brunner et al. (2017) and Konietschke et al. (2015) for other covered designs. Another example is found in the data analysis section (Section 6).

Higher-order asymptotic tests
We have seen in Theorem 2.1 that Kμ = 0 if and only if μ Kμ = 0. A reasonable estimator of μ Kμ is given by H = X KX. The mean and variance the quadratic form H are E(H) = tr(T 1 D)c 1 + μ Kμ and In this section, we will devise a test statistic for testing (1) under the following high-dimensional asymptotic framework, It is apparent that assumption A 1 is a sparsity condition on the covariance matrix. Using the well known trace inequality (e.g. Yang et al., 2001) tr(AB) m ≤ {tr(A 2m )tr(B 2m )} 1/2 , for any positive semidefinite matrices A and B, we have To see this, for example, when k = 4 the inequality yields Similarly, when k = 6, c 6 ≤ (c 8 c 4 ) 1/2 . Thus, The conditions used in Pauly et al. (2015), Takahashi and Shutoh (2016), and Harrar and Kong (2016) are c k = O(b) for k = 1, 2, 3, 4. These conditions are stronger compared to (2) for k = 1, 2, 3, 4. To get higher order accuracy, however, we would in addition need c k = O(b) for k = 5, 6, 7, 8, and condition A 1 is satisfied under these assumptions. Assumption A 2 states that the sample size and dimension grow at the same rate but otherwise either one can be larger than the other. It is weaker than the usual requirement that each of sample sizes diverge and have the same relation with b. Condition A 3 is local alternative assumption which, according to Theorem 4.1, implies that the power of the test can be accurately calculated for alternatives approach to the null at a rate slower than b −3/2 . Assume Σ is known. A centered and suitably-scaled version of H = X KX given by: yields a reasonable test statistic for testing H 0 . In order to generalize the test statistic T for the unknown covariance case, we need to estimate c 1 and c 2 to the appropriate order. The estimators c 1 and c 2 are defined by c 1 = tr(T 2 S) and c 2 = n 2 (n − 1)(n + 2) have desirable asymptotic properties given in Theorem 3.1. Here, it should be noted that c 1 and c 2 are unbiased estimators of c 1 and c 2 , respectively (Srivastava, 2005;Harrar and Kong, 2016).
Theorem 3.1. Under the high-dimensional asymptotic frameworks A 1 and A 2 , the estimators c 1 and c 2 have the following asymptotic properties: Let δ k = tr(T 1 D) k /{tr(T 1 D)} k for k = 1, 2, 3, 4 (δ 1 = 1). Since T 1 is a symmetric and idempotent matrix, one can see that 0 < δ k ≤ 1 and, hence, δ k = O(1) as n → ∞. Next we study the asymptotic sampling distribution of the test-statistic, which is obtained from T by replacing c 1 and c 2 by their empirical counterparts under H 0 . It is shown in the Appendix that T can be expanded as are O p (1) by Theorem 3.1. The characteristic function of T can be expanded as given in the following Theorem.
Theorem 3.2. If the null hypothesis H 0 holds, then under the high-dimensional asymptotic frameworks A 1 and A 2 , the characteristic function of T can be expanded as Note that, by Assumption A 1 , η 3 and η 4 are O(1). Inverting the characteristic function term by term, we get an asymptotic expansion for the distribution function of T as follows.
Theorem 3.3. If the null hypothesis H 0 holds, then under the high-dimensional asymptotic frameworks A 1 and A 2 , the distribution function of T can be expanded as More specifically, Theorem 3.3 states that where φ(x) is the standard normal density functions and h i (x) is the ith Hermite polynomial. The first five Hermite polynomials are: It should be emphasized that when the terms of orders b −1/2 and b −1 are ignored, assumptions A 1 and A 2 can be relaxed as: (i) the assumption of proportional divergence of n and b in A 2 is not needed (Harrar and Kong, 2016) and (ii) the sparsity condition on the covariance matrix (assumption A 1 ) is needed only for c 4 /c 2 2 = o(1), in which case, the assumption reduces to that of Chen and Qin (2010).
Let u(z) be defined by P ( T ≤ u(z)) = P (Z ≤ z) where Z is a standard normal random variable. In what follows, asymptotic expansion of u(z) in terms of z known as Cornish-Fisher expansion (Hill and Davis, 1968) is given in Corollary 3.4.

Corollary 3.4. If the null hypothesis H 0 holds, then under the high-dimensional asymptotic frameworks
The expansions G T (x) and u A (z) are approximations for the CDF and quantile, respectively, of T under the null hypothesis. In these approximations, η 3 and η 4 depend on c 2 , c 3 , and c 4 which are unknown quantities. Therefore, for practical applications, we need an estimated version of the expansions which is To that end, let c 1 and c 2 be as defined in (3) and define c 3 and c 4 as where m 1 = (n−2)(n−1)(n+2)(n+4) and m 2 = m 1 (n+1)(n−3)(n+6). These estimators are unbiased and enjoy some higher order asymptotic properties that makes them suitable for use in asymptotic expansions.
Theorem 3.5. Under the high-dimensional asymptotic frameworks A 1 and A 2 , the estimators c 2 , c 3 and c 4 have the following properties: Now, by using Theorems 3.1 and 3.5, we know that and Therefore, we can define the estimated version G T (x) of G T (x) of Theorem 3.3 by replacing η 3 and η 4 with η 3 and η 4 , respectively. Before we close this section we shed some light on how the asymptotics (b, n) → ∞ is working. Let R = rank(T 2 ) be the number of nonzero eigenvalues of T 2 Σ and denote the eigenvalues by λ b,k in decreasing order, i.e.
If the rank of T 2 is finite, then r is bounded and, consequently, condition A 1 is violated. Thus, for condition A 1 to be satisfied, the rank of T 2 needs to grow with b. Especially, when all eigenvalues are bounded, A 1 holds only if r/b 0. If r grows with b at a slower rate, a limiting normal distribution still holds for T but not the second order accuracy results. This case is interesting and we plan to investigate asymptotic expansion under this framework in future researches.
Next we provide an approximate solution in the situation where r is bounded. In this case, T has the same distribution as a linear combination of centered and independent Chi-square random variables with one degrees of freedom (see also Pauly et al., 2015). More precisely, . . , r. The exact distribution cannot be used directly as the eigenvalues are unknown and their consistent estimators are difficult to come by. We may use the following approximations . These approximations are obtained by matching the first two moments with that of a scaled Chi-Square distribution under H 0 . Further, it is known that H is independent of c 1 . Thus, a reasonable test statistic is, Under H 0 , the distribution of T can be approximated by F c 2 1 /(c2δ2), nc 2 1 /c2 where F ν1,ν2 is the F distribution with degrees of freedom ν 1 and ν 2 . When the rank of T 2 is 1, it turns out that T 2 Σ has only one nonzero eigenvalue c 1 , and c 2 = c 2 1 . Then T has exact F 1/δ2, n distribution. For example, for testing the main effect of A or C or their interaction in the four-factor design mentioned in the last paragraph of Section 2, we shall use , in which Rank(T 2 ) = 1. For another example, let T 1 = diag(n 1 , . . . , n a ) − (n 1 , . . . , n a ) (n 1 , . . . , n a )/(n + a) and T 2 = J b /b. These are matrices of special interest in testing the equality of mean vectors given that they are parallel (see Harrar and Kong, 2016). Under this set up, T has exact F a−1, n distribution. In the more general case, c 2 1 / c 2 is a consistent estimate of c 2 1 /c 2 . From the proof of Theorem 3.1, we know that E[ c k ] = c k , k = 1, 2 and V ar( c 1 ) = 2c 2 /n. Thus, Therefore, by Slutsky theorem and continuous mapping theorem, c 2 1 / c 2

Power under local alternatives
It turns out that the power functions under the local alternatives A 3 depend on the mean vectors through Δ = μ Kμ. Specifically, define the power function of T by where z α is the upper α (significant level) quantile of the standard normal distribution. Then, we can obtain the power functions under the local alternative A 3 as given in Theorem 4.1.
Theorem 4.1. Under the assumption A 1 , A 2 and A 3 , the function β(Δ) can be asymptotically expanded as . Therefore, Theorem 4.1 could still provide accurate approximation to the power function of the test that rejects H 0 if T > u A (z α ). It can be easily seen that when terms of orders b −1/2 and b −1 are ignored from the quantiles u A (z α ) and the power function β(Δ) reduces to When a = 2, T 1 = P a and T 2 = I b , it has the same form as Bai and Saranadasa (1996) and Chen and Qin (2010). It should be noted, however, that for this later test, Theorem 4.1 gives a second order approximation to the power function which is more accurate than Bai and Saranadasa (1996) and Chen and Qin (2010).

Simulation study
To exhibit the improvement resulting from the asymptotic expansion and, hence, facilitate comparison with the limiting distributions in Harrar and Kong (2016), the simulation study will mainly focus on the model where there is one betweenand one within-subject factors. We generate 10, 000 replications of data from N b (μ, Σ). Although the assumed asymptotic frameworks stipulate n to grow proportionally with b, in reality the actual ratio varies from application to application. To investigate the effect of various proportionality of growth, we look at values of several combinations of a, b and n i s. For practical reasons, we also consider small b and large n 1 , . . . , n a (and vice-versa) combinations with balanced as well as unbalanced designs. For the number of groups (number of levels of factor A), we will consider a = 2, 3, 4, 6. and we set significant level α at 0.01 and 0.05. Tables 2-7 present actual type I error rates (test sizes) for the covariance structures Σ = ρI b + (1 − ρ)J b , Σ = (ρ |j−j | ) and Σ = (σ jj ) where σ jj = 1 and, for j = j , σ jj = ρ/(j − j ) 1/4 . We consider a range of values for ρ. For the first covariance structure, the assumptions in A 1 are satisfied uniformly in ρ because c i = (1 − ρ) i (b − 1). However, for the other two covariance structures these assumptions do not hold except for ρ = 0. Especially, when ρ is close to 1, the quantities b 2 c 6 /c 3 2 and b 3 c 8 /c 4 2 could diverge very fast. To illustrate this concretely, Table 1 shows the values of b 3/2 c 5 /c 5/2 2 and b 3 c 8 /c 4 2 for different values of b and ρ. Table 1 Covariance Sparsity Assumptions Evaluated for Σ = (σ jj ) where σ jj = 1 and, for j = j , It should be noted that the covariance matrix structure Σ = ρI b + (1 − ρ)J b will be positive definite if and only if −1/(b − 1) < ρ < 1. The empty cells in Tables 2-7 correspond to the cases where b and ρ combinations do not yield positive definitive covariance matrices. In all the three tables we consider the contrast matrices T 1 = P a or J a /a and T 2 = P b . Another matrix of particular interest in repeated measures analysis is T 2 = J b /b which is useful to test the group effect. However, the distribution of T in this case does not depend on b. Hence, we do not carry out simulation for this contrast matrix.
First and foremost, comparing Table 2, 4 and 6 (results for α = 0.05) with the results in Harrar and Kong (2016), one can clearly see a marked gain in accuracy resulting from the inclusion of higher-order terms in the asymptotic expansion. We can see in Table 2 that for both tests (i.e. T 1 = P a and J a /a), a large number of the achieved error rates are within a tenth of the actual values. This phenomena happens more for weaker correlations than for stronger ones. Further, it is clear from the tables that the performance of the tests in controlling type I error rates is excellent when either the sample sizes or the dimension is large. This might be due to the homoscedastic nature. It seems also the case that when n i 's are small, the tests control type I error rate better as a gets larger. For example, looking at the rows for a = 6, performance appear to be satisfactory for small sample sizes but large dimension. Tables 4 and 6 seem to exhibit similar patters and behaviors. Likewise, for α = 0.01, the asymptotic expansion provides a gain in accuracy in controlling type-I error rates (see Tables  3, 5 and 7).
To investigate performance in terms of power, we compare the power of the method by Chi et al. (2012) with the methods proposed in this paper taking T 2 = P b and setting T 1 to either P a or J a /a. The power comparison is restricted to Chi et al. (2012) because, to the best of our knowledge, this is the only work for high-dimensional repeated measures data that covers the general factorial design. Their approach uses a property of the Wishart random matrix to avoid the singularity problem that arises due to high dimensionality. This manipulation could, however, potentially compromise the power properties of the test. That is one objective of the power comparison in addition to evaluating the power performance of the proposed method. To keep the comparison Table 2 Achieved Type I error rates (×100%) for the testing procedures when T 2 = P b and sampling  Table 3 Achieved Type I error rates (×100%) for the testing procedures when T 2 = P b and sampling  Table 4 Achieved Type I error rates (×100%) for the testing procedures when T 2 = P b and sampling from N b (μ, Σ) where Σ = (ρ |j−j | ). The nominal size is α = 0.05.  Table 5 Achieved Type I error rates (×100%) for the testing procedures when T 2 = P b and sampling from N b (μ, Σ) where Σ = (ρ |j−j | ). The nominal size is α = 0.01.  (8,8,8,8,9,9) 1.11 1.24 1.14 1.33 0.99 0.92 1.14 1.00 6 100 (8,8,9,16,17,17) Table 6 Achieved Type I error rates (×100%) for the testing procedures when T 2 = P b and sampling from N b (μ, Σ) where Σ = (ρ/(j − j ) 1/4 ). The nominal size is α = 0.05.  (8,8,8,8,9,9) Table 7 Achieved Type I error rates (×100%) for the testing procedures when T 2 = P b and sampling from N b (μ, Σ) where Σ = (ρ/(j − j ) 1/4 ). The nominal size is α = 0.01.  (8,8,8,8,9,9)  manageable, we fix a = 3 and Σ = ρI b + (1 − ρ)J b where ρ = 0.2. In regards to sample sizes and dimension, we use the combinations (b; n 1 , n 2 , n 3 ) = (10; 5, 10, 10), (10; 50, 100, 100), (100; 5, 10, 10) and (100; 50, 100, 100). For the alternative hypotheses, when T 1 = P a , we take μ 2 = μ 3 = 0 and consider two structures for μ 1 . The first one represents a dense alternative, namely μ 1i = (1+δ) for i odd and μ 1i = (1− δ) for i even, and the other one represents a sparse alternative, namely μ 1 = (1 + δ, 1 b−1 ) . In both cases δ is made to vary from 0 to 1. When T 1 = J a /a, we take μ 1 = 1 b + μ 2 , μ 2 = μ 3 and consider two structures of μ 2 representing dense and sparse alternatives. For the first one, we take μ 2i = δ for i odd and μ 2i = −δ for i even, and for the second one we take μ 2 = (0 b−1 , δ) . Here also, δ varies from 0 to 1. The latter structure for both values of T 1 represent departures that approach to the null hypotheses at the rate b 1/2 . More precisely, the scaled departure from the null ||μ 1 − 1 b ||/tr(Σ) 1/2 are δ and |δ|/ √ b, respectively. Figures 1 and 2 show power results for T 1 = P a and T 1 = J a /a, respectively. For dense alternatives (left panels), our methods has a clear advantage in all cases. More pronounced dominance is observed, in particular, when n is small. On the other hand, Chi et al. (2012) has an advantage for sparse alternatives (right panels).  Chi et al. (2012) for In the both panel of the plot, μ 2 = μ 3 = 0. In the left panel μ 1i = (1 + δ) for i odd, μ 1i = (1 − δ) for i even and in the right panel μ 1 = (1 + δ, 1 b−1 )

Real data analysis
In this section, we analyze a publicly available data obtained from the University of California-Irvine Machine Learning Repository. 1 The data arose from a  Chi et al. (2012) for In the both panel of the plot, μ 1 = 1 b + μ 2 and μ 2 = μ 3 . In the left panel μ 2i = δ for i odd and μ 2i = −δ for i even and in the right panel μ 2 = (0 b−1 , δ) large study to examine Electroencephalograph (EEG) correlates of genetic predisposition to alcoholism. Measurements from 64 electrodes placed on subject's scalps were recorded 256 times for 1 second. The study involved two groups of subjects: alcoholic (n 1 = 77) and control (n 2 = 45). Each subject was exposed to either a single stimulus (S1) or two stimuli (S1 and S2) which were pictures of objects chosen from a picture set. The sixty-four electrodes (channels) are divided into groups based on their location on the scalp (frontal, temporal, parietal and occipital lobes). To illustrate the application of the methods concisely, we focus the analysis on data from the stimulus S1 and the seventeen frontal-lobe channels. The outcome measurements are Event-Related Potentials (ERP) indicating the level of electrical activity (in volts) in the region of the brain where each of the electrodes is placed. This repeated measures data has two within-subject factors (time and channels) and one between-subject factor (alcohol use). The within-subject factors time and channel have 256 and 17 levels, respectively.
To assess the plausibility of the assumptions in the model, we checked channelby-channel marginal normality after testing for equality of covariance matrices in the alcoholic and control groups. To date, there does not exist a satisfactory test of multivariate normality in the high-dimensional repeated measures. Neither of the high-dimensional tests Li and Chen (2012) nor Zhang et al. (2018) indicated violation of equality of covariance for any of the channels (p-values ≥ 0.512). However, the tests did not give a clean bill of normality. The proportion of fail to rejections (not adjusted for multiplicity) ranged from 24% to 41% for all but five of the frontal periphery channels. For these later channels, the proportions (unadjusted) ranged from 57% to 82%. Considering the multiplicity of the tests, this may only represent a mild violation of normality.
The main research questions of interest are: (H 01 ) whether the ERP profiles over time differ between channels and groups (three-way interaction: alcohol × time × channel); (H 02 ) whether ERP profiles are similar between the channels when averaged over groups (similar time profiles for all the channels); (H 03 ) if the time profiles of ERP are similar between the two groups averaged over channels; (H 04 ) whether the ERP profiles are constant (flat) when averaged over channels and groups. For describing the contrast matrices, we assume the data vectors from each subject are arranged by grouping the 17 channels within each time point, i.e. the data vector from the jth subject in the ith group is . . . , X ij1,17 , . . . , X ij,256,1 , . . . , X ij,256,17 ) . In the notations of the paper, the four hypotheses of interest, H 0i for i = 1, 2, 3, 4, can be tested by using the contrast matrices T 1 = P 2 and T 2 = P 256 ⊗ P 17 ; T 1 = J 2 /2 and T 2 = P 256 ⊗ P 17 ; T 1 = P 2 and T 2 = P 256 ⊗ J 17 /17; and T 1 = J 2 /2 and T 2 = P 256 ⊗ J 17 /17, respectively. The results of the analysis are presented in Table 8. Overall time-profile similarity across groups (averaged over channels) cannot be rejected (p-values = 0.205). In fact, channel-by-channel similarity of time profiles of ERP across groups cannot be rejected (p-values = 0.196). However, the flatness over time is rejected overall for all channels as well as channel-by-channel.
As a way of ascertaining the reproducibility and reliability of the results in Table 8, we conducted a simulation study using parameters similar to that of the EEG data. For table 9, we generate 1000 replications of data from N bd (0, Σ i ). We look at values of b = 256, d = 17, a = 2 and n 1 = 77 n 2 = 45 and take α = 0.05. Table 9 present actual type I error rates (test sizes) for the covariance structures Σ 1 = ρI b + (1 − ρ)J b , for ρ = 0.2 and random matrices Σ i for i = 2, 3, 4 defined as follows. Let Σ i = Q i Λ i Q i , where Λ i is a diagonal matrix with diagonal entries taken from Unif (0, 1) and Q i is orthogonal matrix. Indeed, Q i can be defined from the QR decomposition of a random matrix Z i = (Z i,jj ) where Z i,jj are iid random variables. Here, we consider three distributions for Z i,jj , namely Z 2,jj = 1 {j=j } with probability 1, Z 3,jj ∼ Exp(1) and Z 4,jj ∼ N (0, 1). Here, it is clear that assumption A 1 is satisfied only for Σ 1 . It is clear from Table 9 that the achieved type I error rates are satisfactorily close to 5% regardless of the covariance matrix assumed.

Discussion and conclusion
The paper derives approximations for the null distributions and quantiles of some test statistics in repeated measures. The approximations ensure the errors to be of order O(b −3/2 ) where b is the dimension, i.e. the number of repeated measures. Factorial designs are treated in a unified manner where multiple between-and within-subjects factors which may be crossed or nested are allowed. General covariance structure is allowed where no pre-determined sequence is assumed among the repeated measurements. Therefore, the repeated measurements could be over time or under different treatment conditions. The asymptotic results require some regularity condition on the covariance matrix. Such assumption appears to be inevitable as long as one prefers to consider unstructured covariance matrix. Our observation from the simulation is that this assumption does not appear to restrict the utility of the results for application in more general situations. Nevertheless, we made somewhat milder requirements compared to similar works (see, for example, Bai and Saranadasa, 1996;Takahashi and Shutoh, 2016). The paper also assumes proportional divergence of the sample size and dimension, i.e. n/b → γ 0 ∈ (0, ∞), but either one can be larger than the other. We should point out that this assumption can be relaxed to cover other cases, namely n = O(b ) for > 1 or < 1. However, the expanded cumulative distribution function may have terms with order different from b −j/2 for j = 1, 2, . . . in which case the standard Cornish-Fisher formula for the quantile will not apply. Non standard expansions will need to be derived for the quantiles. Regardless, our impression from the simulation is that the effect of these terms is likely to be insignificant. This is an open problem that needs further investigation.
The development of the paper is under normality. We recommend assessing the plausibility of this assumption before applying the methods. Transformation that improve normality could also be attempted in the event non-normality is detected or suspected. In the proofs, multivariate normality of the repeated measures is mostly needed for its nice property of independence up to correlation and independence of some linear and quadratic forms. Limiting distribution results under statistical models that include independence up to correlation assumption have been derived in Bai and Saranadasa (1996) and Chen and Qin (2010) for two-sample and Katayama et al. (2013) for multiple-sample comparison of mean vectors. In the interest of space, we opted to relegate the investigation of these models for limiting distribution as well as asymptotic expansion in the repeated measures context to a follow-up manuscript.