Correlation Coefficients for a Study with Repeated Measures

Repeated measures are increasingly collected in a study to investigate the trajectory of measures over time. One of the first research questions is to determine the correlation between two measures. The following five methods for correlation calculation are compared: (1) Pearson correlation; (2) correlation of subject means; (3) partial correlation for subject effect; (4) partial correlation for visit effect; and (5) a mixed model approach. Pearson correlation coefficient is traditionally used in a cross-sectional study. Pearson correlation is close to the correlations computed from mixed-effects models that consider the correlation structure, but Pearson correlation may not be theoretically appropriate in a repeated-measure study as it ignores the correlation of the outcomes from multiple visits within the same subject. We compare these methods with regard to the average of correlation and the mean squared error. In general, correlation under the mixed-effects model with the compound symmetric structure is recommended as its correlation is close to the nominal level with small mean square error.


Introduction
Repeated-measure designs are increasingly used in practice to evaluate the trajectory of measures. e Alzheimer's Disease Neuroimaging Initiative (ADNI) study is a longitudinal study to investigate the progression of Alzheimer's disease (AD) [1,2]. is study evaluates the normal cognitive aging with the focus on mild cognitive impairment (MCI) and early AD. Brain structure and function are two research areas of interest in the ADNI study. As expected, brain structure volumes are often highly associated with results from cognitive tests [3][4][5]. In a longitudinal study, correlation for repeated measures should be calculated and reported. However, recent articles still only reported the Pearson correlation coefficient that ignores the correlation of outcomes from the same subject. For these reasons, it is important to compare the existing correlations for repeated measures and make recommendations for other researchers to use.
Bland and Altman [6,7] discussed several approaches to compute correlations for repeated measures. ey proposed calculating subject means to compute the Pearson correlation, where subject means eliminate the correlation of outcomes from the same subject. e second approach is to fit a linear regression model with one measure as the dependent variable and the other measure and the subject as the predictor variables. e second approach is similar to the one proposed by Christensen [8] who suggested computing correlation after adjusting for the subject effect [9][10][11][12]. In a repeated-measure study, the visit effect is the correlation within the subject. Lipsitz et al. [13] proposed computing partial correlation adjusting the visit effect. When data are correlated, mixed-effects models may be utilized to analyze data while controlling for these additional correlations. Lam et al. [14] were among the first to propose computing correlation between repeated measures under the compound symmetric (CS) correlation structure. Later, Hamlett et al. [15] developed programs to compute correlation under the CS structure by using the commercially available statistical software, SAS. In the work by Lam et al. [14], they also computed the correlation under the autoregressive correlation structure, AR (1). After that, Roy [16] developed SAS macros to compute correlation under the AR(1) structure and compared the correlations for repeated measures under these two correlation structures with limited simulation studies. e objective of this manuscript is to conduct extensive simulation studies to compare the existing correlations for repeated measures with regard to the average of correlation and the mean squared error (MSE) and identify the correlation method that has the best performance to be used in practice. In addition to the parameter of interest (correlation for repeated measures), there are several nuisance parameters in the variance-covariance matrix: variances, correlations within each outcome, and correlation between outcomes from different visits [17][18][19][20]. It is computationally intensive for these comparisons. We have to use supercomputers for simulation studies. However, it is computationally feasible to calculate correlations for an observed data set. We use one example from the ADNI study to illustrate the application of the considered methods to calculate correlation between hippocampal volumes and a neuropsychological assessment to evaluate verbal memory.
We organize this article as follows. In Section 2, we introduce the existing methods to calculate correlations for repeated measures. In Section 3, we conduct extensive Monte Carlo simulation studies to compare the performance of the considered correlations with regard to the average of correlation and the MSE. A real example from the ADNI study is then used to illustrate the application of these correlations. Lastly, we provide conclusions in Section 4 on computing correlation for repeated measures when heterogeneity of correlation is observed across visits.

Methods
For a repeated-measure study with n participants, each participant has several scheduled visits (m i visits for the i-th subject). Suppose U and W are the two measures in a repeated-measure study and U ij and W ij are the outcomes of the i-th subject at the j-th visit, where i � 1, 2, . . ., n and j � 1, 2, . . ., m i . e correlation between U and W, ρ UW , is the parameter of interest to quantify a relationship between them. Several methods have been proposed to calculate ρ UW , including independence models, partial correlation models, and mixed-effects models. [6,7] were among the first to provide methods to compute longitudinal correlation coefficient. One of their approaches assumes the independence between outcomes from the same subject: U ij ⊥ U ij′ and W ij ⊥W ij ′ . e longitudinal correlation ρ UW is computed as the Pearson correlation by ignoring the correlation structure from repeated measures. is approach is referred to as the I approach, with the computed correlation as ρ I . is is a naive approach that is easy to apply. Irimata and Li [21] found that ρ I for a pharmacokinetics data set is very close to other correlations computed from other complicated models.

Subject Means.
As suggested by Bland and Altman [6], the correlation can be computed by using the averages at the subject level to eliminate the subject effect in repeated measures. is correlation is able to address the research question whether the average of one measure is related to the average of another. When correlation within each measure is large, ρ UW at different visits should be similar to each other, and this average correlation model would have good performance. We refer to this correlation approach as the M approach with the notation of ρ M . ese two correlations for repeated measures, ρ I and ρ M , are the Pearson correlation and can be computed by using many statistical software: such as the Proc corr procedure in SAS and the function cor or cor.test in R [22]. e next five correlations are computed from regression models (e.g., mixed-effects models), and we would like to suggest using SAS Proc mixed procedure for implementation. Detailed SAS programs are provided in the Appendix.

Correlation Adjusting for the Subject Effect.
Christensen [8] proposed computing correlation for repeated measures by partialling out the subject effect. e subject effect can be removed from the two measures by fitting a multivariate regression model with both measures being the outcomes and the subject ID as the only covariate. e residuals are used to compute the final correlation, which is essentially a partial correlation method for repeated data. is correlation is referred to as the PS correlation that partials out the subject effect, ρ PS .

Correlation Adjusting for the Visit Effect.
In the ρ PS calculation, the correlation between the two measures is included in the multivariate model. In addition to that correlation, another correlation between measures at different visits may be considered. Lipsitz et al. [13] proposed computing partial correlation between outcome and one of the covariates by using this approach. When one of the two measures (e.g., measure U) is considered as the dependent variable, the other measure (W) is considered as the covariate. e correlation structure between visits is assumed to be compound symmetric. We refer this correlation as the ρ PVa correlation. We use ρ PVb for another correlation when W is considered as the dependent variable in the model. One of the properties for correlation is ρ UW � ρ WU , but this property is not met here: ρ PVa is generally not equal to ρ PVb .

Mixed-Effects
be the outcomes from the i-th subject, with the vector length of 2m i . e complete data can be reorganized in a long format, with the columns subject ID, visit, mtype, and outcome, where mtype � "U" for the U measure and mtype � "W" for the W measure. e long format utilizes 2m i rows for the outcomes from Y i . e linear mixed-effects model is presented as where X i and Z i are the design matrices for the fixed effect and the random effect, respectively. e random effect b i follows a multivariate normal distribution N (0, D), and the measurement error ϵ i follows a multivariate normal distribution N (0, R i ). e detailed formula for D and R i may be found in the article by Hamlett et al. [15]. e fixed effect is β � (β 0 , β U , β W )′, where β 0 is the intercept, and β U and β W are the fixed effects of U and W, respectively. Correlation between U and W is computed as which is assumed to be independent of the visit. Each subject has multiple visits, correlation within U is , and the correlation within W is Since W ij is correlated with both U ij and W ij′ , therefore, U ij and W ij′ are correlated and their correlation is assumed to be δρUW, where δ is a factor which is generally less than 1. Let σ 2 U and σ 2 W be the variances of U and W, respectively. ese variances and covariances are used to derive the variancecovariance matrix under the CS structure (see Lam et al. [14] and Hamlett et al. [15]) and that under the AR(1) structure (see Lam et al. [14] and Roy [16]).

Results
We conduct simulation studies to compare the performance of the considered 7 methods for the correlation between repeated measures for a study with four visits. e mean values of U and W are assumed to be (2.0, 1.9, 1.7, 1.4) and (0.8, 0.7, 0.6, 0.5), with both measures decreasing as time goes. Such data are commonly available from cognitive tests on elderly population and other studies. e prespecified correlation for repeated measures is ρ UW � 0.2, 0.5, and 0.8.
In the simulation studies for the AR(1) structure for the visit effect, the correlation within U is Corr(U ij , U ij′ ) � ρ |j−j ′ | U , with ρ U � 0.2, 0.5, and 0.8, and the correlation within W is Corr(W ij , W ij′ ) � ρ |j−j ′ | W , with ρ W � 0.2, 0.5, and 0.8. e factor δ in the correlation between U ij and W ij′ is assumed to be 0.6 in all simulations. e considered variances are σ 2 U � 1 and 3 and σ 2 W � 0.5 and 1. e variance-covariance matrix can be separated into two parts: Z i DZ i ′ and R i . We assume that a quarter of variance is from R i and the remaining is from Z i DZ i ′ . is weight is needed in order to calculate the covariances. For each configuration, we simulate B � 2,000 data sets.
Under the AR(1) structure for the visit effect, Figure 1 presents the average of correlation ρ UW and the MSE when ρ UW � 0.2, σ 2 U � 1, and n � 60 subjects. e MSE is defined as where ρ UW (b) is the estimator of ρ UW by using the b-th simulated data set. It can be seen that the correlations adjusting the visit effect, ρ PVa and ρ PVb , often underestimate the correlation, while the correlation adjusting the subject effect, ρ PS , always overestimate the correlation. e remaining methods have correlations close to the nominal level. Although ρ M is the best with the correlation around the nominal level, its MSE is much larger than the ones that have the correlations close to the nominal level. In the calculation of ρ M , each subject only has one outcome for each measure, as compared to multiple outcomes in other correlation calculations. Due to the reduced number of outcomes, the variance of ρ M is much large that leads to a large MSE. It is noted that ρ PVa or ρ PVb could have the lowest MSE in some cases, but their estimated correlations are generally much below the nominal level. For this reason, we exclude ρ PVa and ρ PVb in the following simulation studies. When a study has the same number of visits for each subject, the estimated correlation by using the mixed-effects model with the CS structure, ρ CS , is very similar to ρ I under the independent assumption. e other mixed-effects model correlation ρ AR has a similar correlation as ρ CS and ρ I . e MSE of ρ AR is slightly smaller than the MSEs of ρ CS and ρ I when the correlations within U or W are small, and this trend is reversed when ρ U and ρ W are large. Similar results are observed when σ 2 U is increased to 3. When ρ U is increased to 0.5 (the top plot in Figure 2), the averages of ρ I , ρ CS , and ρ AR are generally above the nominal level, and the first two correlations are closer to the nominal level as compared to the third correlation ρ AR . We also present the correlation estimates when sample size n is 100 in Figure 2. It can be seen that the MSEs become smaller as compared to the MSEs in the top plot ( Figure 2) when sample size is 60. Figure 3 shows the results when data sets are simulated under the CS structure given ρ U � 0.5, σ 2 U � 1, and n � 60. Correlation ρ PS does not perform well with the average correlations much below the nominal level in many configurations. We also found that ρ M is likely to overestimate the correlation. It seems that ρ M and ρ PS have different trajectories as ρ W increases. Both of these methods do not have satisfactory performance with regard to correlation under the CS structure, although ρ PS has very good correlation estimates under the AR(1) structure. e other three correlations (ρ I , ρ CS , and ρ AR ) have similar good performance with regard to correlation and the MSE. It should be noted that the variance-covariance matrix is not positively defined when ρ U � ρ W � 0.8. erefore, data sets cannot be generated for that configuration. We also simulate data under the unstructured correlation structure and found that ρ I , ρ CS , and ρ AR are still the best correlation estimates. e aforementioned simulations have data sets that each subject has the same number of visits. In practice, it is possible that the number of visits may not be exactly the same for all subjects. We assume the number of visits is either 2, 3, or 4. Each subject is randomly assigned to have 2, 3, or 4 visits with the same probability. We present the results with n � 60 in Figure 4 when variances are small (σ 2 U � 1 and σ 2 W � 0.5 and 1) and large (σ 2 U � 20 and σ 2 W � 10 and 30). e MSE of ρ CS is slightly smaller than that of ρ I , and their biggest difference occurs when both ρ U and ρ W are large. ρ AR is more likely to overestimate the correlation. Although ρ M has the correlation very close to the nominal level, it has the largest MSE as compared to other correlations. When variance is large, ρ I and ρ CS are the best correlations with the estimated correlations much closer to the Computational and Mathematical Methods in Medicine 3 nominal level as to the configurations with small variances. e mixed-effects model correlation ρ CS performs slightly better than ρ I with regard to the average of correlation and the MSE.

Example.
We use one data set from the ADNI study to illustrate the application of the considered correlation methods, with 47 participants who had 5-year visits and completed imaging volumes and memory scores. Hippocampal volumes are found to be highly associated with the delayed recall scores from the Rey Auditory Verbal Learning Test (RAVLT delayed recall) [23]. e RAVLT delayed recall has the possible integer score from 0 to 15, which is often used to assess verbal memory. e higher the score is, the better the memory is.  Table 1. Participants in this data set have the same number of visits. For this reason, ρ I is very similar to ρ CS . ρ M is slightly larger than them, and ρ AR is smaller than them. Correlation adjusted by the subject effect ρ PS is much smaller than ρ CS . Correlations adjusted by the visit effect highly depend on which variable is considered as the dependent variable in the linear regression model. When hippocampal volumes are used as the dependent variable, the estimated correlation is high (0.686), and it becomes too low (0.016) when RAVLT delayed recalls are considered as the dependent variable.
It was reported by Wang et al. [23] that the Pearson correlation ρ I between hippocampal volumes and RAVLT delayed recall scores is slightly above 0.4. ey also provided the Pearson correlations for each group (AD, MCI, and control) which are all below the correlation using combined samples. e correlation within the control group is the lowest.   Table 1, RAVLT delayed recall scores always have a larger correlation with left hippocampal volumes than the correlation with right hippocampal volumes for each correlation method. We also add RAVLT immediate recall scores to further illustrate the application of the considered methods. Its correlation with left hippocampal volumes is often larger than its correlation with right hippocampal volumes. e estimated ρ CS between hippocampal volumes and RAVLT delayed recalls is larger than that between hippocampal volumes and RAVLT immediate recalls.

Conclusions
From the simulation studies, ρ I under the independence assumption and ρ CS using the mixed-effects model with CS variance-covariance structure are shown to have similar correlation estimates when subjects have the same number of visits. But, ρ CS is appropriate as it models the data properly.
e mixed-effects model correlation ρ CS is recommended for use as its correlation is close to the nominal level with small mean square error.

Discussions
Lam et al. [14] derived the detailed variance and covariance. e variances σ 2 U and σ 2 W and covariance σ UW are used to calculate ρ UW . ese variances and covariance estimates are not exactly the same from the independent model and the mixed-effects model with the CS structure: σ 2 W � 16.6846 in the ρ I calculation and 16.6136 from the CS model. Because these estimated variances and covariance are very close between these two methods, the final estimated correlations are very similar. When a study has different number of follow-up for each participant, ρ I and ρ CS differ from each other [18,[24][25][26]. For a study with some possible outliers as seen in the data testing association between pH and PaCO 2 [6], their difference is substantial. We provide the SAS programs by using that example in the Appendix.
When CS or AR(1) correlation structure for the visit effect is applied in the mixed-effects models [10,25,27], the computed correlation is the same at different visits. In the observation of the heterogeneity of correlations at different visits, the unstructured correlation may be considered for the visit effect. Under the heterogeneity assumption, correlation can be computed at each visit from a mixed-effects model [28][29][30].
is brings some challenges to explain the results, such as the overall correlation and the trend of correlation. Alternatively, when a study has a monotonic relationship between correlation and visit, one may include an additional predictor: visit, in the statistical model, to calculate a monotonic correlation for repeated measures.
Data Availability e data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at http://adni. loni.usc.edu/wp-content/uploads/how_to_apply/ ADNI_Acknowledgement_List.pd.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.