Pruning and thresholding approach for methylation risk scores in multi-ancestry populations

ABSTRACT Recent efforts have focused on developing methylation risk scores (MRS), a weighted sum of the individual’s DNA methylation (DNAm) values of pre-selected CpG sites. Most of the current MRS approaches that utilize Epigenome-wide association studies (EWAS) summary statistics only include genome-wide significant CpG sites and do not consider co-methylation. New methods that relax the p-value threshold to include more CpG sites and account for the inter-correlation of DNAm might improve the predictive performance of MRS. We paired informed co-methylation pruning with P-value thresholding to generate pruning and thresholding (P+T) MRS and evaluated its performance among multi-ancestry populations. Through simulation studies and real data analyses, we demonstrated that pruning provides an improvement over simple thresholding methods for prediction of phenotypes. We demonstrated that European-derived summary statistics can be used to develop P+T MRS among other populations such as African populations. However, the prediction accuracy of P+T MRS may differ across multi-ancestry population due to environmental/cultural/social differences.


Introduction
DNA methylation (DNAm), one of the most studied epigenetic mechanisms, regulates the mode of expression of DNA segments independent of alterations of their sequence by adding a methyl group at cytosine residues, hence contributing to variation in cellular phenotypes [1] . With current advances in and reduction of cost of array-based profiling technologies, increasing numbers of large-scale epigenome-wide association studies (EWAS) have been conducted to study DNAm in association with complex human diseases as well as environmental and social factors [2,3] . EWAS have thus far been successful in identifying dozens of cytosine guanine dinucleotide sites (CpGs) associated with various diseases and exposures, which could potentially be used for disease diagnosis and prediction, development of drug targets, and monitoring of drug response [3][4][5][6][7][8] . However, differential DNAm in individual CpGs often shows a weak prediction capacity and can only explain a small fraction of phenotype variance. Polyepigenetic approaches that aggregate information on differential DNAm from multiple CpGs might produce a more accurate biomarker for clinical usage [ 9,10] .
A well-known polygenic approach for genotype data is polygenic risk scores (PRS), which are weighted sums of risk alleles of a pre-selected number of genetic variants [11] . Recently, many efforts have focused on transferring PRS approaches to DNA methylation data to construct methylation risk scores (MRS), which are defined as weighted sums of the individuals' DNAm values of a pre-selected number of CpGs [10] . However, there are many methodological challenges in constructing DNA methylation risk scores [10,12,13] . One of the problems is that DNAm is influenced by ancestry, which captures genetic ancestry (differences in the genome related to ancestry) as well as social determinants of health such as racism and discrimination, socioeconomic status, and environmental effects [14] . Thus, ideally, when external weights are used for the calculation of MRS, these weights should be assessed in a population with the same ancestry as the study samples. However, current epigenetic literature remains limited by the lack of diversity, with most focusing on European populations [15] , therefore making it difficult to identify appropriate weights for MRS for other populations. While it is well known that PRSs are not applicable across different ancestries [16,17] , little is known about the performance of MRS across multi-ancestry populations.
Currently, there are two popular approaches to construct MRS. The first one is to use penalized regression models such as Elastic Net and LASSO regularization [10,18,19] , which usually requires individual-level DNAm data. When only summary-level statistics are available, individual CpGs that reached genome-wide significance in an external EWAS are selected and the beta-coefficients estimated from EWAS are used as weights to generate MRS [10] . However, research in PRS has shown that the optimal p-value threshold strongly depends on the data [20] , and including a larger proportion of variants could potentially capture more of the phenotype variation [21] . Moreover, most MRS from the second approach do not consider DNA comethylation, defined as proximal CpGs with correlated DNAm across individuals [22] , which could potentially bias the generation of MRS. Shah et al 2015 proposed to remove redundant CpGs by keeping the most significant CpGs in co-methylation [23,24] , however, a window to define DNA co-methylation needs to be predefined and the effect of accounting for DNA co-methylation was not evaluated.
One of the most widely used PRS approaches to deal with single nucleotide polymorphisms (SNPs) in high linkage disequilibrium (LD) and to identify p-value thresholds with the best prediction accuracy is the pruning and threshold (P+T) method [25] . In the P+T approach, the correlation square (R 2 ) for SNPs within a close genetic distance is calculated and less significant SNPs that are correlated with an R 2 greater than a particular value (LD pruning) [26] are removed. Next, several p-value thresholds are tested to maximize the prediction accuracy of the derived PRS (p-value thresholding) [26,27] . Theoretically, the P + T approach could be applied to generate MRS, however, there is no standard procedure on how to conduct pruning for DNAm data and the performance of such MRS across multi-ancestry populations remains unknown.
Here, we propose to use the Co-Methylation with genomic CpG Background (CoMeBack) approach, a tool that uses a sliding window to estimate DNA co-methylation, to account for correlations of DNAm at proximal CpG sites [22] , and pair it with p-value thresholding to construct P+T CoMeBack MRS. CoMeBack uses unmeasured intermittent CpGs from the human reference genome to link array probes in hope of reducing false positives while improving the identification of biologically relevant co-methylation [22] . We conducted simulation studies based on data from an adult population consisting of three groups of different ancestries (Indian, White and Black, n = 1,199) to evaluate the prediction performance of P +T CoMeBack MRS and how it changes across multi-ancestry population. Next, we applied the P +T CoMeBack approach to DNAm data from the Drakenstein Child Health Study (n = 260) [28] , a multi-ancestry birth cohort from South Africa, to evaluate the performance of MRS for maternal smoking status. Our simulation study and real data application demonstrated that the P+T approach improves the predictive accuracy of MRS over methods that do not account for comethylation and has similar performance as LASSO regression, which requires access to the raw DNAm data. We also showed that MRS built upon the data from a population of one genetic ancestry could achieve high prediction performance among populations of other genetic ancestries, but the performance might differ in the presence of environmental/cultural/social differences associated with ancestry.

P+T CoMeBack approach for MRS
P+T CoMeBack method refers to the calculation of MRS using informed co-methylation pruning (P) with CoMeBack and P-value thresholding (T). First, summary statistics from an EWAS (typically include the effect size, standard error, and P-value of each CpG site) need to be estimated in an independent dataset (training dataset) to avoid overfitting, and then applied to generate MRS in a testing dataset (the samples used to evaluate the performance of MRS).
In our P+T CoMeBack method, comethylation pruning is completed by applying CoMeBack to DNAm data of the testing dataset or a reference panel [22] . Specifically, CoMeBack chains two adjacent array probes if the following requirements are met: 1. two probes are less than 2kb apart; 2. the reference human genome annotation shows a set of unmeasured genomic CpGs between them; 3. the density of unmeasured genomic CpGs between them is at least one CpG every 400bp. Chaining of adjacent array probes continues until an array probe does not meet the requirements, which will form a unit where multiple CpGs are chained together. Correlations between DNAm levels will then be calculated for all array probes inside each unit. If all pairs of adjacent probes in a unit have a correlation square (R 2 ) greater than 0.3, such unit will be declared as a comethylated region (CMR). Pruning is conducted by only keeping one CpG site per CMR in the dataset, the one with the lowest (most significant) P-value in the EWAS summary statistics.
P+T CoMeBack will be compared to the standard pruning approach, in which less significant CpGs that are correlated with an R 2 >0.3 and located within 2000bp of each other are being removed.
Next, P-value thresholding step (T) is performed for the pruned set of CpG sites. Specifically, the P-value thresholding step (T) is performed by applying different P-value thresholds (e.g., P-value thresholds ε [0.05, 0.005, 5 × 10 −4 , 5 × 10 −5 , . . .]) and only including those CpG sites in the final MRS calculation that reached a P-value below those thresholds in the EWAS summary statistics.
Finally, for each P-value threshold, MRS are calculated as a weighted sum of DNAm β values (β value = methylated allele intensity/(unmethylated allele intensity + methylated allele intensity +100), ranging from 0 representing unmethylation to 1 for complete methylation) of the selected CpGs, where the weights are the corresponding effect sizes for each CpG from the EWAS summary statistics. The squared correlation (R 2 ) between the phenotype of interest and MRS obtained using each P-value threshold is calculated to represent the prediction accuracy. The P-value threshold that produces MRS with the highest prediction accuracy in the testing data set is selected as the optimal P-value and the corresponding MRS is used for downstream analysis. The pipeline for generating P+T MRS (P+T CoMeBack as well as P+T with the standard pruning approach) is written in an R script, which is available at GitHub (https://github.com/jche453/ Pruning-Thresholding-MRS.git).
In our simulation studies and real data application, we compare the P+T CoMeBack MRS approach to the standard P+T and T approach, which refers to an approach in which the MRS is calculated by only thresholding, not accounting for correlations between included CpGs (no pruning).

Simulation studies
To validate the performance of the proposed P+T MRS approach, we conducted simulation studies based on whole blood Illumina Infinium Human Methylation 450K BeadChip data from an ethnically heterogeneous discovery cohort composed of several publicly available datasets (GSE55763, GSE84727, GSE80417, GSE111629, and GSE72680) [22] . Intra-dataset normalization and batch effects correction were performed using ComeBat function in R-package sva [29] , followed by merging of datasets and correction for inter-dataset batch effects using the same function. After removing XY chromosome binding, non-CpG, cross-hybridizing probes and probes that are in close distance (1bp) with common SNPs, there were 386,362 CpGs left for MRS analysis. We randomly selected 1,199 adults (898 Indians, 136 Blacks and 165 Whites) to conduct the simulation studies.
CoMeBack was applied to the DNA methylation β values of the 386,362 CpGs to obtain CMR. In each simulation, 10 of the 386,362 CpGs were randomly selected to be causal, k% (k = 30, 50, 70 or 100) of which are in a CMR with other CpGs. At most, one CpG would be causal in each CMR.
The causal CpGs were randomly assigned a 'true' effect size from a uniform distribution as w i ,U À 0:5; 0:5 ð Þ. We then simulated a phenotype for the j-th subject as follows: where m ij is the DNAm β value of causal CpG site i of the j-th subject, and ε j is an error term that follows a normal distribution. Differentδ 2 were set to ensure that the targeted variance of phenotype explained by DNAm alone equals 10%, 30% 50% or 80%.
We also simulated a second phenotype Y* j for the j-th subject, which was directly affected by ancestry using European ancestry as reference: Effect of ancestry in our simulations is simulated as the effect of genetic ancestry assuming there were no complex social determinants involved in the causal pathway. Different δ �2 were used so that the variance of phenotype that was explained by DNAm and ancestry together equals 20%, 50% or 80. For our simulations, effect a was set to 0.1 and b to 0.2. In each simulation, both simulated phenotypes Y j and Y � j share the same epigenetic liability ( P 10 i¼1 w i m ij ). In each simulation, for fair comparison, 762 Indians were randomly chosen as the training dataset so that there were at least 136 people left for each race group in the testing dataset. Associations between CpGs and each of the two simulated phenotypes were assessed by robust linear regression model using limma R package [30] in the training dataset. We calculated top 10 principal components (PCs) from DNAm of 386,362 CpGs [31] and used EpiDISH to estimate cell type proportions of each CpGs [32] . We observed that in our simulation dataset, top 10 PCs are highly correlated with cell type proportions (Supplement Figure 1a), and using either summary statistics adjusted for top 10 PCs or summary statistics adjusted for cell type proportions would lead to almost identical prediction performance of MRS (Supplement Figure 1b). Thus, to account for population stratification and cell type difference, we adjusted for the top 10 PCs in our main analyses. The summary statistics (effect size and P-values) obtained from association tests in the training data were saved and later used to construct MRS in the testing dataset. We repeated 1000 simulations per scenario to evaluate the prediction accuracy (R 2 ), statistical power, and type 1 error rate of the P+T MRS. Linear regression analysis was used to access the association between MRS and simulated phenotypes in the test data, and statistical power is defined as the proportion of simulations where MRS were significantly associated with the simulated phenotype with at α level of 0.05. To estimate type 1 error rate, we first obtained a null association between MRS values and simulated phenotype values by permutation of MRS values. Linear regression analysis was used to access the association between permutated MRS and simulated phenotypes, and type 1 error rate is defined as the proportion of simulations where permutated MRS were significantly associated with the simulated phenotype.
We evaluated the performance of the MRS not only in scenarios of A) same ancestry in training and test data, but also B) across different ancestry groups (training data: Indian, test data: European or African) and C) in multi-ancestry populations (training data: Indian, test data: Indian, European, and African). For scenario C), we evaluated two analysis strategies: 1. Joint-analysis: MRS analyses in the whole testing dataset where subjects from all racial groups were merged; 2. Standardization: scale MRS to have a standard normal distribution within each racial group before merging all subjects for analyses.

Application study of smoking MRS
To evaluate the performance of the P+T CoMeBack approach in a real data setting, we applied the P+T CoMeBack approach to calculate a MRS for maternal smoking status during pregnancy using cord blood DNAm data from newborns in the South African Drakenstein Child Health Study (DCHS), a multi-ancestry longitudinal study investigating determinants of early child development [33] . There were 145 Black African infants and 115 Mixed ancestry infants in the DCHS. A detailed description of the enrolment process, inclusion criteria, variables measurement, and ethical approval of the study have been previously published [33,34] .
Cotinine levels were measured in urine provided by mothers within four weeks of enrolment and classified as <499 ng/ml (nonsmoker), or ≥500 ng/ml (active smoker) [28] . Cord blood was collected at time of delivery and used to measure DNA methylation by either MethylationEPIC BeadChips (EPIC, n = 145) or the Illumina Infinium HumanMethylation450 BeadChips (450K, n = 103) [33,34] , followed by quality control and normalization to calculate β values (details have been published elsewhere) [35] .
Summary statistics for the calculation of MRS were obtained from a study that meta-analysed the associations between newborn blood DNA methylation and sustained maternal smoking during pregnancy among 5,648 mother-child pairs as part of the Pregnancy and Childhood Epigenetics (PACE) Consortium (Table 1) [36] . The participants of all cohorts used in the meta-analysis except one were of European ancestry.
In addition, we compared P+T CoMeBack MRS to three previously published MRS for maternal smoking during pregnancy (Reese MRS, Richmond [37] . Richmond 19 MRS is a weighted sum of DNAm β values of 19 CpGs that were significantly associated with prenatal smoking in an EWAS conducted in peripheral blood from children of averaged 6.8 years age ( Richmond 19 MRS) [38,39] . In the same study, Richmond 568 MRS was proposed based on 568 CpGs that were significantly associated with prenatal smoking in cord blood [38,39] . We obtained the weights of reported CpGs from the mentioned studies and applied them to DNAm data in the DCHS to generate Reese MRS, Richmond 19 MRS and Richmond 568 MRS.
Linear regressions were used to assess the associations between maternal smoking status and each MRS, using maternal smoking as the independent variable and MRS as the dependent variable. Each model was adjusted for ancestry (in pooled samples), cell type proportions and top 5 PCs calculated from genotypes. In order to obtain comparable beta-coefficients and standard errors across different MRS, each MRS was divided by their interquartile range (IQR) before linear regression analysis.

Simulation results
We compared the prediction performance of P+T CoMeBack MRS to the T method among 136 Indians in the test data across different simulation scenarios (Figure 1). Figure 1a shows that P+T CoMeBack MRS that account for co-methylation between CpGs have stable prediction performance when proportion of causal CpGs located in a CMR (k%) varies. P+T without CoMeBack had similar prediction performance while the T method had a slightly lower prediction performance. While the P+T CoMeBack MRS showed subtle improvement over T method when proportions of phenotype variance explained by DNA methylation is 80% (Figure 1a), the difference between P+T CoMeBack MRS and the T method decreases as the proportions of phenotype variance explained by DNA methylation decreases. This is likely because as variance explained by DNAm decreases, there is less power for association testing, and it becomes increasingly difficult to distinguish real signals from statistical noise while generating the summary statistics.
Next, we assessed the performance of P+T CoMeBack, P+T, and T method across different ancestries and among multi-ancestry populations. All three methods achieved a high power (>95%) and a low type 1 error rate (~5%) within each ancestry in most scenarios for both phenotypes except when the phenotype variance explained by DNA methylation is 10% or 30% (Supplement Table 1

-4).
Whether the simulated phenotypes were independent of ancestry or not, MRS among Whites and Blacks achieved a prediction R 2 as high as among Indians, which should have the best prediction of the simulated phenotypes since weights were obtained from Indian training samples (Figure 2). Findings were similar when the phenotype variance explained by DNA methylation was reduced from 80% to 10%, 30% or 50% (Supplement Figure 2). When the phenotypes are not associated with ancestry (Figure 2a), the three MRS analyses strategies (stratification, joint analysis, and standardization) lead to nearly identical results. However, when the phenotypes are ancestrydependent, both joint analysis and standardization of MRS showed very poor prediction of the phenotypes (Figure 2b).  [37] White newborns (N = 1,068)* Sustained smoking during pregnancy obtained from combined information of cotinine-based and self-report based classification Reese et al. 2017 [37] Logistic LASSO regression 28 Richmond 568 MRS Joubert et al. 2016 [38] Multi-ethnic newborns (N = 6,685)*

Maternal smoking during pregnancy via questionnaires
Richmond et al. 2018 [39] Robust linear regression; Bonferroni corrected P-value<0.05 568 Richmond 19 MRS Joubert et al. 2016 [38] Multi-ethnic older children (average age = 6.  Figure 3 shows the prediction performance of MRS for maternal smoking status among DCHS newborns. As the p-value threshold decreases, the prediction accuracy of the resulting MRS increases before reaching a plateau, demonstrating the importance of P-value thresholding in MRS to control for noise. Among mixed ancestry newborns, P+T CoMeBack MRS of smoking status excluded 22 CpGs in pruning and achieved a prediction R 2 of 29.5% using P-value threshold of 5 × 10 −22 , while the best standard P+T without CoMeBack had a lower prediction R 2 of 26.2% using P-value threshold of 5 × 10 −10 and the best T method MRS had the lowest prediction accuracy (24.5%) using P-value threshold 5 × 10 −9 , confirming the benefits of pruning in MRS calculation ( Figure 3a). All three MRS had lower prediction performance for maternal smoking among Black African infants (10.9%, and 8.0% respectively) (Figure 3b), which is likely due to the low prevalence of smokers among mothers of Black African infants in DCHS (13%) compared to mothers of mixed ancestry infants (49%) (Supplement Figure 3). Additionally, the distributions of all MRS in Black African infants and mixed ancestry infants were similar within each category of maternal smoking status (Supplement Figure 4), confirming that the difference of prediction R 2 between Mixed and Black infants is less likely due to ancestry-related factors other than prevalence of maternal smoking. Jointanalysis of P+T CoMeBack MRS showed a prediction accuracy of 20.4%, which is between the prediction accuracy of P+T CoMeBack MRS among Black African infants and mixed ancestry infants (Figure 3c). Standardization approach did not improve the performance of MRS (Figure 3d). We next compared the prediction accuracy R 2 and distribution of P+T CoMeBack MRS to other established MRS for maternal smoking during pregnancy and newborn DNAm (Figure 4). Overall, P+T CoMeBack and Reese MRS had stable and similar classification performance in all analyses compared to other MRS. P+T CoMeBack MRS and Reese MRS showed a similar prediction R 2 among both Black and Mixed ancestry infants, which were better than other smoking MRS ( Figure 4A). P+T CoMeBack MRS had the largest AUC (0.820) in the ROC curve among mixed infants (Figure 4b) but a smaller AUC than Reese MRS in Black infants and joint-analysis (Figure 4c-d). Further, all 6 MRS showed significant association with smoking status in Mixed infants, Black infants and jointanalysis (Table 2), showing the promise of using MRS to capture the overall DNAm signals in association testing.

Discussion
Based on the well-established P+T framework in PRS, we developed P+T CoMeBack MRS, which aggregates EWAS signals into a MRS [44,40,41] . The proposed P+T CoMeBack MRS approach uses CoMeBack for co-methylation pruning and evaluates multiple P-value thresholds to maximize prediction performance. Such MRS could potentially serve as a powerful dimension reduction approach for mediation and multi-omics integration analyses [44 , 40,41,42,43] as well as biomarkers of individual disease risk in a clinical setting [44,45,46] .
Overall, our simulation studies demonstrated good performance of P+T CoMeBack MRS for predicting phenotypes of interest with good statistical power and well-controlled type 1 error. We demonstrated that the prediction accuracy of MRS reflects the variance of phenotype that is explained by DNAm. By accounting for inter-correlation between CpGs, P+T CoMeBack MRS and P+T without CoMeBack showed a slightly better performance than the standard T method. In the real data application, we observed the best prediction of maternal smoking status when using P+T CoMeBack, which confirms the usefulness of accounting for comethylation and demonstrates the ability of CoMeBack to control for false discover of CMR and usefulness in constructing MRS [22] . However, we note that P+T CoMeBack MRS could still have poor prediction performance if the external EWAS is underpowered or subject to bias.
In the prediction of maternal smoking status, P +T CoMeBack MRS showed comparable performance to Reese MRS, which was derived using the LASSO method [37] . When predictors are highly correlated, LASSO typically selects one of the correlated predictors and shrinks the effect size of the rest to zero, which might produce similar results to our pruning procedure in developing MRS. One of the advantages of P+T CoMeBack MRS is that it is based on EWAS summary statistics which are often publicly available, hence making it a valuable approach, as it is often difficult to obtain individual DNAm data from an external cohort. Additionally, P+T CoMeBack MRS can make use of meta-analysis-type summary statistics, which aggregates results from multiple studies to improve association estimates. In contrast, to construct MRS like Reese MRS, individual DNAm data are usually required to perform a LASSO regression, and these are often not accessible. Furthermore, approaches like LASSO regression require the data to be split into training and test data to avoid overfitting. Consequently, larger sample sizes are needed to develop MRS using LASSO regression than when using publicly available EWAS summary statistics. Recently, novel penalized regressions have been proposed to generate PRS with only GWAS summary statistics and publicly available reference data [47] , but their applications to EWAS summary statistics for MRS have not been investigated. To develop MRS for different exposures and outcomes, we urge EWAS studies to make their epigenomewide summary statistics publicly available.
In our simulation studies, weights obtained from Indian training samples were applied to generate P +T CoMeBack MRS, thus MRS among Indian testing samples were assumed to have the best prediction of the simulated phenotypes. However, MRS among Whites and Blacks also achieved a prediction accuracy as high as among Indians for both simulated phenotypes suggesting that genetic ancestry does not contribute to difference in prediction abilities of MRS across multi-ancestry population. This is likely because we assumed all ancestries share the same causal CpGs and effect sizes. However, in the real world, this assumption could possibly be violated for many phenotypes. Unlike ancestry in our simulation studies, ancestry in the real world is complex. The meaning of ancestry could be different in different regions/nations, and 'effect of ancestry' involves the joint effects of ancestry-associated social determinants of health and environmental effects, and cultural context [48] . Ancestry, along with environment and social differences associated with it, could affect both MRS and phenotypes in numerous causal pathways and potentially modify the effect of MRS on the phenotypes. Thus, even if all ancestries indeed share the same causal CpGs and effect sizes, it might still not be sufficient to disentangle the relationship between ancestry, DNAm and phenotype of interest. This may greatly impact the transferability of MRS across different ancestries, which could be the reason why we observed an inconsistency of performance of P+T CoMeBack MRS in terms of their distributions and predictions across multi-ancestry population in the real data analyses. In practice, we recommend that researchers conduct MRS analyses stratified by ancestry first and evaluate the effect of ancestry on MRS analyses before pooling participants together for a joint analysis. In our real data application, summary statistics for smoking were obtained from a cohort with mainly people of European ancestry [36] . MRS of smoking among mixed ancestry infants achieved a prediction accuracy of nearly 30%. However, the prediction accuracy of P+T CoMeBack MRS among Black African infants was only 10.9%. We suspect that the difference was largely due to the prevalence of active smoking among mothers of Black African infants being lower than those of mixed ancestry infants (13% vs 49%), which is similar to how the prevalence of outcome affects the predictive ability of PRS [49] .
To the best of our knowledge, this is the first study to propose using CoMeBack for pruning MRS among multi-ancestry populations. However, there are several potential limitations that warrant mention. First, in the simulation dataset, ancestry information was self-reported and it may not fully represent the genetic ancestry estimated from genetic data. Second, the sample size of the simulation testing dataset and real data analyses was relatively small. Third, lack of different ancestry-specific summary statistics made it impossible to compare the use of external weights from population of different ancestries (e.g., European ancestry vs other ancestries). Fourth, the prevalence of active maternal smoking was different in different ancestries and has influenced the performance of P+T CoMeBack MRS in the real data application. As a result, real data analysis of smoking MRS could not provide firm evidence about the transferability of MRS between Black African and mixed ancestry infants. Last, we mainly focused on the prediction performance of P+T CoMeBack MRS. Further studies are needed to assess the performance of P+T CoMeBack MRS in mediation analysis.
In conclusion, P+T in general and P+T using CoMeBack in particular, provides an improvement for prediction of phenotype of interest, over T method that does not account for co-methylation between CpGs. In contrast to PRS, using existing summary statistics that were derived from European populations can be used to calculate MRS in other ancestries, thus reducing the ancestry/ethnicity disparity in medical research. However, caution is needed in the analyses and interpretation of MRS results across multi-ancestry populations, especially due to environmental/cultural/social differences associated with ancestry. More investigations of MRS are urged to further improve their prediction accuracy and translational values, also in combination with other clinical and non-clinical variables, especially among multi-ancestry population. With the current increase of large consortia-led EWAS for different exposures and health outcomes (e.g., the PACE consortium), we believe the predictive performance of MRS will continue to increase, and the P+T CoMeBack method has the potential to be widely used for risk prediction and association testing.