Identification of novel functional CpG-SNPs associated with Type 2 diabetes and birth weight

Genome-wide association studies (GWASs) have identified hundreds of genetic loci for type 2 diabetes (T2D) and birth weight (BW); however, a large proportion of the total trait heritability remains unexplained. The previous studies were generally focused on individual traits and largely failed to identify the majority of the variants that play key functional roles in the etiology of the disease. Here, we aim to identify novel functional loci for T2D, BW and the pleiotropic variants shared between them by performing a targeted conditional false discovery rate (cFDR) analysis that integrates two independent GWASs with summary statistics for T2D (n = 26,676 cases and 132,532 controls) and BW (n = 153,781) which entails greater statistical power than individual trait analyses. In this analysis, we considered CpG-SNPs, which are SNPs that may influence DNA methylation status, and are therefore considered to be functionally important. We identified 103 novel CpG-SNPs for T2D, 182 novel CpG-SNPs for BW (cFDR < 0.05), and 52 novel pleiotropic loci for both (conjunction cFDR [ccFDR] < 0.05). Among the identified novel CpG-SNPs, 33 were annotated as methylation quantitative trait loci (meQTLs) in whole blood, and 145 displayed at least some effects on meQTL, metabolic QTL (metaQTL), and/or expression QTL (eQTL). These findings may provide further insights into the shared biological mechanisms and functional genetic determinants that overlap between T2D and BW, thereby providing novel potential targets for treatment/intervention development.


INTRODUCTION
Type 2 diabetes (T2D) is a common chronic metabolic disorder characterized by hyperglycemia, insulin resistance, and impaired insulin secretion [1] It is estimated that in 2017 there were 451 million people suffering from diabetes worldwide [2] 90% of which were classified as T2D [3] Due to the reduced quality of life, increased mortality, and a significant burden on the healthcare system, T2D represents a severe global public health concern. Therefore, it is imperative to gain a better understanding of the pathophysiological mechanisms AGING involved in the onset of T2D for the enhanced development of intervention/treatment strategies.
Birth weight (BW) is a clinical indicator of a variety of metabolic conditions that manifest with age. Studies have demonstrated that low BW is associated with the increased risk of developing T2D [4,5]. The concept of "developmental programming" holds that events occurring during the early development of an individual and specifically during intrauterine life have profound consequences on future disease such as diabetes [6,7]. Both T2D and BW are believed to be highly influenced by genetic factors with heritability estimates of over 50% and 37%, respectively [8,9]. Additionally, the phenotypic correlation between T2D and BW suggests that these traits may share overlapping genetic determinants [10]. Several studies have also proposed the genetic correlation between BW and T2D. For instance, a Mendelian randomization study demonstrated the genetic effects on retarded fetal growth and increased diabetes risk by showing that lower birth weight was associated with increased risk of T2D and higher fasting glucose concentration [11]. Our previous study also identified several loci that associated with both T2D and BW [12]. Therefore, studying genetic relationships between BW and T2D could yield insights into the genetic regulation of T2D risk during the fetal stage.
Hundreds of single nucleotide polymorphisms (SNPs) associated with T2D or BW have been identified by genome-wide association studies (GWASs) [13,14]. However, these SNPs can only explain a small proportion of the total heritability for T2D (~10%) [15] and BW (~25%) [16]. The large majority of the missing heritability is likely attributed to the well-documented limitations of the single-trait GWAS analysis [17]. Due to the polygenic architecture of most complex traits, many SNPs have associations too weak to be identified with the relatively small sample sizes of current GWASs [18]. Therefore, it is essential to employ statistical approaches that can increase the effective sample size by incorporating more information embedded in the existing univariate datasets. For example, by incorporating the pleiotropic effects among correlated traits, it may be possible to augment the sample sizes of standard GWAS for individual traits and identify more trait-associated loci that would otherwise be missed.
Previous studies suggest that epigenetic mechanisms, which are a crucial link between the genetic factors and environmental exposures [19], may also account for some of the missing heritability [20]. DNA methylation occurs mainly at the fifth position of the cytosine ring in CpG dinucleotides [21], and SNPs that are associated with methylation status are commonly referred to as CpG-SNPs. Although it was previously believed that methylation of the promoter region is responsible for transcriptional silencing, emerging evidence suggests that DNA methylation is closely related to the expression across all genomic regions [19]. DNA methylation is the most well-explored epigenetic mark and is also involved in T2D pathogenesis [22]. For instance, Ma et al. [23] identified 30 CpGs representing the whole blood DNA methylation signatures that are associated with cardiovascular disease risk factors and all-cause mortality. Our previous work also proposed that peripheral bloodderived meQTL loci were related to the increasing risk of diabetes and coronary artery disease [23,24]. Focusing on these CpG-SNPs may be an effective strategy to improve the detection of novel potential functional variants associated with T2D and BW.
A genetic-pleiotropy-informed conditional false discovery rate (cFDR) method was developed to improve the gene discovery in complex traits by integrating two independent GWASs for related traits [25]. A major advantage of this approach is that it only requires the summary statistics rather than the individual level genotype data, which are usually not easily accessible, as well as improves statistical power for identifying novel polygenic effects [25]. Using this approach, our group has analyzed multiple sets of genetically associated traits and successfully identified many novel trait-associated loci [12,26].
In this study, we applied a targeted cFDR analysis on CpG-SNPs for T2D and BW [13,14] to identify novel functional loci that are shared between these two traits. Our findings will provide novel insights into the shared pathophysiology between T2D and BW.

Assessment of pleiotropic enrichment
We observed a clear separation between each curve in the conditional Q-Q plot, and the enrichment of effects in T2D varied on different levels of association for BW ( Figure 1A and 1C). We can intuitively observe and graphically assess a strong enrichment of T2D-associated SNPs conditioning on various strengths of associations with the BW. The similar separation between the different curves and similar enrichment pattern was also observed in BW conditioned on T2D ( Figure 1B and  1D). This result indicates a strong pleiotropy between T2D and BW, regardless of whether T2D is conditioned on BW or BW is conditioned on T2D.

T2D loci identified by cFDR
Conditional on association with BW, we identified 127 significant CpG-SNPs (cFDR ≤ 0.05) for T2D variation, AGING which were located on 21 different chromosomes and annotated to 110 genes ( Figure 2A, Supplementary  Table 1). Among these significant SNPs, 103 were not reported compared with the original T2D GWAS study [13], our earlier work [12] and other previous studies [15,27]. Using the more conservative threshold of cFDR ≤ 0.01, 64 significant loci remained.
To explore the potential regulatory functions of these CpG-SNPs, we conducted a series of bioinformatics analyses. Our results found that 27 CpG-SNPs showed significant meQTL effects in whole blood (Supplementary Table 2). Additionally, there were 18 loci that mapped to metaQTLs. Interestingly, two novel SNPs (rs677042 and rs7816345) were associated with bile acids (i.e., ursodeoxycholate, hyodeoxycholate), and a third novel SNP (rs10774563) was related to branched chain fatty acids (i.e., ethylmalonate, methylsuccinate). Bile acid and branched-chain fatty acids are known to be involved in the mechanisms of glucose metabolism [28]. Multiple recent studies reported that short-chain fatty acids and branched shortchain fatty acids may have beneficial health effects on adipocyte lipid and glucose metabolism that can contribute to improved insulin sensitivity in individuals with disturbed metabolism [29,30]. Furthermore, the SNP rs11659412 (Supplementary Table 3) has associations with methylamine, which has previously been shown to have connections with susceptibility for T2D [31].
We also detected four pathways associated with the metabolites that are linked to the metaQTLs. Pathway analysis suggested that these metabolites were linked AGING mainly to the lipid metabolism (alpha-Linolenic acid metabolism and Glycerophospholipid metabolism), energy metabolism (Methane metabolism), and genetic Information Processing (aminoacyl-tRNA biosynthesis) (Supplementary Table 4). Lastly, we detected the SNPs enriched in the eQTLs, which may regulate gene expression levels. Eight CpG-SNPs showed the eQTL effect and were associated with enhancer, promoter, and DNAse elements in various tissues. Notably, two novel CpG-SNPs, rs7787720, and rs579459 showed meQTL, eQTL, and metaQTL effects simultaneously (Table 2).

BW loci identified with cFDR
We identified a total of 188 significant CpG-SNPs (cFDR ≤ 0.05) for BW variation on their association with T2D, which were mapped to 20 different chromosomes ( Figure 2B, Supplementary Table 5). Other than seven confirmed CpG-SNPs for BW [12,14], the remaining 182 are novel CpG-SNPs. Using the more conservative threshold of conditional cFDR ≤ 0.01, 85 significant loci remained.

Pleiotropic loci in T2D and BW identified with ccFDR
We computed ccFDR and constructed a ccFDR Manhattan plot to investigate whether any of the CpG-SNPs were associated with both T2D and BW. We found 55 independent pleiotropic SNPs detected by our analysis (The detailed annotations are listed in Supplementary Table 1 and 5), which were located on 16 chromosomes that reached a significance level of ccFDR ≤ 0.05 ( Figure 2C, Table 1). With the more stringent significance threshold of ccFDR ≤ 0.01, 23 pleiotropic CpG-SNPs remained. Among the identified loci, 52 CpG-SNPs were suggested to be novel. In total, five SNPs have been associated with T2D, while none of these SNPs has previously been identified for BW. All the identified CpG-SNPs annotated to 45 different genes, and 35 of these were not detected as pleiotropic genes in previous related research [12]. Finally, we found 35 SNPs have at least one eQTL, meQTL, or metaQTL effect. One pleiotropic CpG-SNP, rs579459 (ABO), showed eQTL, meQTL, and metaQTL effects simultaneously ( Table 2).

GO enrichment analysis and protein-protein interaction analysis.
We conducted GO enrichment analysis for the T2Dand BW-associated genes that were annotated to the identified CpG-SNPs to explore the potential regulatory functions. The identified SNPs were enriched in biological processes related to "response to oxygencontaining compound" and a molecular function of "scaffold protein binding". We also found that genes associated with T2D were significantly enriched in the pathway of "regulation of hormone levels" (FDR = 3.04 × 10 -3 ) and "regulation of insulin secretion" (FDR = 4.19 × 10 -3 ) (Supplementary Table 7). Using STRING 11.0 database, we performed protein-protein interaction analysis to further investigated the functional partnership among identified T2D-and BW-associated genes (Figure 3), respectively. PPI results showed several genes were well-connected in the interaction network in both traits, such as ADCY5, KCNQ1, IGF2, and CDKAL1, suggesting these genes are essential in the genetic network that coupling of both traits.

Results of the validation study
For cFDR analysis between BW and FG, we observed similar significant separation between the different curves, which indicates a strong pleiotropy between AGING those two traits (Supplementary Figure 1A and 1B). Conditional on association with FG, we identified 160 significant CpG-SNPs (cFDR ≤ 0.05) for BW variation, and 131 of them were replicated compared with the main cFDR analysis. We identified a total of 104 significant CpG-SNPs for FG variation on their association with BW, and 25 of them were replicated. And we replicated 18 pleiotropic CpG-SNPs for FG and BW (Supplementary Table 8).
For cFDR analysis between BW and FI, we identified clearly separation between the different curves, which indicates a strong pleiotropy between those two traits (Supplementary Figure 1C and 1D). Conditional on association with FI, we identified 140 significant CpG-SNPs for BW variation, and 125 of them were replicated compared with the main cFDR analysis. We identified a total of 13 significant CpG-SNPs for FI variation on their association with BW, and 5 of them were replicated. And we replicated 3 pleiotropic CpG-SNPs for FI and BW (Supplementary Table 9).
For cFDR analysis between BW_maternal and T2D_corBMI, significant deflection between curves suggested strong pleiotropy between those two phenotypes (Supplementary Figure 2A and 2B). Conditional on association with T2D_corBMI, we identified 90 CpG-SNPs for BW_ maternal variation, and 13 of them were replicated compared with the main cFDR analysis. We identified a total of 622 significant CpG-SNPs for T2D_corBMI variation on their association with BW_ maternal, and 79 of them were replicated. And we replicated 8 pleiotropic CpG-SNPs for BW_ maternal and T2D_corBMI (Supplementary  Table 10).
For cFDR analysis between BW_fetal and T2D_corBMI, similar deflection between curves demonstrated significant pleiotropy between those two phenotypes (Supplementary Figure 2C and 2D). Conditional on association with T2D_corBMI, we identified 133 CpG-SNPs for BW_fetal variation, and 59 of them were replicated compared with the main cFDR analysis. We identified a total of 669 significant CpG-SNPs for T2D_corBMI variation on their association with BW_ fetal, and 87 of them were replicated. And we replicated 25 pleiotropic CpG-SNPs for BW_ fetal and T2D_corBMI (Supplementary Table 11).

DISCUSSION
In this study, by performing the cFDR on the two independent GWAS datasets from T2D and BW, we identified 103 novel loci for T2D and 182 for BW when focusing on CpG-SNPs. Meanwhile, we identified 55 pleiotropic CpG-SNPs suggesting a shared genetic mechanism among them. Interestingly, only three of these CpG SNPs were identified as pleiotropic loci in the previous studies.
Since the genetic variants located at CpG-SNPs could affect the gene expression and regulation via epigenetic mechanisms, we investigated these 103 CpG-SNPs, which were regarded as novel SNPs associated with T2D. Among those CpG-SNPs, 55 showed at least one effect on meQTL, metaQTL, and/or eQTL, and 35 of them were located at novel risk genes for T2D. For example, rs11073964 is a novel CpG-SNP which showed both eQTL and meQTL effects and mapped to VPS33B (15q26. We also detected seven CpG-SNPs (rs151216, rs163171, rs163177, rs2237892, rs231354, rs234857, and rs3852527) which were annotated to KCNQ1. Two of these loci (rs163177 and rs231354) were shown to have eQTL and meQTL effects, and another SNP (rs2237892) was shown to have metaQTL effect. It is plausible that these multiple neighboring CpG-SNPs might synergistically regulate gene expression and play some roles in T2D.
We also investigated the significant CpG-SNPs associated with BW conditioned on T2D, in which 120  CpG-SNPs were novel loci and annotated to genes that were not reported in previous study [13]. All these SNPs showed at least one effect on eQTL, meQTL, and/or metaQTL. There were two notable SNPs, rs3184504, which was located on the gene SH2B3, and rs492602, which was located on the gene FUT2. Both SNPs showed eQTL, meQTL, and metaQTL effects simultaneously. The gene SH2B3 acts as a negative regulator of cytokine signaling and cell proliferation [38], which is known to be associated with type 1 diabetes and celiac disease [39]. The gene FUT2 encodes a specific fucosyltransferase enzyme, which is crucial for the synthesis of histo-blood group antigens [40]. Whether these genes have a function on BW is unclear, but our findings imply that epigenetic alteration deserves attention and presents new insights for further exploration.

AGING
Importantly, 19 of the 55 pleiotropic variants were novel loci that showed at least one QTL effect. Notably, seven of these 19 loci (rs6770420 in KLF7P1, rs6794193 in SETD2, rs7816345 in AC090453.1, rs6565531 in BAIAP2, rs878619 in SPATA20, rs926345 in PLCG1, and rs137848 in IL17REL) showed eQTL and either metaQTL or meQTL effect. These facts suggest that these SNPs might be involved in the shared pathogenesis of both T2D and BW.
There are several advantages in the current study. First, identifying shared genetic factors between these two traits can facilitate our understanding of the genetic correlation between BW and T2D. To our knowledge, this study is the first to use a targeted cFDR method by focusing on functional CpG-SNPs that are associated with both T2D and BW. Compared with our previous work on the same two traits [12], this work was based on the updated GWAS datasets with a larger population and sophisticated study design [13,14]. Second, in this study, we only focused on functional genetic variants-CpG-SNPs, which not only significantly reduced the multiple testing burden but also shed light on the biological interpretation of the results. Furthermore, we performed the analysis on meQTL, metaQTL, and eQTL effects, which facilitate the identification of candidate functional variants associated with T2D and/or BW. Third, our MR analysis IVW approach showed causal association between BW and T2D while other methods did not, therefore, we cannot define the direct causal relationship between BW and T2D. This result suggested that instead of acting like a direct risk factor, the BW might regulate T2D through multiple intermediate variants such as DNA methylation or regulation of metabolism, as we demonstrated in this study. Fourth, considering BW is influenced both by inherited fetus genotypes and maternal genotypes, we performed validation analysis by using GWAS datasets of direct fetal and indirect maternal genetic effects, which further support the story. Finally, multiple validations in different GWAS datasets were performed in the current study to partially support our findings, which demonstrated the credibility and significance of these findings.
There are also some limitations to our study. First, we are unable to evaluate the effect estimates of pleiotropic SNPs on the traits due to our inability to access the individual-level data. Second, it is difficult to distinguish between the pleiotropic scenarios where a SNP directly influences both BW and T2D, or the SNP affects BW, and the resulting change in phenotype influences T2D susceptibility. However, our MR analysis results may suggest that the SNPs affect T2D susceptibility via affecting BW. Thirdly, although using LD r 2 ≥ 0.2 as the threshold for SNP pairwise pruning is a common practice in similar integration studies [41-43], some of our findings could still be secondary to the signals, especially for SNPs identified in the same gene without special features such as eQTL (e.g., rs151216, rs234857 in KCNQ1). However, since the identified SNPs are functional CpG-loci, the CpG-SNPs identified in this study are more likely to be the leading signals, compared with the previous results. Finally, these findings are based on a bioinformatics analysis of GWAS data. Without further molecular validation, some of these results are suggestive rather than conclusive, such as suggested CpG-loci or functional genes. The aim of our study was to find more potential novel T2D associated variants, so we hope that this limitation could partially be addressed in the future by follow up with fine-mapping studies or molecular validation experiments.

AGING
In conclusion, by using cFDR method on functional CpG-SNPs, we successfully improved the identification of novel genetic variants of both T2D and BW. Our findings offer an improved understanding of the potential shared genetic mechanisms in T2D and BW, which may provide a new direction for further biological studies and clinical trials.

GWAS datasets
We obtained GWAS summary statistics for T2D and BW from publicly available sources [13,14]. The dataset for T2D contained meta-analysis summary statistics from 18 studies performed by DIAbetes Genetics Replication and Meta-analysis (DIAGRAM) Consortium (n = 26,676 case and 132,532 control, European) [13]. The dataset for BW contained metaanalysis summary statistics from 37 studies conducted by Early Growth Genetics (EGG) Consortium (n = 153,781, predominantly European) [14]. Both of these datasets have large enough sample sizes (n > 100,000) for statistical power. Each dataset contains summary statistics for each SNP with the p values that have undergone genomic control at the individual study level. The detailed inclusion criteria and phenotype characteristics from different GWAS are described in the original publications.

Identification of potentially functional CpG-SNPs
The CpG-SNPs in the human genome were identified by interrogating the comprehensive catalog of both common and rare genetic variants from the 1000 Genomes reference panel [44], and our in-house wholegenome high-coverage deep re-sequencing study [45]. A SNP is defined as a CpG-SNP if it introduces or disrupts a CpG site. A total of 50,278,228 CpG-SNPs was identified throughout the human genome. The details of the identification of potentially functional CpG-SNPs were described in our previous study [46].

Data processing
First, we combined the 8,099,761 common SNPs included in these two datasets, then overlapped these common SNPs with the above-identified CpG-SNPs and retrieved a total of 2,478,365 common CpG-SNPs with association summary statistics for both T2D and BW. We then used HapMap 3 genotypes as a reference, and performed a linkage disequilibrium (LD) based pruning method by PLINK 1.9 to remove pairs of SNPs with substantial correlations [47]. The process begins using a window of 50 SNPs, where LD between each pair of SNPs is calculated. If pairs have an R 2 > 0.2, one of that pair of SNPs is removed. Following this initial removal of SNPs, the window shifts 5 SNPs forward, and the process is repeated until there are no pairs of SNPs that are in high LD. After pruning, 96,312 independent CpG-SNPs remained to be used in the subsequent analysis.

Statistical analysis
We constructed conditional quantile-quantile plots (Q-Q plot) to evaluate the enrichment of pleiotropic effects by evaluating the increase in the number of trait-associated SNPs for the first trait (principal trait) when conditioning on SNPs with varying strengths of association in the second trait (conditional trait). We also constructed foldenrichment plots, which quantify the pleiotropic enrichment within each conditional subset compared with the baseline group, which includes all SNPs.
By leveraging two GWASs from T2D and BW, we applied the cFDR approach to obtain the probability that a random SNP is null for association with the principal phenotype given that the p-values for the principal and conditional phenotypes are both less than observed p-values. The method was applied for both orderings of the two phenotypes, cFDR(T2D|BW) and cFDR (BW|T2D). Then, we computed the conjunction cFDR (ccFDR), taken to be the maximum of the two cFDR values, to identify pleiotropic SNPs for both T2D and BW. Finally, we present conditional Manhattan plots to visualize the localization of the SNPs associated with T2D conditional on the strength of association with BW and the reverse. We also present a conjunction Manhattan plot to visualize the locations of the variants with a pleiotropic effect on both phenotypes. We identified a SNP as novel if it has not been reported in previous GWASs [13,14] or our previous cFDR studies [12]. The details were presented in the Supplementary Materials and Methods.

Functional annotation of the pleiotropic CpG-SNPs
To explore the biological functions of the individual trait associated CpG-SNPs and pleiotropic CpG-SNPs, we annotated each identified CPG-SNP to corresponding DNA features or regulatory elements using functional analysis tools such as HaploReg (http://www.broadinstitute.org/mammals/haploreg/hapl oreg.php) and SNPnexus (http://www.snp-nexus.org/). These tools provide the ENCODE [48] and RoadMap [49] annotations for the CpG-SNPs of interest as well as other SNPs in high LD (r 2 ≥ 0.8).
The gene ontology (GO) terms database (http://omicslab.genetics.ac.cn/GOEAST/index.php) was used to perform gene enrichment analysis among the list of genes associated with pleiotropic CpG-SNPs. Among the significant genes we identified gene sets enriched in certain biological processes, cellular components, and molecular functions. To investigate the interaction and functional relationship of the identified T2D and BW genes, protein-protein interaction analyses were constructed by using the STRING 11.0 database (http://string-db.org/).

Validation study using different GWAS datasets
Considering the long duration of action between BW and T2D, we re-performed cFDR analysis between BW and diabetes-related indicators (fasting glucose (FG), and fasting insulin (FI) [53]) to validate the results of the main cFDR analysis, and QQ plots were also generated to demonstrate the pleiotropic enrichment.
Additionally, considering BMI may be highly related with both birth weight and T2D, and BW is influenced both by inherited fetus genotypes and maternal genotypes with intrauterine environment. We further conducted cFDR analysis between BW_maternal [54] and BW_fetal [54] with T2D with correction of BMI (T2D_corBMI) [55], separately."

Bi-directional Mendelian Randomization (MR) analysis
First, we selected independent SNPs (r 2 < 0.001) that achieved genome-wide significance (p < 5 × 10−8) in the BW GWAS datasets as instrumental variables (IVs), then summary statistics of those SNPs were further extracted from T2D GWAS datasets. Next, inverse variance weighted (IVW) regression [56] was performed as main MR analysis to estimate the causal relationship between BW and T2D. MR-Egger regression [57] was performed to estimate the pleiotropy effect among selected SNPs. Simple mode, weighted mode and weighted median [56] were then conducted to validate the main results. P values less than 0.05 was considered significant. Then bidirectional MR analysis was repeated using T2D as exposure, BW as outcome.

Ethics approval and patient consent
We obtained genome-wide association study (GWAS) results published online. The relevant institutional review boards or ethics committees approved the research protocol of the individual GWAS used in the current analysis, and all human participants gave written informed consent, which was demonstrated in the respective original papers.

AUTHOR CONTRIBUTIONS
Hong-Wen Deng conceived and initiated this study，he is responsible for the general development and design of the study and contributed to critical revisions. Jie Shen gave constructive suggestions and finalization of the manuscript. Rui-ke Liu is the first author who performed data analysis and drafted the manuscript. Xu Lin and Chuan Qiu contributed to data analysis. Zun Wang, Jonathan Greenbaum, and Chun-Ping Zeng contributed to critical revisions. Yong-Yao Zhu gave constructive suggestions during the whole process. All authors have given approval to the final version of the manuscript. All authors agree to be accountable for the work and ensure that any questions relating to the accuracy and integrity of the paper are investigated and properly resolved.

CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest.

Pleiotropic enrichment estimation
We performed conditional Q-Q plots based on varying levels of significance in the conditional phenotype to visualize the difference between observed distribution in the principal trait and theoretical distribution. We plotted the QQ curve for the quantile of nominal -log10(p) values for the association of the subset of SNPs that were below each different significance threshold in the conditional phenotype. The quantile of the nominal p-values are plotted on the x-axis, and the nominal -log10(p) values are plotted on the y-axis for T2D and BW respectively. Under the null hypothesis, the strength of pleiotropy enrichment can be reflected by the degree of the leftward shift from the null line. The Q-Q plot falls on the X = Y, which means no enrichment of pleiotropic genetic effect. By contrast, an earlier leftward shift from the null line indicates the existence of pleiotropic enrichment. Greater spacing in the Q-Q plots shows a stronger trend of pleiotropic enrichment shared between the principal and conditional phenotypes.
Then, we presented fold-enrichment plots with "ggplot2 package in R software" [1] to assess further the pleiotropic enrichment between T2D and BW. The plots were formed by nominal -log10(p) values at different stratifications which were divided by the p-value of SNPs for the conditional phenotype (p ≤ 1; p ≤ 0.1; p ≤ 0.01; p ≤ 0.001). Nominal p values (-log10(p)) are plotted on the x-axis and fold enrichments are plotted on the y-axis. In each cut-off category, we computed the fold-enrichment values (En, as defined below) for all possible p values on the x-axis (between 0 and 10),

En[i] =
and Ni is the proportion of SNPs with -log10(p) ≥ x, N0 is the number of all SNPs in each cut-off category, and i is from 1 to N0. We can observe an upward shift from the expected baseline as the presence of pleiotropy. Also, the greater separation between different stratification indicated a stronger pleiotropy.

The calculation of cFDR and conjunctional cFDR (ccFDR)
The cFDR is an extension of traditional FDR, and this method is well-established and has been widely applied [2], [3][4][5]. We performed to integrating the two independent GWASs with summary statistics to assess the probability that an SNP has a false positive association with the principal phenotypes under the given p-value for both the principal and conditional phenotypes are smaller than the pre-defined significance thresholds. cFDR was expressed as: Where pi is the strength of association for the SNP with the 'principal phenotype', and pj is the strength of association for the same SNP with the 'conditional phenotype'. Then the H0 (i) stand for the null hypothesis that there is no association between this given SNP and the principal trait. After the data preparation, we computed the cFDR for each SNP where T2D is the principal phenotype conditioned on the strength of association with BW (T2D|BW) and vice versa (BW|T2D). Using this approach, we identified the loci significantly associated with T2D and BW (FDR < 0.5), respectively.
After the calculation of cFDR, we computed the conjunction cFDR (ccFDR) to find the pleiotropic loci. The maximum cFDR value of the two traits was token as the ccFDR value of each variant. An SNP with the ccFDR value smaller than 0.05 was considered to be significantly associated with both T2D and BW.

Manhattan plots for conditional statistics and conjunction statistics
Using "qqman" package in R software [6], we constructed Manhattan plots to visualize the locations of the genetic markers. All SNPs were present in relation to their chromosomal locations. We plotted locations of the 22 chromosomal on the x-axis, and plotted the −log10 the SNPs' values on the y-axis. The SNP with a -log10(cFDR) ≥ 1.3 was considered as a locus associated with the principal phenotype given the conditional phenotype. Then, an SNP with − log10 conjunction FDR value is > 1.3 was determined to be associated with both the principal trait.

Supplementary Tables
Supplementary Table 1