Pervasive horizontal pleiotropy in human genetic variation is 1 driven by extreme polygenicity of human traits and diseases 2

4 The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at 5 Mount Sinai, 1468 Madison Avenue, New York, NY, USA 6 The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at 7 Mount Sinai, 1425 Madison Ave, New York, NY, USA 8 Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 9 1425 Madison Avenue, New York, NY, USA 10 *Dr. Jordan and Dr. Verbanck contributed equally to this work. 11 12


33
The term "pleiotropy" refers to a single genetic variant having multiple distinct phenotypic 34 effects. In general terms, the existence and extent of pleiotropy has far-reaching implications on 35 our understanding of how genotypes map to phenotypes (1), of the genetic architectures of 36 traits (2,3), of the biology underlying common diseases (4) and of the dynamics of natural 37 selection (5). However, beyond this general idea of the importance of pleiotropy, it quickly 38 becomes difficult to discuss in specifics, because of the difficulty in defining what counts as a 39 direct causal effect and what counts as a separate phenotypic effect. 40 One particularly important dividing line in these conflicting definitions is the distinction between 41 vertical pleiotropy and horizontal pleiotropy (6,7). When a genetic variant has a phenotypic 42 effect that then has its own downstream effects in turn, that variant exhibits "vertical" pleiotropy. 43 For example, a variant that increases low density lipoprotein (LDL) cholesterol might also have 44 an additional corresponding effect on coronary artery disease risk due to the causal relationship 45 between these two traits, thus exhibiting vertical pleiotropy. Vertical pleiotropy has been 46 conceptualized and measured by explicit genetic methods like Mendelian randomization. 47 In contrast, a genetic cause that directly influences multiple traits, without one trait being 48 mediated by another, would exhibit "horizontal" pleiotropy. Horizontal pleiotropy contains some 49 conceptual difficulties, and consequently can be difficult to measure. In principle, we might 50 imagine selecting a variant and counting how many phenotypes are associated with it. Indeed, 51 several versions of this analysis have been performed for different lists of traits (8,2,3,9). 52 However, the results of these analyses are highly dependent on the exact list of traits used, and 53 traits of interest to researchers previously tend to involve only a small number of phenotypes 54 and/or be heavily biased towards a small set of disease-relevant biological systems and 55 processes. Due to these limitations, it is unknown to what extent horizontal pleiotropy affects 56 genetic variation in the human genome at the genome-wide level. 57 The proliferation of data sources like large-scale biobanks and metabolomics data that include a 58 wide array of phenotypes in one dataset, combined with the growing public availability of 59 genome-wide association studies (GWASs) summary statistic data, especially for extremely 60 large meta-analyses, has allowed the development of methods that use these summary 61 statistics to gain insight into human biology, and particularly into the genetic architecture of 62 complex traits and diseases (10). 63 Here, we present a method to measure horizontal pleiotropy using publicly available GWAS 64 summary statistics. We focus on measuring horizontal pleiotropy of SNVs on observable traits, 65 meaning a scenario where a single SNV affects multiple phenotypes without relying on a 66 detectable causal relationship between those phenotypes. Using this framework, we are able to 67 score each SNV in the human genome for horizontal pleiotropy, giving us broad insight into the 68 genetic architecture of pleiotropy. Because our framework explicitly removes correlations 69 between the input phenotypes, and because these phenotypes represent a diverse array of 70 traits and diseases, these insights are largely robust to the specific list of traits studied, and 71 pertain to human biology overall rather than relationships between specific traits. 72

73
Defining pleiotropy 74 We narrowly define the scope of pleiotropy as applying only to genetic variants, and particularly 75 variants investigated as part of GWASs. As effects, we are considering phenotypic outcomes 76 measured by GWASs. By our definition, then, pleiotropy means that one variant shows 77 significant associations across GWASs of multiple traits. We additionally restrict the scope of 78 pleiotropy we are considering to include only horizontal pleiotropy, and to exclude vertical 79 pleiotropy (Figure 1). To elaborate on this distinction, suppose we have identified a variant that 80 influences two different traits, trait A and trait B. In vertical pleiotropy, the traits themselves are 81 biologically related, so that the variant's effect on trait A actually causes the effect on trait B. A 82 key feature of vertical pleiotropy is that two traits that are biologically related should be related 83 regardless of which specific gene or variant is causing the effect. This induces correlation 84 between GWAS effect sizes on the two traits across an entire set of variants. For example, we 85 expect that any variant that increases LDL cholesterol also increases risk of coronary artery 86 disease, because we suspect that it is the increase in LDL cholesterol itself that causes 87 increased disease risk. This results in a correlation between variant effect sizes for LDL 88 cholesterol and coronary artery disease, which has been detected in multiple studies (11)(12)(13). 89 The methodology of Mendelian Randomization uses this predicted correlation within a given set 90 of variants to formulate a statistical test for causal relationships among traits, which is now 91 widely used for biological discovery (14,15). We extend this methodology to use the entire set of 92 SNVs evaluated by GWAS, treating a GWAS-wide correlation between two traits as evidence of 93 a vertical pleiotropic relationship between these traits. 94 In the case of horizontal pleiotropy, an individual variant acts on traits A and B without mirroring 95 any trait-level relationship between them. Unlike vertical pleiotropy, since we are not considering 96 the variant-level effect as evidence of a relationship between the two traits, we cannot detect 97 horizontal pleiotropy by detecting correlations between traits. Instead, each horizontally 98 pleiotropic variant acts by its own unique mechanism. These particular pleiotropic variants, 99 therefore, should show a relationship between the two traits that deviates from the relationship 100 we would infer from the genome-wide correlation of effect sizes between them. This deviation 101 from the correlation between traits is not a prediction of any kind of model of pleiotropy, but 102 simply follows from our definition of the term "horizontal pleiotropy": any pair of traits whose 103 effect sizes are correlated across all variants is by definition related by vertical pleiotropy, while 104 any variant whose effects on two traits substantially deviate from the trait-level relationship 105 between those traits is by definition exhibiting horizontal pleiotropy. 106 A quantitative score for pleiotropy 107 We have developed a method to measure horizontal pleiotropy using summary statistics data 108 from GWASs on multiple traits. Our method relies on applying a statistical whitening procedure 109 to a set of input variant-trait associations, which removes correlations between traits caused by 110 vertical pleiotropy and normalizes effect sizes across all traits. Using the decorrelated 111 association Z-scores, we measure two related but distinct components of pleiotropy: the total 112 magnitude of effect on whitened traits ("magnitude" score, denoted ܲ ) and the total number of 113 whitened traits affected by a variant ("number of traits" score, denoted ܲ ሻ . Both scores are then 114 traits, but the effects of each variant on each trait are independent and the number of 138 variants with effects on multiple traits is no more than would be expected by chance. 139 This empirical correction for polygenicity is required because polygenicity is a major factor that 140 can produce pleiotropy. For example, it has been estimated that approximately 100,000 141 independent loci are causal for height in humans (16). If the total number of independent loci in 142 the human genome is approximately 1 million, this corresponds to about 10% of the human 143 genome having an effect on height. If we imagine multiple phenotypes with this same highly 144 polygenic genetic architecture, we should expect substantial overlap between causal loci for 145 multiple different traits, even in the absence of any true causal relationship between the traits, 146 resulting in horizontal pleiotropy (Figure 2). 147 Power to detect pleiotropy in simulations 148 We conducted a simulation study to evaluate the performance of our two-component pleiotropy 149 score. We simulated 800,000 variants controlling 100 traits, varying the per-trait liability scale 150 heritability of all traits ݄ ଶ and the proportion of pleiotropic and non-pleiotropic causal variants. To 151 introduce LD in the simulations, we used real LD architecture from 800000 SNVs from 1000 152 Genomes European population. We simulated Z-scores independently for each SNV and then 153 propagate LD for a given SNV by "contaminating" its Z-score according to the Z-scores of the 154 SNVs in LD with it. Under the null model, all trait-variant associations were independent, and no 155 horizontal pleiotropy was added. Under the added-pleiotropy models, we randomly chose a 156 fraction of causal variants and forced them to have simultaneous associations with multiple 157 traits. The simulation study showed that both components of the pleiotropy score were well-158 powered to detect horizontal pleiotropy (Figure 4), and that the LD correction dramatically 159 reduces the dependence of the pleiotropy score on LD (Supplementary Figure 1) Table 1). 207 Taken together, these results suggest that extreme polygenicity drives horizontal pleiotropy, and 208 that this has an extremely large effect on the genetic architecture of human phenotypes. 209 Genome-wide distribution of pleiotropy score gives insight into 210 genetic architecture 211 In addition to observing genome-wide inflation of the pleiotropy score, we can also gain insight 212 from the distribution of the pleiotropy score on a more granular level. 213  observed that both components of the pleiotropy score are higher on average in transcribed 226 regions (coding and UTR) than in intergenic noncoding regions. This result was confirmed and 227 expanded by annotations from Roadmap Epigenomics (22), which showed that regions whose 228 chromatin configurations were associated with actively transcribed regions, promoters, 229 enhancers, and transcription factor binding sites had significantly higher levels of both 230 components of the pleiotropy score, while heterochromatin and quiescent chromatin states had 231 significantly lower levels. Investigating individual histone marks, we found that both the 232 repressive histone mark H3K27me3 and the activating histone mark H3K27ac were associated 233 with elevated levels of pleiotropy, although the activating mark H3K27ac had a larger effect. 234 This may indicate that being under active regulation at all produces higher levels of pleiotropy, 235 whether that regulation is repressive or activating. 236 We also used data from the Genotype-Tissue Expression (23) project to measure the 237 connection between transcriptional effects and our pleiotropy score (Table 1). Consistent with 238 the previous observation that functional regions had higher pleiotropy scores, we found that 239 variants that were identified as cis-eQTLs for any gene in any tissue had higher pleiotropy Finally, we found that variants that are eQTLs for genes whose orthologs are associated with 246 multiple measurable phenotypes in mice or yeast have higher pleiotropy scores, demonstrating 247 that our pleiotropy score is also related to pleiotropy in model organisms. variants covered by the UK Biobank, we were able to compute our pleiotropy score 282 independently using these two datasets (Figure 7) This high level of replication using independent sets of GWAS summary statistics suggests that 287 our pleiotropy score is capturing an underlying biological property, rather than an artifact of the 288 UK Biobank study. 289

290
To characterize the phenotypic associations of these loci, we used our replication dataset of 291 published GWAS summary statistics for 73 human quantitative traits and diseases, plus nine 292 additional traits we excluded from our replication dataset for a total of 82 (Methods). We are not 293 able to compute directly the degree of pleiotropy exhibited by these traits, since our definition of 294 horizontal pleiotropy applies only to individual variants and does not apply to traits. However, we 295 can identify traits whose GWAS variant associations are correlated to our pleiotropy score, 296 which in some sense represents the traits that contribute most to our signal of pervasive 297 horizontal pleiotropy. Figure 6c shows the correlations between our LD-corrected pleiotropy 298 score (ܲ and ܲ ) and the association statistics for these 82 traits and diseases. The most 299 strongly correlated traits were anthropometric traits like body mass index, waist and hip 300 circumference, and height; certain blood lipid levels, including total cholesterol and triglycerides; 301 and schizophrenia. These are all known to be highly polygenic and heterogeneous traits. The 302 least correlated traits include several measurements of insulin sensitivity and glucose response, 303 such as the insulin sensitivity index (ISI), certain features of brain morphology, and the 304 inflammatory biomarker lipoprotein(a). This may be partly due to low sample size of the 305 corresponding GWASs. However, these correlations do not appear to be driven exclusively by 306 sample size: in cases where multiple GWASs for the same trait have been performed on 307 subsamples of the population (for example, males only, female only, and combined), the sample 308 size only marginally affects the correlation (Supplementary Table 4). Another contributing 309 factor may be heritability: height, in particular, is among the most heritable traits we examined, 310 while ISI and the brain morphology features are among the least. 311

312
We have presented a framework for scoring horizontal pleiotropy across human genetic 313 variation. In contrast to previous analyses, our framework explicitly distinguishes between 314 horizontal pleiotropy and vertical pleiotropy or biological causation. After applying both 315 components of our pleiotropy score to 372 heritable medical traits from the UK Biobank, we 316 made the following observations: 1) horizontal pleiotropy is pervasive and widely distributed 317 across the genome; 2)) horizontal pleiotropy is driven by extreme polygenicity of traits; 3) 318 horizontal pleiotropy is significantly enriched in actively transcribed regions and active regulatory 319 regions, and is correlated with the number of genes and tissues for which the variant is an 320 eQTL; 4) there are thousands of loci that exhibit extreme levels of horizontal pleiotropy, a 321 majority of which have no previously reported associations; and 5) pleiotropic loci are enriched 322 in specific complex traits including body mass index, height, and schizophrenia. These findings 323 are largely consistent between the magnitude of pleiotropy score ܲ and the number of traits 324 score ܲ , although we note some differences where some variants are primarily associated with 325 ܲ but not ܲ . This indicates that these signals are driven by loci that both influence a large 326 number of traits and have relatively large combined effects, and secondarily by loci that have 327 large combined effects but only influence a handful of traits each, with minimal contribution from 328 loci that influence a large number of traits but have small combined effects. Conversely, after 329 applying the correction for polygenicity, we only observe variants that are significant for ܲ , but 330 not for ܲ . This indicates that, while there do exist horizontal pleiotropic master control loci that 331 affect more traits than we would expect from the random overlap of multiple highly polygenic 332 traits, the overall effect of these loci is not noticeably larger than we would expect. 333 This analysis is enabled by the technique of whitening trait associations to remove correlations 334 between traits. This lets us count pleiotropic effects in a more objective and systematic way, as 335 opposed to manually selecting putatively independent traits to count, or manually grouping traits 336 into independent blocks. However, it does come with three major limitations compared to these 337 approaches. First, it is somewhat more difficult to tell which specific traits are driving a signal of 338 pleiotropy at a particular locus. Our whitened traits are combinations of real observed traits, and 339 do not necessarily correspond to any specific biological traits of interest. However, it is relatively 340 easy to inspect the input GWAS summary statistics for a particular variant of interest to see 341 which traits it is associated with. Furthermore, since pleiotropic loci are by definition associated 342 with a large cross-section of traits, this kind of inspection is not likely to be very informative 343 about specific traits. Second, the whitening procedure has the counterintuitive property that a 344 variant that has a narrow effect on a single trait without also affecting correlated traits can 345 appear to be highly pleiotropic. For example, if a variant had a strong risk-increasing effect on 346 coronary artery disease (CAD), but no effect on any of the known upstream risk factors of CAD 347 While our results largely support this network pleiotropy hypothesis, we have also demonstrated 392 an alternate view of horizontal pleiotropy in the context of highly polygenic causation. In our 393 simulations, introducing extreme polygenicity at the levels suggested by these papers inherently 394 results in high levels of horizontal pleiotropy detectable by our score, independent of any 395 assumptions about the mechanism of pleiotropy or of polygenicity. Indeed, our null hypothesis 396 of no horizontal pleiotropy, that 5% of the genome is independently causal to each trait with P < 397 0.05, is trivially rejected when a single trait is influenced by an unexpectedly large fraction of the 398 genome. This means that, on some level, widespread horizontal pleiotropy in human genetic 399 variation is simply a logical consequence of widespread polygenicity of human traits, regardless 400 of the specific mechanism of either. In simple terms, the more loci are associated with each trait, 401 the more chances there are for associations with multiple traits to overlap. Supporting this 402 result, we find that controlling for the polygenic architecture of the input traits significantly 403 attenuates our signal of pleiotropy, as does restricting to oligogenic traits. It may be the case 404 that horizontal pleiotropy is only truly widespread among the most complex and polygenic 405 subset of human traits. 406

407
In this study, we have presented a quantitative score for horizontal pleiotropy in human genome 408 variation. Using this score, we have identified a genome-wide trend of highly inflated levels of 409 horizontal pleiotropy, an underappreciated relationship between horizontal pleiotropy with 410 polygenicity and functional biology, and a large number of specific novel loci with high levels of 411 horizontal pleiotropy. We expect further investigations using this score to yield deep insights into 412 the genetic architecture of human traits and to uncover important novel biology. 413

414
We developed a statistical method to measure horizontal pleiotropy using a two-component 415 pleiotropy score. For a given variant, we measured 1) the total magnitude of pleiotropic effect 416 the variant has and 2) the number of whitened traits affected by the variant. 417   Second, we retrieved publicly available genome-wide association (GWA) summary statistics 496 data for 82 complex traits and diseases (31-66) ( Table S9). For each dataset, we retrieved the 497 appropriate variant annotation (build, rsid, chromosome, position, reference and alternate 498 alleles) and summary statistics (effect size, standard errors, P-values and sample size of the 499 study). All variant coordinates (chr, pos) were lifted over to hg19 using the UCSC Genome 500

Z-scores decorrelation strategy
Browser LiftOver Tool and aligned to the reference and alternate alleles retrieved from the 501 Ensembl variation database. After performing the same pruning of highly correlated phenotypes, 502 we were left with 73 traits and diseases. 503 Third, we retrieved GWA summary statistics datasets from a GWAS of 453 blood metabolites in 504 7,824 individuals (67). After performing the same pruning of highly correlated phenotypes, we 505 were left with 430 metabolites. 506

507
Using the two components of the pleiotropy score, we can run a genome-wide pleiotropy study 508 (GWPS) which consists of computing two P-values for each component of the score (ܲ and 509 ܲ ) and for all variants genome-wide. We computed the pleiotropy score separately for each of 510 the three datasets described above (372 UK Biobank phenotypes, 73 traits and diseases, and 511 430 blood metabolites). Additionally, we computed the pleiotropy score on a subset of 372 traits 512 with genome-wide significant heritability as calculated by LD Score Regression (20) (univariate 513 heritability significant after Bonferroni correction). The 372 UK Biobank heritable traits were 514 used for discovery, and the 73 traits and diseases and 430 blood metabolites datasets were 515 used for replication. There was a total of 768,756 variants genotyped across all three datasets. 516

517
To study the effect of polygenicity on horizontal pleiotropy, we first estimated the liability-scale 518 heritability of all 372 traits in our UK Biobank dataset using LD score regression, and stratified 519 all traits into four equally-sized classes of heritability, in order to control for the effect of high 520 heritability separate from the effect of high polygenicity. Next, we estimated the polygenicity of from 0 to 127, where each score represents the number of epigenomes for which that site is 545 assigned to the corresponding category. We did the same for two specific chromatin marks: the 546 activating mark H3K27ac and the repressive mark H3K27me3. For these annotations, we used 547 a combination of all other categories as a reference set. In other words, the reference set for 548 each category is all variants that are not in that category. 549 In addition to these molecular annotations, we used expression-related annotations from the 550 Genotype-Tissue Expression project (23). For each variant, we retrieved the number of genes 551 for which the variant is referenced as a cis eQTL (expression quantitative trait loci) in any tissue 552 (eGenes), and the number of tissues where the variant is annotated as a cis eQTL for any gene 553 (eTissues). We divided variants into bins by number of eGenes (below 10, between 10 and 15, 554 between 15 and 20, and over 20) and eTissues (below 30, between 30 and 35, between 35 and 555 40, and above 40). The reference set used for these analyses were variants that are not 556 annotated as eQTLs in any gene or tissue. 557 Finally, we used model organism phenotypes measured by the International Mouse 558 Phenotyping Consortium (IMPC) (68) and the Saccharomyces Cerevisiae Morphological 559 Database (SCMD) (69). To map ortholog genes from IMPC and SCMD to human variants, we 560 used orthology annotations of gene orthologs, and eQTLs from GTEx. Thus, variants annotated 561 as associated with a mouse or yeast phenotype are those that are annotated as cis eQTLs in 562 any tissue for a gene whose ortholog in mouse or yeast is associated with that phenotype. The 563 reference set for this analysis was variants annotated as cis eQTLs for genes that are not 564 associated with mouse or yeast phenotypes. 565 Genome-wide significant pleiotropy loci 566 To detect loci with a genome-wide significant pleiotropy, we used the LD-corrected two-567 component pleiotropy score (ܲ and ܲ ) computed on the dataset of 372 heritable traits from 568 UK Biobank described above. We used LD clumping as implemented in PLINK to cluster linked 569 variants, with an ‫ݎ‬ ଶ threshold of 0.1, a distance threshold of 100 kb, and P-value thresholds of 5 570 x 10 -8 for genome-wide significance and 0.05 for nominal significance. The resulting loci were 571 assigned to genes using 1) localization of variants within a gene, as annotated by Ensembl 572 Variant Effect predictor, and 2) annotation as a cis eQTL in any tissue, as annotated by GTEx. 573 We submitted the resulting list of genome-wide significant genes to DAVID for enrichment 574 analysis (70-72). 575

576
In general, we should expect only 5% of loci to replicate by chance in each replication dataset; 577 however, it is possible that this number might increase because of polygenicity in the underlying 578 GWAS statistics and the resulting inflation in our pleiotropy score, which may cause 579 substantially more than 5% of the genome to be assigned P < 0.05. To correct for this, we 580 performed random permutations of the whitened Z-scores independently for each trait and used 581 these permuted Z-scores to compute our LD-corrected pleiotropy score components (ܲ and 582 ܲ ). This generates a null expectation that preserves the polygenicity and inflation within each 583 dataset. For both datasets, our null model expected that 5% of loci for ܲ loci and 6% of loci for 584 ܲ should replicate. The fraction that replicated in the actual data was substantially higher 585 (Figure 7). polygenicity (a,c,e,g) and with the correction (b,f,d,h), with per-variant heritability ranging from 846 0.0002 to 0.2, proportion of non-pleiotropic causal loci ranging from 0 to 1%, and proportion of 847 pleiotropic causal loci ranging from 0.1% to 1%. Our method has good power to detect 848 pleiotropy for highly heritable traits, though its power is reduced by extreme polygenicity. 849 Extreme polygenicity also increases the false positive rate, though this effect is corrected by our 850 polygenicity correction. 851 diseases. Contribution of pleiotropic variants is calculated as the correlation coefficient between 869 the absolute value of Z-scores and the pleiotropy score among variants that are genome-wide 870 significant for the pleiotropy score (P < 5 × 10 -8 for ܲ and ܲ respectively). 871

879
We grouped variants by (i) molecular function as annotated by Ensembl, (ii) predicted chromatin state as annotated by the NIH