Pharmacogenomics of statin-related myopathy: Meta-analysis of rare variants from whole-exome sequencing

Aims Statin-related myopathy (SRM), which includes rhabdomyolysis, is an uncommon but important adverse drug reaction because the number of people prescribed statins world-wide is large. Previous association studies of common genetic variants have had limited success in identifying a genetic basis for this adverse drug reaction. We conducted a multi-site whole-exome sequencing study to investigate whether rare coding variants confer an increased risk of SRM. Methods and results SRM 3–5 cases (N = 505) and statin treatment-tolerant controls (N = 2047) were recruited from multiple sites in North America and Europe. SRM 3–5 was defined as symptoms consistent with muscle injury and an elevated creatine phosphokinase level >4 times upper limit of normal without another likely cause of muscle injury. Whole-exome sequencing and variant calling was coordinated from two analysis centres, and results of single-variant and gene-based burden tests were meta-analysed. No genome-wide significant associations were identified. Given the large number of cases, we had 80% power to identify a variant with minor allele frequency of 0.01 that increases the risk of SRM 6-fold at genome-wide significance. Conclusions In this large whole-exome sequencing study of severe statin-related muscle injury conducted to date, we did not find evidence that rare coding variants are responsible for this adverse drug reaction. Larger sample sizes would be required to identify rare variants with small effects, but it is unclear whether such findings would be clinically actionable.

Introduction Lipid-lowering drugs that inhibit 5-hydroxy-3-methylglutaryl-coenzyme A (HMG-CoA), known as statins, are widely used for the primary and secondary prevention of cardiovascular disease (CVD). All statins can cause muscle toxicity that ranges in severity from mild muscle pain to severe myopathy or rhabdomyolysis, which can lead to kidney failure and death [1]. The most severe forms of statin-related muscle injury are uncommon, suggesting a genetic predisposition. For instance, severe statin-related myopathy (SRM) restricted to cases with creatine phosphokinase (CK) levels >10x the upper limit of normal (ULN) (SRM 4) requiring hospitalisation, occurs in approximately 1-10 out of every 10,000 individuals taking standard statin doses [2].
Previous genome-wide association studies (GWAS) have identified a common variant in the drug transporter gene SLCO1B1 that increases the risk of SRM defined by varying CK elevation criteria, 2-to 4-fold through a pharmacokinetic mechanism, but the underlying causes of this adverse drug reaction remain largely unexplained [3][4][5]. To investigate whether rare variants in coding regions of the genome may cause SRM, we conducted whole-exome

Study population and overall study design
Identification and recruitment of case and control subjects, whole-exome sequencing, variant calling, and statistical analyses were coordinated by two analysis centers: one based at the University of Washington that included 291 cases and 1540 controls from North America and the United Kingdom ("US-UK") and one based in Liverpool that included 214 cases and 507 controls from Europe ("PREDICTION-ADR"). Within each of these two large case-control studies, participants were recruited from multiple sites. Because there were few non-European ancestry participants in either case-control study, and many sites had none, we restricted analyses to participants of European ancestry. The details of recruitment at each site are described in the Supporting Information files. Whole-exome sequencing and variant calling were harmonized within each of the two case-control studies, and investigators from both studies agreed prospectively to implement a coordinated statistical analysis plan, which included protocols for meta-analysis of results from both single-variant and gene-based burden tests.

Case definition
Using a previously published case definition (S1 Table), SRM 3-5 cases received statin treatment at the time of onset of muscle injury symptoms, had a peak CK level > 4x ULN (SRM 3) or supporting clinical evidence of rhabdomyolysis (SRM 5), and had no other likely cause of muscle injury other than statin treatment. [1] In each study, non-statin causes of muscle injury were excluded based on a review of medical records and/or interview of patients by physicians or study staff. Severe SRM cases (SRM 4) sometimes called severe myopathy and SRM 5, sometimes called rhabdomyolysis in some previous studies [6], met these criteria and had a peak CK level > 10x ULN (S1 Table). SRM cases with myopathy and evidence of anti-3-hydroxy-3-methyl-glutaryl-coenzyme A reductase (HMGCR) autoantibodies (SRM 6), which are suggestive of autoimmune myopathy, were excluded at all sites [7]. In both case-control studies, statin treatment-tolerant controls subjects were users of statins who had no evidence of muscle injury during follow up.

Whole-exome sequencing, variant calling, quality control
Exome sequencing, variant calling, and quality control of sequence data were performed separately in each case-control study (S1 Appendix and S2 Table). Each study conducted sequencing and called variants for cases and controls together, to eliminate batch effects across casecontrol sampling. We created a combined variant annotation file including all quality-controlled variant sites observed in either study. Variants were annotated using ANNOVAR [8] and dbNSFP v3.0 [9] according to the reference genome GRCh37.

Statistical analysis
Within each case-control study, which included only European ancestry participants, population structure was adjusted for using principal components derived from a genetic relationship matrix. To estimate associations between single variants and the risk of SRM, the Firth biascorrected score test was used [10]. The primary single-variant analysis included all SRM cases and all control subjects. Single variant analyses were restricted to variants with at least 10 copies of the minor allele, but not further restricted by annotated variant function. Results from the two case-control studies were combined using a Z-based meta-analysis [11]. The following secondary analyses were also conducted: restriction to users of simvastatin, atorvastatin, and cerivastatin (the three most commonly-used statins in these studies); stratification of SRM cases into moderate (SRM 3: CK � 4x ULN and <10x ULN) and severe (SRM 4 and 5: CK � 10x ULN) categories; and inclusion of all missense variants. In the US-UK case-control study, information on fibrate use was collected. Because fibrates can cause drug-drug interactions with statins through inhibition of the metabolism of statins, and because strong drugdrug interactions can mask the presence of a genuine drug-gene interaction that may be less potent, we conducted secondary analyses restricted to non-users of fibrates. Rare variants (MAF < 1%) were collapsed into gene-burden scores and tested for association with the risk of SRM [12]. Gene-burden results from the two case-control studies were combined using a Z-based meta-analysis. The primary gene-based analysis included variants with MAF < 1% that are missense, bioinformatically predicted to be damaging (MetaLR > .5) [13], stop-gain/loss, coding indels, or splice site variants. Secondary gene-burden analyses restricted to users of simvastatin or atorvastatin, stratified SRM cases into moderate (SRM 3: CK > = 4x ULN and <10x ULN) and severe (SRM 4 and 5: CK > = 10x ULN) categories, and included rare missense variants regardless of their prediction score for function.
For each analysis, the Bonferroni-corrected threshold for statistical significance was P = 0.05 / number of single variants or genes tested. For the primary single-variant analysis, we conducted post-hoc power calculations using QUANTO 1.1 [14,15] to determine study power in the primary analysis population across a range of effect sizes and MAFs, based on α = 0.05 / number of single variants included in the primary analysis and an estimated disease probability among control subjects of 0.0001 [16].
The full results from each meta-analysis will be available on dbGaP within the next release of Cohort for Heart and Aging Research in Genomic Epidemiology Results CHARGE Consortium summary results (accession phs000930) (http://www.chargeconsortium.com/main/ results). Information on how to access individual-level study data is provided in the Supporting Information files.

Results
Across the two case-control studies, there were 505 SRM cases and 2,047 treatment-tolerant controls that passed sample-level quality control ( Table 1). Most of the US-UK cases (mean CK 159x ULN) were severe (65%) and used cerivastatin or simvastatin, while most of the PRE-DICTION-ADR cases (mean CK 32x ULN) were moderate (61%) and used simvastatin or atorvastatin.

Single-variant results
The mean sequencing depth in US-UK case-control study was 78x, and in PREDICTIO-N-ADR it was 56x. After restricting to rare (MAF < 0.01) likely damaging variants, 162,813 variants passed quality control in one or both studies and were included in meta-analyses (Fig  1). In primary analyses that included all SRM cases and all statin-tolerant controls, there were no variants that met the Bonferroni-corrected threshold for statistical significance (P < 3.07 x 10 −7 ) ( Table 2). The widely replicated locus for statin-related muscle injury in SLCO1B1 (rs4149056) was associated with a 1.59-fold increased risk of SRM (95% confidence interval, 1.41-1.77), but this finding was not significant after correction for the number of single variants tested.
Secondary analyses restricted by statin type and stratified by moderate or severe cases of SRM did not yield any associations that met criteria for statistical significance (S3 Table). Among non-users of fibrates in the US-UK case-control study, the SLCO1B1 variant rs4149056 was associated with a 4.01-fold increased risk of SRM (95% confidence interval, 2.61-6.17), and this association met criteria for statistical significance (P = 5.46 x 10 −11 ) (S4 Table).

Gene-burden results
In total, 10,244 genes passed quality control thresholds in one or both studies and were included in the meta-analyses of burden test results. In primary analyses of all SRM cases and all controls, no genes met the Bonferroni-corrected threshold for statistical significance (P < 4.9 x 10 −6 ) (Table 3). Similarly, none of the secondary burden tests analyses yielded genome-wide significant findings (S5 Table).

Post-hoc power calculations
We conducted post-hoc power calculations to estimate study power in the primary analysis population to identify single-variants associated with SRM given a range of MAFs and effect sizes (Fig 2). For a variant with a MAF of 0.01, we had 98% power to detect an OR of 5, and for a variant with a MAF of 0.005, we had 84% power to detect an OR of 6.

Discussion
We recruited more than 500 patients with statin-related muscle injury from multiple sites in North America and Europe to conduct the largest whole-exome sequencing study of this adverse drug reaction to date. Using high-throughput sequencing technology, joint variant calling at two experienced analysis centres, and rigorous statistical methods, we did not identify any novel coding variants associated with SRM 3-5. We also evaluated hypotheses about the burden of rare coding variants within genes, muscle injury due to specific statin drugs, and the severity of muscle injury; the findings from these analyses were null as well.
In 2008, Link et al. found that a common nonsynonymous variant in the drug transporter SLCO1B1 (rs4149056) was associated with a 4-fold increase in the risk of myopathy among users of high dose simvastatin at genome-wide levels of statistical significance [4]. This finding has been widely replicated in other settings for other statin drugs [3,5,[17][18][19], and in our study this common coding variant was associated with a 4-fold increased risk of SRM 5 among non-users of fibrates. Interestingly, we did not find an association between the SLCO1B1 polymorphisms and mild forms of SRM. Attempts to discover additional common variants related to statin-related muscle injury have failed to yield replicable findings [20][21][22]. A recent study from the Czech Republic conducted whole-exome sequencing on 88 patients with mild statinassociated muscle toxicity, reporting an increased burden of rare variation in 24 genes [23]. We did not replicate these findings in our study, perhaps due to different phenotypes as most of our patients have more severe SRM. The validity of these findings may be limited because of the use of publicly available sequencing repositories that do not have information on medication use as the referent group rather than statin-tolerant control subjects.
Strengths of our study include the recruitment of patients from multiple international sites; the use of a standardized case-definition for different forms of myopathy in our study  with the mildest being CK>4x ULN (SRM 3) and rhabdomyolysis ranging from CK>10 to >40xULN with end-organ impairment and muscle symptoms [1], the depth of whole- exome sequencing in both case-control studies; the use of statin exposed controls to minimize genetic associations with indications for statin therapy, the coordinated statistical analysis plan with correction for multiple testing to reduce the risk of false positive findings; and the prospective plan for a meta-analysis of results across two large-case control studies. There were several limitations. As with any adverse drug reaction, it is difficult to attribute causality to a specific drug, even with prospectively developed case definitions and expert adjudication of cases. In addition, the presence of unknown drug-drug interactions with statin that increase the risk of SRM may have masked our ability to identify genuine drug-gene interactions. We recognize that our inability to recruit a large number of cases from diverse ancestries limits the generalizability of our findings to variants observed in European ancestry participants. Half of the SRM cases in the US-UK case-control study were due to the drug cerivastatin, and we were unable to identify a large number of controls subjects with WES who used this drug, since it was removed from the market in 2001. If there is heterogeneity in the effects of genetic variants on the risk of SRM by the type of statin drug or severity of SRM, this might have reduced our overall power to identify genetic loci for this adverse drug reaction in our primary analysis, which included all samples. Finally, using our stringent correction for multiple comparisons, we may have missed an association with rare, loss of function variants in biologically plausible metabolising enzyme or transporter genes that could have led to large increase in drug exposure and therefore muscle damage in some individuals [24].
While our post-hoc calculations demonstrated that our study was well-powered to identify rare coding variants (MAF 1%) with large effect sizes (OR 5), we had limited power to detect rare variants with small-or moderate-sized effects, or to detect even rarer variants with large effects. For instance, to have 90% power to identify a coding variant with a frequency of 0.001 among controls that increases the risk of SRM 6-fold, we would have to recruit and conduct whole-exome sequencing on approximately 2,700 cases with this rare adverse drug reaction, matched with 10,800 treatment-tolerant controls.
It is possible that rare variants exist that increase the risk of SRM, perhaps in non-coding regions of the genome, but they are unlikely to be detected with the sample sizes available in ongoing studies. Based on the null findings from our study, genotyping rare coding alleles for the prediction of severe statin-related muscle injury seems unlikely to yield clinically actionable findings.