Breast Cancer Clinical Trial of Chemotherapy and Trastuzumab: Potential Tool to Identify Cardiac Modifying Variants of Dilated Cardiomyopathy

Doxorubicin and the ERBB2 targeted therapy, trastuzumab, are routinely used in the treatment of HER2+ breast cancer. In mouse models, doxorubicin is known to cause cardiomyopathy and conditional cardiac knock out of Erbb2 results in dilated cardiomyopathy and increased sensitivity to doxorubicin-induced cell death. In humans, these drugs also result in cardiac phenotypes, but severity and reversibility is highly variable. We examined the association of decline in left ventricular ejection fraction (LVEF) at 15,204 single nucleotide polymorphisms (SNPs) spanning 72 cardiomyopathy genes, in 800 breast cancer patients who received doxorubicin and trastuzumab. For 7033 common SNPs (minor allele frequency (MAF) > 0.01) we performed single marker linear regression. For all SNPs, we performed gene-based testing with SNP-set (Sequence) Kernel Association Tests: SKAT, SKAT-O and SKAT-common/rare under rare variant non-burden; rare variant optimized burden and non-burden tests; and a combination of rare and common variants respectively. Single marker analyses identified seven missense variants in OBSCN (p = 0.0045–0.0009, MAF = 0.18–0.50) and two in TTN (both p = 0.04, MAF = 0.22). Gene-based rare variant analyses, SKAT and SKAT-O, performed very similarly (ILK, TCAP, DSC2, VCL, FXN, DSP and KCNQ1, p = 0.042–0.006). Gene-based tests of rare/common variants were significant at the nominal 5% level for OBSCN as well as TCAP, DSC2, VCL, NEXN, KCNJ2 and DMD (p = 0.044–0.008). Our results suggest that rare and common variants in OBSCN, as well as in other genes, could have modifying effects in cardiomyopathy.


Introduction
Dilated cardiomyopathy (DCM) is the underlying cause of >50% of heart transplants. The high morbidity and mortality associated with this disease underscore the need for a better understanding of the underlying molecular defects. Efforts to identify these defects have made great progress and acknowledge the complexity of the genetic architecture of DCM. Historically, however, they have relied on a somewhat circular argument: DCM genetic cause is described as predominantly autosomal dominant transmission with reduced penetrance, a high degree of locus (>40 genes) and 2 of 12 allelic (>197 variants) heterogeneity [1], with most mutations being rare or "private" to each affected family. However, proof of variant pathogenicity within a family, at least for publication in the scientific literature, relies largely on fulfilment of these criteria. In total,~40% of individuals test positive at known DCM loci [2,3] and even within families who fulfil these criteria, age at onset, response to treatment and disease progression is variable [4]. Despite this fact, the search for the 'missing' genetic cause and definition of pathogenicity relies predominantly on a familial-based strategy and criteria, making it particularly difficult to demonstrate pathogenicity in sporadic cases, even when all possible causes other than genetic, (coronary artery disease, chemotherapy-induced cardiotoxicity, valvular disease or repair) are ruled out.
Clearly, the traditional Mendelian paradigm as the causative genetic contribution to DCM genomic architecture is incomplete, and there is a need for alternative and perhaps novel genomic strategies. Genetic association studies of sporadic cases provide a potential alternative. In the first reported genome-wide association study of DCM cases (1179 cases and 1108 controls), common variants at two loci (BAG3 and HSP7B) were associated at genome-wide significance [5]. These data were encouraging because one of the top loci, BAG3, was also identified as a DCM gene in multiply affected families [6]. These overlapping but independent reports of common genetic variants (Allele frequency = 87.5% in sporadic, "idiopathic" DCM cases and 79.2% in controls, OR = 1.89) associated with increased disease risk at a DCM locus that was originally determined by Mendelian rare variant criteria, are not hugely surprising, lending weight and confidence for further investigation of known DCM genes and common risk variants. The caveat of such an approach is that DCM is a common disease (prevalence now estimated at 1:250) and often late onset [4]. Many members of the general population (potentially used as controls) could be asymptomatic, but due to expense, echocardiographic screening of large control populations is unlikely. In this study, we present an alternative approach to identify potential common risk variants, using a large clinical trial of breast cancer patients.
We postulate that phenotypic variability between individuals with the same DCM mutation is the result of cardiac modifying variants (i.e., we hypothesize that the severity of known DCM mutations could be influenced by individual genetic background). Differences in genetic background have been observed in animal models of DCM. For example, conditional knock-out of cardiac Erbb2 in two different lines of mice resulted in a DCM phenotype in both lines, but one showed much later disease onset [7]. However, the homogeneity within such models can be a disadvantage when extrapolating to the human population, and strategies to tease out the human genetic architecture of DCM are required. For example, ERBB2 itself is not a known DCM gene in the Mendelian sense, but it is the target of the monoclonal antibody and breast cancer drug, trastuzumab (Herceptin), the current standard of care for HER2+ breast cancer patients [8]. In Vitro assays of trastuzumab and human iPSC-derived cardiomyocytes demonstrate complete loss of ERBB2 within 48 h [9], a close parallel between the mouse conditional knock-out model and use of trastuzumab in human patients. Indeed, in the first clinical trial of trastuzumab in the metastatic setting [10], the major clinical side-effect was congestive heart failure in up to 27% of patients, although notably, this figure related to patients who received trastuzumab following anthracyclines, already a well-known cause of dose-dependent, irreversible heart failure, often ending in a phenotype of cardiomyopathy [11]. Nonetheless, the incidence of cardiac events was considerably higher in patients who received both anthracycline and trastuzumab than in patients who received anthracycline alone, hence subsequent trials of trastuzumab employed serial echocardiographic monitoring of patients. These patients may represent an important population to identify cardiac modifying variants because: (1) Phase III clinical trials are typically large (N = 1000's); (2) Patients receive echocardiography as a standard of care, with left ventricular ejection fraction (LVEF) monitoring at baseline, throughout treatment and on completion of treatment; (3) Patients must have baseline LVEF >50% to be eligible for trastuzumab, so unlikely to be asymptomatic prior to treatment; (4) The average age of breast cancer patients entered into phase III trials of Herceptin was >60 years, hence more age representative of DCM patients in the general population; (5) Family history of dilated cardiomyopathy is a risk factor for anthracycline-induced cardiomyopathy [12,13], suggesting an overlap between disease development following chemotherapy and genetic variants at DCM loci.
In this study, we analyzed the association of genetic variants across 72 known cardiomyopathy genes with decline in LVEF in 800 patients from the N9831 clinical trial [8]. All patients in this group were treated with doxorubicin and trastuzumab. We report results of single variant associations of common genetic variants (minor allele frequency (MAF) > 0.01) as well as those of gene-based association testing. These analyses highlight genetic variants at OBSCN, ILK, TCAP, DSC2, VCL, FXN, DSP and KCNQ1 as potential cardiac modifying variants that may be relevant to the development or progression of cardiomyopathy.

Materials and Methods
N9831 Clinical Trial: N9831 was a pivotal clinical trial that led to the use of trastuzumab as the standard of care for early HER2+ breast cancer. Patients in the N9831 trial were required to have histologically confirmed adenocarcinoma of the breast with 3+ immunohistochemical staining for HER2 or amplification of the HER2 gene by fluorescence in situ hybridization (≥2.0 ratio) and with either lymph node-positive or high-risk lymph node-negative disease to be eligible for the study. The trial compared adjuvant chemotherapy only (Arm A) vs. adjuvant chemotherapy followed by trastuzumab, either sequentially (Arm B) or concurrently (Arm C), in operable HER2+ breast cancer [8,14]. Patients received serial echocardiograms (ECHO) or multigated acquisition scans (MUGA) for up to 6-years: at baseline, at 3, 6, and 9 months after registration, and after completion of chemotherapy ( Figure 1). Long-term cardiac safety analysis was completed in 2016 [15]. The most common cardiac symptom was decline in LVEF by ≥10 points, observed in 26.2% of patients in Arm A (chemotherapy only) and 37.3% of patients who received trastuzumab (Arms B and C). Prevalence of congestive heart failure (CHF) was also significantly higher in patients receiving trastuzumab (3%) compared to those receiving chemotherapy only (0.9%) [15]. The majority of patients who developed CHF received cardiac medications, which included diuretics, beta-blockers, and angiotensin-converting enzyme inhibitors. cardiomyopathy is a risk factor for anthracycline-induced cardiomyopathy [12,13], suggesting an overlap between disease development following chemotherapy and genetic variants at DCM loci.
In this study, we analyzed the association of genetic variants across 72 known cardiomyopathy genes with decline in LVEF in 800 patients from the N9831 clinical trial [8]. All patients in this group were treated with doxorubicin and trastuzumab. We report results of single variant associations of common genetic variants (minor allele frequency (MAF) > 0.01) as well as those of gene-based association testing. These analyses highlight genetic variants at OBSCN, ILK, TCAP, DSC2, VCL, FXN, DSP and KCNQ1 as potential cardiac modifying variants that may be relevant to the development or progression of cardiomyopathy.

Materials and Methods
N9831 Clinical Trial: N9831 was a pivotal clinical trial that led to the use of trastuzumab as the standard of care for early HER2+ breast cancer. Patients in the N9831 trial were required to have histologically confirmed adenocarcinoma of the breast with 3+ immunohistochemical staining for HER2 or amplification of the HER2 gene by fluorescence in situ hybridization (≥2.0 ratio) and with either lymph node-positive or high-risk lymph node-negative disease to be eligible for the study. The trial compared adjuvant chemotherapy only (Arm A) vs. adjuvant chemotherapy followed by trastuzumab, either sequentially (Arm B) or concurrently (Arm C), in operable HER2+ breast cancer [8,14]. Patients received serial echocardiograms (ECHO) or multigated acquisition scans (MUGA) for up to 6-years: at baseline, at 3, 6, and 9 months after registration, and after completion of chemotherapy ( Figure 1). Long-term cardiac safety analysis was completed in 2016 [15]. The most common cardiac symptom was decline in LVEF by ≥10 points, observed in 26.2% of patients in Arm A (chemotherapy only) and 37.3% of patients who received trastuzumab (Arms B and C). Prevalence of congestive heart failure (CHF) was also significantly higher in patients receiving trastuzumab (3%) compared to those receiving chemotherapy only (0.9%) [15]. The majority of patients who developed CHF received cardiac medications, which included diuretics, beta-blockers, and angiotensin-converting enzyme inhibitors. DNA extraction and genotyping: Genomic DNA was available for a total of 1446 patients from the trial. DNA was isolated from peripheral blood with the Flexigene kit (Qiagen Inc, Germantown, MD, USA) as per the manufacturer's instructions, normalized to 15 ng/μL and shipped to Affymetrix (Affymetrix Inc, Santa Clara, CA, USA) for full service genotyping. Each 96-well plate contained one duplicate patient sample and two DNA samples routinely used as positive controls by Affymetrix. Genotyping was performed using a customized Axiom genotyping array (Affymetrix Inc, Santa Clara, CA, USA) covering a total of 762,792 single nucleotide polymorphisms (SNP)s.
A total of 16 duplicate controls were nested within 1462 DNA samples (1446 unique samples, one duplicate pair per 96-well plate) yielding 100% genotyping concordance across 793,571 SNPs. Primary analyses were confined to White/non-Hispanic with complete LVEF data. A total of 188 DNA extraction and genotyping: Genomic DNA was available for a total of 1446 patients from the trial. DNA was isolated from peripheral blood with the Flexigene kit (Qiagen Inc, Germantown, MD, USA) as per the manufacturer's instructions, normalized to 15 ng/µL and shipped to Affymetrix (Affymetrix Inc, Santa Clara, CA, USA) for full service genotyping. Each 96-well plate contained one duplicate patient sample and two DNA samples routinely used as positive controls by Affymetrix. Genotyping was performed using a customized Axiom genotyping array (Affymetrix Inc, Santa Clara, CA, USA) covering a total of 762,792 single nucleotide polymorphisms (SNP)s.
A total of 16 duplicate controls were nested within 1462 DNA samples (1446 unique samples, one duplicate pair per 96-well plate) yielding 100% genotyping concordance across 793,571 SNPs. Primary analyses were confined to White/non-Hispanic with complete LVEF data. A total of 188 patients were reported as non-White/Hispanic and principal components analyses identified a further 27 outliers, and 40 patients were missing either baseline or post-treatment LVEF, leaving 1191 patients for analyses (Arm A, N = 391; Arms B + C, N = 800), Supplementary Figure S1. Custom shell and R programming was employed to put these data in PLINK format, and all quality control (QC) was done using PLINK 1.07.
No samples had a call-rate under 95%. 13,987 SNPs had a call-rate under 95% and were removed from further analyses. Of the remaining 779,584 SNPs, 160,721 had MAF < 1%.
Deviation of the genotype distributions from Hardy-Weinberg equilibrium was tested in those patients whose LVEF did not drop by >10% to below 50%. All SNPs with Fisher's exact test for Hardy-Weinberg Equilibrium p < 1.0 × 10 −4 were excluded.
Principal components were calculated on 277,190 independent SNPs (none within a moving window of 50 SNPs could have a variance inflation factor (VIF) > 2) to assess correlation with self-reported race. The set of independent SNPs was also used to determine relatedness. There was no cryptic relatedness apart from duplicates; in total, 18 non-control pairs of samples were considered identical based on high PI_HAT (a PLINK statistic based on estimated IBD) and concordance values.
Gene and SNP selection: In this study, we focus on known DCM genes from the current literature. We report single marker association of common genetic variants (MAF > 0.01) at 72 loci (Table 1), of which 71 are listed in the review of DCM genetic architecture [4] and one additional gene, obscurin (OBSCN, more recently identified as a DCM gene) [16] and gene-based analyses which include both common (MAF > 0.01) and rare (MAF < 0.01) SNPs. The Affymetrix Axiom genotyping GWAS platform has the option to include custom-based SNPs on a GWAS backbone. We included custom SNPs for all 71 genes in the Hershberger DCM review [4]. We did not include custom SNPs at the OBSCN locus, as the array was designed prior to publication of [16]. The study included a total of 15,203 variants at these 72 loci (median SNPs per gene = 68, range 1-3512, interquartile range = 178), of which, 7018 had MAF > 0.01. Each gene and the number of SNPs per gene are listed in Supplementary Table S1. Table 1. Genetic variants were tested for association with decline in LVEF in the following genes.

Gene Symbol
Gene Name

ABCC9
ATP-binding cassette, sub-family C, member 9 ACTC1 Actin, Alpha, Cardiac Muscle 1 ACTN2 Actinin Alpha 2 AKAP9 A-Kinase Anchoring Protein 9 ANK2 Definition of cardiotoxicity: Several oncology and cardiology organizations provide definitions for cardiotoxicity that encompass overt clinical events and subclinical injury, although there is no universally accepted clinical cut point [17]. The 2014 American Society of Echocardiography and the European Association of cardiovascular imaging consensus defined CTRCD as a decrease in the LVEF of >10%, to <53% [11]. Reports of cardiotoxicity in the literature range in LVEF from <50% to <55%, in some cases requiring decreases of >15% or 20% [18]. We aimed to avoid the arbitrary nature of this definition by using as our primary endpoint, the maximum decline in LVEF observed from baseline during follow-up until three months after discontinuation of trastuzumab or until two years post-treatment, whichever was earliest.
Statistical analyses: Single SNP statistical analyses were performed for 7033 common variants (MAF ≥ 0.01), using R version 3.1.1, PLINK version 1.07. Linear regression was used with change in LVEF (lowest recorded LVEF-baseline LVEF) as the outcome variable and the number of copies of the minor allele of the variant of interest as the primary predictor variable. Analyses were adjusted for age, baseline LVEF, anti-hypertensive medications and the first two principal components in the 800 patients in Arms BC who received chemotherapy (doxorubicin, cyclophosphamide and paclitaxel) and trastuzumab.
The study included a total of 15,203 variants at 72 genes/SNP-sets (median SNPs per gene = 68, range 1-3512, interquartile range = 178), of which, 7018 had MAF > 0.01. Each gene and the number of SNPs in each gene set are listed in Supplementary Table S1. Gene-based statistical analyses were performed by aggregation of individual test-score statistics for each of the 72 gene sets to compute gene-based level p-values, while adjusting for age, baseline LVEF, anti-hypertensive medications and the first two principal components with the SNP-set (Sequence) Kernel Association Test (SKAT). Three variations of this test were performed: SKAT [19], SKAT-O [20] and SKAT-common/rare [21] under: (1) Rare variant non-burden (more powerful when a large fraction of the variants in a gene are non-causal or the effects of causal variants are in different directions); (2) Rare variant optimized burden (more powerful when most variants in a region are causal and the effects are in the same direction) and non-burden tests; (3) combination of rare and common variants respectively (weighting rare and common variants equally).

Single Marker Analyses of Common Variants
In total, 13/72 genes: VCL, DMD, OBSCN, RYR2, TPM1, KCNQ1, JAG1, SGCD, SCN5A, RBM20, SCN4B, TTN and CACNA1C showed at least one SNP (MAF > 0.01) with evidence of association with chemotherapy-and trastuzumab-induced decline in LVEF, p < 0.05: ( Table 2).   The most significant association was a DMD intronic variant, rs12559939, p = 0.0005. This association is supported by a highly correlated (r 2 = 0.90, D' = 0.95) flanking intronic SNP within 3 kb, rs141927233, p = 0.0006, MAF = 0.19, both with relatively small effect size, β = 1.48 and 1.45, respectively. As our analysis was based on an additive model, and the change in LVEF response variable was negative or zero (by definition), this would suggest that for these SNPs each copy of the minor allele results in a smaller decline in LVEF following combination doxorubicin and trastuzumab, i.e., if the association is true, the minor allele is protective against therapy-induced decline in LVEF.
We next looked within the common variant analyses for associated missense variants. Missense variants in two genes, OBSCN and TTN, were significantly associated at the p < 0.05 level with decline in LVEF. In total, 12 of the 55 OBSCN variants were associated with change in LVEF, seven of which were missense variants ( Table 2) with minor allele frequencies ranging from 0.14 to 0.50; all had small estimated effect sizes, ranging from −1.43 to 1.05. Under Bonferroni correction, genotyping 55 SNPs at this locus would require a p-value of 0.0009 to remain significant after correction. Three missense variants reached this criteria: rs56021350 Thr4399Met (p = 0.0009), rs4653942 Arg4534His (0.0007), and rs1188710 Gln5891Glu, p = 0.0008). The minor allele at each variant was associated with a greater decline in LVEF following therapy. Linkage disequilibrium values show some correlation between these variants (Figure 2). rs56021350 and rs4653942 (MAF = 0.18 and 0.20) are correlated, r 2 = 0.87, but clearly this signal is independent of rs1188710 (MAF = 0.47), suggesting, if these associations are true, there are at least two common, independent missense variants, each with negative effect on LVEF following doxorubicin and trastuzumab treatment. The most significant association was a DMD intronic variant, rs12559939, p = 0.0005. This association is supported by a highly correlated (r 2 = 0.90, D' = 0.95) flanking intronic SNP within 3 kb, rs141927233, p = 0.0006, MAF = 0.19, both with relatively small effect size, β = 1.48 and 1.45, respectively. As our analysis was based on an additive model, and the change in LVEF response variable was negative or zero (by definition), this would suggest that for these SNPs each copy of the minor allele results in a smaller decline in LVEF following combination doxorubicin and trastuzumab, i.e., if the association is true, the minor allele is protective against therapy-induced decline in LVEF.
We next looked within the common variant analyses for associated missense variants. Missense variants in two genes, OBSCN and TTN, were significantly associated at the p < 0.05 level with decline in LVEF. In total, 12 of the 55 OBSCN variants were associated with change in LVEF, seven of which were missense variants ( Table 2) with minor allele frequencies ranging from 0.14 to 0.50; all had small estimated effect sizes, ranging from −1.43 to 1.05. Under Bonferroni correction, genotyping 55 SNPs at this locus would require a p-value of 0.0009 to remain significant after correction. Three missense variants reached this criteria: rs56021350 Thr4399Met (p = 0.0009), rs4653942 Arg4534His (0.0007), and rs1188710 Gln5891Glu, p = 0.0008). The minor allele at each variant was associated with a greater decline in LVEF following therapy. Linkage disequilibrium values show some correlation between these variants (Figure 2). rs56021350 and rs4653942 (MAF = 0.18 and 0.20) are correlated, r 2 = 0.87, but clearly this signal is independent of rs1188710 (MAF = 0.47), suggesting, if these associations are true, there are at least two common, independent missense variants, each with negative effect on LVEF following doxorubicin and trastuzumab treatment. Figure 2. OBSCN variants, p < 0.05, linkage disequilibrium plot. Three missense variants were significant at OBSCN following Bonferroni correction for testing 55 variants at this locus: rs56021350/Thr4399Met, rs4653942/Arg4534His and rs1188710/Gln5891Glu, (p = 0.0009, 0.0007 and 0.0008 respectively). rs56021350 and rs4653942 are in linkage disequilibrium, but clearly, rs1188710 is not correlated with these variants, suggesting multiple independent variants at this locus.
In total, 2/19 associated variants at TTN were missense variants, rs3829746 Ile26134 and rs1001238 Ans17060Asp, p = 0.04. None remained significant after correction for multiple testing, but linkage disequilibrium analyses showed all variants to be in high linkage disequilibrium (Figure 3), suggesting they are not independent tests. Again, the effect of the minor allele(s) is positive (β ranging 0.79-1.08), suggesting a protective effect against decline of LVEF following therapy. Figure 2. OBSCN variants, p < 0.05, linkage disequilibrium plot. Three missense variants were significant at OBSCN following Bonferroni correction for testing 55 variants at this locus: rs56021350/Thr4399Met, rs4653942/Arg4534His and rs1188710/Gln5891Glu, (p = 0.0009, 0.0007 and 0.0008 respectively). rs56021350 and rs4653942 are in linkage disequilibrium, but clearly, rs1188710 is not correlated with these variants, suggesting multiple independent variants at this locus.
In total, 2/19 associated variants at TTN were missense variants, rs3829746 Ile26134 and rs1001238 Ans17060Asp, p = 0.04. None remained significant after correction for multiple testing, but linkage disequilibrium analyses showed all variants to be in high linkage disequilibrium (Figure 3), suggesting they are not independent tests. Again, the effect of the minor allele(s) is positive (β ranging 0.79-1.08), suggesting a protective effect against decline of LVEF following therapy.

Gene-Based Analyses
We next moved to examine rare variant gene-based significance. As these analyses are exploratory, we used both the non-burden sequence kernel association test, SKAT, (which is more powerful when a large fraction of variants in a region are non-causal or the effects of causal variants are in different directions) and the optimal unified burden and non-burden test, SKAT-O. Both tests performed similarly. Seven genes: ILK, TCAP, DSC2, VCL, DSG2, FXN, DSP, KCNQ1 were significant at the nominal p < 0.05 level with SKAT-O, all of which were also significant with SKAT, with the exception of DSG2 (Table 3). Table 3. Rare and common variant gene-based analyses of decline in LVEF following treatment with doxorubicin and trastuzumab, p < 0.05. Gene-based analyses were performed with SKAT, SKAT-O and SKAT CR (common/rare) functions. N.Marker All represents the total number of variants. N.Marker Test represents the number of variants used in the gene-based analyses (monomorphic variants are excluded). For the SKAT common/rare function, N.Marker.Rare is the number of analyzed variants with MAF < 0.025 and for N.Marker.Common, the number of analyzed variants with MAF < 0.025. Under the expectation that cardiac-modifying variants could be common with small effects, or both rare and common, we also used the rare/common function in SKAT, weighting rare and

Gene-Based Analyses
We next moved to examine rare variant gene-based significance. As these analyses are exploratory, we used both the non-burden sequence kernel association test, SKAT, (which is more powerful when a large fraction of variants in a region are non-causal or the effects of causal variants are in different directions) and the optimal unified burden and non-burden test, SKAT-O. Both tests performed similarly. Seven genes: ILK, TCAP, DSC2, VCL, DSG2, FXN, DSP, KCNQ1 were significant at the nominal p < 0.05 level with SKAT-O, all of which were also significant with SKAT, with the exception of DSG2 (Table 3). Table 3. Rare and common variant gene-based analyses of decline in LVEF following treatment with doxorubicin and trastuzumab, p < 0.05. Gene-based analyses were performed with SKAT, SKAT-O and SKAT CR (common/rare) functions. N.Marker All represents the total number of variants. N.Marker Test represents the number of variants used in the gene-based analyses (monomorphic variants are excluded). For the SKAT common/rare function, N.Marker.Rare is the number of analyzed variants with MAF < 0.025 and for N.Marker.Common, the number of analyzed variants with MAF < 0.025. Under the expectation that cardiac-modifying variants could be common with small effects, or both rare and common, we also used the rare/common function in SKAT, weighting rare and common variants equally. Seven genes were significant under the rare/common function, three of which were already identified under rare variant scenarios, TCAP, DSC2 and VCL, (p = 0.025, 0.008 and 0.026 respectively). The rare/common function of SKAT also identified four additional genes, NEXN, KCNJ2, DMD and OBSCN (p = 0.044, 0.031, 0.009, 0.019), two of which were not identified in the initial single marker analysis of common variants (NEXN and KCNJ2).

Discussion
The genomic architecture of dilated cardiomyopathy is complex, with a high degree of phenotypic variability that could be accounted for by cardiac modifying variants. As an exploratory effort to identify putative modifying variants, we conducted a genetic association study of decline in LVEF following treatment with combination doxorubicin (known to induce cardiomyopathy in animal models and humans) and trastuzumab (a targeted therapy for ERBB2, crucial in prevention of dilated cardiomyopathy in mice [7] and known cardiotoxicity in clinical trials [10,15]) in 800 patients from a breast cancer clinical trial across 72 genes that are causative of cardiomyopathies.
Perhaps the strongest result from these analyses is the association with obscurin (OBSCN), a large gene (two giant isoforms, >100 exons, spanning 170 kb). Initially screened as a candidate for hypertrophic cardiomyopathy (HCM), Arimura et al. [22] identified variant, OBSCN Arg4344Gln (within Ig48-49 domain) in a 19-year old affected male. Functional analyses demonstrated that the Arg4344Gln variant affected binding of obscurin to the Z9-Z10 domains of Titin [22]. Our own single marker, common variant analyses identified associations with decline in LVEF with two missense variants in this domain: rs56021350/Thr4399Met and rs61825301/His4489Gln. Both variants were present at MAF = 0.18 in 800 patients treated with doxorubicin and trastuzumab, with the minor allele associated with larger decline in LVEF, p = 0.001, following treatment. Our study also observed association with rs3795801/Gly4666Ser, MAF = 0.18, p = 0.001, again with the minor allele associated with larger decline in LVEF following treatment. This variant maps to the calmodulin binding region (Ig51/52) domain and was also identified in the Arimura study [22] of 144 unrelated HCM patients, but disregarded because it was present in the SNP database.
OBSCN was also recently identified as causative of dilated cardiomyopathy (DCM) [16] based on the observation of five potentially disease-causing mutations in four of 30 patients screened by whole exome sequencing. Marston et al. [16] reported that 15% of the potentially disease-causing variants were in the OBSCN gene which the authors likened to the frequency of truncating mutations in TTN, that have been proposed as a major causative gene of DCM, suggesting mutations in OBCSN may also be significant contributors to DCM burden. Our single marker analyses of common variants and also our gene-based analyses (including 38 common and 44 rare variants) are in agreement, and we further suggest that common and rare variants in OBSCN may contribute to DCM burden or perhaps modify disease progression/outcome.
In a study of 312 DCM patients, TTN truncating variants were reported in 25% of familial and 18% of sporadic cases [23]. A subsequent study identified TTN truncating variants in 6/17 DCM families [24], not all of which segregated with disease, illustrating the difficulty of determining variant pathogenicity. We had hoped that our exploratory study might shine some light on causality at this locus, but we observed only minimal evidence for the association of common variants, despite the large coding region (>300 exons) and that our analyses included 275 common variants. The association we did observe, appeared to be from variants with a positive value of beta (suggesting lesser decline in LVEF following treatment), all in high linkage disequilibrium, including 17 non-coding and two missense variants, (p = 0.019-0.047). If this signal was to be real, the predicted effect on LVEF would be protective against doxorubicin and trastuzumab.
In summary, our data are suggestive of genetic modifying variants that may increase risk of, or protect against development and/or progression of cardiomyopathy. Several of the associated variants in our study have been previously identified in sequencing studies of familial cardiomyopathy, but likely discarded because they were present in public SNP databases, even at low frequency.
All associated common variants (MAF > 0.01) in this study are shown in Table 2. Given the heterogeneity observed within DCM, even within family members carrying the same "causative" variant, a potential strategy would be to ask whether those family members with the worst outcome (earliest onset) were also positive for modifying alleles in the same gene, reported to have negative impact on LVEF.
The limitations of the study are the exploratory nature and testing of multiple genes under multiple scenarios of rare and common variants. Given that several of the associated 'modifying' variants are coding, perhaps the next steps are testing in model organisms. This functional testing would also discern whether specific variants are modifiers of the effects of doxorubicin, trastuzumab or combination therapy.
Supplementary Materials: The following are available online, Table S1: 72 known cardiomyopathy genes and SNPs used in common and rare variant analyses. Figure S1: N9831 PCA plot, Arms A, B and C GWAS data.