A predictive model for canine dilated cardiomyopathy—a meta-analysis of Doberman Pinscher data

Dilated cardiomyopathy is a prevalent and often fatal disease in humans and dogs. Indeed dilated cardiomyopathy is the third most common form of cardiac disease in humans, reported to affect approximately 36 individuals per 100,000 individuals. In dogs, dilated cardiomyopathy is the second most common cardiac disease and is most prevalent in the Irish Wolfhound, Doberman Pinscher and Newfoundland breeds. Dilated cardiomyopathy is characterised by ventricular chamber enlargement and systolic dysfunction which often leads to congestive heart failure. Although multiple human loci have been implicated in the pathogenesis of dilated cardiomyopathy, the identified variants are typically associated with rare monogenic forms of dilated cardiomyopathy. The potential for multigenic interactions contributing to human dilated cardiomyopathy remains poorly understood. Consistent with this, several known human dilated cardiomyopathy loci have been excluded as common causes of canine dilated cardiomyopathy, although canine dilated cardiomyopathy resembles the human disease functionally. This suggests additional genetic factors contribute to the dilated cardiomyopathy phenotype.This study represents a meta-analysis of available canine dilated cardiomyopathy genetic datasets with the goal of determining potential multigenic interactions relating the sex chromosome genotype (XX vs. XY) with known dilated cardiomyopathy associated loci on chromosome 5 and the PDK4 gene in the incidence and progression of dilated cardiomyopathy. The results show an interaction between known canine dilated cardiomyopathy loci and an unknown X-linked locus. Our study is the first to test a multigenic contribution to dilated cardiomyopathy and suggest a genetic basis for the known sex-disparity in dilated cardiomyopathy outcomes.


INTRODUCTION
Dilated cardiomyopathy (DCM) is a prevalent and often fatal disease requiring clinical management in humans and dogs (Egenvall, Bonnett & Häggström, 2006;Hershberger, Morales & Siegfried, 2010). DCM is the second most common cardiac disease in dogs and is characterised by ventricular chamber enlargement and systolic dysfunction which often leads to congestive heart failure (Egenvall, Bonnett & Häggström, 2006). The aetiology of DCM is complex. Genetic factors, myocardial ischemia, hypertension, toxins, infections and metabolic defects have been implicated (McNally, Golbus & Puckelwartz, 2013). To date, mutations in over 50 genes have been associated with DCM in humans; however, mutations in the most prevalent DCM related genes only account for approximately 50% of patients with DCM (Posafalvi et al., 2013). In human DCM genetic testing where a panel of approximately 50 loci are tested concurrently, often more than one locus can be implicated in the disease (McNally, Golbus & Puckelwartz, 2013), suggesting multiple genetic factors cooperate in DCM aetiology.
Canine DCM is phenotypically similar to human DCM (Shinbane et al., 1997). As outlined below, to date mutations in only two genes (PDK4 and STRN) and a single nucleotide polymorphism (SNP) on chromosome 5 have been associated with canine DCM (Mausberg et al., 2011;Meurs et al., 2012;Meurs et al., 2013), suggesting additional genetic causes remain unknown. While canine studies have sometimes been limited by small sample size (typically less than 10 individuals), those studies with larger sample numbers (greater than 50 individuals) have also frequently failed to find significant associations with DCM (e.g., Philipp et al., 2007;Philipp, Vollmar & Distl, 2008;Wiersma et al., 2008). One possible explanation for the challenges in identifying DCM associated loci in humans and dogs is that even within an extended family or breed, genetic variation at a single locus cannot explain the development of DCM. Indeed dog breeds can be considered as large families, with dogs within a breed more related to each other than dogs of other breeds (Parker et al., 2004). In the same way that some human families are affected by DCM, a subset of dog breeds are affected by DCM more frequently than others (Egenvall, Bonnett & Häggström, 2006). Dobermans Pinschers (hereafter Dobermans) are particularly affected by DCM, with both a high prevalence (58.2% in European Dobermans) and severity with DCM associated death often occurring within 8 weeks of diagnosis (Calvert et al., 1997;Wess et al., 2010). In dogs, diagnosis is usually at the onset of clinical symptoms of heart failure. But there is an extended pre-clinical phase, during which if treatment can be effective by prolonging the onset of heart failure (Summerfield et al., 2012). In this phase left ventricular dilation and dysfunction begins, and can be accompanied by ventricular premature complexes (Singletary et al., 2012), Median life expectancy of DCM affected European Dobermans is 7.8 years, compared with 11 years for unaffected European Dobermans (Proschowsky, Rugbjerg & Ersbøll, 2003;Egenvall, Bonnett & Häggström, 2006). A deletion in a splice site of the PDK4 gene (Meurs et al., 2012) and a SNP on chromosome 5 (Mausberg et al., 2011) in Dobermans are two of only three canine DCM mutations identified to date. While these two loci are associated with Doberman DCM, individually neither locus explains all cases of Doberman DCM (Mausberg et al., 2011;Meurs et al., 2012). Individuals heterozygous at the Chr5 SNP are more likely to develop DCM, but there are many DCM cases homozygous for the healthy allele (Mausberg et al., 2011). While PDK4 genotypes are less definite predictors of DCM, with both affected and unaffected individuals possessing the three possible genotypes, the 16bp PDK4 splice site deletion is found more frequently in North American Dobermans with DCM than those without DCM (Meurs et al., 2012). However an analysis of European Dobermans failed to identify an association between PDK4 and DCM (Owczarek-Lipska et al., 2013), suggesting additional unknown factors influence the effect of PDK4 in predisposing individuals to DCM. Thus novel genetic causes of canine DCM remain to be identified (Mausberg et al., 2011;Philipp et al., 2012).
In this study we developed genetic models to test the influence of unknown genetic factors to predict which DCM-associated genotype combinations are likely to develop DCM. Using this method, we provide evidence for a sex-linked genetic influence on known DCM loci in the pathogenesis of canine DCM. Our study is the first to propose a multigenic contribution to canine DCM.

Determining which genotypes develop DCM
Five genetic models incorporating genotypes at multiple observed and hypothetical loci were developed including: 1. two known DCM loci; 2. two known loci + 50% of the population more susceptible to developing DCM; 3. two known loci + a novel autosomal dominant DCM locus; 4. two known loci + a novel autosomal recessive DCM locus; 5. two known loci + a novel additive DCM locus and 6. two known loci + a novel X-linked DCM locus. For each model, different biologically feasible phenotype outcomes were tested for each genotype combination to establish the best fit of the model to the observed DCM incidence data. Each model was subject to the following constraints: individuals that are homozygous CC at the Chr5 SNP develop DCM, and individuals with no susceptibility alleles are healthy.

Model testing
For each model, the frequency of each genotype combination was calculated by multiplying the genotype frequencies using PDK4 and Chr5 frequencies (Table 1) obtained   Mausberg et al. (2011) andOwczarek-Lipska et al. (2013).

PDK4
Chr5 SNP  -Lipska et al. (2013) and Mausberg et al. (2011). A range of frequencies were tested for each hypothetical loci. For example, for the model incorporating only PDK4 and Chr5 variants, one genotype combination is WtWt-TT. The frequency of this genotype combination is the product of the frequency of WtWt and the frequency of TT in the population. From the combined genotype frequencies, the expected numbers of individuals with each genotype combination were calculated by multiplying the frequency by the number of individuals in the study to be compared with (182 when compared with Mausberg et al. (2011) and Owczarek-Lipska et al. (2013)). Thus, the numbers of individuals in the model that were, for example, WtWt healthy and WtWt DCM were obtained by summing the numbers in each category. Having obtained the numbers of affected and unaffected individuals that the model predicts for each genotype, these were tested against the observed data using a χ 2 test. Where additional putative DCM loci were included in the model, several allele frequencies were tested. However, as GWAS studies have previously been carried out (Mausberg et al., 2011;Meurs et al., 2012), it is unlikely that additional DCM alleles are at higher frequencies than those already identified. For this reason, DCM allele frequencies over 0.5 were not tested. If the model is a good fit of the observed data, the χ 2 test statistic will be non-significant. The proportion of the population that the model predicts to have DCM was determined by taking the sum of all the genotype combined frequencies that lead to DCM in the model. For example, for the model incorporating just the two known loci this is 0.0144 + 0.0624 + 0.0052 + 0.0048 + 0.0004 = 0.0872- (Table S1). This proportion was then compared to the observed DCM frequency of 0.582 (Wess et al., 2010).
For most models, it must be assumed that there is no difference in DCM incidence between the sexes, as an effect of sex has not been included. For the DCM model testing a 50% increased susceptibility, where it is biologically feasible that males are more susceptible and the models incorporate an additional X-linked locus, it is possible to calculate the proportion males and females that develop DCM. While males develop clinical symptoms earlier and appear to be more severely affected, there are indications that the sex of those affected by DCM is close to 50% male, 50% female (Wess et al., 2010), so we would expect our model to reflect this.
Odds ratios of each genotype and allele developing DCM for each model were obtained by testing each genotype against the other two combined and each allele against the other. Odds ratios are the odds/probability of an individual with a particular genotype or allele Table 2 Genotype odds ratios from the original studies reporting an association. Ratios from the PDK4 locus (Meurs et al., 2012) and Chromosome 5 SNP (Mausberg et al., 2011). The PDK4 χ 2 test results indicate that the WtWt genotype significantly associated with non-DCM and the WtDel genotype significantly associated with DCM at the 0.01 significance level, the DelDel genotype odds ratio whilst different from the null result of 1, is not significantly so. For the chromosome 5 SNP all individuals that are CC in the original study developed DCM, thus and odds ratio and confidence interval cannot be calculated, but χ 2 tests can be performed on the data. TT is significantly associated with non-DCM and the TC and CC genotypes are significantly associated with DCM at the 0.01 significance level.

Genotype
Odds ratio 95% CI   (Meurs et al., 2012) and Chromosome 5 SNP (Mausberg et al., 2011). The χ 2 test results indicate that each susceptibility (Del and C respectively) allele is significantly associated with DCM and the alternate allele significantly associated with non-DCM at the 0.01 significance level. developing DCM compared, by dividing one by the other, to the odds of an individual with all other genotypes or alleles developing DCM, with and odds ratio greater than one associated with the trait of interest and an odds ratio of less than one not associated (Bland & Altman, 2000). For example the odds ratio for TT in the published data from Mausberg et al. (2011) is calculated in the following way. There are 45 individuals that are TT DCM and 85 TT healthy the odds of a TT individual developing DCM are 45/85 (0.53), there are 43 individuals which are TC or CC with DCM and 9 individuals that are TC or CC healthy so the odds of these individuals developing DCM are 43/9 (4.78) the odds ratio divides the genotype of interest odds by the 'others' odds to give the odds ratio or 0.11. To assess the significance of these ratios χ 2 tests were performed on the 2 × 2 tables-in the above example the four groups are TT-DCM, TT-healthy, TC or CC-DCM, TC or CC-healthy. If the model is a good fit to the observed data it is expected that the odds ratios are of a similar pattern and significance, e.g., TT, small-significantly not associated with DCM; TC, large-significantly associated with DCM; CC, not possible to test-not testable, as for the Chr5 SNP in Table 2. Odds ratios of both genotypes and alleles were obtained from the original studies (Tables 2 and 3).

RESULTS
Following the constraints stated in the methods and using biologically feasible reasoning each model was optimised to best fit the observed data. For each model the genotypephenotype decision descriptions are shown in Table 4. Tables of each model are in Supplemental Information.

Comparing model predictions with observed data
The χ 2 test values comparing predicted numbers with observed numbers of DCM and healthy individuals at each genotype ranged from 4.35 to 7766.06. A χ 2 value of less than 11.07 indicates there is no significant difference between predicted and observed genotype-phenotype data, (5% significance level, with 5 degrees of freedom). Values less than 15.09 represent predictions not significantly different to observed values at the 1% significance level. χ 2 values less than these critical values are indicated in Table 5.

Model predicted DCM population frequency and sex incidence
For each model, the predicted DCM frequency was calculated to provide an additional method to test the accuracy of the model. The DCM frequency in the European Doberman population is estimated to be 58.2% (Wess et al., 2010), therefore accurate models should   (Wess et al., 2010)) in the European Doberman pincher population.
predict a similar frequency. The frequencies predicted by each model are displayed in Table 6 (see also Table S2), with those within 10% of the reported frequency highlighted as accurate models. Further to this the proportion of males and females that each model predicts to develop DCM were calculated. Whilst most models do not account for sex and assume equal numbers of males and females affected, two models tested either a 50% increase in male susceptibility or an additional X-linked locus. Based on reported DCM incidence for a model to fit the observed data it is expected that similar proportions of males and females develop DCM. Table 7 shows that irrespective of the frequency of the novel susceptibility allele the model incorporating a novel X linked DCM locus gives similar proportions of affected males and females.

Odds ratios
For the Chr5 SNP there are no odds ratio for CC, as all individuals that are CC develop DCM in both the original study (Mausberg et al., 2011) and models so odds ratios cannot Notes. * significant at 5% level ** significant at 1% level be calculated. Despite this a χ 2 test can be performed on the counts of affected and unaffected individuals observed and predicted with the genotype so the significance of the results was obtained. For the Chr5 SNP, 12 of 18 models (Table 9), and 15 of the allele odds ratios are consistent with the original studies (Table 11). The PDK4 deletion association was identified in the North American Doberman population; in the European population, the odds ratios (Tables 8 and 10) are not significantly different from the null result of 1. Once combined with additional loci, similar significant likelihood ratios as the North American population are obtained for 13 of 18 models (Tables 8 and 10).

Selecting the most realistic model
For a model to be considered plausible, it should predict similar numbers of affected and unaffected individuals at each genotype as observed in Mausberg et al. (2011) andOwczarek-Lipska et al. (2013), predict similar DCM frequency as reported in the population (Wess et al., 2010), and give odds ratios of genotypes and alleles similar to those from the studies which report an association. To assist in determining which models meet these requirements, Table 12 shows which conditions each model meets (Tables S3-S6). Table 9 Odds ratios of each Chr5 SNP genotype with χ 2 significance.

DISCUSSION
This study used publicly available data to test the prediction that genetic models incorporating multiple factors can better explain and predict the incidence of canine DCM than those utilising a single factor. Until now, the possibility that multiple genes combine to influence DCM phenotype has been proposed, but has not yet been tested, despite an established role for multiple loci in related diseases (Ingles et al., 2005;Xu et al., Rampersaud et al., 2011;Posafalvi et al., 2013). This is the first study to investigate the combined effect of multiple factors on the predisposition to DCM. Although our models do not explain all cases of canine DCM, by combining three factors (PDK4, Chr5 TIGRP2P73097 SNP and an X-linked locus) we show that DCM incidence can be more accurately predicted (Tables 6-12). Furthermore, as noted above the PDK4 splice site deletion is not significantly associated with DCM in the European population. But in the model incorporating only the two known loci, the PDK4 variant improves the odds ratio for the Chr5 SNP. Collectively these findings indicate that models incorporating multiple factors are more effective than those incorporating a single factor. This result is important because it has implications for future studies of the genetics and management of DCM. A better understanding of the genetic basis of DCM will permit the monitoring and earlier clinical intervention of high risk individuals thus potentially improving the outcome for affected individuals.
To assess the accuracy of each model, we performed several statistical tests. For any model to be considered an accurate representation of observed data it should predict similar numbers of affected and unaffected individuals at each genotype as have been reported in the published data. It should also predict a similar DCM frequency to that found in the population. Secondly, the odds ratios of genotypes and alleles should support an association of the specific variants with DCM. The models incorporating the two known DCM loci and an additional X-linked locus with a susceptible allele frequency of 0.46 for the novel susceptible allele met all such conditions. It is important to note that this susceptible allele frequency should have been identified by the previous GWAS studies (Mausberg et al., 2011;Meurs et al., 2012). It is therefore possible that additional cases and controls are required to complete a comprehensive GWAS analysis of DCM in Dobermans to establish the function and frequency of this predicted DCM associated locus.
Most predictive models are based on either known or simulated genotypes at multiple loci (Janssens et al., 2006;Pencina, D'Agostino & Vasan, 2008). Such models do not account for known effects of genotypes or allow the inclusion of additional as yet unknown, loci. For example, in this study all individuals possessing the Chr5 CC genotype have DCM. Our methodology is unique and useful where there are multiple known and unknown factors which do not fully account for the phenotype. In particular, our approach accommodates specific gene combinations to lead to disease, rather than incremental risk factors as is the case in other predictive models (Janssens et al., 2006;Pencina, D'Agostino & Vasan, 2008). Limitations to our methodology include the number of factors that can be modelled is limited by the available data. Despite this, our methodology could be used in other situations. While many phenotypes are the consequence of multiple loci, there can be some loci which have comparatively more important contribution to the phenotype (e.g., Strange et al., 2011;Papa et al., 2013). Identifying these loci can be the first steps in predicting phenotypes (e.g., Hayes et al., 2010;Papa et al., 2013). Following the identification of loci associated with a trait, our methodology can be used to indicate what type of additional loci may be influencing the trait of interest, which may simplify the identification of additional loci.

CONCLUSIONS
There are many unknown factors involved in the aetiology of canine and human DCM. In Dobermans, we have identified multigenic effects and a possible X-linked locus as novel variables influencing DCM risk. While the PDK4 splice site deletion and the Chr5 SNP have both been tested for association with DCM in the European population of Dobermans, the combined genotype of individuals has not yet been considered (Mausberg et al., 2011;Owczarek-Lipska et al., 2013). Our model would benefit from further genotyping of Dobermans at both the PDK4 and Chr5 variants to further validate the model. Future work is also required to identify X-linked DCM loci if the model is verified for the known loci. It is also possible that the different combinations of alleles leading to DCM in the model could affect the time taken to progress from one disease stage to the next as reported by Wess et al. (2010). If validated, our model has implications for current canine breeding practices and welfare of individuals within the breed. Individuals with allele combinations more likely to develop DCM can be monitored more intensely than those with less genetic risk, and mating pairs resulting in deleterious genotypes can be avoided. This will have improve welfare by reducing the prevalence of DCM-associated alleles within the population and potentially improving the longevity of affected dogs by enabling monitoring and earlier clinical management. By utilising similar methodology, equivalent multigenic effects and possible additional loci could be identified in human DCM, giving similar benefits to those described for Dobermans.