Benefits of Dominance over Additive Models for the Estimation of Average Effects in the Presence of Dominance

In quantitative genetics, the average effect at a single locus can be estimated by an additive (A) model, or an additive plus dominance (AD) model. In the presence of dominance, the AD-model is expected to be more accurate, because the A-model falsely assumes that residuals are independent and identically distributed. Our objective was to investigate the accuracy of an estimated average effect (α^) in the presence of dominance, using either a single locus A-model or AD-model. Estimation was based on a finite sample from a large population in Hardy-Weinberg equilibrium (HWE), and the root mean squared error of α^ was calculated for several broad-sense heritabilities, sample sizes, and sizes of the dominance effect. Results show that with the A-model, both sampling deviations of genotype frequencies from HWE frequencies and sampling deviations of allele frequencies contributed to the error. With the AD-model, only sampling deviations of allele frequencies contributed to the error, provided that all three genotype classes were sampled. In the presence of dominance, the root mean squared error of α^ with the AD-model was always smaller than with the A-model, even when the heritability was less than one. Remarkably, in the absence of dominance, there was no disadvantage of fitting dominance. In conclusion, the AD-model yields more accurate estimates of average effects from a finite sample, because it is more robust against sampling deviations from HWE frequencies than the A-model. Genetic models that include dominance, therefore, yield higher accuracies of estimated average effects than purely additive models when dominance is present.

In quantitative genetics, dominance is the phenomenon where the genotypic value of the heterozygote deviates from the mean genotypic value of the two homozygotes (Falconer and Mackay 1996). Dominance has been shown to play an important role in production traits of livestock species (Morris and Binet 1966;Sellier 1976;Visscher et al. 2000) and plant crops (Xiao et al. 1995;Stuber 2010;Huang et al. 2016). In livestock genetic improvement, however, research has been focused on the estimation of average effects, because average effects capture all heritable variation (Lynch and Walsh 1998). The average effect of a single gene (a), also known as the allele substitution effect, is defined as the linear regression coefficient of genotypic values on allele counts (Falconer and Mackay 1996). Under Hardy-Weinberg equilibrium (HWE), the a at a biallelic locus is a function of the additive (a) and dominance (d) parts of gene effects, and the population allele frequency p: a ¼ a þ ð1 2 2pÞd; (1) that is not captured by the average effect is called the dominance deviation (Falconer and Mackay 1996).
When the A-model is used, dominance deviations are not modeled and thus become part of the residual. As a consequence, the residuals are not independent and identically distributed (IID), because dominance deviations are different across genotypes (Ott and Longnecker 2010). The A-model may therefore give inaccurate estimates of a; because it falsely assumes that the residuals are IID. When the AD-model is used, dominance deviations are explicitly modeled, and the residuals will more likely be IID. In the presence of dominance, the AD-model may therefore yield more accurate estimates of a than the A-model. In contrast to the A-model, however, the AD-model requires the estimation of two effects instead of one (for a single locus), which may reduce the accuracy with which these effects are estimated. Additionally, dominance effects are generally smaller and therefore harder to estimate than additive effects (Lynch and Walsh 1998). For these reasons, the AD-model may require more individuals to be sampled for an accurate estimation of a; compared with the A-model. Furthermore, estimating dominance effects when there is very little or no dominance may lead to overfitting (Ott and Longnecker 2010). Hence, while the AD-model may better fit the data in the presence of dominance, the A-model may be preferred when the sample size is relatively small and dominance is negligible. It is, however, not yet clear how sample size and dominance effect size affect the accuracy ofâ with the A-model vs. the AD-model.
The objective of this work, therefore, was to investigate the root mean squared error (RMSE) of the estimatedâ at a single locus in the presence or absence of dominance, using either an A-model or an AD-model. We start with some theory of a single locus model, then derive the expected estimate of a; and calculate the RMSE ofâ for several broad-sense heritabilities, dominance effects, sample sizes, and allele frequencies.
We then calculate the mean RMSE for several degrees of dominance over the distribution of allele frequency, and identify mechanisms that underlie the differences between the A-model and AD-model.

THEORY
Our interest is to estimate the average effect ðaÞ at a single locus in a large population that is in HWE, from data collected as a finite sample of that population. The average effect will be treated as a fixed effect (as in quantitative genetics), and not a random variable (e.g., as in genomic prediction) (de los Campos et al. 2015). In quantitative genetics, a at a single locus can be estimated from the sample by linear regression using an A-model or an AD-model. The A-model estimates a directly through linear regression of phenotypic values on allele counts, where y is a vector of centered phenotypes, e is a vector of residuals, and x is a vector of centered allele counts with ð0 2 2p s Þ for individuals with 0 copies of the alternative allele, ð1 2 2p s Þ for individuals with one copy; and ð2 2 2p s Þ for individuals with two copies. The term p s is the allele frequency of the alternative allele, observed in the sample. Throughout this paper, we will use the term genotypes to indicate the three allele count classes, with values of 0, 1, or 2. With the A-model, the ordinary least squares estimate (LSE) of a iŝ The AD-model estimates the additive (a) and dominant (d) gene effects by multiple linear regression where m is a dominance indicator vector with ð0 2 2p s ð1 2 p s ÞÞ for homozygous individuals, and ð1 2 2p s ð1 2 p s ÞÞ for heterozygous individuals. Vectors y and x are the same as in the A-model, and e is a vector of residuals. Note that this is the genotypic parameterization as described by Vitezica et al. (2013). With the AD-model, the LSE of a and d are Theâ from the AD-model is subsequently calculated aŝ By definition,â from both models give an estimate of the average effect in the sample (Falconer and Mackay 1996). Because the size of the sample is finite, genotype and allele frequencies in the sample might deviate from the frequencies in the total population. These deviations might introduce error in the estimation of a: To investigate the effects of finite sample size in the presence of dominance, the estimates from the A-model (â A ) and the AD-model (â AD ) were compared by computing their RMSE for several scenarios.

Expectation ofâ
If we take a random sample of N individuals from a large population in HWE that has allele frequency p; the expectation ofâ can be computed using probabilities and estimates of each possible sample composition. We define c as a set of variables fn 0 ; n 1 ; n 2 g that describe unique sample compositions, where n 0 is the number of individuals with genotype 0, n 1 is the number of individuals with genotype 1, and n 2 is the number of individuals with genotype 2. The probability of sampling c is calculated from the multinomial probability function PðcjN; pÞ ¼ N! n 0 !n 1 !n 2 ! g n 0 0 g n 1 1 g n 2 2 : Conditional variables N and p are hereafter omitted to improve readability, so that PðcjN; pÞ is abbreviated as PðcÞ: The quantities g 0 ; g 1 , and g 2 are the genotype frequencies in the HWE population, and follow from the population allele frequency p (g 0 ¼ ð12pÞ 2 ; g 1 ¼ 2pð1 2 pÞ; g 2 ¼ p 2 ). The expectation ofâ is computed as the sum over all products of probabilities PðcÞ and corresponding estimatesâðcÞ; whereâðcÞ is the LSE of a given c: The a cannot be estimated when the sample consists of individuals that all have the same genotype, so we use IðcÞ as an indicator variable to exclude such samples all other: Note that samples including only genotypes 0 are excluded from Equation 8, by summing n 0 from 0 to N 2 1; instead of from 0 to N: After excluding samples with IðcÞ ¼ 0; the probabilities PðcÞ of the remaining samples were rescaled so that they sum to 1.

Root mean squared error
The RMSE is defined as the root of the expected squared difference between theâ estimated from the sample, and the true value of a where dðcÞ is the contribution of finite sampling deviation to the RMSE dðcÞ ¼ PðcÞðâðcÞ2aÞ 2 : We define dðcÞ here because we will later on focus on the contribution of a single finite sample c to the RMSE. The above expressions will be used to investigate the effect of N; H 2 ; p; and d on the RMSE ofâ with the A-model and the AD-model.

METHODS
We aim to illustrate the effect of sample size (N), broad-sense heritability (H 2 ), allele frequency p; and dominance effect d; on RMSE of estimated average effects (â). As a base scenario, we chose one for both the additive and dominance effect of the gene (e.g., full dominance). The expected value ofâ was calculated for N 2 f300; 500; 1000g; H 2 2 f0:01; 0:05; 1g; and p ¼ ½0:001 2 0:999 (increments of 0:001), with the A-model (Equation 3) and AD-model (Equation 5). The variation in broad-sense heritability was achieved by adding random residuals to the phenotypes (y). In addition, we varied the dominance effect (d 2 f0; 0:1; 0:2; 0:5g) for the scenario where N ¼ 500 and H 2 ¼ 0:05: TheâðcÞ from the AD-model were computed using the sample allele frequency (p s ) in Equation 6 instead of the population allele frequency (p), because the latter is usually unknown. For samples where one of the genotypes was missing,âðcÞ with the AD-model was computed in the same way as with the A-model, because in those cases the vector of genotypes x was completely confounded with dominance vector m: Additionally, to quantify the average accuracy ofâ; we computed the mean RMSE ofâ; assuming a distribution for the allele frequency. For this purpose, we used the RMSE as a function of p and numerically integrated over p using its expected distribution under a drift model, Here, N e is the effective population size, f ðpÞ is the distribution of allele frequencies when mutation is ignored, p ranges from 1=2N e to 1 2 1=2N e (Wright 1931;Goddard 2009), and To ensure that R f ðpÞdp ¼ 1; k was given a value of 1=logð2N e 2 1Þ: The resulting distribution of allele frequencies is U-shaped, and a low N e yields a more uniform distribution than a high N e : We computed the mean RMSE for several N e (50, 100, and 200), N (200-600), and sizes of dominance effect d (0.5, 1, and 1.5). We considered sample sizes up to 600 instead of 1000 to reduce computation time. In these scenarios, both H 2 and the additive gene effect (a) were equal to one.
The data used can be regenerated exactly following the descriptions in this paper. Figure 1 shows the RMSE ofâ with the A-or AD-model, for a ¼ 1 and d ¼ 1: For all scenarios, the RMSE ofâ was smaller with the AD-model than with the A-model. In scenarios where H 2 ¼ 1; RMSE was symmetrical around p ¼ 0:5 with both the A-and AD-model. For brevity, we will therefore only describe the pattern for p , 0:5: For both models and all N; the RMSE was smallest when p was close to 0, and increased when allele frequency increased. With the A-model, RMSE was largest around p ¼ 0:04 and then decreased when p moved toward 0.5. With the AD-model, RMSE was also largest around p ¼ 0:04; then decreased when p moved toward 0.1, after which RMSE slightly increased again until p ¼ 0:5:

Root mean squared error
With H 2 , 1; RMSE showed a similar pattern, but was not symmetrical around p ¼ 0:5: Compared with H 2 ¼ 1; the RMSE was larger for all p; but this contrast decreased when p increased. This asymmetry was a result of fixing H 2 in the simulations, which caused the ratio of the dominance variance and residual variance to increase with p: For all scenarios, RMSE decreased when N increased. Figure 2 shows the RMSE ofâ with the A-or AD-model, for a ¼ 1; N ¼ 500; H 2 ¼ 0:05; and different dominance effects (d). For d ¼ 0 and d ¼ 0:1; there was almost no difference in RMSE between the Aand AD-model. This indicates that in the absence of dominance, there was no disadvantage of using the AD-model in terms of RMSE. For d ¼ 0:1, there was no apparent benefit from using the AD-model. For d ¼ 0:2 and d ¼ 0:5; however, the AD-model had lower RMSE than the A-model.

Contribution of finite sampling deviation to the root mean squared error
When there is no environmental variance (H 2 ¼ 1) and the model is correct, the RMSE ofâ is expected to be zero. The results, however, show that the RMSE is larger than zero with both the A-and AD-model. To gain more insight into the sources of this error, we investigated the contribution of single samples to the RMSE, for one scenario where H 2 ¼ 1; N ¼ 300; p ¼ 0:10; and a ¼ d ¼ 1; so that a ¼ 1:8: For this purpose, we studied the squared difference between aðcÞ and a (i.e., squared error), as a function of the realized number of individuals with genotype 2 (n 2 ). The samples have different probabilities of occurring, so that some samples may contribute more to the total RMSE than others. We therefore investigated the contribution of finite sampling deviation to the RMSE (dðcÞ), by weighting the squared errors ofâ(c) with their probabilities (see Equation 11).
Additive model: Figure 3A shows the squared error as a function of the realized number of individuals with genotype 2, for the A-model. The realized number of individuals with genotype 2 in the sample is expressed as a departure from its expectation (i.e., Dn 2 ), where the expectation is Eðn 2 Þ ¼ p 2 N ¼ 3: The squared error was smallest when Dn 2 was zero and increased as Dn 2 moved away from zero. The remaining variance in squared error for a given value of Dn 2 (as shown by the boxplots) was due to variation in the difference between p s and p (i.e., Dp). For example, when Dn 2 ¼ 3; the allele frequency in the sample can vary, because the number of sampled heterozygotes can vary. This variation in Dp affectsâðcÞ; except when Dn 2 ¼ 2 3: In that case, the number of individuals with genotype 2 was zero (in this example) and aðcÞ was always the slope of a line between two data points. Figure 3B shows the effect of Dn 2 and Dp on dðcÞ for the A-model. The sample where Dn 2 ¼ 0 and Dp ¼ 0 did not contribute to the RMSE (dðcÞ ¼ 0). Samples where Dn 2 , 0 had the largest contributions to the RMSE, and samples where Dn 2 . 0 had somewhat smaller contributions. Figure 3B also shows that Dp contributed less to the RMSE than Dn 2 ; because there were samples where Dp ¼ 0; but dðcÞ was relatively large. Additive plus dominance model: Figure 3C shows the squared error as a function of Dn 2 ; for the AD-model. The squared error was small and about equal for all Dn 2 ; except for Dn 2 ¼ 2 3; where the squared error was largest and exactly the same as with the A-model (see Figure  3A), because there were no individuals with genotype 2 in the sample. Similar to the A-model, the remaining variance (as shown by the boxplots) was due to variation in the difference between p s and p (Dp). Figure 3D shows the effect of Dn 2 and Dp on dðcÞ for the AD-model. Samples where both Dn 2 6 ¼ 2 3 and Dp ¼ 0; did not contribute to the RMSE (dðcÞ ¼ 0). Samples where Dn 2 ¼ 2 3 showed the largest contribution, while all other samples showed small dðcÞ: Similar to the A-model, Figure 3D shows that Dp was not an important source of error.
A-vs. AD-model: In conclusion, even when the locus explains all variance (i.e., H 2 ¼ 1),â shows error with both the A-and AD-model when it is based on a finite random sample from a population in HWE and dominance is present. With the A-model, the error originated mainly from sampling deviations of genotype frequencies from expected HWE frequencies (DHWE), and to a lesser extent from sampling deviations of allele frequencies (Dp) ( Figure 3B). With the AD-model, the error originated from Dp only, provided that all three genotype classes were sampled ( Figure 3D). These results partly explain the patterns of RMSE in Figure 1 (see Appendix A for more detail).

Mean RMSE across allele frequency distribution
Above, we illustrated the RMSE ofâ as a function of p: Now, we present the mean RMSE averaged over the distribution of p; for a ¼ 1 and H 2 ¼ 1; assuming a U-shaped distribution of p as a function of N e ; and for different values for N and d (Figure 4). For all scenarios, the mean RMSE with the A-model was about twice as large as the mean RMSE with the AD-model.
With both models, the mean RMSE was zero when d was zero (data not shown) and increased as d increased. The mean RMSE decreased when N increased. The mean RMSE decreased a little when N e increased, which was caused by differences in the U-shaped distribution of allele frequencies. For example, when N e ¼ 50; the percentage of loci with an allele frequency outside the 0.05-0.95 range was 36%, whereas when N e ¼ 200; this percentage was 51%. Loci in this range have a low RMSE (see Figure 1 and Appendix A), and therefore a higher N e results in a lower mean RMSE. The effect of N e on the mean RMSE decreased as N increased. Results were identical when a was changed, because the mean RMSE scales linearly with the absolute dominance effect d; and not with the dominance coefficient d=a:

DISCUSSION
We investigated theaccuracy (in terms ofRMSE)of estimated average effects (â) in the presence of dominance, using a single locus model including only an additive or an additive plus dominance effect. In the presence of dominance, the A-model falsely assumes that residuals are IID. The AD-model was therefore expected to better fit the data and give more accurate estimates of a; but only when dominance is present and sample size sufficient for the dominance effect to be accurately estimated. Our results, however, show that the AD-model was always equally or more accurate than the A-model, even with small sample sizes (i.e., N ¼ 300), a heritability lower than one (i.e., H 2 , 1), or in the absence of dominance.
With the A-model, both sampling deviations of genotype frequencies from HWE frequencies (DHWE) and sampling deviations of allele frequencies (Dp) contributed to the error. With the AD-model, only sampling deviations of allele frequencies contributed to the error, provided that all three genotype classes were sampled. The contribution of Dp to the error was much smaller than the contribution of DHWE. The AD-model was therefore more accurate than the A-model. Thus, even when the locus explained all variance (i.e., H 2 = 1), the mean RMSE decreased as sample size increased, because with larger sample sizes, deviations from HWE that considerably affectâ had a lower probability of occurring. Additionally, with larger sample sizes, the chance of missing one of the genotype classes was smaller, which further reduced the RMSE. The (mean) RMSE ofâ was always smaller with the AD-model than with the A-model. The RMSE ofâ scaled linearly with d; if d doubled, the RMSE also doubled. Remarkably, in the absence of dominance, there was no disadvantage of using the AD-model. Hence, the AD-model yielded equally or more accurate estimates of average effects than the A-model for all scenarios considered.
With the A-model,â is computed as the linear regression coefficient of genotypic values on allele counts (Fisher 1941), which yields the average effect in the sample (a s ), rather than the average effect in the whole population (a). Hence, the expectation ofâ A is equal to where F s measures the deviation from HWE in the sample (DHWE) (Haldane 1954;Falconer 1985). Here, F s is defined as one minus the ratio between the observed number of heterozygotes and the expected number of heterozygotes based on the sample allele frequency (Haldane 1954;Wright 1969). With the AD-model,â is computed fromâ andd; which are simultaneously estimated from the data. Unlike the A-model (where EðâÞ = a s ), the expectation ofâ AD is equal to when all three genotype classes are sampled. Comparison of Equation 14 and Equation 15 shows that the error inâ A originates from both DHWE and Dp; while the error inâ AD originates from Dp only, except when one of the genotypes is missing in the sample. When only two genotype classes are sampled, the AD-model reduces to the A-model. With the AD-model, the contrast between the mean genotypic value of the homozygotes and the genotypic value of the heterozygotes (d) does not depend on the number of individuals in these two groups. This is why the AD-model is more robust against deviations from HWE than the A-model. These results were confirmed by mathematical derivations of the error with the two models (Appendix B). In theory, the error from the A-model can be quantified when p s ; p; F s , and d are known, and from the AD-model when p s ; p; and d are known. In real data, however, p (and also d with the A-model) is not known, and therefore the error cannot be quantified. As a result, the error cannot be removed from either of the two models. In conclusion, the AD-model is preferred for the estimation of average effects when dominance is present, because it yields more accurate estimates than the A-model, particularly when sample sizes are small.
In this study, we used the so-called genotypic parameterization of the AD-model, as opposed to the breeding parameterization (Vitezica et al. 2013). The results, however, were identical to the breeding parameterization (results not shown), because the two parameterizations are equivalent.
Additional to the contribution of dominance to additive variance, evidence for the contribution of epistasis is increasing (Mackay 2015;Monnahan and Kelly 2015). Our results show that modeling dominance improves estimated average effects, and it may therefore be tempting to hypothesize that modeling epistasis may also improve estimates. However, investigating the benefit of modeling epistasis for the accuracy ofâ is not straightforward, because it requires extension to multiple loci.
Taking a finite sample from a large population, which was done in this study, closely resembles a sharp reduction to a small population size, known as a bottleneck. In a small population, genotype frequencies deviate from HWE even under random mating. The expected genotype frequency for heterozygotes is equal to 2p s ð1 2 p s Þð1 2 F s Þ (Haldane 1954). In turn, the expectation of F s depends on the size of the bottleneck (or sample size,N), and is equal to 21=ð2N 2 1Þ with random mating (Kimura and Crow 1963). This indicates that the expected heterozygosity in the sample is larger than the HWE frequency calculated from the sample allele frequency. The effect of DHWE on estimated average effects was studied by Wang et al. (1998), who focused on the consequences for the additive genetic variance. In agreement with our results, they showed that the average effect was not influenced by DHWE when d ¼ 0; or when p 0:5: Furthermore, the effect of DHWE on estimated average effects depended on the size of the bottleneck (or sample size, N) and the size of dominance effect (d) (Wang et al. 1998). Because the effects of a bottleneck are very similar to the effects of taking a small sample from a large population, the results of our study also apply to populations in a bottleneck.
We quantified the error in estimates of a that originated from DHWE in random finite samples from a population of unrelated individuals. We purposefully used relatively small sample sizes to illustrate the effect. Although sample sizes taken in empirical studies may be larger, effective sample size may be much smaller, because actual populations often have small effective population size (N e ) (Hall 2016). This low N e is related to the family structure in the population, where many individuals are bred from a limited number of parents, so that N e N. Hence, the effective sample size may be much smaller than N; because the sample will partly consist of related individuals. Because of this relatedness, sampling deviations in allele and genotype frequencies can be larger than expected based on sample size. The sample sizes chosen in this study may therefore be similar to effective sample sizes in empirical studies. As an example, we investigated the SD of F s across allele frequencies in a dataset of 3500 pigs (Cleveland et al. 2012). The resulting value was comparable to the expected SD of F s for samples of 500-1000 animals (see Appendix C), which supports our expectation that effective number of sampled individuals may be smaller than the actual number of sampled individuals. Furthermore, in many studies that use genotype data, markers are removed if they show a significant deviation from HWE. The significance threshold that is used for HWE filtering, however, is often very liberal (Gondro et al. 2013). Consequently, there are still many markers left in the data that deviate from HWE and may give inaccurate estimates of average effects. As a result, we expect that the magnitude of DHWE simulated in this study may be similar to DHWE in empirical studies.
The estimation of average effects at single loci, as presented in this study, may be relevant for genome-wide association studies (GWAS). In GWAS, a large number of markers spread across the genome are each tested for an association with the observed phenotype (Gondro et al. 2013). Most GWAS test these associations by using an additive model which treats the marker genotypes as fixed (Hayes 2013). Only few studies have used the AD-model in GWAS to explicitly estimate a and d (e.g., Lopes et al. 2014;Aliloo et al. 2015;Huang et al. 2015;Bennewitz et al. 2017) and, to our knowledge, none have investigated differences in accuracy of estimated average effects between the A-model and AD-model. The effects of sampling genotypes onâ shown in this study apply toâ m in GWAS, becauseâ m are usually estimated by ordinary least squares. Using the AD-model in GWAS will therefore yield more accurate estimates of average effects and explained variance of markers.
The results presented in this study may also be relevant for genomic prediction. In genomic prediction, genomic estimated breeding values (GEBVs) are calculated as the sum of many estimated average effects multiplied by their marker genotypes (Meuwissen et al. 2001). Differences in accuracy of GEBVs may therefore be related to differences in accuracy of the estimated average effects. Our results, however, cannot be extrapolated directly to accuracy of GEBVs for several reasons. In this study, we considered a single locus, estimated a as a fixed effect, and assumed known genotypes of the quantitative trait locus (QTL). By contrast, GEBVs are based on many marker loci, for which all a's are estimated simultaneously as random effects (Meuwissen et al. 2001). In genomic prediction, the effect of a single QTL is likely to be explained by multiple markers, and errors of individual marker effects may cancel out to some extent when accumulated within individuals to compute their GEBVs. Additionally, random effect models shrink average effects toward zero (Whittaker et al. 2000), which may shrink the sampling error as well. In conclusion, to translate our results to accuracy of GEBVs, this research should be extended to the estimation of multiple random effects based on marker genotypes.
Neither GWAS nor genomic prediction are based on the genotypes at QTL directly, but rely on linkage disequilibrium (LD, measured by r) between observed markers and unknown QTL (Lewontin and Kojima 1960). For the additive effect at the QTL, the fraction captured by the marker is proportional to r; whereas for the dominance effect, the fraction captured by the marker is proportional to r 2 (Weir 2008;Zhu et al. 2015). The proportion of the signal of the dominance part of a m that is captured is therefore expected to be smaller than of the additive part, because r 2 # r: For this reason, a marker should be very close to a QTL to pick up its dominance effect (Wellmann and Bennewitz 2012). As a result, the benefit of dominance models over additive models may be smaller with lower marker densities. We therefore argue that, when dominance is present and markers are able to capture dominance, the dominance model yields more accurate estimates of a than the additive model.

CONCLUSIONS
When a single locus average effect is estimated in a random finite sample from a large population in HWE, both A-models and AD-models yield error in their estimates, even when the locus explains all variance (i.e., H 2 ¼ 1). Estimates from the AD-model, however, are more robust against chance deviations from HWE frequencies than estimates from the A-model. Genetic models that include dominance, therefore, yield higher accuracies of estimated average effects at single loci than purely additive models when dominance is present. In the absence of dominance, there was no penalty for fitting dominance. These results are important for GWAS, and potentially also for genomic prediction.