Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome

Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants.


Background
Genotype imputation is a cost-effective statistical approach to infer genotypes at untyped loci [1][2][3][4]. Target panels genotyped with low-density single nucleotide polymorphism (SNP) chips can be imputed to higher density using a reference panel typed at higher density. Imputation is important to increase the power and resolution of genome-wide association studies [2,5] and the accuracy of genomic breeding values for individuals typed using cheap low-density SNP genotyping arrays [6]. Imputation accuracy depends on several factors including the size and composition of the reference panel [3,7], the minor allele frequency (MAF) of imputed variants [8], and correct placement of variants on the reference genome assembly [7,9,10].
Large and informative reference panels are required to facilitate accurate imputation of rare and low-frequency variants [3,7,11]. The effect of the target panel size, on the other hand, has been barely studied. The software that implement a pre-phasing-based strategy [12], for example, Minimac3 [13] and IMPUTE2 [14], impute genotypes into phased haplotypes. These tools are probably more sensitive to the size of the target panel because the phasing accuracy depends on the number and the relatedness of the samples considered [15][16][17]. Erroneously inferred haplotypes might compromise the quality of subsequent imputation.
The correct physical placement of SNPs along the genome is also crucial for accurate phasing and genotype imputation [7,10,18,19]. Several iterations of improvement resulted in a highly contiguous and accurate bovine reference sequence [20]. However, the correct placement of markers and haplotypes inference remains challenging in repetitive and duplicated regions [21]. Recently a highly contiguous version (ARS-UCD1.2) of the bovine reference sequenceassembled using long sequencing readsfilled gaps and resolved repetitive regions of the previous UMD3.1 assembly [21].
Identifying and filtering out poorly imputed variants is important to avoid bias in downstream genomic analyses [2]. The empirical accuracy of imputation can be obtained by masking and subsequently predicting genotypes in a cross-validation setting. To avoid this computationally demanding and time-consuming process, the per-locus imputation quality scores from imputation programs are often used as indicators for imputation quality. These quality scores are correlated with true empirical estimates [22], however, their reliability with respect to the size of the reference panel and MAF of the imputed variants has not been tested extensively.
We herein study the quality of imputation from 50 K to 777 K SNP density in Nellore cattle from Brazil using a pre-phasing-based workflow, implemented using the software packages Eagle2.4 for phasing [16], and Minimac3 for imputation [13]. First, we investigate if the improvements in the new bovine genome assembly (ARS-UCD1.2) affect the imputation quality. Next, we test the effect of reference and target panel size on imputation quality. In addition to the commonly used imputation quality metrics, we study the marker density in the imputed data set, i.e. the number of markers segregating in the target panel, after imputation. Finally, we study the reliability of model-based quality scores obtained from the imputation software by comparing it to empirical measures obtained using crossvalidation.

Results
Comparison of the accuracy of imputation between the UMD3.1 and ARS-UCD1.2 genome assemblies We considered 3938 Nellore cattle genotyped at 777,962 SNPs using the Illumina BovineHD BeadChip to study the effect of reference genome assemblies on the imputation accuracy. Following quality control (QC) on the genotype data, we determined the position of 684,561 and 683,590 autosomal markers of the Illumina Bovi-neHD BeadChip according to the UMD3.1 and ARS-UCD1.2 bovine genome assemblies, respectively. An intersection of 683,504 autosomal SNPs mapped on both assemblies. The genotypes of 1938 randomly selected animals (target) were reduced to the markers of the Illumina BovineSNP50 genotyping array to mimic low SNP density. The masked genotypes were subsequently inferred in silico using 777 K genotypes of 2000 animals (reference) in ten-fold cross-validation. The empirical accuracy of imputation was assessed by the squared correlation between true and imputed alleles dosage (R 2 dose). There was almost no difference in imputation accuracy when the markers were aligned to either UMD3.1 or ARS-UCD1.2 with the respective mean R 2 dose of 0.916 and 0.917. Only a few SNPs showed large differences in the accuracy of imputation between the two assemblies ( Fig. 1a, Additional file 1b). Genotype imputation was less accurate for low-frequency variants (MAF < 5%) (Additional file 1a) using either the ARS-UCD1.2 (0.818) or UMD3.1 (0.817) assembly.
The new assembly placed 1507 of 1735 markers that were previously unmapped (i.e., not assigned to a chromosome on UMD3.1) on autosomes; 762 of these passed our QC parameters and segregated in the target panel after imputation. Three hundred fifty-six did not pass QC, mainly due to MAF filtering applied, and 389 markers were monomorphic after the imputation. The newly placed autosomal markers were imputed with a mean R 2 dose value of 0.802. Compared to the UMD3.1 assembly, 2874 markers were placed on a different chromosome in the ARS-UCD1.2 assembly. For 1116 of these markers that passed QC in our data, the mean R 2 dose was low but improved from 0.515 (UMD3.1) to 0.586 (ARS-UCD1.2) (Fig. 1b).
The analysis of imputation accuracy enabled us to identify regions with poor imputation quality most likely due to misplaced SNPs. We partitioned autosomes into segments of 100 kb and counted the number of SNPs that were poorly imputed despite moderate minor allele frequency (MAF > 1% and R 2 dose < 0.8). All SNPs in segments with more than three poorly imputed SNPs were considered as putatively misplaced (Fig. 2). A total of 246 segments containing 5447 SNPs (Additional file 2) and 182 segments containing 3549 SNPs (Additional file 3) were identified as misplaced on the UMD3.1 and ARS-UCD1.2 assembly, respectively. Of the 5447 putatively misplaced SNPs on the UMD3.1 assembly, 4769 segregated in a taurine cattle population [7], where using a slightly different approach 1680 (35%) of these markers were identified as likely misplaced. A subset of 2205 SNPs was identified as misplaced on both assemblies with respective mean R 2 dose of 0.65 and 0.67 when aligned to the UMD3.1 and ARS-UCD1.2 assembly. Three thousand two hundred forty-two SNPs in segments identified as misplaced on the UMD3.1 assembly were imputed with better accuracies (not on poorly imputed segments) on the ARS-UCD1.2 assembly. The mean R 2 dose for these SNPs increased from 0.78 to 0.93 when aligned to ARS-UCD1.2. The increase in R 2 dose was even more substantial (from 0.49 to 0.95) for 386 SNPs that were mapped to a different chromosome on the new build. On the other hand, 1344 SNPsnot on poorly imputed segments when mapped to the UMD3.1 assemblywere identified as putatively misplaced on the ARS-UCD1.2 assembly. The mean R 2 dose for these SNPs decreased from 0.89 to 0.73 when mapped to the ARS-UCD1.2 assembly. The drop in the accuracy was even more substantial (from 0.89 to 0.14) for 257 SNPs that were mapped to a different chromosome on the new build.

Effect of reference panel size on imputation accuracy and marker density
We used SNPs mapped to ARS-UCD1.2 to study the effect of reference panel size on the accuracy of imputation and marker density in the imputed data set. Using four reference panels containing 250, 500, 1000, and 2000 randomly selected animals, we imputed genotypes for 1938 animals from low (50 K) to high density (777 K) and compared their imputed and true genotypes using ten-fold cross validation to take sampling errors and population substructure into account. The mean squared correlation between true and imputed allele dosage (R 2 dose) increased with an increasing size of reference panels. For instance, R 2 dose increased from 0.845 to 0.917 (8.5%) when the reference panel size increased from 250 to 2000 (Table 1). The variance in R 2 dose values was similar across replicates, suggesting population substructure had little or no effect on imputation accuracy in our data (Additional file 4).
The increase in R 2 dose was observed for markers in all allele frequency classes ( Fig. 3a) but it was greater (from 0.688 to 0.818; 19%) for low-frequency (MAF < 5%) than for common variants (MAF > 5%; 0.909 to 0.965; 6%). The correlation-based empirical accuracy metric R 2 dose does not reflect the realized marker density (number of informative markers) in the imputed data set. Therefore, we studied the marker density in the imputed dataset when the genotypes were imputed using different reference panels. Larger reference panels captured a greater fraction of genetic variation in the population, allowing the imputation of a larger number of variants ( Fig. 3b and Table 2). In the smallest (n = 250) and the largest Fig. 1 Difference in imputation accuracy according to reference genome assemblies. Difference in imputation accuracy (R 2 dose) between the ARS-UCD1.2 and UMD3.1 assembly for all makers (a), and for markers that were remapped to a different chromosome (b), when genotypes were aligned to the two Bovine assemblies. To facilitate the interpretation, the y-axis is presented on log-scale in (a) (n = 2000) reference panels, 597,095 and 636,201 markers were polymorphic, respectively, of which 595, 609 and 629,025 were also polymorphic in the 1938 target animals.
When we reduced the marker density of the target animals to the content of the BovineSNP50 BeadChip genotyping array and imputed the missing genotypes using reference panels of increasing size, the number of informative markers increased from 586,701 to 618,660. The increase in the imputed marker density with larger reference panels was mainly (~90%) due to a higher number of low-frequency variants ( Table 2). The largest reference panel contained~31% more (150,407 vs. 114, 516) rare alleles (MAF < 2%) than the smallest reference panel (n = 250 animals), of which 132,931 (92.2%) and 104,222 (92.8%) remained informative post-imputation. The additional markers available in the imputed data when genotypes were inferred from the largest reference panels (compared to the smallest reference panel) were imputed with the mean and median R 2 dose of 0.674 and 0.740 with 42% of the markers being imputed with R 2 dose > 0.80. Table 1 Empirical accuracies (R 2 dose, R 2 gt, CR) and modelbased (Rsq) imputation quality score from Minimac3 using different reference panels

Effect of target panel size on imputation accuracy and markers density
We investigated the effect of target panel size on imputation accuracy and number of polymorphic markers after imputation using a target panel of 500 (base) animals. The 50 K genotypes of the base animals were phased together with 0, 600, 1200, 1800, or 2400 additional animals, and subsequently imputed to 777 K using the same reference panel of 1000 animals. The mean squared correlation between true and imputed allele dosages (R 2 dose) increased when genotypes were imputed into haplotypes that were inferred in larger target panels ( Fig. 4a and Table 3). For instance, R 2 dose increased from 0.903 to 0.913 (1.1%) when 2400 additional  Number of markers that were imputed as polymorphic but were monomorphic in the target animals (considering the true genotypes) c MAF < 2% animals were included for haplotype inference. Increase in accuracy was observed for markers in all allele frequency classes (Fig. 4a). Rare variants (MAF < 2%) benefited the most from large target panels. For common variants, the marginal increase in accuracy was low beyond 600 additional animals (Table 3).
In the 1000 reference animals, 625,538 markers were polymorphic of which 607,560 were also polymorphic in the 500 base animals. When we combined the base animals with different numbers of additional animals (0, 600, 1200, 1800, and 2400) and imputed the missing genotypes, the number of markers that were polymorphic in the imputed data increased with the number of additional animals used. For instance, the number of polymorphic markers in the imputed data increased by 2331 markers (from 593,239 to 595,570) when 2400 additional animals were considered to infer the haplotypes of the 500 base animals. Most (n = 2323) of them were rare markers segregating at MAF < 2% (Fig. 4b).
Of 17,978 rare markers that had MAF less than 2% in the reference panel, 14.6% were monomorphic in the true genotypes of the target panel. However, some of these rare markers were polymorphic in the imputed dataset likely due to erroneously imputed genotypes. The fraction of fixed markers in the target data that turned polymorphic after imputation increased with the number of additional animals used for phase inference in target animals (9.5 and 10.5% using 0 and 2400 additional animals for haplotype inference, respectively) ( Table 4).

Reliability of the model-based quality score from Minimac3
Next, we compared the model-based quality score reported by Minimac3 (Rsq) to our three empirical accuracy estimates -(i) squared Pearson's correlation coefficient between true genotypes and imputed allele dosages (R 2 dose) (ii) squared Pearson's correlation coefficient between true and imputed "best-guess" genotypes (R 2 gt), and (iii) proportion of correctly imputed genotypes (concordance rate -CR) -. Irrespective of the size of the reference panel and MAF of the imputed variants, the Rsq values were higher than R 2 dose and R 2 gt but lower than CR values (Table 1, Additional file 5). However, the discrepancy between these metrics decreased  when either the MAF of the imputed variants or the size of the reference panels increased (Additional file 6, Table 1). The Rsq values reported by Minimac3 were highly correlated with R 2 dose (0.77) and R 2 gt (0.75) but not with CR (0.03) when a reference panel of 2000 samples was used to impute 1939 target samples. The very low correlation between Rsq and CR was mainly due to large discrepancies of both metrics for markers with low MAF. The mean correlations of Rsq with R 2 dose, R 2 gt, and CR (excluding markers with MAF < 0.5%) increased from 0.80, 0.81 and, 0.33 to 0.84, 0.84 and, 0.61, respectively, when the reference panel size increased from 250 to 2000 (Fig. 5). These correlations were also stronger for common than for the rare variants; when imputing with the largest (n = 2000) reference panel, the correlations of Rsq with R 2 dose, R 2 gt and CR were 0.81, 0.81 and, 0.64 and 0.93, 0.92 and, 0.84 for low-frequency (MAF < 5%) and common (MAF > 5%) variants, respectively.

Discussion
Phasing and imputation accuracies of high-density genotypes using the ARS-UCD1.2 bovine reference genome assembly have been investigated in different Bos taurus taurus breeds [11]. In Bos taurus indicus cattle, the accuracy of imputing 50 K genotypes to higher density has been investigated only using the UMD3.1 assembly of the bovine genome [23,24]. To the best of our knowledge, our study is the first to evaluate the accuracy of imputing 50 K to 777 K genotypes in a Bos taurus indicus cattle breed using the ARS-UCD1.2 assembly of the bovine genome [21].
Higher contiguity of the ARS-UCD1.2 genome assembly might improve haplotype inference in regions that contained phasing and imputation errors in the previous assembly (UMD3.1) [7,10,18]. While the global accuracy of imputation was identical for both assemblies, we detected a number of segments for which the accuracy of imputation differed greatly. Moreover, we identified fewer putatively misplaced SNPs on ARS-UCD1.2 compared to UMD3.1 and observed a substantial improvement in imputation accuracy for 60% of misplaced SNPs on UMD3.1 when they were mapped to the new build. However, we also identified SNPs that had considerably higher imputation accuracy when aligned to UMD3.1 than to ARS-UCD1.2 indicating that some physical ARS-UCD1.2 coordinates of the 777 K markers are wrong. A limitation of our study is that we considered only SNPs interrogated with commercial genotyping chips. These SNPs suffer from ascertainment bias as they are predominantly located in more accessible regions of the genome [25]. Hence differences in imputation accuracy between the two builds might be higher than observed in our study when imputing whole-genome sequence variant genotypes. A haplotype-resolved Brahman genome assembly [26] might further improve the accuracy of imputing genotypes for animals from indicine breeds, particularly within regions that are not co-linear between Bos taurus taurus and Bos taurus indicus assemblies.
The accuracy of imputing Illumina BovineHD genotypes was assessed previously in indicine cattle. Accuracy greater than 0.957 was obtained when reference panels of 793 and 171 Nellore and Gyr animals were used to impute 50 K genotypes to 777 K [23,24]. Using map positions of the UMD3.1 assembly and the software FImpute [4], these studies assessed the correlation between true and imputed (best-guess) genotypes (Rgt). Using a pre-phasing-based approach that does not explicitly consider pedigree information, we obtained a lower mean Table 4 Number of polymorphic and monomorphic markers in n = 1000 reference and n = 500 target animals when different numbers of additional animals were included for phase inference in the 500 target animals imputation accuracy (Rgt) of 0.952 in Nellore cattle, although the number of reference animals (n = 2000) was considerably higher and the pre-phasing-based approach applied in our study is expected to be at least as accurate as FImpute [27]. The slightly lower accuracy obtained in our study likely resulted from including more lowfrequency markers, that are difficult to impute, rather than from differences in the imputation software used. Applying the same MAF filter used in the previous studies (MAF > 2%), we indeed obtained a comparable accuracy (Rgt) of 0.976 using the ARS-UCD1.2 assembly with pre-phasing-based genotype imputation. Accurate genotype imputation requires large reference panels that are genetically related to the target animals [3,7,28]. In our study, the mean accuracy of imputation (R 2 dose) increased by 8.52% when the size of the reference panel increased from 250 to 2000. Our findings also show that the size of the reference panel is correlated to the number of informative markers, i.e., polymorphic, in the imputed data set. Large reference panels facilitate imputing genotypes for true rare variants. Importantly, genotypes for additional variants that were only available when the largest reference panel was considered, were imputed with moderately high accuracy (mean R 2 dose = 0.674). Correlation-based accuracy estimates (R 2 dose and R 2 gt) do not reflect the number of monomorphic sites post imputation, thus the overall benefit from large reference panels is even higher than indicated by these measures [12,29].
Our approach allowed for an indirect investigation of the effect of sample size on phasing accuracy by studying the accuracy of imputing genotypes into the inferred haplotypes. We show that the size of the target panel also influences pre-phasing-based imputation quality. Both imputation accuracy and marker density in the target animals increased when their haplotypes were inferred in larger cohorts. Haplotype inference using statistical methods benefits from large cohorts [15] due to better modeling of linkage disequilibrium and increased prevalence of identical by descent tracts [16]. Although the pre-phasing-based strategy employed here is computationally efficient (and accurate when using larger reference and target panels), our results showed that care needs to be taken when it is applied to impute genotypes into smaller cohorts. The new paradigm of reference-based phasing implemented in Eagle2 and SHAPEIT3 [30], or use of pedigree information (e.g., with FImpute [4] or LINKPHASE3 [31]) might improve phase inference in small reference and target panels. As noted by Loh and coworkers [16], the reference panel used for imputation could also be used as a reference panel for phase inference. If the size of the target panel increases, it seems advisable to infer target haplotypes and impute genotypes in the pooled data again instead of imputing only the new data. Our results may be generalized to other applications of genotype phasing as they highlight that the sample size needs to be considered in choosing an appropriate phasing method.
Many genotype imputation software tools estimate a SNP-level model-based quality score to indicate imputation accuracy. This value may be used to detect and discard poorly imputed variants from downstream analyses [28,32]. We compared the model-based imputation quality score from Minimac3 (Rsq) to three empirical measures of imputation accuracy (R 2 dose, R 2 gt, and CR). In agreement with previous studies that investigated the model-based accuracies from Beagle [33] and Minimac3 [22], we found that Rsq values were highly correlated with correlation-based empirical measures (R 2 dose, R 2 gt). The correlation between empirical and model-based quality scores increased with the size of the reference panel. However contrary to the results from a previous study [22], we found Rsq values to be systematically higher than the R 2 dose and R 2 gt for makers across all allele frequency classes. The third empirical measure, CR was not correlated with Rsq when all markers were considered. This measure, unlike the first two, does not account for the frequency of imputed alleles and considerably overestimates the accuracy of imputation for rare variants [34,35]. However, the correlation between CR and model-based quality score is also high for variants with MAF greater than 5%.
The calculation of empirical accuracy using crossvalidation is computationally intensive. Our results showed that the model-based quality score is highly correlated with empirical correlation-based measures. However, Minimac3 reported moderately high Rsq values (increasing with the size of the reference panel used; > 0.640) for variants that were falsely imputed (i.e. monomorphic in the target panel, considering the true genotypes, but polymorphic in the imputed data). Because thresholds between 0.3 and 0.6 are commonly used to discard markers of insufficient imputation quality [36] these erroneously imputed variants would qualify for downstream analyses. Variants that are imputed into target animals although they are actually not polymorphic, may result in conflicting results in association studies [37]. Our findings show that model-based imputation quality scores, although useful, need to be treated with caution, particularly for low-frequency and rare variants that have been imputed from less informative reference panels.

Conclusions
Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The improvement in the new assembly affected the imputation of only a small fraction of SNPs present on the Bovine HD SNP chip. The global accuracy of imputation was almost identical to the previous build (UMD3.1). Accuracy of imputation and the number of informative markers in the imputed data benefit from large reference panels. The size of the target panel also has an influence on both metrics when genotypes are inferred using pre-phasing-based approaches. The model-based imputation quality scores from Minimac3 (Rsq) were correlated with the empirical correlation-based estimates of accuracy. These values can be used to filter out poorly imputed variants but their reliability is low for rare and low-frequency variants particularly when genotypes are imputed from small reference panels.

Genotype data
We considered microarray-derived SNP genotypes of 4098 Nellore (Bos taurus indicus) cattle from farms participating in the DeltaGen® and Paint® (CRV Lagoa) breeding programs in Brazil. All animals were genotyped using the high-density Illumina BovineHD BeadChip (Illumina, Inc., San Diego, CA, USA), comprising 777, 962 SNPs. Following the genotyping, all animals remained as either breeding or fattening animals at the farms. No animals were raised or slaughtered for the purpose of this study. The map positions of the SNPs were determined according to the Bos taurus taurus reference genome assembly UMD3.1 [38]. Quality control (QC) on the raw genotype data was performed using the PLINK (v1.9) software [39]. Animals and SNPs with call rates < 95% were not considered. SNPs that were annotated to non-autosomal chromosomes or SNPs for which the map positions were unknown were discarded as well as SNPs that deviated from Hardy-Weinberg proportions (p < 10 − 5 ). After filtering, 3938 animals with genotypes for 684,561 autosomal SNPs remained in the dataset. Sporadically missing genotypes (max 5%) were imputed using Beagle5 [40].

Imputation procedure and assessment of accuracy
To assess the accuracy of imputation for different compositions of reference and target panels and two bovine genome assemblies (see below), we carried out a ten-fold cross-validation using the 3938 animals. For each fold, we randomly grouped animals into target and reference panels. The marker density in the target panels was reduced to match the BovineSNP50 (version 2) BeadChip comprising 56,206 SNPs. Subsequently, the genotypes of the target panels were imputed to higher density using information from the reference panel using a prephasing-based imputation workflow. The Eagle v2.4 [16] software was used to infer haplotypes for the reference and target panels and Minimac3 [13] was used to impute genotypes.
Minimac3 provides imputed allele dosages (continuously distributed values between 0 and 2) and best-guess genotypes (discrete values of 0, 1, or 2). We assessed empirical imputation accuracy as the squared Pearson's correlation coefficient between either true genotypes and imputed allele dosages (R 2 dose), or true genotypes and best-guess genotypes (R 2 gt). We also calculated the proportion of correctly imputed genotypes (concordance rate -CR). R 2 dose values were used to assess imputation quality in different imputation scenarios and all three empirical accuracies were used to investigate the reliability of the model-based quality score estimate from Mini-mac3 (Rsq).
Values for R 2 dose and R 2 gt cannot be calculated for markers that are imputed to the major allele in all target animals because the imputed doses and "bestguess" genotypes for such markers have zero variance. Thus, the loss of such otherwise segregating markers (usually at very low MAF) post-imputation is not taken into account by the correlation-based imputation accuracy measures. To study the effect of reference and target panel composition on the realized marker density in the imputed dataset, we also calculated the proportion of markers that are imputed to the major allele in all target samples but were segregating in the real data set.

Identification of misplaced SNPs
We analyzed the accuracy of genotypes imputed from the largest reference panel to identify regions with poor imputation qualitydue most likely to misplacement of SNPswhen genotypes were aligned to the two bovine genome assemblies. We partitioned autosomes into segments of 100 kb and counted the number of poorly imputed (MAF > 1% and R 2 dose < 0.8) SNPs. All SNPs in segments with more than three poorly imputed SNPs were considered as putatively misplaced.

Test scenarios Effect of reference and target panel
We used 3938 animals with genotypes for 683,590 markers aligned to the ARS-UCD1.2 assembly to investigate the effect of reference and target panel sizes on imputation quality. To study the effect of the reference panel, we imputed a target panel of 1938 animals from 50 K to 777 K using reference panels with 250, 500, 1000, and 2000 randomly sampled animals in 10 replicates. To study the effect of target panel size, we imputed genotypes in pre-phased target panels of varying sizes using a reference panel of fixed size (n = 1000). Genotypes for 500 animals (base panel) were phased with an additional of 0, 600, 1200, 1800 or 2400 randomly sampled animals. The resulting pre-phased target panels respectively containing 500, 1100, 1700, 2300, and 2900 animals were imputed to higher density using a phased reference panel of 1000 animals. The imputation was carried out in 10-fold cross-validation, and the accuracy of imputation was assessed only for the 500 animals in the base panel.

Comparison between UMD3.1 and ARS-UCD1.2 genome assemblies
We compared the accuracy of imputation in SNP genotypes aligned either the UMD3.1 or ARS-UCD1.2 genome assembly by imputing 1938 animals from 50 K to 777 K using a reference panel of 2000 animals in ten-fold cross-validation (see above). Target and reference animals were selected randomly. The global imputation accuracies were reported as mean R 2 dose (see above) from all autosomal markers across all replicates. We also studied the difference in accuracies for (i) "newly mapped" markers i.e., markers that are unmapped when aligned to UMD3.1 and on autosomes when aligned to ARS-UCD1.2, (ii) markers aligned to different chromosomes for the assemblies, and. (iii)markers identified as putatively misplaced.