Analysis of Hematological Traits in Polled Yak by Genome-Wide Association Studies Using Individual SNPs and Haplotypes

Yak (Bos grunniens) is an important domestic animal living in high-altitude plateaus. Due to inadequate disease prevention, each year, the yak industry suffers significant economic losses. The identification of causal genes that affect blood- and immunity-related cells could provide preliminary reference guidelines for the prevention of diseases in the population of yaks. The genome-wide association studies (GWASs) utilizing a single-marker or haplotype method were employed to analyze 15 hematological traits in the genome of 315 unrelated yaks. Single-marker GWASs identified a total of 43 significant SNPs, including 35 suggestive and eight genome-wide significant SNPs, associated with nine traits. Haplotype analysis detected nine significant haplotype blocks, including two genome-wide and seven suggestive blocks, associated with seven traits. The study provides data on the genetic variability of hematological traits in the yak. Five essential genes (GPLD1, EDNRA, APOB, HIST1H1E, and HIST1H2BI) were identified, which affect the HCT, HGB, RBC, PDW, PLT, and RDWSD traits and can serve as candidate genes for regulating hematological traits. The results provide a valuable reference to be used in the analysis of blood properties and immune diseases in the yak.


Introduction
Yak (Bos grunniens) is an iconic symbol of the Qinghai-Tibet Plateau and adjacent high-altitude areas [1]. This region is well known for its high-altitude, pristine natural environment, and extreme seasonal changes [2]. During the early Holocene period, the ancient Qiang people began to domesticate wild yaks [3,4]. While only a few animal species can survive in the Qinghai-Tibet Plateau, yaks thrive in the extreme conditions present in this area. They provide food such as meat and milk, a mode of transportation, shelter, and fuel for the residents of this high-altitude region [5,6]. As typical representatives of indigenous animals, yaks are characterized by a high degree of adaptability and low sensitivity to harsh ecological pressures [7,8]. However, due to relatively primitive breeding technology used by the herdsmen, and the difficulty or even lack of implementing immunizations and other measures preventing epidemics, yaks are very susceptible to parasites, which has caused significant economic losses to the yak-breeding industry.
Genes 2019, 10, 463 3 of 13 260/280 between 1.8 and 2.0 was considered to reflect acceptable quality. Qualified DNA samples were diluted to yield the concentration of 50 ng/µL and stored at −20 • C for subsequent SNP chip typing.
Standard sets of hematological data were recorded using the Sysmex pocH-100iV Diff whole blood analyzer (Sysmex Corporation, Japan) within 24 h of specimen collection at the Lanzhou Institute of Husbandry and Pharmaceutical Sciences of Chinese Academy of Agricultural Sciences. The remaining portion of the blood sample was stored at −20 • C until DNA extraction (see above). To perform single-marker GWASs, 15 blood traits were used. They included 8 erythroid traits: hematocrit (HCT), hemoglobin (HGB), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean corpuscular volume (MCV), red blood cell count (RBC), red blood cell volume distribution width-SD (RDW-SD), and red blood cell volume distribution width (RDW). Three leukocyte traits were also analyzed: lymphocyte count (LYM), white blood cell count (WBC), and medium white blood cell count (OTHR). The remaining four traits were related to the platelets and included platelet distribution width (PDW), platelet count (PLT), platelet-large cell ratio (P-LCR), and mean platelet volume (MPV). The presence of correlations between the 15 hematological parameters analyzed was established using the R psych package (http://personality-project.org/r/psych.manual.pdf).

SNP Array Genotyping and Quality Control
Genomic DNA was processed according to Illumina's chip protocol and the chip was scanned using the iScan instrument (Illumina Inc., San Diego, CA, USA). The results of the chip scan were analyzed with GenomeStudio (version 2011.1) software. Quality control was performed using PLINK (version 1.07) [32] and results with a parameter of detection rates >95.0% and minor allele frequencies >0.01, HWE (p-value < 1×10 −5 ), were excluded. Moreover, yaks with more than 10% of missing genotypes or with Mendelian errors greater than >5% were excluded. Only SNP markers that were consistent with family separation rules have been included in subsequent statistical analyses.

Statistical Analyses
The Bonferroni calibrated multiple test [33] was used to determine the significance threshold, with a genomic significant level threshold of 0.05 per effective SNP locus and a chromosomal significant horizontal threshold of 1 per effective SNP locus. For the purpose of this study, the corresponding thresholds were set as 9.5 × 10 −6 (0.05/104,679) and 4.7 × 10 −7 (1/104,679).

Single-Marker GWAS
For each SNP, the linear trend of alleles and phenotypic traits was assessed by the general linear mixed model [34,35]. The linear model contains polygenic random effects, and a variance-covariance matrix to form a structural Q matrix. The Q matrix was computed using admixture [36]. Since the CV value (Corss validation) was found to be the smallest when K = 3, therefore K = 3 was chosen for the Q matrix.
The model was described as follows: where y is the phenotype, b i is the regression coefficient of phenotype on SNP genotypes, X represents the vector of the SNP genotype indicators and assumes values of 0, 1, or 2 corresponding, respectively, to the three genotypes 11, 12 and 22, v is a variable dependent on population structure, Q is the corresponding principal components matrix, and e is the vector of residual errors with e, N (0, Iσ e2). The single-marker GWASs were conducted using the Tassel 5.2.43 software package [35].

Haplotype Analysis
Individual haplotype blocks and haplotype frequency were detected using the standard expectation maximization (EM) [37] algorithm. Reference thresholds and partitioning criteria for SNPs within closely linked haplotypes were established following the recommendations of the user manual of the PLINK software [32]. The recessive multi-alleles were used as a random effect in the association study model. The haplotype analysis model used in the current research is similar to that employed in the single-marker model, except for the replacement of the SNP effect in the single-marker GWAS by the haplotype effect.

SNP Characteristics and Phenotype Statistics
The mean, standard deviation and coefficient of variation (CV) for 15 phenotypic observations of blood traits in the experimental population are presented in Table 1. The CV values ranged from a minimum of 2.92 to a maximum of 57.71 for MCHC and P-LCR, respectively.
After quality control, none of the animals had a genotyping call rate lower than 95%; all 315 yaks were included in the association analyses. A total of 42,276 SNPs failed the missingness test (GENO > 0.05), 653,185 SNPs had a minor allele frequency (MAF) of less than 0.05, and 8710 SNPs were severely departing from the Hardy Weinberg Equilibrium (HWE) (p-value less than 10 −5 ). These SNPs were excluded from further analysis so that a total of 104,864 SNPs remained. Additional 185 SNPs, including unmapped SNPs, were also removed, yielding a total of 104,679 SNPs used for subsequent GWASs.

Loci for Erythrocyte Traits
Sixteen significant SNPs were found by using the single-marker GWAS method. They included nine suggested and seven genome-wide significant SNPs. The SNPs were identified for three erythrocyte traits, of which six were for HCT, five for HGB, and five for RBC (Table 2). Together, 16 single markers were detected which were located within five annotated genes. They include nine markers in the region of genes region, and one marker was located in a region 69,046 bp away from the nearest annotated gene. All identified markers were distributed on BTA (Bos taurus chromosome) 5, 16, 14, and 23. The most significant association of SNP was with the HCT trait (p = 6.21 × 10 −8 ) ( Figure 1).
In haplotype analysis, a total of six significant haplotype blocks, including two genome-wide and four suggestive blocks, were identified for four erythrocyte traits. One block each was associated with HCT, HGB, and RDWSD, while three blocks were associated with RBCs (Table 3). These significant SNPs were located on chromosome 23 ( Figure 2). The most significant haplotype block was located in the GPLD1 gene and was associated with the HGB trait.

White Blood Cell Counts
By single-marker GWAS, seven single suggestive SNPs were found for two white blood cell counts traits. Three of them were identified for LYM and four for OTHR. The most significant (p = 7.71 × 10 −7 ) marker, bovine HD0100040184 was associated with OTHR and was located in the ATP2C1 gene on chromosome 1 (Figure 1).
Haplotype analysis revealed only 1 significant haplotype block, which was associated with OTHR traits. The most significant annotation blocks located in the LOC785941 gene on chromosome 8 was also associated with OTHR traits (Figure 2).

Platelet Traits
Single-marker GWAS detected 14 significant suggestive SNPs. All of them were associated with PLT. Expected PLT traits markers were included within genes or within the nearest regions of genes; other trait makers were outside the gene region. The most significant (p = 3.87 × 10 −7 ) PLT trait marker was bovine HD2100007261, located in the SH3GL3 gene on chromosome 4 ( Figure 1).
Haplotype analysis detected two haplotype blocks for PLT. Both blocks were located on chromosome 23 at position from 31,565,453 to 31,686,075 bp position ( Figure 2).

Discussion
The present investigation employed a single-marker and haplotype GWAS method based on the BovineHD BeadChip to analyze 15 hematological traits in polled yaks. The accumulated data resulted in the identification of 43 significant SNPs by single-marker GWAS, including 35 suggestive and eight genome-wide significant SNPs, as well as nine significant haplotype blocks, including seven suggestive and two genome-wide blocks, by haplotype analysis. The combination of single-marker and haplotype analyses revealed significant associations of 15 hematological quantitative traits in comparable genomic regions. However, the distribution of genomic P values obtained with these two methods were slightly different. This apparent lack of consistency points to the fact that the efficacy of the analysis is data dependent. Additionally, the results obtained with the single-marker method were characterized by more significant p-values than those acquired with the haplotype method. The reason for this discrepancy may be due to the close relationship between continuous SNPs located in the same region; clustering of SNPs may reduce the number of significant correlations. It is often difficult to weigh between increasing LD and increasing freedom which results in reduced detection efficiency. Furthermore, the LDs distributed in the genome are not uniform, i.e., in some places, the LD is high, whereas in others places the LD is low, so we compared the two methods to capture more associated SNPs.
The current investigation represents the first GWAS of hematological traits in the yak. Previous studies using this methodology in livestock were focused on hematological traits in pigs. By employing a similar method, Zhang and coworkers have determined that four SNPs on SSCs 2 have pleiotropic effects when single-marker analysis was used and are located at 24,777,963 bp on chromosome 7 when haplotype analysis was employed [38]. Seven genes have been selected as potential candidates after SNP annotations and expression variation were analyzed. Zhang studied selective White Duroc 6 Erhualian F2 intercross in three growth stages and found 185 significant SNPs for 18 hematological traits. In the GWAS analysis of the dairy cow, hematologic traits were not addressed, but the association of leptin with immunological [39] and hematological traits was identified. The study of Orrù and colleagues [30] revealed that haplotype polymorphism affects the hematologic variables of the dairy cow during the perinatal period. Our study identified several novel SNPs affecting blood traits. However, they did not overlap with significant loci of the same traits in other species. The possible reasons for this discrepancy include differences among species, the use of cattle chips that yielded only 10,486 SNPs after the quality control caused insufficient marker density, and the genetic complexity of blood cells.
Genes 2019, 10, x; doi: FOR PEER REVIEW www.mdpi.com/journal/genes discrepancy may be due to the close relationship between continuous SNPs located in the same region; clustering of SNPs may reduce the number of significant correlations. It is often difficult to weigh between increasing LD and increasing freedom which results in reduced detection efficiency. Furthermore, the LDs distributed in the genome are not uniform, i.e., in some places, the LD is high, whereas in others places the LD is low, so we compared the two methods to capture more associated SNPs.  The current investigation represents the first GWAS of hematological traits in the yak. Previous studies using this methodology in livestock were focused on hematological traits in pigs. By employing a similar method, Zhang and coworkers have determined that four SNPs on SSCs 2 have pleiotropic effects when single-marker analysis was used and are located at 24,777,963 bp on The results of this GWAS showed a significant association pattern of erythrocyte phenotypes. The Manhattan plot pattern was similar for HGB, HCT, and RBC and the three erythrocyte traits which shared common regions at BTA23, from 32,982 Mb to 32,983 Mb, containing two SNPs (BovineHD2300009627 and BovineHD2300009628). Both of these parameters are directly related to Erythrocytes, respectively correlation analysis of hematological traits by comparison (Additional file 1: Table S1). Moreover, the results indicated a high correlation between the three erythrocyte traits (r1 = 0.96, r2 = 0.93, r3 = 0.94; p < 0.001). A significant degree of correlation implies that the QTL on BTA23 might be pleiotropic. However, this association was not found in leukocyte and platelet traits.
By applying a single-marker analysis, the current study identified 43 significant SNPs on 12 different chromosomes associated with 15 hematological traits (Additional file 1: Table S1). Together, 15 genes occupying regions from 63 to 77,309 bp were annotated. The subsequent verification of the function of these genes resulted in the selection of three genes as potential candidates. They included endothelin receptor type A (EDNRA), apolipoprotein B (APOB), and glycosylphosphatidylinositol-specific phospholipase D1 (GPLD1)-all functionally associated with blood-related cells or immune function.
The SNP BovineHD1700003099 located in the gene explained 13.35% of phenotypic variants of PDW (Table 2). EDNRA, a homologous isomer of the endothelin receptor, belongs to the rhodopsin family of the G protein-coupled receptor superfamily [40]. EDNRA is an ET-1 selective receptor, expressed mostly in vascular smooth muscle cells. Activation of EDNRA results in prolonged vasoconstriction. The endothelin receptors are closely related to the onset and development of cardiovascular and kidney diseases, diabetes mellitus, autoimmune diseases, and cancer [41][42][43]. The EDNRA gene is also involved in the vasoconstriction mechanism, which is closely related to pulmonary hypertension [44]. In addition, the results indicated that 14 markers were significantly correlated with the PLT traits, and the association of the BovineHD 1100022374 marker was most significant (p =1.68 × 10 −6 ). Moreover, all these markers were located within the APOB gene. The expression of the APOB gene is positively correlated with the degree of atherosclerosis, and the composition and function of platelets in patients with hyperlipidemia is altered. These changes may be related to the effect of circulating lipoproteins on platelet phenotype [45]. Since yaks are chronically exposed to low atmospheric pressure and hypoxia, the compensatory increase in the number of erythrocytes leads to a higher blood viscosity, which results in the disorder of lipid metabolism and an elevation in blood lipid levels. In this regard, hypoxia promotes the formation of atherosclerotic plaque through the increase of permeability of vascular walls for lipids [46]. There is a coordination relationship between the platelet activation mechanism and the cholesterol-rich microdomain [47]. In the species of cattle with close relatives of yak, the base mutation and insertion mutation of the APOB gene can be found to cause cholesterol deficiency in the individual [48,49]. Furthermore, the data obtained in the present study demonstrate that two significant markers (BovineHD4100016160, BovineHD2300009636) overlapped and fell into the GPLD1 gene, which is also associated with erythrocyte traits. The product of the GPLD1 gene, also known as GPI-PLD and PLD, belongs to the glycosyl phosphatidyl inositol (GPI) family of proteolytic enzymes. The concentration of GPLD1 in human serum can be as high as 5-10 µg/mL [50]. GPI has a variety of anchoring proteins widely expressed in the cardiovascular system. During the onset and development of atherosclerosis, GPI acts not only as a physiological receptor, but also activates signaling pathways leading to growth or apoptosis of vascular cells. The deficiency of the homologous complement increases the sensitivity of erythrocytes, becoming the most important factor in the pathogenesis of paroxysmal nocturnal hemoglobinuria. It has been documented that the overexpression of GPLD1 increases its ability to anchor CD55 and CD59 by nearly 50%, resulting in the inhibition of cell proliferation, and increase in the rate of apoptosis and complement-mediated cell death. These effects are attributed to the blockade of the CD55 and CD59 signaling pathway and the decrease or inhibition of the resistance to complement activity, constituting an important factor in the immune escape of cancer cells [51]. Together, these findings point to the involvement of the GPLD1 gene in the pathways affecting the properties of red blood cells, making GPLD1 a candidate gene controlling the characteristics of red blood cells.
In the present work, nine significant haplotype blocks located mostly in BTA8, BTA23, and BTA24 were identified by haplotype analysis. Among them, three significant blocks for RBC were found within two annotated genes, GPLD1 and TINAg, located on BTA23. Also, RBC, HCT, and HGB were shown to have common significant blocks located on the GPLD1 gene. Another erythrocyte trait, RDW-SD, was associated with the block at BTA23, from 3150 to 3156 Mb containing seven SNPs (BovineHD2300009102, BovineHD2300009103, BovineHD2300009105, BovineHD2300009110, BovineHD2300009112, BovineHD2300009113 and BovineHD2300009117), a region that also includes the HIST1H1E and HIST1H2BI genes. These genes are associated with the function of blood and immune cells. Kordbacheh and coworkers have demonstrated that histones promote erythrocyte aggregation and sedimentation in a concentration-dependent manner, increasing their osmotic fragility and lysis [52]. In addition, histones negatively interfere with the denaturation of red blood cells by reducing their ability to pass through an artificial spleen. In vivo, histones induce anemia, increase hemoglobin content in the spleen, and decrease the number of platelets and white blood cells within minutes.

Conclusions
In summary, our research provides important information about the genetic determinants of hematological traits in yaks and points out five genes, namely, GPLD1, EDNRA, APOB, HIST1H1E, and HIST1H2BI, affecting the HCT, HGB, RBC, PDW, PLT, and RDWSD traits, and serving as candidate genes for the future dissection of molecular mechanisms regulating hematological traits. The present investigation provides a valuable reference for the analysis of blood properties in the yak and immune diseases impacting this species.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/10/6/463/s1, Table S1: Description of correlation and p-value among the hematological traits, Table S2: Description of all identified SNPs showing significant association with hematological traits by single-marker GWAS.
Author Contributions: X.M. and C.J. conceived and designed the experiments; D.F. performed the experiments; M.C. and J.P. analyzed the data; X.G. and X.D. contributed reagents/materials/analysis tools; X.W. and P.B. prepared the figures and/or tables; C.L. and P.Y. approved the final draft of the manuscript submitted for review and publication.

Conflicts of Interest:
The authors declare no conflict of interest.