Genome wide association study identifies SNPs associated with fatty acid composition in Chinese Wagyu cattle

Fatty acids are important traits that affect meat quality and nutritive values in beef cattle. Detection of genetic variants for fatty acid composition can help to elucidate the genetic mechanism underpinning these traits and promote the improvement of fatty acid profiles. In this study, we performed a genome-wide association study (GWAS) on fatty acid composition using high-density single nucleotide polymorphism (SNP) arrays in Chinese Wagyu cattle. In total, we detected 15 and 8 significant genome-wide SNPs for individual fatty acids and fatty acid groups in Chinese Wagyu cattle, respectively. Also, we identified nine candidate genes based on 100 kb regions around associated SNPs. Four SNPs significantly associated with C14:1 cis-9 were embedded with stearoyl-CoA desaturase (SCD), while three SNPs in total were identified for C22:6 n-3 within Phospholipid scramblase family member 5 (PLSCR5), Cytoplasmic linker associated protein 1 (CLASP1), and Chymosin (CYM). Notably, we found the top candidate SNP within SCD can explain ~ 7.37% of phenotypic variance for C14:1 cis-9. Moreover, we detected several blocks with high LD in the 100 kb region around SCD. In addition, we found three significant SNPs within a 100 kb region showing pleiotropic effects related to multiple FA groups (PUFA, n-6, and PUFA/SFA), which contains BAI1 associated protein 2 like 2 (BAIAP2L2), MAF bZIP transcription factor F (MAFF), and transmembrane protein 184B (TMEM184B). Our study identified several significant SNPs and candidate genes for individual fatty acids and fatty acid groups in Chinese Wagyu cattle, and these findings will further assist the design of breeding programs for meat quality in cattle.


Background
Fatty acids of beef products have received considerable attention for their significance in human health, the improvement of salutary fatty acid (FA) content can offer more economic benefits in the beef market [1][2][3][4][5][6]. Like most economically important traits in beef cattle, FA composition are complex traits influenced by both genetic and environmental factors. Several studies have suggested that FA composition are lowly or moderately heritable traits and can be altered by feeding strategies [7][8][9][10]. However, recent studies from estimates of genetic parameters suggested that investigation of the genetic basis for FA composition can enable us to promote genetic improvement for them [7,9,11,12]. Several studies have also reported genetic parameter analyses and genome-wide association studies of fatty acid profiles from milk in dairy cattle [13,14]. Application of molecular genetic approaches can provide more opportunities to design genomic selection strategies for meat quality in beef cattle [15,16].
In the last decade, genome-wide association studies (GWASs) have emerged as a powerful approach for detecting the candidate variants and genes for complex traits in beef cattle [8,[17][18][19]. Many studies have identified candidate markers associated with FA composition in different populations, such as Japanese Black cattle, Hereford, Angus, and Shorthorn [16,[20][21][22][23]. However, most of these studies were carried out using low density SNP arrays. Only a few studies were conducted using the high-density BovineHD (770 K) SNP array in limited breeds [8,24,25]. The genetic basis of FA composition may vary among populations, and the use of genomic technologies for improving fatty acid profiles in beef cattle have not been comprehensively addressed [8,16,21,[23][24][25][26][27][28][29].
Wagyu is especially well-known for its remarkable marbling score and meat quality [6,30]. While the marbling is mainly fat tissue containing abundant monounsaturated fatty acids (MUFAs), it also reflects a lower melting point and contributes positively to beef flavor and tenderness [6,31]. Several studies have been reported to investigate the genetic basis of fatty acids in Wagyu cattle, and a list of associated SNPs and candidate genes have been identified for these traits [6,16,[32][33][34]. Chinese Wagyu cattle is a hybrid population from Wagyu and Fuzhou cattle, Fuzhou cattle is an indigenous population raised in Liaoning Province, China. Therefore, a GWAS of Chinese Wagyu cattle can contribute valuable knowledge for understanding the genetic basis of fatty acid composition. The objectives of the current study were to 1) identify the associated genomic variants and relative candidate genes for FA composition and 2) elucidate the genetic architecture of FA composition at the whole genome level in Chinese Wagyu cattle.

Ethics statement
All animals were treated following the guidelines established by the Council of China Animal Welfare. Experimental protocols were approved by the Science Research Department of the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences (CAAS) (Beijing, China).

Animals and phenotypes
The Chinese Wagyu population (464 animals) was established in Dalian, Liaoning Province, China, and all animals were born between 2012 and 2013. After weaning, they were fattened using the same feeding conditions for 20-24 months and slaughtered at an average of 28 months. During slaughtering, we measured meat quality traits in strict accordance with the guidelines proposed by the Institutional Meat Purchase Specifications for fresh beef. Meat samples were selected from the longissimus lumborum muscle, between the 12 th and 13 th ribs from each animal, after storage for 48 h, and samples were vacuum packed and chilled at − 80°C. In addition, approximately 10 g of sample were taken for subsequent analyses. Total lipids were extracted from samples according to protocols described previously [35]. Approximately 2 mg of extracted lipid was re-dissolved in 2 mL of n-hexane and 1 mL of KOH (0.4 mol/L) for saponification and methylation. A total of 21 individual fatty acids, including six saturated fatty acids, four monounsaturated FAs, and eleven polyunsaturated fatty acids, were measured using gas chromatography (GC-2014 CAFsc, Shimadzu Scientific Instruments). Each FA was quantified as a weight of percentage of total FAs. In addition, FAs groups were measured as total saturated fatty acid (SFA), total monounsaturated (MUFA), total polyunsaturated (PUFA), total omega 3 (n-3), total omega 6 (n-6), ratio between PUFA and SFA (PUFA/SFA), ratio between n-6 and n-3 (n-6/ n-3) and health index (HI). The estimation of various FA groups follows the same process as previously described [8,11].

Genotyping and quality control
Blood samples were obtained with the regular quarantine inspection of the farms. DNA was extracted from the blood samples using a routine procedure. In total, 464 individuals were genotyped using the Illumina Bovine HD BeadChip (Illumina, Inc., San Diego, CA). SNPs were pre-processed based on the following filters using PLINK v1.07 [36]: Minor allele frequency (> 0.05), proportion of missing genotypes (< 0.05), and the Hardy-Weinberg equilibrium (P > 10E-6). Moreover, individuals with more than 10% missing genotypes were excluded. After quality control, the final data consisted of 364 individuals and 503,579 autosomal SNPs.

Heritability and genetic correlation estimation
Phenotypic and genetic (co)variances of fatty acids were estimated using the pairwise bivariate animal model implemented in the ASReml v3.0 package [37]. The model is where y 1 and y 2 are vectors of phenotypic values of trait 1 and 2, respectively; X 1 and X 2 are incidence matrices for fixed effects; b 1 and b 2 are vectors of the fixed effects; Z 1 and Z 2 are incidence matrices relating the phenotypic observations to vectors of the polygenic (a 1 + a 2 ) effects for two traits; and e 1 and e 2 are random residuals for two traits. Variances of the random effects are defined as V ðaÞ ¼ Gσ 2 a for the polygenes and V ðeÞ ¼ Iσ 2 e for the residuals, where G is the additive genetic relationship matrix, I is the identity matrix, σ 2 a is the additive genetic variance, and σ 2 e is the residual variance. The G matrix was constructed using the SNP genotypes based on the proportion of total loci shared by two individuals [38], which was defined as where M is an n × m matrix of the number of animals (n) and number of marker loci (m), and it specifies the marker genotype coefficient at each locus, p i is the frequency of allele A of SNP, and (1 − p i ) is the frequency of allele B. In MM ′ the number of alleles shared by relatives was reported on the off-diagonals and an individual's relationship with itself was reported on the diagonals. Farm and carcass grade were considered as fixed effects in the model. In addition, the duration of fattening and back-fat thickness were considered as covariates in the model. Genomic heritability of each trait was estimated using h 2 ¼ σ 2 a =ðσ 2 a þ σ 2 e Þ . The phenotypic and genetic correlation coefficients were calculated using r P ¼ cov P , where r P and r G are phenotypic and genetic correlation coefficients, respectively. σ 2 P X and σ 2 P Y are the phenotypic variance of trait X and trait Y. σ 2 P X and σ 2 P Y are the additive genetic variance of trait X and trait Y. cov P XY and cov G XY are the phenotypic and genetic covariance.

Genome-wide association analysis using FarmCPU
Fatty acid composition were adjusted for fixed effects and covariates using linear mixed models. The Fixed and random model Circulating Probability Unification (FarmCPU) model was used to test the single-SNP association. This algorithm takes into account the confounding problem between covariates using both the Fixed Effect Model (FEM) and Random Effect Model (REM) [39,40]. The first three principal components were calculated using GAPIT, which were considered as the covariates [41]. The quantile-quantile (Q-Q) plot was generated to assess population stratification [42]. The linkage disequilibrium (r 2 ) was estimated using PLINK v1.07 software. Linkage disequilibrium between SNPs around the target regions were estimated and visualized using Haploview v4.3 software [43]. Region plots were generated using the asplot function in the R package "gap" [44]. Positional candidate genes were investigated for 100 kb windows around SNPs using UCSC Genome Browser, which was based on the Bos taurus genome assembly UMD 3.1. The proportion of phenotypic variance explained by each SNP was calculated as follows: where p i and q i represent the frequencies of two alleles for the i th SNP, β i is the effect of the i th SNP, and σ 2 p denotes the phenotypic variance.

GWAS results and candidate regions
We performed GWAS for 21 individual FAs and 8 FA groups using FarmCPU, and only results for traits with The concentrations of fatty acids were expressed as a percentage of total fatty acid methyl esters quantified genomic heritability ≥0.10 were reported in the current study. As the Bonferroni correction was highly conservative for the GWAS using the high-density SNP array, we considered P < 1.36E-06 (0.1/73,531) as the suggestive significant level which was proposed by Zhu et al. [8] and Duggal et al. [45]. This strategy evaluates the approximate number of "independent" SNPs by counting one SNP per linkage disequilibrium (LD) block, plus all SNPs outside of the LD blocks (inter-block SNPs). The summary of the results from the GWAS using FarmCPU methods were shown in Table 2. In total, we identified 15 and 8 candidate SNPs for individual FAs and FA groups, respectively. Seven SNPs associated with C22:6 n-3 were detected that located at six chromosomes. Among them, three significant SNPs (P-value = 3.38E-10, 9.19E-08, and 1.31E-07) overlapped with phospholipid scramblase family member 5 (PLSCR5), CLIP-associating protein 1 (CLASP1), and Chymosin (CYM), respectively, while no genes were found for other four SNPs (Fig. 1i). Notably, we observed that four candidate SNPs at BTA26 within SCD for C14:1 cis-9 had significant P values (P = 1.02E-07) (Fig. 1c). Among them, three SNPs were imbedded within the intron of SCD, while one SNP located at its exon region. Our results revealed that the highly significant SNPs contribute~7.37% of phenotypic variance for C14:1 cis-9, and some nearby SNPs display high LD around the SCD gene. In addition, we observed one block with 6 kb at this region (Fig. 2). Thus, these identified SNPs are possible candidate markers for further application of maker assisted selection. In addition, one SNP at 42.65 Mb in BTA8 (P = 4.89E-07) for C20:1 cis-11 overlapped with SWI/SNF Related, Matrix Associated, Actin Dependent Regulator of Chromatin, Subfamily A, Member 2 (SMARCA2) (Fig. 1g). We also  Fig. 1a) and two SNPs (at BTA6:64.22 Mb and BTA6:64.28 Mb) were associated with C18:1 cis-9, respectively (Fig. 1e). However, no putative candidate gene was identified near this SNP. Q-Q plots for five FAs were presented in Fig. 1b, d, f, h, and j.
We also detected eight associated SNPs for five FA groups (Table 2 and Fig. 3). Of these SNPs, we observed three, two, one, one, and one candidate SNPs for PUFA/ SFA, MUFA, PUFA, SFA, and n-6, respectively. QQ plots for five FA groups were presented in Fig. 3b, d, f, h, and j. Interestingly, we found one SNP (BovineHD0500031844) Fig. 1 a Manhattan plot of association results for C22:0, where the Y-axis was defined as -log 10 (P) and the genomic position was represented along the X-axis. The green line indicated P = 1.36E-06. b Quantile-quantile plot of 503,579 SNPs in the genome-wide association study for C22:0. c Manhattan plot showing P-values of association for C14:1 cis-9. d Quantile-quantile plot for C14:1 cis-9. e Manhattan plot of association results for C18:1 cis-9. f Quantile-quantile plot for C18:1 cis-9. g Manhattan plot showing P-values of association for C20:1 cis-11. h Quantile-quantile plot of for C20:1 cis-11. i Manhattan plot showing P-values of association for C22:6 n-3. j Quantile-quantile plot for C22:6 n-3 associated with pleiotropic effects for multiple FA groups (PUFA, n-6, and PUFA/SFA). Additionally, three candidate genes were identified at 100 kb widows around this SNP. Among them, one gene, BAIAP2L2, was located in the upstream region, while two genes, MAFF and TMEM184B, were located at the downstream region. Moreover, we found several large blocks around the significant SNPs, ranging from 110.3 to 110.5 Mb (Fig. 4d).

Discussion
Fatty acids have generally been recognized as essential contributors to the tenderness and flavor of meat [46]. To our knowledge, this study is the first attempt to investigate the molecular mechanisms underpinning FAs using a high-density SNP array in Chinese Wagyu cattle. Our analyses showed that the estimated heritabilities varied among FAs, which is in agreement with previous publications [47][48][49]. This difference could be explained by the genetic architecture of the studied traits, and the effects of candidate SNPs may vary among diverse populations [25,47]. In the present study, C14:1 cis-9 has the highest heritability (0.43) among MUFAs. Inoue et al. [50] and Ekine et al. [11] estimated that the heritability for C14:1 cis-9 were 0.86 and 0.51, which were also the highest among the MUFAs. These reports suggested that the amount of C14:1 cis-9 is likely to be influenced by genetic factors more than other MUFAs.
The heritability for C18:1 cis-9 was 0.25 in our analysis, whereas a relatively high heritability for C18:1 cis-9 (0.42 to 0.78) were reported in Japanese Black cattle [12,32,50]. Indeed, previous studies have reported that the SNPs within the SCD gene showing significant associations with C18:1 cis-13 of Canadian commercial steers as well as in Spanish breeds [51]. The heritability for eight FA groups were higher than those reported by previous studies [9,10]. However, our results are similar to previous reports for the FA groups in Japanese Black cattle [12,32,50]. These results may suggest that the genetic structure of FAs in our population was more similar to Japanese Black cattle. The estimates of heritability for the FAs and FA groups are different across studies. This is particularly evident when the studied breeds are different, which may indicate differences in the genetic architecture of FAs in different populations. However, other factors, such as sample sizes and the statistical models, may also contribute to the difference of heritability estimates across studies [11]. We also observed high SE for some of FA except C16:0, C20:0, C18:2 n-6, C20:2 n-6 and C20:5 n-3, this is probably be explained by the small sample size. Also, population genetic structure and environmental conditions can affect the estimation of heritability.
In total, we identified 23 SNPs associated with nine candidate genes for FA composition in Chinese Wagyu cattle. Among them, we observed four significant SNPs for C14:1 cis-9 located at 21.14 Mb on BTA26, these newly identified SNPs were embedded with Stearoyl-CoA desaturase (SCD). Also, several LD blocks were observed within this gene (Fig. 2). SCD is the key enzyme involved in the endogenous synthesis of conjugated linoleic acid (CLA) and the conversion of saturated fatty acids into mono-unsaturated fatty acids (MUFA) in mammalian adipocytes [33]. Many SNPs near SCD have been reported, which are associated with C14:1 cis-9, C16:1 cis-9, C18:1 cis-9, and C18:2 n-6 in different cattle populations [6,20,23,54]. For instance, two associated SNPs for C14:1 cis-9 were detected at 17.39 Mb and 17.58 Mb on BTA26 in Gifu cattle [6]. Three associated SNPs at 18.99~21.26 Mb on BTA26 for C14:1 cis-9, C16:1 cis-9, C18:1 cis-9, and C18:2 n-6 were detected in Angus and Hereford-Angus crossbred populations in Canada [23]. We suspected that different candidate SNPs for FA composition may be due to different genetic bases across populations. However, no significant SNPs around SCD was found that associated with C18:1 cis-9 in Chinese Wagyu cattle, which was consistent with previous findings in other populations [16,20,21,51]. In addition, the direct effect of polymorphism within SCD on FA composition of milk has been extensively reported [6,8,18,[21][22][23][24][25]. Changes in the enzymatic activity as a result of SCD polymorphism and regulation have been recognized to cause diet-independent variations of CLA content in milk [55].
We also identified three genes, namely, CLIP-associating protein 1 (CLASP1), chymosin (CYM), and phospholipid scramblase family member 5 (PLSCR5) for C22:6 n-3. Among them, CLASP1 located within a QTL region that was related to Warner-Bratzler shear force in Nelore beef cattle [56]. A previous study reported that the CYM is related to immune response and milk fat percentage [57]. In addition, we observed one SNP at 123 Mb in PLSCR, which was associated with C22:6 n-3. PLSCR has been previously reported to be related to reproductive traits in pigs [58,59]. In our study, highly positive phenotypic and genetic correlations were observed between each pairwise comparison of PUFA, PUFA/SFA, and n-6. The phenotypic correlation coefficients between PUFA vs. PUFA/SFA, PUFA vs. n-6, and PUFA/SFA vs. n-6 were 0.825, 0.99, and 0.817, while the genetic correlation coefficients were 0.751, 0.948, and 0.632, respectively. These results suggest that FA groups with high positive correlations may be regulated by the same genes. Indeed, we identified three genes, including BAIAP2L2, MAFF, and TMEM184B, for multiple FA groups (PUFA, n-6, and PUFA/SFA), which may imply their pleiotropic effects for these traits in beef cattle. Among them, we found that MAFF has a high expression in fat tissues as reported by Fagerberg et al. [60]. Another gene, named BAIAP2L2, has been previously reported as related to the marbling score in Korean cattle, and the differentially expressed pattern of this gene is correlated with the expression of several miRNAs [61].
Previous studies suggested that Fatty acid synthase (FASN) and Elongation of very long chain fatty acids protein 5 (ELOVL5) are associated with fatty acid compositions including C14:0, C14:1 cis-9, C16:0, C16:1 cis-9, C18:0, and C18:1 cis-9 in cattle [8]. However, these two genes were not identified in the present study, thus we suspected that heterogeneous genetic architecture of fatty acids exists among different populations. We accurately detected one SNP at 120 Mb on BTA1 for C20:4 n-6, this SNP was embedded with ELOVL7 (suggested P-value = 1.05E-05). In addition, several studies have found candidate SNPs within ELOVL7 associated with FA composition in porcine muscle and abdominal fat tissues [62,63]. In mammals, seven enzymes have been identified in the ELOVL family (ELOVL1-7). The ELOVL enzyme has a distinct distribution in different tissues, and different enzymes exhibit different preferences for the FA substrate. The ELOVL5 and ELOVL6 genes are involved in the production/synthesis of palmitic (C16:0), palmitoleic (C16:1 cis-9), stearic (C18:0), and oleic (C18:1 cis-9) fatty acids. Therefore, the role of ELOVL5 and ELOVL6 in the synthesis of FAs is of great importance for beef cattle breeding programs [24,64,65]. The investigation into the molecular mechanism of the ELOVL gene family can provide valuable insight into improving the composition of beneficial FA in cattle and expanding our knowledge of transcriptional regulation mechanisms in domestic animals [66].

Conclusions
We identified several candidate SNPs and genes for individual FAs and FA groups in Chinese Wagyu cattle. Our findings highlight four novel SNPs in SCD with large effects for C14:1 cis-9 and one SNP with pleiotropic effects related to FA groups.