Characterization of casein variants in the Guaymi and Guabala breeds through a low-density chip arrangement

ABSTRACT Studies of the genetic diversity of the Guaymi and Guabala cattle breeds have shown the need to evaluate various components, including the polymorphism of casein genes. The objective of this work is to characterize the casein variants in the Guaymi and Guabala landraces by means of a low-density SNP arrangement. Twenty-four SNP markers were typed in samples of Guabala and Guaymi Creole cattle. The values of H o, H e, and F is, considering only the polymorphic loci in the Guabala breed, were 0.438, 0.449, and 0.011, respectively. In the case of the Guaymi breed, H o, H e, and F is at the polymorphic loci were 0.513, 0.405, and −0.281, respectively. The effective number of alleles obtained from the Guabala breed was 1.167, and that in Guaymi was 1.257. This study determined the genetic diversity of the casein group in the Guaymi and Guabala breeds; however, few polymorphic alleles were observed, particularly in the Guabala race. Both breeds had high frequencies of the A2A2 genotype at rs43703011 (CSN2), which is considered favourable for production of quality milk. The identified markers will allow the design of strategies to reduce the levels of inbreeding and better understand the aptitudes of both breeds in terms of productivity.

The genes of casein are closely related to the quality and productivity of milk and its derivatives.CS1NS1 is associated with high milk production, as well as protein content (Eenennaam and Medrano 1991).CSN2 has particular importance since it is related not only to the high yield and quality of milk (Kučerova et al. 2006) but also to a healthier product, particularly its A2A2 variant , unlike the A1A1 variant, which has been associated with the bioactive peptide β-casomorphin and human health risk factors such as ischaemic heart disease, arteriosclerosis, type I diabetes, sudden infant death syndrome, and autism (Kaminski et al. 2007;Kost et al. 2009;Cieslinska et al. 2012).CSN1S2 is associated with protein yield (Nilsen et al. 2009).CSN3 is related to milk protein coding, the latter very important in the stability of the structure of casein micelles, milk production, and cheese quality (Alexander et al. 1988;Alim et al. 2014).Laible et al. (2016) revealed that milk protein genes have the potential to be used to improve the bovine milk component.The studies of genetic diversity that have been carried out in the Guaymi and Guabala breeds (Delgado et al. 2011;Ginja et al. 2019;Villalobos-Cortes et al. 2021a) have highlighted the need to evaluate various factors of productivity and milk quality, such as the variability of casein genes.The objective of this study is to characterize the caseins in the genomes of the Guaymi and Guabala breeds of Panama by genotyping single-nucleotide polymorphisms (SNPs).

Sample collection
The polymorphism of 24 SNP markers of the caseins CSN1S1 (2), CSN2 (8), CSN1S2 (2), and CSN3 (12) of 34 samples of Criollo Guabala (15) and Guaymi cattle (19).The animals were selected within the conservation centres through a previously carried out genetic characterization, to guarantee the purity of both breeds (Villalobos-Cortes et al. 2020).These SNPs were selected from an array of 10,000 SNP markers in an Affymetrix Axiom OrcunSNP Array platform, as part of the Innovative Management of Animal Genetic Resources (IMAGE) project in the Horizon 2020 framework programme.

Genomic DNA isolation
Five-millilitre samples of venous blood were taken from each animal and kept cold until arrival at the laboratory.DNA was extracted using the commercial kit DNeasy Blood and Tissue from Qiagen, obtaining an average concentration of 45 ng/ml and a volume of 50 µl per sample, with a total amount of 2.5 µg of DNA.Affymetrix analyses complied with the Nagoya protocol.Of the 10,000 SNPs selected, 8416 met the criteria recommended by the company, with a conversion threshold of 0.6.All SNPs were aligned with the UMD 3.1.1reference genome (Elsik et al. 2016).The results obtained in VCF format were validated and transformed into GDA format using the program PGDSpider 2.1.1.5(Lischer and Excoffier 2012), then converted to text and Excel formats.To verify the positions of the SNPs, the Integrative Genome Viewer program IGV 2.9.4.03 (Robinson et al. 2011) was used along with the Genome Data Viewer of the National Center for Biotechnology Information (NCBI), with the same reference genome, UMD 3.1.1.SNPs that had a reference number (RefSNP) were used to locate them in the reference genome position ARS.UCD.1.2by Ensembl! (Howe et al. 2021) and the European Variation Archive (Cezard et al. 2021).

Genetic diversity analysis
To evaluate the genetic variability within each population, the following parameters were calculated: percentage of polymorphic loci, observed heterozygosity (H o ), expected heterozygosity (H e ), effective number of alleles (N e ), and deviations from Hardy-Weinberg (HW) equilibrium in each population, calculated by the exact test using the Markov chain method with a chain length of 1,000,000 and 100,000 memorization steps (Guo and Thompson 1992).Gene and genotypic frequencies and F is , F st , and F it values were also calculated (Wright 1965;Weir and Cockerham 1984).GENETIX 4.02 (Belkhir et al. 2003), GenAlEx 6.501 (Peakall and Smouse 2012) and ARLEQUIN 3.5.(Excoffier et al. 2005), the Shannon diversity index was calculated using GenAlEx 6.501.The polymorphic variants were subjected to Cattle QTLdb (Zhi-Liang et al. 2007) to identify possible associations with economic utility traits.

Results
Of the 24 markers analysed, 23 were considered usable; in the Guabala breed, 20.83% of polymorphic loci were obtained (5), and in the Guaymi breed, 37.50% of polymorphic loci were obtained (8).Most of the variants, except for one belonging to the CSN2 gene (6:87183034) and three belonging to the CSN3 gene (6:87390198, 6:87390448 and 6:87390604), were identified.The location in the reference genome (UMD 3.1.1,Genome Data Viewer of NCBI).The values of H o , H e and F is (Table 1) considering only the polymorphic loci in the Guabala breed were 0.438, 0.449, and 0.0108, respectively.In the case of the Guaymi race, H o , H e and F is , also considering the polymorphic loci, were 0.513, 0.405, and −0.281, respectively.The N e obtained from the Guabala breed was 1.167, and the Guaymi was 1.257, both considered low.
The general mean of the Shannon index considering the polymorphic loci was 0.173 for each population was 0.130 in Guabala and 0.215 in Guaymi.In the Guabala breed, most of the markers showed heterozygote deficits but did not reflect deviations from HW equilibrium.An excess of heterozygotes was obtained at all the markers evaluated in the Guaymi breed, with a greater difference being observed in rs133474041 (p < .05).
Table 2 describes the allelic frequencies of polymorphic variants of the casein gene group.In the Guabala population, the SNP rs133474041 of the CSN1S1 gene showed the highest reference allele (G) frequency of 0.800.In the Guaymi population, the RefSNPs rs43703011 (CSN2), rs441966828 (CSN1S2), and rs439304887 (CSN3) all presented the highest frequencies (0.842) of their reference alleles G, C, and A, respectively.The SNP rs43703011 of the CSN2 gene was monomorphic in allele C (1.000) in the Guabala breed and had a frequency of 0.842 in the Guaymi breed.The CSN1S2 gene was polymorphic at rs441966828 in the Guaymi breed, with a C frequency of 0.842.In the case of CSN3, of the 12 alleles evaluated, 5 were polymorphic in the Guaymi breed and 4 in Guabala.The genotypic frequencies by population, considering the polymorphic markers, showed higher values of homozygosity in the Guabala breed over the Guaymi breed, with GG (0.640 and 0.468) at the rs133474041 variant, CC at the rs450402006 variant (0.537 and 0.433), CC at the rs43703015 variant (0.321 and 0.250), AA at the rs43703016 variant, and AA at the rs110014544 variant (0.321 and 0.223), respectively.
Regarding the Fixation Indices or F statistics for both populations, the values of F is , F it , and F st were −0.174, −0.135, and 0.033, respectively, which were not significant.(2021b) in Guabala and Guaymi populations.The high percentage of monomorphic alleles could be counterproductive since it would reflect the possible presence of homozygous regions by inbreeding, as reflected in the presence of a low number of effective alleles (N e ), and where the Guabala breed has the lowest values (1.167).Such behaviour in local breeds with small population sizes is common, and the increase in consanguinity is one of the most relevant problems and entails different negative effects, such as the reduction of phenotypic values (Mastrangelo et al. 2016).This behaviour could be reversed by reorganizing the different mating systems between farms of producers (currently, new breeders organization of Guaymi and Guabala cattle have been identified, with whom new crossbreeding strategies could be developed) and conservation centres or in vitro germplasm banks (FAO 2007).Another alternative that cannot be ruled out would be to implement absorbent crossing strategies with populations with close genetic distances.The allelic frequencies of the SNP rs133474041 are lower than those reported by Kolenda and Sitkowska (2021), who report a frequency of 0.994 for the G variant and 0.061 for the A variant in the Holstein-Friesian breed from Poland.Regarding rs109299401 (CSN2), the T variant of the Guabala breed was more common than that reported by Kolenda and Sitkowska (2021), who described a frequency of 0.930.In the Guaymi population, it was lower, at 0.789.Both values of Guaymi and Guabala SNP rs43703011 of the CSN2 gene are higher than those reported by Kolenda and Sitkowska (2021) and those reported by Bisutti et al. (2022) in Holstein cattle, with a frequency in C of 0.560.This group of alleles has been of growing interest because some studies suggest it may produce intolerance and gastrointestinal problems (Jianqin et al. 2016;Nuhriawangsa et al., 2021) and type 1 diabetes mellitus in infants (Elliott et al. 1999;Chia et al. 2017) and ischaemic heart disease in adults (McLachlan 2001), associated with the release of beta-casomorphin-7 by the presence of histidine (His67) in the A1A1 variant, unlike the A2A2 variant, which is associated with health benefits, Brooke-Taylor et al. ( 2017), although some evidence goes against this (Venn et al. 2005;Cass et al. 2008).The difference between the two alleles results in an amino acid difference.The original codon CCT, which codes for the amino acid proline in variant A2, mutates to CAT, forming histidine, in variant A1 at position 67 of CSN2 (Bâlteanu et al. 2010;Oleński et al. 2012).Allele A2 represents the original gene of the genus Bos.This gene encodes the A2 allelic form of betacasein and is present in the milk of many mammals, such as humans, sheep, goats, and bovines (Ng-Kwai-Hang and Grosclaude 2003).The rs441966828 locus in CSN1S2 gene was lower than the values reported by Vanvanhossou et al. (2021) in the African breeds Lagune and Somba (1000) and the crosses of Borgou and Pabli (0.980).Likewise, Meier et al. (2019) reported monomorphic alleles in C of the German Black Pied and Holstein-Friesian populatios.The frequencies of allele C at rs450402006 in Guaymi (0.658) and Guabala (0.733) were lower than the 0.939 reported by Kolenda and Sitkowska (2021).As for the T allele of the CSN3 gene and rs43703015, the frequencies reported in this study are lower than those obtained in Germany by Meier et al. (2019) in populations German Black Pied (0.867) and Holstein-Friesian (0.797) and in the Holstein-Friesian breed (0.992) by Kolenda and Sitkowska (2021) and similar to those reported by Vanvanhossou et al. (2021) in Benin and Nigeria (0.500).
The frequencies of the C allele of the SNP rs43703016 of the creole breeds in this study were lower than those obtained in the Holstein-Friesian populations in Germany (Kolenda and Sitkowska 2021) but similar to those reported in the Lagune and Somba breeds in Benin (Vanvanhossou et al. 2021).The frequency of A (1.000) in the SNP rs439304887 in the Guabala breed was equal to that obtained in the Holstein-Friesian breed of Poland (Kolenda and Sitkowska 2021), while in Guaymi, an allele C frequency equal to 0.842 was observed.At the SNP rs110014544 of CSN3, the breeds had G allele frequencies of 0.433 (Guabala) and 0.528 (Guaymi), which were slightly higher than those seen in the dairy breeds cited above.These high levels of homozygosis in both populations can be attributed to the small number of these breeds that are within conservation programmes.Population censuses as reported by Delgado et al. (2018) in Panama estimate that the Guaymi breed represents 0.08% and the Guabala 0.05% of the livestock population.It is necessary to continue working on models of crosses and conservation modalities whose strategies consider the implementation of conservation centres with in situ, in vivo, and in vitro modalities.It is also vital to promote the creation of breeders' associations that ensure the sustainability of the breeds over time, which would increase the number of animals, prevent inbreeding, and initiate genetic improvement processes, which have not been developed yet.It is also important to consider new analysis tools, such as genomic evaluation, to determine with greater precision the population structures of these landraces their presence of homozygous segments, and genomic inbreeding, among other analyses (Kardos et al. 2015).When we consulted the polymorphic variants in Cattle QTLdb, four of them, two in the CSN2 gene and two in the CSN3 gene, were positive for traits of economic interest.In the CSN2 gene, the rs109299401 variant has been associated with somatic cell count, longevity, milk yield, and protein yield, the rs43703011 variant with somatic cell count, longevity, fat yield, and protein yield.In the CSN3 gene, the rs43703015 variant has been associated with curd firmness and fat yield, the rs43703016 variant with protein percentage (Schopen et al. 2011;Fontanesi et al. 2014;Viale et al. 2017).

Conclusion
This study determined for the first time, the genetic diversity of the casein group in the Guaymi and Guabala populations, which had few polymorphic alleles, particularly Guabala.Genotypic and allelic frequencies for Guaymi and Guabala cattle were similar to those reported in several Bos taurus breeds.Both breeds had a high prevalence of the A2A2 genotype at the rs43703011 allele of CSN2, which is considered favourable to produce good quality milk in both breeds.With the emergence of a new association of Guaymi and Guabala cattle breeders in Panama (ACCRIPA) and the results obtained in this work, it is proposed to redesign a mating system that includes these new herds.This will allow greater efficiency in conservation programmes, a reduction in inbreeding generated by the low number of animals, in addition to taking advantage of them for commercial purposes, such as the production of type A2A2 milk.

Table 1 .
Mean Shannon index (I ) observed heterozygosity (H o ), expected heterozygosity (H e ) and HW equilibrium of casein gene variants of the Guaymi and Guabala breeds.

Table 2 .
Allelic frequencies of polymorphic variants of casein genes of the Guaymi and Guabala races (reference genome UMD 3.1.1).