Genome-Wide Signal Selection Analysis Revealing Genes Potentially Related to Sheep-Milk-Production Traits

Simple Summary In our research, candidate genes related to sheep-milk production were revealed by a genome-resequencing analysis and a genome-signal-selection analysis, and a RT-qPCR experiment was performed to prove the expression levels of these candidate genes, the results showed that the FCGR3A gene’s expression level had a significant negative relationship with sheep-milk production. Abstract Natural selection and domestication have shaped modern sheep populations into a vast range of phenotypically diverse breeds. Among these breeds, dairy sheep have a smaller population than meat sheep and wool sheep, and less research is performed on them, but the lactation mechanism in dairy sheep is critically important for improving animal-production methods. In this study, whole-genome sequences were generated from 10 sheep breeds, including 57 high-milk-yield sheep and 44 low-milk-yield sheep, to investigate the genetic signatures of milk production in dairy sheep, and 59,864,820 valid SNPs (Single Nucleotide Polymorphisms) were kept after quality control to perform population-genetic-structure analyses, gene-detection analyses, and gene-function-validation analyses. For the population-genetic-structure analyses, we carried out PCA (Principal Component Analysis), as well as neighbor-joining tree and structure analyses to classify different sheep populations. The sheep used in our study were well distributed in ten groups, with the high-milk-yield-group populations close to each other and the low-milk-yield-group populations showing similar classifications. To perform an exact signal-selection analysis, we used three different methods to find SNPs to perform gene-annotation analyses within the 995 common regions derived from the fixation index (FST), nucleotide diversity (Ɵπ), and heterozygosity rate (ZHp) results. In total, we found 553 genes that were located in these regions. These genes mainly participate in the protein-binding pathway and the nucleoplasm-interaction pathway, as revealed by the GO- and KEGG-function-enrichment analyses. After the gene selection and function analyses, we found that FCGR3A, CTSK, CTSS, ARNT, GHR, SLC29A4, ROR1, and TNRC18 were potentially related to sheep-milk-production traits. We chose the strongly selected genes, FCGR3A, CTSK, CTSS, and ARNT during the signal-selection analysis to perform a RT-qPCR (Reale time Quantitative Polymerase Chain Reaction) experiment to validate their expression-level relationship with milk production, and the results showed that FCGR3A has a significant negative relationship with sheep-milk production, while other three genes did not show any positive or negative relations. In this study, it was discovered and proven that the candidate gene FCGR3A potentially contributes to the milk production of dairy sheep and a basis was laid for the further study of the genetic mechanism underlying the strong milk-production traits of sheep.


Introduction
The sheep is one of the earliest domesticated-farm-animal species and has experienced evolution and domestication over thousands of years [1]. Dairy sheep are traditionally farmed in southern Europe (France, Italy, Spain, Greece), central Europe (Hungary and the Czech and Slovak Republics), eastern Europe (Romania and Ukraine), and countries in the Middle East, such as Turkey and Iran [2]. Organized sheep-breeding programs were developed from at least the 1960s [3]. The most efficient selection scheme for local dairy sheep is based on the pyramidal management of the population, with the breeders of the nucleus flocks at the top, and pedigree, official milk recording, AI (artificial insemination) breeding, controlled natural mating, and breeding-value estimation (i.e., BLUP) are carried out to make genetic progress. In 2013, dairy small ruminants accounted for a minor part of the total agricultural output in France, Italy, and Spain (0.9 to 1.8%) and a larger part in Greece (8.8%) [4]. In these European countries, the dairy-sheep industry is based on local breeds and crossbreeds raised under semi-intensive and intensive systems, and it is concentrated in a few regions. While with the development of dairy-sheep breeding and the emergence of dairy products emerging, dairy ruminants have become a major part of European agricultural income [5,6]. The average flock size varies from small to medium (140 to 333 ewes/farm), and the average milk yield ranges from low to middle (170 to 500 L/ewe) [7], showing substantial space for improvement in relation to cows' milk [8]. Furthermore, sheep milk has higher protein, fat, lactose, total non-fat solids, and ash contents and a higher nutritional value than cows' and goats' milk [9], which makes it suitable for processing into various types of dairy products. Most sheep milk is sold to industries and then processed into traditional cheese products, most of which are made into Protected Denomination of Origin (PDO) cheeses for gourmets. The animals' udder health is also critical in sheep-milk quantity and quality. Mastitis is an inflammation of the mammary gland that is usually caused by pathogens, mainly bacteria, which develop in the udder tissue after infections in the teat canal. It is one of the most prevalent and costly diseases in the dairy industry due to the significant reductions in milk production and physical harm that it causes [10][11][12]. In previous studies, some quantitative trait loci (QTLs) and SNPs associated with SCC (somatic cell counts) that serve as markers of mastitis were identified based on linkage-disequilibrium analyses of different dairysheep breeds [10,13,14], leading to significant improvements in ewes' ability to resist mastitis [15,16].
The dramatic decreases in the costs of whole-genome sequencing (WGS) and RT-qPCR experiments on animals have made it possible to scan the complete genomes of thousands of animals. Using genome information makes it possible to explain parts of the total genetic variance that are difficult to measure, such as low-heritability, sex-limited, and postmortem traits. To help clarify the link between the genome sequence and real data of important traits, this research was designed to gain a better understanding of sheep lactating genes by comparing the SNP-variant information on different sheep breeds with significantly different dairy-production characteristics. These breeds significantly differ at two dairy levels, since there are high-milk-yield (East Friesian sheep, 700 kg/lactation, Dairy Meade sheep, 500 kg/lactation and Awassi sheep) and low-milk-yield sheep breeds (Hu sheep, small-tailed Han sheep, and Churra sheep) [17]. To this end, we performed a whole-genome sequencing (WGS) analysis of these sheep breeds, as we found that in previous studies, little research was performed using whole-genome sequencing on these dairy sheep populations. First, we re-sequenced the crossed breed of Dairy Meade sheep and small-tailed Han sheep. The information on this sheep breed will be uploaded to a public genome-sequence database for researchers to use freely (NCBI). In our research, a genetic-structure analysis was performed to establish the relationships and geographical distances between 10 sheep breeds, and then genome-wide-signal scans were carried out to identify significant genes associated with sheep-milk production. The genes validated by the RT-qPCR results and their relationship with milk production represent significant progress in our knowledge of the gene-expression levels of lactating sheep and, thus, provide potentially valuable information for future dairy-sheep studies.

Whole-Genome-Resequencing Analysis
We sequenced 41 sheep genomes at an average depth-of-coverage of 10X, which contained 4 sheep breeds. To this end, genomic DNA was extracted from ear tissue using the Wizard ® Genomic DNA purification kit (Promega, A1125, Madison, WI, USA). The A260/A280 ratio using Nano-Drop ND-2000 (Thermo Fisher Scientific, MA, USA) was used to check the quality and integrity of the extracted DNA and agarose-gel electrophoresis was used to check its quality. Good-quality DNA from each collected sample was sequenced on the MGI-SEQ2000 platform from the Beijing Compass Biotechnology Company (Beijing, China). The satisfactory sequence data obtained were used to characterize individual genomes at a minimum of 10X depth of coverage, with more than 30 G of raw data from each sample. After quality control, data were used to perform genome mapping and SNP calling. Resequencing data from a further 60 sheep, downloaded from NCBI, were analyzed by Plink parameters on Linux (San Francisco, CA, USA) (Version V1.90). The details are listed in Section 2.3. All high-quality double-trimmed read pairs were aligned against the reference assembly Oar v.4.0 genome [29] (https://www.ncbi.nlm.nih.gov/assembly/GC A_000298735.2, accessed on 15 March 2022) using BWA software (https://sourceforge.net/ projects/bio-bwa/files/, accessed on 15 March 2022) (version: 0.7.12). Paired-end reads that mapped to exactly the same position on the reference genome were deleted by the Mark Duplicates in Picard (picard-tools-1.56, at http://picard.sourceforge.net, accessed on 15 March 2022). Additional realignment of indels and SNPs was performed by using the Genome Analysis Toolkit (GATKV4.0) (https://gatk.broadinstitute.org/hc/en-us, accessed on 15 March 2022) [30] and sequence alignment (SAM tools) [31]. The GATK was used to identify SNP variation in each individual sample. The SNPs that did not meet the following criteria were excluded: (1) SNP call rate > 99.6%; (2) minor-allele frequency > 0.01; (3) missing rate is lower than 0.1; and (4) all loci followed the Hardy-Weinberg rule; (5) linkage-disequilibrium loci were excluded (R 2 < 2). All SNPs were annotated using the Bio Mart 4.0 (https://asia.ensembl.org/info/data/biomart/index.html, accessed on 15 March 2022) [32] based on the gene-reference genome provided by the Oar v.4.0 genome from NCBI.

Genome-Wide Signal-Selection Scan and Gene Annotation
We used all SNPs that passed quality control to detect signatures of selection in high-and low-milk-yield sheep within 101 sheep genomes. Genome-wide signal-selection analysis was performed by using F ST fixation indices (population-differentiation value), (θπ), the nucleotide-diversity ratio (θπ-HMY/θπ-LMY), HMY (high milk yield), LMY (low milk yield), and the transformed heterozygosity score (ZH P ). The window-based ZH P method was calculated by the formula ZH P = (Hp − µHp)/σHp, where µ is the overall average signal value of heterozygosity and σ is the standard deviation of all windows of each group. High-milk-yield group included 57 samples from five breeds (Dairy Meade sheep = 20, DM (F1) = 15, DM (F2) = 10, East Friesian sheep = 10, and Awassi sheep = 2), while low-milk-yield group included 44 individuals across five breeds (STHS = 5, Hu sheep = 14, Fin sheep = 9, Churra sheep = 6 and Suffolk sheep = 10). The F ST fixation, ZHp [39], and log 2 θπ (high/low), were calculated within 100-kb sliding windows and 10-kb steps to obtain overlapping regions. The variation regions within the highest 5% of all three statistics were considered as candidate selections, and then candidate genes were annotated by a genomic-database search and annotating engine, Bio Mart (https: //asia.ensembl.org/info/data/biomart/index.html, accessed on 15 March 2022) [40].

Gene Ontology and Kyoto Encyclopedia of Genes and Genomes
Functional enrichments for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were performed using g: Profiler (https:// biit.cs.ut.ee/gprofiler/gost, accessed on 1 June 2022) for all selected genes. The enrichment significance was assessed using a condition for g: SCS 'Set Counts and Sizes', and the p value was set at <0.05.

Validation of RT-qPCR Experiment
We randomly selected 11 Lacaune sheep in from a farm in Belgium (https://lesfauve slaineux.wordpress.com/le-troupeau/, accessed on 1 September 2022) to determine their genes' expression levels. These were frequently selected in our research, since Lacaune sheep are among the most famous and heavily produced dairy sheep in France, and imported into Belgium. This farm efficiently recorded the milk-production data from different milk lactations and information on newly lambing ewes. Fresh milk was sampled from 11 newly lambed ewes. From each ewe, 50 mL of milk was sampled in non-RNA se tubs. All these steps complied with the requirements of European Animal Welfare Committee. The milk was stored at 4 • C during transport from farm to laboratory and processed immediately for somatic cell isolation. Subsequently, 50 uL of 0.5 M EDTA was added for each 50-mL milk sample. The samples were then centrifuged for 10 min at 2000× g, after which the supernatant containing cream and skin milk was removed, followed by washing in 10 mL PBS. Samples were then centrifuged a second time, the maximum amount of supernatant was removed, and the lysed samples were stored at −80 • C. The RNA was extracted according to the procedure of the isolation of RNA from non-fibrous tissue (Promega Inc., Madison, WI, USA), after which concentration and A260/A280 of RNA were tested; the standards were concentration >10 ng/ul and 2.2 > A260/A280 > 1.8. Satisfactory RNA was used to perform reverse transcription in accordance with Protocole reverse transcription (Promega Inc., Madison, WI, USA). The conditions of reverse transcription were as follows: 25 • C, 5 min; 42 • C, 60 min; 70 • C, 15 min, 4 • C, 20 min; 4 cycles in total. The cDNA was stored at −20 • C. Quantitative real-time PCR was performed using DNA, premiers were designed to amplify target genes and reference genes, reference genes were selected from NCBI database, which were similarly expressed in all tissues, the genes' premier pairs for RT-qPCR were designed by Eurogentec CO., (Brussel, Belgium), and RT-qPCR analyses were performed using Rotor Gene 6000 system (Corbett) and SYBR green (Gembloux, Belgium). Each amplification reaction contained 2 µL of cDNA and 18 µL of SYBR master mix (Thermo Fisher Scientific, MA, USA), including 500 nM of primers at a final volume of 20 µL. Reactions were performed as follows: 95 • C for 5 s, 60 • C for 30 s, 72 • C for 45 s, for 40 cycles in total, with data collection at the annealing step.

The Relationship between Gene Expression and Milk Production
Candidate genes from RT-qPCR experiment were used to perform a linear correlation analysis between genes' relative expression values and milk production. The milkproduction recordings were from Belgian farm with the same Lacaune sheep as those in 2022 (Data S4), as well as the production data from 2015 to 2021 (Data S5). We used the BLUPF90+ (http://nce.ads.uga.edu/html/projects/programs/, accessed on 15 March 2023) program to obtain estimated values (similar to estimated breeding values) of milkproduction traits of each ewe from 2015 to 2021. Usually, higher estimated values meant the animal had stronger production traits, so we used these values to conduct a linear analysis with ∆CT gene values and actual milk production from 2022, as well as the estimated values from period of 2015-2021 to show their relationships. Animal model is needed in BLUPf90+. In our research, the animal model was as follows: y ijk = A i + S j + βx ijk + e ijk , where y ijk is observed milk production, A i is fixed effect of lactation days, lactation number, birth year, and milking times, S j is random effect of animals, βx ijk is regression coefficient of variance of 1.0, and e ijk is residual effect. The estimations were given after data analysis using BLUPF90+ program.

Genomic Variants and Principal Component Analysis
We combined the genome sequences of 101 high-and low-production dairy sheep. The genomes were mapped and the SNPs were called using PLINK. After the joint calling and quality control, we detected 4,751,642 SNPs in total. To study the population stratification, three different approaches were employed, of which PCA is the first to be discussed here. The intention behind the use of PCA was to reduce the dimensionality of the genomic relationship matrix so that individuals (and breeds) were separated along different principal components (PCs). The first three principal components were able to explain most of the variation: PC1 explained about 50.26%, PC2 explained about 23.51%, and PC3 explained about 14.95%. Principal component 1 (PC1) differentiated the breeds in this study into three main clusters ( Figure 1A): Cluster1 (high-yield breeds, EFR, DM, DMF1, and DMF2); Cluster2 (Mongolian low-yield breeds, HS, STHS, CS, and AWS) and Cluster3 (low-yield breeds, FS and SFK). However, overlap was observed among the DM sheep, DMF1 and DMF2, within Cluster 1. The reason for this may have been that these sheep have extremely close genetic relations. The PCA separated the different sheep breeds in this study, with SFK the most clearly separated due to its longer distance from and non-blood relations with the other sheep. In the PCA plots, the high-yield breeds were closer to the Mongolian sheep than to the low-yield sheep, and in the high-yield cluster, the EFR sheep were clearly separated from the DM sheep. The separation of the clusters corresponded to the geographical origins and blood mixes of these sheep breeds, and may have also revealed their genetic distance.

Ancestor Analysis
To further explore the population structure among individuals from the different breeds in this study, a model-based hierarchical clustering analysis was undertaken for K-values of 2-7 (with K the user-defined number of biological ancestral populations). The cross-validation (CV) estimates revealed the K-value of 3 to be the best fit for the nine populations; it demonstrated the lowest CV error among all the K values. The bar plot of the ADMIXTURE results from the K-values of 2 and 3 is presented in Figure 1C. The analysis with K = 2 separated the high-milk-yield sheep breeds from the low-yield breeds. When K = 3, the FS and SFK were separated from the other breeds in the low-yield group. The DM and EFR sheep had the highest milk yield among all the populations, the Mongolian CHS, HS, and STHS sheep had average milk production [26], and the FS and SFK sheep produced the least milk. It is known that FS and SFK sheep are clearly separated from the other two sheep populations, and this was validated by the results obtained from the PCA analysis and the admixture analysis.

Neighbor-Joining-Tree Analysis
In this study, the phylogenetic neighbor-joining tree divided the high-and lowproduction groups into nine different clusters ( Figure 1B). The nine main branches were as follows: DM, DMF1, DMF2, EFR, CS, FS, HS, STHS, and SFK. According to the tree, we observed that the DM and DMF2 sheep branches were close to each other and that some of the DM, DMF1, and DMF2 sheep were mixed together, indicating their genetic similarity and probable sharing of a single ancestral line. The DM sheep is the hybrid offspring of EFR breeds and New Zealand sheep [41], so it was the second-nearest to the EFR sheep. Our phylogenetic results showed two breeds (DMF1 and DMF2) in the different sub-branches, which might have been due to the hybridization between the DM breeds and the STHS breeds. For the HS and STHS sheep, all the Mongolian sheep had similar genetic information, and their branches were close to each other. Similar to the PCA, there was a close genetic distance between the FS and SFK sheep, since there are shorter geographical distances between them. In the NJ-tree picture, the Awassi breed was not clearly separated, possibly due to its small numbers. It should be noted that the results of the phylogenetic analysis were reliable due to the large number of SNPs from the 10 sheep breeds used in this study.

Ancestor Analysis
To further explore the population structure among individuals from the different breeds in this study, a model-based hierarchical clustering analysis was undertaken for K-values of 2-7 (with K the user-defined number of biological ancestral populations). The cross-validation (CV) estimates revealed the K-value of 3 to be the best fit for the nine populations; it demonstrated the lowest CV error among all the K values. The bar plot of the ADMIXTURE results from the K-values of 2 and 3 is presented in Figure 1C. The analysis with K = 2 separated the high-milk-yield sheep breeds from the low-yield breeds. When K = 3, the FS and SFK were separated from the other breeds in the low-yield group. The DM and EFR sheep had the highest milk yield among all the populations, the Mongolian CHS, HS, and STHS sheep had average milk production [26], and the FS and SFK sheep produced the least milk. It is known that FS and SFK sheep are clearly separated from the other two sheep populations, and this was validated by the results obtained from the PCA analysis and the admixture analysis.

Signal-Selection Analysis and Gene Ontology
We scanned the genomes of 101 high-and low-milk-yield sheep for signals of positive selection with different traits. To achieve this, we calculated three complementary statistics along the sheep reference genome Oar v.4.0 (https://www.ncbi.nlm.nih.gov/assembl y/GCA_000298735.2, accessed on 15 March 2022) using 100-kb-long sliding windows and 15-kb step sizes. The first statistic was the population-differentiation index F ST , for identifying genomic regions with different allelic frequencies between high and low groups. The second statistic measured the differences in nucleotide diversity between the two groups (θπ(high/low)), with high or low θπ values indicating positive selection of different milk-production sheep, respectively. The third measure was also used to calculate the selected candidate SNPS using the index of the transformed heterozygosity score (ZH P ) ( Figure 2). With regard to the limiting of the false-positive identifications, we considered the 5% most heavily overlapping regions from all three scans, which provided a total of 995 overlapping SNP regions and 553 protein-coding genes and small RNA. The selection candidates identified in the high-and low-yield groups are provided in Data S1. Most of the protein-coding genes were significantly enriched using the functional gene ontology and KEGG categories related to the protein binding, RNA binding, molecular transport, and cytokine-receptor activity (p = 9.23 × 10 −3 ) (Figure 3, Data S2). Importantly, through the gene annotation, the three highest 5% genomic regions providing the strongest signatures of selection containing FCGR3A, CTSK, CTSS, ARNT, GHR, and SLC29A4 were selected as the strongest annotation genes by comparing the high-and low-milk-yield groups.

RT-qPCR Results
We chose the four genes that were most frequently selected by the signature-selectio analysis shown in Figure 2, which were FCGR3A, CTSK, CTSS, and ARNT, according the NCBI database and previous studies. We chose GAPDH gene as reference gene to pe form RT-qPCR validation. The premier sequence is listed in Table 2. All the gene-expre sion values were derived from 11 ewes; the CT and ΔCT values of each gene are listed DataS3, with ΔCT values meaning the relative expression of each gene.  20 10 nmol TE-100 um *-F means the forward premiers of genes; -R means the reverse premiers of genes.

The Relationship between Gene Expression and Milk Production
The milk-production data were sourced from the Belgian farm. They covered the a erage daily milk production from four months in 2022: September, October, Novembe and December (Data S4). A linear analysis of the ΔCT values of each gene was performe

RT-qPCR Results
We chose the four genes that were most frequently selected by the signature-selection analysis shown in Figure 2, which were FCGR3A, CTSK, CTSS, and ARNT, according to the NCBI database and previous studies. We chose GAPDH gene as reference gene to perform RT-qPCR validation. The premier sequence is listed in Table 2. All the gene-expression values were derived from 11 ewes; the CT and ∆CT values of each gene are listed in Data S3, with ∆CT values meaning the relative expression of each gene.

The Relationship between Gene Expression and Milk Production
The milk-production data were sourced from the Belgian farm. They covered the average daily milk production from four months in 2022: September, October, November, and December (Data S4). A linear analysis of the ∆CT values of each gene was performed using the daily average milk production from 2022 and values estimated using the program of BLUPf90+ (Data S5), and the correlations between daily milk production and the estimated values are also shown in Figure 4. The results showed that the FCGR3A gene had significant correlations with milk production and estimate values. The higher ∆CT values meant lower expression values during the lactation period. It was revealed that the FCGR3A gene had a significantly negative relationship with milk production and the estimate values (R 2 = 0.8452, R 2 = 0.7382); furthermore, as predicted, the relationship between actual milk production in 2022 and the estimate value was positive (R 2 = 0.6988. The CTSK, CTSS, and ARNT did not show significant differences between their expression values, the milk production values, and the estimate values, but the correlation coefficients between actual milk production in 2022 and the estimate values for the 2015-2021 milk production were positive (R 2 = 0.6988, R 2 = 0.628, R 2 = 0.628).
using the daily average milk production from 2022 and values estimated using the program of BLUPf90+ (Data S5), and the correlations between daily milk production and the estimated values are also shown in Figure 4. The results showed that the FCGR3A gene had significant correlations with milk production and estimate values. The higher ΔCT values meant lower expression values during the lactation period. It was revealed that the FCGR3A gene had a significantly negative relationship with milk production and the estimate values (R 2 = 0.8452, R 2 = 0.7382); furthermore, as predicted, the relationship between actual milk production in 2022 and the estimate value was positive (R 2 = 0.6988. The CTSK, CTSS, and ARNT did not show significant differences between their expression values, the milk production values, and the estimate values, but the correlation coefficients between actual milk production in 2022 and the estimate values for the 2015-2021 milk production were positive (R 2 = 0.6988, R 2 = 0.628, R 2 = 0.628).

Discussion
In this study, we sequenced the genomes of 101 sheep with high and low levels of milk production. Using a neighbor-joining-tree analysis, we divided these sheep into 10 groups according to their genetic relationships and background, and then, using a principal component analysis and a structure analysis, these 10 groups were divided into three clusters: high-milk-yield sheep (EFR, DM, DMF1, DMF2), Mongolian low-milk-yield sheep (HS, CS, AWS, STHS), and low-milk-yield sheep (SFK and FS). The scanning of the genomes of high-and low-yield sheep breeds revealed that the FCGR3A ARNT, CTSK, CTSS, and GHR genes were the strongest candidates. According to the RT-qPCR experiment and the analysis of the relationship between these candidate genes' ∆CT values and milk production, the FCGR3A gene had a significant relation with Lacune sheep milk production. All of these genes have also been reported to be highly expressed during the lactation period in cattle [42] and buffalo, reflecting the similar biological functions of these genes when they are expressed in lactating mammary glands. The GO and KEGG analyses showed that most of these genes were significantly enriched for mammary-gland-specific GO terms (cytokine-receptor activity and protein binding), as well as establishing the cellular functions and protease bindings. These genes were found to have high and low levels of milk production, milk protein, milk fat, milk lactose, cheese traits, somatic cell counts, etc. For some of these traits, it is not easy to find a direct relationship with milk quantity. This was probably the reason why we did not find a significant correlation coefficient between the CTSK, CTSS, or ARNT expression values and milk production. Furthermore, as we found in many publications, some of the most significant genes in our research play a crucial role in sheep-mastitis traits and in the immune-response process, which improves milk production. Sheep-mammary-gland mastitis is also a common source of economic losses on sheep farms, so it is also important for us to focus on some of the genes related to mammary-disease resistance.

Candidate Genes Potentially Associated with Some Milk Traits
The FCGR3A (Fc fragment of IgG, low-affinity IIIa, receptor) gene is considered a novel and promising candidate for relieving stress, inflammation, and disease [43], as well as dairy-cattle mastitis. The binding of FCGRs to the Fc region of immunoglobulins mediates a variety of immune functions, such as antigen presentation, the clearance of immune complexes, the phagocytosis of pathogens, and cytokine production [44], they work together to resist viral injection to improve udder health and, thus, increase the production of milk. It is reasonable to suggest that lactation periods cause the increase in expression of a large number of genes, resulting in improvements in performance, as mastitis or Staphylococcus aureus infections often occur during lactation [45,46]. With regard to CTSS (Cathepsin S)-encoding protease, Sodhi et al. (2021) found a higher expression of most of the cathepsin genes (CTSS, CTSD, and CTSK) during the mid-to-late lactation stages, emphasizing their potential roles in milk synthesis, since the expressions of most of the proteases were higher during peak lactation [47]. The higher expression of cathepsin-encoding genes during late lactation stages could be attributed to the fact that CTSS plays a crucial role in mammary-gland involution [48]. Previous studies found the role of the CTSK gene in influencing cheese traits, as it is a protease-encoding casein that increases milk protein [49]. Similar results were found when evaluating the expression patterns of important proteasepathway-associated CTSK genes derived from different lactation stages in Sahiwal cows and Murrah buffalo. In both breeds, the RNA-expression levels of these genes were higher in the late lactation stages than in the early lactation stages. Lactation induces bone loss in order to provide sufficient calcium in milk, and to prevent this from taking place, the CTSK gene elevates its expression to increase osteocyte numbers, in order to maintain the balance of bone and milk calcium [50,51]. This proves that the CTSK gene plays an important role in milk-calcium traits. Furthermore, a differential expression analysis of Churra and Assaf sheep allowed us to notice some genes that were significantly and differentially expressed between the two breeds. These genes were mainly associated with protein-protease activity. Furthermore CTSK was differentially expressed and was selected as a candidate gene associated with cheese traits in this study [49]. These findings confirmed previous results, which highlighted the importance of the expression of genes encoding for some proteases in sheep milk. The aryl hydrocarbon receptor nuclear translocator (ARNT) can interact with the AHR aryl hydrocarbon receptor). The AHR is restricted to the cytoplasm in its unbound state [52,53]. Once activated, the AHR is combined with the nucleus of the ARNT and forms an active complex with the AHR nuclear translocator (ARNT) to alter the expressions of target genes [54]. It was reported that an association between AHR activation and ARNT causes changes in milk production [55]. More specifically, pregnant mice exposed to TCDD (an ARNT agonist) in vivo produced lower levels of the milk proteins, β-casein and whey protein [56]. The GHR is a regulator of developing growth and, as a growth hormone, it has important effects on carbohydrate, protein, and lipid metabolism. In cattle, mutations in GHR have been associated with milk yield and composition in Ayrshire, Holstein, and Jersey cattle. Dettori et al. (2018) reported that variations of the ovine GHR gene might affect milk-quality traits in Sarda sheep [57]. The rs55631463 GHR SNP genotype affects milk fat and protein yield, and the rs411154235 SNP is associated with lactose content in milk, as it encourages the transformation of glucose into glycogen to help glycogen cellular deposits [58].
In our study, except for FCGR3A, the strongest gene candidates, CTSK, CTSS, ARNT, were not found to be significantly relevant to milk-production quantity. This was potentially due to the small number of Lacune sheep selected to validate their function. Since our objective was to detect milk-yield genes, not milk-composition genes, it was difficult to directly find the relationships between these genes, which may participate in milk-protein or milk-fat synthesis. Thus, further measures need to be applied in the future to prove that these genes are correlated with other sheep-milk traits. In particular, Dairy Meade should be tested, since this breed, or crosses in which it is involved, comprised a high proportion of the high-milk-yield group.

Conclusions
In conclusion, our research is the first attempt to report a genome-signature-selection scan revealing genes that are associated with high-and low-milk-yield sheep breeds by using whole genome sequencing. Some of the most significant candidate genes associated with milk yield were identified through a combination of genome-signature-selection scanning and RT-qPCR experimentation. In the high-and low-yield groups, 553 genes were detected that were enriched with significant protein-receptors-combing pathways. Under selective pressure, we selected the four genes, FCGR3A, CTSS, CTSK, ARNT, that were most likely to be related to milk-production traits, in order to perform a correlations analysis, in which FCGR3A was found to have a significant relationship with daily milk yield, as it participates in the process of immune reaction. Furthermore, since higher gene-expression values mean severer mastitis, they are negatively correlated with milk production. These results were supported by the correlation between the FCGR3A ∆CT values and the estimated values of the corresponding ewes. Therefore, these results could provide a better genetic perspective on the phenotypic differences between different-milk-yield groups for similar studies. Our findings provide an insight into the dynamic characterization of sheep-mammary-gland gene expression, and the identified candidate genes can provide valuable information for future functional characterization, as well as contributing to a better understanding of the genetic mechanisms underlying the milk-production traits in sheep.