The Genetic Diversity of Mink (Neovison vison) Populations in China

Simple Summary Mink (Neovison vison) is well-known as one of the most important sources of fur and is used as feed across the North zone of China. Using double-digest restriction site-associated DNA sequencing, this study evaluated the genetic diversity and population structure of a five color of mink populations in China (ddRAD-seq). The black mink population was found to be genetically differentiated from other color type populations, whereas a clustering of the red mink and black mink populations was observed; further, other color populations clustered separately. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analysis showed that the genes with a selection signature were enriched in the melanogenesis pathway. The results of our study provide basic information on mink diversity, and theoretical concepts for the conservation and exploitation of mink breeds in China. Abstract The American mink (Neovison vison) is a semiaquatic species of Mustelid native to North America that is now widespread in China. However, the knowledge of genetic diversity of mink in China is still limited. In this study, we investigated the genetic diversity and identified significant single nucleotide polymorphisms (SNPs) in mink populations of five different color types in three different mink farms in China. Using double-digest restriction site-associated DNA sequencing, we identified a total of 1.3 million SNPs. After filtering the SNPs, phylogenetic tree, Fst, principal component, and population structure analyses were performed. The results demonstrated that red mink and black mink grouped, with separate clustering of all other color types. The population divergence index (Fst) study confirmed that different mink populations were distinct (K = 4). Two populations with different coat colors were subjected to the selection signature analysis, and 2300 genes were found to have a clear selection signature. The genes with a selection signature were subjected to Gene Ontology (GO) categorization and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, the results revealed that the genes with a selection signature were enriched in the melanogenesis pathway. These study’s findings have set the stage for improved breeding and conservation of genetic resources in real-world practical mink farming.


Introduction
The farm mink (Neovison vison) is a semi-aquatic carnivore that traditionally originated from the Northern Hemisphere [1]. Due to their importance in the fur industry, these animals and their skins were transported from Western Europe to the Far East, especially China in the late 20th century [2]. According to the data for 2021, approximately 20 million mink skins are produced globally; China contributed 25% of the total mink skin produced A total of 97 unrelated minks were obtained from three different farms in China and included 50 white minks (WM), 20 black minks (BLM), 10 sliverblue minks (SM), 10 brown minks (BM), and 7 red minks (RM). WM, SM, and BM were from Huilongwan breeding mink farm in Yichun City; BLM was from Zhuojia mink farm in Jilin City; and RM was from Shandong mink farm in Weifang City. Blood samples were obtained from the paws of minks and stored at −20 • C. Genomic DNA was extracted from these whole blood samples using Easy Pure Blood Genomic DNA kit according to the manufacturer's instructions, and each DNA sample was evaluated using gel electrophoresis and stored at −20 • C. Animal management and sampling procedures were performed following the standards of the Principle of Laboratory Animal Care and the guidelines prescribed by the Animal Research Committee of the Institute of Special Animals and Plants Sciences, Chinese Academy of Agricultural Sciences (Protocol code no. ISAPSAEC-2022-62M).

Library Construction and Variant Calling
Double digest RAD sequencing (ddRAD-seq) libraries were constructed following a modified protocol [21]. Genomic DNA was digested by PstI and MspI enzymes, and resulting DNA fragments were amplified and subsequently purified by VAHTSTM DNA Clean Beads. After separation on a 2% agarose gel, DNA in the range of 220-450 bp (with indices and adaptors) was excised using a gel extraction kit (Qiagen, Hilden, Germany). Finally, the samples were sequenced by the Illumina NovaSeq platform at Shanghai Personal Biotechnology Co., Ltd., Shanghai, China.
Each sample was assessed in terms of the quality of sequencing reads using the Fast QC software (v0.11.7) [22]; Low-quality reads were trimmed l, and overlapping reads (at least 11 nucleotides, per default) were merged (collapsed). The resulting high-quality reads were aligned to the latest reference genome of the standard black mink, which accessed on 16 December 2022 (https://www.ncbi.nlm.nih.gov/assembly/GCF_020171115.1/) using BWA-mem v 0.7.12 [23]. High confidence SNPs were called by GATK (V3.8). Variants were filtered by GATK to using the criteria a quality by depth (QD) < 2.0, mapping quality (MQ) < 40.0, Fisher strand (FS) > 60.0, mapping quality rank-sum test < −12.5, and read position rank sum test < −8.0. Furthermore, all SNPs with a minor allele frequency (MAF) < 0.05, call rate < 0.90 and those deviating from the Hardy-Weinberg equilibrium (p < 10 −6 ) were filtered out. In addition, individuals with >0.15 missing genotype were discarded from the dataset using VCFtools 35. Minor allele frequency was computed for all SNPs and the proportion of SNPs was determined for MAF ranges of <0.05, 0.05 to <0.1, 0.1 to <0.2, 0.2 to <0.3, 0.3 to <0.4 and 0.4 to ≤0.5. Functional annotation of the SNPs and indels was performed using the ANNOVAR2017 software (2017-07-16) [24].

Genetics Analyses
A total of 1,396,257 SNPs were used to calculate the phylogenetic distance among mink populations. The phylogenetic tree was constructed using FastTree (version 2.1.11) software to determine the evolutionary relationship among mink populations. Principal component analysis (PCA) was performed using the GCTA software (version 1.26.0). Plink was used to analyze the genetic structure of the mink populations based on different color type. The number of genetic clusters was predefined from K = 2 to 10 and repeated each Kvalue three times. Admixture software (Version 1.3.0) was used to calculate ∆K. Observed heterozygosity (Obs Het), observed homozygosity(Obs Hom), expected heterozygosity (Exp Het), expected homozygosity(Exp Hom), nucleotide diversity (π) and inbreeding coefficients (Fis) were calculated using the populations' program in STACKS. The Fst values were calculated for each SNP of all pairwise subgroups using variant call format (VCF) tools. We studied the genetic structure of different mink populations using phylogenetic tree construction, PCA, and population structure analysis, whereas the genetic analysis of population structure was carried out using ADMIXTURE software (Version 1.3.0).

Identification of Selective Sweeps
Candidate genes in the sweep regions of different mink populations were analyzed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Top GO software (Version: 2.38.1) was used to perform the GO enrichment analysis and the pathway information is fetched from KEGG website. phyper is used to do the hypergeometric tests. Calculated p-values were corrected for the false discovery rate (FDR) while applying an FDR threshold of ≤0.05. Pathways meeting the criteria were considered significantly enriched pathways.

ddRAD-Seq and Data Filtering
Before quality filtering, ddRAD-seq generated 390.08G of raw data for 97 normally sequenced individuals, with an average of 4.02 Gb per sample and a range of 2.31 to 6.23 Gb. 362.63 Gb of clean data (2.16 Gb to 5.21 Gb for each sample, with an average of 3.74 Gb) were retained after the sequence data underwent quality filtering. Each sample received an average of 26.71 million clean reads that were retained. The number of reads on the alignment was mostly above 93.21%, the percentage of high-quality clean reads was above 94.52%, and the average effective mapping rate was above 98.86%. Hence, our sequencing data were of a high quality (Q20 > 96%, Q30 > 91%), with a stable GC content that ranged from 49.89 to 54.95% (Supplementary Tables S1-S3).

Variation Calling
A total of 1,396,257 high-quality SNPs were identified after SNP calling on 97 samples, with sequencing depths ranging from 19.05 to 20.95 (Table 1). Each animal had the same number of SNPs, which ranged from 9,211,330 in the BM population to 1,4738,128 in the red mink, but the number of SNPs in each individual was the same, and the number of identified homozygous SNPs varied from 203,639 to 234,273 and from 74,386 to 88,124 (

Genetic Diversity
We detected deficits in all Obs values in all mink populations and the Obs values were compared with the Exp values. RM showed the lowest Exp Het, whereas the highest Obs Het was observed in BLM, indicating the occurrence of self-crossing in the populations. The intra-population inbreeding coefficient of Fis ranged from 0.2441 to 0.4402. We estimated the genome-wide nucleotide diversity using the SNP data. π values, which ranged from 0.2099 to 0.3503, demonstrated that the nucleotide diversity of BLM was the highest. The average Obs Het, Exp Het, π, and Fis were 0.1226, 0.1948, 0.2272, and 0.2308, respectively ( Table 4). The estimated genetic distances according to the Fst metric varied widely between 0.0833 and 0.1657 among the populations ( Table 5). The highest and lowest genetic distances were observed between RM and SM and BM and WM, respectively.

Genetic Diversity
We detected deficits in all Obs values in all mink populations and the Obs values were compared with the Exp values. RM showed the lowest Exp Het, whereas the highest Obs Het was observed in BLM, indicating the occurrence of self-crossing in the populations. The intra-population inbreeding coefficient of Fis ranged from 0.2441 to 0.4402. We estimated the genome-wide nucleotide diversity using the SNP data. π values, which ranged from 0.2099 to 0.3503, demonstrated that the nucleotide diversity of BLM was the highest. The average Obs Het, Exp Het, π, and Fis were 0.1226, 0.1948, 0.2272, and 0.2308, respectively ( Table 4). The estimated genetic distances according to the Fst metric varied widely between 0.0833 and 0.1657 among the populations ( Table 5). The highest and lowest genetic distances were observed between RM and SM and BM and WM, respectively.

Genetic Structure
Similar patterns were shown by genetic analysis of the phylogenetic tree and PCA. The findings indicated that SM, BM and WM were grouped independently. Some minks were congregated, with RM minks clustered with BLM minks, while others distinguished themselves and created a new group (Figure 2A,B). The ideal value of K was 3 based on each k's cross-validation error ( Figure 2C). However, the clusters were not produced from BLM, BM, or SM. When K was adjusted to 2, RM was separated from other populations, and the latter WM was further separated when K was set to 3. Several mink populations were divided at K = 4.

Genetic Structure
Similar patterns were shown by genetic analysis of the phylogenetic tree and PCA. The findings indicated that SM, BM and WM were grouped independently. Some minks were congregated, with RM minks clustered with BLM minks, while others distinguished themselves and created a new group (Figure 2A,B). The ideal value of K was 3 based on each k's cross-validation error ( Figure 2C). However, the clusters were not produced from BLM, BM, or SM. When K was adjusted to 2, RM was separated from other populations, and the latter WM was further separated when K was set to 3. Several mink populations were divided at K = 4.

Selection Signatures and Enrichment
Fst and π, which were selected in the top 5% of regions,were used in selection sweeps. Five mink populationsʹ comparisons of 2300 genes with a selection signature revealed 10 highly enriched terms (p < 0.05), including three functional terms for biological processes (BP), four for molecular functions (MF), and three for cellular components (CC). For BP terms, the cysteinyl-tRNA aminoacylation, columnar cuboidal epithelial cell growth, and sulfur amino acid catabolic process were determined to have the strongest

Selection Signatures and Enrichment
Fst and π, which were selected in the top 5% of regions, were used in selection sweeps. Five mink populations' comparisons of 2300 genes with a selection signature revealed 10 highly enriched terms (p < 0.05), including three functional terms for biological processes (BP), four for molecular functions (MF), and three for cellular components (CC). For BP terms, the cysteinyl-tRNA aminoacylation, columnar cuboidal epithelial cell growth, and sulfur amino acid catabolic process were determined to have the strongest GO connections. For the MF keywords, thiamine binding (GO: 0006772), flavin-linked sulfhydryl oxidase activity, cysteine-type peptidase activity, and melatonin receptor activity (GO: 0008502). Contrarily, CC keywords were connected to the melanosome (GO: 0042470), pigment granule (GO: 0048770), and photoreceptor outer segment. Some of the most important genes were found to be abundant in the melanogenesis pathway, cysteine and methionine metabolism, Wnt signaling route, MAPK signaling system, and phototransduction in the KEGG enrichment analysis of the genes with a selection signature. The development of melatonin and the hair structure was also linked to 12 genes, including Wnt11, Wnt5B, PSMB3, CCND1, MTNR1B, EGF, DN-EGF, WD, WIPI1, IGFII, IGFBP7, BMP6, and TGF-β ( Figure 3, Supplementary Tables S5 and S6).

Discussion
ddRAD-seq and other related approaches can provide a picture of the distribution of SNPs throughout the whole genome, they have been widely used in genetic and population structure analyses [25]. In the present study, a total of 1.39 M SNPs were identified from 97 animals belonging to five different color populations, enabling us to predict the relationships among these populations of animals. This number of SNPs identified was significantly higher than that reported previously, for instance, 52,714 SNPs of the LD pattern across the American mink genome by GBS [26] and 34,816 SNPs with body size and pelt length in American mink by GBS [16]. This should be attributed to the use of different restriction enzymes that reduce the complexity of the genome for sequencing. Furthermore, 98.86% of clean reads were mapped to the mink reference genome, with subsequent SNP calling. These data are consistent with the population genomic research (on average, 98.24%) performed using whole genome sequencing [20] in American minks. This indicated that the identified SNPs can be used in downstream analysis.
The Fst varied between 0.0839 and 0.2241 for various color types, which is within the range of another study conducted at 12 locations in southern Chile (Fst varied between 0.017 and 0.364) [10]. According to the findings of this research, Thirstrup et al. [18] revealed modest genetic divergence (Fst) across mink color types (0.076) and between mink from two distinct geographic origins (0.087), with the highest Fst (0.14) reported between Pastel and Stardust color-types in American mink.
From the PCA, phylogenetic tree map, and structure analyses, most animals of RM and BM populations clustered together, whereas other populations (SM, BM, and WM) clustered separately. It indicates that the genetic relationship of between BM and RM populations was relatively close. The standard type of American mink in nature was more

Discussion
ddRAD-seq and other related approaches can provide a picture of the distribution of SNPs throughout the whole genome, they have been widely used in genetic and population structure analyses [25]. In the present study, a total of 1.39 M SNPs were identified from 97 animals belonging to five different color populations, enabling us to predict the relationships among these populations of animals. This number of SNPs identified was significantly higher than that reported previously, for instance, 52,714 SNPs of the LD pattern across the American mink genome by GBS [26] and 34,816 SNPs with body size and pelt length in American mink by GBS [16]. This should be attributed to the use of different restriction enzymes that reduce the complexity of the genome for sequencing. Furthermore, 98.86% of clean reads were mapped to the mink reference genome, with subsequent SNP calling. These data are consistent with the population genomic research (on average, 98.24%) performed using whole genome sequencing [20] in American minks. This indicated that the identified SNPs can be used in downstream analysis.
The Fst varied between 0.0839 and 0.2241 for various color types, which is within the range of another study conducted at 12 locations in southern Chile (Fst varied between 0.017 and 0.364) [10]. According to the findings of this research, Thirstrup et al. [18] revealed modest genetic divergence (Fst) across mink color types (0.076) and between mink from two distinct geographic origins (0.087), with the highest Fst (0.14) reported between Pastel and Stardust color-types in American mink.
From the PCA, phylogenetic tree map, and structure analyses, most animals of RM and BM populations clustered together, whereas other populations (SM, BM, and WM) clustered separately. It indicates that the genetic relationship of between BM and RM populations was relatively close. The standard type of American mink in nature was more brownish than the present type of standard black and standard brown, and standard black was the result of a selection of dark colors among the original dark brown wild minks [27]. The WM of this study was Albino, also called pink-eyed white mink, which was a recessive mutant of standard mink. The first mutant of the regular mink, the Sliverblue mink, first appeared in Wisconsin, USA, in 1931. Farm samples displayed a distinct pattern of genetic subdivision in a prior study, and White, Pearl, Silver, and Sapphire were easily distinguished as distinct clusters with high odds of population membership [18]. The majority of the RM in China was imported from Russia in the 1950s, while the black mink in our study was imported from North America in the 1990s when the American mink was first introduced as a furbearer to the eastern and southern former Soviet Union [28]. Russian mink were huge with long, coarse guard hair, brown fur, and subpar reddish wool.
To enhance the quality of the Red mink's fur, Black minks, which were little but had silky, short guard hair and dense wool, were utilized [29]. Our findings were in line with a previously published study of several color types of mink, in which structural analysis revealed that Mahogany looked to represent an admixture between Black and Brown mink.Since the color-phase breeding approach is used, genetic selection or genetic drift may have varied effects on mink depending on their line and farm [30]. Due to the frequent occurrence of the drift in cconfined populations of animals created by tiny, isolated populations, American mink are particularly vulnerable to this genetic mechanism [31].
Variation in coat color is thought to be a marker of domestication and a distinct trait of a few breeds [32].. In contrast, coat color and fur quality were essential features in mink breeding generally and in the mink industry specifically because of their effects on the eventual economic worth of fur [33].
Over a century of the occurrence of new mutations and selection on the initial wild brown and black phenotype resulted in 35 mutations and more than 100 forms from their combination ( Table 6). Mammals' various coat colors are initially related to the sort of melanin that melanocytes generate. In our work, GO analysis was enriched to demonstrate significant population variation in melanosome, pigment granule, melatonin receptor function, phototransduction, and photoreceptor outer segment. Coat color is determined by the amounts and types of melanosomes produced and released by melanocytes resident in the skin [34]. The principal role of the melanocyte is to produce the pigment granule within melanosomes and to transfer these organelles to keratinocytes [35]. Melatonin, a significant indoleamine neuromodulator implicated in circadian rhythms and sleep patterns, regulates diverse rhythmic functions by activating its high-affinity G-protein-coupled receptors [35]. In the adult rat retina, the melatonin receptor MT 1 immunocytochemically locates in the IPL, OPL, and ipRGCs, while MT 1 receptor immunoreactivity is found in many mouse retinal neurons, including rod and cone photoreceptors and ipRGCs [36]. In our research, the melanogenesis pathway, Wnt signaling pathway, MAPK signaling pathway, Cysteine, and methionine metabolism, and phototransduction were enriched by 11 genes, including Wnt 11, Wnt5B, PSMB3, CCND1, TGFβ, MTNR1B, EGF, DN-EGF, WD, WIPI1, and IGFII. In the International Federation of Pigment Cell Societies(IFPCS), 661 genes related to coat color have been published so far [37]. The research on fur development mainly focuses on the molecular mechanism of hair follicle development. Most of these key signaling molecules of hair follicles belong to Wnt (Wingless-related), Shh (Sonic Hedgehog), BMP (Bone morphogenetic protein), FGFs (Fibroblast growth factors), TGF(Transforming growth factor), and Notch signaling pathway [38]. Hsiang Ho [39] found that WD repeats domain, phosphoinositide-interacting 1 (WIPI1) coordinates melanogenic gene transcription and melanosome formation. Fibroblasts act on melanocytes directly and indirectly through neighboring cells by secreting a large number of cytokines, proteins, and growth factors (KGF, bFGF, TGF-β) which bind to receptors and modulate intracellular signaling cascades (MAPK/ERK, Wnt/β-catenin) related to melanocyte functions [40]. The Wnt pathway was significant in fur quality, which is the biological pathway considered to be the key regulator of hair follicle morphogenesis [41]. The canonical Wnt pathway has been reported for its important role in skin pigmentation and melanogenesis in chickens [18]. In our research, we found Wnt 11, Wnt5B, PSMB3, and CCND1 in the Wnt signaling pathway. Many cell growth factors have been reported to be regulators of hair follicle growth and development, including hepatocyte growth factor (HGF), insulin-like growth factor-1 (IGF-1), and epidermal growth factor (EGF). Wu [42] discovered that BMP6 and Wnt10b cooperate to control the activation of HFSCs and the transition of hair follicles from telogen to anagen. Consequently, our findings show that a range of genes, including WD, WIPI1, TGF, Wnt, and EGF, can influence melanin deposition in minks in different ways to change the color of the mink coat. In conclusion, selection signature analysis was used to investigate the genetic analyses and genomic resources related to the coat color of Chinese mink breeds. Our findings revealed 1.39 million SNPs in five color populations, which may be used to create SNP arrays for upcoming genomics initiatives in the Chinese mink population.

Conclusions
In this study, we explored the genetic diversity, and genetic organization of different mink in China using the ddRAD-seq technology to discover SNPs at the genome level. Also, we discovered the potential genes that were associated with melanogenesis. The findings of our study offer important data for the preservation and use of Chinese mink breeds, as well as a clear direction for upcoming breeding efforts.
Supplementary Materials: The following supporting information can be downloaded at: https://www. mdpi.com/article/10.3390/ani13091497/s1, Table S1: List of quality filtering of reads on 97 American mink genomes; Table S2: List of sequencing data alignment on 97 American mink genomes; Table S3: List of sequencing data quality on 97 American mink genomes; Table S4: Indel test result statistics; Table S5: GO classification of the selected genes in BLM_mink and SM mink; Table S6: KEGG enrichment of the selected genes in BLM mink and SM mink.