Integrative Analysis of Methylation and Transcriptional Profiles to Reveal the Genetic Stability of Cashmere Traits in the Tβ4 Overexpression of Cashmere Goats

Simple Summary Cashmere goats have double coats consisting of non-medullated fine inner hairs or cashmere fibers produced by secondary hair follicles (SHFs) and guard hairs produced by primary hair follicles (PHFs). Cashmere is an important economic product worldwide and the world market for cashmere is increasing while the current production of cashmere is limited. Thymosin β4 (Tβ4), a 4.9-kDa protein, contains 43 amino acids. Here, we produced Tβ4 overexpression (Tβ4-OE) offspring using two methods. The somatic cell nuclear transfer (SCNT) goats had increased hair follicle development and higher cashmere yields than wild type (WT) and natural mating (NM) goats. Taken together, our results showed that DNA methylation affected the expression of differentially expressed genes (DEGs) between generations and the genetic stability of cashmere traits. Abstract DNA methylation alteration is frequently observed in exogenous gene silencing and may play important roles in the genetic stability of traits. Cashmere is derived from the secondary hair follicles (SHFs) of cashmere goats, which are morphogenetically distinct from primary hair follicles (PHFs). Here, in light of having initially produced 15 Tβ4 overexpression (Tβ4-OE) cashmere goats which had more SHFs than the wild type (WT) goats, and produced more cashmere, we produced Tβ4-OE offsprings both via somatic cell nuclear transfer (SCNT) and via natural mating (NM). However, the desired trait exhibited lower fixation in the line-bred offspring compared to the SCNT offspring. Integrative analysis of methylation and transcriptional profiles showed that this might be due to the influence of methylation on the expression of differentially expressed genes (DEGs) between generations, which was mutually consistent with the results of the functional and pathway enrichment analysis of differentially methylated regions (DMRs) and DEGs. Overall, our study systematically describes the DNA methylation characteristics between generations of cashmere goats and provides a basis for improving genetic stability.


Introduction
Offspring, which can stably inherit the fine trait, always attract interests of researchers in line breeding throughout the world and great attentions of industry [1]. In recent years, with the development of molecular biology and gene editing technology, we can obtain a large number of individuals with excellent target traits [2][3][4][5][6]. Therefore, we are eager to pass this trait on to future generations, so as to expand its production scale and better serve human beings. However, due to the genetic stability of traits [7], it is not always possible to achieve the desired goals. In addition, the region and time of breed breeding also affect the final breeding process [8]. Therefore, we need to determine the reasons that affect the genetic stability of traits and effectively avoid genetic instability, so as to select and breed new strain lines with good production values for promotion and utilization.
DNA methylation alteration has been observed in various animal development stages and is considered to be a cause of genetic instability [9,10]. Global hypomethylation is frequently seen in highly and moderately repeated DNA sequences and plays a key role in chromosomal instability [7]. Hypermethylation in gene promoter regions, such as in exogenous genes, is usually related to gene silencing [11]. Therefore, studying the methylation degree of genes between generations can lay a foundation for determining the factors that affect genetic stability.
Cashmere is derived from the secondary hair follicles (SHFs) of cashmere goats, which are morphogenetically distinct from primary hair follicles (PHFs). In contrast to PHFs (which produce guard hairs), SHFs do not contain medulla and instead generate fibers that are soft and delicate [12,13]. Indeed, one of the key goals of cashmere goat breeding is to increase the number of SHFs carried by the goat.
Thymus beta 4 (Tβ4), first isolated by Goldstein and White from the thymus of a calf in 1966 [14] and consisting of 43 amino acid residues, is a highly conserved G-actin sequestering protein with a wide range of functions, such as promoting cell migration, angiogenesis, and wound healing [15]. Recent studies have shown that Tβ4 induces rapid hair growth around the wound margin and promotes hair growth on the dorsal skin of mice and on the depilated areas of rats [16,17]. Moreover, the overexpression of Tβ4 in hair follicles can improve cashmere yield in cashmere goats [18]. In the last 10 years, our team has initially produced 15 Tβ4 overexpression (Tβ4-OE) cashmere goats, which had more secondary hair follicles than the WT goats and produced more cashmere (unpublished observations).
In this study, the offspring of Tβ4-OE cashmere goats were bred through somatic cell nuclear transfer (SCNT) and natural mating (NM), and the genetic stability of their cashmere production performance was further evaluated. At the same time, the strategy of genome-wide methylation sequencing combined with transcriptome sequencing was adopted to analyze the methylation differences between the original generation and the offspring of the Tβ4-OE cashmere goats, thereby providing a theoretical basis for further explaining the genetic instability between generations of genetically modified cashmere goats.

Cell Culture
Goat fetal fibroblasts (GFbs) were isolated from 35-day-old fetal Tβ4-OE and WT goats as previously described [19]. The Tβ4-OE and WT GFbs, as the donor cells for SCNT, were cultured in Dulbecco's modified Eagle's medium/F12 containing 10% fetal bovine serum in 5% CO 2 at 37 • C in a humidified incubator for 3-5 days and then passaged and frozen for future use.

Generation of Cloned Goats via Somatic Cell Nuclear Transfer (SCNT)
SCNT was performed as described previously [19]. Briefly, oocytes matured in vitro were enucleated, and selected polyclonal and donor cells were injected into the perivitelline space. The oocyte-donor cell pairs were fused and activated. After being cultured for 20 h, cloned embryos were surgically transferred into the oviducts of the surrogates.

Measurement of Body-Related Indices
The body mass of all first filial generation of Tβ4-OE and WT were determined at birth. Body mass, cashmere weight, cashmere thickness, and fiber length were monitored yearly until the goat died.

RNA Extraction and Quantitative Real Time-PCR (qRT-PCR)
Total RNA was isolated from tissue samples of six typical representatives, two parents and their twin offsprings with RNAiso Plus* (TaKaRa Bio, Shiga, Japan). cDNA was synthesized from 1 µg of total RNA with a PrimeScript RT reagent Kit with a gDNA Eraser (Perfect Real Time) (TaKaRa Bio, Shiga, Japan), following the manufacturer's instructions. qPCR was performed using SYBR Premix Ex Taq II (TaKaRa Bio, Shiga, Japan) on a 7500 Real-Time PCR System (Applied Biosystems, Munich, Germany) [20].

Western Blot Analysis
Expression of the Tβ4 in the skin samples of six typical representatives was detected by western blot analysis. Total protein was extracted from the skin samples obtained from the body side of the goats, separated by 10% sodium dodecyl sulfate polyacrylamide gel electrophoresis, and transferred to a nitrocellulose membrane. This membrane was blocked with 5% skimmed milk at room temperature for 1 h and incubated overnight at 4 • C using an anti-Tβ4 antibody (1:1000 dilution) and an anti-GAPDH antibody (1:10,000 dilution) as a loading control. After incubation, the membrane was rinsed sequentially with phosphate-buffered saline and phosphate-buffered saline containing 0.05% Tween-20 solution. Subsequently, the membrane was treated with a secondary goat anti-rabbit antibody. Protein bands were visualized using a Tanon-5200 image-analysis system (Tanon Science and Technology Co., Ltd, Shanghai, China) [21].

Hematoxylin-Eosin (H&E) and Immunohistochemistry (IHC) Staining
Skin tissues obtained from the six typical representatives were prefixed with 4% paraformaldehyde for 48 h, dehydrated in a series of alcohol concentrations, transferred into xylene, and embedded in paraffin. We cut 5 mm sections from each embedded tissue samples, and stained the sections with H&E. Stained sections were dehydrated and sealed with a cover slip [2]. The Tβ4 expression levels were detected by immunohistochemistry following the previous description [22,23]. Moreover, the ratio of secondary and primary hair follicles (SHFs/PHFs ratio) in sections of skin tissue from six typical representatives was statistically determined.

DNA Extraction
Goat genomic DNA samples used for the whole-genome bisulphite sequencing (WGBS) were extracted and purified from skin tissues of two parents and their twin offsprings (200-500 mg per sample) using a DNA extraction kit (Promega, Madison, WS, USA). A total of 1 µg qualified gDNA (concentration ≥50 ng/µL, A260/A280 = 1.8-2.0, without RNA, protein and degradation) was used for each library construction. A K5500 spectrophotometer was used to detect the purity of DNA, and 1% agarose gel electrophoresis was used for genome integrity identification. A Qubit ® 3.0 Fluorometer was used for the quantitation of DNA.

Whole Genome Bisulphite Sequencing (WGBS) and Data Processing
The volume of the combined gDNA sample and unmethylated lambda DNA controlled was adjusted to a total volume of 80 µL using 1× TE, and the DNA was fragmented to around 300 bp. Blunt-ended fragments was created by filling in or chewing back 3' and 5' overhangs. Phosphorylation of the 5' ends ensured that the fragments were suitable for ligation. The next step was dA-tailing of the 3'end of the end-repaired, phosphorylated fragments. The single A overhang enabled ligation to adaptors with single T overhangs. The methylated adaptor containing sequences required downstream in the sequencing workflow were ligated to the dA-tailed fragment. 2% agarose gel selected the 350-500 bp fragments. The bisulfite conversion technique involved treating DNA with bisulfite, which converted unmethylated cytosines into uracil. Methylated cytosines remained unchanged during the treatment. The uracil-binding pocket of KAPA HiFi DNA polymerase was inactivated, enabling amplification of uracil containing DNA. Furthermore, converted DNA fragments were PCR amplified and sequenced using HiSeq ×10 platform (Illumina, San Diego, CA, USA) [24]. To obtain high-quality clean reads, raw reads were filtered using the following criteria: (1) remove reads containing more than 10% unknown nucleotides (N); (2) remove low quality reads containing more than 40% low quality (Q-value ≤ 20) bases.

Methylation Level Analysis
The obtained clean reads were mapped to the species reference genome using the BSMAP software by default [25]. Then, a custom Perl script was used to determine the amounts of methylated cytosine and calculate methylation based on the methylated cytosine percentage in the entire genome, in each chromosome, and in different regions of the genome for each sequence's context (CG, CHG, and CHH; H = A, C, or T). To assess the different methylation patterns in different genomic regions, methylation profiles at 5 -flanking 2kb regions and gene sequences (or transposable elements, TEs) were plotted based on the average methylation for each 100 bp interval.

Differentially Methylated Regions (DMRs) Analysis
The DMRs for each sequence context (CG, CHG, and CHH; H = A, C, or T) between the samples were identified according to the following stringent criteria: more than five methylated cytosine molecules in at least one sample; the total depth of sequencing for each methylation cytosine site was >10, and the depth of support for methylation cytosine was >4; the region length measured between 40 bp and 10 kb; the distance between the adjacent methylated sites was <200 bp; the fold change of the average amount of methylation was >2; the Pearson's chi-square test (χ 2 ) value was p ≤ 0.05. At the adjacent 2 kb (upstream or downstream) or body regions of the genes or TEs; the putative overlapping DMRs were sorted for further study.

Transcriptome Analysis
We generated sequencing reads of 12 mRNA libraries on a HiSeq X10 platform. After removing the reads, we retained >6 Gb clean data for each sample. The reads were aligned using Bowtie2 v2.2.3 based on the Capra_hircus_ARS1 reference genome [26]. We successfully aligned 88.89-98.13% of the paired reads with concordant alignments to the reference genome. The expected fragments per kilobase of transcript sequence per millions base pairs sequenced (FPKM) value for each gene was calculated to estimate expression, and DESeq2 v1.6.3 was used to identify the DEGs [27]. The obtained p-values were corrected using the Benjamini and Hochberg algorithm for controlling the false discovery rate; genes were considered differentially expressed when q ≤ 0.05 and |log2_ratio| ≥ 1.

Enrichment Analysis of Differentially Expressed Genes (DEGs) and Differentially Methylated Regions (DMRs)
We conducted Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses using the R package "clusterprofiler" [18]. GO terms or KEGG pathways with adjusted p < 0.05 were considered statistically significant.

Statistical Analyses
Means, SDs, and SEMs were analyzed using Graphpad Prism 7.0. Two-way analyses of variance (ANOVAs) with Dunnett's multiple comparison tests were used to compare the statistical differences in groups. We considered p-values <0.05 statistically significant.

Establishment of the Germline of Thymosin β4 Overexpression (Tβ4-OE) Cashmere Goats
To investigate whether the desired trait (i.e., increased cashmere yield) could be stably inherited, we produced Tβ4-OE offspring both via SCNT and via NM ( Figure 1A). Using these methods, we successfully established a flock of 23 goats, all of which overexpressed Tβ4 ( Figure 1B,C). Of these, there were 11 F1-1 (two via SCNT and nine via NM) and 12 F1-2 (4 via SCNT and 8 via NM) (Tables 2  and 3). We monitored the body weights of all Tβ4-OE cashmere goats and physiological indices of the six typical representatives; the average body weights and physiological indices of the F1-Tβ4-OE were not significantly different from those of the F1-WT goats of the same age (p > 0.05; Figure 1D,E). This suggests that the Tβ4-OE offspring are just as healthy as the WT ones.

Cashmere Yield and SHFs/PHFs Ratio Increased in F1-SCNT
To investigate whether Tβ4 is expressed in the hair follicles, we used IHC to detect Tβ4 expression in hair follicles. The results suggested greater Tβ4 expression in the hair follicles of the F1-SCNT compared to the WT and F1-NM (Figure 2A). Meanwhile, our qPCR and western blot analyses also indicated that the Tβ4 protein and mRNA expression levels were higher in the F1-SCNT goats than in the F1-WT and F1-NM goats ( Figure 2B). Moreover, we investigated whether F1-Tβ4-OE had more hair follicles than the F1-WT. We found that the SHFs/PHFs ratio in the F1-SCNT was significantly greater than that of the F1-NM and the F1-WT ( Figure 3A,B). We also measured the cashmere's weight, thickness, and fiber length for all F1-Tβ4-OE and F1-WT goats ( Figure 3C). The average cashmere yield of the F1-SCNT was significantly greater than that of the F1-NM, as well as that of the F1-WT (p < 0.05). There were no significant differences in the cashmere thickness and fiber length among the three groups of goats.

Differential Analysis of Methylation and Expression
In order to investigate the genetic stability of the cashmere traits between generations of cashmere goats, we selected the two parents (P0-GM, a female and a male from P0 for an generational methylation study) and their twin offspring (F1-GM, from F1-2-NM for an generational methylation study) as research models. Methylation data of the P0-GM and F1-GM skin samples from WGBS were used for differential methylation analysis. There were 336 hypermethylated and 753 hypomethylated mC between the P0-GM and F1-GM, which correspond to 214 hypermethylated and 560 hypomethylated genes. Subsequently, the distribution of the methylation level of methylated C base was described based on the chromosome level, and a Circos diagram was drawn ( Figure 4A). Interestingly, we observed that the F1-GM had thinner CHG and CHH than the P0-GM. Then, to further understand the distribution of the methylation level in the whole genome, the methylation level in the 2Kb window of the whole gene was calculated, and a Violin graph was drawn ( Figure 4B). We analyzed DMRs distribution in different genomic regions ( Figure 4C). The largest numbers were on the introns. Meanwhile, the hyper and hypo DMRs lengths were calculated ( Figure 4D). Expression data for the P0-GM and F1-GM skin samples from sequencing were used for differential expression analysis. We found 36 highly expressed ('DE-high') and 58 lowly expressed ('DE-low') genes in F1-GM skin samples ( Figure 5A,B).

Roles of Methylation in Regulating Gene Expression
We analyzed the intersection between differentially expressed genes and differentially methylated genes. One gene was hypermethylated with a low expression in HCC (NCBI query: Symbol, LOC102185508; Description, histone-lysine N-methyltransferase PRDM9-like; Type, protein), and one gene was hypomethylated with high expression (NCBI query: Symbol, LOC108633356; Description, uncharacterized; Type, lncRNA) ( Figure 5C).

Comparison of Functional Analysis of DEGs and DMRs with GO and KEGG
A functional enrichment analysis of genes with overlapping promoter and genomic region with DMRs was carried out by R. The GO analysis results showed that changes in the biological processes (BP) of DMRs were significantly enriched in the cellular process, biological regulation, and metabolic process. Changes in the molecular function (MF) were mainly enriched in binding, catalytic activity, and molecular transducer activity. Changes in the cell component (CC) of DMRs were mainly enriched in the cell part, organelle, and membrane ( Figure 6A). Interestingly, the above analysis is highly consistent with the results of the GO function enrichment of DEGs in the P0-GM and F1-GM cashmere goats ( Figure 6B). A KEGG pathway analysis revealed that the DMRs were mainly enriched in pathways in cancer, axon guidance, and the phosphatidylinositol signaling system ( Figure 7A). Meanwhile, the KEGG pathway analysis revealed that the DEGs were mainly enriched in protein digestion and absorption, complement and coagulation cascades, and natural killer cell mediated cytotoxicity ( Figure 7B).

Discussion
For the past 10 years, we have studied the effects of Tβ4 gene overexpression on cashmere goat hair follicles, in order to select and breed high-yield cashmere goats. After observing that Tβ4 overexpression had a significant effect on cashmere yield in cashmere goats [18], we used SCNT and natural breeding methods to develop a new goat breed that stably inherited overexpression of Tβ4. This was particularly difficult because, in addition to the many general constraints on large-scale mammal breeding (such as the environment and the operation), unfavorable factors are particularly prominent on the Inner Mongolia Plateau, which has a fragile ecological environment, drastically fluctuating temperatures, sparse vegetation, harsh winds, and sandy soils [28]. Despite these challenges, we have to date successfully established a flock of nearly 40 Tβ4-OE cashmere goats. Importantly, the Tβ4-OE goats were healthy, with normal growth rates, normal activity levels, and no obvious behavioral defects. This is related to the insertion of a promoter (KAP6.1) specific to the expression of Tβ4 in the hair follicle into the vector. Therefore, it did not affect the embryonic development of cashmere goats. Thus, this flock provides a solid foundation for further analysis of the genetic stability of the Tβ4 gene over several generations.
We carefully selected the six representatives for study based on the properties of cashmere among first filial generation of Tβ4-OE and WT goats. Each representative fully reflects the average level of trait of that type. Therefore, the differences among different types are reflected in the results. Meanwhile, the goats for the intergenerational methylation study were rare and unique. We selected two Tβ4 overexpressed cashmere goats (a male and a female) for mating and obtained twin offspring. Such groupings are difficult to obtain, since cashmere goats have little chance of producing double lambs, especially genetically modified ones, and such groupings can eliminate interference, especially the temporal and spatial interference between individuals.
The advantage of SCNT is that this procedure transfers the parental genome to the offspring without alteration; its disadvantage remains its low production efficiency [29][30][31]. This was reflected by the results of our offspring breeding program: the SCNT Tβ4-OE offspring were equivalent to their parents in all measured traits, but only a small number of these offspring were successfully born. In contrast, a large number of NM Tβ4-OE offspring were successfully born. However, NM Tβ4-OE offspring lost some of the desired traits. Our study confirmed that the genetic instability between generations was caused by methylation regulating the expression of some genes, such as the LOC102185508 and LOC108633356 obtained by the combined analysis of transcriptome and methylation. Although these effects may lead to genetic instability, NM also produced far more live offspring than SCNT. Therefore, it is most effective to improve livestock genomes by combining SCNT and NM.
In addition, we found that the GO function enrichment results of DMRs and DEGs were highly similar. This suggests that intergenerational methylation changes affect the genetic stability of traits by regulating intergenerational DEGs. This suggests that we can improve the genetic stability of traits by changing the methylation degree of DEGs between generations. This result still needs further experimental verification. However, the inconsistency of the KEGG results also indicates that DMRs and DEGs use different paths to affect the genetic stability of traits.

Conclusions
In summary, the desired trait exhibited lower fixation in the line-bred offspring compared to the SCNT offspring. An integrative analysis of the methylation and transcriptional profiles showed that this result might be due to the influence of methylation on the expression of DEGs between generations, which was mutually consistent with the results of the GO and KEGG analysis of DMRs and DEGs. To eliminate the effects of DNA methylation, it would be more efficient to produce animals carrying a gene integrated at specific genomic loci. It is clear CRISPR/Cas9-based genome engineering will accelerate improvements in desirable livestock traits. The use of CRISPR/Cas9 avoids the target gene silencing caused by methylation and will ensure the stable expression of the target gene and the presence of the target trait in all offspring.