Characterization and Comparative Analysis of Transcriptional Proles of Porcine Colostrum and Mature Milk Across Multiple Parities

The expression of genes and their regulation during lactation in swine is not well-understood. In order to gain a better understanding of genes and pathways involved in sow lactation, total RNA from colostrum and mature milk samples were sequenced from 65 sows across four parities. A stringent bioinformatic pipeline was used to identify and characterize 44,234 transcripts. The 44,234 identied transcripts included 41,875 previously annotated transcripts and 2,359 novel transcripts. Differential gene expression analysis was conducted using a generalized linear model coupled with the Lancaster method for P-value aggregation across transcripts. In total, 1,900 differentially expressed genes (DEG) were identied for the milk type main effect, and 373 DEG were identied for the milk type x parity interaction. Several gene ontology (GO) terms related to immune response were signicant for the milk type main effect, supporting the widely-accepted hypothesis that immunoglobulins and immune cells are transferred to the neonate via colostrum. This is the rst study to perform global transcriptome analysis from whole milk samples in sows from multiple parities. Our results provide important information and insight of and innate and targets for future improvement of and piglet development. born to multiparous sows [4, 5]. It has been hypothesized that differences in lifetime performance between gilt progeny and sow progeny may be due to differences in lactation performance, specically lower levels of immunoglobulin G (IgG) and other energetic components in the colostrum and milk of gilts. However, data from Craig et al. [6] showed no parity differences in total IgG, fat, protein, lactose, and net energy concentrations. These results suggest that the poorer performance of gilt progeny is unlikely due to insucient nutrient levels and is more likely due to differences in colostrum and milk intake and their ability to digest and absorb each component [5].

distance (trans-acting) in the genome via different mechanisms [17]. LncRNA have also been associated with metabolic, immunological, and developmental regulation, as well as phenotypic variation in livestock [18].
A limited number of transcriptomic studies related to porcine milk production have been reported [19,20]. These studies focused on the mammary gland transcriptome during late gestation [19] and during the transition from colostrogenesis to lactogenesis [20]. To our knowledge, there have been no studies that have reported direct sequencing of porcine whole milk samples. In fact, only six livestock RNA-sequencing (RNA-Seq) studies have reported sequencing of milk, including three in cattle [9,21,22], one in goat [23], one in sheep [24], and one in buffalo [25]. The emphasis of these studies was gene expression related to the lactation process, and as such, milk somatic cells were sequenced as a proxy for the mammary gland tissue [26].
In the current study, we performed total RNA-Seq on porcine colostrum and mature milk samples, from dams in parities one through four, in order to characterize both transcriptomes. We identi ed novel protein-coding and lncRNA transcripts and quanti ed expression of both known and novel transcripts. Expression pro les were compared to identify differentially expressed genes (DEG) between colostrum and mature milk across parities.

Results
High-throughput sequencing RNA-Seq libraries were sequenced generating over 6 billion 75 bp paired-end reads, with an average of 46.2 million reads per library (Table S1). The number of reads in the colostrum libraries ranged from 22.6 to 81.8 million reads with an average of 44.4 million reads, while the number of reads in the mature milk libraries ranged from 24.2 to 97.8 million reads with an average of 48.0 million reads. After adapter removal and read trimming, the resulting high-quality reads were mapped to the Sscrofa 11.1 genome assembly with an average 99.6% read mapping rate per library. The number of reads aligning to known messenger RNA (mRNA), miscellaneous RNA (miscRNA), non-coding RNA (ncRNA), and pseudogenes in the swine genome are presented in Table S2. It was observed that most reads mapped to known mRNA, while 50.5% of colostrum reads and 44.5% of milk reads were mapped outside of annotated loci, potentially harboring novel transcripts ( Figure 1).

Transcript identi cation and characterization
Transcripts, assembled individually for each library, were merged into a single set of 460,853 putative transcripts. This set was subjected to several ltering steps in order to remove transcriptional noise and classify transcripts ( Figure 2). Transcripts identi ed in only one library and lowly expressed transcripts were removed, as these were considered transcriptional noise. The remaining set of transcripts was ltered to include only those with class codes '=', 'u', 'x', and 'i' ( Figure S1). The transcripts with class codes 'u', 'x', and 'i' were further ltered by length, and number of exons. This set of 8,699 putative novel transcripts were then subjected to classi cation by open reading frame (ORF) length and protein coding potential score to complete transcript characterization. In total, 44,234 transcripts were identi ed in the porcine milk transcriptome, including 41,875 previously annotated transcripts as well as 2,359 novel transcripts.
Transcripts corresponded to 17,740 unique gene loci. Previously annotated transcripts corresponded to 16,515 known gene loci in the Sus scrofa genome, while unannotated protein-coding and non-coding transcripts corresponded to 971 and 571 loci, respectively. Genomic coordinates of the identi ed novel transcripts are given in Tables S3 and S4. Among the novel lncRNA transcripts, 224 and 171 were long intergenic non-coding RNA (lincRNA) and identi ed, unknown long non-coding RNA (ilncRNA), respectively, while 286 lncRNA anked a protein-coding gene in a divergent orientation (long non-coding natural antisense transcripts; lncNAT) ( Figure 3A). Using the BLAST algorithm, a total of 364 lncRNA exhibited homology with transcripts in the porcine NONCODE database, 25 lncRNA exhibited homology with non-coding transcripts in other species, and 167 lncRNA were homologous to noncoding transcripts in both swine and other species ( Figure 3B; Table S5). A similar analysis identi ed that 1,463 of the novel mRNA transcripts were homologous to known transcripts in swine and other species (Figure 4).
Basic sequence features of the novel transcripts, including length, exon number, expression, and ORF length, are shown in Figure 5 and Table 1. Novel lncRNA were expressed at signi cantly lower levels than novel mRNA and known transcripts ( Figure 5A). The exon number and size of the novel lncRNA and coding transcripts were notably smaller than that of known transcripts ( Figure  5B,C). The ORF length of novel lncRNA was signi cantly shorter than ORF length in known and novel coding transcripts, while the ORF length of novel coding transcripts was longer than that of known transcripts ( Figure 5D).

PCA and differential expression analysis
The principal component analysis (PCA) plot ( Figure 6) showed that colostrum and mature milk transcript expression pro les seem to fall into distinct clusters, while there was no clear clustering of samples by parity. After multiple testing correction, we identi ed 518 differentially expressed transcripts (DET) for the milk type x parity interaction and 2,613 DET for the milk type main effect (Tables S6 and S7). Table 2 shows the classi cations of DET. The DET set for the milk type main effect was comprised of 2,244 known transcripts, 286 novel coding transcripts, and 83 novel lncRNA, while the interaction DET set included 464 known transcripts and 38 and 15 novel coding transcripts and lncRNA, respectively. P-values of transcripts were aggregated for each gene loci to obtain DEG. A total of 1,900 DEG were identi ed for the milk type main effect, and 373 DEG were identi ed for the milk type x parity interaction (Tables S8 and S9).

Gene ontology and pathway analysis
Gene ontology (GO) analysis of the DEG indicated that genes associated with the milk type main effect were predominantly involved in binding (38.9%), catalytic activity (28.4%), transporter activity (8.5%), molecular function regulation (7.6%), and transcription regulation activity (7.3%). A total of 487 biological process, 65 molecular function, and 77 cellular component GO terms were signi cantly enriched in this gene set (Table S10). Additionally, 4 KEGG pathways were signi cantly enriched.

Discussion
Milk production by the sow is a major limiting factor in the growth and survival of her litter. Knowledge of porcine milk composition, as well as understanding genetic factors underlying its variation, is a matter of ongoing interest. In this study, we performed the rst exhaustive characterization of the porcine milk transcriptome derived from whole milk samples. The goal was to highlight differences in samples collected during early and mid-lactation and compare transcriptome pro les of dams across multiple parities.
A limited number of milk transcriptome studies in livestock have been conducted [9,[21][22][23][24][25], and to date, RNA-Seq of porcine milk has not been reported. Non-invasive sampling of the transcriptome of milk-producing cells via RNA secreted into porcine milk provides a powerful window into the biology of swine lactation. Lemay et al. [26] showed that RNA extracted from milk provides a better representation of RNA from mammary epithelial cells than RNA from whole mammary tissue.
Total RNA was isolated from 130 fresh whole milk samples (65 colostrum and 65 mature milk) from dams across four parities. In most milk transcriptome studies, milk is fractionated, and RNA is extracted from somatic cells, milk fat, or whey. Total RNA concentrations tend to be higher in the milk fat and somatic cells than in the whey fraction, while RNA integrity of somatic cells is higher than those of milk fat and whey [27,28]. Low RIN values in this study (average RIN = 4.0) are likely due to the presence of small amounts of cytoplasmic material in milk fat globules [29], bacteria and small RNAs (miRNA) in the fat fraction [30], and degraded and/or free RNAs. Each milk fraction has its own place in research settings. The advantages and disadvantages of each RNA source has previously been summarized [26]. In this study, we chose to utilize whole milk samples in order to capture the broader transcriptomic signatures of porcine colostrum and milk. We were able to process the samples much more quickly than had we fractionated the milk, and our sample represents the entirety of what is being ingested by the growing piglet.
Libraries were sequenced to an average depth of 46 million reads per library. A depth of 40 million reads is considered su cient for reliable detection of major splice isoforms for abundant and moderately abundant transcripts [31]. When generating our sequence data, we targeted a depth of 50 million reads per library. However, there was considerable variation in sequence depth across libraries. Some of this variation can be attributed to technical aspects of NGS technology, such as the stochasticity of sequencing, RNA quality, and library preparation.
A total of 44,234 transcripts were identi ed in this study, of which approximately 95% are annotated in the current swine genome build. Transcripts corresponded to 17,740 unique gene loci, including 16,515 known porcine genes. The number of expressed genes is comparable to those reported in similar studies in sheep [24] and goat [23]. A smaller number of expressed genes (~13,500) was reported in the buffalo milk transcriptome [25]. This discrepancy is likely to due to the swine, sheep, and goat reference genomes being more complete and of higher quality.
In general, gene expression values covered a wide range of magnitudes (Figure 7), and the gene expression pro les of colostrum and mature milk were not highly correlated (Pearson correlation coe cient 0.222). There was a large overlap (12 out of 15) in the top fteen most abundantly expressed genes in colostrum and mature milk (Table 3). Among the top expressed genes were CSN3, CSN2, CSN1S1, LALBA, FASN, EEF1A1, TPT1, SAA3, and WAP, which have been previously identi ed among the top expressed genes in milk samples from other species [23-25, 32, 33].
As expected, many of the top expressed genes were related to biosynthesis of milk proteins. Expression levels of CSN2, CSN3, CSN1S1,LALBA, and WAP, which encode for the synthesis of the main milk proteins casein and whey, increased from early to midlactation stages. A similar gene expression pattern has been identi ed in a previous swine study [34], as well as in goat [23], cattle [33], and sheep [24]. High expression of the EEF1A1 gene is also related to high levels of milk protein synthesis, as EEF1A1 is one of the most abundant protein synthesis factors [23]. Additionally, ribosomal proteins RPLP0 and RPS2 were among the top expressed genes in colostrum and exhibited a slight decrease in expression during mid-lactation. These genes were also found to be highly expressed in early lactation in buffalo [25].
In addition to milk protein synthesis genes, genes associated with milk fat were among the top expressed genes, and their expression was nearly constant across lactation stages. Milk fat composition is known to in uence piglet growth and development [35]. The FABP3 gene, which is involved in the uptake and transport of fatty acids, has been linked to milk fat synthesis in cattle [36]. FASN is directly involved in most of the short and medium-chain fatty acids in milk [37], and XDH is involved in the formation of the lipid droplet in milk [38].
The 20 most signi cant DET for milk type and interaction are given in Tables 4 and 5, respectively. Many of the most signi cant DET were associated with genes involved in milk fat synthesis. Transcript MSTRG.335722.71455, which is associated with insulin induced gene 1 (INSIG1), was found to be up-regulated in mature milk samples. INSIG1 is known to regulate the expression of sterol regulatory element-binding protein 1 (SREBP1; also denoted SREBF1), which is central to milk fat synthesis [20,39]. Although not one of the most signi cant DET, transcript MSTRG.239932.53831, associated with SREBP1, was identi ed as a DET for the milk type x parity interaction. Transcripts MSTRG.189750.42732 (THRSP gene), MSTRG.108970.23990 (SP1 gene), MSTRG.283058.62377 (ANXA7 gene) are other milk fat synthesis genes among the most signi cant DET. In general, expression of milk fat synthesis genes was up-regulated in mature milk samples compared to colostrum, which agreed with expression patterns observed across bovine lactation stages [39]. Our results highlight that the transition from swine colostrum to mature milk is marked by a shift from high protein contents to high fat and lactose contents [40].
In this study, DEG were identi ed by aggregating P-values across transcripts associated with each gene via the Lancaster method, rather than using gene read counts directly. Using this approach not only maintains both transcript and gene-level resolution, but also bypasses issues of different variances and directions of change across constituent transcripts. This method outperforms other gene-level methods and provides a coherent analysis between transcripts and genes [41].
One of the major aims of this study was to evaluate DEG between lactation stages across multiple parities. Progeny born to multiparous sows generally exhibit superior growth performance compared to those born to primiparous sows. However, colostrum and milk composition pro les, are highly similar across parities [6].
Glucose transport is a major precursor to lactose synthesis, which is synthesized in the Golgi vesicle of mammary secretory alveolar epithelial cells during lactation [42]. Glucose-6-phosphate transporters SLC37A2 and SLC37A3, glucose cotransporter SLC5A8, and glucose transporter SLC2A5 were identi ed as DEG for the milk type main effect. Glucose transport across the plasma membrane of mammalian cells is carried out by two distinct processes one of which involves glucose transporters from the GLUT gene family (encoded by SLC2A genes) and the other which involves glucose transporters from the SGLT family (encoded by SLC5A genes). Both the SLC2A5 and SLC5A8 genes were up-regulated in colostrum. Crisá et al. [23] identi ed signi cant up-regulation of members of the SLC2A gene family and polysaccharide and glycosamino-glycan binding molecular function to be enriched in goat colostrum samples compared to mature milk.
Members of the SLC35 gene family encode nucleotide sugar transporters localizing at the Golgi apparatus and/or the endoplasmic reticulum. These transporters transport nucleotide sugars pooled in the cytosol into the lumen of these organelles, where most glycoconjugate synthesis occurs [47]. Currently, the SLC35 gene family is comprised of 31 genes which are divided into 7 subfamilies, SLC35A to SLC35G [48]. GDP-fucose transporters SLC35C1 and SLC35C2 were identi ed as DEG for the milk type main effect, with SLC35C2 up-regulated in colostrum and SLC35C1 down-regulated. SLC35A5 and SLC35G1 were identi ed as DEG for the milk type x parity interaction, with both genes being up-regulated in mature milk in the rst three parities and up-regulated in colostrum during the fourth parity (Figure 8). Crisà et al. [23] identi ed 3 DEG from the SLC35 family that were up-regulated in goat colostrum compared to mature milk, as well as enrichment of glycosaminoglycan binding molecular in colostrum. Consistent with this result, we also identi ed the enrichment of glycosaminoglycan binding molecular function, with 29 of the 124 porcine genes associated with the GO term being present in our milk type DEG set. These results support the hypothesis that oligosaccharide metabolism decreases over the course of lactation as oligosaccharide concentrations in milk have been shown to peak in colostrum and decrease over the remainder of lactation in human [49], goat [50], and pig [51].
Forty-two genes from the PI3K-Akt pathway were found to be differentially expressed. Although not statistically signi cant after FDR-correction, this pathway had a nominal P-value of 0.025 in the enrichment analysis. The PI3K-Akt pathway is a key signaling node for lactogenic expansion and differentiation of the luminal mammary epithelium, as numerous signaling pathways that regulate lactogenic development converge on PI3K-Akt, including the insulin-like growth factor 1 receptor (IGF1R), RANKL and RANK, integrins, and PRLR-to-JAK2-to-STAT5A pathways [52]. In general, expression of DEG in the PI3K-Akt pathway was upregulated in colostrum. The janus kinase 1 (JAK1) gene, a key component of the PI3K-Akt pathway, had an FDR-corrected P-value of 0.02 but was ltered out of the DEG list based on the log2 fold change threshold. JAK1 was more highly expressed in mature milk, and the difference in expression level between colostrum and mature milk was more pronounced in the later parities ( Figure  9). Using a novel mammary gland-speci c JAK1 knockout mouse model, Sakamoto et al. [53] demonstrated that JAK1 is essential to involution, the mammary gland remodeling process at weaning where the milk-producing epithelial cells are replaced with adipocytes [54]. The potential effects of involution on mammary development and milk yield during the next lactation are not well understood. Ford et al. [55] noted that mammary glands in sows that were suckled during lactation were larger than non-suckled glands at the end of the involution process, suggesting a possible bene cial effect on redevelopment during the next gestation.
Several GO terms related to immune response, particularly leukocyte activation, were signi cantly enriched in the DEG for the milk type main effect (Table S10). The majority of genes in these pathways were up-regulated in colostrum. This nding is consistent with the widely accepted theory that immunoglobulins and immune cells are transferred to the neonate via colostrum. In pigs, the epitheliochorial nature of the placenta prohibits transfer of maternal immune cells and immunoglobulins to the fetus, and thus, the piglet relies on the successful absorption of colostral components to acquire maternal immunity [56]. Proin ammatory cytokines play an important role in the development of the neonatal immune system by mediating the early local and systemic responses to microbial challenges [57]. A total of 37 DEG were associated with cytokine secretion, including interleukin 17 receptor C (IL17RC), interleukin 27 receptor A (IL27RA), tumor necrosis receptor superfamily members 1b and 15 (TNFRSF1B and TNFRSF15), and tumor necrosis factor superfamily member 4 (TNSFF4). Several other genes in these gene families were shown to be up-regulated in early porcine lactation by Palombo et al. [20].
Antimicrobial proteins naturally present in colostrum and milk can kill and inhibit a broad spectrum of bacteria [58]. Milk is also known to exert chemotactic activity on neutrophils [59], an important innate host defense against microorganisms. The chemokine superfamily encodes secreted proteins involved in immunoregulatory and in ammatory processes. The CXC chemokine ligand 16 (CXCL16), which encodes a chemokine antimicrobial protein [60], was up-regulated in colostrum samples ( Figure 10). Palombo et al. [20] identi ed signi cant up-regulation of CXCL2 and CXCL10 in day 1 postpartum swine milk samples compared to colostrum. Many of the main chemokines (CXCL2, CXCL8, CXCL9, CXCL10, CXCL11, CXCL13, CXC14, and CXCL16) were expressed in our samples, and expression patterns for CXCL10 and CXCL2 were consistent with the ndings of Palombo et al. [20]. Our results differed from previously published results in that CXCL8 was the most abundantly expressed chemokine in both colostrum and mature milk, and CXCL3 was not expressed in our samples. One factor contributing to this discrepancy was the use of the improved reference genome (Sscrofa 11.1), where many of the gaps and misassemblies present in the Sscrofa 10.2 genome build were resolved and the annotation was signi cantly improved. These results suggest that chemokine ligands may play an important role in the transition from colostrum to mature milk in swine, likely helping prompt recruitment of neutrophils.

Conclusions
This is the rst study to describe the whole milk transcriptomic pro le of porcine colostrum and milk across multiple parities. Using total RNA-Seq, we identi ed and characterized 2,359 novel transcripts expressed in milk and detected several hundred genes that were differentially expressed between colostrum and milk and across parity. Our ndings have produced several highly specialized and functional candidate genes that may contribute to postnatal development and growth of piglets, as well as lactation in the sow.

Population and sampling
A four-breed composite line (Maternal Landrace × High-lean Landrace × Duroc × Yorkshire) maintained at U.S. Meat Animal Research Center (USMARC) for at least 18 generations was used for the collection of data in this project and have been previously described [61,62]. Litter sizes were adjusted within 48 hours of farrowing to ensure litters were approximately equal in size but did not exceed the number of functional teats. Mammary excretion samples were collected on day of farrowing (d 0; colostrum) and again on day 10 post-farrowing (d 10; mature milk) from a total of 65 dams, 16 rst parity (P1), 25 second parity (P2), 15 third parity (P3), and 9 fourth parity (P4). The power calculation for this experiment was conducted using the online RNASeqSampleSize tool ( [63]; http://cqs.mc.vanderbilt.edu/shiny/RnaSeqSampleSize/). The power of using 65 animals to detect ~1,000 DEG, with maximum dispersion 0.5 and minimum fold change of 2.82, at FDR level 0.05 from 17,740 expressed genes was found to be 0.99.After sample collection, animals remained at USMARC and progressed through the breeding system according to standard operating procedures.
In most cases, no external stimulant (i.e., oxytocin) was needed to collect colostrum at time of farrowing, as farrowing stimulates endogenous oxytocin production and milk letdown activity. However, if enough colostrum could not be collected within 10-30 minutes, an intramuscular injection of oxytocin (20 IU) was administered to stimulate colostrum letdown. Teats were sprayed with iodine (5%) and ethanol (70%) and wiped clean with a chem-wipe, and then 10 mL of colostrum was collected manually from the third and fourth teat on one side of the sow. On d 10, piglets were separated from the sow for approximately one hour, and sows were given an intramuscular injection of 20 IU oxytocin to stimulate milk letdown. Teats were cleaned, and 10 mL of milk was collected manually from the third and fourth teat on one side of the sow. Fresh samples were transported to the laboratory on ice. Samples (250 µL) were aliquoted into individual lysis D matrix tubes (MP Biomedicals, LLC, Solon, OH) with 1 mL TRIzol reagent (Invitrogen, Thermo Fisher Scienti c, Waltham, MA) and stored at -80°C until RNA isolation.

RNA isolation and sequencing
RNA was isolated using the FastPrep-24 5G Instrument (MP Biomedicals, LLC) with cryogenic lysis. Brie y, RNA was isolated by high-speed cellular disruption using multi-directional, simultaneous bead beating of sample material (i.e., colostrum or milk) with a cool adapter for cryogenic lysis at 6.0 m/sec for 40 sec. Lysed samples were transferred into a clean tube, and completion of isolation occurred following manufacturer's recommended protocol for TRIzol. The nal RNA pellet was dried at RT for 10 min and resuspended in 30 µL water (Invitrogen UltraPure DNase/RNase-free, Thermo Fisher Scienti c). RNA was quanti ed using a NanoDrop UV-Vis spectrophotometer (Thermo Fisher Scienti c) and RNA integrity was assessed using an Agilent Bioanalyzer System (Agilent, Santa Clara, CA).
Total RNA samples extracted from colostrum or milk were prepared for RNA sequencing with the TruSeq Stranded Total RNA sample preparation kit (Illumina, San Diego, CA) following the guidelines of the manufacturer. Libraries were quanti ed with RT-qPCR using the NEBNext Library Quant Kit (New England Biolabs, Inc., Beverly, MA, USA) on a CFX384 thermal cycler (Bio-Rad, Hercules, CA, USA), and the size and quality of the library was evaluated with an Agilent Bioanalyzer DNA 1000 kit (Santa Clara, CA, USA). The libraries were diluted to 4nM with Illumina RSB. Libraries were paired-end sequenced with 150 cycle high output sequencing kits on an Illumina NextSeq 500 instrument.
Processing RNA-Seq data Alignment of RNA-Seq reads was carried out as follows. First, quality of the raw paired-end sequence reads in individual fastq les was assessed using FastQC (Version 0.11.5; www.bioinformatics.babraham.ac.uk/projects/fastqc), and reads were trimmed to remove adapter sequences and low-quality bases using the Trimmomatic software (Version 0.35) [64]. The remaining reads were mapped to the Sscrofa 11.1 genome assembly (NCBI accession AEMK00000000.2) using Hisat2 (Version 2.1.0) [65] with default parameters.
Mapped transcripts were assembled for each library using Stringtie (Version 1.3.3) [66]. The NCBI Sscrofa 11.1 reference annotation (Release 106) was used to guide the assembly process. Transcripts from all samples were merged together using Stringtie merge mode to build a consensus set of transcripts.

Identi cation and characterization of novel transcripts
Transcript expression levels were quanti ed for each library using fragments per kilobase of exon per million mapped reads (FPKM) [67]. Transcripts expressed in a single sample, and transcripts with FPKM < 0.3 in all samples were removed. Gffcompare (Version 0.11.2) [68] was used to compare the list of assembled transcripts with the Sus scrofa reference annotation (NCBI release 106). Transcripts overlapping known transcript classes in the reference annotation (gffcompare class code '=') were assigned to the appropriate annotation class, while transcripts with gffcompare class codes 'x' (exonic overlap on the opposite strand), 'i' (fully contained in reference intron), and 'u' (unknown, intergenic) were considered to be potential novel transcripts.
The BLASTN algorithm from the BLAST+ package [75] was used to identify homology between (1) novel lncRNA and the NONCODE database (Version 5) and (2) novel mRNA and the NCBI Non-redundant Nucleotide database (nt; Version 5, https://ftp.ncbi.nlm.nih.gov/blast/db/). BLASTN was run with default parameters, and an E-value cutoff of 10.0 was used to de ne homologous sequences.

Differential expression and functional analyses
Raw read counts for the 44,234 transcripts and the 17,740 corresponding genes were normalized using DESeq2 (Version 1.26) [76]. The PCA plot, using normalized read counts, was generated using the plotPCA function from the DESeq2 package. Differential expression analysis of transcripts was performed using DESeq2 with the following generalized linear model: Transcripts with FDR-adjusted P-value ≤ 0.01 were considered DET for the type x parity interaction. Transcripts that were not DET for the interaction term with |log2 fold change| ≥ 2 and FDR-adjusted P-value ≤ 0.01 were considered DET for the milk type main effect.
Transcript P-values were aggregated for each gene using the Lancaster method [77] in order to generate gene-level analysis. This approach has been described in detail by Yi