Functional Annotation of Transcripts Obtained from Fall-Grown Tall Fescue as Inuenced by Endophyte Infection

Tall fescue is one of the primary source of forage for livestock. It grows well in the marginal soils of the temperate zones. It hosts a fungal endophyte (Epichloë coenophiala), which helps the plants to tolerate abiotic and biotic stresses. The genetics and biology underlying mechanism of freezing stress tolerance of tall fescue is still unknown, due to its complex genetic background and outbreeding modes of pollination, limited genomic, and transcriptomic resources. The aim of this study was to identify differentially expressed genes (DEGs) in two tissues between novel endophyte-positive (E+) and endophyte-free (E-) tall fescue genotypes at three diurnal time points; in the morning (-3.0 to 0.5°C), afternoon (11 to 12°C), and evening (12 − 10°C) in the eld environment, by exploring the transcriptional landscape via RNA sequencing. For the rst time, we generated 226,054 and 224,376 transcripts from E + and E- Texoma MaxQ II tall fescue, respectively by de novo assembly. The upregulated transcripts were detected fewer than the downregulated ones in both tissues (S: 803 up and 878 down; L: 783 up and 846 down) under the freezing temperatures in the morning. By Gene Ontology enrichment analysis, 10 GO terms were found only under the freezing stress in the morning. Metabolic pathway and biosynthesis of secondary metabolites genes showed lowest number of DEGs under morning freezing stress and highest number in evening cold condition by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analysis. The DEGs expressed under morning stress condition and the nine candidate genes that we identied using GO analysis, might be the possible route underlying cold tolerance in tall fescue. transcriptome analyses, collected S tissues separately in ice-cold 15 mL falcon tubes from 15 genotypes of each and E- from three random rows of the plot to test their endophyte status. In an earlier study, Takach et al. 41 conrmed that endophyte is in the S, not in L, in Texoma tall fescue. The S samples were freeze-dried and ground separately in the presence of liquid N 2 using mortar and pestle. Genomic DNA was extracted using DNA to the PCR amplications were performed using primers described in 42 to conrm endophyte status of the Texoma tall fescue genotypes. 5 µL of 2x Sigma KiCqStart SYBR Green qPCR Ready-Mix (Cat no.: KCQS01), 1 µL molecular biology grade water, and 2 µL cDNA (1:20). qRT-PCR amplications were performed on QuantStudio 7 Flex Real-Time PCR system (Thermo Fisher Scientic, Singapore) using a protocol of 2-step PCR cycle (an initial denaturation of cDNA at 95°C for 3 min, followed by 40 cycles of denaturation at 95°C for 15 sec and annealing at 60°C for 45 sec) and a 3-step of melting curve analysis (95°C for 15 sec, 60°C for 1 min and 95°C for 15 sec). Experiments were performed with three technical replicates of each S tissues of E+ and E- tall fescue collected under freezing temperature in the morning. Gene expression was quantied using the 2^-(ΔΔC T ) method 53 . The tall fescue Actin gene was used as an internal reference gene. Primers used to amplify 53-63 bp of the genes were designed using Primer Express software (v3.0.1) (Thermo Fisher Scientic) (Supplementary Table S5) and by MO, USA.


Introduction
Tall fescue (Festuca arundinacea Schreb.) is a cool-season perennial grass species in the temperate zone worldwide. They grow well in the transition zone of the United States 1 , where cool-and warm-season grasses are cultivated successfully. Tall fescue is highly productive and provides quality forage. Thereby, it is grown for pasture, hay, and silage and being used as a primary source of herbage protein of livestock feed. Tall fescue can grow in a wide range of temperatures between 4-35°C with an optimum 20-25°C. They can tolerate extreme cold stress, which includes chilling (0-12°C) and/or freezing (< 0°C) temperatures for a short period during fall. The forage production of tall fescue is reduced during January and February, and growth resumes when temperature rebound to ≥ 12°C in spring.
Increasing evidence suggests that a wild-type endophyte (Epichloë coenophiala) living in the intercellular space helps tall fescue plant to ght against abiotic and biotic stresses [2][3][4][5][6][7] . However, the endophyte produces alkaloids that are harmful to grazing animal 8 . A novel endophyte strain, AR584, puri ed from tall fescue originating from Morocco by AgResearch, New Zealand 9 that provides similar bene ts to the tall fescue plants, and at the same time it is not harmful to the grazing animal [10][11][12] . To elucidate the genetic relationship between novel endophyte and tall fescue, genome resources for the Festuca grasses and their endophytes are needed.
The genome size of hexaploid tall fescue is ~ 6 Gb that makes genome sequencing studies critical 13 , though a draft genome sequence of a related diploid perennial ryegrass (Lolium perenne) is available 14 . Next-generation sequencing of mRNA (RNA-seq) is a powerful method to evaluate transcriptional responses into biotic and abiotic stresses without a reference genome. RNA-seq offers the opportunity to identify transcripts, by mapping RNA-seq reads onto a genome, or by rst assembling the reads de novo into contigs and then mapping the contigs onto the transcriptome for organisms without reference genome 15 . Transcriptome analyses were used to identify differentially expressed genes (DEGs) responsible for the expression of traits within contrasting plant materials. The rst leaf transcriptome was developed on four different tropical guinea grass (Panicum maximum Jacq.) 16 genotypes. De novo transcriptome assemblies from two buffalograss (Bouteloua dactyloides Nutt. Columbus) cultivars 17 , multiple tissues from two highly inbred perennial ryegrass (Lolium perenne L.) genotypes 18 , and four Lolium-Festuca species 19 were also reported.
By comparing the transcriptome pro les of two tall fescue genotypes (heat tolerant and heat sensitive), candidate genes response to plant heat tolerance were identi ed 20 . In order to identify DEGs under lead (Pb) stress, the leaf transcriptome of two tall fescue cultivars (Pb tolerant and Pb sensitive) were compared 21 . To investigate the molecular mechanism of tall fescue adaptability to cadmium (Cd) stress, candidate genes were reported by comparing the root and leaf transcriptome with or without Cd treatments 22 .
Dinkins et al. 23 analyzed tall fescue transcriptomes and compared DEGs in different tissues of endophyte-positive (E+) and endophyte-free (E-) clones. To study the effect of endophyte on drought tolerance, endophyte (AR584)-infected contrasting tall fescue genotypes (water stress tolerant and susceptible) 24 harboring different endophyte strains 25 were analyzed to identify DEGs.
To improve our understanding on cold/freezing responsive genes, comparisons of transcripts under different temperatures were conducted in Kentucky bluegrass (Poa pratensis L.) 26 , zoysiagrass (Zoysia spp. Willd.), sheepgrass (Leymus chinensis) 27 . However, it is not known how novel endophyte responded under extreme cold stress and maintain their symbiotic relationship. Studies on transcriptomic analyses have revealed the mechanism of cold tolerance without the presence of endophyte in winter rapeseed (Brassica rapa L.) 28 and Lotus japonicus 29 . It is therefore important for grass breeders to understand whether the tall fescue alone has the ability to withstand under freezing condition, or needs support from the novel endophyte to survive under extreme cold stress. To address the above questions, we analyzed transcripts from pseudostem (S) and leaf (L) tissues of E + and E-Texoma MaxQ II tall fescue genotypes at three different time points in a day from a natural eld environment using RNA-sEq. Tissue type, cellular conditions, and environmental factors all guided transcript pro les that may in uence regulatory events such as splicing and the expression of genes or their isoforms 30 . These ndings encouraged us to use two tissue types utilizing E + and E-tall fescue for transcriptome analysis to monitor changes in plant gene expression under cold stress in the natural eld environment by considering genetic and environmental interaction, evaluating plant responses, and endophyte's in uence on the host responses. The aims of this study were to: (i) investigate genome-wide transcriptomic pro le of E + and E-Texoma tall fescue, (ii) identify DEGs in two tissues under three time points, and (iii) identify candidate gene(s) responsible for cold tolerance under freezing condition in the natural eld environment. It could provide us useful information on how the endophyte in uences genes and their regulatory pathways associated with cold/freezing response in tall fescue.

Results
Sequencing and de novo assembly of the transcriptome After the quality assessment and data ltered, a total of 553.8 million high quality paired-end reads were identi ed in the E+ (18) samples and 484.8 million in the E-(18) samples (Table 1 and Supplementary Figure S1). The ltered reads were de novo assembled into 5,520,386 and 5,133,272 contigs in the E+ and E-samples, respectively for downstream analysis. From these contigs, we identi ed unique transcripts varied from 186,653 to 200,380 in the E+ samples and from 188,468 to 194,606 in the Esamples. Overall, we identi ed a total of 226,054 transcripts in the E+ samples and 224,376 transcripts in the E-samples. The length of these transcripts varied from 177-27,968 bp and the N50 varied from 1,288-1,326. Finally, we found 234,883 transcripts from all the samples collected in this study (Table 1). In addition, the result showed about 5.18% and 0.95% more transcripts were expressed in endophyte-infected S and L, respectively over endophyte-free S and L under freezing temperatures in the morning.
Identi cation of orthologue genes from other plant species Among the combined transcripts (234,883), about 13.5% got hit to switchgrass (31,780), rice (31,622), and Arabidopsis (31,604) genomes. A total of 5919, 4476, and 4002 orthologue genes were identi ed in switchgrass, rice, and Arabidopsis, respectively (Supplementary Figure S2). Due to lack of well annotated tall fescue genome, the function of the majority fescue transcripts remain unknown when compared to the reference genomes of the related species.
Analysis of DEGs between different cold stress and E+/E-Texoma tall fescue Differential gene expression in E+ tall fescue was analyzed relative to E-tissues under three different time points under natural eld environment. A total of 5,757 signi cant DEGs (p-value ≤ 0.05) with log2 fold change (FC) ≤-5 and ≥5 were identi ed at least in one of the six following comparisons: E+MS vs. E-MS, E+ML vs. E-ML, E+NS vs. E-NS, E+NL vs. E-NL E+ES vs. E-ES, and E+EL vs. E-EL ( Figure 1 and Supplementary Table S1). Analysis of DEGs showed that higher number of transcripts were expressed in S than L tissues at each time point (Figure 1). The upregulated transcripts were detected fewer than the downregulated ones in both tissues under the freezing (-3°C to 0.5°C) temperatures (S: 803 up and 878 down; L: 783 up and 846 down) in the morning. At afternoon (11°C to 12°C), fewer upregulated (678) than downregulated (928) transcripts were identi ed in S tissues, but reverse scenario (688 up-vs. 633 down-regulated) was observed in L tissues. In contrast to the afternoon, in the evening (12°C to 10°C), more upregulated (1,309) than downregulated (916) transcripts were identi ed in S tissues, and fewer upregulated (590) than downregulated (705) transcripts were detected in L tissues ( Figure 1).
The speci c-and overlapping DEGs among the comparisons were visualized in DiVenn ( Figure 2). The result showed 463 DEGs were speci c to E+NS vs. E-NS, 321 were speci c to E+NL vs. E-NL, 961 were speci c to E+ES vs. E-ES, and 470 were speci c to E+EL vs. E-EL under normal cold condition in the afternoon and evening time. Among the morning time expressed transcripts, 97 DEGs were common between S (E+MS vs. E-MS) and L (E+ML vs. E-ML), of which 42 were upregulated in one but downregulated in other comparison ( Figure 2). In addition, there were 556 DEGs were speci c to E+MS vs. E-MS and 529 were speci c to E+ML vs. E-ML, totaling of 1,085 were signi cantly up-and down-regulated in the morning freezing conditions, and were not expressed in the normal cold condition in the afternoon and evening time.
The DEGs were used for linkage hierarchical clustering analysis ( Figure 3). We observed a distinct pattern of gene expression at transcriptional level under different time points. Cluster analysis showed that some genes upregulated in the morning were downregulated in the afternoon and evening time or vice versa. Heat map also showed that the expression pro les of the majority genes were different between the S and L tissues in all the time points ( Figure 3). This result indicates that tall fescue responded to the stress conditions in time-and tissue-speci c manners.

Gene Ontology analysis of DEGs
Out of 5,757 signi cant DEGs, 1,099 got hit to 732 rice genes in the six comparisons and were used for GO analysis (Supplementary Table S2). We obtained 98 signi cant GO terms of three major categories, such as biological process (BP), molecular function (MF) and cellular component (CC) (Figure 4, Supplementary Table S3). Some of the identi ed GO terms were often contained common genes. Since our key objective was to investigate in uence of novel endophyte on freezing tolerance, our primary interest was on the GO terms associated with samples collected under the freezing temperatures (-3°C to  Table 2).
There were three signi cant GO terms 'amino acid activation', 'tRNA aminoacylation' and 'tRNA aminoacylation for protein translation' under BP, four GO terms 'protein transporter activity', 'aminoacyl-tRNA ligase activity', 'ligase activity forming carbonoxygen bonds' and 'ligase activity forming aminoacyl-tRNA and related compounds' under MF and two GO terms 'membrane coat' and 'coated membrane' under CC, which were enriched only E+MS vs. E-MS under the freezing stress (-3°C to 0.5°C) in the M ( Figure   4). Most of the genes associated with the nine GO categories above were related to ligase activity, kinase activity, binding, signaling, and transporter activity. Interestingly, there was only one molecular function GO term 'lyase activity', which was signi cantly enriched in E+ML vs. E-ML during the freezing stress in the morning than other cold stresses. There were 12 genes under 'lyase activity' category, of which one gene, LOC_Os04g37920.1 (Fa.63716.1) was identi ed in GO:0033554 (cellular response to stress) and GO:0006974 (response to DNA damage stimulus).
In this study, we identi ed nine DEGs speci cally expressed during the freezing stress in the morning than the cold stress at afternoon or in the evening, of which ve DEGs (Fa. 8356

Discussion
The objectives of this study were to characterize the tall fescue transcriptome, and to identify genes that were differentially expressed due to the presence of a fungal symbiont under freezing condition. Due to lack of a well-annotated tall fescue reference genome, we generated 36 de novo assemblies, consist of S and L tissues of E + and E-Texoma tall fescue genotypes at three time points and three replications. Individual assembly were performed to keep right track of the DEGs in the E + and E-Texoma tall fescue under different level of cold condition in the eld as well as to know the transcript abundance in the individual samples 31 .
Our transcriptome assemblies result in 234,883 transcripts, which may constitute important transcriptomic resources for understanding cold tolerance mechanism of this allohexaploid forage species. In a previous transcriptome study, de novo assembly obtained 199,399 contigs from novel endophyte (AR584) infected two tall fescue genotypes under water stress condition in a greenhouse study using the Illumina Genome Analyzer IIx system 24 . Recently, Dinkins et al. 25 generated transcriptome resources from two tall fescue genotypes infected with common toxic endophyte (CTE), one with non-toxic strain (NTE19) and the other with hybrid endophyte species (FaTG-4) under water de cit condition in the greenhouse, and assembled against a tall fescue TF153K transcriptome assembly developed by Dinkins et al. 23 . Both studies are performed in controlled condition in the greenhouse, but we performed this transcriptomic study of AR584 infected Texoma tall fescue at natural eld environment. The eld condition is always more variable than that of the controlled growth chamber, due to the direct effect of sunlight, day length, soil microbial community, and genotype-environment interaction but can provide more naturalistic outcome.
By comparing the transcript abundance within E + and E-Texoma tall fescue, we observed that the number of unique transcripts were higher in S than that of L tissue in all three different temperatures. More importantly, our results showed that novel endophyte had positive in uence on gene expression over E  Table 1). The number of unique transcripts were almost similar in S tissues, but slightly different in L tissues between E + and E-tall fescue in the afternoon and evening temperature (10-12°C). Thus, we speculated that the plant does not need support from endophyte under normal cold condition, but does need assistence to survive under freezing temperatures by altering their gene expression. Although Dinkins et al. 23 reported that the presence and/or absence of endophyte do not change global expression, more number of transcripts obtained in E + over Esamples in all tissues examined under three temperatures conditions in this study (Table 1) might be due to endophyte's response.
DiVenn showed that 1,085 DEGs were speci cally expressed in the morning freezing conditions might play a key role to maintain symbiotic relation between a novel endophyte (AR584) and its host tall fescue under morning freezing stress.
Under morning freezing temperatures, plants triggered genes in response to extreme cold stress that was evidenced in GO analyses where 10 GO terms (three under BP, ve under MF, and two under CC category) were only found in the morning time (Fig. 4). However, we did not able to analyze all the DEGs due to lack of available information in tall fescue genome and orthologous genes in related species. The genes expressed at the morning freezing conditions may be a possible route to tolerate the cold stress. This study would be very useful to develop hypothesis that can bring further understanding of underlying genetics of cold tolerance in tall fescue. Mahmood et al. 32 reported that the data obtained from transcriptome study can be the starting point to formulate hypothesis to dig genetics of ergot resistance.  (Table 2 and Supplementary Table S2

Conclusion
This study represents the rst transcriptome analysis of E + and E-Texoma tall fescue under freezing and chilling temperatures in the natural eld environment. We generated 234,883 unique transcripts from 36 de novo assemblies. A total of 5,757 DEGs were identi ed between E + and E-samples under three diurnal temperature conditions, of which 1085 were only up-or down-regulated under freezing temperatures in the morning. We were not able to analyze all the genes expressed differentially in two tissues under three temperature conditions, due to lack of available information in related species. Using GO analysis, nine candidate genes were identi ed from E + vs. E-samples collected during morning freezing temperature that might help to understand the endophyte in uence on the genetic basis of freezing tolerance in tall fescue. Moreover, the transcriptomic resources generated in this study would serve as valuable resources for grass breeders and to the research community for further structural annotation of tall fescue genome.

Plant material
Novel endophyte (AR584) positive (E+) and endophyte-free (E-) tall fescue genotypes of cv. Texoma MaxQ II (referred as "Texoma") (Pennington, USA, https://www.pennington.com/) were developed at the Noble Research Institute, Ardmore, Oklahoma, USA. Texoma MaxQ II is a commercial cultivar freely available for cultivation in USA. The E+ and E-plants were transplanted in the eld for seed production via open pollination among them. Since the endophyte does not transmit through pollen, seeds were harvested from the E+ and E-mother plants separately. The seeds obtained from the E+ and E-Texoma genotypes were sown in rows in the experimental farm located at Dupy (Latitude: 34°17'12.106"N, Longitude: 96°59'36.608"W), Gene Autry, Oklahoma. Before collecting samples for transcriptome analyses, we collected S tissues separately in ice-cold 15 mL falcon tubes from 15 genotypes of each E+ and E-from three random rows of the plot to test their endophyte status. In an earlier study, Takach et al. 41 con rmed that endophyte is residing in the S, not in L, in Texoma tall fescue. The S samples were freeze-dried and ground separately in the presence of liquid N 2 using mortar and pestle. Genomic DNA was extracted using MagAttract 96 DNA Plant Core Kit (QIAGEN Cat. No. 67163, Hilden, Germany) according to the manufacturer's recommendation. PCR ampli cations were performed using primers described in 42 to con rm endophyte status of the Texoma tall fescue genotypes. Quality assessment and assembly of the RNA-seq reads The raw reads of 36 samples (Figure 1) were quality trimmed to remove any low quality bases and primer/adapter sequences before performing the assembly using the Trimmomatic (v. 0.36) using default settings 43 . Reads less than 30 bases long after trimming were discarded, along with their mate pair. Endophyte-derived reads were identi ed by mapping the trimmed reads to the Epichloë coenophiala transcriptome 44 (http://csbio-l.csr.uky.edu/ec/) and successfully mapped reads were excluded from further analysis. The trimmed and ltered reads from each sample were independently de novo assembled using the software Trinity (v. 2.8.5) with default parameters 45 . These assemblies were then combined by randomly selecting one as a starting transcriptome and then iteratively aligning the transcriptome with each assembly, identifying that assembly's novel transcripts, and adding those transcripts to the combined transcriptome. Each sample was then mapped to the combined transcriptome using HISAT2 (v. 2.0.5) (https://daehwankimlab.github.io/hisat2/) with 24 threads and the default mapping parameters 46 . The expressed transcripts in each sample were quanti ed using the StringTie (v. 1.2.4) with the default assembly parameters to produce more complete and accurate reconstructions of transcripts and better estimates of their expression levels 47 .

Identi cation of differentially expressed genes
To identify genes which expressed under different temperature condition during morning, afternoon, and evening time with or without the presence of endophyte, pairwise differential gene expression testing was performed using DESeq2 with default parameters setting 48 . DESeq2 method was used for differential read counts per gene in RNA-seq, using shrinkage estimation for dispersions and fold changes to improve stability of estimates across experimental conditions. A log2 FC ≤-5 and ≥5 and adjusted p-value ≤ 0.05 were used to determine the signi cant differences in differential gene expression between two samples. The DEGs with log2 FC -and + sign indicates downregulated and upregulated genes, respectively.

Hierarchical clustering of differentially expressed genes
Hierarchical clustering analysis of DEGs from the six comparisons was constructed using the function heatmap.2 in the R package gplots 49 in R Studio.

Visualization of differentially expressed genes
The DEGs that were biologically signi cant were visualized using the web-based software DiVenn 50 . The red and blue nodes represent up-and down-regulated genes, respectively. The yellow nodes represent upregulated in one dataset but downregulated in the other dataset.
Identi cation of orthologous genes using tall fescue transcripts As annotation of tall fescue genome is not available till to May 30, 2021, the complete and accurate tall fescue transcripts were aligned against the switchgrass non-redundant protein sequences in Phytozome v13 database (https://phytozome.jgi.doe.gov/pz/portal.html) using BLASTX searches to identify best matched switchgrass orthologues. Using switchgrass orthologues, we also obtained rice and Arabidopsis orthologues of tall fescue transcripts from the Phytozome database.

Gene Ontology analysis of differentially expressed genes
We performed GO enrichment analysis using orthologue genes of rice (Supplementary Table S2