Comparative transcriptomics analysis of compatible wild type and incompatible ΔlaeA mutant strains of Epichloë festucae in association with perennial ryegrass

Epichloë festucae fungi form bioprotective endophytic symbioses with perennial ryegrass. Although this interaction is economically important, relatively little is known about the molecular processes and regulatory genes that are involved in establishing a compatible symbiosis. The present study utilised next-generation sequencing to investigate the genes required for establishing a compatible symbiotic interaction between E. festucae and perennial ryegrass. A comparative transcriptomics study, comparing the compatible symbiotic interaction of E. festucae/perennial ryegrass with the incompatible interaction of a ΔlaeA mutant strain of E.festucae/perennial ryegrass, was performed two weeks after inoculation. Differentially expressed genes were identified and classified according to gene ontology and functional annotation analyses. The raw data of this study have been deposited at SRA database with the BioProject ID PRJNA513830.


Data
Data reported here describes the results of a comparative transcriptomics study between compatible symbiotic interactions of Epichlo€ e festucae/perennial ryegrass and incompatible interactions of DlaeA mutant strains of E.festucae/perennial ryegrass [1]. HiSeq Illumina sequencing included 12 raw sequence data sets that have been deposited into the NCBI SRA database and can be accessed with the Bio Project accession number PRJNA513830. Differentially expressed genes (DEGs) were identified and further analysed using Gene Ontology (GO) (Supplementary Fig. 1AeC) and functional annotation (Supplementary Table 1).
Inoculated seedlings were grown for two weeks under 16 h of 650 W/m2 light and 8 h dark. Seedlings were frozen in liquid nitrogen and samples from 4 cm upwards and 0.5cm downwards from the meristem were collected for RNA extraction. 100 seedlings for each sample were pooled in three replicates for each treatment. RNA quality and quantity were determined using an Agilent 2100 Bioanalyzer (Agilent Technologies), Nanodrop Lite spectrophotometer (Thermo scientific) and running on a 1% agarose gel.

RNA-seq (HiSeq) analysis
RNA samples on dry ice were sent to the Beijing Genomics Institute (BGI, Hong Kong) for sequencing and 2 mg of RNA sample used to prepare libraries by BGI standard methodology (http://www.bgi.com/ services/genomics/rna-seqtranscriptome/#tab-id-2). Samples were sequenced in two lanes of an Illumina HiSeq4000 (paired end, 100-bp reads) ( Table 1).
The reads were trimmed using flexbar version 2.4 [5] and mapped against the prepared database using RNA-star version 2.5.0c [4] ( Table 1). The non-directional counts of uniquely mapped read pairs were summed for each gene and analysed using the EdgeR package version 3.10.5 [6] in the R statistical software environment version 3.2.1. Quasi-likelihood negative binomial generalized linear models were generated from the counts within sample type. Fold changes and p-values were generated using Exact Tests for differences between two groups of Negative-Binomial Counts. Of total 8547 genes in E. festucae 216 genes were differentially expressed (with two-fold or more difference and an FDR equal to or less than 0.05).

Gene ontology (GO) analysis
Transcript sequences for E. festucae were searched against the NCBI nr nucleotide database with an e-value cut off of 1E-5 with the top 10 hits being kept. The xml output, along with the corresponding InterProScan output was run though the Blast2GO Pipeline Version 2.5.0 using all the default settings. A non-redundant list of GO terms was made for each gene (multiple transcripts and proteins deriving from each gene were collapsed into a single gene). This non-redundant list was used in GOEAST and AgriGO test for the enrichment of GO terms in differentially expressed gene lists using Fisher exact tests [7,8] (Supplementary Fig. 1).

Functional annotation
Official protein sequences for endophyte genes were run through InterProScan 5RC4 to find matches against the InterPro protein signature databases using the default settings. The protein sequences were also searched using BLASTP version 2.2.28þ against the entire Swiss-Prot database as well as the fungal division of UniProt and NCBI reference sequence protein plant and fungi subsets with an e-value cut-off of 1E-20. The official transcript sequences were searched against the NCBI reference sequence RNA plant and fungi subsets using BLASTN version 2.2.28þ with an e-value cutoff of 1E-20. In addition, the official protein sequences were also searched using BLASTP version 2.2.28þ against the fungal KEGG genes database with an e-value cut-off of 1E-20 (Supplementary Table 1).