Translatome and transcriptome analysis of TMA20 (MCT-1) and TMA64 (eIF2D) knockout yeast strains

TMA20 (MCT-1), TMA22 (DENR) and TMA64 (eIF2D) are eukaryotic translation factors involved in ribosome recycling and re-initiation. They operate with P-site bound tRNA in post-termination or (re-)initiation translation complexes, thus participating in the removal of 40S ribosomal subunit from mRNA stop codons after termination and controlling translation re-initiation on mRNAs with upstream open reading frames (uORFs), as well as de novo initiation on some specific mRNAs. Here we report ribosomal profiling data of S.cerevisiae strains with individual deletions of TMA20, TMA64 or both TMA20 and TMA64 genes. We provide RNA-Seq and Ribo-Seq data from yeast strains grown in the rich YPD or minimal SD medium. We illustrate our data by plotting differential distribution of ribosomal-bound mRNA fragments throughout uORFs in 5′-untranslated region (5′ UTR) of GCN4 mRNA and on mRNA transcripts encoded in MAT locus in the mutant and wild-type strains, thus providing a basis for investigation of the role of these factors in the stress response, mating and sporulation. We also document a shift of transcription start site of the APC4 gene which occurs when the neighboring TMA64 gene is replaced by the standard G418-resistance cassette used for the creation of the Yeast Deletion Library. This shift results in dramatic deregulation of the APC4 gene expression, as revealed by our Ribo-Seq data, which can be probably used to explain strong genetic interactions of TMA64 with genes involved in the cell cycle and mitotic checkpoints. Raw RNA-Seq and Ribo-Seq data as well as all gene counts are available in NCBI Gene Expression Omnibus (GEO) repository under GEO accession GSE122039 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122039).

with individual deletions of TMA20, TMA64 or both TMA20 and TMA64 genes. We provide RNA-Seq and Ribo-Seq data from yeast strains grown in the rich YPD or minimal SD medium. We illustrate our data by plotting differential distribution of ribosomal-bound mRNA fragments throughout uORFs in 5 0 -untranslated region (5 0 UTR) of GCN4 mRNA and on mRNA transcripts encoded in MAT locus in the mutant and wild-type strains, thus providing a basis for investigation of the role of these factors in the stress response, mating and sporulation. We also document a shift of transcription start site of the APC4 gene which occurs when the neighboring TMA64 gene is replaced by the standard G418-resistance cassette used for the creation of the Yeast Deletion Library. This shift results in dramatic deregulation of the APC4 gene expression, as revealed by our Ribo-Seq data, which can be probably used to explain strong genetic interactions of TMA64 with genes involved in the cell cycle and mitotic checkpoints. Raw RNA-Seq and Ribo-Seq data as well as all gene counts are available in NCBI Gene Expression Omnibus (GEO) repository under GEO  Saccharomyces cerevisiae BY4741 wild-type strain and BY4741-based strains with TMA20, TMA64 or both TMA20 and TMA64 knockouts were maintained in rich (YPD) or minimal (SD) media.

Experimental features
In the mid-log exponential phase, yeast cells were pretreated with cycloheximide and collected. cDNA libraries of ribosome-bound mRNA and total mRNA from wild-type and knockout strains were performed as described previously [1]. Sequenced reads were trimmed, read mapping and counting was performed.

Value of the data
The data provides a gene expression landscape of yeast strains lacking TMA20 and/or TMA64 proteins, which are orthologous to mammalian translation factors MCT-1 and eIF2D, thus expanding our knowledge about individual functional roles of these two translation factors in a living cell.
An abnormal translation of the MATa2 mRNA derived from the MAT locus of MATa yeast strain is detected, which can be used for explanation of sporulation defects previously detected in the TMA64 deletion strain.
Quantitative Ribo-Seq data provides essential information of translational changes in the knockout strains including altered uORFs translation in 5' UTR of mRNA encoding important transcription regulator GCN4, thus providing a basis for investigating the role of these proteins in the stress response.
The RNA-Seq data highlights transcription abnormalities within the APC4 gene locus, caused by replacement of the adjacent TMA64 gene by the standard G418 or HYG resistance cassettes commonly used for generating gene deletions, which can be probably used to explain previously observed strong genetic interactions of TMA64 with genes involved in the cell cycle and mitotic checkpoints.
The deep sequenced Ribo-Seq and RNA-Seq are applicable for detailed bioinformatics analysis of translation events, such as prediction of alternative open reading frames.

Data
In this study we present ribosome profiling data generated from the wild-type BY4741 S.cerevisiae strain and strains lacking translation factors TMA20 (MCT-1), TMA64 (eIF2D) or both of them at the same time. Information on all performed experiments is shown in Table 1. Raw Ribo-Seq and RNA-Seq data are available online in the NCBI Gene Expression Omnibus repository (GEO accession: GSE122039, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc ¼GSE122039). Supplementary  Table S1 and Supplementary Figure S1 contain analyzed NGS data, as described below. Examples of differentially translated and transcribed genes in wild-type and knockout strains are presented in Figs. 1-3.
In contrast to a dataset previously obtained in a study by Young et al. [2], here we present ribosome profiling data not only for double knockout yeast strains, but also for strains with individual deletions of TMA20 and TMA64. This allows studying transcriptional and translational changes caused specifically by the absence of individual translation factors TMA20 (MCT-1) or TMA64 (eIF2D). Due to cycloheximide addition to yeast culture medium before harvesting the cells, distribution of mapped reads from Ribo-Seq sets may slightly vary from data described by Young et al. According to a previous study [3], use of the inhibitor in such a fashion is likely to cause blurring of local density effects, but also strengthens Ribo-Seq signals at translation initiation sites, facilitating analysis of ribosome distribution over uORFs.

Cell maintenance and cDNA libraries preparation
RNA-Seq and Ribo-Seq сDNA libraries were prepared from total RNA samples or ribosome-bound RNA samples, respectively, for both the wild-type BY4741 yeast strain and three knockout strains (with individually deleted TMA20 or TMA64 genes, or with a double deletion of TMA20 and TMA64, hereafter referred as wt, Δtma20, Δtma64 and ΔΔtma20tma64, respectively). The libraries were sequenced, resulting in 9 RNA-Seq and 9 Ribo-Seq data sets. Table 1 summarizes information about all of the sequencing experiments.
The experimental procedure in general followed the ribosome profiling protocol described in [1]. Briefly, yeast cells were grown to an exponential phase in either rich YPD (1% yeast extract, 2% peptone, 2% glucose) or minimal SD (0,67% YNB w/o amino acids with ammonium sulfate, 2% glucose, complete amino acid supplementation) media. Cycloheximide was added to yeast media to a final concentration of 100 mg/ml and growth was continued for 3 more minutes; then cells were harvested by filtration, resuspended in polysome lysis buffer (20 mM Tris pH 8.0, 140 mM KCl, 1.5 mM MgCl2, 100 g/ml cycloheximide, 1% Triton), flash frozen in liquid nitrogen and homogenized by grinding. Then a portion of each cell lysate was used for total RNA isolation, while another part was treated with RNase I for polysome disassembly, applied to a sucrose gradient for fractionation, followed by isolation of a monosome fraction and extraction of ribosome-protected mRNA fragments for ribosome profiling. mRNA was isolated using Oligo(dT) beads and ribosome-bound RNA was isolated from sucrose fractions using acidic-phenol extraction. Further ribosome profiling and RNA-Seq library preparations were performed as described previously [1]. Two biological replicates indicated as WT1 and WT2 were performed for wild-type strain maintained in YPD.

Data analysis
3.1. Ribosome profiling of yeast strains lacking TMA20 and/or TMA64 genes Translation factor TMA64 and homologs of its N-and C-terminal regions, TMA20 and TMA22 respectively (eIF2D, MCT-1, and DENR in mammals) are proteins involved in translation termination, re-initiation, and ribosome recycling. Initially, eIF2D and heterodimer MCT-1DENR were assumed to provide a non-canonical translation initiation pathway as they facilitate GTP-independent delivery of Met-tRNA i Met and some elongator tRNAs to the 40S ribosomal P-site [11,12]. In addition, in vitro and in vivo studies demonstrated that TMA64/eIF2D, TMA20/MCT-1, and TMA22/DENR are able to promote the post-terminational tRNA and mRNA release from the 40S ribosomal subunit both in yeast and mammals [2,13]. The absence of these factors, together with the 40S recycling failure, led to deregulated translation re-initiation downstream of both short and full-size translated open reading frames in different organisms [2,[14][15][16]. The C-terminal regions of TMA64/eIF2D and TMA22/DENR contain the SUI1 domain, which is also present in the translation factor SUI1/eIF1. Structural data indicate that the SUI1 domains of all three factors have similar positions in the P-site of the 40S ribosomal subunit with a conserved β-loop protruding toward a codon-anticodon duplex formed by mRNA and a P-site tRNA [17,18]. In accordance with biochemical data, this suggests that during recycling, TMA64/eIF2D and the heterodimer TMA20TMA22 (MCT-1DENR) may operate in a manner similar to SUI1/eIF1 in translation initiation, or control initiator tRNA access to re-initiating ribosomal complexes after uORF translation.
Raw and analyzed Ribo-Seq and RNA-Seq data sets for wild-type, individual Δtma20 and Δtma64, as well as double ΔΔtma20tma64 knockout yeast strains were obtained and uploaded into the NCBI Gene Expression Omnibus repository (GEO accession: GSE122039, https://www.ncbi.nlm.nih.gov/ geo/query/acc.cgi?acc ¼ GSE122039). The initial analysis revealed that they included deep RNA-Seq and Ribo-Seq data with more than 10 million uniquely mapped and counted reads within gene CDS. In general, in a half of the samples, 50% of the total library length was composed of uniquely mapped reads. Across all samples, 60 to 90% of them were located within annotated CDS and included in the gene counts. Metagene start-and stop-centric profiles for Ribo-Seq data exhibited clear triplet periodicity (Supplementary Fig. S1). Supplementary Table S1 provides an overview of mapped reads as well as gene-level read counts for all experiments and descriptive statistics on the generated sequencing data.

Examples of data illustrating differential transcription and translation in wt and knockout strains
Two different gene cassettes, MATα and MATa, either of which can be present in the MAT locus of S.cerevisiae genome, define the mating type of yeast. Each mating-specific cassette encodes two transcripts directed from the opposite DNA strands by a shared bidirectional promoter: either MATa1 and MATa2 in a-type strains, or MATα1 and MATα2 in α-type strains (Fig. 1) [9]. All of the transcripts except MATa2 encode functional proteinstranscription factors that determine mating type or diploid phenotype (reviewed in [19,20]). While MATa2 is considered to be non-functional [21,22], it nevertheless contains two ORFs (Fig. 1), presumably originating from an original coding region with similarity to MATα2 via an internal frameshift [9,23]. The second ORF could still encode a remarkably conserved amino acid sequence with a high similarity to a portion of MATα2 [9,24] that represents its DNA-binding domain [25,26]. The corresponding protein (MATa2-2) could compete with MATα2 for DNA binding or even have its own transcription factor activity [27]. However, its synthesis should be inhibited by presence of the first ORF in the MATa2 mRNA, which can be regarded as an uORF for the MATa2-2 coding region. Since TMA20 and TMA64 knockout strains have an upregulated translation re-initiation and/or readthrough activities [2,16,28], it was interesting to illustrate our ribosome profiling data with a footprint coverage of MATa locus present in BY4741 strain derivatives. As the sequenced S288C strain is MATα, Ribo-Seq and RNA-Seq reads were re-mapped to MATa locus sequence taken from GenBank (accession number V01313.1) [9]. Fig. 1 provides data on Ribo-Seq and RNA-Seq read coverage of MATa locus of the studied yeast strains. This data can be used to explain the role played by TMA20 and TMA64 translation factors in mating and sporulation programs [29][30][31][32].
Another example involves GCN4, the global transcriptional regulator, which is activated during amino acid starvation. Expression of the GCN4 mRNA is controlled by a peculiar mechanism based on differential translation re-initiation on four short uORFs in its 5' UTR (reviewed in [33]). Fig. 2 shows the 5' proximal region of the GCN4 transcript, with differential ribosome footprint coverage of uORFs in different strains. Our data can be used for further investigation of TMA20 and TMA64 roles in uORF-mediated translational control of stress response.
APC4, the gene encoding a subunit of anaphase-promoting complex, is located in the same genetic locus as TMA64 and shares a 238-bp promoter region with it. The corresponding mRNAs are synthesized from opposite DNA strands. In the Δtma64 and ΔΔtma20tma64 strains the TMA64 coding sequence was replaced with G-418 or hygromycin resistance gene cassettes (KanMX or HygMX), respectively. Fig. 3 shows differential RNA-Seq coverage of the 238 bp region, flanked by segments of the APC4 and TMA64 coding regions or KanMX/HygMX cassettes, in different yeast strains. This data may likely account for the observed strong genetic interactions of TMA64 with genes involved in the cell cycle and mitotic checkpoints [34] and cell cycle abnormalities of TMA64 knockout strains [35].