Proteomic data of the Trypanosoma cruzi insect-dwelling epimastigotes overexpressing the RNA-binding protein UBP1

We present data on the proteome of the Trypanosoma cruzi epimastigote cells overexpressing the U-rich RNA-binding protein 1 (UBP1). The role of this regulatory protein during the epimastigote-to-metacyclic trypomastigote stage transition was clearly established by our group at the transcriptome level; nevertheless, the impact of UBP1 overexpression on protein synthesis is not known. To address this question, we performed shotgun label-free quantification proteomics using an in vitro system based on the tetracycline-inducible expression of TcUBP1 and epimastigote wildtype cells. Using tryptic peptide digestion and LC-MS/MS analysis with Orbitrap technology, this data file describes the proteome of three biological samples per condition and yields 1637 correctly quantified proteins. The statistical comparisons of the two analyzed groups within the Proteome Discoverer platform identified 379 differentially expressed proteins, with 207 being up-regulated and 172 being down-regulated. In addition, profile plots and heatmap analysis to visualize the distribution of protein abundances within replicates are also presented. Data are available via ProteomeXchange with identifier PXD047761.


a b s t r a c t
We present data on the proteome of the Trypanosoma cruzi epimastigote cells overexpressing the U-rich RNA-binding protein 1 (UBP1).The role of this regulatory protein during the epimastigote-to-metacyclic trypomastigote stage transition was clearly established by our group at the transcriptome level; nevertheless, the impact of UBP1 overexpression on protein synthesis is not known.To address this question, we performed shotgun label-free quantification proteomics using an in vitro system based on the tetracycline-inducible expression of TcUBP1 and epimastigote wildtype cells.Using tryptic peptide digestion and LC-MS/MS analysis with Orbitrap technology, this data file describes the proteome of three biological samples per condition and yields 1637 correctly quantified proteins.The statistical comparisons of the two analyzed groups within the Proteome Discoverer platform identified 379 differentially expressed proteins, with 207 being up-regulated and 172 being down-regulated.In addition, profile plots and heatmap analysis to visualize the distribution of protein abundances within replicates are also pre-

Value of the Data
• This is the first proteome analysis of Trypanosoma cruzi RNA-binding protein UBP1overexpressing parasites.• The proteomic dataset underlines the quantitative differences of significant up-or downregulated proteins as a direct effect of UBP1-overexpression. • The proteome changes might, in the future, lead to a better understanding of how epimastigote-to-metacyclic form differentiation is orchestrated.

Background
Extending the dataset previously obtained in the related study by Sabalette et al. [1] , here we present proteomic data for insect-dwelling Trypanosoma cruzi epimastigotes overexpressing TcUBP1 (TcCLB.507093.220),an RNA-binding protein with post-transcriptional roles in gene expression regulation [2] .We demonstrated that UBP1 induction in the epimastigote form regulates gene expression by affecting mRNA levels.This leads to the up-regulation of genes associated with the trypomastigote infective stage and the downregulation of genes related to ribosomal and synthetic pathway proteins, resembling the transition to the infective metacyclic trypomastigote stage [3] .The current study was initiated due to a lack of understanding regarding the influence of UBP1 expression at the level of protein synthesis.

Data Description
This data consists of a shotgun label-free proteomic experiment to investigate the differential protein expression profiles of T. cruzi epimastigote cells overexpressing the TcUBP1-GFP protein [3] .We used two different sam ples with three biological replicates each, of wild-type RNA-binding protein TcUBP1 and tetracycline-induced epimastigotes for 4 days, CTRL and UBP1-OE, respectively.In more detail, the proteomes of these cells were reduced, alkyated, digested, and analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) using a Q-Exactive mass spectrometer with Orbitrap technology.The obtained spectra were analyzed using the Proteome Discoverer program, utilizing the database specific to the T. cruzi (strain CL Brener) UP0 0 0 0 02296.The LFQ analysis revealed a total of 1637 correctly quantified proteins over 1749 identified proteins.Of these, 379 proteins were found to be two-fold differentially expressed between UBP1-GFP and CTRL samples: 207 were up-regulated and 172 were down-regulated.Supplementary Table 1 contains a list of the differential proteins characterizing the proteome of UBP1-overexpressing epimastigote cells.The list includes, for each protein: UniProt accession number, systematic gene name, protein description, and fold change value.Then, the datasets were examined to check the amount of UBP1 levels within the two conditions, with UBP1 being the experimental internal control of our proteomic analysis.The profile plots showed in Fig. 1 stress the trend of LFQ abundance of identified proteins and, separately, for UBP1 among samples (CTRL and UBP1-OE replicates).Furthermore, the R software [4] was utilized to create a heatmap that illustrates the distribution of protein abundances among the replicates ( Fig. 2 ).Some differences are observed between replicates, especially for the UBP1-OE_1 sample, and we cannot rule out that this may be due to biological variability within this group.However, in essence, these variations do not stem from different levels of UBP1, as the induction values in the triplicates ( Fig. 1 , Right) are consistent with those before obtained in our laboratory [3] .While some quantitative proteomics in trypanosomatids have shown that the transcriptome and the proteome positively correlate, it is not always possible to correctly predict protein levels based solely on mRNA levels due to extensive posttranscriptional mechanisms that control gene expression [5 , 6] .In addition, the proteome correlates much better with the translatome than with the transcriptome [5] .Taking this into consideration, we generated a log-log scatter plot illustrating fold change values for transcriptomic and proteomic experiments using UBP1-overexpressing cells ( Fig. 3 ).We found a positive but low correlation between both datasets ( r = 0.17).The obtained Pearson correlation coefficient was even lower than r = 0.31, which Smircich et al. reported for metacyclic trypomastigote cells [5] .However, it is interesting to note the presence of several downregulated ribosomal proteins in both approaches (in particular, 35 out of the  172 downregulated proteins in Supplementary Table 1 are related to the translation process).This result is consistent with previous findings from RNA-Seq experiments [1] and with parasites exhibiting an expression profile similar to the infective form.Finally, it is noteworthy that along with UBP1, other RNA-binding proteins are also up-regulated (TRRM1A, TcCLB.509317.60 [7] ; ZC3H39, TcCLB.508895.50 [8] ; and snoRBP, TcCLB.507649.80 [9] ), so we cannot ignore that the observed effect in UBP1-overexpressing epimastigotes may result from a regulatory cascade involving these proteins as well.

Cell sample preparation and LC-MS/MS analysis
TcUBP1 was inducibly expressed in the T. cruzi CL Brener epimastigote strain as described [3] .Control wild-type UBP1 and four-day tetracycline-induced parasites were cultured in epimastigote form at 28 °C.Proteins were extracted and analyzed by mass spectrometry at the Protemics Core Facility of the CEQUIBIEM, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Argentina.The protein bands that were excised from the Coomassie blue-stained SDS-PAGE gels were washed and destained using a series of solutions: 50 mM ammonium bicarbonate, 25 mM ammonium bicarbonate, 50% acetonitrile, and 100% acetonitrile.Then, protein bands were reduced with 20 mM dithiothreitol for 45 min to 56 °C and alkylated with 50 mM iodoacetamide for 45 min.After that, the proteins were in-gel digested with 100 ng trypsin (Promega V5111) in 25 mM ammonium bicarbonate overnight at 37 °C.The peptides were recovered by elution with 50% acetonitrile and 0.5% trifluoroacetic acid.This extraction process included brief sonication, and then the solution was further concentrated by using a speed-vacuum drying method.The samples were diluted in 15 μL of water with 0.1% formic acid, purified using C18 ZipTip (Merck Millipore), and extracted with 10 μL of a mixture of water, acetonitrile, and formic acid in a ratio of 40:60:0.1%.The digests were analyzed using nanoLC-MS/MS on a Thermo Scientific Q-Exactive Mass Spectrometer, which was coupled with a nanoHPLC EASY-nLC 10 0 0 (Thermo Scientific).For LC-MS/MS analysis, approximately 2 micrograms of peptides were introduced into a reverse-phase column with the following specifications: C18, 2 micrometers, 100 Å, 50 micrometers in diameter, and 150 mm in length.The Easy-Spray Column PepMap RSLC (P/N ES801) is designed for effectively separating protein complexes with exceptional resolution.The nano-column was operated at a flow rate of 300 nanoliters per minute, with the solvent composition ranging from 7% B for 5 min to 35% B for 120 min.Solvent A consisted of water with a 0.1% concentration of formic acid, while solvent B consisted of acetonitrile with a 0.1% concentration of formic acid.The volume of the injection was 2 microliters (μL).An electric potential of 1.5-3.5 kV was applied for the process of Electrospray Ionization using the Thermo Scientific EASY-SPRAY system.Full-scan mass spectra were obtained using an Orbitrap analyzer, covering a mass range of 40 0-20 0 0 m/z at a resolution of 70,0 0 0 at 400 m/z .In each cycle, the twelve most intense ions were consecutively isolated, subjected to higher-energy collision dissociation for fragmentation, and then measured in the Orbitrap analyzer.Peptides with a charge of + 1 or with an undetermined charge state were excluded from MS/MS fragmentation.

Proteomic analysis
The raw data from the Q-Exactive mass spectrometer was subjected to analysis using the Proteome Discoverer software (version 2.2, Thermo Scientific) and searched against the freely available database of the T. cruzi CL Brener strain (Organism ID: 353153, Proteome ID: UP0 0 0 0 02296, total proteins: 19,245) with trypsin specificity, allowing a maximum of two missed cleavages per peptide.Carbamidomethylation of cysteine residues was assigned as a fixed modification, and methionine oxidation was considered a variable modification.Searches in Proteome Discoverer utilized a precursor mass tolerance of 10 ppm and a product ion tolerance of 0.05 Da.Protein hits were filtered based on high-confidence peptide matches with a peptide false discovery rate of 1%, calculated using a reverse database strategy.

Limitations
Not applicable.

Fig. 2 .
Fig. 2. Heatmap of LFQ protein abundances to show the separation of the two analyzed groups (CTRL and UBP1-OE) using pheatmap function in R [4] .
sented.Data are available via ProteomeXchange with identifier PXD047761.© 2024 The Author(s).Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )