Enrichment and identification of Δ9-Tetrahydrocannabinolic acid synthase from Pichia pastoris culture supernatants

This data article refers to the report Δ9-Tetrahydrocannabinolic acid synthase (THCAS) production in Pichia pastoris enables chemical synthesis of cannabinoids (Lange et. al. 2015) [2]. THCAS was produced on a 2 L lab scale using recombinant P. pastoris KM71 KE1. Enrichment of THCAS as a technically pure enzyme was realized using dialysis and cationic exchange chromatography. nLC-ESI-MS/MS analysis identified THCAS in different fractions obtained by cationic exchange chromatography.


Type of data
Gene expression data (SDS PAGE), enrichment and isolation of protein (CIEX chromatogram), Proteome discoverer search results (xls and ProtXML files), specific activities How data was acquired Fermentation, dialysis, CIEX, HPLC, SDS PAGE, one-dimensional nLC-ESI-MS/MS Data format analyzed Experimental factors THCAS was produced with P. pastoris KM71 KE1; activity was determined; bands of protein gel were analyzed Experimental features THCAS is secreted in two active fractions with differences in glycosylation patterns Data source location N/A Data accessibility Data is provided as Supplementary material directly with this article, data are also related to Lange et al [2] Value of the data Heterologous production of plant proteins in P. pastoris benefits from glycosylation, but one should be aware it might result in a mixture of proteins with different glycosylation patterns and thus different enzymatic properties.
nLC-ESI-MS/MS is an efficient method to detect and identify low amounts of recombinant protein The simple two-step protocol for THCAS enrichment might also be applied for the fast isolation of other positively charged proteins from P. pastoris culture supernatants.
Chemical synthesis of complex and hydrophobic natural products can benefit from the implementation of biocatalytic reactions derived from the natural source.
1. Data, experimental design, materials and methods

Cloning
A codon optimized gene sequence of the thcas gene ( Fig. 1) was cloned into the plasmid pPICZαA (Life Technologies GmbH, Darmstadt, Germany) using EcoRI and NotI restriction sites. The native Nterminal signal sequence of thcas was removed and replaced by the α-mating factor signal peptide of Saccharomyces cerevisiae (encoded on plasmid). P. pastoris Mut S strain KM71 was transformed with 1 mg of the SacI linearized plasmid according to existing protocols resulting in the strain P. pastoris KM71 KE1 [1]. Afterwards cells were plated on YPDS agar plates containing zeocin (100 mg mL À 1 ). The cells were cultivated in BMGY medium and BMMY medium and activity was tested in cell free culture supernatant, in order to identify colonies of Pichia pastoris, which were able to secrete active THCAS.
The following media were used for transformation of the P. pastoris strain: YPD (Yeast Extract peptone Dextrose Medium): 10 g yeast extract, 20 g peptone, 20 g glucose. For growth on solid medium, 15 g L À 1 agar-agar was added to the medium. YPDS: the same composition as YPD medium, but with additional 182.2 g of sorbitol per 20 g of agar.
Cultivation of P. pastoris KM71 KE1 on a technical lab scale was performed in a 2 L stirred tank bioreactor KLF2000 (Bioengineering AG, Wald, Switzerland) as described in [2]. During the fermentation, secreted protein was monitored by SDS PAGE (Fig. 3) indicating two bands of THCAS in the culture supernatant ( $ 59 kDa non-glycosylated and $74 kDa glycosylated). After the fermentation, cells were removed by centrifugation (12,300g, 4 1C, 1 h) in order to obtain the culture supernatant containing active THCAS. Supernatant was further used for activity assays or for the enrichment of THCAS. (Fig. 4)

Enrichment/isolation of THCAS
A total of 1 L culture supernatant could be obtained via lab scale fermentation. Enrichment of THCAS was performed stepwise from this liter (Fig. 4) 50 mL of the culture supernatant was dialyzed against 5 L of 20 mM sodium citrate buffer at 4 1C using a ZelluTrans membrane (MWCO 6000-8000, Carl Roth, Karlsruhe, Germany). The buffer was exchanged twice. 50 mL of the obtained dialysate   resulted in the elution of two active fractions (Fig. 5). Fig. 7 shows the two active fractions before and after treatment with EndoH. All steps were carried out at 4 1C.

Activity tests
Activity tests were performed with samples of culture supernatant obtained from shaking flask experiments (BMMY medium, pH 6.0), bioreactor experiments (basal salt medium, pH 5.5) or with dialysate (20 mM sodium citrate buffer, pH 5.0) and active fractions (20 mM sodium citrate buffer, pH 5.0 containing 28 mM KCl in fraction 1 and 40 mM KCl in fraction 2). Activity assays (Fig. 6) were performed in 100 mL total volume. The assay was started by the addition of 150 mM CBGA (from a 10 mM stock solution, CBGA dissolved in MeOH) to the assay. The tubes were incubated at 30 1C and 600 rpm in a thermoshaker for 0, 5, 10, 15, 30 and 60 min (Eppendorf, Hamburg, Germany). The assay was stopped by the addition of 100 mL pure MeOH to the respective sample. After extensive mixing and centrifugation (10 min, 4700 rpm, 4 1C), 50 mL of the respective sample was analyzed by HPLC.
The reader is referred to the associated research article [2] for a detailed description of the HPLC analysis method. Specific activity was calculated based on the total protein amounts in the respective sample determined by the method of Bradford [3]. Specific activities were calculated as U g -1 total protein (1 U equals the formation of 1 mmol THCAS per minute). For the specific activity during course of the assay the product concentration was subtracted from the concentration at following time point resulting in new generated product concentration between the two time points.

In-gel tryptic digestion
Protein bands were excised from the gels, cut into small cubes (ca. 1 Â 1 mm 2 ), and destained according to Schlüsener and colleagues [4]. Gel pieces were dried in a SpeedVac, trypsin (porcine, sequencing grade; Promega, Mannheim, Germany) solution (12.5 ng ml À 1 in 25 mM ammonium bicarbonate, pH 8.6) was added until gel pieces were immersed completely in digestion solution. The protein digestion was performed over night at 37 1C with agitation (tempered shaker HLC MHR20, 550 rpm). After digestion, elution buffer (50% acetonitrile, 0.5% TFA, UPLC grade, Biosolve, Netherlands) was added (1 ml elution buffer for each ml of digestion buffer) and the samples were sonicated for 20 min in an ultrasonic bath. Samples were centrifuged and supernatants were transferred to new 1.5 ml tubes. The extracted peptides were dried using a SpeedVac and stored at À 20 1C. Prior to MSanalysis peptides were resuspended in 20 ml of buffer A (0.1% formic acid in water, ULC/MS, Biosolve,   6. Specific activities over time from fermentation over dialysis and CIEX; (A) substrate concentration in control sample (cell free culture supernatant before induction); as no activity/product formation of THCA could be observed, the activity is zero before induction; activity was determined with cell free culture supernatant in basal salt medium (pH 5.5) (B) specific activity in culture supernatant after end of the fermentation in basal salt medium (pH 5.5); (C) specific activity in culture supernatant after dialysis in 20 mM sodium citrate buffer (pH 5.0); (D) specific activity in flowtrough of CIEX step in 20 mM sodium citrate buffer (pH 5.5); (E) specific activity of active fraction 1 (59 kDa) in 20 mM sodium citrate buffer containing 28 mM KCl (pH 5.5); (F) specific activity of active fraction 2 (74 kDa) in 20 mM sodium citrate buffer containing 40 mM KCl (pH 5.5); specific activity was determined based on duplicate assays; specific activity was calculated based on total protein amount in the respective fraction, which were determined based on the method of Bradford.
Netherlands) by sonication for 10 min and transferred to LC-MS grade glass vials (12 Â 32 mm 2 glass screw neck vial, Waters, USA). Each measurement was performed with 8 μL of sample.

Protein identification
Proteins were identified using the SEQUEST [5] algorithm embedded in Proteome Discoverer 1.4 (Thermo Electron © 2008-2012) searching against the complete proteome database of Komagataella (Pichia) pastoris (strain GS115 / ATCC 20864) containing 5073 entries obtained from UniProt (UP000000314), additionally including the protein sequence of THCAS. The mass tolerance for precursor ions was set to 10 ppm; the mass tolerance for fragment ions was set to 0.6 Da. Only tryptic peptides with up to two missed cleavages were accepted and the oxidation of methionine was admitted as a variable peptide modification. The false discovery rate (FDR) was determined with the percolator validation in Proteome Discoverer 1.4 and the q-value was set to 1% [6]. For protein identification the mass spec format-(msf)-files were filtered with peptide confidence "high" and two unique peptides per protein. Results were exported from Proteome Discoverer as Excel tables (Supplementary information S1-1 to S4-1) and in ProtXML files (S1-2 to S4-2) format.

Funding
This project was supported by funds from the Ministry of Innovation, Science and Research of North Rhine-Westphalia in the frame of CLIB-Graduate Cluster Industrial Biotechnology, contract no: 314-108 001 08.