Protein expression/secretion boost by a novel unique 21-mer cis-regulatory motif (Exin21) via mRNA stabilization

Boosting protein production is invaluable in both industrial and academic applications. We discovered a novel expression-increasing 21-mer cis-regulatory motif (Exin21) that inserts between SARS-CoV-2 envelope (E) protein-encoding sequence and luciferase reporter gene. This unique Exin21 (CAACCGCGGTTCGCGGCCGCT), encoding a heptapeptide (QPRFAAA, designated as Qα), significantly (34-fold on average) boosted E production. Both synonymous and nonsynonymous mutations within Exin21 diminished its boosting capability, indicating the exclusive composition and order of 21 nucleotides. Further investigations demonstrated that Exin21/Qα addition could boost the production of multiple SARS-CoV-2 structural proteins (S, M, and N) and accessory proteins (NSP2, NSP16, and ORF3), and host cellular gene products such as IL-2, IFN-γ, ACE2, and NIBP. Exin21/Qα enhanced the packaging yield of S-containing pseudoviruses and standard lentivirus. Exin21/Qα addition on the heavy and light chains of human anti-SARS-CoV monoclonal antibody robustly increased antibody production. The extent of such boosting varied with protein types, cellular density/function, transfection efficiency, reporter dosage, secretion signaling, and 2A-mediated auto-cleaving efficiency. Mechanistically, Exin21/Qα increased mRNA synthesis/stability, and facilitated protein expression and secretion. These findings indicate that Exin21/Qα has the potential to be used as a universal booster for protein production, which is of importance for biomedicine research and development of bioproducts, drugs, and vaccines.


INTRODUCTION
Proteins play key roles in all physiological processes and pathological conditions. Protein expression is a critical process to biological and biomedical research and biotechnology. However, it has been challenging and costly in many cases to generate proteins for largescale applications. The demand for biologics has been growing steadily in the past three decades. According to the report of precedenceresearch.com, a market research and consulting organiza-tion, the global biopharmaceuticals market was estimated at US$ 265.4 billion in 2020 and is expected to reach over US$ 856.1 billion by 2030. In 1986, the FDA approved recombinant human tissue plasminogen activator expressed in mammalian cells as the first recombinant therapeutic protein. Today, recombinant proteins have emerged as vital biopharmaceuticals for the diagnosis and treatment of human diseases. 1 Since proteins produced in the mammalian cells have the correct folding, assembly, and post-translational modification, about 60%-70% of recombinant protein pharmaceuticals are produced in mammalian cells, which plays a central role in biopharmaceutical industry. 2,3 The production yield of recombinant proteins in mammalian cells has increased significantly since the 1980s, from a few micrograms to a few grams per liter. This astonishing achievement in mammalian recombinant protein yields is due to long-term and continuous research progress in vector design, cell line engineering, medium, and bioprocess optimization. [1][2][3][4][5] Among these methods, it is particularly effective to increase protein expression by optimization of vector components, such as promoter, Kozak sequence and signal domain, enhancer element, poly(A), intron and splice sites, internal ribosome entry sites, codon optimization, and marker selection. [6][7][8][9] However, even with these strategies, some proteins still can only be expressed at very low to no expression levels. Furthermore, protein production in mammalian cells results in a much lower yield compared with E. coli expression systems. Therefore, it remains to be a research focus for developing new and simple universal method that can significantly increase protein production at a lower cost, particularly in cases needing protein production by large scale.
Studying SARS-CoV-2 has been hampered by the low-level expression of many viral proteins including the spike (S) protein in mammalian cells, 10 limiting the quick response to the COVID-19 pandemic. 11,12 To optimize SARS-CoV-2 viral protein expression, we developed various expression vectors using different promoters and a luciferase/green fluorescent protein (GFP)-based dual reporter system. During the vector optimization process, we serendipitously discovered that the addition of a 21-mer oligonucleotide motif (Exin21, expression-increasing 21) into the vector dramatically increased the expression and secretion of SARS-CoV-2 envelope (E) protein. This unique Exin21 encodes a specific heptapeptide (QPRFAAA), designated as Qa. We found that Exin21/Qa addition increased the productivity of various types of proteins in cells, including SARS-CoV-2 proteins S, nucleocapsid (N), and membrane (M), and accessory proteins (NSP2, NSP16, and ORF3), endogenous proteins human interleukin-2 (IL-2), interferon-g (IFN-g), angiotensin-converting enzyme 2 (ACE2), mouse NIK and IKK2-binding protein (NIBP), and anti-SARS-CoV monoclonal antibody (mAb). Exin21 also improved the SARS-CoV-2 S pseudovirus and lentivirus packaging and raised the mRNA synthesis and stability. This Exin21/ Qa demonstrates high potential to be widely applied as a simple and common booster on the production of therapeutic proteins, antibodies, and mRNA vaccines.

Discovery of a heptapeptide Qa in boosting viral protein expression/production
To study SARS-CoV-2 viral protein expression in mammalian cells, we generated a dual reporter system to measure the viral protein expression quantitatively and dynamically. We fused Gaussia-Dura luciferase (gdLuc) and destabilized GFP (dsGFP), abbreviated as LG, onto the C terminus of SARS-CoV-2 E protein (Figures 1A and S1). This design allows dual measures of the secretory gdLuc-fused target protein in culture medium by sensitive gdLuc assay and the dsGFP positivity and intensity (for dynamic resolution) [13][14][15] by fluorescence microscopy and flow cytometry. During the cloning of the E protein-expressing vector, we initially screened the correct clones by restriction enzyme digestion and tested positive clones E1 and E7 for protein expression by gdLuc assay ( Figure 1B) and fluorescence microscopy ( Figures 1C and S1A). To our surprise, the E7 clone ex-hibited >20-fold higher luciferase activity than E1. We then examined the E7 DNA sequence by Sanger sequencing. Unexpectedly, we discovered that E7 had an additional 21-nucleotide sequence (CAACCGCGGTTCGCGGCCGCT) that encodes 7 amino acids (aa) in frame between the upstream of LG and the downstream of the FLAG tag ( Figure 1A). We designated this heptapeptide as Qa based on the pronunciation of its aa sequence (QPRFAAA) and named its linked Flag-LG as Flag-QLG. We confirmed that transfection of pcDNA6B-E-Flag-QLG (E7) exhibited up to 90-fold higher gdLuc activity than pcDNA6B-E-Flag-LG (E1) in at least 20 independent experiments ( Figure 1D). The Qa boosting feature was validated in experiments using an all-in-one vector that includes secreted embryonic alkaline phosphatase (SEAP) or Cypridina luciferase (cLuc) for normalization ( Figure 1E), although these constructs showed less fold induction, possibly due to bigger size of vector and less transfection efficiency. The Qa location in the 3 0 untranslated region (UTR) of the LG reporter had no boosting activity, confirming the in-frame requirement of Qa in the coding LG reporter ( Figure 1F). When the E expression cassette alone or alongside dsGFP was removed, the remaining Qa sequence between Flag and the N terminus of LG or L (Flag-Q-LG or Flag-Q-L, respectively) retained its boosting activity ( Figure 1G).
We then examined the effect of this Qa addition (X-Flag-Q-LG) on the expression of other SARS-CoV-2 proteins, including S, N, M, NSP2, NSP16, and ORF3. We found that Qa boosts the production of all the tested viral proteins ( Figures 1H, 1I, 2A, S1, and S2), with an efficiency ranging from 3-to 3,848-fold, depending upon the respective protein. Such variation of Qa boosting efficiency may result from differences in cellular density/function, transfection efficiency, reporter dosage, and viral protein types as well as 2A-mediated auto-cleaving efficiency and secretion signaling. Taken together, this Qa heptapeptide can boost the expression/production of SARS-CoV-2 viral proteins in mammalian cells.
Unique 21-mer oligonucleotide cis-regulatory motif contributing to Qa boosting Given that the Qa insertion needs to be in the same open reading frame (ORF) as the targeted genes for protein expression and functional detection, we initially speculated that the in-frame LG for Gaussia-Dura luciferase (gdLuc) and destabilized green fluorescent protein (dsGFP) and Qa-tagged LG (QLG) fused with viral protein and potential multiple measures of viral protein expression/production. The Exin21/Qa stands for the expression-increasing 21-mer nucleotide motif and its corresponding heptapeptide. (B-D) Representative experiments showing Exin21 boosting of SARS-CoV-2 envelope (E) protein dynamic production (B) determined by gdLuc assay in supernatants 48-72 h after transfection, representative images of Exin21-boosted dsGFP expression detected by fluorescence microscopy (C) and average fold induction with results of 20 experiments with 3-4 replicates (D). Cells were transfected with indicated reporter pcDNA6B vectors (6B, 100 ng/well, quadruplicate) and normalization vector (20 ng/well). Data represent mean ± SE of gdLuc activity after normalization in supernatants at 48 h post-transfection (in most cases) and relative fold changes (in red) in QLG over corresponding LG groups (the same below). (E) Exin21 boosting after normalization with separate or all-in-one vectors. (F) Exin21 insertion in the 3 0 untranslated region (UTR) of E-LG exhibiting no boosting activity. (G) Exin21 addition only to the reporter LG or L retaining boosting activity. (H and I) Representative gdLuc assay showing various degrees of Exin21 boosting in other SARS-CoV-2 structural proteins: spike (S), nucleocapsid (N), and accessory proteins: NSP2, NSP16, and ORF3. (J-L) Alanine scanning and deletion mutation (J) as well as degenerate (K) and missense (L) mutation assays showing the critical role of the unique and specific Exin21 in boosting E-LG production. Data represent mean ± SE of gdLuc activity, with relative percentage changes compared with the parent E-QLG group. The inset in (J) shows the heptapeptide structure with the residue position. Insets in (K) and (L) show the mutated nucleotides and corresponding aa residues. The dQ for degenerate QLG and mQ for missense QLG mutants.
heptapeptide Qa plays a critical role in boosting protein production. Thus, we performed alanine scanning and deletion mutation assays ( Figure 1J) to determine the role of individual aa residues in regulating Qa function at the peptide level. All tested mutations impaired the boosting activity to various extents from >57% loss of boosting activity to almost complete loss in the F4A mutation, indicating that each residue of this unique Qa heptapeptide appears to be important for the boosting activity, with the fourth residue, phenylalanine (F), being the most critical. To explore the contribution of the underlying oligonucleotides at the RNA level, we created synonymous (silent, degenerate) mutations that only change nucleotides without affecting the aa sequence. Unexpectedly, we found that all the degenerate mutants tested showed significant loss (>90%) of Qa boosting activity ( Figure 1K), indicating that Qa boosting activity derived predominantly from the action of the 21-mer oligonucleotide motif instead of the unique heptapeptide. We next performed nonsynonymous (missense) mutation assays by retaining the ORF sequence required for reporter expression. All the tested mutants lost the boosting activity to various degrees compared with the parent Qa group ( Figure 1L). These data suggest that both the sequence (composition) and the number of this 21-mer motif are critical for the Qa boosting activity. The newest alignment analysis using nucleotide BLAST did not identify any sequence that 100% matches the 21 nucleotides in any species/organisms, but some sequences at the cover of 20 or 19 nucleotides showed 100% identity in some non-mammalian species. In mammals (e.g., human and mouse), 100% identity can be found at the cover of %14 (for human) or 15 (for mouse) nucleotides. For the heptapeptide (QPRFAAA) by protein BLAST, there is no 100% match in mammalian genes and viral proteins, although it is present in bacteria, fungi, and other non-mammal organisms, which may not have the same function as Exin21 that is absent in any organisms so far. Therefore, we assigned Exin21 as a new name for the unique expression-increasing 21-mer cis-regulatory motif, which encodes an epitope tag (Qa).

Broad capability of Exin21/Qa addition to boost protein expression/production
To expand the potential applications of Exin21/Qa in boosting protein expression and production, we performed similar assays in different types of proteins, mammalian cells, and species. We observed similar boosting effects for many non-viral proteins ( Figures 2B-2E). Interestingly, transfection with a lower amount of plasmid DNA in HEK293T cells yielded higher boosting efficiency for most SARS-CoV-2 viral proteins (Figures 2A and S2), but not for host cellular gene products such as mouse NIBP 16 ( Figure 2B) and human ACE2 (hACE2) ( Figure 2C), or cytokines such as IFNg ( Figure 2D) and IL-2 ( Figure 2E), although dose-dependent responses in reporter activities remained similar. Exin21/Qa induced stronger boosting of SARS-CoV-2 E protein in the presence of the stronger CAG promoter ( Figure 2F). We further found that similar boosting of protein expression and production occurred in other cell types including HeLa, BHK, and others ( Figure 2G). In addition to being functional in regular plasmids, Exin21/Qa also exhibited boosting activity in viral transfer vectors such as lentiviral (LV) vectors ( Figures 1G, 2D, and 2E). In summary, Exin21/Qa addition has a broad capability of boosting protein expression/production across various gene products, vectors, mammalian cell types, and species.
Exin21/Qa enhancement of antibody production mAb-based therapeutics require the optimization of antibody production in suitable cell culture platforms, which relies on high-performance expression vectors. To achieve this, genetic elements in mAb production vectors have been widely modified. To determine if Exin21/Qa addition would be able to boost antibody production, we used a human anti-SARS-CoV mAb (Bei, CR3022), which contains the variable regions of heavy and light chains (GenBank: DQ168569 and DQ168570, respectively) as a test platform. We inserted Qa into the C termini of the immunoglobulin heavy (H) and light (L) chains of CR3022 to generate Qa-tagged HQ and LQ (Figure 3A). We co-transfected plasmids encoding HQ and LQ into HEK293T cells to generate Qa-tagged mAb, using the original H and L vectors (NR52399 and NR52400) as controls. We collected mAb-containing supernatants 2-3 days after transfection and measured their mAb levels by enzyme-linked immunosorbent assay (ELISA) using SARS-CoV-2 S protein as the coating antigen ( Figures  3B and 3C). We found that Exin21/Qa boosted mAb production by up to 37-fold, with or without normalization of transfection efficiency ( Figure 3D). Boosting efficiency was obtained at 13-fold on average from 16 independent experiments even with varied experimental conditions (cell density, transfection efficiency, and ELISA variations) ( Figure 3E). We further confirmed that Exin21/Qa boosted mAb production by western blot analyses of cell culture supernatants (Figure 3F). These data indicate that Exin21/Qa addition robustly boosts mAb production/secretion.

Exin21/Qa enhancement of SARS-CoV-2 S pseudovirion production
Pseudotyped viruses have been widely used in studies not only for gene delivery, but also for vaccine production, antibody neutralization, cellular entry, and pathogenic exploration. Pseudovirion is an excellent alternative to high-risk viruses such as SARS-CoV-2 and its variants [17][18][19][20][21][22] and does not require the usage of BSL3 facilities. Pseudovirions are virus-like particles (VLPs) coated with viral surface or membrane proteins that harbor specific cellular tropisms. 20,22,23 VLPs pseudotyped with SARS-CoV-2 S protein evoke stronger immune responses than any individual viral protein due to their three-dimensional structures, similar to those of live virus. 20,22,23 SARS-CoV-2 S protein has been widely used to generate S pseudovirion, but the packaging efficiency for lentivirus-like (LVLP) or vesicular stomatitis virus-like (VSVLP) particles has been low in most reports, even with the codon-optimized C-terminal deletion S protein. 17,18,20,24 Given the fact that Exin21/Qa addition boosts S protein production in mammalian cells, we speculated that it might boost the packaging efficiency of S pseudotyped LVLP (S-LVLP). By applying the widely used C-terminal 18 aa-deleted codon-optimized SARS-CoV-2 S protein (Sd18) as a test platform ( Figure 4A), we validated that Exin21/Qa addition on the C-terminal Sd18 (Sd18Q) boosted Sd18 expression as demonstrated by western blot analysis ( Figure 4B). We also found that Exin21/Qa addition increased S-LVLP packaging efficiency by $2to 4-fold in HEK-hACE2 cells ( Figure 4C). To provide dynamic measurement of S-pseudovirion transduction, we tested the packaging efficiency of the dual-reporter LV vector pRRL-E-QLG, which harbor inserts larger than the GFP insert alone. As expected, the original Sd18 showed a significantly lower packaging efficiency than that of Exin21/Qa addition (Sd18Q) when using the transfer vector pRRL-E-QLG for the packaging of S-LVLP ( Figures 4D and 4E). These data demonstrate that Exin21/Qa addition in the Sd18 expression system significantly boosts packaging and transduction efficiencies of SARS-CoV-2 S-LVLP.

Exin21/Qa enhancement of lentivirus production
Viral gene therapy has been extensively studied and actively applied to clinical diseases. Both AAV and LV are the most promising strategies for viral gene therapy, but viral packaging efficiency (production yield) has been a bottleneck for progress in the field. 25 This extends to therapeutic applications of other technologies such as genome editing by CRISPR-Cas, where viral packaging efficiency is also a rate-limiting factor for development of novel therapeutics. 26 Generally, the level of mRNA supplied by LV transfer vector can affect LV packaging efficiency. We hypothesized that Exin21 addition in the LV transfer vector can elevate transgene mRNA levels during packaging and thereby boost the efficiency of LV packaging and gene delivery. We tested this idea by comparing the LV transfer vectors pRRL-E-LG and pRRL-E-QLG for standard LV packaging (psPAX2 and VSV-G). After LV infection of HEK293T cells, Exin21/Qa increased production of the transgene reporter gdLuc from the transfer vector ( Figure 4F), similar to its boosting efficiency in transfected cells ( Figures 2D and 2E). However, Exin21/Qa addition in the transfer vector did not increase the packaging efficiency (i.e., the titer of packaged LV; data not shown). We saw similar changes using LV-spCas9-Q-RFP and LV-MS2-spCas9-Q-GFP ( Figures 4G and 4H), for which packaging efficiency is usually <1% that of standard LV-RFP or LV-GFP. These data suggest that the Exin21 addition does not increase the mRNA level of the transfer gene during LV packaging and thus does not increase packaging efficiency, but Exin21/Qa addition does enhance production of transgene protein in the transduced cells ( Figures 4F and 4G). This is supported by further qRT-PCR analysis at 24-72 h after initiation of LV packaging, showing that the mRNA levels of the transgene E-QLG during LV packaging were reduced significantly by Exin21/Qa addition ( Figure 4I). We also tested if Exin21/Qa addition on LV packaging proteins such as Gag, Pol, and RRE via the packaging vector psPAX2 could boost packaging efficiency. Interestingly, Exin21/Qa addition to Gag significantly impaired, rather than augmented, the LV packaging, but addition to Pol and RRE significantly boosted packaging of pRRL-GFP ( Figures 4J-4L). These data suggest that proper insertion of Exin21/ Qa in the LV packaging vectors could boost packaging and transduction efficiency.
Exin21/Qa enhancement of vaccine production via increasing mRNA stability Another immediate application of Exin21/Qa addition may be in the elevation of vaccine yields for the urgent fight against the COVID-19 pandemic. Currently, the most effective vaccines against SARS-CoV-2 and its variants are derived from mRNA or DNA encoding S www.moleculartherapy.org protein. 27 As shown in our above results, Exin21/Qa addition increased S protein expression by $3to 24-fold in a CMV-driven cDNA expression vector ( Figure 2A). If we could apply such enhancement to vaccine production in large scales, it would greatly reduce costs and expedite the availability of COVID-19 vaccines. Since mRNA vaccine exhibits numerous advantages over other vaccines and the application of SARS-CoV-2 S protein mRNA-based vaccines is now well established in humans, we hypothesized that Exin21/Qa addition could also boost mRNA-dependent translation of SARS-CoV-2 proteins increasing vaccination efficiency. To test this idea, we generated a capped mRNA with Exin21 insertion by in vitro transcription (promoter independent) and examined if Exin21/Qa addition would boost production of viral proteins after mRNA transfection in HEK293T cells ( Figure S3). Our data showed that the presence of Exin21/Qa significantly increased the production of SARS-CoV-2 S protein from transfected functional mRNAs in a time-and dose-dependent manner ( Figure 5A). We further found such protein production-boosting motif can be universally applicable to mRNAs of other SARS-CoV-2 proteins including N, E, and ORF3 and the host cellular gene hACE2 ( Figures 5B, 5C, and S3). The mRNA-based vaccine uses N1-methyl-pseudouridine-modified mRNA to increase the stability and decrease innate immunogenicity. [28][29][30] To further validate the boost effect of Exin21/Qa addition on mRNA vaccine, we performed in vitro transcription using N1methyl-pseudouridine. We found that modified mRNA shows even stronger boosting activity by Exin21/Qa addition compared with non-modified mRNA ( Figure 5D). These data suggest that Exin21/ Qa addition could increase the production of mRNA vaccine by facilitating mRNA stability and/or translational efficiency in a transcription-independent manner.
To further determine if Exin21/Qa addition regulates mRNA-dependent translation, we measured the dynamic changes of translational products after inhibiting transcription with actinomycin D. In the absence of Exin21/Qa addition, actinomycin D completely blocked the production of viral protein E ( Figure 5E) and ORF3 ( Figure 5F), measured by gdLuc activity. In contrast, Exin21/Qa addition showed a time-dependent increase of the protein expression and production/ accumulation despite transcriptional inhibition ( Figures 5E and 5F), suggesting that Exin21/Qa addition acts via post-transcriptional regulation (e.g., increased mRNA stability and/or translation efficiency) independently of transcription. To further determine if Exin21 addition influences mRNA stability of targeted genes, we used a traditional mRNA decay assay for E and S viral proteins. Although E and S viral mRNAs exhibited different patterns of changes during the time course, Exin21/Qa addition on both viral E ( Figure 5G) and S ( Figure 5H) proteins increased the half-life of the encoded mRNAs by $3to 6-fold.
To assess more accurately the role of Exin21/Qa addition in regulating the dynamic changes of targeted mRNA in cells, we performed single-molecule fluorescent in situ hybridization (smFISH) via hybridization chain reaction (HCR), 31,32 Click-iT 5-ethynyl uridine (EU) nascent RNA capture 33 and thiol(SH)-linked alkylation for the metabolic sequencing (SLAM-seq). [34][35][36] HCR assay identified the typical puncta signal of E-LG mRNA (detected by LG probe) localized mainly in the cytosol of HEK293T cells at early time points (6 h) after transfection using lower amount of plasmid DNA (<20 ng per well of 8-well coverglass chamber slide) ( Figure 6A). Using increased amount of plasmid DNA or waiting until longer time points after transfection led to the induction of diffuse smeared-like nonpuncta strong signal throughout the cytosol. HCR specificity was validated by experimentation in the absence of gdLuc and/or cLuc probes during HCR procedure ( Figure S4A). Comparison analysis in parallel at the same imaging condition showed that Exin21/Qa addition robustly increased the number and intensity of the puncta signal per cells at 6 h in a dose-dependent manner ( Figures 6A and S4). Click-iT nascent RNA capture assay (Figure 6Ba) confirmed the strong boosting of nascent E-LG mRNA synthesis by 97-fold at pulse 1 h in the group of Exin21/Qa addition (E-QLG) compared with the E-LG group (Figure 6Bb). At pulse 3 h, Exin21/Qa boosting was increased to 15,468-fold while the control E-LG showed a 935-fold increase of the nascent mRNA ( Figure 6B-b). From pulse 1 to 3 h, the increase of mRNA synthesis in E-QLG group (159-fold) was less than that in E-LG group (935-fold), possibly due to the saturation of high mRNA level already in E-QLG group (Figure 6Bc). After removal of EU (chase phase), the mRNA levels rapidly reduced in the control E-LG group; however, Exin21/Qa addition increased the half-life   (Figure 6Bb). SLAM-seq utilizes s4U labeling of the nascent RNA and IAA alkylation (for the T > C conversion) to map RNA synthesis dynamics for the existing individual transcripts. The frequency of T > C conversion can be detected with the costeffective and fast EZ-amplicon sequencing and CRISPResso2 bioinformatics analysis ( Figure S5A). We selected EGFP reporter as the target transcript for the EZ-amplicon ( Figure S5B) to detect the effect of Exin21/Qa addition on the synthesis and degradation of E-LG mRNA in cultured HEK293T cells after transfection with equal amount of plasmid DNA. The T > C conversion rates cross the entire amplicon and around the selected target site were apparently higher in the E-QLG group than in the E-LG group (Figures S5C and  S5D). As shown in Figure 6Ca,b, the T > C conversion frequency at pulse 1 and 2 h was around 8% in the control E-LG group but increased to around 18% in the E-QLG group, suggesting that Exin21/Qa addition increased E-LG nascent mRNA synthesis during pulse phase. The T > C rate continued increased to around 21% in both groups at chase 1 h (Figure 6Cc) but decreased to 13% in the control E-LG group at chase 2 h, while retaining 19% in the E-QLG group ( Figures 6C and 6D), indicating that Exin21/Qa addition increased E-LG mRNA stability.
Taken together, these data indicate that the addition of Exin21/Qa to a given target mRNA significantly increases nascent mRNA synthesis, mRNA stability, and perhaps translational efficiency, thereby boosting protein expression and production of the targeted mRNA (e.g., S protein mRNA vaccine).
Exin21/Qa boosting of targeted protein secretion As we found above, Exin21/Qa addition elevated expression of various types of targeted proteins. Aiming to test if Exin21/Qa addition can boost the protein expression of the E protein dual reporter within cells, we unexpectedly found that E-QL protein levels in the cell lysates were remarkably reduced rather than increased in the Exin21/Qa group when detected by western blot analysis with an anti-FLAG antibody ( Figure 7A), even though Exin21/Qa addition robustly increased gdLuc activity in culture supernatants ( Figure 1D). We found similar reductions by Exin21/Qa addition in the corresponding intracellular levels of other viral proteins (S and N), and host cellular proteins (IFN-g, IL-2, and hACE2) ( Figures 7B and 7C).
Based on these unexpected observations, we hypothesized that robust Exin21-induced increases in supernatant gdLuc activity must involve the protein secretion process. This idea is supported by the Exin21induced boosting that we observed in antibody secretion ( Figure 3) and secretory IFN-g and IL-2 ( Figures 2D and 2E). To corroborate this secretion-boosting activity, we analyzed the protein levels of secretory E-Flag-Q-gdLuc (E-QL) in cell culture supernatants using serum-free media. We found that cleaved E-QL and GFP as well as non-cleaved E-Flag-Q-gdLuc-GFP (E-QLG) were detectable by western blot analyses using anti-gdLuc and anti-GFP antibodies in unconcentrated supernatants (40 mL from 100 mL) of the E-QLG group ( Figures 7D and S6). Increases in the level of secretory protein in the supernatant ( Figure 7D) were confirmed by the boosting in the gdLuc activity ( Figure 7E). Protein secretion was blocked by treatment with brefeldin A (BA), an endoplasmic reticulum (ER)-Golgi protein trafficking inhibitor ( Figures 7F, 7G, and S7). To further confirm the secretion-enhancing feature of Exin21/Qa addition, we used non-secretory firefly luciferase (fLuc) expression and activity assays. Cellular levels of fLuc protein expression and enzyme activity were significantly increased in cell lysates, but no or very little fLuc activity was detectable in supernatants, even in the presence of Exin21/Qa addition ( Figures 7H and 7I), which is consistent with the increase in non-secretory protein spCas9 expression ( Figure 4G). The increase in the cellular E-Q-fLuc mRNA levels was validated by qRT-PCR assay ( Figure 7J). Thus, Exin21/Qa addition appears to boost expression of targeted proteins and facilitate their secretion. We noted that auto-cleavage by the 2A system of most of the targeted proteins was incomplete, varying among different proteins ( Figures 7C, 7D, 7F, S6, and S7), which has been reported by others. 37,38

DISCUSSION
In this study, we report the fortuitous discovery of a novel and unique Exin21/Qa cis-regulatory motif that has versatile capabilities of boosting the expression and secretion of targeted proteins. This cisregulatory Exin21 sounds like the secretion-enhancing cis-regulatory targeting element (SECReTE) that was recently identified by computational analysis to facilitate ER-localized mRNA translation and protein secretion. 39 This SECReTE motif is enriched in nearly all mRNAs encoding secreted/membrane proteins in eukaryotes and its addition results in enhanced protein secretion. 39 It also boosts protein expression and secretion when added to an mRNA for an exogenously expressed proteins such as GFP. 39 However, our Exin21 has many features significantly different from SECReTE: (1) no triplet repeats such as NNY or NYN, (2) unique and exclusive composition/order of the 21 nucleotides, (3) smaller size (21-mer) than SECReTE (R30-mer from R10 triplet repeats), and (4) absence in any cellular or viral genes. In addition, Exin21/Qa is also quite different from the activity-enhancing motif that involves the promoter enhancer [40][41][42] or anti-sense activity, 43 and codon optimality-mediated mRNA stability, [44][45][46][47] as well as the cis-acting protein stabilon 48,49 or degron. [50][51][52] Stabilon (13-mer) can also increase mRNA stability in addition to boosting protein production. 49 Our data indicated that adding Exin21 motif to a given mRNA could remarkably increase the corresponding protein expression and secretion. We clearly demonstrated www.moleculartherapy.org this feature in different types of proteins including viral, non-viral, intracellular, structural, and secretory proteins. The extent of such boosting power varied, with proteins such as N and ORF3 exhibiting up to thousands of fold increase. We believe these findings are translatable to a paradigm shift in the applied protein production in research and industry.
We explored the range, extent, and mechanisms of these Exin21/Qa actions using a variety of tools, approaches, and target proteins. Exin21/Qa addition robustly augmented production of a secretory gdLuc fusion protein derivative of multiple SARS-CoV-2 structural proteins (S, M, N, and E), the accessory proteins (NSP2, NSP16, and ORF3), and the host cellular gene products (Figures 1 and 2). Among those we tested, the protein production-boosting actions of Exin21/Qa were largely independent of the specific promoter used, although it did elicit stronger enhancement of protein production in combination with the stronger CAG promoter ( Figure 2F). Exin21/Qa addition enhanced mRNA-dependent production of targeted viral and non-viral protein fusion reporters, as determined by in vitro RNA transcription and mRNA transfection, followed by dual reporter assays ( Figure 5). Exin21/Qa enhanced the yield of the S-containing pseudoviruses and standard LV packaging (Figure 4). Exin21/Qa addition increased the release of secreted host proteins, including a robust enhancement of antibody production when Exin21/Qa was placed in antibody heavy and light chains (Figure 3), and augmented the secretion of IFN-g and IL-2 ( Figure 7). Exin21/ Qa actions were blocked by an ER-Golgi-trafficking inhibitor BA (Figure 7). These findings point not only to a wide range of activities elicited by Exin21/Qa addition, but also to potentially important and diverse applications in biotechnology areas such as production of vaccines, mAbs, and other biopharmaceuticals where mammalian cell expression systems are needed.
The N-terminal signal (secretion) peptide (SP) on the natural secretable gdLuc reporter is well known to contribute to its classical secretion in most cases. However, the internal SP can also mediate gdLuc secretion. 53 This is supportive of our observation that gdLuc with internal SP can be detected in culture media by gdLuc assay and western blotting for all the tested proteins, including both secretion proteins and membrane/non-membrane proteins. Our western blotting with anti-FLAG antibodies confirmed the absence of the auto-cleaving ac-tivity of the Exin21/Qa insertion (Figures 7A-7C). We found that Exin21/Qa addition robustly boosted the regulated secretion of secretory proteins such as S protein, mAbs, IFN-g, and IL-2, but not via any SP-like intracellular targeting mechanism, as it did not induce the release of non-secretory proteins such as fLuc ( Figure 7H) and spCas9 ( Figure 4G). This property could potentially prove invaluable for the industrial applications of such secreted protein productions. For example, Exin21/Qa addition could presumably enhance the production/secretion of S protein in the human body in mRNA-based vaccines against SARS-CoV-2 variants, and therefore reduce the amount of mRNA needed per vaccination due to the higher levels of S protein released 27 while still providing the same level of host immune responses, expediting the vaccine availability of any new variants while reducing potential toxicity and production cost.
The ability to boost production yields of viruses or pseudotyped viruses can be invaluable to the fields of gene therapy and biomedical research. The use of pseudotyped viruses has facilitated research on high-risk viruses that require BSL3 facilities. Pseudoviruses of SARS-CoV-2 S protein and its variants have been used extensively in the evaluation of neutralizing antibodies and vaccinations, as well as in mechanistic and functional studies. 17,18,22,24,54 The bottleneck for generation of S pseudovirions has been the limited packaging efficiencies for LVLP or VSVLP. 17,18,20,24 Our new approach to add Exin21/Qa in the Sd18 expression system boosted the packaging and transduction efficiencies of SARS-CoV-2 S-LVLP. This strategy has facilitated our ongoing research on the antiviral effect of EGCG and the protective efficiency of serum from vaccinated patients against continuously emerging SARS-CoV-2 variants. 55-57 A challenge in viral gene therapy is the limited efficiency of viral packaging. 25,26 Using an LV system as a test platform, we found that Exin21/Qa addition to an LV transfer vector did not increase the packaging efficiency because the mRNA levels of the transgene were decreased unexpectedly during the LV packaging ( Figure 4I), but it still boosted the production of transgene protein in transduced cells ( Figures 4F and 4G) or transfected cells (Figures 2D and 2E). It would be interesting to explore how the LV packaging proteins such as Gag and Pol could affect the synthesis and stability of the Exin21containing mRNA. Using the packaging vector psPAX2, Exin21/Qa addition at the C termini of Pol and RRE increased LV packaging efficiency, but addition at the Gag C terminus impaired LV packaging. www.moleculartherapy.org Thus, optimizing Exin21/Qa locations within the LV packaging vector will be essential in applications to maximize Exin21/Qa boosting efficiency. Because Exin21/Qa addition boosted both Sd18 expression and the packaging efficiency of S-LVLP, Exin21/Qa addition in VSV-G protein may boost regular LV packaging efficiency. Addition of Exin21/Qa at different locations of VSV-G 58,59 may thus maximize its production-boosting efficiency. Likewise, optimizing Exin21/Qa boosting activity on AAV or other viral packaging systems may prove very valuable in biopharmaceutical applications.
Many varieties of epitope tags, including FLAG, Myc, HA, Ollas, V5, His, C7, and T7, developed earlier have enabled specific research and biotechnological applications such as protein labeling, tracing, immunoaffinity purification, immunostaining, immunodetection enhancing, 60-67 protein degradation slowing, and solubility conferring. [68][69][70][71] Until now, however, no tagged epitope had ever been discovered that can stimulate protein expression and secretion. We performed a series of mutation analyses including alanine scanning, deletion, and synonymous and nonsynonymous mutations, and proved that the unique 21-mer motif Exin21 with a specific order/ number of nucleotide composition is essential for its boosting activity, which requires ORF with the targeted genes. Thus, the encoded unique heptapeptide Qa can serve as a novel epitope tag that shares features with well-established epitope tags for general applications as above. Importantly, Exin21/Qa addition can enhance the intensity of endogenous protein labeling owing to its boosting capacity and thus improve the detection sensitivities in applications such as neural network tracing. 60 A broad area of importance, yet to be explored, is the potential of Exin21/Qa addition to enhance the expression of targeted, highly specific bioengineered proteins in vivo, such as via CRISPR-Cas gene knockin strategies that could facilitate expression of loss-of-function genes. Such applications would be valuable in treating disorders such as haploinsufficient mutagenic diseases including Angelman syndrome, Pitts-Hopkins syndrome, and others. In genetic engineering, Exin21/Qa boost of dominant genes may improve organism phenotypes, such as in agriculture applications. Of course, any potential toxicities or off-target effects of such in vivo expression of Qa-tagged proteins are yet unknown and untested. Nevertheless, based on prior findings with the well-tested epitope tags both in vitro and in vivo, we do not anticipate any propensity for toxicity of the small 7-aa Qa tag.
The mechanisms via which Exin21/Qa exerts its actions on boosting protein expression/secretion remain mainly to be delineated. Howev-er, our initial findings indicated that the presence of Exin21/Qa significantly increased the level of the targeted mRNA due to a combination of increased mRNA synthesis and slowed mRNA decay. The boosting of nascent mRNA synthesis was validated by Click-iT and SLAM-seq technologies. However, post-transcriptional mechanisms (i.e., mRNA stabilization) may play a predominant role in Exin21/ Qa boosting because: (1) direct delivery of modified or non-modified mRNA to cells remarkably boosted the production of targeted proteins; (2) the boosting effects persisted during global transcription inhibition by actinomycin D; (3) during the chase phase, EU or s4U-incorporated mRNA decayed slower in the Exin21/Qa group; (4) the boosting action requires in-frame insertion within the coding region of targeted genes; and (5) Exin21/Qa facilitated the secretion of the targeted proteins. This unique cis-regulatory Exin21/Qa supports previous proof of concept that the coding sequence harbors numerous regulatory sites that may regulate mRNA location, stability, and translation efficiency. 72 However, how Exin21 affects mRNA stability as well as translation/secretion and protein stabilization/degradation remains largely unknown. It would be interesting to determine if the Exin21/Qa cis-regulatory motif has a special secondary RNA structure that can recruit RNA-binding proteins, 72 directly regulate the mRNA stability of targeted proteins, 73 or bind directly to poly(A) or UTR to exert its stabilizing effects upon mRNA and boosting of translation. Because BA, an inhibitor of the conventional ER-Golgi secretion pathway, blocked Exin21/Qa-stimulated protein secretion, we speculate that Exin21/Qa may regulate protein retrograde or anterograde trafficking among the ER-Golgi network [74][75][76][77] and facilitate ER-targeted mRNA translation and protein secretion similar to SECReTe. 39 Other secretion inhibitors might be useful in identification of additional pathways involved in Exin21/Qa-modulated protein secretion, particularly that of non-conventionally secreted proteins (e.g., that of cytokines such as IL-1). 78,79 It is also interesting to determine if Exin21/Qa can regulate protein stabilization as a novel stabilon 48,49 or degradation weak initiator. 50 In summary, we discovered a novel, small (21-mer) and unique cisregulatory motif, Exin21/Qa, that can greatly enhance the production of a variety of different types of proteins ranging from viral transcripts/proteins, endogenous gene products, vaccines, and antibodies to engineered recombinant proteins in mammalian cells. This Exin21/Qa has a universal protein production-boosting capacity that should facilitate versatile applications in biomedical research and biotechnological industry. This revolutionary discovery will also open a new research avenue in the field of RNA biology and
The pCAG vectors encoding E protein were generated by replacing the CMV promoter in corresponding pcDNA6B-SARS-CoV-2-E-Flag-LG or -QLG vectors with CAG promoter via SnaBI/KpnI sites.

Mutation vectors
Site-directed or deletion mutagenesis of Exin21/Qa were performed using pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP1479) as a template. Mutagenic primers were designed to change or delete specific nucleotides in the Exin21 sequence. For each mutation, a Phusion High-Fidelity PCR reaction was performed using a universal primer (T1640) matching a region upstream of SARS-CoV-2 E and a mutagenic primer matching Exin21 sequence except for the region a desired mutation introduced. The PCR products that carry Exin21 mutations were gel purified and cloned into EcoRV/NotI-digested 6B-E-QLG DNA using NEB-HiFi.

Luciferase and SEAP assays
For the gdLuc assay, the Coelenterazine (CTZ) substrate (Nanolight Technology, Norman, OK, cat. no. 3032) was dissolved in 10 mL ul-tra-sterile distilled water to make stock solutions and kept at À20 C until use.

VSV-G or S protein-pseudotyped lentivirus packaging and titration
Recombinant lentivirus carrying the indicated LV vector was produced in small scale using a second generation of LV packaging system according to standard protocols. In brief, HEK293T cells in one of six-well plates were cotransfected using the TP5 kit with the indicated transfer LV vector (1.4 mg), the packaging vector psPAX2 or its mutants (1 mg) and VSV-G or Sd18 vector (0.4 mg). At 2-3 days posttransfection, supernatants containing LV were concentrated and purified with simplified 10% sucrose purification as described previously. 82 The functional titers of the crude and purified lentivirus were determined by counting GFP-expressing HEK293T cells at 48 h after infection with serial dilutions of lentiviruses under fluorescent microscopy. For some cases, flow cytometry analysis was used for LV titration.
Western blot analysis SDS-polyacrylamide gels (10%-12%) were home-made or mini-PROTEAN TGX gels (cat. no. 4561093, 4561096) were purchased from Bio-Rad. Cell lysates were prepared using either lysis buffer composed of 50 mM Tris-HCl (pH 7.0), 150 mM NaCl, 5 mM EDTA, and 1% Triton X-100 supplemented with PMSF (100Â), aprotinin, and leupeptin (200Â) or Universal Lysis Buffer (Nanolight Technology, cat. no. 333). The 50 mL lysates were prepared from each well after collecting the supernatant. The lysates were incubated at 4 C for 20-30 min, centrifuged at maximum speed in an Eppendorf centrifuge. The clear lysates were either denatured for 5 min at 98 C immediately in 1Â SDS-PAGE loading dye or stored at À80 C until use. Supernatants were stored at 4 C until they were treated with 1Â SDS-PAGE loading dye. Denatured 10-20 mL aliquots of cell lysates or 20-30 mL supernatants were loaded onto SDS-PAGE in Tris-glycine/SDS buffers under denaturing and reducing conditions.
Polyacrylamide gels were transferred to 0.2 mm nitrocellulose membranes (Bio-Rad supported nitrocellulose [NC] membrane, cat. no. 162-0097) either using wet transfer or the iBlot2 Dry Blotting System using IBlot2 NC mini (IB23002) or regular Stacks (IB23001). For wet transfer, the following transfer buffer was used: 25 mM Tris-HCl (pH 7.6), 192 mM glycine, 20% methanol. The gels were sandwiched together with NC membranes and transfers were performed in 1Â transfer buffer at 250 mA at 4 C for 1-2 h.
Dry western blot transfers were performed in an IBlot2 Dry Blotting System (Invitrogen, Thermo-Fisher, Waltham, MA, IB21001) using mini or regular IBlot2 stacks for 7 min according to the manufacturer's guidelines.
After the transfer, the membranes were blocked in 1Â TBST buffer containing 5% milk. The membranes then were treated with primary antibodies (1/500-1/2,000 dilutions) overnight at 4 C or 2 h at room temperature. The membranes were washed three times with 1Â TBST buffer per minute each followed by incubation with secondary antibodies. The secondary antibodies with infrared tag were diluted 1/10,000-120,000 and incubated with the NC membranes for 45 min to 1 h. At the end of incubation, the membranes were washed with 1Â TBST buffer three times, 5 min each, and scanned on a Li-COR Odyssey CLx Imaging System. The images were analyzed with NIH ImageJ (v.1.53) densitometric measurements. The data are expressed as integrated density times area and presented as relative fold in comparison with corresponding control. Membrane staining was performed using an MemCode Reversible Protein Stain kit (Thermo Fisher Scientific, cat. no. 24580).

Antibody detection with ELISA
HEK293T cells were co-transfected with Qa-tagged HQ (TP1574) and LQ (TP1571) at 50 ng/well in 96-well plates in quadruplicates with or without normalization vector pGL4.16-CMV (TP329), which derived from the promoter-less vector pGL4.16 (Promega, cat. no. E6711), or pRRL-E-Flag-LG (TP1578) at 20 ng/well. The original antibody plasmids for pFUSEss-CHIg-hG1-SARS-CoV-2-mAb (TP1565) and pFUSE2ss-CLIg-hk-SARS-CoV-2-mAb (TP1566) were used as the control. ELISA was performed using the Human IgG (Total) Uncoated ELISA Kit (Thermo Fisher Scientific, cat. no. 88-50550-88). A 96-well Costar ELISA plate (Corning) was first coated with SARS-CoV-2-Spike (S) protein from BEI (cat. no. NR52724) at 100 mg/well overnight at 4 C. The washing and blocking steps were performed using the buffers and solutions provided in the kit. Supernatants containing secreted antibodies were collected from the transfections at 24 and 48 h and kept at 4 C until use. The aliquots of 0.5, 2.5, and 5.0 mL antibody supernatants were added to each SARS-CoV-2-S-coated wells. After overnight incubation, the wells were washed (400 mL per well) four times. Horseradish peroxidaseconjugated anti-human IgG detection mAb in assay buffer (1/250) was added to each well and incubated at room temperature for 2-3 h. The wells were then washed three times (400 mL each) and treated with 300 mL substrate TMB (3,3 0 ,5,5 0 -tetramethyl benzidine) for 15 min to develop blue color and the reactions were terminated with 2 N HCl. The yellow color formation was measured at 450 nm using a BioTek Synergy LX multiplate reader. The level of anti-SARS-CoV mAb was quantified using s Sigmoidal four-parameter logistic curve fit using Prism GraphPad 9.1.

ER-Golgi transport inhibition with BA
BA (AdipoGen Life Sciences, San Diego, CA, cat. no. AG-CN2-0018) was dissolved in DMSO to make 1 mg/mL stock solution. HEK293T cells were transfected with the indicated vectors using TP5 transfection reagent in DMEM plus 10% FBS as described above. The transfected cells were incubated overnight, and 10 mg/mL BA was added prior to medium change and incubated for 3 h at 37 C in 5% CO 2 . The culture medium was replaced with FreeStyle 293 serum free medium (Thermo Fisher Scientific, cat. no. 12-338-018) with 10 mg/mL BA and incubation was continued for 24 h at 37 C in 5% CO 2 . The supernatants were withdrawn right after medium replacement and collected after 24 h. The cell lysates were also prepared at the 24 h time point. The supernatants and cell lysates were tested for gdLuc activity and western blot analysis.

Quantification and statistical analysis
Quantification of fold changes in Qa groups compared with corresponding non-Qa groups was performed using excel software. Statistical analysis was performed using Prism GraphPad 9.1. Significance at *p < 0.05, **p < 0.01, and ***p < 0.001 was determined using a twotailed Student's t test between two groups or by one-way ANOVA for multiple comparisons. Data were presented as mean ± SE. The size www.moleculartherapy.org and type of individual samples were indicated and specified in the figure legends.