Challenging Post-translational Modifications in the Cell-free Protein Synthesis System

: Post-translational modifications (PTMs) represent a cornerstone in the complexity of the proteome, significantly contributing to diversifying protein structure and function. PTMs can considerably influence protein function, stability, localization, and interactions with other molecules. Therefore, it is important when choosing a protein expression system to ensure the precise incorporation of PTMs during protein synthesis, which is paramount for producing biologically active proteins. The cell-free protein synthesis (CFPS) system has emerged as a powerful protein synthesis platform and research toolkit in synthetic biology. The open nature of the system allows the reaction environment to be tailored to any protein of interest to promote specific PTMs, thus allowing for the production of a protein with desired modifications. This review presents various PTMs achieved in the CFPS systems, providing insights into current challenges, successes, and future prospects


Introduction
A post-translational modification (PTM) is a biochemical event that modulates the attributes of a protein during or following ribosomal synthesis. Generally, these modifications entail reversible chemical modifications (phosphorylation, acetylation, methylation, and redox-based modifications, including S-nitrosylation, S-sulfenation, S-sulfination, and thiolation), enzymatic modifications (ubiquitination, ubiquitin-like modification, and SUMOylation), complex molecules (glycosylation, ADPribosylation, AMPylation, and lipids attachment such as acylation and prenylation), and irreversible alterations such as deamidation, eliminylation, deimination, and proteolytic cleavage (Figure 1) [1]. The most frequently modified amino acids bear side chains containing hydroxy, amino, or thiol functional groups of serine, threonine, tyrosine, aspartate, asparagine, lysine, arginine, and cysteine [1,2]. A PTM can alter the chemical and biological attributes of the modified amino acid residues and/or neighboring polypeptide regions, leading to changes in the protein conformation, net charge, binding properties, and, ultimately, its function. Consequently, PTMs play a crucial role in a myriad of biological processes encompassing protein folding and stability, enzymatic activity, protein-protein interactions, cell signaling, and gene regulation [2].
Eukaryotes, including plants and animals, undergo more complex PTMs, contrary to bacteria, which can only manage limited PTMs such as phosphorylation, acetylation, methylation, and thiolation [3]. Irregular PTMs leading to protein dysfunction have been implicated in a wide range of diseases, including cancers and neurodegenerative conditions such as Alzheimer's [4] and Parkinson's [5] diseases. Despite the strong correlation between abnormal PTMs and diseases, PTMs remain a challenging topic to study due to their inherent complexity and diversity, various environmental conditions, and lack of practical tools to explore these complex natural phenomena.
Hence, there is a growing demand for a versatile research platform to explore these intricate PTMs. The cell-free protein synthesis (CFPS) system has presented noteworthy progress in examining crucial PTMs. In this review, we revisit PTM investigations conducted using the CFPS system, aiming to unveil potential roadmaps for future studies and applications of PTMs in the realm of synthetic biology.

Cell-free Protein Synthesis
More than fifty years ago, Nirenberg and Matthaei introduced the CFPS system for the first time [6], establishing a transformative platform that continues to advance our understanding of both fundamental and applied biology [7]. Recent advances in synthetic biology further highlight CFPS as a rapid, high-yielding, and cost-effective production of various proteineous and nonproteineous bio-based products, which makes the system a compelling alternative to conventional cell-based biomanufacturing ( Figure 2) [8,9]. The open environment of the CFPS system serves as a major driving force that has enabled the direct addition of supplemental materials and control of the pre-existing cellular networks of the cell extracts. The versatility of CFPS positions this system as an optimal platform for producing complex and hard-to-synthesize proteins originating from higher organisms and functional proteins with precise PTMs [9]. In addition, this system has shown promise in the development of synthetic genetic circuits [10], unnatural amino acid incorporation [11], and therapeutic protein production and prototyping [8,12]. Cell-free protein synthesis (CFPS) system for in vitro biomanufacturing. CFPS requires various components to operate transcription and translation, such as an energy (ATP) regeneration system, chemical substrates and salts, and the cell extract combined in a microtube.

Human Proteins in CFPS Biomanufacturing
The biopharmaceutical industry is predominantly focused on the development of protein therapeutics, such as monoclonal antibodies, peptides, and recombinant proteins, which represents the most proliferative category of emerging products [13,14]. Recently, a variety of notable human recombinant proteins have been produced in different types of CFPS systems. Sullivan et al. successfully produced both recombinant human erythropoietin (rhEPO) and human granulocyte-macrophage colony-stimulating factor (rhGM-CSF) in bacteria and yeast-based cell-free systems [15]. This achievement carries great significance because FDA approval of these proteins underscores the potential of CFPS to expedite cost-effective protein pharmaceutical production. In 2017, progress was marked by the successful production of recombinant streptokinase-a therapeutic protein crucial in dissolving blood clots-via a CFPS system derived from the Chinese hamster ovary (CHO) [16]. In addition, the recombinant skin therapeutic protein, monomeric Filaggrin, was synthesized by Kim et al. in the Escherichia coli-based CFPS system, representing another successfully biomanufactured human protein using CFPS [17]. Despite multiple achievements in the production of human proteins via bacterial CFPS systems, these systems frequently suffer from complications related to incorrect protein folding and insufficient PTM incorporation. To counter these challenges, researchers have moved to more sophisticated CFPS systems that utilize eukaryotic organisms such as yeast, insect, CHO, and human (HeLa) cells, thereby facilitating correct protein folding and PTMs [18].

Disulfide Bond Formation
Disulfide bonds are a type of PTM that consists of covalent linkages formed by the oxidation of two cysteine residues on the polypeptide chain [19]. The presence of disulfide bonds is crucial for proper protein folding and serves as a vital factor that influences the structure and functionality of proteins. Any issues with disulfide bond formation can lead to misfolded proteins, which can have substantial implications, including disease development. Disulfide bonds are present in a wide range of proteins derived from various life forms. Some common types of disulfide bond-containing proteins include hormones, transport proteins, structural proteins, enzymes, and various classes of antibodies. The complexity of disulfide bonds can be described in terms of their patterns, which refer to the specific arrangement of disulfide bonds within a protein. These patterns can vary between different proteins, families, or organisms, with some containing only a few disulfide bonds while others contain many. Historically, the formation of these bonds in vitro has been a challenge due to the reducing conditions inside cells. However, recent advances in CFPS systems have provided technological advantages for producing proteins with disulfide bonds due to the system's controlled and simplified environment. For example, in a CFPS system, the redox conditions can be precisely tuned to favor the formation of disulfide bonds. Moreover, components such as chaperones and disulfide isomerases, which assist in folding proteins and forming disulfide bonds, can be added in defined quantities. This level of control over the protein synthesis environment allows researchers to experiment with different conditions and components to optimize the production of proteins with specific disulfide bond patterns. Additionally, high-throughput approaches using CFPS systems can be useful for generating large libraries of proteins with varying disulfide bond patterns.

Biomanufacturing Monoclonal IgG Antibodies
In 2019, Murakami et al. utilized the PURE (protein synthesis using recombinant elements) CFPS system for producing disulfide-containing monoclonal IgG antibodies by optimizing the redox conditions and using specific disulfide catalysts and chaperones [20]. The PURE system is a reconstituted CFPS system based on the protein synthesis machinery of E. coli. Unlike the cell lysate-based system, the PURE system contains only purified factors involved in transcription, translation, and energy regeneration [21]. This study revealed that the redox environment could be optimized by adjusting the ratio of GSH/GSSG supplemented to the system. Additionally, it was found that the use of the E. coli-derived disulfide bond catalyst, known as disulfide bond isomerase (DsbC), and the molecular chaperone, DnaK, was sufficient for functional IgG synthesis. Under optimal conditions, peak production of the anti-HER2 antibody trastuzumab reached 124 µg/mL, demonstrating the PURE systems' potential for rapid and cost-effective production of therapeutic proteins. Moreover, the study's findings highlight the flexibility of the CFPS approach. The ability to precisely control the reaction conditions in the CFPS system enabled the researchers to optimize the environment for efficient disulfide bond formation, a critical factor for the correct folding and function of many proteins.

Hydrophobins Production
Hydrophobins, a class of small, surface-activated proteins produced by fungi, have unique biochemical properties [22]. They have a strong affinity for hydrophilic and hydrophobic interfaces, which allows the hydrophobins to self-assemble into a coating that can change the behavior of a surface. This makes them potentially useful for a range of biotechnological applications, such as implant coatings and drug delivery systems. However, the production of hydrophobins has been challenging due to their complex structure, which includes multiple disulfide bonds. In 2020, Siddiquee et al. demonstrated six different types of natively folded hydrophobins in a CFPS system by adjusting the conditions of the CFPS system [22]. This study showcased the functional expression of proteins with multiple disulfide bonds.

Recombinant Tumor Necrosis Factor-α
Tumor necrosis factor-α (TNF-α) plays a crucial role in inflammatory responses, immune cell signaling, and the apoptosis of cancer cells [23]. In 2022, a team utilized CFPS to produce human TNF-α, which carries a single disulfide bond between cysteine residues 69 and 101 [23]. The team optimized the yield of soluble TNF-α protein using codon optimization and subsequent system optimization through the statistical approach of the response surface methodology that leads to simultaneous analysis of the individual and interaction effect of cell-free parameters and then tested the cytotoxicity of the recombinant TNF-α against three different human cancer cell lines, Caco-2, HepG-2, and MCF-7, as well as normal human cells. By optimizing the production process and confirming the protein's therapeutic effects, this research highlighted the use of the CFPS system for the functional production of the therapeutic protein.

Prototyping Disulfide Bond-containing Proteins
Dopp and Nigel reported a simple, functional, and cost-effective cell extract derived from a commercially available E. coli strain SHuffle T7 Express lysY that can express both T7 RNA polymerase and DsbC for in vitro prototyping of proteins with disulfide bonds [24]. This strain provides the benefit of requiring only the optimization of IPTG induction and harvest times for cell growth during cell extract preparation. In order to experimentally determine these optimized parameters, they used Gaussia luciferase (GLuc), which carries five disulfide bonds and emits a strong luminescent signal, making it an ideal reporter for extract optimization. To demonstrate the versatility and rapid prototyping capability of the SHuffle extract, Dopp and Nigel screened the activity of four Luciferase candidates against ten luciferin analogues. Each of these analogues contained multiple disulfide bonds and acted on the substrate coelenterazine for bioluminescent activity assays. Beyond this, they also showcased the broad applicability of the developed cell extract by producing the enzymes, Hevamine, endochitinase A, and periplasmic AppA, each containing three disulfide bonds. The activity of Hevamine and ChitA was significantly improved in the developed system, showing 3.4× and 2.4× fold increase in activity, respectively.

Virus-like Particles
Bundy and Swartz demonstrated the use of the E. coli-based CFPS system as an alternative virus-like particle (VLP) production to overcome some of the limitations commonly encountered in traditional VLP production [25] and further investigated the intermolecular disulfide bond formation [26] in 2007 and 2011, respectively. PANOxSP CFPS system enabled the high-yield synthesis and assembly of MS2 bacteriophage coat protein VLP (MS2 VLP, 479 µg/mL) and a C-terminally truncated Hepatitis B core protein VLP (HBc VLP, 445 µg/mL) in 30 µL reaction and successfully scaled up (1 mL) with 525 µg/mL of MS2 VLP and 436 µg/mL of HBc VLP production. Both small (30 µL) and large (1 mL) scale assembly efficiency was over 80%. However, they could not detect the intermolecular disulfide bond at C61 to dimerize HBc VLP. In the second study, Bundy and Swartz developed the optimized disulfide bond formation condition in the CFPS (SS-CFPS) system by controlling redox conditions and achieved highly improved disulfide bond formation for HBc VLP (~95%). The Qβ VLP effectively formed disulfide bonds after exposure to a reductase-free extracellular aerobic environment. In a process utilizing highly oxidizing, yet biologically incompatible hydrogen peroxide, nearly 100% of disulfide bonds formed in Qβ VLPs as well. Despite a 5-10% loss of these bonds when the oxidant was removed, disulfide bond formation was still significantly higher than in QVLPs not exposed to an oxidizing agent.

Glycosylation
Glycosylation is one of the most common PTMs observed in eukaryotic organisms that plays an integral role in protein folding, movement, and signaling processes [27]. This PTM entails the conjugation of a complex carbohydrate group, known as a "glycan", to an amino acid residue within a protein [28]. Asparagine-linked (N-linked) glycosylation and O-linked glycosylation are both highly prevalent types of glycosylation. N-linked glycosylation involves the attachment of a glycan to an asparagine residue, while O-glycosylation involves attachment to a serine, threonine, or hydroxylysine residue. In addition, both types of glycosylation can be distinguished from one another by their slightly different sugar peptide bonds. Although both N-linked and O-linked glycosylation are ubiquitously observed, N-glycosylation is comparatively more predominant. Proteins subjected to this PTM play crucial roles in regulating protein folding and sorting prior to their transport to the Golgi apparatus, thereby impacting immune responses, stem cell fate, and pharmacokinetics [29]. Studying N-glycosylation presents substantial challenges due to the heterogeneity of resulting proteins, high cost, and reliance on mammalian cells, which come with considerable expense and lower yield limitations. However, this PTM has been successfully challenged in the CFPS system.

One-pot Glycoprotein Synthesis
In 2018, a collaboration between the DeLisa and Jewett groups synthesized a glycopeptide using a bacterial CFPS system [29]. The E. coli cell extract, enriched with a glycosyltransferase enzyme and a sugar substrate, facilitates the synthesis of a glycopeptide with a single sugar moiety attached. This "one-pot glycoprotein synthesis system" was a major stride in glycoengineering. The team used the gram-negative bacterium Campylobacter jejuni as a model system, which carries a bacterial protein glycosylation locus (PGL), directing cells to execute N-glycosylation akin to eukaryotic cells. E. coli strain CLM24, previously optimized for N-glycosylation by inactivating O-glycosylation, was used to prepare the crude cell extract. Subsequently, the team tested whether glycans other than C. jejuni would be comparable with the system. They found that native and engineered Campylobacter lari glycans, native Wolinella succinogenes glycans, and many engineered E. coli and Klebsiella pneumoniae glycans, all with diverse structures, supported in vitro glycosylation successfully. Finally, the team developed a genuinely "onepot" system by creating a cell extract from the CLM24 cells overexpressing the oligosaccharide donors and enzymes utilized previously. This enabled complete glycosylation of the DQNAT motif at single-chain Fv antibody (scFv13-R4 DQNAT ) and superfolder Green fluorescent Protein (sfGFP 217-DQNAT ) without adding supplements to the CFPS system. The "one-pot" system gave researchers great flexibility over the components being utilized to create the most efficient system possible. Additionally, the system was ten times more cost-effective than the commercial glycoprotein synthesis system [29].

Recombinant Immunotoxin Synthesis
In 2022, Krebs and colleagues worked to produce a recombinant immunotoxin (RIT) based on Pseudomonas exotoxin A, PE24 [30]. This strong RIT can potentially induce cell death, therefore the CFPS system provides a suitable production environment. However, producing this immunotoxin is challenging because the anti-CD7 antibody located in the RIT requires N-glycosylation. The team utilized CHO cell and E. coli-based CFPS systems to address this. Following liquid scintillation, SDS-PAGE, autoradiography, ELISA, and MTT (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide) assay, the team confirmed the successful production of RIT. It was later observed that both CHO and E. coli cell-free synthesized RITs were highly effective in killing cells with CD7 and, to a lesser extent, cells without CD7 expression. This work holds significant promise for the treatment of diseases associated with CD7 expression, including T-cell acute lymphoblastic leukemia, acute myeloid leukemia, and others.

GlycoCAP
Recently, the Jewett group successfully synthesized proteins with noncanonical glycan attachments, utilizing the E. coli-based CFPS system [31]. Lectins are proteins that bind to specific glycans and are recognized for their potential use in treating allergies and autoimmune disorders. However, it is difficult to create glycan-based drugs because scientists are unable to rapidly create a variety of structures necessary to formulate the drugs [31]. Currently, researchers use mammalian cell culture to create glycans, leading to heterogenous populations, although homogenous ones would be preferred. To address this problem, the team developed the GlycoCAP system, which facilitated the installation of noncanonical glycans onto various proteins. This showcased the application of four different glycans onto the dust mite allergen-α2,3 C5-azido-sialyllactose, α2,3 C9-azido-sialyllactose, α2,6 C5-azido-sialyllactose, and α2,6 C9-azido-sialyllactose. This breakthrough opens the door for the development of novel allergy treatments and enhances our understanding of certain neurodegenerative diseases.

Phosphorylation
Phosphorylation is a type of reversible PTM often used to activate or deactivate proteins [1,2]. The addition of a phosphate group introduces a negative charge, rendering the protein more hydrophilic, and in turn, changing the shape of the protein. This modification influences various cellular processes, such as cell growth, apoptosis, and signal transduction. Current methods include using chemical analogs or modifying target sites to mimic phosphorylation, but these are all limited by low yields [32]. Notably, in 2015, Oza et al. demonstrated the site-specific incorporation of phosphoserine into human mitogen-activated ERK activating kinase 1 (MEK1) utilizing an E. coli-based CFPS system [32]. They successfully produced mono-and double-phosphorylated MEK1 in milligram quantities and tested the activity in a recreated in vitro signaling cascade. This work suggests that the CFPS system can be a high-yielding production platform for direct phosphorylation of proteins. In addition, this marked a significant advancement towards a better understanding of the human phosphoproteome and the exploration of disease-related phosphorylated proteins and potential therapeutic small molecule inhibitors.

In Vitro Prenylation
In 2022, Kai et al. developed a prenylated protein synthesis system using the E. coli-based CFPS [33], leveraging the eukaryotic prenylation machinery, CFpPS, a similar concept to the cell-free one-pot glycoprotein synthesis platform previously developed in the DeLisa and Jewett groups [29]. Protein prenylation is an irreversible PTM that transfers a farnesyl (15 carbon) or geranylgeranyl (20 carbon) group to a cysteine residue located in a C-terminal consensus sequence (CaaX box) of the target protein. This reaction is catalyzed by farnesyltransferase (Ftase) and geranylgeranyl transferase type 1 (GGTase-1), respectively. The CFpPS allowed for the co-translational expression of several model proteins, including Kras, Hras, RhoA, RhoC, Rac1, and Cdc4. The system successfully demonstrated soluble protein expression and membrane binding ability. This advancement provides compelling evidence for the practical use of the bacterial CFPS system for studying complex eukaryotic PTMs.

ALiCE
The Almost Living Cell-free Expression (ALiCE) system could overcome some of the limitations inherent to bacterial CFPS systems [34]. The ALiCE system, derived from the tobacco BY-2 cell line, incorporates microsomal vesicles native to the cells, which allows the system to perform PTMs more efficiently. The ALiCE system successfully showcased the production of various eukaryotic proteins requiring PTMs, such as a Hepatitis B core antigen model Virus-Like Particle, glucose oxidase (enzyme containing multiple disulfide bonds), SARS-CoV2 spike protein, and the anti-tumor necrosis factor α monoclonal antibody, adalimumab, which requires multiple glycosylation sites and disulfide bonds. Moreover, the system demonstrated the first expression of the human cannabinoid receptor type II (CB2). Lastly, the team produced these proteins by minimizing batch-tobatch variability with scalable yields.

Concluding Remarks and Future Challenges
This review summarizes the advances in cell-free systems for PTMs, which have given researchers unprecedented control over the modification processes and the ability to overcome the challenges faced with traditional in vivo systems. Using the CFPS system, proteins with unique modifications that are otherwise difficult or impossible to produce in vivo can now be synthesized efficiently. This extends not only to research purposes, such as understanding protein function and interactions, but also to practical applications, such as industrial pharmaceutical production. The ability to produce therapeutic proteins using the CFPS system at reduced cost and increased yield could revolutionize drug development, making novel therapeutics more accessible and affordable for end users. While we have made significant progress in investigating various PTMs in the CFPS system, many challenges remain to be overcome to reach the ultimate scientific and industrial objectives. A majority of these challenges originated from the distinct PTM environments, which often impede PTM efficiency due to resource competition and inadequate PTM resources within the CFPS setting. We will need meticulous calculations and predictions to accomplish both precise PTM and productivity by utilizing computer-aided simulation for target-specific PTM. It is crucial to continue optimizing and developing the system, ensuring the functional activity of the target proteins. In conclusion, the CFPS system holds great potential for the study and production of posttranslationally modified proteins. As the scientists continue to improve and expand the PTM repertoire available in the CPFS system, we can expect the system to make a significant contribution to the biomanufacturing industry as well as synthetic biology research.