Chemical Synthesis and Semisynthesis of Lipidated Proteins

Abstract Lipidation is a ubiquitous modification of peptides and proteins that can occur either co‐ or post‐translationally. An array of different lipid classes can adorn proteins and has been shown to influence a number of crucial biological activities, including the regulation of signaling, cell–cell adhesion events, and the anchoring of proteins to lipid rafts and phospholipid membranes. Whereas nature employs a range of enzymes to install lipid modifications onto proteins, the use of these for the chemoenzymatic generation of lipidated proteins is often inefficient or impractical. An alternative is to harness the power of modern synthetic and semisynthetic technologies to access lipid‐modified proteins in a pure and homogeneously modified form. This Review aims to highlight significant advances in the development of lipidation and ligation chemistry and their implementation in the synthesis and semisynthesis of homogeneous lipidated proteins that have enabled the influence of these modifications on protein structure and function to be uncovered.

Proteins are responsible for orchestrating the vast majority of biological processes that occur in living systems. Advances in genetics and proteomics have revealed that the total number of proteins within ac ell (the proteome) is far larger than the size of the genome;f or example,i nh umans, approximately 20 000 genes are estimated to encode over one million protein products. [1][2][3] This enormous diversification of genetic information is largely ar esult of co-and posttranslational modifications (PTMs) that occur during or after protein translation on the ribosome,r espectively. [1,3] Hundreds of distinct PTMs have been discovered to date and can occur via enzymatic or non-enzymatic processes. [4,5] The nature of these modifications varies from the addition of small functionalities (e.g.p hosphorylation, methylation, sulfation, or acetylation), to the addition of larger and/or structurally complex biomolecules (e.g. ubiquitination, glycosylation, ADP-ribosylation, or lipidation). [1,6] Other common PTMs include subtle modifications of amino acid side chains (e.g.c itrullination), polypeptide cleavage,a nd cyclization events. [5] Although there is growing evidence that PTMs occur on al arge proportion of human proteins [7] and are crucial for structure,l ocalization, and/or biological function (including the efficacyo fm any biologics), [8] the modulatory effects of most modifications are unknown for the majority of the proteome. [2,3] Lipidation is aw idespread modification of proteins that can occur post-translationally or co-translationally.C haracterized by the addition of hydrocarbon chains of various lengths to proteins,l ipidation increases the protein hydrophobicity,which often leads to membrane anchoring. [9,10] This localization at cell (or intracellular) membranes can serve ar ange of functions,i ncluding modulation of the activity of cell-signaling proteins,sequestration of aprotein from asubstrate,o rt he enhancement of protein-substrate association through membrane clustering. [9,10] In addition to the wideranging roles of protein-bound lipids in biology,t hese molecules have also been implicated in human disease;f or example,l ipidation of the human oncoprotein, Src, leads to delivery of the protein to the plasma membrane,which results in its pathogenicity. [11] These lipid modifications include prenylation at cysteine (Cys) residues (e.g. S-farnesyl and Sgeranylgeranyl lipids), fatty acylationateither Cys residues or the N-terminus (e.g. S-a nd N-palmitoyl or N-myristoyl lipids), and the attachment of cholesterol, glycosylphospha-tidylinositol (GPI), or phosphatidylethanolamine (PE) anchors to the C-terminus (Figure 1). Given the functional importance of protein lipidation, and the diversity of lipid modifications that exist in nature,tools which facilitate access to these biomolecules in homogeneous form are vitally important for detailed structure-function studies. [12] This Review aims to highlight the biological significance of protein lipidation as well as provide ad etailed account of the synthetic and semisynthetic technologies that have been developed and employed to efficiently access this class of modified proteins.

Prenylation
Prenylation is characterized by the attachment of multiple isoprene units to Cys residues through at hioether linkage within the C-terminal region of ag iven protein. Thet wo forms of prenylation are farnesylation and geranylgeranylation, which contain three and four isoprene units,respectively ( Figure 1). Up to 2% of cellular proteins are known to be Lipidation is au biquitous modification of peptides and proteins that can occur either co-or posttranslationally.Anarrayofdifferent lipid classes can adorn proteins and has been shown to influence anumber of crucial biological activities,including the regulation of signaling, cell-cell adhesion events, and the anchoring of proteins to lipid rafts and phospholipid membranes.W hereas nature employs arange of enzymes to install lipid modifications onto proteins,the use of these for the chemoenzymatic generation of lipidated proteins is often inefficient or impractical. An alternative is to harness the power of modern synthetic and semisynthetic technologies to access lipid-modified proteins in apure and homogeneously modified form. This Review aims to highlight significant advances in the development of lipidation and ligation chemistry and their implementation in the synthesis and semisynthesis of homogeneous lipidated proteins that have enabled the influence of these modifications on protein structure and function to be uncovered. prenylated in mammalian cells,most of which are geranylgeranylated. [13] Functionally,p renyl modifications serve to recruit otherwise soluble proteins to the cell membrane:t his can be either the plasma membrane or endomembranes surrounding organelles such as the Golgi, ER, lysosomes,and nucleus. [9] Examples of proteins that have their function modulated by post-translational prenylation include the Ras superfamily,w hich play ac entral role in cellular signaling. Some commonly farnesylated proteins in this family include K-Ras,N-Ras,H-Ras,Rheb,nuclear lamins,and Hdj2, while geranylgeranylated proteins include Rac,C dc42, RhoA, and Rab proteins. [14] Until very recently,t hree protein prenyltransferase enzymes were reported to be operational in eukaryotic cells, being responsible for the installation of modifications to the side chain of Cys residues.S pecifically,f arnesyl groups are transferred by farnesyltransferase (FTase) using farnesyl pyrophosphate (Fpp) as the substrate,w hile geranylgeranyl groups are installed by geranylgeranyltransferase (GGTase-I) using geranylgeranyl pyrophosphate (GGpp). [9] Both of these enzyme-catalyzed modifications occur within aC -terminal CAAX box motif,where Cisthe Cys residue that is modified, Aisanaliphatic amino acid, and the nature of the Xresidue dictates the type of modification installed. Forexample,when Xi ss erine (Ser), methionine (Met), or glutamine (Gln), the Cys residue is farnesylated, whereas when Xisleucine (Leu), the protein is geranylgeranylated. [15,16] There is also evidence to suggest that FTase strongly prefers small hydrophobic residues present at the second Ap osition. [17] After prenylation, the remaining amino acids (AAX) of the box are cleaved, either by an endoplasmic reticulum protease or Rasconverting enzyme 1. Ther esulting carboxylate group of the side-chain-prenylated Cys residue is subsequently methylated by the enzyme isoprenylcysteine carboxymethyltransferase (ICMT) to provide aC-terminal methyl ester. [18] In contrast to the FTase and GGTase-I prenyltransferases,the third class of enzyme,R ab (Ras-related in brain) geranylgeranyl transferase (GGTase-II or Rab-GGTase) employs GGpp as as ubstrate to specifically transfer either one or two geranylgeranyl groups.T he C-terminal prenylation motifs that are found within the family of Rab proteins are mostly CC and CXC but also include CCX, CCXX, and CXX. [9,19] Af ourth type of protein prenyltransferase called GGTase3 has very recently been discovered and is responsible for geranylgeranylating the ubiquitin ligase FBXL2, thereby linking it to the membrane and allowing polyubiquitylation of membraneanchored proteins. [20] It similarly modifies SNARE proteins, such as Ykt6, which is ap rerequisite for proper assembly of the Golgi SNARE complex. [21] 1.

Fatty Acylation
Myristoylation and palmitoylation represent the two most common forms of protein fatty acylation, and both have been shown to critically influence protein structure,function, and/ or localization. [9,22,23] Protein palmitoylation is defined by the attachment of aC 16 palmitate fatty acid to ap rotein and can occur through two different linkages in humans ( Figure 1). In S-palmitoylation, the lipid is reversibly attached to the side chain of Cys residues through an enzymatically and hydro-lytically labile thioester linkage.I nc ontrast, with N-palmitoylation, the lipid is transferred to the N-terminus of the protein or, more rarely,t oalysine side chain where it is appended through ah ydrolytically stable amide bond. Interestingly,n oe nzymatic machinery is known to remove N-palmitoylation and it is,t herefore,t hought to be an irreversible modification. [22] Unlike many other PTMs,including prenylation, as pecific consensus sequence for predicting protein N-palmitoylation does not exist and, in addition, some reported N-palmitoylation modifications could be the result of St oNt ransfer reactions. [24] S-palmitoylation, however, is often associated with nearby N-myristoylated glycine (Gly) residues,o rprenylated C-terminal Cys residues. [22] Palmitoylation is implicated in protein trafficking,a st he imparted hydrophobicity directs the otherwise soluble proteins to different cellular and organelle membranes.I n neurons,p almitoylation is important for targeting proteins to the axon terminals,w hich ultimately regulates synapse activity. [25] Formany proteins,S-palmitoylation is not permanent but rather cycles between palmitoylation and depalmitoylation to regulate their function in ad ynamic manner. [26] Such dynamic modification cycles are driven enzymatically, with addition of the lipid being carried out by palmitoyltransferases and removal being catalyzed by acylprotein thioesterases,such as acylprotein thioesterase-1 (APT1) [27] or palmitoylthioesterase-1 (PPT1). [28] Members of the Ras protein family of small GTPases are some of the most wellstudied examples of S-palmitoylated proteins.H ere,t he lipidation has been shown to regulate the membrane association of the proteins. [29] There is also evidence to suggest that palmitoylation protects proteins from proteasomal degradation by preventing their ubiquitination. [30,31] Protein N-myristoylation is defined by the attachment of aC 14 myristoyl group to an N-terminal Gly residue of ap rotein through an amide linkage.I ns ome cases,p roteins can also bear myristoyl groups at the e-amino group of lysine (Lys) residues,a si st he case for TNFa and the precursor interleukin 1a protein. [32][33][34] Unlike prenylation and palmitoylation, myristoylation can occur co-translationally as well as post-translationally,with the myristoyl group installed after cleavage of the N-terminal initiator Met residue by methionine aminopeptidase. [35] In eukaryotes, N-myristoyltransferases (NMT1 or NMT2) are primarily responsible for catalyzing the transfer of the myristoyl group from myristoyl-CoA to the substrate protein bearing an N-terminal Gly residue. [36] Like other forms of lipidation, N-myristoylation regulates cell signaling,m embrane association, and trafficking,a nd dual myristoyl modifications are also common. [37,38] In many cases, asingle myristoyl group is not sufficient to induce membrane trafficking,a sa dditional lipidation is necessary to enhance hydrophobicity.F or this reason, myristoylation and palmitoylation are commonly found together on proteins. [39] It is important to note that rare protein fatty acylations also occur with octanoate (best known as O-octanoate modification on the peptide hormone ghrelin), [40] as well as with unsaturated C 16 -C 20 fatty acid chains.

Glycosylphosphatidylinositol Anchors
Thea ttachment of ag lycosylphosphatidylinositol (GPI) anchor (also called glypiation) is ap ost-translational modification found across ab road range of organisms,i ncluding mammals,i nsects,p lants,f ungi, and protozoa. [41] Them odification occurs on the C-terminus of proteins and typically serves to anchor proteins to the extracellular face of plasma membranes. [41] TheG PI structure comprises ap hosphoethanolamine linker,ah ighly conserved glycan core (d-Man(a1-2)-d-Man(a1-6)-d-Man(a1-4)-d-GlcN(a1-6)myo-inositol), and al ipid tail which varies in structure depending on the organism it originates from. Specifically,these lipids can vary in length from 14 to 28 carbon atoms,a nd can either be saturated or unsaturated. [42] Although many of the biological functions of the GPI anchor are yet to be elucidated, they have already been shown to play key roles in cell-cell adhesion, signal transduction, membrane targeting,a nd lipid raft partitioning. [43][44][45] Bertozzi and co-workers have undertaken numerous elegant chemical biology studies to understand the structure-function relationships of GPI anchors.For example,w ork from the group has shown that the internal glycan of the GPI is important for lateral mobility of proteins to regulate activity. [46,47] Using apowerful cell surface painting strategy,purified GPI-modified proteins have been anchored into cell membranes from exogenous sources in both in vitro and in vivo settings. [48][49][50] GPI anchors have also been shown to be important components of immunodominant epitopes for eukaryotic parasites (e.g. Plasmodium falciparum), which has encouraged the production of synthetic variants for use as vaccine candidates. [51,52] Crucial to these studies is the ability to access sufficient quantities of GPI-anchored peptide or proteins in pure form. When produced through recombinant expression in cells,samples are typically heterogeneous,with awide variety of structures within the lipid portion of the GPI that are very challenging to separate by chromatographic techniques. [45,53] As such, chemical synthesis has emerged as ap otential avenue to access GPI-anchored proteins for functional studies.T oa void the challenging and laborintensive synthesis of the native GPI molecule,s everal simplified mimics have been synthesized, studied, and reviewed;h owever, these will not be discussed in detail in this Review. [45,54,55]

Cholesterol Anchors
Cholesterol anchors are found in the context of hedgehog (Hh) proteins,w hich are important for embryonic development and malignant tumorigenesis in av ariety of tumor types. [56,57] These cholesterol modifications are incorporated during au nique autocleavage process,a fter which the 3bhydroxy group of cholesterol is linked to the C-terminus of the processed protein through an ester linkage ( Figure 1). [56,57] This reaction is initiated by intramolecular nucleophilic attack on the carbonyl group of aG ly residue by an adjacent thiol side chain of aC ys residue,l eading to the intermediate formation of at hioester linkage.T his thioester subsequently reacts with the 3b-hydroxy group of cholesterol to generate an ester linkage and liberate the C-terminal autoprocessing domain. [9] C-terminal cholesterol anchors have been shown to be responsible for the release of dually lipidated Hh proteins from the cell surface.This is facilitated by two transporter-like proteins (Scube and Disp) that recognize parts of the cholesterol molecule. [58][59][60] Importantly,a lthough cholesterol is not necessarily required for Hh signaling activity,i th as been shown that the absence of this modification reduces signaling potency. [59] Modification with cholesterol has also been shown to be important in regulating the activity of the protein smoothened (SMO), which is modified on an aspartic acid (Asp) residue rather than the C-terminus of the protein. [61]

Phosphatidylethanolamine Anchors
Thea ddition of phosphatidylethanolamine (PE) modifications to generate PE anchors is ar are and relatively understudied PTM. To date,PEanchors have been found on the autophagy-related proteins Atg8 (yeast) and LC3 (mammals). [62] These anchors are covalently attached to proteins through an amide bond between aC-terminal Gly residue and the amino group of the PE. [63] In both the Atg8 and LC3 proteins,t he conjugation to PE is essential for their correct localization and function. [64] 2. Tools for the Preparation of Lipidated Peptides and Proteins

Synthetic Tools
In general, lipopeptides can be routinely accessed by standard solid-phase peptide synthesis (SPPS) techniques. Although several solution-phase syntheses of lipopeptides have been reported, [65][66][67][68] these approaches tend to be laborious and require numerous protecting group manipulation and purification steps.M oreover,t he inherent solubility problems that accompany the use of side-chain-protected peptides in solution-phase peptide synthesis are exacerbated for lipopeptides. [64] These issues are mitigated on as olid support as reactions can be driven to completion with excess reagents,w hich can be removed by simple filtration. Lipopeptides can be accessed by SPPS either by coupling prelipidated amino acids to ag rowing chain, or by selectively lipidating specific unprotected amino acids following complete elongation of the chain. [69] Theproduction of full-length lipid-modified proteins is considerably more challenging;this is due to the inherent size limits of peptides that can be assembled by standard SPPS methods (typically 40-50 residues). Fort hese reasons,l ipidated peptides and proteins of more than 50 amino acids in length are more commonly accessed by peptide ligation chemistry,inwhich apeptide and lipopeptide fragment can be chemoselectively fused to access larger targets ( Figure 2). Larger lipidated protein targets that would be intractable or impractical to produce through total chemical synthesis can instead be generated through semisynthetic methods.A sam ethodology,p rotein semisynthesis is broadly categorized as the use of either chemical or chemoenzymatic methods to fuse as ynthetic peptide to al arger and (typically) unmodified expressed protein ( Figure 2). One of the most widely adopted methods for protein semisynthesis is expressed protein ligation (EPL)which leverages the NCL manifold to facilitate chemical ligation of ar ecombinant protein with as ynthetic peptide. This was first demonstrated through ligation of arecombinant protein thioester (generated through thiolysis of ap roteinintein fusion) and as ynthetic peptide bearing an N-terminal Cys residue.A lternatively,anumber of methods have been developed for chemoenzymatic semisynthesis,t he most relevant to this Review being the sortase-mediated ligation, whereby ap rotein tagged with aC -terminal sortase-recognition sequence can be regioselectively fused to aC-terminal peptide or protein by the sortase enzyme ( Figure 2).
Recombinant expression methods for accessing proteins are routine and most commonly performed in E. coli. [70,71] In addition to their ease of genetic manipulation and handling, E. coli grows rapidly and typically provides high yields of at arget protein. Nonetheless,s ome complex mammalian proteins that feature high degrees of structural complexity can be challenging to access in simple bacterial expression systems.F urthermore,a lthough prokaryotes are known to post-translationally modify proteins,the use of E. coli expression systems mostly results in the production of unmodified proteins,and it is not straightforward to use standard bacterial systems for the incorporation of mammalian PTMs. [71] It should be noted that eukaryotic expression systems such as yeast [72] or insect cells [73] bear the necessary enzymatic machinery to generate many higher-order PTMs and have been used to access modified proteins.S uch approaches, however, are intrinsically plagued by the formation of heterogeneous mixtures of differentially modified proteins. This severely limits their application for deconvoluting the effects of particular PTMs and for accessing more highly defined, site-specifically modified protein therapeutics,aclass of biomolecules predicted to form the bedrock of the burgeoning biotechnology and "biologics" industries. [64,74] Fort his reason, the semisynthetic generation of lipidated proteins through the chemoselective ligation of an unmodified recombinant protein and ah omogeneously lipidated synthetic peptide represents an enormously powerful methodological platform.
Assembly of full-length lipid-modified proteins from segments produced synthetically or recombinantly can be achieved through ar ange of diverse ligation techniques ( Figure 2). [3,75,76] Them ost commonly used method is native chemical ligation (NCL); [77,78] however,o ther common ligation methods include the diselenide-selenoester ligation (DSL), [79][80][81] Ser/Thr ligation (STL), [82] a-ketoacid-hydroxylamine (KAHA) ligation, [83] maleimidocaproyl (MIC) ligation, [68] and sortase-mediated enzymatic ligation. [84] In most cases,u nprotected peptide and protein fragments are used, which allows reactions to be performed in buffered aqueous solutions at neutral (or near-neutral) pH and ambient-tomoderate temperatures.T he NCL method involves achemoselective reaction between ap eptide bearing an N-terminal Cys residue with another peptide derivatized as aC-terminal thioester. Mechanistically,t he NCL reaction is initiated by nucleophilic attack of the side chain of the Cys residue (at the N-terminus of one segment) at athioester (at the C-terminus of the other segment) in ar eversible transthioesterification step.T his step is followed by ar apid S-to-N-acyl shift to produce the native amide bond. [77,78] Thesynthesis of peptide thioesters can be achieved using ar ange of solution-and solid-phase procedures. [85] Larger protein thioesters can also be accessed using engineered inteins,w hich utilize an atural protein splicing process. [86] In this process,aninternal peptide fragment within ap rotein, termed an intein, is self-excised from the larger protein, which then ligates two flanking segments,t ermed exteins,t hereby forming an amide bond between them. In the first step,a(thio)ester is formed at an Nterminal Cys or Ser residue of the intein by areversible N-to-S/O-acyl shift ( Figure 2). This intermediate is then subjected to at rans(thio)esterification after nucleophilic attack by aC ys,S er, or Thrr esidue present on the C-terminus of the extein. Ther esulting (thio)ester then undergoes an intramolecular cyclization at the conserved asparagine (Asn) residue present on the C-terminus of the intein. This succinimide formation excises the intein and after af inal S/ O-to-N acyl shift, this ultimately results in the formation of an amide bond between the two exteins. [87,88] In the context of EPL, af usion construct of the target protein linked to an intein domain that can only undergo the initial thioester formation is employed, and affinity tags such as chitin-binding domains (CBDs) can be incorporated on the C-terminus of this construct to facilitate downstream protein purification. After purification by affinity chromatography,t he fusion protein can be cleaved and eluted from the column with an excess of at hiol such as 2-mercaptoethanesulfonate (MESNa), thereby providing the corresponding MESNa thioester of the desired protein segment. This can then be ligated to as ynthetic lipopeptide containing an N-terminal Cys by an NCL reaction. [64,87] TheD SL methodology is inspired by the NCL reaction but harnesses the superior reactivity of C-terminal selenoesters with the enhanced nucleophilicity of the 21st amino acid selenocysteine (Sec) on the N-terminus of the other peptide fragment. This enhanced reactivity means that ligation reactions are more facile than NCL and can be performed even at low concentrations.DSL is,therefore,auseful ligation technique when reaction partners are less soluble (e.g. for lipidated fragments), and has been successfully employed for lipidated protein synthesis. [89] STL involves the chemoselective reaction between an unprotected peptide with aCterminal salicylaldehyde (SAL) ester and another unprotected peptide with an N-terminal Ser or threonine (Thr) residue. [90,91] Theh igh abundance of Ser and Thrr esidues in native proteins makes this an attractive method for protein synthesis.Afinal type of peptide ligation reaction covered here is the a-ketoacid-hydroxylamine( KAHA) ligation. [83,92] This reaction proceeds via the decarboxylative condensation of ap eptide bearing an a-ketoacid on the C-terminus with ap eptide containing an N-terminal hydroxylamine functionality.T his enables the fusion of the fragments through an amide bond, typically under acidic organic buffering conditions.D espite the availability of these key techniques,i ti s anticipated that future innovations based around these peptide ligation concepts will undoubtedly enhance the number of lipidated protein targets that can be accessed by synthesis and semisynthesis.
Tw of urther methods that have been used to generate lipid-modified protein analogues are the maleimidocaproyl (MIC) and sortase ligations.Specifically,inthe MIC ligation, the synthetic lipopeptide is equipped with an N-terminal maleimide group; [68] this is reacted with aCys residue within an expressed protein fragment through Michael addition to the side chain thiol. One benefit of this method is that Cys residues on the C-terminus can be modified with adegree of selectivity due to the steric inaccessibility of other Cys residues in the sequence if they are buried within the protein. [93] However,t his method is not compatible with multiple C-terminal Lysresidues,asthe e-amine functionality can also react with the electrophilic maleimide moiety. [94] In contrast, the sortase ligation capitalizes on the ability of the sortase enzyme to specifically recognize an LPXTG pentapeptide motif [X = preferably Ser or glutamate (Glu)] for chemoselective fusion of two peptide or protein fragments ( Figure 2). [95] Following binding to the motif,s ortase first initiates thiolysis of the T À Ga mide bond to produce at hioester-linked acyl enzyme intermediate.T his thioester is then intercepted by the a-amine moiety of an N-terminal Gly residue on another peptide or protein fragment to afford the transpeptidation product with regeneration of the active sortase enzyme. [84,95] Am ajor drawback of these strategies is that they generate non-native "scars" within the protein sequence,n amely,a nu nnatural maleimide or LPXTG motif for the MIC and sortase ligations,r espectively.O ther examples of ligation or bioconjugation techniques that have been successfully employed to access unnatural analogues or mimics of lipidated proteins include the Diels-Alder ligation [96] and the the Cu I -catalyzed azide-alkyne cycloaddition (CuAAC)-the archetypal click reaction. [97] 2.2. Tools for Improving the Handling and for Analyzing Lipidated Peptides and Proteins Thei ntroduction of lipid moieties into peptides and proteins can dramatically change their physicochemical properties,leading to poor solubility,and in turn aggregation or the formation of micelles in aqueous buffers. [98] Many efforts have been made to tune the solubility of lipidated peptides and proteins by using buffer additives such as detergents and/or chaotropes,o rt hrough the introduction of transient solubility tags.T he improved solubility engendered through these strategies aids both ligation and lipid modification reactions,a sw ell as with subsequent handling during purification.
In most cases,lipidated peptides and proteins are purified and analyzed by reverse-phase high-performance liquid chromatography (RP-HPLC), as their increased hydrophobicity allows easy separation from unmodified and/or much smaller precursors.H ere,t he use of short-chain stationary phases with sufficiently large pore size is recommended, for example,aC 4 stationary phase with apore size of 30 nm. The use of columns with longer alkyl chains such as C 18 complicate purification because of the higher affinity of the lipid modification to these stationary phases. [99] Thec hoice of organic eluent and modifier for RP-HPLC is also critical, as is the temperature.N otably,o ptimized purification conditions usually need to be identified for each protein individually (or at least for ac lass of proteins) and one must ensure that the use of low pH and/or high temperatures during purification does not lead to decomposition of the proteins or cleavage of lipid chains. [100] In many cases,t he addition of organic solvents miscible with water, such as acetonitrile or fluorinated alcohols,help to solubilize lipidated peptides and proteins before and during purification. Fluorinated alcohols such as 2,2,2-trifluoroethanol (TFE) or 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP) are strong hydrogen bond donors and can stabilize a-helical secondary structures,t hereby helping to keep peptides and proteins with helical structural elements in solution. Combinations of organic solvents (e.g. acetonitrile with 2-propanol and TFE) with low concentrations of TFAa samodifying agent have been successfully used for the purification of lipidated proteins by RP-HPLC. [101] Although there are even better solvents for dissolving hydrophobic peptides and proteins such as dimethylformamide (DMF) and dimethyl sulfoxide (DMSO), these are not commonly used due to their strong UV absorption at 214 nm (similar to absorption of the peptide bond), which interferes with the absorption of peptides and proteins during HPLC purification. [102] Thea ddition of DMSO can also lead to the oxidation of methionine and cysteine side chains. [103] If no conditions for RP-HPLC can be identified, hydrophobic interaction chromatography (HIC) can serve as an alternative means of purification. Fort his method, ah ydrophobic stationary phase (such as agarose with butyl or phenyl ligands) is used and the samples are applied to the column in ahigh-salt buffer. Ad ecreasing salt gradient is used to elute proteins from the column in order of increasing hydrophobicity. [104] It should be noted that HIC is not suitable for the purification of peptides and proteins that form aggregates through solely hydrophobic interactions.
In such cases,a queous buffers containing high concentrations of chaotropes such as guanidinium hydrochloride (Gdn·HCl, up to 6M)o ru rea (up to 8M)a re often used in combination with detergents to prevent the aggregation of hydrophobic peptides and proteins by disrupting hydrogen bond donor and acceptor sites and by blocking hydrophobic interaction sites with detergent molecules. [105,106] Above the critical micelle concentration (CMC), detergents provide an ideal environment for lipids;however, they can also denature proteins and are often challenging to remove by standard purification approaches.The most commonly used detergents include the negatively charged sodium dodecyl sulfate (SDS) as well as non-ionic detergents such as Tr iton X-100 and Tw een 20. It is important to note that the polyethylene glycol moieties in Tr iton X-100 and Tw een 20 are easily ionized and, therefore,s uppress other molecules during analysis by mass spectrometry.F or this reason, alternative detergents such as octylglycoside (OG) and n-dodecyl-b-D-maltoside (DDM) are often preferred, as they can be removed by dialysis or with the use of biobeads. [102] In cases where the strategies described above (or combinations thereof) do not lead to successful ligation and/or purification, an alternative viable strategy is the covalent attachment of solubility tags that can be removed following the ligation or folding step.S uch transiently attached solubilizing molecules complicate synthesis and lead to decreased yields as ar esult of additional steps,b ut can significantly improve the recovery of hydrophobic peptides and proteins through RP-HPLC.One of the earliest reported strategies involved the introduction of polylysine tags (four or more lysine residues) on the N-or C-terminus of transmembrane peptides,a nd has been successfully used to increase the solubility of peptides derived from the human erythrocyte protein glycophorin A, bacteriophage M13 major coat protein and the hepatitis Cv irus membrane protein NS4A in aqueous buffers. [107] However,these early versions of solubility tags were permanently attached and, owing to their high charge,c an influence the biological function of the peptides and proteins.A saresult, temporary solubilization tags such as polyethylene glycol polyamide,p olyethylene glycol, and polyarginine moieties were developed. These can be attached on the termini, to side chains,o rt ot he polypeptide backbone through av ariety of different linkers that allow cleavage of the tags under acidic [108] or basic [109] conditions or with the use of specific protease enzymes. [110,111] Solubility tags linked through photocleavable linkers have also been developed that avoid exposure to harsh reaction conditions during cleavage. [112] Based on the summary provided above,although there are clearly anumber of strategies available to handle and purify lipidated peptides and proteins, there is no singular generalizable strategy and the practitioner may in some cases need to test anumber of these approaches. Overall, it is recommended that if no solubilizing buffer can be found and other alternatives,such as the use of chaperones (see Section 6f or lipidated Rab proteins), are not available, solubility tags on the backbone of the target peptide/protein are ag ood option for accessing homogeneously lipidated proteins.

Synthesis and Semisynthesis of Palmitoylated
Peptides and Proteins S-Palmitoylated lipopeptides are typically accessed by SPPS using the Fmoc strategy (with adjusted Fmoc removal conditions to avoid thioester hydrolysis and S-to N-acyl transfer) using pre-lipidated amino acid building blocks.A n example of this strategy was in the synthesis of resin-bound palmitoylated endothelial nitric oxide synthase (eNOS)   (1) by Waldmann and co-workers (Scheme 1A). [113] An alternative approach involves the introduction of the lipid modification to asubstrate peptide at alate stage,asexemplified by the synthesis of apalmitoylated variant of the matrix protein M2 31-96 (2;Scheme 1B). [114] N-Palmitoylated lipopeptides are relatively straightforward to prepare through direct condensation of palmitic acid to the N-terminus of asolid-supported peptide using standard coupling conditions,a sd emonstrated in the assembly of the sonic hedgehog N-terminal fragment (ShhN) 1-34 (3;Scheme 1C). [115] Acommon issue encountered Scheme 1. Solid-phase synthesis of S-palmitoylated peptides by A) coupling of pre-lipidated amino acids [113] or B) direct on-resin palmitoylation of unprotected Cysresidues. [114] C) Solid-phase synthesis of N-palmitoylated peptides by direct coupling of palmitic acid to the N-terminus. [115] PG = Standard side-chain protecting groups employed in Fmoc-SPPS.
with the solid-phase synthesis of S-palmitoylated peptides and proteins is that palmitoyl thioesters are labile to standard Fmoc-deprotection conditions (e.g.2 0vol %p iperidine in DMF) and can undergo S-to-N acyl shifts when present on deprotected Cys residues located at the N-terminus. [116] To prevent this unwanted S-to-N acyl shift, an Fmoc deprotection solution containing 1vol %D BU in DMF can be used during peptide synthesis,f ollowed immediately by the next amino acid coupling step. [116] Ac ommon issue observed during the synthesis of peptides bearing fatty acyl modifications is poor solubility or aggregation leading to the generation of higher order structures.This issue is compounded by the fact that the amphipathic nature of lipopeptides leads to broader elution profiles on stationary phases used for chromatographic purification (e.g.H PLC), thus making it more difficult to remove by-products during purification steps.F or this reason, access to full-length lipidated proteins usually necessitates the assembly of multiple peptide and lipopeptide fragments using peptide ligation chemistry,orthe use of transient solubilizing modifications that are also employed in the synthesis of membrane peptides and proteins. [117] In this context, the use of alternative ligation strategies such as the STL, DSL, and direct aminolysis methods have become common, as palmitoyl thioesters are often unstable in the presence of thiol additives used to enhance the rates of NCL reactions. [118][119][120] Numerous palmitoylated proteins have been accessed by total chemical synthesis to date.F or example,P alà-Pujadas et al. used an impressive five-segment kinetically controlled NCL strategy to access the 175-residue palmitoylated Nterminal domain of the human Sonic Hedgehog protein. [115] Hirabayashi and co-workers performed at otal synthesis and structural characterization of the 178-residue caveolin-1 (10), which is triply S-palmitoylated in the C-terminal region at Cys 133 ,Cys 143 ,and Cys 156 (Scheme 2A). [120] Retrosynthetically, the protein was divided into five peptide segments which were fused using four consecutive ligation reactions.T he synthesis proceeded through initial direct aminolysis between isopeptide fragments 11 and 12 to generate intermediate 13.T his was followed by iterative ligation reactions,affording the fully protected caveolin-1 primary sequence (14). Finally,c hemoselective deprotection of Acm, palmitoylation of the three deprotected Cys residues [using the electrophilic N-succinimidyl palmitate (Pal-OSu) reagent] and global deprotection furnished the target lipidated protein 10 (Scheme 2A). It is important to note that the group observed that the peptide segments derived from caveolin-1 were highly insoluble in aqueous buffer. Solubilizing O-acyl isopeptide linkages were, therefore,employed to improve the solubility of the segments that were converted back into native peptide bonds following protein assembly.F urthermore,t ob ypass the need for aqueous solvents,t he group utilized direct aminolysis reactions (rather than more traditional ligation approaches) to condense each fragment using DMSO as as olvent. These conditions are favorable for the stability of the palmitoyl thioesters,w hich are otherwise labile in aqueous media over long periods of time.H owever, direct aminolysis is not typically the method of choice of the practitioner due to the potential for 1) regioselectivity issues when other nucleophilic residues are present on the peptide fragment (e.g.Lys and Cys residues) and 2) epimerization of the a-center at the ligation junction upon activation of the C-terminal residue of one of the fragments.Regioselectivity issues can be avoided by using suitable protecting groups on the Lysa nd Cys residues,a nd epimerization can be avoided by judicious choice of ligation sites bearing only Gly,Pro,orisopeptide-derived Ser residues on their N-terminal side. [120] In another example,C hisholm et al. accessed the membrane protein phospholemman (FXYD1) 1-72 (15)u sing "reductive DSL"c hemistry at low concentration to ensure solubility of alipopeptide fragment 16 bearing apalmitylated Cys 42 residue (Scheme 2B). [89] After the DSL reaction between 16 and aF XYD1 1-39 N-terminal selenoester 17,t he Sec residue at the ligation junction (Sec 40 )w as subjected to late-stage alkylation with 1-bromohexadecane to afford ad ipalmitylated analogue of the protein. [89] Hanna et al. accessed di-and tri-palmitoylated variants of the Mycobacterium tuberculosis-associated antigen protein ESAT6 using af our-component DSL/NCL strategy. [121] Here,t he N-terminal ligation fragment was palmitoylated through direct coupling of the lipidated amino acid and the fragment was later converted into aC -terminal thioester through as idechain anchoring strategy. [122] Acombination of DSL and NCL chemistry was used to prepare the majority of the ESAT6 17-95 protein, which was then fused to the N-terminal palmitoylated sequences in afinal NCL reaction.
In af inal key demonstration of these synthetic approaches,H uang et al. utilized ar emoveable-backbonemodification (RBM) tag to introduce solubility to alipopeptide fragment for the synthesis of rabbit S-palmitoylated sarcolipin (SLN) and S-palmitoylated influenza Av irus matrix-2 (M2) 1-96 (18)i on channel proteins using STL chemistry.T he RBM used in this study,a2-hydroxy-3methoxy-4-amidobenzyl group,b earing a4 -amidohexalysine moiety to engender solubility,h ighlights the utility of solubility tags during the synthesis of palmitoylated fragments. [119] TheS TL method is particularly attractive for the synthesis of these targets given the reaction does not require athiol additive (unlike NCL), which would otherwise thiolyze the palmitoyl thioesters.T of urther expand on this work, Huang et al. developed amethod to enable sequential NCL-STL reactions in the N-to-C direction, which was subsequently employed to assemble the same S-palmitoylated M2 ion channel target (18; Scheme 2C), together with an Spalmitoylated interferon-induced transmembrane protein 3 (S-palm IFITM3). [118] Thes ynthesis of 18 was achieved through an initial NCL reaction between thioester 19 and ac ysteinyl fragment 20 (bearing am asked SAL ester). C-Te rminal activation through treatment with N-chlorosuccinimide (NCS)/AgNO 3 generated the SAL active ester intermediate 21,w hich was reacted with the hydrophobic serinyl fragment 22 bearing the RBM (to aid solubility) under STL conditions to afford the M2 1-96 precursor 23.T arget M2 protein 18 was generated through af inal acidolysis of the remnant SAL oxazolidine.
Semisynthetic approaches provide ap owerful means to access larger lipid-modified proteins and, importantly,c an overcome the need for multiple ligation steps that are usually necessary when accessing proteins of more than 100 residues in length by total chemical synthesis.Although semisynthetic approaches have been used widely for the generation of modified proteins,there are limited examples with palmitoylated proteins.O ne notable example,h owever, was the generation of palmitoylated variants of the mouse prion protein (PrP; 24)u sing an EPL strategy (Scheme 2D). [55,123] Thes emisynthetic strategy relied on expression of recombinant murine PrP  fragment 25 in fusion with aMxe GyrA mini-intein and two affinity tags (His 6 -tag and chitin-binding domain).
Mercaptoethanesulfonate sodium salt (MESNa) was used to intercept the scissile Ser À Cys amide bond between the intein and the remainder of the rPrP protein, thereby generating the corresponding MESNa thioester 26.T his was then used in athiophenol-promoted ligation reaction with the palmitoylated peptide membrane anchor fragment 27 to provide the target lipidated protein 24.Byusing this strategy, five different palmitoylated variants were prepared in about 30 %y ield and used to study the impact of lipidated PrP on the membrane structure and protein distribution in the membrane. [124] This approach was later elaborated for the construction of PrP variants with either N-terminal or centrally located truncations. [123]

Synthesis of Myristoylated Peptides and Proteins
N-terminal myristoylation is typically achieved by the direct coupling of af atty acid or pre-activated fatty ester to the N-terminal amine of ap rotected peptide during SPPS. Given their relative ease of synthesis,there is an abundance of examples in the literature of short, myristoylated peptides that have been produced synthetically. [125][126][127] Fore xample, Waldmann and co-workers prepared eNOS  ,N -myristoylated and S-palmitoylated at two positions using ac ombination of orthogonal enzyme-labile,a cid-labile,a nd noblemetal-labile protecting groups in af ragment condensation approach. [128] However,t his approach was low yielding (< 1%)a nd laborious,w hich led the group to develop al inear solid-phase approach that made use of the Ellmansulfonamide resin linker (Scheme 3A). [113] Specifically,t he authors expanded upon their solid-phase synthesis of resinbound palmitoylated eNOS 1-16 (1;S cheme 1A), with af inal Fmoc-deprotection and on-resin myristoylation with myristoyl chloride (Myr-Cl) to assemble the full-length resin-bound precursor 28.A fter alkylation of the resin linker (with iodoacetonitrile) and subsequent cleavage,t his strategy provided the target myristoylated and dipalmitoylated eNOS 1-26 peptide 29 in am uch-improved yield of 24 %a fter purification. Importantly,25mgofthe eNOS peptide 29 could be prepared by this strategy and could be achieved in days to weeks,rather than months.
Access to larger myristoylated proteins is typically achieved by either semisynthesis or multicomponent ligation strategies.I na ne xample of ap rotein prepared by chemical synthesis,L ua nd co-workers accessed the 131-residue Nmyristoylated HIV-1 matrix protein p17 1-131 (30)b yathreesegment convergent NCL strategy (Scheme 3B). To access each peptide fragment, the group utilized the in situ neutralization method developed by Kent and co-workers for SPPS based on the Boc strategy. [129] Initially,the internal and Acm-protected thioester fragment 31 was ligated to C-terminal cysteinyl fragment 32 to afford the non-lipidated C-terminus of the protein (33). Af inal NCL reaction with the myristoylated N-terminal fragment 34,derivatized as athioester, then provided the final myristoyl-p17 protein (30). Importantly, access to this synthetic myristoylated HIV-1 matrix protein 30 allowed the authors to study the "myristoyl switch" hypothesis,which relates to the ability of the protein to interact with the cell membrane in areversible manner.
Chemoenzymatic and metabolic approaches have also been used to access N-myristoyl analogues substituted with terminal azide or alkyne functionalities,asexemplified by the enzymatic myristoylation of PfARF1 1-15 (35)byT ate and coworkers to produce both myristoylated and azido-myristoyl derivatives of the PfARF1 1-15 (36;S cheme 3C). [130][131][132] These reactions are catalyzed by N-myristoyltransferases which recognize the N-terminal GXXXS motif and, therefore, allow the fluorescent labeling and imaging of myristoylated proteins within cells.T hese approaches have been reviewed previously and will not be discussed any further here. [133]

Synthesis of Prenylated Peptides
Like other lipidated peptides,t he most common method for accessing prenylated peptides is by Fmoc SPPS-based procedures.T he lipid can either be incorporated directly onresin through the use of apre-prenylated amino acid building block [134] or, alternatively,peptide precursors can be lipidated after cleavage of the resin through solution-phase alkylation reactions. [135][136][137][138] Ther eactive nature of the prenyl modification and its tendency to isomerize can present an umber of challenges during synthesis.F or example,t he alkene functionality is easily degraded under acidic or reducing conditions,h ence acid-labile or hydrogenolytically labile resin linkers or protecting groups are not suitable for use with prenylated peptides. [116] Moreover,p renyl groups must be compatible with the coupling conditions of the lipidated Cys building block. [116] As Cys is prone to racemization, coupling conditions have been extensively studied and optimized. Importantly,i th as been shown that a1:1 mixture of HBTU/ HOBt or HCTU with trimethylpyridine (TMP) as ab ase in CH 2 Cl 2 /DMF (1:1 v/v) leads to minimal racemization of the residue when coupling to the solid phase. [64,139,140] Ac ommon strategy for the synthesis of prenylated peptides relies on the use of hyper-acid-labile 2-chlorotrityl chloride (2-CTC) resin linkers.T his approach enables cleavage of the lipopeptide from resin using very mildly acidic conditions [for example,1vol %TFA or fluorinated alcohols such as trifluoroethanol (TFE) or hexafluoroisopropanol (HFIP)],which are compatible with the prenyl modifications. Theonly drawback of this strategy is that cleavage from the 2-CTC resin liberates aC -terminal carboxylic acid, whereas most prenylated proteins natively possess aC -terminal methyl ester. Conveniently,peptides can instead be anchored to the resin through the side chain of an amino acid bearing af unctionalizable side chain, which allows an appropriately amino acid methyl ester to be coupled to the C-terminus.By using this approach, Waldmann and co-workers prepared af arnesylated K-Ras peptide 4B methyl ester (37)s tarting from side-chain anchored Fmoc-Lys-OAll (38;S cheme 4A). [141,142] From here,d eprotection of the allyl ester and subsequent coupling with H 2 N-Cys(Far)-OMe generated resin-bound lipopeptide 39.T his farnesylated dipeptide was subsequently elongated by Fmoc-SPPS to construct the full length farnesylated K-Ras 4B (40)onthe resin, before afinal acidolytic cleavage afforded the target peptide 37 in 11 % overall yield.
An alternative approach to generating the C-terminal methyl ester is to employ an oxidatively labile hydrazide linker which enables cleavage of the peptide under oxidative conditions that are orthogonal to standard Fmoc-SPPS protecting groups.Inthis manner,W aldmann and co-workers demonstrated the first differential lipidation of ap eptide on an acid-stable solid-phase by selective deprotection of aCys-(Trt)-containing intermediate 41 under mild acidolytic conditions and asubsequent alkylation with farnesyl bromide to provide 42 (Scheme 4B). [136] Extension by Fmoc-SPPS including the coupling of Cys with orthogonal protection of the side chain by monomethoxytrityl (Mmt) provided resin-bound intermediate 43,w hich was selectively deprotected and acylatedw ith palmitoyl chloride.Afinal oxidative cleavage from the resin in the presence of methanol provided the palmitoylated and farnesylated N-Ras 180-186 (44)a st he corresponding C-terminal methyl ester.T he Ellman sulfonamide linker has also found utility for the preparation of lipopeptide thioesters bearing prenyl and fatty acyl groups. Theb enefit of this linker for use in this scenario is that it is very stable to both acid and base treatment, but the sulfonamide linker can be selectively alkylated, for example, with iodoacetonitrile,t oa ctivate the linker for cleavage. Specifically,a lkylation of the sulfonamide renders the carbonyl moiety electrophilic and, as such, can be reacted with nucleophiles for modification of C-terminal peptides.I nt he case of farnesylated N-Ras [180][181][182][183][184][185][186] (45), Waldmann and coworkers demonstrated that after the generation of resinbound farnesyl-peptide 46 and subsequent alkylation of the linker with iodoacetonitrile to provide 47,methanol could be used to generate the target N-Ras 180-186 C-terminal methyl ester 45 (Scheme 4C). [113,116] It should be noted, however, that undesired alkylation reactions elsewhere on the peptide can occur when alkylating the sulfonamide linker and, as aresult, can lead to diminished yields.I th as been shown that similar cassette strategies to those outlined above can also be adapted to the synthesis of geranylgeranylated peptides through alkylation of suitable peptide substrates with geranylgeranyl halides. [143] In an alternative approach it has been shown that aziridine-2-carboxylic acids can be incorporated into peptide sequences by SPPS,a nd subsequent site-and stereoselective opening can be performed on-resin with suitable thiol nucleophiles such as farnesyl thiol to generate prenylated peptides.T his elegant approach was showcased by Gin and co-workers,w hereby Fmoc-aziridine-2-carboxylic acid (Fmoc-Azy-OH) was installed on ar esin-bound tripeptide 48 using Fmoc-SPPS to afford pentapeptide 49 (Scheme 5). [144] Interception of the aziridine functionality with farnesyl thiol under basic conditions,prior to cleavage of the resin and deprotection, then provided S-farnesylated peptide 50.T his approach, however, has yet to be demonstrated on larger peptidic systems and the applicability in the presence of all proteinogenic amino acids (e.g.C ys residues that may cross-react with the aziridine moiety) has not yet been explored.

Synthesis and Semisynthesis of Prenylated Proteins
One group of prenylated proteins that have been intensively studied belongs to the Ras superfamily of GTPases, with anumber of important biological studies underpinned by the ability to access pure versions of these modified proteins through semisynthetic methods. [116] TheR as superfamily belongs to the class of monomeric Gp roteins that are involved in many cellular processes,i ncluding signal transduction and cell-cycle regulation. [145] They can switch between an active,G TP-bound state and an inactive,G DP-bound state.Intheir active form, they can either stimulate or inhibit cellular processes through interaction with various effectors. [64] As they are important molecular switches,d ysregulation of Ras proteins is highly relevant in the development of cancer. [146] Fort his reason, the genes that code for these proteins are some of the most important human oncogenes [147] and continuously activated Ras proteins are found in 30 %of all solid human tumours. [64,148] Thebest studied proteins in the family are the three isoforms K-Ras,N -Ras,a nd H-Ras, [116] which share approximately 90 %sequence identity in the first 168 residues and most of the variation between them arises in the C-terminal region (20 residues), which also contains sites of post-translational lipidation. [146] Thet hree most common types of lipidation found in Ras proteins are S-palmitoylation, S-prenylation, and N-myristoylation. These lipid modifications lead to the association of Ras proteins to membranes and are essential for their function.
Given the difficulty associated with isolating the fulllength lipidated Ras proteins,m any early structural and biochemical studies were performed exclusively with the soluble domain missing the unstructured C-terminal portion. [116] To better understand the effect of prenylation on the activity of Ras proteins,anumber of groups subsequently developed several powerful methods to access natively modified Ras proteins.I nt he early 2000s,e fforts by the Kuhlmann and Waldmann groups focused on using an expressed protein MIC-ligation between an expressed N-Ras 1-181 protein 51 (bearing aC-terminal Cys) and alipidated maleimidyl peptide 52 to access both palmitoylated and farnesylated N-Ras 1-181 (53)w ith an on-native maleimide linker (Scheme 6A). [149,150] Irrespective of the non-native linker,t hese mimics could be efficiently incorporated into artificial membranes and exhibited affinity for effector proteins in vivo. [68,151] Furthermore,t hese semisynthetic proteins were used to study the palmitoylation cycle of N-Ras in cells. [26] TheN -Ras protein was also equipped with ap hotoactivatable geranylbenzophenone analogue of the farnesyl modification, which was subsequently used to interrogate protein-protein and protein-lipid interactions in cells. [151] Following the landmark studies from the Kuhlmann, Bastiaens,a nd Waldmann groups using MIC ligations, [68,152] the groups of Goody and Waldmann reported the first synthesis of an ative geranylgeranylated Rab7 protein by an inteinmediated EPL strategy. [153] Briefly,they prepared ar ecombinant Rab7 protein segment C-terminally fused to an intein, which was cleaved by incubation with MESNa to provide the corresponding thioester. This Rab7 thioester segment was ligated to asynthetic geranylgeranylated N-cysteinyl peptide. This enormously powerful EPL strategy has since been used to access geranylgeranylated Ypt1 GTPase, [154] prenylated Rab7, [143] mono-/digeranylgeranylated Rab7, [155] farnesylated Rheb and K-Ras4B, [94,156] and farnesylated Rheb proteins. [157] In most of these cases,finding asuitable detergent to enable the ligations to proceed at sufficient reaction rates and to keep the prenylated peptides and proteins solubilized in aqueous buffers was crucial. [143,158] It should be noted that, to date,Rab proteins have primarily been the targets of the EPL approach to access C-terminally lipidated proteins.T his is in large part due to the availability of "solubilizing" binding partners for these proteins such as the Rab escorting protein REP-1, which can be used to solubilize the resulting lipidated proteins during ligation and folding.F olding is the final and critical step in generating active,post-translationally modified semisynthetic proteins.T wo other solubilizing chaperone-like proteins have also been applied to solubilize prenylated Rab proteins,n amely,t he GDP-dissociation inhibitor (GDI) [154] and the b-subunit of RabGGTase to renature the geranylgeranylated protein Rab7. [159] Indeed, although some Ras-type proteins including K-Ras4B and D-Ral have been accessed using the EPL strategy, [141] achaperone is not widely available Scheme 6. A) MIC-based ligation assembly of af arnesylatedN -Ras analogue 53 by Waldmann and co-workers. [149] B) Late-stage prenylation of acysteinyls ubtilisin Bacillus lentus (SBL) mutant 54 via intermediaryp rotein selenylsulfide 55 by Davis and co-workers. [162] C) Pd-catalyzed Tsuji-Trost allylation for the single or double prenylation of UBL3 [bearing one (57)o rtwo (58)C-terminal Cysresidues] by Becker and co-workers. [163] for many other members of the Ras protein family,w hich makes folding to the native form following ligation challenging.
Another avenue to access native prenylated proteins is through the late-stage chemoselective modification of proteins.D espite progress in the field of site-selective protein modification, there are still challenges to overcome with respect to the regioselectivity,c hemoselectivity,a nd stability of the resulting proteins as well as the development of reactions that work efficiently in aqueous buffers at physiological pH and temperature to prevent denaturation of the target proteins. [160,161] Fort his reason, there are fewer examples of prenylated proteins generated by late-stage modification than for the ligation-based strategies outlined earlier. [160] An example of al ate-stage lipidation method was reported by Davis and co-workers which exploited the unique reactivity of selenylsulfides for the thiol-selective prenylation of proteins. [162] Specifically,the authors were able to pre-activate aS156C mutant of the model protein subtilisin Bacillus lentus (SBL; 54)c ontaining one exposed Cys as ap henyl selenylsulfide 55 through ar eaction with phenylselenyl bromide (Scheme 6B). This species was then reacted with ap renylated thiol in aqueous solution (containing 20 vol %D MSO to solubilize the highly hydrophobic prenyl thiols), which led to the formation of the asymmetric prenylprotein disulfide-linked construct 56.I nasimilar fashion, both farnesyl and geranyl modifications of subtilisin were obtained (with > 50 %a nd > 90 %c onversion, respectively). It should be noted that it was not possible to install the geranylgeranylation modification by this approach, most likely because of the insolubility of the geranylgeranyl thiol in the aqueous buffer systems required to solubilize the protein. Ap otential drawback of this method is the nonnative disulfide linkage to the lipid modification, which can be cleaved under reducing conditions. [162] Very recently,Becker, Breinbauer,a nd co-workers reported am ethod for the latestage prenylation of expressed proteins by Pd-catalyzed Tsuji-Trost allylation (Scheme 6C). By using this approach, the authors were able to install farnesyl, geranyl, geranylgeranyl, and other non-native cargos to the C-terminal Cys residue of ubiquitin-like protein 3( UBL3). In this case protein variants bearing either one (57)o rt wo (58)C ys residues were used, which afforded singly (59)ordoubly (60) lipidated UBL3 proteins,respectively.Importantly,the prenyl modifications installed by the Tsuji-Trost allylation reaction possess an ative thioether bond, which makes this approach ap articularly promising new strategy for installing native prenyl modifications on peptides and proteins in solution. [163]

Synthesis of PE-Linked Peptides and Proteins
Thesynthetic addition of phosphatidylethanolamine (PE) to the C-terminus of peptides and proteins has proven extremely challenging due to the very hydrophobic nature of the PE moiety.O ne notable example that addressed the solubility problem was reported by Liu and co-workers,w ho accessed PE-modified LC3-II protein (61)i np ractical quan-tities. [164] Key to their success was the implementation of ap hotolabile solubilizing tag installed on an orthogonally protected resin-bound hexapeptide 62.A fter cleavage from the resin under mild acidolytic conditions,t he resulting intermediate 63 was then coupled to 1,2-distearoylphosphatidylethanolamine (DSPE) in the presence of DIC/HOAt to afford 64.Afinal NCL between this cysteinyl PE-modified hexapeptide and an expressed LC3-II MESNa thioester (65) (generated through intein thiolysis) provided access to the target PE-linked protein 61 after UV-mediated removal of the photolabile solubility tag (Scheme 7).
In the same year as the example above,W ua nd coworkers reported another semisynthesis of aP E-modified LC3 protein by asimilar EPL strategy,and the resulting lipidmodified protein was used to study autophagy. [165] In this case, the LC3 protein was expressed as an N-terminal MBP and Cterminal intein fusion construct in E. coli,and treatment with MESNa gave the MBP-LC3 thioester with sufficient solubility to achieve the subsequent ligation to aP E-carrying peptide.T oa ccess the native full-length lipidated LC3, the MBP was eventually cleaved with TEV protease.T he semisynthetic lipidated protein was shown to be functional through its interaction with the protease Atg4B and its activity in membrane tethering and fusion, which are key for the role of LC3 in autophagy. [166] 8. Synthesis of Cholesterol-Linked Peptides and Proteins TheC-terminal modification of proteins with acholesterol molecule is responsible for controlling the localization of proteins at the cell membrane.O ne notable example is the hedgehog family proteins,which are commonly modified with aC -terminal cholesterol moiety. [167] Waldmann and co-workers generated mimics of these cholesterol-modified hedgehog proteins by using aM IC ligation strategy involving an expressed protein fragment and as maller synthetic peptide fragment bearing aC -terminal cholesterol moiety.A lthough these possess an on-native maleimidyl linker,t he constructs enabled the authors to perform key experiments that revealed the ability of cholesterol alone to anchor proteins to membranes with affinities comparable to dual lipidation motifs,s uch as S-farnesylation with additional geranylgeranylation or S-palmitoylation, found on other lipidated proteins. [171] Te ruya et al. have also reported the semisynthesis of GFP bearing aC-terminal cholesterol moiety as amodel system. To access this,t he group used aG FP-thioester prepared using intein technology.This was ligated to asmall peptide bearing aC -terminal cholesterol moiety,u sing ad etergent to aid the solubility of the lipopeptide fragment. Confocal fluorescence microscopy was then used to study the localization of the protein within membranes. [172] In ab uilding block approach starting from cholesterol (66), Blixt and co-workers synthesized an azide-containing cholesterol derivative 67 for reaction with an alkynyl-amino acid 68 to generate at riazole-linked modified amino acid 69 (Scheme 8A). The applicability of this cholesterylated cassette 69 was then demonstrated through coupling of the building block to resin to generate 70,w hich could be elongated through standard Fmoc-SPPS to generate glycosylated model lipopeptide 71. [168] Ingallinella et al. took as olution-phase approach to the derivatization of cholesterol (66)t hrough the reaction of cholesteryl bromide (72)w ith aC 34 peptide 73,b earing an unprotected C-terminally positioned cysteiner esidue,t o generate cholesterylated C34 (74;S cheme 8B). This allowed the authors to increase the antiviral potencyofHIV-1 peptide fusion inhibitors by targeting it to the cell compartment where fusion occurs. [169] Recently,C hilkoti and co-workers developed an elegant enzymatic method for accessing C-terminal cholesterolmodified peptides and proteins,s uch as elastin-like polypeptide (ELP). [173] This strategy involved the fusion of ELP to asecondary HhC protein (autoprocessing C-terminal domain of hedgehog protein; 75), which recognizes and binds cholesterol. Upon the binding of cholesterol (66), an inteinlike N-to-S acyl shift, involving the HINT domain shared between the hedgehog protein and inteins,f orms ar eactive thioester intermediate 76.T his thioester subsequently reacts with the 3b-hydroxy group on an associated and proximal cholesterol (66)m olecule,w hich ultimately results in the extrusion of the HhC domain (77)a nd formation of the cholesterol-modified ELP (78;S cheme 8C). [170] Thea uthors used this approach to attach cholesterol to the bioactive peptide exendin-4, an approved peptide drug for type II diabetes.Importantly,the authors showed that the cholesterol modification led to self-assembly of the peptide into micelles, which then activated the glucagon-like peptide Ir eceptor with high potency. Given the ability of cholesterol to direct biomolecules to specific sites on membranes,i ncluding ordered domains (rafts), it is anticipated that the methods described above will continue to find widespread use in an umber of fields,r anging from chemical biology to drug discovery and delivery. [174][175][176] 9. Synthesis of GPI-Linked Peptides and Proteins Thefirst total synthesis of the native GPI anchor molecule was reported in the late 1990s. [177] Since this seminal report, several synthetic routes to the GPI anchor have been reported and have been reviewed elsewhere. [178,179] Although the early syntheses of the GPI anchor represented substantial feats in synthetic organic chemistry,the molecules were not equipped with appropriate functionality for fusion to ap eptide or protein. In am ajor advance in the field of lipidated protein synthesis,i n2 004, Guo and co-workers employed ac onvergent strategy to assemble a1 2-residue GPI-anchored CD52 antigen peptide. [180] Theg roup accessed the CD52 glycopeptide and GPI anchor separately,t hen fused both fragments using an HOBt/EDC-mediated coupling.T his work was followed rapidly by reports detailing the synthesis of analogues of GPI-anchored proteins,i ncluding aG FP-GPI mimic [181] and an EYFP-GPI mimic. [182] By using an alternative strategy,G uo and co-workers employed as ortase A mediated ligation to modify peptides and small proteins with GPI anchors. [183] However,t his enzymatic approach suffered from two major drawbacks:1 )the attachment of one or two non-native Gly residues to the phosphoethanolamine moiety of the GPI anchor was necessary for the recognition of sortase A, and 2) the recognition sequence (LPXTG) introduced into the protein C-terminus resulted in an on-removable and non-native ligation scar in the final modified protein product. Nevertheless,this strategy could be employed for the efficient preparation of analogues of human CD52 and CD24 antigens as well as aGPI-anchored MUC1, containing ashort peptidic sequence of the tumor-associated protein.
Another powerful approach for linking GPI anchors to peptides and proteins is through NCL. Nakahara and coworkers were the first to link thioester peptides and Cyscontaining GPI using an NCL-based approach, [184] whereas Bertozzi and co-workers used EPL to fuse GPI analogues to recombinant proteins,specifically using this strategy to access GPI-modified GFP constructs that allowed them to probe the effect of these lipids on protein-membrane targeting and membrane diffusion. [46] Building on these seminal studies, Becker, Seeberger, and co-workers were able to develop ar obust and generalizable semisynthetic strategy for the preparation of homogeneously GPI-anchored recombinant prion protein (rPrP; 79)b ased on an NCL platform. [185] Specifically,asynthetic Cys-tagged GPI anchor (80)w as ligated to an expressed rPrP bearing aC -terminal MESNa thioester (81)b yN CL (Scheme 9A). Thel igation was performed at pH 7.8 in the presence of at hiophenol as at hiol additive,w hich led to the efficient generation of the GPI-anchored protein. Notably,n oa ddition of detergents or lipids was required during the ligation (which was performed in standard 6MGdn·HCl, 0.3 MNaP i buffer) and the excess GPI anchor could be recovered and recycled after the reaction. Recently,V arónSilva and co-workers further improved on this approach by the integration of ao ne-pot ligation strategy to semisynthetically access complex GPIanchored proteins (Scheme 9B). [186] Fore xample,asimilar synthetic Cys-containing GPI anchor (82)c ould be ligated with an active eGFP protein thioester formed in situ from the respective protein-Npu intein intermediate (83)t og enerate homogeneous GPI-anchored eGFP (84). As imilar strategy has also been used for the successful semisynthesis of Thy1 and Plasmodium berghei ANKA MSP119 proteins,b oth of which bear homogeneous GPI anchors,a lthough extended reaction times were necessary.
Although both sortase Aa nd NCL-based ligation strategies provide an efficient means to link synthetic GPI molecules to proteins,e ach requires the use of non-native protein-GPI linkages,t hrough either an additional peptidic recognition sequence (in the case of sortase ligation) or ar emnant Cys residue following the NCL step.W ith this in mind, and av iew to generating truly native protein-GPI constructs,Z hu and Guo developed am ethod for coupling GPI to peptides and proteins through the use of the traceless Staudinger ligation (Scheme 9C). [187] In an example of this approach, the CD52 peptide (85) was synthesized by Fmoc-SPPS on ah yper-acid-sensitive 2-CTC resin and, following cleavage from the resin, was converted into the respective phosphinothioester ( 86); this could be further deprotected upon acid treatment to afford phosphinothioester 87.Ligation of either 86 or 87 with azidefunctionalized GPI (88)proceeded smoothly,thereby providing the human CD52 antigen bearing af ully native linkage between the protein and the GPI anchors (89). Thestrategic use of the traceless Staudinger ligation in this manner sets the scene for the generation of many more native GPI-anchored proteins in the future,i ncluding important proteins such as the CD48 antigen and carbonic anhydrase IV (both GPIanchored through aC -terminal Ser) or the Eph receptor ligand ephrin A5 (which is GPI-anchored through aCterminal Asn), all of which have not been studied in homogeneous form to date.K ey to such studies will be the implementation of recently developed predictive tools to identify new GPI-anchored proteins,such as PredGPI, [188] and the extension of phosphinothioester generation to larger and recombinant proteins,f or which intein-based methods could potentially prove an enabling technology.H owever, given that the traceless Staudinger ligation can suffer from slower reaction rates compared to other ligation methods,i ti s possible that such larger protein phosphinothioesters may not ligate as efficiently as the smaller peptide examples explored to date. [189] Alimiting factor in the approaches described above is the availability of sufficient amounts of functionalized GPI anchor, as these species are difficult to prepare by multistep synthetic routes.F urthermore,d ifficulties encountered while handling these native lipid-modified proteins as ac onsequence of solubility problems and/or amphipathic properties has meant that many researchers have turned to the use of less-complex GPI core structures in synthetic and semisynthetic campaigns. [190] Ap otential solution is to harness natural GPI anchors made by cells;h owever, only af ew examples of this approach for the generation of GPIanchored peptides have been reported to date.F or example, Schumacher et al. described the generation of aG PI-anchored peptide with af ree N-terminal Cys in yeast that can be used in ligation reactions with peptide or protein thioesters. [191] Alternatively,D har and Mootz reported the innovative use of asplit intein-based (Npu DnaE) system that relies on expressed GPI-anchored peptides fused to aCterminal intein segment, with subsequent trans-splicing with another protein bearing an N-terminal intein (in this case the model protein eGFP). [192] 10. Summary and Outlook Them ethods and examples described in this Review summarize the current status of (semi-)synthetic strategies to generate lipidated peptides and proteins.T oprovide an easily accessible picture of this field, we set out to provide aconcise overview of the major classes of lipidation found on peptides and proteins.Asuccinct description of the most relevant chemical approaches to generate lipidated peptides and proteins has been provided. We highlight advantages and challenges of the individual strategies,w hich we further elaborate on when highlighting specific examples for each class of lipidation. From these examples,i ti sc lear that our ability to assemble homogeneous lipid-modified proteins from segments made by SPPS and recombinant expression has significantly matured over the past decade and has served as the basis for an umber of important fundamental discoveries in biology and medicine. Scheme 9. A) NCL-based assemblyo faCys-functionalized GPI anchor (blue) and arecombinantp rion protein [rPrP  bearing aC-terminal MESNa thioester (red)] by Becker,S eeberger,a nd co-workers. [185] B) One-pot NCL of aC ys-functionalized GPI anchor (blue) with an eGFP-Npu intein intermediate. [186] C) Staudinger-based synthesis of trimannoseG PI-modified CD52 by Guo and co-workers. [187] The peptide was synthesized by Fmoc-SPPS on an acid-sensitive 2-CTC resin. Side chain protected CD52 was converted into its cognate phosphinothioester prior to Staudinger ligation with an azide-functionalized GPI. PG = protecting group.
However,t he sensitive nature of linkages between lipids and proteins (e.g.t hioesters), the chemical complexity of specific lipid modifications (e.g.GPI or PE anchors), and the impact of lipid modifications on ligation yields as ar esult of increased hydrophobicity or amphipathicity still make these synthetic and semisynthetic endeavors incredibly challenging. Although in some cases these limitations can be side-stepped by introducing non-native linkages between lipid(s) and proteins,this raises concerns about the functional consequences of introducing these artificial variations.
Despite significant progress over the past decade,a s highlighted in this Review,o ur knowledge on the functional roles of different lipidation modifications,a nd patterns thereof,r emains incomplete.H owever,i ti se nvisaged that further extensions and improvements to ligation-based protein synthesis methods such as NCL (and EPL), DSL, STL, and KAHA will continue to drive the field forward. One key requirement will be to perform ligation reactions at lower concentrations,which can be achieved with further extensions to the DSL method and by employing solubilization tags on lipidated peptide segments to improve NCL, EPL, and even protein trans-splicing (PTS) reactions that rely on split inteins. [2,77] We anticipate that additional progress in the field will be made through the combination of the sophisticated ligation strategies described above with the development of novel chemo-and regioselective modification reactions,f or example,l ipidation of unprotected cysteine residues. [193,194] There have been an umber of new approaches recently developed towards this end, driven by the need for efficient conjugation reactions to generate selectively modified protein therapeutics.T hese methods can now be repurposed for late-stage lipidation, thus avoiding handling problems during protein synthesis.S imilarly,e nzyme-mediated strategies offer opportunities for the synthesis of lipidated proteins at two distinct steps.F irst, for protein assembly through the use of enzymemediated ligation strategies (e.g.u sing engineered proteases or specific peptide ligases), [195] and second for late-stage enzymatic lipidation. [196] Theu se of these emergent approaches,e ither together or in combination, should enable more efficient and robust access to lipidated proteins, thus accelerating efforts to study the roles of lipidation in fundamental biological studies,a sw ell as to provide highquality lipidated proteins for the biotechnology and pharmaceutical sectors.