Making Circles: Recent Advance in Chemical and Enzymatic Approaches in Peptide Macrocyclization

Macrocyclic peptides have emerged as an important class of molecules for drug discovery due to their enhanced stability and bioavailability. Since the 1990s, ligation chemistries have been extensively studied for preparing peptide macrocycles. The ligation chemistries usually go through an energetically favored capture coupled with an acyl transfer reaction to overcome the disfavored entropy ring formation and facilitate the formation of the cyclic amide bond. Concurrently with chemical methods, several enzymatic approaches utilizing peptide ligases have also been explored. These peptide ligases are enzymes that involve in the production of naturally-occurring cyclic peptides in diverse organisms. Many enzymes have been isolated and proved to be highly efficient in the production of cyclic peptides. Here, we review the recent advance in chemical ligation and enzymatic approaches in making cycles.


Introduction
In the last few decades, peptides have been the center of drug discovery, and they have been applied in a wide range of areas including medicine and biotechnology. Peptides are now one of the most popular candidates that are dominating the clinical trials, and increasing amounts of FDA approved peptide-based drugs are being released to the market. Surprisingly, before early 80's, peptides were not being defined as druggable molecules because of their instability and poor bioavailability. They are easily damaged by chemical and physical denaturation, hydrolysis by enzymatic degradation, low membrane penetration, and poor gut absorption. The turning point leading to such dramatic change in the perspective of peptides was initiated by the introduction of the paradigm shifting technology named the Solid Phase Peptide Synthesis (SPPS) [1]. It allows routine and automation chemical synthesis of a long peptide chain. The ease of peptides perpetration makes researchers able to design or identify novel peptides through rational design and combinatorial peptide library. Most importantly, the SPPS allows researchers to perform extensive structural and side-chain modifications as well as the use of unnatural amino acids on the peptides, which marked the beginning of turning peptide and protein from "food" to drug. Among all the modification strategy, backbone cyclization is one the most common and efficient approaches to turn peptides into druggable compounds.
Macrocyclic peptides have emerged as an important class of molecules in drug discovery due to their enhanced conformational rigidity and resistance against proteolytic degradation. In many cases, macrocyclization increases the receptor binding affinity, confers druglike properties, and improves oral bioavailability [2,3]. A large number of natural macrocyclic peptides ranging from 5-78 residues have been found in diverse organisms [4]. A large variety of bioactivities including anticancer, antimicrobial, immunosuppressive, oxytocin receptor and neurotensin receptor binding have been identified in the cyclic peptide families. Furthermore, large macrocycles are also well suited for inhibiting protein-protein interactions, which are often defined as "undruggable" targets by conventional drug-discovery approaches [5]. To conduct more detailed study on these cyclic peptides, chemical synthesis was the only way to cyclize peptides, which was a formidable challenge in the early days. The classical method involves the use of side-chain protected peptides to perform cyclization through direct lactamization between the N-and C-termini via strong enthalpic activation. This method was widely used in the 80's to synthesize cyclic peptide sizes ranging from 5-12 amino acids [6]. However, the efficiency of this approach is highly sequence-dependence, which requires extensive time and effort to optimize the coupling conditions to minimize the formation of ambiguous polymers, epimerization, and to enhance solubility. Moreover, this method is only capable of cyclizing peptides < 15 amino acids due to the disfavored enthalpy of large ring size. To overcome these problems, advanced chemical and enzymatic ligations were employed to perform intramolecular ligation between N-and C-termini [7]. In this review article, state-of-the-art methodologies including chemical ligation reactions and ligases-mediated ligation, which have been used to produce end-to-end backbone cyclic peptides will be discussed.

Chemical Ligations
Distinguish themselves from the classical approach, the chemical ligation is capable of joining two unprotected peptide fragments unambiguously without the use of any coupling reagent. In general, chemical ligation is a two-step reaction that involves the chemoselective capture reaction followed by the proximity-driven acyl transfer reaction to generate a native amide bond at the ligation site [8]. The chemoselective capture forms a new covalent bond between an orthogonal pair that brings the two peptides together but distinguishes themselves from the amino acids side-chains that bear different functionalities. After the capture reaction, a spontaneous, energetically-favored proximity-driven O-to-N or S-to-N acyl transfer occurs and results in an amide bond. Such acyl transfer reactions are commonly found in enzymatic reactions of bond breaking and forming, therefore they are also called biomimetic ligations. In the 1980's, the acyl transfer was first demonstrated by Kemp et al., by using an organic template to join peptides together via O-to-N acyl transfer [9]. Based on Kemp's reaction, a series of biomimetic ligation chemistries were developed and are being "evolved". In this article, the chemical ligations are grouped into two categories including O-to-N and S-to-N acyl transfer-mediated chemical ligation.

O-to-N Acyl Transfer-Mediated Ligation
Pseudoproline ligation: In 1993, Tam developed the pseudoproline ligation that can be performed in water, without the use of protecting groups, coupling reagent, and organic templates. The reaction occurs between an ester aldehyde and the bifunctional N-terminal amino acid bearing a free amine and a beta-hydroxyl or a beta-thiol group (Cys, Thr, or Ser) [10,11]. The amino group at N-terminus forms a reversible imine with the C-terminal ester aldehyde. Then the nucleophilic group at the beta position attacks the imine to form an oxazolidine or thiazolidine ring. Then a spontaneous five-member ring-driven O-to-N acyl transfer occurs and leads to the formation of a stable amide bond with a thiazolidine/oxazolidine ( Figure 1A). Botti et al., demonstrated the use of pseudoproline ligation to perform intramolecular ligation using the amino acid sequence derived from the V3 loop on gp120 of HIV-1 to form a series of cyclic peptides in 1996. The result showed that the ligation method could cyclize peptides from the smallest ring of 5 to the largest ring of 26 amino acids [12].
Serine/Threonine ligation: In 2010, Li reported a modified version of pseudoproline ligation that is capable of generating a natural Ser/Thr amino acid at the ligation site named Ser/Thr Ligation (STL) [13]. Instead of an ester aldehyde, they put a Salicylaldehyde (SAL) at the C-terminus of a peptide. Similar to pseudoproline ligation, the aldehyde on SAL forms an oxazolidine ring with the N-terminal Ser/Thr in 1:1 pyridine: acetic acid as solvent. Then 6-member ring O-to-N acyl transfer occurs that results in an amide-linked stable benzylidene acetal intermediate. The unnatural benzylidene acetal linkage can be removed by acid treatment, which then generates a natural amide bond with Ser/ Thr at the ligation site ( Figure 1B) [14]. STL has been utilized in synthesizing different proteins [15][16][17]. It has been used in cyclizing natural occurring anticancer peptides yunnanin C, cyclomontanin B and cyclic peptide analogues [18][19][20]. This work demonstrated the ease of generating an analogous library of yunnanin C by using a single cyclization reaction condition. Lam et al., reported the first chemical synthesis of FDA-approved cyclic lipopeptide daptomycin via STL cyclization. The success of synthesizing daptomycin allows the study of structure-activity relationship and analogous design [21]. In 2013, our group showed that with the unique STL ligation mechanism, highly constrained cyclic tetrapeptides could be cyclized without the use of any flexible residues (Gly), turning residues (Pro), tertiary amide, or other d-amino acids [22]. This plausible cyclization can overcome the disfavored energy process by going through the ring-contraction strategy by the prior formation of the 16-membered ring during the oxazolidine formation. Then the 16-membered ring undergoes an O-to-N acyl transfer that results in a 12-membered ring cyclic tetrapeptide after acid treatment.
α-ketoacid-hydroxylamine (KAHA) ligation: Bode's group reported the new and robust KAHA ligation to ligate peptides [23]. Three different types of KAHA ligation was introduced with different N-terminal hydroxylamines including free hydroxylamine, O-substituted hydroxylamine, and the cyclic alkoxyamine. These reactions proceed different reaction pathways to form an amide bond on unprotected peptides. In general, the reaction takes place between the N-terminal hydroxylamine and the C-terminal α-ketoacid and results in a native amide bond with homoserine at the ligation site ( Figure 1C). KAHA ligation has been used in the synthesis of gramicidin S, tyrocidine A, Hymenamide B, semi gramicidin S, stylostatin A as well as a library of cyclic peptide analogues [24,25].

S-To-N Acyl Transfer-Mediated Ligation
Native chemical ligation: After the introduction of pseudoproline ligation in 1994, Kent and coworkers introduced the Native Chemical Ligation (NCL) in the same year, and Tam et al., reported it shortly in 1995 [26,27]. It was the first ligation that generates  natural amino acid at the ligation site. NCL is playing a significant role in peptide chemistry, and it is the most used ligation method in protein synthesis [28][29][30]. The NCL uses a highly chemoselective thiol-thioester exchange reaction which occurs between the N-terminal Cys and the C-terminal thioester and results in a thioester-linked intermediate. Then the thioester undergoes a spontaneous, proximity-driven S-to-N acyl transfer ( Figure 1D). The superior nucleophilicity of thiol in basic buffer permits a robust and efficient thiol-thioester exchange reaction as well as an energetically-favored S-to-N acyl transfer. In 1999, Tam and co-workers performed macrocyclization of natural-occurring cyclic peptides, called cyclotide, with over 30 amino acids with three intramolecular disulfide bonds using NCL [31]. During the synthesis of the four cyclotides, kalata B1, cyclopsychotride, circulin A, and circulin B, they discovered that the cyclization rate and yield were enhanced in the presence of intramolecular Cys residues. In the same year, the "thia-zip" mechanism was proposed, which hypothesized that trans-thioesterification reaction occurs intramolecularly and becomes thiolactone macrocycles [32]. The reversible trans-thioesterification chain-reaction begins from the adjacent Cys and ends at the N-terminal Cys. Lastly, an S-to-N-acyl transfer occurs at the N-terminal Cys and leads to the formation of an amide bond ( Figure 2). Such ring-expansion process brings the two ends in proximity which allows the cyclization to be performed at a great distance of peptide chain. Cyclotides and other biological imperative cyclic peptides were also synthesized and engineered by NCL-mediated cyclization which includes conotoxin, sunflower trypsin inhibitor, defensins, MCOTI-I, II, and a series of engineered cyclic peptides [33][34][35][36][37]. The collection of works showed that the cyclized peptides have higher serum stability and some of the analogs showed superior oral-bioavailable and membrane penetration. Besides the traditional solution phase NCL, Barany and coworker reported the solid phase on-resin NCL [38]. This method requires the attachment of resin using the side-chain of Fmoc-Asp-allyl, and after peptide elongation, the C-terminal allyl protecting group can be removed, and thiophenol can be coupled to the C-terminus to form a peptide thioester on-resin followed by NCL cyclization. . 04 .
Despite the high efficiency of NCL, the synthesis of base-labile peptide thioester requires the use of boc-chemistry, which requires the use of the strong acid such as hydrogen fluoride or TfOH for the side-chain deprotection. Therefore, new methods of peptide thioester preparation based on fmoc-chemistry were introduced in the last decade ( Figure 3). Tate's group proposed the use of "Safety-catch linker" to perform NCL and cyclization of cysteine knot MCoTI-II [39,40]. The C-terminal sulfonamide can be activated by iodoacetonitrile and cleaved by 3-mercaptopropionate to form a peptide thioester. Another well-studied approach is to use the N-to-S acyl transfer reaction to generate peptide thioester. N-4,5-Dimthoxy-2-Mercaptobenzyl (Dmmb) [41], N-methyl Cys [42], and Weinreb amide derivatives [43] have been used as the surrogate to catalyze the acyl transfer and demonstrated in the synthesis of model peptides. In 2010, Melnyk and Liu reported the use of bis(2-Sulfanylethyl)Amino (SEA) or N, N-Bis (2-Mercaptoethyl)-Amides (BMEA) to generate peptide thioester independently [44][45][46][47]. The C-terminus surrogate group at pH 7 and in the presence of reducing agents, it undergoes N-to-S acyl transfer to generate peptide thioester. Because of the controllable N-to-S acyl transfer, the SEA ligation is capable of performing tandem ligations for large protein synthesis and controlled peptide cyclization [48]. Taichi et al., designed the Thioethylalkylamido (TEA) linker that can be used with a milder acidic condition of pH 3.0 [49]. They have successfully used this method to produce cyclic peptides including kalata B1, cyclic omega-conotoxin MVIIA, and engineered sunflower trypsin inhibitor [34,50,51]. Recently, Liu and co-workers introduced the use of peptide hydrazide to generate peptide thioester via a one-pot, two-step mechanism. Under acidic pH, the peptide hydrazide reacts with NaNO 2 to generate acyl azide at the C-terminus [52][53][54][55]. In the presence of thiols, it attacks acyl azide to give peptide thioester. Then by going through NCL mechanism, proteins or cyclic peptides can be synthesized. In 2011, Zheng et al., uses the hydrazide methods to synthesize a series of cyclic peptides with sizes ranging from 5 up to 42 amino acids with ~20 % yield [56]. By using this methods, Möbius, and bracelet types of cyclotides, as well as the MCoTI-II trypsin inhibitor were synthesized.
Cysteine-free ligation: Despite the popularity of NCL, it is limited by the low abundance of Cys in natural peptides and proteins. Thus, tremendous effort was made to develop methods that can be performed NCL without Cys at the ligation site. In 1998, Tam and Yu reported the methionine ligation. They used a homocysteine to perform NCL followed by S-methylation to generate the methionine at the ligation site ( Figure 1E) [57]. Another approach is to prepare thiol-containing unnatural amino acids to perform NCL followed by desulfurization to create a natural amino acid ( Figure 1F). Dawson et al., demonstrated the use of Raney nickel or palladium aluminium to convert Cys to Ala. They have successfully synthesized the 21-amino acid cyclic antibiotic microcin J25 [58]. Quaderer and Hilvert used selenocysteine instead of cysteine to perform NCL in much quicker reaction rate to form a cyclic RGD motif peptide. Then deselenization was conducted using Raney nickel to generate an alanine from selenocysteine [59]. Danishefsky and co-worker reported the use of Tris(2-Carboxyethyl) Phosphine (TCEP) and a radical initiator VA-044 to perform desulfurization [60]. They demonstrated this powerful desulfurization by converting Cys to Ala on intermolecular ligation as well as peptide cyclization of natural-occurring cortogossamide.
Traceless Staudinger ligation: Traceless Staudinger ligation was developed by Bertozzi based on the Staudinger reaction. The ligation performs between C-terminal thioester-phosphine and N-terminal azide to form an iminophosphorane intermediate followed by aza-ylide-thioester transfer to form a native amide bond ( Figure 1G) [61]. Hackenberger employed the traceless Staudinger ligation to perform peptide cyclization on protected or unprotected peptides [62].

2-Formylthiophenol ligation:
Recently, Tung et al., reported the 2-formylthiophenol ligation that peptide formylthiophenol thioester can react with an amino group to form an amide bond regardless of N-terminal amino acid, including Pro [63]. Authors demonstrated the synthesis of five different sizes (5-11 amino acids) of natural-occurring anticancer peptides with over 40 % isolated yield. The formation of hemiaminal between aldehyde and amine was suspected to be the intermediate followed by S-to-N acyl transfer to generate an amide bond in a basic organic solvent. However, this reaction mechanism cannot distinguish the amino group between N-terminal amine and side-chain of Lys. Therefore, the cyclic peptides generated cannot contain internal Lys within the sequence.

Enzymatic Ligation
Chemoenzymatic approach using a peptide ligase would provide an attractive alternative and would be useful for cyclization of non-cysteine containing peptides. Furthermore, the use of enzymes as biocatalysts for cyclization has received increasing attention due to their high chemoselectivity, non-toxic, and cost-effectiveness. In this review, we focus on the recent progress on chemoenzymatic methods for the preparation of macrocyclic peptides and proteins.

Intein
Inteins are autocatalytic protein splicing elements. They induce self-excision and subsequent ligation of two flanking sequences, named exteins, by the formation of a native peptide bond. Protein splicing was first observed in 1988 in yeast trifluoperazine-resistant TFP1-408 gene [64][65][66], and was confirmed in the yeast vacuolar membrane ATPase VMA-1 gene [67] and yeast TFP-1 gene in 1990 [68]. Up to date, there are more than three hundred inteins identified with wide occurrence in all three domains of living organisms [69]. The general mechanism of intein-splicing has been elaborated as a four-step acyl shift reaction: (1) N-O/S acyl shift between N-extein and intein to afford an active ester/thioester bond at the splicing site where the amide bond is in cisoid conformation mediated by intein [70], (2) Transesterification between the first residue of C-extein (Cys, Ser) and newly formed ester/thioester   in proximity to afford an amide bond between N-and C-extein. In application, inteins are usually engineered into shorter mini-inteins by removing the homing endonuclease element [71] or split into two for trans-splicing [72,73].
Two recombinant approaches are developed for intein-mediated protein cyclization, including Expressed Protein Ligation (EPL) [74] and intein-mediated Protein Trans-Splicing (PTS) [75]. The target protein is coexpressed with selected intein at C-terminus or both termini as fusion proteins, followed by in vitro or in vivo cyclization reaction. Intein mediates formation of ester/thioester at C-terminus of the target protein, which requires an incoming nucleophile at the N-terminus. Thus, the target protein needs to start with Cys or Ser/Thr [76][77][78][79]. By constructing a series of precursor peptides from cyclotide hedyotide B1, it has been shown that the C-terminal residues adjacent to intein also influence cyclization efficiency. Steric effect generated by β-branch and backbone hindrance remarkably reduce the cyclization yield [80]. The intein-mediated splicing process has been developed into various technologies including affinity tags [81][82][83][84][85][86][87], ligation and cyclization [74,75,78,[88][89][90], protein semi-synthesis and biosensors [91,92]. Nevertheless, the requirement of recombinant expression instead of independent usage of intein limits its application to be a more commonly used ligation tool. Its substrate requirement for N-terminal Cys/Ser/Thr also narrows its versatility.

Subtiligase
Subtiligases are engineered enzymes derived from the protease subtilisin BPN' isolated from Bacillus subtilis. The first ligase was designed and named thiolsubtilisin by converting an active hydroxyl functional moiety in the active triad to thiol in organic condition [93,94], and later modified through S221C single mutation that allows ligation to take place in aqueous condition [95]. The double mutation S221C and P225A was introduced by Wells' group in the 1990s that further increase the catalytic efficiency [96]. Structural study by X-ray crystallography showed the ligation mechanism involving a thiol-mediated nucleophilic attack of amide/ester bond in substrate resulted in the formation of thioester intermediate followed by aminolysis by the N-terminal amine of the incoming peptide. C-terminal modification to glycolate-Phe-amide ester facilitates subtiligase-mediated ligation. This optimized subtiligase displays fast catalytic activities (k cat of up to 20 s -1 ) making it a powerful tool in protein semi-synthesis and engineering [97]. In general, the subtiligase-mediated ligation takes place in aqueous buffer with pH 8.0 at room temperature for 1 to 3 h.

Sortase A
Sortase A (SrtA) is a membrane-bound Cys transpeptidase found in Gram+ bacteria Staphylococcus aureus. It anchors proteins with a sorting sequence LPXTG (X can be E, D, K, A, N or Q) to the bacterial cell wall [104]. Sortase A mediates the transpeptidation by forming a thioester linkage between Cys184 of its active site and Thr in the LPXTG motif [105]. The protein-srtA thioester intermediate is prone for nucleophilic attack by a penta-Gly branch of lipid II that leads to attachment of protein to the glycopeptide region of lipid II followed by transferring of protein to cell wall peptidoglycan [106]. Although, its natural function is not a cyclase, sortase A has been used for cyclization of various peptides and proteins.
Sortase A-Mediated Ligation (SML) has been firstly demonstrated by Mao et al., using short peptides RE(Edans)LPKTG n K(Dabcyl)R (n=0 to 5) with G n (n=0 to 5)-peptide [107]. They used fluorescence to measure sortase-mediated cleavage rate and HPLC to monitor the ligation rate. The results showed that the minimal requirement for N-terminal nucleophile is a single aminoglycine and the length of N-terminal Gly affects ligation rate. An earlier Kinetic study revealed that S1′ position of sortase A is specific to Gly and S2′ position also favors Gly [108], suggesting that sortase A prefers GG in ligation reaction.
SML has been developed for protein labelling [109][110][111], macrocyclization [112,113] and ligation to peptides or non-peptidyl compounds (lipid, amino-poly (ethylene glycol), G 3 -polystyrene beads) [112,114]. It is compatible to both recombinant and chemically prepared peptides and proteins. However, sortase A has a long recognition signal and low the catalytic efficiency, which often requires equal molar ratio of enzyme to the substrate for efficient cyclization (Table 3). Recently, there has been extensive effort to improve the catalytic properties of sortase A using direct evolution by several independent research groups.  . 06 .
Subtiligase has been shown to cyclize peptides between 12 and 31 amino acids with yields of 30%-85% (Table 1). The catalytic efficiency of subtilisin was found to be influenced by residues at P1-P4 and P1′-P3′ positions. These sites have been extensively investigated by peptide library and phage display [96,98]. Mutagenesis study further expands the substrate selectivity of subtiligase and enhance its stability (Table 2) [96,99]. It also requires an extended structure at the N-terminus of the substrate. Ligation site within a helix should be avoided during experimental design. Subtiligase mediates fast reactions of protein terminal labelling [100], cyclization [101], and preparation of peptide thioester and thioacid [102,103]. Due to the large binding surface involving seven sites, the minimal length of substrate for macrocyclization requires 12 residues [101]. The requirement of C-terminal activation by esterification can be easily achieve using chemical synthesis but remains a limitation to the engineering of recombinant-prepared substrates.  The mature core peptide is underlined.

Transglutaminase
Transglutaminases are a group of enzymes that catalyze the acyl transfer reaction between the γ-carboxyamide of glutamines in peptides/proteins and ε-amino group of Lys (or a wide variety of primary amines) to form a protease-resistant isopeptide bond [123,124]. They are ubiquitous and have a widespread occurrence in microorganisms, plants, invertebrates and mammals [125][126][127]. Animal and plant transglutaminases are calcium-dependent enzymes, whereas microbial transglutaminases do not require

GmPOPB
GmPOPB is a member of the prolyl oligopeptidase isolated from the poisonous mushroom Galerina marginata [115,116]. It shares 36% and 37% sequence identity to PCY1 and porcine POP, respectively. GmPOPB is involved in the biosynthesis of α-amanitin toxin, an 8-residue cyclic peptide and a potent inhibitor of RNA polymerase II and III [117][118][119]. The amanitin precursor contain a 10-residue leader propeptide, a mature core peptide of 8 residues, and a 17-residue C-terminal propeptide [120][121][122]. Recombinant GmPOPB cleaves the amanitin precursor by a two-step process after two internal Pro residues in an ordered manner to produce the mature cyclic peptide. In the first processing event, GmPOPB act as a protease cutting after the first Pro residue to release a 25-residue intermediate consisting of the core peptide and the C-terminal prosequence. In the second event, GmPOPB act as a ligase, which cleaves after the second Pro and consecutively ligates with the N-terminal Ile to complete the macrocyclization.
GmPOPB is C-terminal specific for Pro at the P1 position and able to accommodate precursor peptides of up to 35 amino acids. Little is known about the substrate requirement of the core peptide beyond the C-terminal Pro. It has been tested to cyclize only one example, an 8-residue cyclic peptide IWGIGCNP. On the other hand, there is better understanding of the C-terminal recognition sequence. Cyclization screening of a set of peptide substrates with varying precursor sequences showed that GmPOPB displays a very stringent substrate requirement for the 17-residue C-terminal propeptide, which adopts an α-helical structure ( Table 4). Truncation of the propeptide by five amino acids or changing of selected residues to Gly cause dramatic reduction in the cyclization efficiency. It has also been shown that the C-terminal Cys of the propeptide is highly conserved and essential for the substrate recognition. Changing this amino acid to Ala causes loss of activity. Interestingly, GmPOPB able to cyclize the α-amanitin precursor (AbAMA1) from a distantly related species Amanita bisporigera, indicating the secondary structure rather than the exact primary sequence being important for cyclization. The major advantages of GmPOPB is its ability to cyclize small peptides with relative high efficiency. Kinetic analysis showed that the K m and k cat values for GmPOPB for the native substrate are about 5.6 s -1 and 25 µM, respectively. However, poor understanding of the substrate scope limits its application for synthesis of macrocyclic peptides. GmPOPB fails to cyclize phallacidin, a related mushroom toxin and thus it remains to be determined of GmPOPB can cyclized peptide unrelated to α-amanitin.
calcium for their activities . They have been used extensively in food industries as "meat glue" to bond proteins together and thus improve the texture of meat [128,129]. Examples include imitation crabmeat, and fish balls. It allows a manufacturer to imitate the texture and taste of a more expensive product, such as lobster tail, using a relatively low-cost material. Recently, transglutaminases have been explored for peptide macrocyclization [130]. Furthermore, it has also been reported that about 1% of statherin, a 43-residue phospho-peptide found in human saliva, exists in the cyclic form by the action of transglutaminase [131].
While it has been well characterized that transglutaminases catalyze the specific isopeptide bond formation between the side chain of Gln and the side chain of Lys residue, there is no clear specific peptide sequence preferred by the enzymes. Probing of substrate specificity using phage selection of random peptide libraries showed that certain peptide sequences have a higher selectivity as Gln-donor substrates. For example, the peptide Ac-WALQRPH-

Butelase
Butelase 1 is the fastest known ligase with catalytic efficiencies of up to 1,340,000 M -1 s -1 [134]. It is a highly promiscuous enzyme and has been shown to cyclize and ligate a broad range of non-native peptides and proteins of various origins with high efficiency [135][136][137][138][139]. It is discovered recently from the medicinal plant Clitoria ternatea (butterfly pea), which is also a common ornamental plant, fodder and cover crop in Southeast Asia [140]. Butelase 1 belongs to the legumain family of the cysteine proteases (family C13, clan CD) and shares 37% and 70% sequence homology with human legumain and VmPE-1 from mungo bean, respectively [140]. Butelase 1 presents in high yield (~ 5 mg/kg) and can be purified from pods of C. ternatea [141]. It is a cyclase responsible for the backbone cyclization of cyclotides, the largest family of plant cyclic peptides. The first cyclotide kalata B1 was discovered in the early 1970s, and the finding that Butelase 1 could cyclize kalata B1 provided the first in vitro evidence for its maturation, some 40 years after its discovery [140,142]. . 08 . NH 2 has shown a 19-fold decrease in K m value (3 mM) than the commonly used Cbz-Gln-Gly (58 mM) [132,133]. However, the precise mechanism and requirements for the selectivity remained to be determined. Using WALQRPH as a Gln-donor sequence, a 3 residue spacer (GGG), and an acceptor KS sequence, a set of 10 peptides has been designed to evaluate the effect of length and amino acid composition for cyclization by transglutaminases (Table 5) [130]. The result show that peptide of variable ring sizes from 1.2 to 2.6 kDa can be cyclized efficiently. Truncation of two residues on the N-terminus cause loss of cyclization which suggested that certain length is required for efficient activity. In the absence of the acceptor Lys residue, Gln is deamidated to form Glu residue. Microbial transglutaminases are commercially available at relatively low cost. The cyclization is efficient but lack specificity if the peptides or proteins contain multiple Gln or Lys residues. Most Gln and Lys residues would act as substrates to a varying degree of efficiency depending on their accessibility to the enzymes.
and requires a C-terminal dipeptide His-Val at the P1′ and P2′ positions for substrate recognition [141]. It acts preferentially as a cyclase but not a protease. It does not hydrolyze the legumain substrate Z-AAN-AMC even with an extended incubation of 30 h. Although the precise mechanism of cyclization remained to be determined, it has been speculated that butelase 1 first recognizes and cleaves the peptide bond between Asx and His-Val to form an acyl-enzyme intermediate, which is subsequently resolved by the amino group of the N-terminal residue to form a head-to-tail macrocycle. Butelase 1 displays a broad tolerance for the N-terminal residue at the P1″ position, accepting almost 20 natural amino acids except for Pro. Interestingly, it has a more stringent specificity for residue at the P2″ position, and highly favors Ile/ Leu/Val and to some extent Cys residues. Butelase 1 has been shown to cyclize various peptides and proteins from 1 to 30 kDa with high efficiency (Table 6). It is the only known naturally occurring cyclase with the ability to cyclize proteins. Interestingly, butelase has also been reported to mediate intermolecular peptide ligation and site-specific modification of proteins. This property allows incorporation of a functional probes or tags into a target protein under mild reaction condition. The high catalytic efficiency  together with a simple recognition motif make butelase 1 highly useful in peptide macrocyclization. It has a broad substrate scope capable of cyclizing peptides and proteins from 10 to >200 residue with high efficiency. A typical butelase-mediated reaction requires 100 to 1000 folds less amount of enzymes than other peptide ligases. It has been shown that butelase can cyclize peptides at up to 200,000 folds faster than sortase A [134]. Butelase 1 also enables the cyclization of non-cysteine containing peptides which are difficult to be synthesized by chemical method. Furthermore, butelase 1 has a simple recognition motif which requires no C-terminal hydrazide, thioester or unusual amino acids. Importantly, no recognition sequence except Asx was left in the final cyclized product making butelase-mediated reaction a traceless ligation. Butelase 1, however, cannot cyclize denature proteins and required a properly folded conformation.