Identification, Cloning and Heterologous Expression of the Gene Cluster Directing RES-701-3, -4 Lasso Peptides Biosynthesis from a Marine Streptomyces Strain

RES-701-3 and RES-701-4 are two class II lasso peptides originally identified in the fermentation broth of Streptomyces sp. RE-896, which have been described as selective endothelin type B receptor antagonists. These two lasso peptides only differ in the identity of the C-terminal residue (tryptophan in RES-701-3, 7-hydroxy-tryptophan in RES-701-4), thus raising an intriguing question about the mechanism behind the modification of the tryptophan residue. In this study, we describe the identification of their biosynthetic gene cluster through the genome mining of the marine actinomycete Streptomyces caniferus CA-271066, its cloning and heterologous expression, and show that the seven open reading frames (ORFs) encoded within the gene cluster are sufficient for the biosynthesis of both lasso peptides. We propose that ResE, a protein lacking known putatively conserved domains, is likely to play a key role in the post-translational modification of the C-terminal tryptophan of RES-701-3 that affords RES-701-4. A BLASTP search with the ResE amino acid sequence shows the presence of homologues of this protein in the genomes of eight other Streptomyces strains, which also harbour the genes encoding the RES-701-3, -4 precursor peptide, split-B proteins and ATP-dependent lactam synthetase required for the biosynthesis of these compounds.


Introduction
Lasso peptides are a class of ribosomally synthesised and post-translationally modified peptides (RiPPs) natural products produced by the bacterial domain [1,2]. They have been shown to possess a wide range of biological properties, including antimicrobial activity against Gram-negative pathogens in the case of capistruin [3] and microcin J25 [4], antimicrobial activity against Gram-positive pathogens such as that exerted by siamycin-I [5] and the inhibition of HIV replication displayed in cell cultures by RP71955 [6]. A striking feature of lasso peptides, which sets them apart from other RiPPs, is their unusual topology. All of them display a macrolactam ring comprising seven to nine residues, formed between the N-terminal α-amino group and the βor γ-carboxyl group of aspartic or glutamic acid, with the remaining C-terminal peptide tail threaded through the ring. Additionally, the topology of some lasso peptides can be further modified by the presence of one or two disulfide bridges (class III and I lasso peptides respectively), although most of the known lasso peptides contain none (class II lasso peptides) [2]. The vast majority of lasso peptides have been isolated from terrestrial microorganisms;

Identification and in Silico Analysis of RES-701-3, -4 Biosynthetic Gene Cluster
During our ongoing research with marine microorganisms, we identified Streptomyces caniferus CA-271066 as a producer of new bioactive metabolites [24]. The draft genome of this organism was analyzed with antiSMASH [10], which predicted 34 putative regions putatively encoding secondary metabolite gene clusters, including NRPS (non-ribosomal peptide synthetase), type I and II PKS (polyketide synthase), siderophores, terpenes and RiPPs. Careful examination of one of the RiPP gene clusters strongly suggested that it directed the biosynthesis of at least RES-701-3, based on the amino acid sequence predicted to be encoded by the precursor peptide gene resA ( Figure 2). A sequence analysis revealed the presence of seven ORFs located in a 7.7 Kb region ( Figure 3). Interestingly, five of the ORFs (resA, resC, resB1, resB2 and resE) are unidirectionally transcribed, whereas the remaining two (resF and resD) are transcribed from the complementary strand. Additionally, resB1 and resB2 are probably translationally coupled. resA encodes a 44 aa (amino acid) precursor peptide, with the C-terminal 16 aa region containing the core peptide and the remaining 28 aa from the N-terminal region forming the leader peptide required for processing. A BLASTp

Identification and in Silico Analysis of RES-701-3, -4 Biosynthetic Gene Cluster
During our ongoing research with marine microorganisms, we identified Streptomyces caniferus CA-271066 as a producer of new bioactive metabolites [24]. The draft genome of this organism was analyzed with antiSMASH [10], which predicted 34 putative regions putatively encoding secondary metabolite gene clusters, including NRPS (non-ribosomal peptide synthetase), type I and II PKS (polyketide synthase), siderophores, terpenes and RiPPs. Careful examination of one of the RiPP gene clusters strongly suggested that it directed the biosynthesis of at least RES-701-3, based on the amino acid sequence predicted to be encoded by the precursor peptide gene resA ( Figure 2). A sequence analysis revealed the presence of seven ORFs located in a 7.7 Kb region ( Figure 3). Interestingly, five of the ORFs (resA, resC, resB1, resB2 and resE) are unidirectionally transcribed, whereas the remaining two (resF and resD) are transcribed from the complementary strand. Additionally, resB1 and resB2 are probably translationally coupled. resA encodes a 44 aa (amino acid) precursor peptide, with the C-terminal 16 aa region containing the core peptide and the remaining 28 aa from the N-terminal region forming the leader peptide required for processing. A BLASTp A sequence analysis revealed the presence of seven ORFs located in a 7.7 Kb region ( Figure 3). Interestingly, five of the ORFs (resA, resC, resB1, resB2 and resE) are unidirectionally transcribed, whereas the remaining two (resF and resD) are transcribed from the complementary strand. Additionally, resB1 and resB2 are probably translationally coupled. resA encodes a 44 aa (amino acid) precursor peptide, with the C-terminal 16 aa region containing the core peptide and the remaining 28 aa from the N-terminal region forming the leader peptide required for processing. A BLASTp homology search using the NCBI non-redundant protein sequence database was employed to analyze the proteins encoded by the remaining ORFs. resC encodes a 612 aa protein, which was found to be similar to SOE12204.1 (630 aa, 84% identity, 89% similarity) from Streptomyces sp. 2323.1. The protein contains a cd01991 domain, which is typically found in the asparagine synthase and ATP-dependent lactam synthetases. resB1 encodes an 84 aa protein similar to WP_106430390.1 (86 aa, 89% identity, 90% similarity) from Streptomyces auratus, and shows homology with the lasso peptide biosynthesis PqqD (pyrroloquinoline quinone biosynthesis) family chaperone. resB2 encodes a 145 aa protein similar to SOE12206.1 (145 aa, 88% identity, 92% similarity) from Streptomyces sp. 2323.1, containing the domain pfam13471. resE encodes a 209 aa hypothetical protein similar to WP_006604205.1 (209 aa, 86% identity, 91% similarity) from Streptomyces auratus, and lacks any known conserved domain. resF encodes a 551 aa protein similar to WP_119203701.1 (551 aa, 88% identity, 94% similarity) from Streptomyces sp. 2233, it contains a COG0531 domain and is proposed to be a member of the APC (Amino Acid-Polyamine-Organocation) family of transporters. Its three-dimensional structure is predicted to contain 14 transmembrane helices (TMHMM server v 2.0) [25]. resD encodes a 691 aa protein similar to WP_106430393.1 (648 aa, 78% identity, 82% similarity) from Streptomyces auratus with high homology to ABC transporters. It contains a COG1132 domain and its three-dimensional structure is predicted to contain six transmembrane helices [25].
homology search using the NCBI non-redundant protein sequence database was employed to analyze the proteins encoded by the remaining ORFs. resC encodes a 612 aa protein, which was found to be similar to SOE12204.1 (630 aa, 84% identity, 89% similarity) from Streptomyces sp. 2323.1. The protein contains a cd01991 domain, which is typically found in the asparagine synthase and ATPdependent lactam synthetases. resB1 encodes an 84 aa protein similar to WP_106430390.1 (86 aa, 89% identity, 90% similarity) from Streptomyces auratus, and shows homology with the lasso peptide biosynthesis PqqD (pyrroloquinoline quinone biosynthesis) family chaperone. resB2 encodes a 145 aa protein similar to SOE12206.1 (145 aa, 88% identity, 92% similarity) from Streptomyces sp. 2323.1, containing the domain pfam13471. resE encodes a 209 aa hypothetical protein similar to WP_006604205.1 (209 aa, 86% identity, 91% similarity) from Streptomyces auratus, and lacks any known conserved domain. resF encodes a 551 aa protein similar to WP_119203701.1 (551 aa, 88% identity, 94% similarity) from Streptomyces sp. 2233, it contains a COG0531 domain and is proposed to be a member of the APC (Amino Acid-Polyamine-Organocation) family of transporters. Its threedimensional structure is predicted to contain 14 transmembrane helices (TMHMM server v 2.0) [25]. resD encodes a 691 aa protein similar to WP_106430393.1 (648 aa, 78% identity, 82% similarity) from Streptomyces auratus with high homology to ABC transporters. It contains a COG1132 domain and its three-dimensional structure is predicted to contain six transmembrane helices [25]. The growth of Streptomyces caniferus CA-271066 did not lead to the production of RES-701-3, -4 under any of the fermentation conditions employed. Thus, we conceived a strategy based on heterologous expression in order to establish a link between the lasso peptides and their putative biosynthetic gene cluster.

Cloning of resACB1B2EFD into the Vector pCAP01
A genomic region spanning 9.1 Kb and containing resACB1B2EFD was amplified by employing an overlapping-PCR approach. Over 700 nucleotides upstream of resA and resD were included in the amplified region in order to capture the putative promoter and the transcriptional and ribosomebinding sites. The 9.1 Kb SpeI/XhoI fragment was initially cloned into the pCR TM -Blunt vector and transformed into NEB 10-beta E. coli. Clones were checked by restriction analysis, and a digested and purified SpeI/XhoI fragment was then cloned into the pCAP01 vector, a S. cerevisiae/E. coli/actinobacteria shuttle vector designed for the site-specific integration of the cloned gene cluster into the chromosomes of heterologous actinobacterial hosts, thanks to the φC31 integration element present in the vector backbone, to generate pCAPRES [26]. Because pCAPRES contains a kanamycinresistant marker, the direct transformation of the non-methylating CmR KmR E. coli strain ET12567/pUB307 is not possible. Thus, pCAPRES was used to transform the CmR E. coli strain ET12567, followed by triparental intergeneric conjugation employing E. coli ET12567/pCAPRES, ET12567/pUB307, and spores of the actinomycete host (Streptomyces coelicolor M1152, M1154, or  The growth of Streptomyces caniferus CA-271066 did not lead to the production of RES-701-3, -4 under any of the fermentation conditions employed. Thus, we conceived a strategy based on heterologous expression in order to establish a link between the lasso peptides and their putative biosynthetic gene cluster.

Cloning of resACB1B2EFD into the Vector pCAP01
A genomic region spanning 9.1 Kb and containing resACB1B2EFD was amplified by employing an overlapping-PCR approach. Over 700 nucleotides upstream of resA and resD were included in the amplified region in order to capture the putative promoter and the transcriptional and ribosome-binding sites. The 9.1 Kb SpeI/XhoI fragment was initially cloned into the pCR TM -Blunt vector and transformed into NEB 10-beta E. coli. Clones were checked by restriction analysis, and a digested and purified SpeI/XhoI fragment was then cloned into the pCAP01 vector, a S. cerevisiae/E. coli/actinobacteria shuttle vector designed for the site-specific integration of the cloned gene cluster into the chromosomes of heterologous actinobacterial hosts, thanks to the φC31 integration element present in the vector backbone, to generate pCAPRES [26]. Because pCAPRES contains a kanamycin-resistant marker, the direct transformation of the non-methylating CmR KmR E. coli strain ET12567/pUB307 is not possible. Thus, pCAPRES was used to transform the CmR E. coli strain ET12567, followed by triparental intergeneric conjugation employing E. coli ET12567/pCAPRES, ET12567/pUB307, and spores of the actinomycete host (Streptomyces coelicolor M1152, M1154, or Streptomyces albus J1074). Exconjugants were checked by PCR to confirm the integration of the putative biosynthetic gene cluster into the chromosomes of the heterologous hosts.

Discussion
Post-translational modifications are unusual in lasso peptides, although a few examples have been reported recently, including the C-terminal phosphorylation in paeninodin [27], citrullination in citrulassin A [14], acetylation in albusnodin [28], C-terminal methylation of lassomycin and lassomycin-like lasso peptides [29,30], and the epimerization of the α-carbon from the C-terminal amino acid residue in MS-271 [31]. In this work, we prove that resACB1B2EFD, identified through a genome-mining approach, is sufficient for the biosynthesis of the lasso peptide RES-701-3 and its 7hydroxy-tryptophan homologue RES-701-4 employing Streptomyces coelicolor M1152 and M1154 as   On the other hand, no production of either RES-701-3 or RES-701-4 could be detected in any of the exconjugant J1074-pCAPRES clones employed under any of the growth conditions used.

Discussion
Post-translational modifications are unusual in lasso peptides, although a few examples have been reported recently, including the C-terminal phosphorylation in paeninodin [27], citrullination in citrulassin A [14], acetylation in albusnodin [28], C-terminal methylation of lassomycin and lassomycin-like lasso peptides [29,30], and the epimerization of the α-carbon from the C-terminal amino acid residue in MS-271 [31]. In this work, we prove that resACB1B2EFD, identified through a genome-mining approach, is sufficient for the biosynthesis of the lasso peptide RES-701-3 and its 7-hydroxy-tryptophan homologue RES-701-4 employing Streptomyces coelicolor M1152 and M1154 as heterologous hosts. No production of either lasso peptide could be detected in any of the pCAPRES exconjugants from Streptomyces albus under any of the growth conditions, which shows the importance of employing different heterologous hosts, even if they are from the same genus. The heterologous expression of a lasso peptide in Streptomyces coelicolor and a lack of production in Streptomyces albus has also been observed in the case of albusnodin [28].
Based in their homologies and the identified conserved domains, plausible biosynthetic roles for proteins ResA, ResB1, ResB2 and ResC can be proposed. resA encodes the precursor peptide. resB1 and resB2 encode for a "split" B protein; therefore, ResB1 is proposed to recognise and bind the leader peptide from ResA to deliver the structural peptide to ResB2 for processing [15]. resC encodes for a lactam synthetase, which is proposed to catalyze the ATP-dependent formation of the amide bond between the N-terminal α-amino group of Gly-1 and the β-carboxyl group of Asp-8 [32].
Many lasso peptide BGCs contain an ABC transporter, presumably involved in the secretion of the mature product, which is usually found downstream of the lactam synthetase-encoding gene. In the RES-701-3, -4 gene cluster resD encodes for an ABC-type transporter and is located in the complementary strand to resACB1B2E. resF is adjacent to resD and encodes for a 551 aa protein with 14 transmembrane helices that belongs to the Amino Acid-Polyamine-Organocation (APC) family of transporters. To the best of our knowledge, this is the first time that a member of this family of transporters has been found in a lasso peptide biosynthetic gene cluster, and is not clear what role, if any, it could play in RES-701-3, -4 biosynthesis. Presumably, the concerted action of ResD with additional ABC transporter components encoded elsewhere in the genome could be responsible for RES-701-3, -4 secretion.
On the other hand, resE is located downstream of resB2, and it encodes a medium-sized protein (209 aa), lacking any known conserved domain. Close homologues of ResE are found in the genomes of eight other Streptomyces strains. An analysis of the genomic context for these ResE analogues shows that all of them are encoded within homologous RES-701-3, -4 gene clusters ( Figure 5), strongly suggesting that ResE is required for the biosynthesis of these lasso peptides.
A close inspection of the ResA protein shows that the C-terminal 16 aa region containing the structural peptide is identical in all the cases, and only minor differences are found in the N-terminal region, corresponding to the leader peptide. In four of the cases, Streptomyces sp. 2314.4, Streptomyces sioyaensis, Streptomyces sp. 2333.5 and Streptomyces sp. 2112.2, the gene cluster organization is identical to that described here for Streptomyces caniferus CA-271066. Streptomyces auratus and Streptomyces angustmyceticus NRRL B-2347 contain the operon resACB1B2E and lack the genes resD and resF. Finally, in the case of Streptomyces sp. TM32 and Streptomyces sp. 2323.1 a number of ORFs encoding for proteins unrelated to lasso peptide biosynthesis are found embedded between resF and resD. These data suggest that ResD and ResF might not be strictly required for RES-701-3, -4 biosynthesis. On the other hand, it is worth mentioning that in the case of Streptomyces angustmyceticus NRRL B-2347, resB2 and resE might be translationally coupled, which could suggest a coordinated action and/or protein-protein interaction between ResB2 and ResE [33]. More distantly related homologues of ResE can be found in the genomes of numerous other Streptomyces species, but there is a noticeable drop in the level of homology and an analysis of their genomic context shows that they are not encoded within lasso peptide gene clusters.
transporters. To the best of our knowledge, this is the first time that a member of this family of transporters has been found in a lasso peptide biosynthetic gene cluster, and is not clear what role, if any, it could play in RES-701-3, -4 biosynthesis. Presumably, the concerted action of ResD with additional ABC transporter components encoded elsewhere in the genome could be responsible for RES-701-3, -4 secretion.
On the other hand, resE is located downstream of resB2, and it encodes a medium-sized protein (209 aa), lacking any known conserved domain. Close homologues of ResE are found in the genomes of eight other Streptomyces strains. An analysis of the genomic context for these ResE analogues shows that all of them are encoded within homologous RES-701-3, -4 gene clusters ( Figure 5), strongly suggesting that ResE is required for the biosynthesis of these lasso peptides.  The fact that i) resACB1B2EFD is sufficient to produce both lasso peptides, ii) putative roles for ResA, ResC, ResB1 and ResB2 can be proposed based in their homologies, iii) ResD and ResF are transmembrane transporters and, as such, are unlikely to be involved in the conversion of RES-701-3 into RES-701-4 and iv) all of the closest homologues of ResE are encoded in homologous RES-701-3 or -4 gene clusters found in eight other Streptomyces species, lead us to hypothesise that ResE is likely to play a key role in the hydroxylation of position 7 of the C-terminal tryptophan of RES-701-3 or its pre-lasso intermediate in order to afford RES-701-4.
In summary, we report the identification, cloning and heterologous expression of the gene cluster encoding the biosynthesis of RES-701-3, -4. Our data unequivocally shows that resACB1B2EFD is sufficient for the production of both lasso peptides. Additionally, genome mining allowed us to identify eight other Streptomyces strains potentially containing the RES-701-3, -4 biosynthetic gene cluster, in which resE is universally conserved. We hypothesise that ResE is likely to play a key role in the hydroxylation required to generate RES-701-4, but further genetic and/or biochemical characterization work will be required to test this hypothesis and decipher the exact roles of ResE and ResF in the biosynthesis of these two lasso peptides.

Bacterial Strains and Plasmids
The strain Streptomyces caniferus CA-271066 was isolated from an ascidian collected at the seaside at 2 meters depth in São Tomé (São Tomé and Principe). A similarity-based search with the 16S rDNA sequence (1393 nt) against the EzBioCloud database indicated that the strain is closely related to Streptomyces caniferus DSM 41453(T) (100% identity) [34]. NEB 10-beta competent E. coli (New England BioLabs, Ipswich, MA, USA), E. coli ET12567 (LGC Standards, Manchester, NH, USA) and E. coli

Cloning of RES-701-3, -4 Biosynthetic Gene Cluster into the pCPA01 Vector
The 9.1 Kb amplicon containing RES-701-3, -4 gene cluster was initially cloned into the pCR TM -Blunt vector using the Zero Blunt TM PCR Cloning kit (Thermo Fisher Scientific, Waltham, MA, USA). Briefly, ca. 975 ng of the 9.1 Kb amplicon was mixed with 0.5 µL of the pCR TM -Blunt vector, 1.5 µL of the ligase buffer, 1.5 µL of the T4 DNA ligase and the mixture was incubated at room temperature during 90 min. A total of 10 µL of the mixture were then used to transform NEB 10-beta competent E. coli (New England BioLabs, Ipswich, MA, USA). The generated recombinant plasmids (pBLUNT-RES) were confirmed by a restriction analysis. pBLUNT-RES was digested with SpeI and XhoI and the resulting 9.1 Kb fragment was purified and ligated by T4 DNA ligase with the largest dephosphorylated SpeI/XhoI fragment of pCAP01, followed by the transformation of NEB 10-beta competent E. coli. The recombinant plasmids were analyzed by restriction digestion to obtain pCAPRES.

Intergeneric Conjugation
Plasmid pCAPRES was conjugated into Streptomyces hosts as previously described [26]. Briefly, purified pCAPRES was used to electroporate non-methylating E. coli ET12567. Cells from E. coli ET12567/pUB307 and E. coli ET12567/pCAPRES collected at an optical density of 0.4-0.6 were washed with LB twice to remove the antibiotics, resuspended and mixed with an adequate amount of freshly activated spores of S. coelicolor M1152, M1154 and S. albus J1074. The mixtures were plated on MA for triparental mating and overlaid after ca. 16 h with nalidixic acid (25 µg/mL) and kanamycin (50 µg/mL). After a few days of incubation, some of the exconjugants were streaked on MA plates containing nalidixic acid (25 µg/mL) and kanamycin (50 µg/mL) and five colonies from each Streptomyces heterologous host were picked and streaked on ISP-2 plates. The insertion of pCAPRES into the Streptomyces hosts chromosomes was checked by PCR employing the genomic DNA of the exconjugants and the primers resBF and resBR.

Heterologous Expression of RES-701-3, -4 Gene Cluster and LC-ESI-TOF Analysis
Seed cultures on ATCC-2 of the recombinant strains S. coelicolor M1152, M1154 and S. albus J1074 harbouring RES-701-3, -4 BGC, together with the corresponding negative controls, were used to inoculate Petri plates with ISP2, ISP4, MYM, minimal medium and supplemented minimal medium. The plates were incubated at 28 • C for 6 days and then the agar was sliced and subjected to extraction with n-BuOH. The organic solvent was evaporated to dryness and the extract was resuspended to a final ratio of 20% DMSO/water. The microbial extracts were filtered and analyzed employing a Bruker maXis QTOF mass spectrometer coupled to a HPLC system, as previously described [38].