Molecular basis for carrier protein-dependent amide bond formation in the biosynthesis of lincosamide antibiotics

In the biosynthesis of the lincosamide antibiotic celesticetin, the condensation enzyme CcbD generates the lincosamide pharmacophore by forming an amide bond between the carrier protein (CP)-tethered proline and ergothioneine-conjugated thiooctose. Although the function of CcbD has been investigated, its substrate specificity, structures and catalytic mechanisms remain unclear. Here we show the structure–function analyses of CcbD. Our biochemical analysis revealed that CcbD exhibits promiscuous substrate specificity towards CP-tethered acyl substrates to generate unnatural lincosamides. Furthermore, structural analyses indicated that CcbD possesses an unusual overall fold, while the N-terminal region shows weak similarity to cysteine proteases. Thus, CcbD, like cysteine proteases, utilizes the Cys-His-Glu catalytic triad to form amide bonds in a CP-dependent manner, which is significantly different from other known amide bond-forming enzymes. Furthermore, the structures of the CcbD/thiooctose complex and the cross-linked CcbD/CcbZ-CP complex, as well as structure-based mutagenesis, revealed the intimate structural details of the CP-dependent amide bond formation reaction. Molecular insights into the mechanism of amide bond formation in the biosynthesis of lincosamide antibiotics remain scarce. Now, the crystal structure of the condensation enzyme CcbD that catalyses this reaction is solved, its substrate scope investigated and a catalytic mechanism proposed.

Article https://doi.org/10.1038/s41929-023-00971-y In this Article, we performed structure-function analyses of CcbD by in vitro characterization, crystallization and structure-based mutagenesis. The enzyme reaction analyses with l-prolyl-CoA and unnatural acyl substrates revealed that the enzyme recognizes the CP moiety of the substrate, but shows broad substrate specificity towards the acyl moiety to generate various unnatural lincosamide compounds. The crystallographic analyses of CcbD indicated that its overall structure is not related to known structurally characterized amide bond-forming enzymes, while its N-terminal region is weakly similar to those of cysteine proteases 26 . Furthermore, site-directed mutagenesis experiments indicated that CcbD utilizes Cys-His-Glu residues as a catalytic triad. Lastly, the complex structures of CcbD with 8 and CcbD with CP revealed the molecular basis for the recognition of substrates and CP, and the mechanism of the CcbD-catalysed CP-dependent amide bond formation reaction.

Substrate specificity of CcbD
First, we compared the CcbD reaction between 5 and 8 to understand the substrate preference of CcbD for the C1-S-alkyl moiety. The yield of the product from 8 was comparable to that from 5, indicating that CcbD does not recognize the structure of C1-S-alkyl moiety in the enzyme reaction (Extended Data Fig. 1). Second, while CcbD shows substrate tolerance towards the alkyl-chain length of PPL 17 , the requirement of CP and the substrate scope for non-proline acyl-substrates have condensation of CP-tethered PPL, or Pro and ergothioneine (EGT) S-conjugated thiooctose 5, to form an amide bond in 6 and 7, respectively (Fig. 1b). The amino acid selectivity is determined by the LmbC and CcbC specificities, which strictly recognize PPL and Pro to form LmbN-CP-tethered PPL and CcbZ-CP-tethered Pro, respectively 17 . In contrast, CcbD shows relaxed substrate specificity towards both of its substrates, CP-tethered Pro and thiooctose 14,17 . Specifically, as the aminoacyl substrate, CcbD can utilize both CcbZ-CP-tethered Pro (its natural substrate) and LmbN-CP-tethered PPL (intermediate from 1 biosynthesis), and it also accepts methylthiolincosamide (MTL: 8), in which EGT is substituted with a C1-S-methyl group bearing the opposite stereochemistry compared with the natural substrate 5, to generate 9 and 10, respectively (Fig. 1c) 15 .
Although this CP-dependent condensation system is similar to those of non-ribosomal peptide synthetases (NRPSs), LmbD/CcbD show low sequence similarity to any functionally characterized amide bond-forming enzymes (below 15%) 13,14 . Furthermore, the model structure of CcbD predicted by AlphaFold2 (ref. 18) does not resemble those of other amide bond-forming enzymes ( Supplementary Fig. 1) [19][20][21][22][23][24][25] . These observations suggest that LmbD/CcbD possess an unusual fold and employ a different mechanism for amide bond formation from other enzymes. Furthermore, although the function of CcbD has been investigated by in vivo and in vitro analyses, its substrate specificities towards CP-tethered acyl-substrates, overall structures, catalytic residues and catalytic mechanisms remain unclear. Article https://doi.org/10.1038/s41929-023-00971-y not been clarified. Therefore, to understand the importance of CP in the enzyme reaction, l-Pro-CoA was incubated with CcbD and 8. We determined that CcbD also uses l-Pro-CoA as a substrate, instead of CP-tethered l-Pro, although the reaction efficiency is significantly lower than that with CP-tethered l-Pro (Extended Data Fig. 2). We next tested various acyl-CPs, including acetyl-CP (9), butanoyl-CP (10), hexanoyl-CP (11), octanoyl-CP (12), alanoyl-CP (13), pipecolyl-CP (14) and phenylalanoyl-CP (15), as substrates of CcbD. Interestingly, CcbD accepted all of them as substrates to generate 16-24 with yields of ~8-34% (Extended Data Fig. 3 Table 2). Further MS/MS analysis suggested that 21 is N-octanoyl-8, and 18 and 20 are products resulting from ester bond formation at the C7-hydroxyl group ( Supplementary Fig. 2c,d). Moreover, we also tested various amino-containing compounds 25-36 as substrates of CcbD (Extended Data Fig. 4). Surprisingly, CcbD accepted 32-36 as substrates to generate dipeptide compounds 37-41, although the yields are quite low, suggesting that CcbD recognizes the ethanolamine moiety of substrates. These results indicated that the CP moiety is important for the precise substrate recognition by CcbD, while CcbD shows more relaxed specificity for the proline moiety and amino substrates to generate unnatural-type lincosamides and dipeptides.

Overall structure and active site of CcbD
To understand the structural basis for the CP-dependent amide bond-forming reaction by CcbD, we solved the structure of selenomethionine-labelled CcbD at 2.5 Å resolution ( Fig. 2 and Supplementary Table 3). CcbD exists as a homodimer and the overall structure of CcbD consists of three domains, including the α/β-fold domain, which contains five α-helices and six β-strands, four anti-parallel β-sheets and the C-terminal five α-helix domain (Fig. 2a). The CcbD monomer possesses a large cleft between the α/β-fold domain and the C-terminal α-helix domain. The four anti-parallel β-sheet domain of the adjoining monomer inserts into this cleft to form a dimer. The dimer is stabilized through electrostatic, hydrophilic and hydrophobic interactions between the amino acid residues on each inserted β-sheet domain, and the dimer interface buries ~2,850 Å 2 . The monomers are nearly identical to each other, with root-mean-square deviations (RMSDs) of 0.3 Å. The dimerization creates a large cavity between the dimer surface, consisting of α1, α5, α10 and a loop between α6 and β5 from monomer A, and α10 and a loop between β1 and β2 from monomer B. The estimated total volumes and areas of the active-site cavity are 790 Å 3 and 920 Å 2 , respectively.
Interestingly, despite the lack of sequence similarity, the α/β-fold domain of CcbD shares weak structural similarity to those of cysteine proteases, including a putative C39-like peptidase protein with unknown function from Bacillus anthracis (PDB code 3ERV) and the phytochelatin synthase (PCS)-like enzyme NsPCS from Nostoc sp. pcc 7120 (PDB code 2BTW), with RMSDs of 3.3 Å and 3.3 Å for the Cα atoms, respectively (13% and 10% identity) 26 . However, the four anti-parallel β-sheets and the C-terminal five α-helix domain, which are important for the homodimerization of CcbD, were not observed in cysteine proteases and NsPCS (Extended Data Fig. 5a-c). Comparison of the active site between CcbD and NsPCS indicated that the active site shape and residues are not conserved except for the catalytic triad as Article https://doi.org/10.1038/s41929-023-00971-y described below. The ligand binding site of NsPCS is located on the surface of monomer, while CcbD forms large pocket to specifically accepts acyl-tethered CP (Extended Data Fig. 5d,e).

Identification of catalytic residues
Cysteine proteases and PCSs utilize a Cys-His-Asn/Asp catalytic triad for the enzyme reaction [27][28][29][30] . CcbD also possesses a similar triad composed of residues Cys17, His131 and Glu148 in the putative active site cavity (Fig. 2b).The C17A, H131A and E148A variants completely lost the amide bond-forming activity, indicating that these residues play a catalytic role in the enzyme reaction of CcbD ( Fig. 3a,b). The structural similarities and conservation of the catalytic triad between cysteine proteases and CcbD suggested that the first step of the CcbD reaction is the formation of the CcbD-PPL/Pro complex, by the nucleophilic attack of Cys17 to the thioester group of CP-tethered PPL/Pro.

Binding mode of the thiooctose substrate
We solved the complex structure of CcbD with 8 at 2.25 Å resolution to investigate the binding mode of thiooctose. Compound 8 binds near Cys17 in the catalytic centre through hydrogen bond interactions (Fig. 2c,d). A C2-hydroxy group interacts with Arg263′ from the adjoining monomer, and the C3-hydroxy group also forms a hydrogen bond network with Arg263′ and Asp264′ via a water molecule. Furthermore, the C4-hydroxy group interacts with Tyr270 via a water molecule and the main chain amides of Met128 and Leu129. The position of the C7-hydroxy group is fixed by the hydrogen bond with the Asp114 side chain and the main chain carbonyl of Leu129. The C6-amino group is positioned close to the catalytic residues Cys17 and His131, at distances of 5.2 Å and 3.1 Å, respectively, suggesting that His131, once activated by Glu148, abstracts the hydrogen atom to activate the amino group of 8, and the activated amino attacks the thioester of the Cys-tethered PPL/ Pro intermediate. The D114A and R263′A variants, whose side chains interact with the sugar substrate, abolished the amide bond-forming activity, suggesting that the hydrogen bond networks with the sugar moiety of the substrate are important for substrate recognition (Fig. 3).
The S-methyl group of 8 is oriented towards the entrance of the active site cavity, and there is sufficient space to accept the EGT moiety of the natural substrate 5. The ethanolamine moiety of 32-36 would bind in similar positions to those of the C6-amino and C7-hydroxy groups of 8 and is likely to be recognized by Asp114, Leu129 and the catalytic His131 as in the 8-binding. Furthermore, this sugar binding site is significantly large enough to bind other amino-containing compounds, including bulky phenylalanol and tryptophanol. Article https://doi.org/10.1038/s41929-023-00971-y

Cross-linking reaction of CcbD and CcbZ-CP
To understand how CcbD selectively recognizes CP during the reaction, we attempted to crystallize CcbD and the CcbZ-CP complex. First, we performed a site-specific cross-linking reaction of CcbD and CcbZ-CP by using a bifunctional maleimide reagent, 1,2-bis(maleimido) ethane (BMOE), to obtain covalently cross-linked CcbD/CcbZ-CP 31 . While CcbZ-CP does not have any cysteine residues, CcbD possesses five, with Cys216 and Cys249 located on the surface of the enzyme ( Supplementary Fig. 3). Therefore, we substituted these cysteines with serine to prevent non-specific cross-linking reactions, and the resulting C216S/C249S variant was used for cross-linking to the sulfhydryl group of the 4′-phosphopantetheine in CcbZ-CP. The incubation of the C216S/C249S variant and holo-CcbZ-CP in the presence of BMOE efficiently afforded the covalent complex (Extended Data Fig. 6). In contrast, no apparent cross-linking was observed between the C17S/C216S/C249S variant and CcbZ-CP (Extended Data Fig. 6). These results clearly suggested that the specific cross-linking reaction occurred between the sulfhydryl groups of the 4′-phosphopantetheine of CcbZ-CP and Cys17 via BMOE in the active site of CcbD. We purified and crystallized the cross-linked CcbD/CcbZ-CP complex. The structure of the CcbD/CcbZ-CP complex was solved at 2.3 Å resolution. In the complex structure, CcbZ-CP is located between the α/β-fold domain and the C-terminal α-helical domain, and covers the active site entrance of CcbD (Fig. 4a). The structure of CcbD in the complex is almost identical to that in the wild type (WT), with an RMSD of 0.6 Å for the Cα atoms. The interface area between CcbD and CcbZ-CP comprises ∼770 Å 2 , which is 4.8% of the surface area of CcbD and 17% of the surface area of CcbZ-CP. This contact area is similar to those of other complex structures of CPs and enzymes, such as the polyketide synthase and NRPS machineries [32][33][34] .
The 4′-phosphopantetheine moiety, which is covalently tethered to Ser36 of CcbZ-CP, is located in the same cavity as 8. The maleimide groups of BMOE form covalent bonds with the sulfhydryl group of the catalytic Cys17 of CcbD and the sulfhydryl group of the 4′-phosphopantetheine of CcbZ-CP (Fig. 4b). The pantetheine moiety does not form tight interactions with the active site residues, probably because BMOE is larger than the natural substrate, proline. The phosphate group of the 4′-phosphopantetheine forms a hydrogen bond network with Tyr322, Arg323 and Glu260′ of CcbD. The C1-carbonyl group of the maleimide, which forms a covalent bond with Cys17, also interacts with the main chain amide of Cys17 and the side chain of His150. Considering that the Cys17-tethered PPL/Pro should be in a similar position to that of maleimide, His150 and the main chain amide group of Cys17 would form an oxyanion hole to stabilize the tetrahedral intermediate during the thioester bond formation and subsequent amide bond formation reactions. The activity of the H150A variant is significantly reduced to 23%, indicating the importance of the oxyanion hole ( Fig. 3 and Supplementary Table 4  Notably, a large space is observed deeper inside the cavity, beyond the maleimide binding site ( Fig. 4b and Extended Data Fig. 7). This space is constructed by hydrophobic residues and large enough to accept long alkyl-chains and large side chains of amino acids. These observations suggest that the alkyl-chain of PPL and the other acyl substrates bind in this space, and explain why CcbD shows broad substrate specificity. The model structure of LmbD suggests that the catalytic triad, active site residues and hydrophobic space are well conserved in LmbD, suggesting that LmbD would show a similar property to CcbD (Fig. 5).
The superimposition of the CcbD/8 and CcbD/CcbZ-CP structures and the docking model of 5 in CcbD/CcbZ-CP suggest that the binding sites for 5/8 and 4′-phosphopantetheine overlap and there is no space for the simultaneous binding of 5/8 and Pro-tethered 4′-phosphopantetheine (Extended Data Fig. 8). To further investigate the formation of CcbD-proline intermediate, we also incubated 8 with the presumed Pro-tethered CcbD, which was prepared by incubating CcbD, CcbC, holo-CcbZ-CP and proline (without 8) and removing the substrate by desalting. As a result, even after removal of the proline substrates, 10 was still produced ( Supplementary Fig. 4a). Moreover, when Pro-CoA was incubated with or without CcbD, the non-enzymatic release of proline was not significantly increased in the presence of CcbD, suggesting that CcbD does not catalyse the hydrolysis of either Pro-CoA or Pro-conjugated Cys to release proline in the active site ( Supplementary Fig. 4b). This could be due to the accessibility of water molecules in the active site. In the active site of NsPCS, which catalyses deglycination of glutathione, a water molecule interacting with catalytic His183 is located close to the acylated Cys70 (Extended Data Fig. 5f). In contrast, no activated water molecules interacting with basic or acidic amino acids were observed within 4 Å of Cys17 in the CcbD structures. On the basis of these observations, we believe that the formation of the CcbD-PPL/Pro complex and the ping-pong mechanism are more likely.

Interaction between CcbD and CcbZ-CP
CcbD recognizes CcbZ-CP through salt bridges, hydrogen bond interactions and hydrophobic interactions (Fig. 4c). The Arg312, Arg316 and Arg324 residues on α12 in CcbD form salt bridges with Glu47, Glu40 and Glu41 on α2 in CcbZ-CP, respectively, and Arg323 and Gln319 in CcbD interact with Gln43 in CcbZ-CP via a hydrogen bonding network. Moreover, Trp152 in CcbD is inserted into the hydrophobic region of CcbZ-CP ( Fig. 4c and Supplementary Fig. 5).
To confirm the importance of these interactions, we constructed the CcbD R312A, R316A and R324A variants and their triple variant. The variants of the counterpart residues in CcbZ-CP, including E40A, E41A and E47A and their triple variant, were also constructed. Cross-linking assays to evaluate the interactions of each CcbD variant with CcbZ-CP and CcbD with CcbZ-CP variants revealed that CcbD R316A and the triple variant of CcbD, and E40A, E47A and the triple variant of CcbZ-CP significantly reduced the cross-linking efficiency (Extended Data Fig. 9). Furthermore, the amide bond formation activities of the CcbD R316A and CcbD R312A/R316A/R324A variants and the CcbZ-CP E40A/E41A/E47A variant were significantly decreased by 50%, 5% and 38%, respectively, while the activities of CcbD R312A and R324A were comparable to that of the WT (Extended Data Fig. 10 and Supplementary Table 4). Arg316 forms salt bridges with both Glu40 and Glu47 and a hydrogen bond interaction with Gln43, which is the reason why Arg316 is the most important residue for the activity. These results clearly indicated that the salt bridge and hydrophilic interactions between CcbD and CcbZ-CP are important for CP recognition. The sequence alignment between LmbN-CP and CcbZ-CP (68% identity) revealed that the hydrophilic and hydrophobic residues, observed in the interactions between CcbD and CcbZ-CP, are conserved except for Leu60. This would be the reason why CcbD accepts LmbN-CP-tethered PPL in addition to its natural substrate ( Supplementary Fig. 6).

Discussion
Amide bond formation is one of the most ubiquitous reactions in nature, and a variety of amide bond-forming enzyme families have been identified, including the condensation (C)-domain of NRPS, ATP-dependent amide bond synthetase, N-acyltransferase, aminoacyl-tRNA synthetase, PCS and ATP-grasp enzymes 35 . Among them, the CP-dependent amide bond formation is known only in the C-domains of NRPSs and some Gcn5-related N-acetyltransferase (GNAT) enzymes [35][36][37][38] . The structure of the NRPS C-domain consists of two domains that form V-shaped pseudo-dimers ( Supplementary Fig. 1c) Fig. 1h) 42 .
In contrast, the structure of the N-terminal domain of CcbD is weakly related to those of cysteine proteases and PCSs. The active site of CcbD is constructed on the dimer interface, while those of other cysteine proteases and PCSs are formed in the monomer cavity 26 . The CcbD-specific C-terminal α-helix domain is involved in dimerization, active site formation and interactions with CP, mainly through hydrophilic interactions and salt bridges. Furthermore, the C-terminal α-helix domain in the adjoining monomer also forms the active site and interacts with the 4′-phosphopantetheine.
Regardless of the lack of sequence similarity and active site architectures, CcbD has a Cys17-His131-Glu148 catalytic triad similar to the catalytic triads of cysteine proteases and PCSs [26][27][28][29][30][43][44][45] . In the reaction of PCSs, which catalyses deglycination of glutathione and condensation between γ-Glu-Cys and glutathione to generate phytochelatins, the catalytic Cys residue is initially acylated by the first glutathione and form enzyme-associated γ-Glu-Cys. Then, the amino group of another glutathione attacks the thioester bond of the enzyme-γ-Glu-Cys intermediate to generate phytochelatins. Similarly, the chemoenzymatic peptide synthesis by cysteine proteases has been demonstrated as the reverse reaction of hydrolysis, through thermodynamically or kinetically controlled synthesis 46,47 . In these peptide syntheses, the acyl donor substrate, activated with an ester, amide or nitrile bond, is attacked by a nucleophilic cysteine residue to form a thioester bond with the protease. Subsequently, the thioester bond is cleaved by the amino acid substrate to generate the amide bond.
Considering the similarity of the catalytic residues, CcbD is expected to share a catalytic mechanism comparable to those of cysteine proteases and PCSs. On the basis of these observations, we propose the following mechanisms for the condensation between thiooctose and CP-tethered Pro/PPL (Fig. 6). The enzyme reaction is initiated by the binding of the carbonyl group of Pro in the CP-tethered substrate into the oxyanion hole of the active site, through interactions with the backbone of Cys17 and the imidazole ring of His150. The nucleophilic attack from Cys17 to the thioester bond in the Pro-connected CP then generates the CcbD-Pro complex, with the release of CP. Subsequently, the thiooctose binds to the active site of the CcbD-Pro complex, and the thioester at Cys17 is attacked by the amino group of thiooctose, with nucleophilicity enhanced by the proximity of His131 and Glu148, to generate an amide bond. The high sequence similarity (56% identity) and the conservation of the active site architectures between CcbD and LmbD suggest that the reaction mechanism of LmbD is the same as that of CcbD (Fig. 5). Thus, the CcbD/LmbD-catalysed condensation reaction between the amino sugar substrate and Pro/PPL is structurally and mechanistically different from those of the NRPS system but resembles those of cysteine proteases and PCSs.
In conclusion, our structural analysis of CcbD revealed that CcbD/ LmbD catalyses the formation of the pharmacophore of lincosamide antibiotics via a mechanism that is structurally and mechanistically Article https://doi.org/10.1038/s41929-023-00971-y significantly distinct from the previously identified CP-dependent amide bond-forming enzymes, and thus provided insights into the diversity of amide bond-forming reactions in nature. Future structure-based engineering of CcbD/LmbD will develop the enzymes as biocatalysts to generate unnatural lincosamides with various acyl and sugar moieties for future drug discovery.

General
Oligonucleotide primers (Supplementary Table 5 and Supplementary Data 1) and DNA sequencing services were provided by Eurofins Genomics. The restriction enzymes and PrimeSTAR GXL DNA polymerase were purchased from Takara Bio Solvents and chemicals were purchased from Wako Chemicals, Merck KGaA and Hampton Research, unless noted otherwise. PCR was performed using a TaKaRa PCR Thermal Cycler Dice Gradient (Takara Bio). The nuclear magnetic resonance spectra of compounds were recorded on ECX-500 MHz ( JEOL) spectrometers.

Construction of plasmids for CcbD expression
The ccbD gene was PCR amplified from the chromosomal DNA of the celesticetin-producing type strain Streptomyces caelestis ATCC 15084, using the primers Fw: CCGCATATGGCCCAATCCAAGGGTTCGGTTGAT and Rv: CCGCTCGAGGAGTTCCTTGAGCAATCGCCG. The ccbD gene was inserted into the pET42b vector (Novagen) via the NdeI and XhoI restriction sites. The resulting plasmid was used to produce C-terminally His 8 -tagged CcbD. The plasmid for CcbC expression was used as previously reported 17 .

Construction of plasmids for expression of CcbD and CcbZ-CP variants
The primers used for the construction of plasmids for site-directed mutagenesis studies are listed in Supplementary Table 5. The plasmid for the expression of WT CcbD or WT CcbZ-CP was used as the template for PCR-based site-directed mutagenesis, which was performed with a QuikChange Site-Directed Mutagenesis Kit (Stratagene) according to the manufacturer's protocol.

Expression and purification of CcbD, CcbZ-CP, holo-CcbZ-CP and their variants
The pET42a plasmids for the expression of CcbD and its variants were transformed into Escherichia coli BL21(DE3). The resulting strains were cultured in LB medium supplemented with 34 mg l −1 chloramphenicol and 50 mg l −1 kanamycin sodium at 37 °C, with shaking at 160 rpm. Plasmids for the expression of CcbZ-CP and its mutants and sfp were transformed into E. coli BL21(DE3) or E. coli BLR. The resulting strains were cultured in LB medium supplemented with 100 mg l −1 ampicillin sodium and 50 mg l −1 kanamycin sodium, respectively, at 37 °C, with shaking at 160 rpm. When the OD 600 reached 0.6, the cell cultures were cooled on ice for 30 min, and then isopropyl β-d-1-thiogalactopyranoside (0.3 mM) was added to induce the target protein expression and the cultures were continued at 16 °C, 160 rpm. After 18 h of post-induction incubation, cells were collected by centrifugation at 5,500g for 10 min and suspended in lysis buffer, containing 20 mM Tris-HCl (pH 8.0), 100 mM NaCl, 5 mM imidazole and 5% glycerol. The cell suspension was sonicated for 5 min on ice. After the cell debris was removed by centrifugation at 20,000g for 30 min, the supernatant was mixed with 1 ml Ni-NTA agarose resin and loaded onto a gravity flow column. Unbound proteins were removed with 50 ml lysis buffer containing 30 mM imidazole, and then the His-tagged protein was eluted with lysis buffer containing 300 mM imidazole. Buffers for CcbD H16A, C17A, D114A, H131A, E148A, H150A, R263A, D277A and E278A purification were supplemented with 1 mM dithiothreitol. For the in vitro assay, the eluates of CcbD and its variant proteins were concentrated to 10 mg ml −1 after the removal of imidazole, using a 30 kDa Amicon Ultra-15 filtration unit (Merck Millipore). The eluates containing holo-CcbZ-CP and its variant proteins were concentrated to 20 mg ml −1 after the imidazole was removed, using a 3 kDa or 10 kDa Amicon Ultra-15 filtration unit (Millipore). For crystallization, the eluate from the Ni-NTA agarose was applied to a 6 ml RESOURCE Q anion exchange chromatography column (4 °C, Cytiva) and a HiLoad 16/60 Superdex 200 pre-packed gel filtration column (4 °C, Cytica), and eluted with a solution containing 20 mM Tris-HCl (pH 8.0), 100 mM NaCl, 5% glycerol and 1 mM dithiothreitol. The resulting eluate was concentrated to 15 mg ml −1 , using an Amicon Ultra-4 (molecular weight cut-off 30 kDa) filter at 4 °C. The purity of the proteins was monitored by SDS-PAGE, and the protein concentrations were determined with a SimpliNano microvolume spectrophotometer.

In vitro assays of CcbD and its variants
The standard enzymatic reaction of CcbD with substrate was pre-incubated for 30 min in a 50 μl reaction mixture, containing 50 mM Tris-HCl (pH 7.5), 2 mM l-proline, 2 mM dithiothreitol, 5 mM ATP (pH 7.5), 10 mM MgCl 2 , 2 μM CcbC, 50 μM holo-CcbZ-CP or its variants, and 2 mM substrate 5 or 8. After this pre-incubation, 10 μM CcbD (WT or variants) was added to the reaction and incubated for 5 min at 30 °C. To measure the relative activity between 5 and 8, the consumption of substrates was calculated by liquid chromatography-mass spectro metry To measure total turnover number and conversion rate towards l-proline reaction, the reaction buffer containing 50 mM HEPES (pH 7.5), 1 mM l-proline, 1 mM 8, 5 mM ATP (pH 7.5), 5 mM MgCl 2 , 20 μM CcbC and 100 μM CcbZ-CP was pre-incubated for 30 min in a 25 μl reaction mixture. After this pre-incubation, 2 μM CcbD was added to the reaction and incubated for 13 h at 30 °C. For the total turnover number and conversion rate analysis towards acyl-substrates reactions, the reaction buffer containing 5 mM MgCl 2 , 100 μM acyl-CoA, 100 μM MTL, 30 μM Sfp and 100 μM apo-CcbZ-CP was pre-incubated for 30 min in a 25 μl reaction mixture. After this pre-incubation, 10 μM CcbD was added to the reaction and incubated for 13 h at 30 °C. Each sample was then subjected to LC-MS analyses and the consumption of 8 was calculated.

LC-MS methods used for analysing in vitro reactions.
For the analysis of the CcbD WT with various substrates LC-MS samples were injected into a Shimadzu Labsolution LCMS 8045 system. A HILICpak VG-50 2D column (Shodex) or C18-MS-II column (Nacalai tesque) was used for separation. The gradient elution was performed with solvent A (50 mM NH 4 Ac) and solvent B (CH 3 CN), with a flow rate of 0.

Production and purification of natural thiooctose substrate of CcbD (5)
The seed culture of S. lincolnensis lmbN_ΔCP strain 13 was prepared by inoculating spores into 50 ml of YEME medium in 500 ml flat-bottom boiling flasks, which were incubated at 28 °C for 30 h, 180 rpm. A 2 ml portion of the seed culture was then inoculated into 40 ml of AVM medium 13 in 500 ml flat-bottom boiling flasks and incubated at 28 °C for 120 h. The supernatants from 30 flat-bottom boiling flasks were used in the next steps. The cells were centrifuged at 4,000g at 4 °C for 10 min, and the supernatant was stored at −20 °C. The supernatant was adjusted to pH 2-3 with formic acid and extracted in two steps. First, Amberlite XAD-4 in a glass column was used, and the amount of the sorbent was approximately 5 cm in diameter and 10 cm in height. Methanol (MeOH) followed by 0.1% formic acid was used to equilibrate the column before loading the supernatant. The eluate-containing compounds not interacting with Amberlite and including the compound of interest were collected. Second, MCX 35 cc (6 g) (Waters) cartridges were used. The cartridge was conditioned and equilibrated with MeOH followed by 2% formic acid, and then the collected eluate from the previous extraction was applied on the cartridge, which was washed with 2% formic acid and MeOH. Thereafter, 200 ml MeOH with 5% of an aqueous solution of ammonium hydroxide (29%) was loaded to elute the compound of interest. For further purification of the natural substrate, two-step preparative high-performance liquid chromatography was performed. The first step was carried out using a Triart Diol-HILIC column (250 × 20 mm, 5 μm particles; YMC) with a two-component mobile phase: A, 50 mM ammonium acetate, pH 4.7, and B, ACN. The flow rate was 8 ml min −1 with a linear gradient (min/% of B) 0/95; 5/95; 28.6/30; and 32/95 with equilibration before the next analysis. The second step was performed on a Luna column (C18, 250 × 15 mm, 5 μm; Phenomenex) using a two-component mobile phase, with A as 0.1% formic acid and B as MeOH. The flow rate was 3 ml min −1 with a gradient of 5% to 90% B, for 70 min. The fractions with the natural thiooctose substrate were monitored using an Acquity UPLC system with a 2996 PDA detection system (194-600 nm), connected to an LCT Premier XE time-of-flight mass spectrometer (Waters). The sample (5 μl) was loaded onto the ACQUITY PREMIER BEH Amide (2.1 mm × 50 mm, 1.7 μm) LC column, maintained at 40 °C. A two-component mobile phase, B and A, containing ACN with 20 mM ammonium formate, pH 4.75, 9:1 (v/v) and ACN with 20 mM ammonium formate, pH 4.75, 1:1 (v/v), respectively, was used for separation. The elution was performed at a flow rate of 0.4 ml min −1 with the following gradient (min/%B) 1/99.0; 7/1.0; 9/1.0; and 10/99.0, with 1.0 min column clean-up (99.0% B) and 2 min equilibration (99% B). The mass spectrometer was used in the positive mode (W) with the cone voltage, +40 V; the capillary voltage, +2,800 V; ion source block temperature, 120 °C; desolvation gas temperature, 350 °C; desolvation gas flow, 800 l h −1 ; and cone gas flow, 50 l h −1 ; with an inter-scan delay of 0.01 s and a scan time of 0.15 s. The mass accuracy was maintained using lock spray technology with leucine-enkephalin as the reference compound (2 ng μl −1 , 5 μl min −1 ). The diode array detector detection technique and the same conditions were used to quantify the natural substrate. The chromatograms were extracted at the absorption maxima of the compounds, and the area of the peaks was used to construct a five-point calibration curve (R = R 2 = 0.9999) of the EGT standard (Cayman Chemical), which was used to quantify the natural thiooctose substrate (5).

Preparation and purification of the cross-linked CcbD-CcbZ-CP complexes
For the cross-linking assays of CcbD-CcbZ-CP and their variants, 20 μM CcbD or CcbD variant protein was mixed with 60 μM holo-CcbZ-CP or holo-CcbZ-CP variant proteins and 0.2 mM BMOE, in buffer containing 40 mM NaH 2 PO 4 /K 2 HPO 4 (pH 7.0), 5 mM EDTA and 5% glycerol. At 10, 30 and 60 min after the reaction was initiated, a 50 μl aliquot of the reaction was removed and the reaction was quenched with SDS-PAGE loading buffer and monitored by SDS-PAGE. To obtain the CcbD-CcbZPCP complex proteins, 60 μM His-tag free CcbD C249S/C216S variant was mixed with 600 μM His-tagged holo-CcbZPCP protein, in buffer containing 40 mM NaH 2 PO 4 /K 2 HPO 4 (pH 7.0), 5 mM EDTA and 5% glycerol. The cross-linking reaction was initiated by adding 0.2 mM BMOE, and incubated on ice for 1 h. The BMOE was removed by passage through a PD-10 column, and the complexes were monitored by SDS-PAGE. The remaining CcbD variant proteins and holo-CcbZPCP proteins were removed by chromatography on nickel affinity resin and a HiLoad 16/60 Superdex 200 pre-packed gel filtration column (4 °C, Cytiva).

Crystallization and structure determination
Crystals of selenomethionine-labelled CcbD were obtained after 3 days at 20 °C, while crystals of CcbD-apo and CcbD-CcbZPCP complex were obtained after 3 days at 10 °C. All crystals were obtained by using the sitting-drop vapour-diffusion method with the following reservoir solutions: selenomethionine-labelled CcbD: 0.02 M Tris-HCl (pH 6.7), 0.16 M MgCl 2 and 10% w/v PEG 8000; and CcbD WT: 0.02 M Tris-HCl Article https://doi.org/10.1038/s41929-023-00971-y (pH 6.7), 0.16 M MgCl 2 and 10% w/v PEG 8000. The complex structures were prepared by incubating CcbD crystals at 10 °C for 1 h with 50 mM 8 in the crystallization drop. The crystallization conditions to prepare the CcbD-CcbZ-CP complex were 0.1 M MES (pH 6.7), 0.16 M MgCl 2 , 0.2 M sodium thiocyanate and 8% w/v PEG 8000. The crystals were transferred into the cryoprotectant solution (reservoir solution with 25% (v/v) glycerol), and then flash-cooled at −173 °C in a nitrogen gas stream. The X-ray diffraction datasets were collected at BL-1A (Photon Factory), using a beam wavelength of 1.1 Å. The diffraction datasets for selenomethionine-labelled CcbD, CcbD with 8 and CcbD-CcbZ-CP were processed and scaled using the XDS program package 48 and Aimless in CCP4 (ref. 49). The determination of Se sites and the generation of the initial model were performed with Crank2 in CCP4 (ref. 50). The dataset for selenomethionine-labelled CcbD was slightly twinned. The analysis by Xtriage in PHENIX 51 suggested that the correlation between the intensities related by the twin law h, -k, -h-l, with an estimated twin fraction of 0.15 is most likely due to a non-crystallographic symmetry axis parallel to the twin axis. Further processes were carried out without detwin. The initial phases of CcbD with 8 and the CcbD-CcbZ-CP structures were determined by molecular replacement, using the selenomethionine-labelled CcbD structure as the search model. Molecular replacement was performed with Phaser 52 in PHENIX 51 . The initial phases were further calculated with AutoBuild in PHENIX 51 . The structures were modified manually with Coot 53 and refined with PHENIX.refine 54 . The cif parameters of 8 for the energy minimization calculations were obtained by using the PRODRG server 55 . The final crystal data and intensity statistics are summarized in Supplementary  Table 3. The Ramachandran statistics are as follows: 97.7% favoured, 2.3% allowed for selenomethionine-labelled CcbD; 98.0% favoured, 1.9% allowed, 0.1% outliers allowed for CcbD with 8; and 98.4% favoured, 1.6% allowed for CcbD-CcbZ-CP. A structural similarity search was performed, using the Dali program server 56 . All crystallographic figures were prepared with PyMOL (DeLano Scientific, http://www.pymol. org). For docking model structure with 5, initial docking models were constructed using the AutoDock Vina plugin in UCSF Chimera 57 . Then, the conformation of ligands was manually modified on the basis of the binding mode of 8 in the complex structure of CcbD/8 to avoid the close contacts between the ligands and the active site residues, with Coot.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
The data supporting the findings of this study are available within this article and its Supplementary Information files and source files or from the corresponding authors on request. The crystallographic data that support the findings of this study are available from the Protein Data Bank (http://www.rcsb.org). The coordinates and the structure factor amplitudes for the selenomethionine-labelled CcbD, complex structures of CcbD with 8 and CcbD with CcbZ-CP were deposited under accession code 7YN1 58 , 7YN2 59 and 7YN3 60 , respectively. Source data are provided with this paper.