Targeted genome mining for microbial antitumor agents acting through DNA intercalation

Microbial natural products have been one of the most important sources for drug development. In the current postgenomic era, sequence-driven approaches for natural product discovery are becoming increasingly popular. Here, we develop an effective genome mining strategy for the targeted discovery of microbial metabolites with antitumor activities. Our method employs uvrA-like genes as genetic markers, which have been identified in the biosynthetic gene clusters (BGCs) of several chemotherapeutic drugs of microbial origin and confer self-resistance to the corresponding producers. Through systematic genomic analysis of gifted actinobacteria genera, identification of uvrA-like gene-containing BGCs, and targeted isolation of products from a BGC prioritized for metabolic analysis, we identified a new tetracycline-type DNA intercalator timmycins. Our results thus provide a new genome mining strategy for the efficient discovery of antitumor agents acting through DNA intercalation.


Introduction
Microbial specialized metabolites are rich sources of antibiotics and clinically used drugs [1][2][3]. These small molecules disrupt essential cellular processes through various modes of action. Historically, compounds of interest were isolated through large-scale screening of microbial culture extracts followed by activity tracking based on specific bioactivity assays. However, this traditional bioactivity-guided strategy is becoming less efficient due to the high rate of rediscovery of known compounds. In recent years, great efforts have been made to elucidate the biosynthetic pathways of microbial specialized metabolites with diversified structures and activities [4,5]. These studies not only revealed novel enzymology involved in the microbial specialized metabolism but also significantly increased our understanding of the self-resistance mechanisms of the producers, as it was found that self-resistance genes are frequently clustered with biosynthetic genes [6]. This co-localization property facilitates the identification of biosynthetic genes of a given microbial metabolite and offers genome mining approaches for the discovery of new compounds with predictable modes of action in the postgenomic era. Indeed, several self-resistance-guided strategies for identifying new bioactive compounds have been reported [7][8][9][10][11][12].
One family of bacteria-derived antitumor and antimicrobial agents is DNA intercalators, which can insert into the DNA double helix through noncovalent interactions, leading to DNA structural changes and replication arrest [13,14]. Molecules from this class include chemotherapeutic anthracycline doxorubicin, enediyne C-1027, depsipeptide thiocoraline, and aurealic acid mithramycin [14]. These molecules are biosynthetically assembled by different enzymatic machinery and are structurally distinct from each other. It was found that the biosynthetic gene clusters (BGCs) of these compounds encode UvrA-like proteins that can confer self-protection to the producers. Examples include DrrC from Peer review under responsibility of KeAi Communications Co., Ltd. the daunorubicin BGC [15], Ecm16 from the echinomycin BGC [16], and MtrX from the mithramycin BGC [17] (Fig. 1). Canonical UvrA is part of the UvrABCD nucleotide excision repair (NER) system [18]. UvrA participates in the initial ATP-dependent DNA scanning and damage recognition through direct UvrA-DNA interactions. This is followed by the recruitment of UvrB and UvrC for nucleotide excision. Although the detailed self-resistance mechanism of the UvrA-like protein family is still not fully understood, preliminary studies of DrrC and Ecm16 suggested that they also depend on ATP for their normal functions [19]. However, unlike canonical UvrA, they lack a UvrB-binding domain and can render self-protection independent of the host UvrABC system.
In this study, we explored the potential of utilizing the uvrA-like gene as a genetic marker for the targeted isolation of new anticancer and antimicrobial agents through genome mining. Through systematic genome analysis of gifted Actinobacteria genera, prediction and classification of UvrA-like gene-containing BGCs, and metabolic analysis of a strain prioritized for target compound isolation, we identified new tetracycline-type DNA intercalator timmycins. Together, our results reported a new genome mining strategy for the efficient isolation of antibacterial and antitumor agents acting through DNA intercalation.

Gene inactivation in Actinomadura strain
For the construction of a gene knock-out mutant in Actinomadura sp. ATCC 31491, two ~3 kb homologous arms flanking the targeted region were amplified by PCR using genomic DNA of Actinomadura sp. ATCC 31491 as templates. These segments were assembled into linearized pYD69 through seamless cloning [20]. After confirmation by DNA sequencing, the obtained vector was introduced into methylation-deficient E. coli ET12567/pUZ8002 for E. coli-Actinomadura conjugation. Exconjugants were obtained after selection for apramycin resistance. After several rounds of nonselective growth, replica plating and PCR were then used to screen the apramycin-sensitive colonies for gene knockout mutants (Fig. S1).

Metabolic analysis of Actinomadura strains
For the metabolic analysis, strain Actinomadura sp. ATCC 31491 and its mutant strain were grown on MSF plates at 30 • C for 10 days until sporulation. These MSF agar plates were then sliced into small pieces, extracted by ethyl acetate, and subjected to HPLC analysis. Analytic HPLC analysis was carried out with an Agilent 1260 HPLC apparatus using an Elipse EC-C18 column (5 μm, 4.6 mm ID × 150 mm) (detection wavelength: 280 nm). Elution was performed at 1.0 mL/min with a mobile-phase mixture consisting of a linear gradient of water and acetonitrile (both contain 0.05% (v/v) formic acid) as follows: 30%-90% acetonitrile in 18 min, 90%-100% acetonitrile in 2 min, and then maintained at 100% acetonitrile for 3 min.

Compound isolation and structural elucidation
For isolation of compounds, the ethyl acetate extract of Actinomadura sp. ATCC 31491 grown on MSF agar was fractionated on a Sephadex LH-20 column with MeOH elution. After analysis with HPLC, the fractions containing metabolites of interest were combined, and the target compounds were purified from these fractions by reversed-phase semipreparative HPLC (YMC-Triart C18, 5 μm, 10 mm ID × 250 mm) (UV detection at 280 nm). Compounds 1 (40 mg, t R = 33.0 min) and 2 (30.1 mg, t R = 45.0 min) were purified with 52% (acetonitrile+ 0.05% FA)/ (H 2 O + 0.05% FA) at 4 mL/min. The 1 H-and 13 C and 2D NMR spectra were recorded on a Bruker AV-600 MHz spectrometer using CDCl 3 as the solvent.

Antibacterial and antitumor activity assays
S. aureus subsp. aureus Rosenbach ATCC 29213, E. faecalis ATCC 19433, and P. aeruginosa ATCC 27853 were used for antibiotic assays. Briefly, microbial seed cultures were initiated by inoculating 5 mL LB and by growing these overnight at 37 • C and 220 rpm. Each culture was then diluted to an initial OD 600 of 0.02 in a 100 μL volume per well in a 96-well plate, which gave an inoculum of approximately 5 × 10 5 to 5 × yl)-2,5-diphenyltetrazolium bromide) (0.83 mg/mL) solution was added to each well [21], and plates were returned to the incubator for 4 h. After removal of the solution from each well, 150 μL of DMSO was added to lyse the cells and solubilize the purple crystals. The samples were read on a microplate detector at 490 nm. The obtained results were processed using GraphPad Prism 7 software.

Genome mining of BGCs with uvrA-like genes
To explore the distribution of UvrA-like proteins in actinobacteria that are known as gifted natural product producers, we first collected sequences of six UvrA-like proteins from BGCs of known antitumor compounds. These include DrrC (NCBI accession number: AAB39274), QncO (AGD95041), MtrX (CAK50795), SgcB2 (ANY94424), Ecm16 (BAE98165), and Luz28 (UKU0991). We then used these sequences as queries for the tblastn searches against the NCBI database (July 2022) of RefSeq genomes of selected Actinobacteria genera (Actinomycetales, Frankiales, Micromonosporales, Streptomycetales, and Streptosporangiales) in the GenBank database [22]. Homologous proteins with sequence identity ≥30% and alignment length >600 amino acids were collected. After dereplication, 3820 actinobacteria genomes containing approximately six thousand UvrA/UvrA-like proteins were recovered. Next, the sequences of 40 kb upstream and downstream regions of each uvrA/uvrA-like gene were collected and then subjected to antiSMASH 6.0 analysis to detect the potential presence of putative BGCs [23].
These efforts resulted in 642 BGCs in total.
We then used the BiG-SCAPE tool to construct the sequence similarity networks of these BGCs and group them into eight classes, including PKSs, other PKSs, NRPSs, PKS-NRPS hybrids, RiPPs, saccharides, terpenes, and others [24] (Fig. 2). Our present study focused on the saccharide class for further study. All BGCs from this class contain type II PKS and glycosyltransferase genes, which potentially encode aromatic polyketides decorated with saccharide residues. BGCs of characterized compounds fall into this category, including those for mithramycin, chromomycin A3, cytorhodin Y, and cosmomycin C [25][26][27][28]. Among them, mithramycin has been approved for clinical anticancer use [25]. Carbohydrate moieties have been shown to be essential for the biological properties of these molecules.

In silico analysis of the tim BGC that is prioritized for further study
Next, we used CORASON to construct a phylogenetic tree of these UvrA-like protein-containing BGCs from the saccharide class [24] (Fig. 3). One BGC from the strain Actinomadura sp. ATCC 31491 caught our attention and was prioritized for metabolite characterization (Fig. 3). This BGC (named here as tim BGC) is phylogenetically related to the BGCs of aureolic acids mithramycin and chromomycin, but it falls into an isolated clade. Comparison of the tim BGC to those of mithramycin (mtm BGC) and chromomycin (cmm BGC) revealed that the tim BGC lacks a pair of genes (mtmOIV/mtmW, cmmOIV/cmmW) that are conserved in the mtm and cmm BGCs (Fig. 4a, Table S4). These genes encode enzymes responsible for the oxidative cleavage of the fourth ring of a tetracyclic precursor, resulting in a tricyclic structure. In addition, the tim BGC encodes a putative amidotransferase (TimD), the homolog of which has been demonstrated to catalyze the transamination of malonate to malonamate in the biosynthesis of oxytetracycline, which introduces an amide unit at one terminus of the polyketide backbone [29]. The presence of this gene in tim BGC suggested that the corresponding product possesses a tetracycline-type amidated aglycon structure.

Identification and structure elucidation of timmycins
To identify the product arising from the tim BGC, we first generated a mutant strain of Actinomadura sp. ATCC 31491 by deleting the timA and timB genes, which encode the type II PKS ketoacyl synthase KS α and KS β , respectively. We then compared the metabolic profiling of strain ΔtimAB with that of the parental strain ATCC 31491 by HPLC analysis of their culture extracts. After screening different culture media, we identified a set of peaks that were only present in the culture extract of the parental strain but not in that of the strain ΔtimAB (Fig. 4b). We speculated that these peaks, which displayed similar UV-vis spectra, are structural analogs and could derive from the tim BGC.
To further structurally characterize these compounds, we set out to isolate the two major components (1 and 2) (Fig. S2b). Their structures were elucidated using a combination of 1D and 2D NMR analysis (Fig. 4c and Fig. S3-6). Timmycins contain a tetracycline-type aglycone substituted with two oligosaccharide side chains, including a disaccharide attached to C-8 and a trisaccharide attached to C-12a of the aglycone (Fig. 4c)

Proposed biosynthetic pathway of timmycins
Based on the established knowledge of the biosynthetic pathways of mithramycin/chromomycin and other aromatic polyketides [25,26], we proposed a biosynthetic route to timmycins (Fig. 5). The tim BGC encodes enzymes necessary for the construction of an aminated tetracyclic core. These include KS α (TimA), KS β (TimB), acyl carrier protein (TimC), amidotransferase (TimD), aromatase (TimQ), and cyclases (TimY and TimX). Oxygenases TimOI and TimOII are suggested to introduce hydroxyl groups into the polyketide core. The putative C-methyltransferase TimMII and the two O-methyltransferases TimMI and TimMIII could be responsible for three methylation steps occurring on the timmycin aglycone. For the subsequent glycosylation steps, the four glycosyltransferases (TimGI/GII/GIII/GIV) are expected to catalyze the formation of five glycosidic bonds, which are analogous to those of the mithramycin and chromomycin pathways. More specifically, TimGIV and TimGIII, which are similar to CmmGIV (57% identity) and CmmGIII (53% identity) from the chromomycin pathway (Table S4), are assumed to sequentially transfer sugar C (by TimGIV), sugar D (by TimGIII), and sugar E (by TimGIV) to the aglycon, whereas TimGI and TimGII, which are more closely related to CmmGI (53% identity) and CmmGII (52% identity), are suggested to install sugars A and B, respectively. As the last steps in timmycin biosynthesis, the sugar O-methyltransferase TimF and the two acyltransferases TimKI and TimKII could further tailor the sugar moieties through methylation and acylations.

Biological activities of timmycins
We next tested the antibacterial and cytotoxic activity of timmycins (Tables 1 and 2). Both 1 and 2 have strong antibacterial activity against Gram-positive bacteria, including Staphylococcus aureus subsp. aureus ATCC 29213 and Enterococcus faecalis ATCC 19433 (Table 1). No antifungal or anti-Gram-negative bacterial activity was observed at the concentration we tested. Compound 1 displays cytotoxicity against cancer cell lines (A375, HCT116, and HeLa) in the submicromolar range ( Table 2).
To determine the ability of 1 to interact with DNA, we used a gel mobility retardation assay by incubation of 1 with a 0.7 kb genomic DNA fragment randomly selected from the genome of Actinomadura sp. ATCC 31491 (Fig. S8). We found that 1 affects the mobility of the DNA fragment, indicating that 1 can bind DNA. Next, we investigated the role of the UvrA-like protein TmrX in self-resistance. We introduced tmrX into the heterologous host Streptomyces albus J1074. A disk diffusion test was then used to evaluate the sensitivity of the resulting strain S. albus + tmrX to 1 (Fig. S9). Compared with the control strain S. albus, the resistance of S. albus + tmrX to 1 increased significantly. Together, these results supported that 1 exerts its antibacterial and cytotoxic action by interacting with DNA.

Conclusion
In this study, we developed a new genome mining strategy for the targeted discovery of bacteria-derived antitumor agents. This method is based on the previous observation that the biosynthetic gene clusters of natural DNA intercalators encode a family of UvrA-like proteins for selfresistance, indicating the potential of the uvrA-like gene as a genetic marker to guide the identification of BGCs for new DNA intercalators. As a proof of concept, we successfully isolated a new tetracycline-type DNA intercalator, timmycin, from an Actinomadura strain. Although selfresistance gene-guided genome mining approaches have been described previously, this is the first time that the uvrA-like gene family is used for the targeted isolation of microbial bioactive metabolites. Considering that hundreds of new uvrA-like gene-containing BGCs were identified in this study, our genome mining strategy has great potential   for the discovery of novel anticancer drug candidates.

Declaration of competing interest
The authors declare that they have no conflicts of interest.