Biochemical basis for the regulation of biosynthesis of antiparasitics by bacterial hormones

Diffusible small molecule microbial hormones drastically alter the expression profiles of antibiotics and other drugs in actinobacteria. For example, avenolide (a butenolide) regulates the production of avermectin, derivatives of which are used in the treatment of river blindness and other parasitic diseases. Butenolides and γ-butyrolactones control the production of pharmaceutically important secondary metabolites by binding to TetR family transcriptional repressors. Here, we describe a concise, 22-step synthetic strategy for the production of avenolide. We present crystal structures of the butenolide receptor AvaR1 in isolation and in complex with avenolide, as well as those of AvaR1 bound to an oligonucleotide derived from its operator. Biochemical studies guided by the co-crystal structures enable the identification of 90 new actinobacteria that may be regulated by butenolides, two of which are experimentally verified. These studies provide a foundation for understanding the regulation of microbial secondary metabolite production, which may be exploited for the discovery and production of novel medicines.


Introduction
The rise of drug-resistant pathogens continues to compromise human health and is exacerbated by the decline in the rate of discovery of new anti-infectives (Aminov, 2010;Tenover, 2006). A major limitation is the lack of tools that enable access to the vast library of bacterial natural product antibiotics. Statistical surveys depict the number of antibiotics that are genetically encoded within the Streptomyces genus to be in excess of~300,000 new molecules, but a large repertoire of these compounds cannot be produced when the strain is grown under standard laboratory conditions (Govern, 2013;Watve et al., 2001). The responsible biosynthetic genes are 'silent' under laboratory conditions and are regulated via unknown mechanisms (Arakawa, 2018;Tyurin et al., 2018).
The diffusible small molecule g-butyrolactone (GBL) A-factor plays an essential role in the biosynthesis of the antibiotic streptomycin in Streptomyces gresius. The intracellular target of A-factor has been identified as a member of the TetR family, and these receptors have been shown to regulate antibiotic biosynthesis in several actinobacterial species ( Figure 1A; Tyurin et al., 2018;Takano, 2006;Choi et al., 2003;Onaka et al., 1995). The use of exogenous GBLs has been shown to induce secondary metabolite production from otherwise silent clusters (Thao et al., 2017;Sidda and Corre, 2012). However, the alkali labile nature of g-butyrolactones and the pleiotropic nature of GBL-mediated regulation limit the general use of these hormones.
Related classes of bacterial hormones include the 2-alkyl-3-methyl-4-hydroxybutenolides, the 2alkyl-4-hydroxymethylfuran-3-carboxylic acid, and the alkylbutenolides ( Figure 1B). Butenolides, such as avenolide ( Figure 1B), trigger the production of secondary metabolites with a minimum effective concentration in the low nanomolar range in Streptomyces avermitilis (Corre et al., 2010;Arakawa et al., 2012). Notably, butenolides show greater pH stability and generally regulate fewer processes than g-butyrolactones. The recent discovery of avenolide activity in about 24% of observed actinomycetes (n = 51), suggests that other active actinobacteria can also produce avenolide-like compounds to regulate secondary metabolism (Thao et al., 2017). For example, avenolide regulates the production of the anthelminitic compound avermectin Miller et al., 1979). Ivermectin, a chemical derivative of avermectin, is on the World Health Organization's list of essential medicines, and has lowered the incidence of otherwise untreatable parasitic infections, including river blindness, strongyloidiasis, and lymphatic filariasis (elephantiasis) (Chen et al., 2016;Callaway and Cyranoski, 2015).
Enzymatic and synthetic routes towards the production of g-butyrolactones have been described, but access to butenolides has been restrictive (Kato et al., 2007)., (Seitz and Reiser, 2005) Here, we describe a concise 22-step convergent route towards the total synthesis of avenolide, enabling biochemical and biophysical characterization of its interaction with the AvaR1 receptor. We also present structures of AvaR1, in isolation (2.4 Å resolution), in complex with avenolide (2.0 Å resolution), and bound to a synthetic DNA oligonucleotide (3.09 Å resolution) derived from its natural binding site. Using the primary sequence of AvaR1 and the synteny of genes that are likely to be involved in butenolide biosynthesis, we identify 89 additional putative butenolide receptors. Mapping of residues at the ligand-binding site that form conserved sequences highlights their importance in ligand activation. The identification of these putative avenolide-responsive strains may enable the production of novel metabolites in the presence of the hormone. As proof of principle, we show that the supplementation of synthetic avenolide into growing cultures of two strains that contain homologous receptors results in visible changes to the production media. eLife digest Bacteria that dwell in soil known as actinobacteria are the source of many drugs that are used to treat cancer and infectious diseases in humans. In their natural environments actinobacteria produce these drugs, or at least similar compounds, to compete with neighboring microbes for food or to kill their enemies. However, when researchers culture actinobacteria in the laboratory, the bacteria often produce little or none of these compounds.
Some actinobacteria produce a compound called avermectin. This compound is closely related to a drug used to treat an infectious disease known as river blindness, which is a common cause of sight loss in people living in West Africa. A bacterial hormone known as avenolide regulates when the bacteria produce avermectin by binding to a receptor known as AvaR1. But the precise details of how this process works remained unclear.
To investigate how avenolide binds to AvaR1, Kapoor, Olivares and Nair developed a new strategy to produce large quantities of avenolide in the laboratory from commercially available molecules. A three-dimensional structure of AvaR1 was then generated showing the receptor on its own, bound to avenolide or bound to a short DNA molecule. In the absence of avenolide, AvaR1 sits on DNA. However, binding to avenolide causes AvaR1 to move off the DNA. This revealed how the binding of avenolide changes the receptor protein so that it can be released from DNA to allow the production and release of other small molecule compounds. Further experiments used these structures as guides to identify 90 new species of actinobacteria that may respond to avenolide and other similar bacterial hormones.
Understanding how bacterial hormones stimulate actinobacteria to produce avermectin and other compounds will aid efforts to extract new compounds from soil bacteria that have the potential to treat cancer or infectious diseases.

Results and discussion
Convergent synthesis of (4S, 10R)-avenolide and characterization of binding to AvaR1 We hypothesized that a detailed understanding of the mechanism of regulation of secondary metabolite production by avenolide would benefit from an understanding of the regulatory mechanism. Moreover, as AvaR1 shares~40% sequence identity with the g-butyrolactone receptor ArpA, biochemical studies of AvaR1 may inform on other classes of bacterial hormone receptors while also providing the rationale for ligand specificity amongst these different classes.
The inherent temperature, acidic and alkali stability of butenolides over g-butyrolactones prompted efforts towards the large-scale production of avenolide for biochemical studies. However, little is known about the biosynthetic pathways that elaborate butenolides, as opposed to canonical GBLs that can be produced enzymatically from commercially available precursors. To this end, we undertook a novel retro-synthetic strategy produce avenolide from the convergent synthesis of three key fragments: the iodide (11), the aldehyde (12) and epoxy (10) ( Figure 1C).We followed the protocol reported by Uchida et al., 2011, which begins with commercially available 1-methyl-2-butene. A Sharpless asymmetric dihydroxylation (Kolb et al., 1994) produced the diol intermediate (1) in 82% yield in a single step ( Figure 1D). The desired key intermediates aldehyde (12), iodo alkene (11) and epoxy (10) were also made following the same reported protocol except that TBDPS rather than TBS was used for improved reaction monitoring using TLC. The epoxy (10) was then stereospecifically converted into allyl alcohol (17) in a single step, by using titanocene dichloride and Zn powder A-factor is a g-butyrolactone, avenolide is an alkylbutenolide, SRB1 is a 2-alkyl-3-methyl-4-hydroxybutenolide, and MMF1 is a 2-alkyl-4-hydroxymethylfuran-3-carboxylic acid. (C) Retrosynthetic scheme for avenolide synthesis involving five key reactions. (D) Overall summarized and synthetic scheme for total synthesis of (4S,10R)avenolide with total number of steps and reaction yields. (Wang et al., 2013;Katsuki and Sharpless, 1980). The final product, stereospecific (4S, 10R)-avenolide (13) was then produced by ring-closing metathesis of molecule 19, treating the dialkene with Grubb's second-generation catalyst (Sheddan and Mulzer, 2006).
The synthetic strategy significantly reduced the number of steps from those reported in prior studies. Specifically, the diol intermediate (1) was synthesized from the commercially available 1methyl-2-butene in single step using Sharpless asymmetric dihydroxylation, avoiding multiple protection/deprotection steps. Likewise, the aldehyde intermediate (12) was synthesized effectively in three steps (as compared to ten steps in prior reports), avoiding the use of toxic CuCN. Last, conversion of the epoxy intermediate (10) to the final product occurred through the intermediacy of an allyl alcohol (17), reducing the strategy used in prior reports by four more steps. The final yield of avenolide was 14 mg total from 15 g of starting material. The identities of all intermediates, as well as that of the final product, were determined using 1 H-NMR and 13 C NMR that matched with the reported data. Detailed experimental methods and NMR spectra can be found in Appendix 1.
Crystal structure of AvaR1 and the binary complex with avenolide The structure of AvaR1 was determined to 2.4 Å resolution ( Figure 2A) using crystallographic phases determined from anomalous diffraction data collected from SeMet-labeled protein crystals. The overall structure is reminiscent of that of other TetR-family transcriptional repressors, and consists of an obligate homodimer (Bhukya and Anand, 2017). Each monomer is entirely helical and consists of a DNA-binding domain (DBD; composed of a helices 1-4), and a ligand-binding domain (LBD; consisting of a helices 5-13). The dimer interface is formed via interactions between the two LBDs and is formed mainly through hydrophobic packing interactions.
Co-crystallization efforts for AvaR1 bound to avenolide yielded crystals that diffracted to 2.0 Å resolution, and crystallographic phases were determined using molecular replacement ( Figure 2B; Thorn and Sheldrick, 2013). Clear density for the entire hormone can be visualized bound to the LBD of both monomers in the homodimer ( Figure 2C). Notably, the structure of AvaR1 undergoes conformational shifting upon ligand binding, and a structure-based superposition against the ligand-free structure illustrates that binding of the hormone results in an~10 o shift in the DBD of each monomer ( Figure 2D). This shift results in an increase in the distance between the two DBD in the dimer upon binding of the ligand, which would preclude DNA binding by the ligand-bound homodimer. Additional local changes between the two structures included the movement of Gln165, which swings into the binding pocket to make hydrogen-bonding interactions with the lactone ring, as well with Gln64. Last, Thr108, which is harbored on helix a6, also moves to accommodate interactions with the hormone. The indole side chain of Trp127 likewise shifts to increase the volume of the binding cavity. Other interactions include hydrogen bonds between Thr131 and the C10 hydroxy, and an interaction between Thr162 and the lactone ring of avenolide. The superimposition suggests that the new contacts formed by these residues are likely to couple hormone binding to the conformational shift between the two monomers of AvaR1 ( Figure 2D).
We used isothermal titration calorimetry to characterize the binding interaction between the synthetic avenolide and AvaR1 (measurements were conducted in triplicate). The resultant binding isotherm shows the point of inflection at a molar ratio of N-1, which suggests a 1:1 binding of ligand per monomer ( Figure 2D). The strength of the binding is measured to be K d = 42.5 nM ± 2.1 nM (three independent trials), which correlates with the reported value of~4 nM that was estimated from gel shift-based assays .

Structure-based comparison across different GBL-like receptors
A structure-based sequence alignment of GBL-like receptors for which ligand specificity has been established reveals a strong conservation of residues that have been shown to be critical for ligand binding, suggesting a common mechanism for ligand recognition across disparate classes of receptors ( Figure 3A). Residues that surround the alkyl chain of avenolide include Trp127, Val158, and Phe161, which are almost universally conserved across all members of the receptor family, whereas Leu88, which forms the opposite wall of the binding cavity, is always a hydrophobic residue but of variable size ( Figure 3B). The chain length of the methylenomycin furans (MMFs) is shorter than those of the GBLs and butenolides; correspondingly, residues at the base of the ligand-binding cavity in the AvaR1, such as Ala85, His130, and Thr131, are replaced by bulkier the Glu107, Leu150, and Leu151 in the sequence of the MMF receptor MmrF. Residue Thr161 is within hydrogen-bonding distance to the lactone ring of avenolide, and this residue is conserved in receptors that bind to g-butyrolactones and butenolides, but absent from receptors for other classes of hormones such as MmrF. Likewise, Gln64 in AvaR1 is positioned on the opposite site of the lactone and is conserved among receptors that bind to structurally related classes of hormones, but is absent from the structure of MmrF. The latter sequence contains a Tyr85 at a near equivalent position, which may be necessary for interactions with the carboxyl group of MMF.
The lactone ring of avenolide is situated above helix a6, which contains residues that are nearly universally conserved among GBL-like receptors, including Ser103, Val104, Arg105, Leu106, Val107, and Asp108. Notably, this helix bridges the LBD and the DBD, suggesting that it plays a role in coupling ligand binding to DNA dissociation ( Figure 3C). Specifically, movement of Gln64 and Thr108 into the ligand-binding cavity of AvaR1 upon engagement of the hormone results in the displacement of helix a6 away from the pocket. The orientation of Arg105, located on the opposite side of helix a6, is established through multiple hydrogen-bonding interactions with the backbone carbonyls of conserved residues in helix a1, including the universally conserved Phe22, Gly26, and Tyr27. Hence, the accommodation of hormone binding necessitates movement of the DBD, in order to preserve the suite of hydrogen-bonding interactions with helix a6. As noted, Gln64 and Thr107 are largely conserved among receptors that bind lactone-containing hormones, suggesting a common mechanism for coupling ligand binding to DNA dissociation.
Crystal structure of AvaR1 and the binary complex with the aco ARE In order to gain further insights into the mechanism of hormone-mediated de-repression, we also determined the structure of AvaR1 bound to a synthetic oligonucleotide derived from the autoregulator responsive element (ARE) sequence. DNase foot-printing analysis had previously established the identity of the ARE located upstream of the aco gene, but this response element is pseudopalindromic . Crystallization efforts with the symmetric AvaR1 homodimer yielded crystals that did not diffract beyond 8 Å , presumably as a result of the asymmetry of the ARE operator. Efforts using an artificial palindromic sequence derived by inverting and repeating each half of the pseudo palindrome yielded crystals that diffracted to 3.09 Å resolution ( Figure 3D, Appendix 1-figure 2, Appendix 1-table 1), and the structure was determined by molecular replacement. As a result of the use of this symmetric DNA, each homodimer in the crystallographic asymmetric unit is bound to a monomer from an adjacent ARE.
The structure shows that each DBD interacts with one half of the palindrome of the DNA duplex. Numerous contacts are formed between helix a1 of the DBD and the duplex, including Arg5, which inserts into the major groove and interacts with Thy7, as well as between Lys43 and Ade15 of the ARE ( Figure 3D). Additional non-specific interactions include those between Thr10, Thr42, and Tyr47 and the backbone phosphate of the duplex. A comparison of the ligand-binding sites with that in the hormone-bound structure reveals that the binding pocket is further occluded through movements of Trp127 and the loop harboring Gln64, consistent with the roles of these residues in effecting conformational movements of the DBD upon binding of the hormone.

Genome mining informs on putative butenolide regulatory biosynthetic pathways
Given the improved stability of butenolides over g-butyrolactones, we speculate that these hormones may prove more amenable in attempts to activate antibiotic production. In order to identify actinobacterial strains that are under butenolide regulatory control, we sought to use a bioinformatics approach based on the identification of the corresponding receptor. However, as shown in Figure 3A, the sequence similarity between bona fide g-butyrolactone receptors, such as ArpA, and butenolide receptors is high (40% sequence identity with AvaR1) precluding such analysis. Orphan receptors called pseudo g-butyrolactone receptors that are activated by multiple ligands likewise share 40% sequence identity with AvaR1, confounding simple sequence-based analysis. Prior efforts to discriminate between receptor classes have not proven fruitful, and phylogenetic analyses have failed to discriminate between positive and negative regulators.
In an effort to identify other actinobacteria that are under butenolide regulatory control, we utilized synteny of the putative butenolide biosynthetic genes to distinguish between receptor clades. We first used the Enzyme Similarity Tool (EST) from the Enzyme Function Initiative to create a Sequence Similarity Network (SSN) of all members of the receptor class. An E-value cutoff of 10 À70 was used to produce an SNN in which characterized receptors of the various GBL families were segregated with mutually exclusive co-localization ( Figure 4A). We used the resultant SSN as input for the EFI Genome Neighborhood Network (GNN) tool to identify nodes that are co-localized next to genes with PFams that are associated with putative avenolide biosynthetic genes. The fact that the butenolide receptors often regulate the production of their own ligand further enabled this approach and allowed for inferences about the classes of ligands that are produced and recognized by receptors that have yet to be characterized. Because the EFI tools are only integrated with the Uniprot database, we also manually searched sequences in Genbank for similar operonic architecture to identify putative butenolide receptors on the basis of genomic context.
Our analysis yielded a set of 90 putative actinobacterial genomes (Appendix 1-table 2) that harbor a putative butenolide receptor that is likely to be under butenolide regulatory control ( Figure 4A). We emphasize that, although this is orders of magnitude greater than the currently known number of butenolide receptors, this number represents a significant underestimate but the true number cannot be identified given the limitations of our approach. Mapping of the sequence conservation amongst these 90 receptors onto the co-crystal structure of hormone bound to AvaR1 reveals that residues Gln64, Thr108, and Trp127 , which are proposed to couple ligand binding to domain shift, are conserved among all of these sequences ( Figure 4B). The conservation score is highest for residues that are involved in interactions with the lactone, whereas those that interact with the alkyl tail are more divergent. These data are consistent with the observations that the lactone ring and C10 hydroxyl are general features of butenolides, whereas the length and branching of the alkyl tail vary significantly. The organization of the biosynthetic operon that harbors genes, which is hypothesized to be involved in butenolide biosynthesis, is largely conserved in these organisms ( Figure 4C and Appendix 1-table 2).

Conclusion
Although the bacterial hormone A-factor was discovered nearly a half century ago, significant gaps remain in our understanding of how these signaling molecules regulate gene expression. The discovery of the regulation of secondary metabolite biosynthesis by g-butyrolactones inspired efforts to use these molecules as ex vivo effectors to induce otherwise silent biosynthetic gene clusters, but these efforts met with little success. The labile nature of the lactone ring, as well as the often pleiotropic effects that g-butyrolactones induce, have presumably subverted efforts to utilize these small molecules as chemical inducers. By contrast, the structurally related butenolides show improved stability under strongly acidic and basic conditions. However, the utility of butenolides in biotechnology efforts has been limited by an inability to access these molecules.
Here, we present here an efficient and convergent 22-step total synthetic route for the production of avenolide, which can be extrapolated for the total synthesis of other members of the butenolide class of small signaling molecules. This effort allowed for detailed structure-function studies of the corresponding hormone receptor, including the first crystal structure of any GBL-type receptor bound to its cognate ligand. The structural data informs on the mechanism by which hormone binding induces a conformational change in the AvaR1 receptor, resulting in the formation of a dimeric assembly that occludes efficient DNA binding. We also elaborate a bioinformatics strategy using the genomic neighborhood context of butenolide biosynthetic genes as a marker to identify 90 actinobacterial strains that are likely to be under the regulatory control of butenolides. The addition of avenolide to the growth media for two representative strains results in changes in the color of the culture supernatant. These results support the validity of our bioinformatics approaches and set the framework for further efforts towards the use of butenolides to active antibiotic biosynthesis in otherwise silent gene clusters.

Materials and methods
Total synthetic schemes, experimental procedures, and validation of relevant synthetic intermediates are provided in the 'Supplemental data'.

Expression, purification, crystallization and structure determination of AvaR1
Wild-type protein AvaR1 was amplified from S. avermitilis genomic DNA by PCR using primers that were based on the published sequence of the polypeptide and inserted into a pET-28-MBP vector for expression in Escherichia coli as a maltose binding protein (MBP)-tagged fusion. The resultant plasmid was transformed into E. coli containing the Rosetta plasmid for protein expression. AvaR1 was produced by growing the cells in shaking flask of LB media at a temperature of 37˚C. When the cells reached an O.D. 600 of 0.6, the cells were cooled on ice for 15 min. Following the addition of 0.5 mM isopropyl b-d-1-thiogalactopyranoside (IPTG), the cells were placed in an 18˚C shaking incubator for 18 hr. The cells were then harvested by centrifugation and re-suspended in a buffer composed of 500 mM NaCl, 20 mM Tris base (pH 8.0), and 10% glycerol. Re-suspended cells were lysed by homogenization and the lysate was centrifuged at 14,000 rpm to remove cell debris. The cleared cell lysate was loaded onto a HisTrap column, which was subsequently washed with 1 M NaCl, 30 mM imidazole, and 20 mM Tris base (pH 8.0). MBP-tagged AvaR1 was eluted using a linear gradient beginning with 1 M NaCl, 20 mM Tris base (pH 8.0), and 30 mM imidazole and ending with 250 mM imidazole. Pure fractions, as judged by SDS-PAGE, were combined and diluted two-fold before treatment with thrombin (final ratio of 1:100 [w/w]) for 18 hr at 4˚C to cleave the N-terminal tag. Tag-free AvaR1 was concentrated and loaded onto a size exclusion column (Superdex S75 16/60) pre-equilibrated with 100 mM KCl and 20 mM HEPES free acid (pH 7.5). Pure fractions were collected and concentrated to 25 mg/mL before storage in liquid nitrogen. Production of SeMetlabeled AvaR1 was carried out by repression of methionine synthesis in defined media supplemented with selenomethionine (Doublié, 2007).
Preliminary crystals of AvaR1 were obtained using a sparse matrix screen. Diffraction-quality crystals were grown using hanging drop vapor diffusion. 13.5 mg/mL AvaR1 was added at a 1:1 ratio to mother liquor containing 12% polyethylene glycol (PEG) 1000, 0.1 M sodium citrate tribasic dehydrate (pH 4.2), 0.2 M LiSO 4 and 4% (v/v) tert-butanol, and incubated against the same solution at 4C . Crystals were improved through multiple rounds of micro-seeding. The SeMet-AvaR1 crystals were obtained using 12 mg/mL protein added to a 1:1 ratio of 30% PEG MME 2000, and 0.15 M KBr. Crystals were vitrified by direct immersion without the addition of any cryo-protectives.
All diffraction data were collected at Argonne National Laboratory (IL). The autoPROC (Vonrhein et al., 2011) software package was utilized for the indexing and scaling of the diffraction data. Initial phases for AvaR1 were obtained using anomalous diffraction data collected on crystals of SeMet-labeled protein. Initial models were built using Phenix and Parrot/Buccaneer. Manual refinements were completed by the iterative use of COOT (Emsley et al., 2010) and Phenix.refine. Cross-validation was utilized throughout the model-building process in order to monitor building bias. The stereochemistry of all of the models was routinely monitored using PROCHECK. Crystallographic statistics are provided in Appendix 1-table 3.
For co-crystallization of the hormone-bound complex, purified AvaR1 (14 mg/ml) was incubated with 3 mM avenolide for 30 min on ice. Co-crystals were obtained by vapor diffusion methods and initial crystals were obtained in Index D5 (25% PEG3350 and 0.1M sodium acetate trihydrate [pH 4.5]). Well-diffracting crystals were produced through optimization to a final solution of 23% PEG3350 and 0.1 M sodium acetate trihydrate (pH 4.5) at 4˚C using hanging drop crystallization. Crystals were submerged briefly in the crystallization medium supplemented with 25% ethylene glycol prior to vitrification in liquid nitrogen. The coordinates of apo AvaR1 were used to determine crystallographic phases.
AvaR1 was co-crystallized with different oligonucleotide sequences that were designed on the basis of the AvaR1 DBS upstream of aco gene. Purified and concentrated dimeric protein (14 mg/ mL) was incubated with individual oligonucleotide duplexes (Supporting Appendix 1-table 1) in 1:1.2 molar ratio, for 30 min on ice. Palindromic DNA sequences were first self-annealed and then double stranded DNA was used from 1 mM stock prepared in 20 mM MgCl 2 , 50 mM Tris (pH 8.0) buffer. The order in which reactants were added was: buffer, DNA and finally protein. For some oligonucleotides, white turbid solution was obtained as soon as the protein was added, but the addition of a few microliters of ammonium acetate and incubating at either room temperature or on ice produced clear solution. Crystallization trays were set up at 4˚C and every oligonucleotide was crystallized in different conditions (Appendix 1-figure 2B). Ethylene glycol (25% v/v) was used as cryoprotectant prior to the vitrification of crystals for all AvaR1-oligonucleotide co-crystals. The oligonucleotide sequences used are listed in Appendix 1-table 1 and the sequences that produced diffraction-quality crystals are shown in Appendix 1-figure 2.

Identification of putative butenolide biosynthetic clusters
Using the AvaR1 amino-acid sequence as a handle, tools from the Enzyme Function Initiative (EFI) were used to first create a Sequence Similarity Network (SSN) of 10,000 Uniprot sequences. Once an SSN was created, an iterative process was undertaken to find an E-value that ensured that characterization of receptors of the various GBL families resulted in mutually exclusive co-localization. The resulting E-value was 10 À70 . Using this SSN, the data was run through the EFI's Genome Neighborhood Network (GNN) webtool. Network visualization was performed in Cytoscape (Shannon, 1971). Using Pfams associated with putative avenolide biosynthetic genes along with the knowledge that this family of receptors often regulates their own ligand production, inferences were made as to the class of ligand produced and recognized by uncharacterized receptors. Because the EFI webtools are only integrated with the uniprot database, we also manually scoured through a number of Genbank sequence results derived from BLAST analysis for proper genomic context relative to butenolide production. These BLAST searches used the sequences of any putative butenolide biosynthetic genes as handles. The proper genomic context necessary for butenolide biosynthesis was defined as a TetR_N Pfam receptor surrounded by a gene in the p450 Pfam (PF00067), and either an Acyl-CoA_dh_1 (PF00441) Pfam gene, an Acyl-CoA-dh_2 (PF08028) Pfam gene, or an ACOX (PF01756) Pfam gene. A list of the 90 homologous strains that were identified is provided in Appendix 1-table 2.

Isothermal titration calorimetry
ITC measurements were performed at 25˚C on a MicroCal VP-ITC calorimeter. A typical experiment consisted of titrating 7 mL of a ligand solution (80 mM) from a 250 mL syringe (stirred at 300 rpm) into a sample cell containing 1.8 mL of AvaR1 solution (8 mM) with a total of 35 injections (2 mL for the first injection and 7 mL for the remaining injections). The initial delay prior to the first injection was 60 s, with reference power 10 mCal/s. The duration of each injection was 16 s and the delay between injections was 400 s. All experiments were performed in triplicate. Data analysis was carried out with Origin 5.0 software. Binding parameters, such as the dissociation constant (K d ), enthalpy change (DH), and entropy change (DS), were determined by fitting the experimental binding isotherms with appropriate models (one-site binding model). The ligand stock solution was prepared at 10 mM. The buffer solutions for ITC experiments contained 300 mM KCl and 20 mM HEPES (pH 7.5).

Total synthesis of avenolide
The experimental procedures were adopted from Uchida et al., 2011 with further optimizations and modifications as stated. Detailed experimental procedures for relevant intermediates are specified in the Materials and methods section of Appendix 1.
[ For synthesis of aldehyde fragment 12, the p-methoxybenzyl (PMB)-protected intermediates were synthesized according to the protocol reported by Uchida et al., 2011. Iodo alkene fragment 11 was also synthesized following the reported protocol by substituting the use of the tert-butyl (dimethyl)silyl (TBS) protecting group with a tert-butyldiphenylsilyl group (TBDPS) to enable easy monitoring of the reaction by thin layer chromatography (TLC) visualization under UV light. Epoxy fragment 10 was synthesized through intermediates 14, 15 and 16, as described in Appendix 1.
Chemical structure 2. Single step synthesis of allyl alcohol 17 from epoxy 10.
(R)À8-((2S,3S)À3-(Hydroxymethyl)oxiran-2-yl)À3-((4-methoxybenzyl)oxy)À3-methyloctan-4-one (17) was synthesized following the protocol from the reported literature (Wang et al., 2013). Anhydrous ZnCl 2 (2 mL, 1 M in Et 2 O, 2 mmol) and zinc powder (350 mg, 6.72 mmol) were added to a red solution of Cp 2 TiCl 2 (1.26 g, 5.05 mmol) in anhydrous tetrahydrofuran (THF) (15 mL). The solution was stirred for 1 hr at room temperature until it turned green. Epoxide 10 (590 mg, 1.68 mmol) in anhydrous THF (5 mL) was then added to the resultant mixture. After stirring for 30 min at room temperature, the reaction was quenched with aqueous HCl (1.0 M, 3 mL) and the mixture was extracted three times with Et 2 O (4 mL). Collected Et 2 O fractions were combined and washed with water, 10% aqueous NaHCO 3 , water and brine, dried over Na 2 SO 4 and filtered and concentrated under reduced pressure. The obtained residue was purified using flash column chromatography on silica gel (25% EtOAc/hexane to 40% EtOAc/hexane) to obtain pure allyl alcohol compound 17.  141.3, 128.9, 128.9, 114.9, 114.0, 113.8, 84.8, 73.2, 65.3, 55.5, 37.1, 36.8, 29.4, 25.3, 23.5, 20.2, 8. This resulted in a simplified protocol for the synthesis of allyl alcohol 17, which otherwise was reported to be made in two additional reaction steps starting from epoxy 10. An acrylic group was added to allyl alcohol 17 using DDQ by the reported procedure, which was followed by ring-closing metathesis reaction to yield stereospecific (4S, 10R)-avenolide(13). Detailed synthetic schemes, experimental procedures and yields are reported in the Materials and methods section of Appendix 1. The obtained 1 H and 13 C NMR data support the reported values, and thus the spectra are provided only for key intermediates. The funding agency provided support for facilities and services. The agency also provided partial salary support for all participants. Appendix 1

Supplementary methods
All commercially available chemicals were purchased by the Aldrich Company and Fischer Scientific and were used without further purification unless otherwise specified. Anhydrous solvents were either purchased from Fischer scientific and Aldrich or borrowed from solvent dehydrating system (SDS) in other labs at UIUC. All of the oligonucleotides used in the study were purchased from Integrated DNA Technology (IDT). IPTG and antibiotics were purchased from Gold Biotechnology.

Instrumentation
High-and low-resolution mass spectra were obtained by the Mass Spectrometry Laboratory, School of Chemical Science, University of Illinois. Mass spectra were obtained by field desorption (FD) on a Waters 70-VSE-A and by ESI on a Waters Micromass Q-Tof. Other instruments like rotatory evaporator (Isotemp) and stirring plates (Cole Palmer) were purchased from Fischer Scientific. Nuclear magnetic resonance (NMR) spectra were recorded on a Varian Unity 500, Varian INOVA 500NB or Varian Unity 400 spectrometer at 21 ± 3˚C unless otherwise mentioned. Chemical shifts (d) are reported in parts per million (ppm). Coupling constants (J) are reported in Hertz. 1 H NMR chemical shifts were referenced to the residual solvent peak at 7.26 ppm in chloroform-d (CDCl 3 ) or other deuterated solvents as stated. 13 C NMR chemical shifts were referenced to the center solvent peak at 77.16 ppm for CDCl 3 . Analytical thin-layer chromatography (TLC) was performed on 0.2 mm silica 60 coated on glass plates with F254 indicator for monitoring all of the reactions. PMA, KMnO 4 , anisaldehyde, ninhydrin, and iodine were used for staining TLC plates to detect various functional groups. Flash column chromatography was performed on 40-63 mm silica gel (SiO 2 ). Solvent mixtures used for chromatography are reported as percent volume (%) or as volume ratio (v/v).

Total synthetic schemes and experimental procedures
The experimental procedures were adopted from the publication by Uchida et al., 2011 with further optimizations and modifications as described.
Appendix 1-chemical structure 1. Chemical structures and synthetic scheme for intermediate 9.

Synthesis of aldehyde fragment
For the synthesis of (R)À2-((4-methoxybenzyl)oxy)À2-methylbutanal (12), PPTS (46.85 mg, 0.185 mmol) and p-anisaldehyde dimethylacetal (0.63 ml, 3.73 mmol) at room temperature (RT) under N 2 were added to a solution of 1 (388 mg, 3.73 mmol) in CH 2 Cl 2 (15 mL). After stirring for 5 hr at RT, the reaction was quenched with water and the aqueous phase was extracted with EtOAc (2 Â 30 mL). The combined organic extracts were washed with brine and collected organic extracts were dried over anhydrous Na 2 SO 4 and concentrated in vacuo using rotatory evaporator. The dried residue was purified by flash column chromatography (15% EtOAc/hexanes) to afford the corresponding pmethoxybenzylidene acetal (0.595 g, 92%) as a colorless oil.
1H DIBALH (1.02 M in hexanes, 4.00 mL, 4.05 mmol) at À78˚C under N 2 was added to a solution of the p-methoxybenzylidene acetal (300 mg, 1.35 mmol) in CH 2 Cl 2 (15 mL). After stirring for 1 hr at 0˚C, the reaction was cautiously quenched with MeOH at 0˚C, diluted with CH 2 Cl 2 (30 mL) and treated with celite (1.20 g) and Na 2 SO 4 .10H 2 O (1.20 g). The mixture was allowed to warm to RT. After stirring at RT for 2 hr, the mixture was filtered through a celite pad, washed with DCM three times and the filtrate was concentrated in vacuo. The residue was purified by flash column chromatography (25% EtOAC/hexanes) to give the corresponding alcohol 2 (140 mg, 92%) as a colorless oil.  A solution of the oxalyl chloride (1.15 mL from 2 M stock in DCM, 2.23 mmol) in dry CH 2 Cl 2 (7 mL) was cooled to À78˚C under an atmosphere of Ar. A solution of dry DMSO (316 mL in 2 mL DCM, 4.46 mmol) was added slowly until temperature is maintained below À65˚C. After stirring for 5 min, a solution of 2 (500 mg, 2.23 mmol) in dry CH 2 Cl 2 (3 mL) was added slowly, and the resulting mixture was stirred slowly for 15 min, after which, Et 3 N (1.55 mL, 11.14 mmol) was added slowly. After stirring the reaction for 10 additional minutes at À78˚C, the cooling bath was removed, and the reaction was allowed to warm to RT. Upon reaching RT, water (10 mL) was added and stirring continued for 15 min. The reaction mixture was then washed successively with 5% HCl (10 mL), saturated NaHCO 3 (10 mL) and brine (7 mL) using a separatory funnel. Collected organic layers were combined and dried with anhydrous Na 2 SO 4, filtered and concentrated under reduced pressure to afford crude oil. Flash chromatography (20% EtOAc/hexanes) was performed to get pure aldehyde 12 (430 mg, 94%) as a yellow oil.
Spectroscopic characterization parameters agreed with the reported mass and chemical shift values
[M+Na]+, found m/z 391.2134. PPTS (214.76 mg, 0.854 mmol) and DHP (7.8 mL, 85.4 mmol) were added to a solution of allyl alcohol 7 (3.15 g, 8.54 mmol) in anhydrous CH 2 Cl 2 (40 mL) at 0˚C under N 2 . The reaction mixture was then stirred for 6 hr at 0˚C followed by quenching with H 2 O. The aqueous phase was extracted with EtOAc and combined organic extracts were washed with brine and dried over anhydrous Na 2 SO 4 and concentrated in vacuo. The residue was purified by flash column chromatography (15% EtOAc/hexanes) to afford the corresponding THP ether(7) (3.5 g, 88% yield) as a colorless oil   (7) (3.5 g, 7.74 mmol) in dry THF (55 mL) at RT under N 2 . The reaction mixture turned orangish red in color and it was stirred for 4.5 hr at RT. The reaction was quenched with H 2 O, and the aqueous phase was extracted with CH 2 Cl 2 . The combined organic extracts were washed with brine, and then dried over anhydrous Na 2 SO 4 and concentrated in vacuo. The obtained yellowish-orange crude residue was purified by flash column chromatography (50% EtOAc/hexanes) to give the corresponding TBDPS deprotected alcohol (8) (1.45 g, 92%) as a colorless oil. The spectroscopic data match with the reported data .  For the synthesis of (E)À2-((7-Iodohept-2-en-1-yl)oxy)tetrahydro-2H-pyran (11), Et 3 N (1.3 mL, 9.33 mmol), Me 3 N.HCl (44.5 mg, 0.47 mmol) and MsCl (541 ml, 6.99 mmol) were added to a solution of 8 (1 g, 4.66 mmol) in CH 2 Cl 2 (50 mL)at 0˚C under N 2 atmosphere. The reaction mixture was allowed to warm to RT and stirred for 3 hr at RT. After that, the reaction was quenched with H 2 O and the aqueous phase was extracted with CH 2 Cl 2 . The combined organic extracts were dried over anhydrous Na 2 SO 4 and concentrated under reduced pressure. The obtained yellowish-orange residue was purified by flash column chromatography (40% EtOAc/hexanes) to afford the corresponding mesylate ((E)À7-((tetrahydro-2H-pyran-2-yl) oxy) hept-5-en-1-yl methanesulfonate)(9) (1.35 g, quant.) as a colorless oil. The spectroscopic data match with the reported data   Uchida et al., 2011, the OH was substituted with an iodo group. Briefly, the mesyl ester (9) (3.00 g, 10.46 mmol) was added to a solution of NaCI (2.30 g, 15.4 mmol) in acetone (150 mL) under N 2 atmosphere. After stirring at reflux for 8.5 hr, the reaction mixture was cooled to RT and the aqueous phase was extracted with EtOAc. Organic fractions were combined, dried over anhydrous Na 2 SO 4 and concentrated in vacuo. The residue was purified by flash column chromatography (10% EtOAc/hexanes) to give the corresponding iodo compound (11) (2.0 g, 62%) as a yellow colored oil. The spectroscopic data match with the reported data   Uchida et al., 2011, allyl alcohol (16) was synthesized from intermediates 11 and 12 via intermediates 14 (diastereoisomeric mixture) and ketone (15).