Exploiting the Natural Diversity of RhlA Acyltransferases for the Synthesis of the Rhamnolipid Precursor 3-(3-Hydroxyalkanoyloxy)Alkanoic Acid

The RhlA specificity explains the observed differences in 3-(3-hydroxyalkanoyloxy)alkanoic acid (HAA) congeners. Whole-cell catalysts can now be designed for the synthesis of different congener mixtures of HAAs and rhamnolipids, thereby contributing to the envisaged synthesis of designer HAAs.

S urfactants are amphiphilic molecules that reduce surface and interfacial tensions, which allows them to accumulate at interfaces and form emulsions. These properties are of industrial interest and are exploited in multiple applications in such different fields as pharmaceuticals, agriculture, food, detergents, and cosmetics (1-3). Biosurfactants are surfactants of biological origin and are a promising alternative to synthetic surfactants, as they are nontoxic, biodegradable, and produced from renewable feedstocks. Their application window is extensive, as they might be effective in environments with extreme pH, temperature, or salinity (4)(5)(6).
The biosurfactant 3-(3-hydroxyalkanoyloxy)alkanoic acid (HAA) is the hydrophobic moiety of rhamnolipids and most often consists of two hydroxy fatty acids linked by an ester bond (4, 7-10) (Fig. 1). Indeed, HAAs are not reported as typical products of microorganisms but, rather, were reported in trace amounts during rhamnolipid formation (11).
The carbon chain lengths of HAAs determine their physical properties, such as their abilities to foam and emulsify, and their critical micelle concentration (CMC). Their chain lengths are strongly hinted to be determined by RhlA, an acyltransferase containing an ␣-/␤-hydrolase domain that catalyzes the esterification of two activated hydroxy fatty acids to HAA (32). In in situ experiments, it has been shown that acyl-carrier protein (ACP)-activated hydroxy fatty acids are the preferred substrate for RhlA (8), while it has been shown in vivo in P. aeruginosa that CoA-activated hydroxyl fatty acids are incorporated preferably into the HAA molecule (33). Within the Gammaproteobacteria, Pseudomonas, Acinetobacter, Enterobacter (17,18), and Pantoea (34) species produce mono-or diglycolipids. Their chain lengths vary, while the most common HAAs have 10 carbon atoms in both hydroxy fatty acids and are thus denoted C 10 -C 10 . In contrast, representatives of the Betaproteobacteria, namely, Burkholderia species, predominantly produce HAAs with chain lengths of 14 carbon atoms (Fig. 2). A few species do not follow this general categorization. Pseudomonas chlororaphis, e.g., produces rhamnolipids with one fatty acid chain of 10 carbon atoms and one of 12, resulting in the designation Rha-C 10 -C 12 when these chains are fully saturated and Rha-C 10 -C 12:1 when FIG 1 Molecular structure of a rhamnolipid molecule. The chain lengths of the hydroxy fatty acids vary, resulting in different congeners. The main congener produced by P. aeruginosa contains 10 carbon atoms in both hydroxy fatty acid derivatives. Without the two rhamnose units, the molecule is a 3-(3hydroxyalkanoyloxy)alkanoic acid (HAA). The synthesis of an HAA molecule is catalyzed by RhlA, which fuses two hydroxy fatty acids. RhlB links an activated dTDP-rhamnose to an HAA, resulting in a mono-rhamnolipid, which is the substrate that is transformed by RhlC, the second rhamnosyltransferase, into a di-rhamnolipid. the C 12 chain is unsaturated in one position (15,35,36). In contrast, Burkholderia kururiensis KP23 produces Gammaproteobacteria-like rhamnolipids containing mainly C 10 -C 10 residues (24).
Rhamnolipid production has not been extensively explored in species of the phyla Firmicutes, Deinococcus-Thermus, Actinobacteria, and Ascomycota. Most promising are the results presented for Thermus species belonging to the phylum Deinococcus-Thermus. Pantazaki et al. (29) produced HAAs and rhamnolipids with chain lengths of 8 to 14 carbon atoms with Thermus thermophilus HB8. Rezanka et al. (30) reported the production of rhamnolipids by Thermus sp. strain CCM 2842, mainly containing the C 16 -C 16 HAA congener, which has not been previously reported. Both groups used selective mass spectrometric methods.
A number of papers in the scientific literature report the synthesis of novel rhamnolipids with novel hosts, which we could not confirm, revealing the need for standardization and guidelines for determination of rhamnolipid and HAA structures. In contrast to rhamnolipids, only a few methods also cover HAAs. Again, HPLC-MS/MS is the method of choice to cover both rhamnolipids and HAAs (37,38). The most comprehensive HPLC-MS/MS method focusing on HAA was presented by Lépine et al. (39). Therefore, our approach was to apply known and potential rhlA genes, express them recombinantly in Escherichia coli, and subject the resulting HAAs to a tailored HPLC-MS/MS analysis for confirmation.
The focus of our study was to explore the diversity of RhlAs and their potential to produce "designer HAAs." The results are discussed in a phylogenetic context.

RESULTS
The natural diversity of RhlA, the acyltransferase of the rhamnolipid synthesis pathway, was investigated and exploited for the synthesis of the lipophilic intermediate HAA. We cloned eight rhlA homologs drawn from the full phylogenetic range of Proteobacteria into the Escherichia coli expression vector pET28a. Alternative RhlAs allowed the synthesis of different HAA congeners.
Phylogeny of RhlA. It has been shown that HAA synthesis in E. coli relies only on a recombinantly synthesized RhlA from P. aeruginosa (8,32). Further, the experimental evidence strongly supports that RhlA selectively determines the ␤-hydroxy fatty acid chain lengths in HAAs (20). As a first step toward tailor-made HAAs, the natural genetic diversity of RhlA was investigated. Representative RhlA protein sequences for all phyla that were detectable by homology searches in GenBank and KEGG were collected. First, the RhlA of P. aeruginosa was used as a template. As the RhlAs from, for example, Pantoea species have limited homology with the protein from P. aeruginosa, homology searches with these sequences were also performed. All identified RhlA proteins are from the classes Betaproteobacteria and Gammaproteobacteria (Fig. 3). Strains from other phyla that are reported to produce rhamnolipids have not been sequenced, and the genes encoding their rhamnolipid synthesis pathways are not known, with two exceptions; an RhlA (GenBank accession number KP202092) was found in the Actinobacteria strain Dietzia maris As-13-3 (28), and the genome sequence of the Deinococcus-Thermus strain T. thermophilus HB8 (29) is known. However, in the latter genome, no rhlA homolog was found.
In general, the identified RhlAs can be divided into three main branches of a currently sparse phylogenetic tree (Fig. 3). In the first branch, the representatives of the genus Pseudomonas form a monophyletic lineage. In the P. aeruginosa strains, represented by strain PA01, two operons containing structural genes for rhamnolipid synthesis are known. In the first of these operons, rhlA and rhlB, the relevant genes for mono-rhamnolipid synthesis, are clustered with a regulator and inducer for quorum sensing, while rhlC, which enables the strain to produce di-rhamnolipids, is located in a different operon and is clustered with a putative transporter (40). Surprisingly, an analysis of the genetic environment of rhlA homologs detected using BLAST in the Pseudomonas fluorescens group showed two possible locations. Besides the colocalization with rhlB, an rhlA homolog is found in synteny with a putative transporter. In P. fluorescens strain A506, rhlA genes are present in both loci, while in P. fluorescens strain Operons associated with rhamnolipid formation are drawn next to the organism names, and genes are labeled with their gene locus or protein accession number. Organisms chosen for HAA production in this study are highlighted in green, while elsewhere-confirmed RhlAs are marked in bold. Others were chosen based on homology searches. S. plymuthica is marked in gray, as we could not confirm an RhlA activity. Double slashes depict independent genomic locations. In Dietzia maris, the synteny of rhlABC is not published. The strains P. fluorescens LMG05825 and P. chlororaphis NRRL-B-30761 are not genome sequenced; therefore, the putative homologous genes are indicated by question marks. The genes for rhamnolipid formation in P. aeruginosa are typically organized in two operons; rhlB (red) is located downstream of rhlA (green) and encodes rhamnosyltransferase I, which is necessary for mono-rhamnolipid formation. The genes rhlA and rhlB are colocalized with the regulator and inducer genes (rhlRI, white) that are involved in regulation via quorum sensing. In a second operon, rhlC (orange), the gene coding for rhamnosyltransferase II, is clustered with a putative transporter (light blue) gene. In the strains P. fluorescens LMG05835 and P. chlororaphis, only the genes shown are sequenced. In the P. ananatis LMG20103 operon containing rhlAB homologs, three genes are present that code for a methyl-accepting chemotaxis citrate transducer (tcp), a putative inner membrane protein (ygbK), and a 2-keto-3-deoxygluconate permease (kgdT1). In the Burkholderia species, the structural genes for di-rhamnolipid formation are organized in a single operon that further includes the genes nodT and hylD, which are potentially involved in the drug resistance systems of the cell. The tree was constructed using the neighbor-joining method in MEGA7 with default settings. Branch lengths shorter than 0.02 are omitted. SBW25, only the latter location and no rhlB homolog can be found. Most P. fluorescens strains do not carry the genes for rhamnolipid synthesis (rhlA in synteny with rhlB, data not shown).
In the second branch, all representatives of the Burkholderia genus and the only Actinobacteria species, D. maris As-13-3 (28), are present. However, the RhlA of D. maris As-13-3 is reported to share 96% sequence identity with a Burkholderia cenocepacia protein, indicating that horizontal gene transfer is a probable explanation for its occurrence. In general, in Burkholderia, rhlAB are located on chromosome II and are in synteny with the putative transporter gene and rhlC. Furthermore, nodT and hylD, coding for enzymes related to efflux and secretion processes, are colocated. In B. cenocepacia and Burkholderia ambifaria, an open reading frame encoding a methyl transferase is placed between rhlA and rhlB. A second operon for rhamnolipid formation exists in Burkholderia pseudomallei and Burkholderia thailandensis on chromosome I (not shown).
The third branch includes homologous proteins from representatives of the orders Enterobacterales and Oceanospirillales, the latter with the only representative being Halomonas. In general, in this branch, the homology of the RhlA proteins is more divergent than in the Pseudomonas and Burkholderia branches. An rhlAB-like operon is found only in Pantoea strains (34) and Lonsdalea britanica, while rhlA homologs are found in Serratia and Dickeya strains but not in synteny with an rhlB homolog. No experimental evidence for HAA or rhamnolipid formation exists for the organisms in this branch, with the exception of Pantoea ananatis BRT175 (P. ananatis) producing the glucolipid ananatoside A, the hydrophobic part of which is an HAA molecule (34,41). In P. ananatis LMG20103, the three genes tcp, ygbK, and kgdT1, which code for a putative methyl-accepting chemotaxis citrate transducer, an effector protein, and a 2-keto-3-deoxygluconate permease, respectively, are encoded in one common operon with the rhlAB homologs.
Determining the synteny of sequences identified by BLAST analyses using an RhlA query requires detailed analysis to distinguish RhlA from the transacylase PhaG, an enzyme that links de novo fatty acid and polyhydroxyalkanoate (PHA) biosynthesis (42)(43)(44) by catalyzing the reesterification from acyl carrier protein (ACP) to CoA. In P. aeruginosa, the protein sequences of RhlA and PhaG have a 44% sequence identity (44), which is similar to the 44 to 48% identity between Burkholderia RhlAs and RhlAs of P. aeruginosa. Fig. 3 shows that rhlA in Pseudomonas, Burkholderia, and Pantoea is part of a glycolipid synthesis operon. In contrast, phaG is located upstream of a tRNA gene, and furthermore, homologs of four of the six upstream genes of phaG in Pseudomonas putida can also be found upstream of phaG in P. aeruginosa (Fig. 4). We used this difference in the synteny of rhlA and phaG as a criterion for the identification of rhamnolipid genes in the reported rhamnolipid producer Pseudomonas desmolyticum NCIM-2112. We were especially interested in this strain, as it was reported to produce rhamnolipids with chain lengths of six to eight carbon atoms (45), a congener range not The gene synteny of the phaG homolog in P. desmolyticum is the same as in P. putida. Homolog genes coding for a uracil-DNA glycosylase (UDG), a 3-hydroxyisobutyryl-CoA hydrolase (3-HIB-CoAH), a protein of unknown function (u.f.), and a ribosomal small subunit pseudouridine A (RsuA) located upstream of phaG can also be found in the upstream region of phaG in P. aeruginosa. A tRNA homolog is placed downstream. This difference in synteny can be used as a criterion to distinguish rhlA from phaG in Pseudomonas species.
confirmed yet for an isolated RhlA. Full genome sequencing allowed a BLAST search for RhlA; however, only the transacylase-encoding phaG was identified. A gene encoding RhlB was not found in the genome of P. desmolyticum (data not shown). To improve the authoritative value of rhamnolipid literature, genetic evidence could be, besides highquality analytics, a means to reduce or ideally avoid miscommunication of rhamnolipidproducing strains (46).
Considering gene synteny, the rhlA homologs identified by BLAST analysis of the Serratia, Dickeya, and Halomonas species are not well supported. Experimental evidence should confirm or disprove the RhlA activity.
HAA synthesis with recombinant E. coli. E. coli strains BL21(DE3) and C43(DE3), each equipped with the rhlA gene from P. aeruginosa (pPA2), were grown in LB medium. Defined glucose pulses were given to provide an additional carbon source. When applying E. coli C43(DE3) as the host, glucose addition caused a steep increase in HAA titers 2 h after induction, which subsequently stagnated as glucose was depleted (6 h) (Fig. 5). The high HAA formation and growth rates were restored after the second glucose pulse at 20 h. While the growth rate slowed down 2 h later, the HAA production rate remained high, pointing to the fact that resources were efficiently allocated to the HAA synthesis pathway and diverted from supplying the growth machinery. With this strategy, an HAA titer of 1.7 g/liter 30 h after induction was reached, which is the highest concentration reported so far using recombinant microorganisms for HAA synthesis. Using E. coli BL21(DE3) as the host, the glucose supplementation had no enhancing effect on HAA formation at any time but was used for biomass formation. In this host, only 0.4 g/liter was achieved 20 h after induction.
The main HAA congeners synthesized by E. coli C43(DE3) pPA2 (Table 1) were the same as those produced by a recombinant P. putida KT2440 using the P. aeruginosa rhlA (11,20) or the wild-type P. aeruginosa strain (12,(47)(48)(49). The HAA spectrum observed is, however, broader in E. coli, expanding to C 14 -containing congeners. The results support previous data showing that the RhlA enzyme is mainly responsible for HAA congener selectivity, while the host organisms play only a minor role.
Diversification of the HAA spectrum by exploiting natural genetic variance. In order to increase HAA congener diversity, seven additional rhlAs of species representing the identified evolutionary space were used.
The first obvious choice from the Betaproteobacteria was RhlA of Burkholderia plantarii PG1 (formerly Burkholderia glumae PG1), which synthesizes mainly C 14 rhamnolipids (20,21). We also chose RhlA of B. ambifaria, which was of particular interest, as the protein shares 91% identity with RhlA of D. maris, which was reported to produce the C 10 -C 10 congener (28). Our purpose was to verify this nontypical main congener for the Burkholderia genus with pAMB.
In contrast to the 16S rRNA phylogeny, in which Enterobacterales, Oceanospirillales, and Pseudomonas are classes of Gammaproteobacteria, the RhlA sequences of the Enterobacterales and the Oceanospirillales-representative Halomonas form a common third branch (Fig. 3). We thus selected RhlA from P. ananatis LMG20103 as the first representative from the Enterobacterales branch. This strain is fully genome sequenced, and the genes for glycolipid synthesis are present (34, 50). Rooney et al. (17) and Hošková et al. (18) reported that other Enterobacterales synthesize rhamnolipids with mainly C 10 -C 10 HAAs (Fig. 2). The N terminus of RhlA in P. ananatis is longer than those of other RhlA proteins, which might be due to automated annotation. For this reason, two versions of rhlA were cloned, one representing a normal-sized rhlA and the long rhlA version. Both rhlAs led to HAA production (data for the long version not shown), suggesting that the normal-sized RhlA is the native protein. Additionally, sequencing indicated a frameshift in the published sequence that led to 13 incorrectly annotated amino acids. A comparison with RhlA from, e.g., Pantoea stewartii A206 confirms this finding, and a corrected sequence was submitted to GenBank (accession number MF671909). As mentioned above, the gene synteny in Dickeya dadantii Ech586, Halomonas sp. R57-5, and Serratia plymuthica PRI-2C does not show colocalization with genes related to glycolipid synthesis, but the rhlA homologs are isolated in the genome.
To experimentally confirm the activity, we further investigated HAA formation using rhlA genes of these strains. Finally, we included the rhlA from P. fluorescens LMG 05825 (P. chlororaphis ATCC 17813), which is reported to be the same strain as P. chlororaphis NRRL B-30761 (35), a strain producing mainly C 10 -C 12 and C 10 -C 12:1 congeners (15). Solaiman et al. (36) found an operon containing rhlAB and the regulator gene rhlR (Fig. 3) in this strain. While we could confirm the previous results, the rhlA from strain LMG05825 carried two nucleotide changes resulting in one amino acid difference in RhlA.
E. coli strains were equipped with one of the eight rhlA genes and cultivated as described above. Glucose was fed 2 and 22 h after IPTG (isopropyl-␤-D-thiogalactopyranoside) induction. Seven of the eight recombinant strains produced HAAs (Table 1); E. coli C43(DE3) pPLY was the exception. Again, while the main HAA congeners were highly similar to reported congener compositions of wild-type strains, the congener spectrum might be a bit wider, which however, could also be a result of the sensitive method used for identification in this study. By the combination of efficient chromatographic separation and structure informative tandem mass spectrometric detection, the resulting HPLC-MS/MS method enables selective and sensitive detection of HAAs. A limit of detection in the range of 0.1 mg/liter is achieved, and thus, HAA with a relative share of Ͻ0.1% can be detected ( Table 1).
As expected from mono-rhamnolipids produced by the wild-type strain P. chlororaphis NRRL-B-30761 (15), our congener determination with plasmid pFLU revealed a different main congener spectrum than with pPA2 from P. aeruginosa. Accordingly, we detected C 10 -C 12 and C 10 -C 12:1 to be among the main congeners, but additionally, we identified C 10 -C 14 and C 10 -C 14:1 to be present in even slightly larger fractions. The C 10 -C 14 congener was also detected by Gunther et al. (15), though in a smaller fraction. In contrast to pPA2, where the longest detected chain contained 14 carbon atoms, with pFLU, congeners containing C 16 , C 16:1 , or even C 18:1 chains were present.
The two RhlA proteins from the Betaproteobacteria branch (pBUG and pAMB) showed 14 carbon atoms in both chains. For pBUG, this was expected due to the phylogenetic classification with other Burkholderia strains, for which C 14 -C 14 rhamnolipid production has been reported in wild-type (19,(21)(22)(23) and recombinant strains (20). With pBUG, 16% of the HAAs incorporated at least one C 16 or C 16:1 fatty acid. In the phylogenetic tree shown in Fig. 3, pAMB is arranged with RhlA from Dietzia maris (28), shown to produce C 10 -C 10 -containing rhamnolipids. Therefore, the result with mainly C 14 chain lengths was unexpected. Besides the main fraction of C 14 chain lengths, we found with this plasmid the most significant fraction of unusual congeners containing chain lengths with odd numbers, namely, 12% containing at least one chain with 13 carbon atoms and in traces C 15 or C 15:1 . To further confirm the presence of these odd-numbered hydroxy fatty acids, we conducted LC-MS/MS measurements applying high-resolution MS. Besides high resolution, the instrument used also delivers high mass accuracy (Ͻ5 ppm relative mass deviation compared to the theoretical value). Hence, elemental compositions can be deduced not only from the intact HAA molecule but also for the fragments in MS/MS mode. Exemplar data are presented in Fig. 6A, where the high-resolution MS/MS mass spectrum of an HAA molecule contain-  (39). Therefore, the detection of these two fragments also demonstrates that two congeners are contained, i.e., HAA C 13:0 -C 14:1 and HAA C 14:1 -C 13:0 . Furthermore, the presence of odd-chain hydroxy fatty acids was confirmed using complementary gas chromatography-mass spectrometry (GC-MS) analysis. HAA samples were hydrolyzed and derivatized to yield the corresponding methyl ester. Additional trimethylsilylation of the hydroxy group facilitated the assignment of chain length as well as position of the hydroxy group in the mass spectrum obtained by electron ionization (Fig. 6B).
Most surprising and divergent were the results we obtained with the plasmids from the Enterobacterales and Halomonas species. Though the RhlA homologs form their own branch (Fig. 3), the HAAs detected with the single plasmids do not show common characteristics. Again, for pPLY, HPLC-MS/MS did not confirm HAAs in the culture supernatant but did confirm other fatty acids. Notably, these free fatty acids experienced similar retention on the used HPLC column as HAAs. Using unspecific detection, such as charged aerosol detection, or by evaporative light scattering detection, false annotation cannot be ruled out. With plasmid pDAD, a comparable spectrum to pFLU was observed. Strikingly, the main fraction contained C 10 -C 14 or C 10 -C 14:1 (26 and 47%, respectively), indicating a high specificity for these congeners. With pANA, the main congeners contained C 10 -C 10 (31%), C 10 -C 12 or C 10 :C 12:1 (27%), and C 10 -C 14 or C 10 -C 14:1 (21%), which is comparable to the congeners found with pPA2 and pFLU. Plasmid pHAL, in contrast, showed congeners like the Burkholderia strains with saturated and monounsaturated C 14 chains.
The congeners that were produced covered the entire HAA spectrum known in wild-type Proteobacteria species. The congener C 8 -C 8 produced by P. aeruginosa 57RP (49, 52) was found in some, but not all, experiments with pPA2 and hence is not listed in Table 1.

DISCUSSION
The esterification of (hydroxy-) fatty acids as it is catalyzed by RhlA is a rare enzyme activity. A similar activity can be found in the black yeast fungus Aureobasidium pullulans. In this strain, liamocin, a glycolipid consisting of mannitol linked with three or four 3,5-dihydroxydecanoic ester groups, is produced (53). Our survey for RhlA in microorganisms showed its presence mainly in the Betaproteobacteria and Gammaproteobacteria phyla, with little evidence in other phyla. We exploited the natural diversity of RhlA, allowing the synthesis of distinct HAA congener mixtures using E. coli as a host. The confirmed substrate specificity of RhlA opens the door for the production of tailor-made HAAs. Fig. 2 shows that rhamnolipid producers are not restricted to representatives of the Betaproteobacteria and Gammaproteobacteria phyla. However, we and others (46,54) have experienced difficulties in reproducing and confirming previous studies showing rhamnolipid synthesis by bacteria of different phyla. In many cases, we did not detect rhamnolipid production and/or genetic evidence for rhamnolipid synthesis despite having cultivated and/or sequenced the reported rhamnolipid producers, respectively. Having had similar experiences, Irorere et al. (46) ascertained that unequivocal analytical techniques to determine rhamnolipid production were not used and concluded that particular reports might be erroneous. As mentioned above, Jadhav et al. (45), for example, reported that P. desmolyticum NCIM-2112 produced mono-rhamnolipids with fatty acid chain lengths of from six to eight carbon atoms. Our efforts to identify an rhlA homolog after genome sequencing failed. We cultivated the organism as described by the authors but detected no rhamnolipids using HPLC-MS/MS (data not shown). The question of whether P. desmolyticum encodes an enzyme with RhlA activity but of a different phylogeny remains, which is consistent with the observations of Kügler et al. (54) finding no evidence of reports of rhamnolipid production by Actinobacteria. Indeed, we found no RhlA with homology searches in non-Proteobacteria species, with the exception of the actinobacterium Dietzia maris AS-13-3. A detailed survey of the analytical methods applied for the identification of novel rhamnolipid-producing strains is necessary. In this regard, the reports of rhamnolipid production in Renibacterium salmoninarum 27BN (27), Tetragenococcus koreensis JS (25), and Aspergillus sp. strain MFS1 (31) do not fulfill the criteria for unequivocal rhamnolipid identification proposed by Irorere et al. (46).
The diversity of HAA congeners might be broadened by identifying RhlAs from Betaproteobacteria and Gammaproteobacteria. The evolutionary relationships between the known RhlAs (Fig. 3) are to a large extent consistent with the species phylogeny based on 16S rRNA gene sequences (Fig. 2). However, it is striking that the RhlA proteins within the genus Pseudomonas do not form a monophyletic lineage with the RhlAs from the Enterobacterales (Serratia, Dickeya, Lonsdalea, and Pantoea) as the 16S rRNA genes do. While pseudomonads and Enterobacterales are both Gammaproteobacteria, the RhlA proteins of the Enterobacterales are outgrouped, forming a separate branch.
Within the pseudomonads, species-specific HAA congeners could be produced. While pPA2 is the most prominent C 10 -C 10 producer, we detected with pFLU a C 12 or C 14 chain combined with a C 10 chain, which confirms the findings of Gunther et al. (15). However, Gunther found C 10 -C 12 or C 10 -C 12:1 as the main congener; in our study, C 10 -C 14 and C 10 -C 14:1 turned out to be even more prominent.
The Burkholderia species B. plantarii, B. thailandensis, and B. mallei synthesize mainly C 14 -C 14 rhamnolipids (19,(21)(22)(23)55). This was confirmed in this study. Using the B. plantarii RhlA, the average carbon chain was determined to have 14.0 carbon atoms. However, in terms of RhlA diversity, the phylogeny of the known RhlA proteins indicates that other Burkholderiaceae might produce HAAs with shorter fatty acids. B. kururiensis, belonging to the genus Paraburkholderia (24), is reported to mainly produce the C 10 -C 10 rhamnolipid congeners. Two explanations for this finding are possible. On the one hand, a C 10 -specific protein might have been transferred to B. kururiensis from, e.g., P. aeruginosa via horizontal gene transfer. On the other hand, the rhlA might have evolved from the original Burkholderia type rhlA to be more promiscuous toward shorter fatty acid chain lengths.
Most new congeners with odd chain lengths were produced when RhlA from B. ambifaria was applied, which we confirmed with GC analytics (Fig. 6B). It was shown that in contrast to acetyl-CoA, propionyl-CoA can be accepted by the enzyme FabH as a precursor to chain elongation, resulting in odd-chain-length fatty acids (56,57). FabH varies depending on its bacterial origin and accepts acetyl-CoA or propionyl-CoA in bacteria synthesizing straight-chain fatty acids, while in branched-chain fatty acid-producing bacteria, branched-chain acyl-CoAs serve as precursors for chain elongation (57). Our results showing straight C 13 chains when using pAMB in E. coli indicate that FabH of E. coli is of the straight-chain type delivering the substrate for the B. ambifaria RhlA.
Most interesting and representing the group with the most potential toward novel HAA congeners are the results we obtained with RhlA proteins from the Enterobacterales and Halomonas. Except for Pantoea and Lonsdalea, the genes coding for RhlAs in this group are not colocalized with other genes related to glycolipid formation and thus are difficult to distinguish from phaG. In contrast to the Pseudomonas and Burkholderia branches shown in Fig. 3, RhlA homologs from five genera are combined in the third branch. With only four RhlAs tested, we found a diversity within this group ranging from no HAAs with pPLY over similar congeners like in the pseudomonads (pDAD and pANA) to a Burkholderia-like spectrum with pHAL. So far, few results from Enterobacterales have been presented in the literature. Reports about rhamnolipid formation by the wild-type strains Enterobacter asburiae (17,18), Enterobacter hormaechei, P. stewartii (17) (Fig. 1), and P. ananatis BRT175, a strain producing a glycolipid with a sugar moiety other than rhamnose (34,41), show similar fatty acid chain lengths to those we detected with pANA and pDAD. Though the HAA spectrum from, e.g., pDAD and pFLU or pHAL and pBUG are similar, the RhlAs are only distantly related and not arranged in the same phylogenetic lineage. The diversity of HAAs within the Enterobacterales indicated by long branch lengths hints at the existence of more proteins with RhlA activity in this and other orders of the Gammaproteobacteria, such as the Oceanospirillales. With confirmed RhlA activity from numerous species, the sparse tree depicted in Fig. 3 might develop toward distinct branches related to genera. A tendency can already be seen with our data obtained with pHAL, pDAD, pPLY, and pANA. Eliminating the unconfirmed S. plymuthica strain, to date, three strains from Lonsdalea and Pantoea form their own lineage. In these species, a colocalization of the rhlA homologs with an rhlB homolog are found.
Our results indicate that the rhlA genes are conserved within microbial genera. Because RhlA mainly determines substrate specificity in the rhamnolipid synthesis pathway, the main fatty acid congener of HAAs and rhamnolipids can be inferred from knowledge of the species of the producing organism. Insights into the correlation between microbial and RhlA phylogeny and RhlA specificity may be fostered by additional genomic and production data from rhamnolipid producers, ideally increasing the number of HAA-producing species and genera for which genetic evidence for rhlA genes exists.
HAAs synthesized by non-Proteobacteria. Toribio et al. (58) argued in 2010, when hundreds of genomes were already available in databases, that the rare occurrence of rhlA homologs outside of the Betaproteobacteria and Gammaproteobacteria species suggested that horizontal gene transfer occurs only in rare circumstances. This conclusion agrees with the results for D. maris presented here (28). Although 360,000 genomes are currently available in the Genomes On Line Database (GOLD), and many more are available in others, the early observation by Toribio et al. (58) is still valid. Although no evidence is presented here, a massive gene loss in most other phyla and genera cannot be excluded. With BLAST searches, RhlAs cannot be detected in, e.g., Thermus. Despite having a common ancestor, it is possible that the phylogenetic distance increased during evolution. This hypothesis is supported by the fact that the Betaproteobacteria RhlA proteins from P. aeruginosa and P. ananatis, which show a similar HAA spectrum, share a mere 35% identity or, to name another example, proteins from B. plantarii PG1 and Halomonas spp. show only 50% identical positions (data not shown). Alternatively, rhamnolipids might be synthesized by proteins that do not share an evolutionary origin with RhlA. Some evidence exists for alternative genes, especially for strains of the genus Thermus, which is encouraging. Pantazaki et al. (29) reported the production of HAAs and rhamnolipids with chain lengths of 8 to 14 carbon atoms using the fully sequenced strain T. thermophilus HB8. No homologs of rhlA and rhlB were found in the genome using conventional BLAST approaches. The main congener detected in Thermus sp. CCM 2842 was Rha-C 16 -C 16 , and fatty acid chain lengths of up to 24 occur in small fractions (30), indicating that an RhlA with different substrate specificity exists; again, no genetic evidence is available. The numerous reports of HAA and rhamnolipid synthesis by species not belonging to Betaproteobacteria and Gammaproteobacteria remain something of a mystery, with explanations as divergent as erroneous analytics, horizontal gene transfer, massive gene loss, and parallel evolution. The challenge to identifying the genetic origin of rhamnolipid synthesis in phyla such as Firmicutes, Actinobacteria, and Deinococcus-Thermus thus remains.
detection was carried out with electrospray ionization in negative ionization mode. Structural information was provided by performing additional MS/MS experiments on two different mass spectrometers as follows. Samples of E. coli C43(DE3) pPA2 and C43(DE3) pBUG were analyzed on a Micromass Quattro micro triple quadrupole mass spectrometer (product ion scans) as detailed previously (37). MS/MS characterization of extracts from E. coli C43(DE3) pANA, C43(DE3) pFLU, and BL21(DE3) pPA2 was carried out on a linear ion trap mass spectrometer (LTQ XL; Thermo Fisher Scientific, Inc., San Jose, CA, USA) under the conditions described by Behrens et al. (38).
Additional confirmatory experiments were conducted using high-resolution MS. The analytes were identified by their accurate masses detected on a QExactive hybrid quadrupole Orbitrap (Thermo Fisher Scientific, Waltham, MA, USA) mass spectrometer. The instrument was operated in negative electrospray ionization mode with the following parameters: spray voltage, 3.0 kV; sheath gas, 40 arbitrary units (AU); auxiliary gas, 10 AU; sweep gas, 1 AU; resolution, 140,000ϫ (full width at half maximum [FWHM] at m/z 200); and mass range, m/z 200 to 1,000.
The intact HAAs were detected as deprotonated molecules ([M-H] -); e.g., a peak at m/z 301 was observed for C 8 -C 8 HAA. MS/MS product ion spectra were dominated by the cleavage of the ester bond between the two ␤-hydroxy fatty acids as described by Lépine et al. (39). The product ion spectrum of the parent ion at m/z 301 showed a major fragment at m/z 159, which corresponds to a C 8 fatty acid moiety, thus confirming the assignment of the parent as C 8 -C 8 HAA. Fragments with m/z 131 and 187 were also present. These ions indicate the presence of C 6 and C 10 fatty acid moieties, therefore confirming by LC-MS/MS that not only C 8 -C 8 HAA but also C 6 -C 10 and C 10 -C 6 HAAs were present.
Confirmation of hydroxy fatty acids by GC-MS. The HAAs were analyzed using gas chromatography-mass spectrometry (GC-MS). Therefore, an aliquot of each sample was dried under a gentle stream of nitrogen and hydrolyzed with 0.5 M NaOH in MeOH-H 2 O solution (9:1 [vol/vol], 2 ml, 70°C, 1 h). Afterward, the solution was acidified to pH 3 with 1 M HCl, and the fatty acids were extracted with chloroform (3 ϫ 3 ml). After removal of the solvent, fatty acid methyl esters (FAMEs) were prepared by adding 100 l of BF 3 -MeOH (14%, wt/vol) and heating (75°C, 1 h). Then, 2 ml of H 2 O was added, and the FAMEs were extracted with chloroform (3 ϫ 2 ml). The solvent was evaporated using a gentle stream of nitrogen. The residue was redissolved in 25 l of pyridine and 50 l of the silylating agent (BSTFA:TMCS [99:1, vol/vol]) and then heated (70°C, 1 h). Finally, the silylating agent was removed under a gentle stream of nitrogen, and the residue was rediluted in 0.2 ml n-hexane and used for GC-MS analysis.
After derivatization, samples were analyzed using a GCMS-QP-2020 equipped with a Nexis GC-2030 gas chromatograph (both Shimadzu, Kyoto, Japan). A 30-m, 0.25-mm-inside-diameter (i.d.), 0.25-mfilm-thickness DB-5MS column (J&W Scientific, Folsom, CA, USA) was used for the separation. Samples (1 l) were injected using an AOC-20i Plus autosampler (Shimadzu, Kyoto, Japan) and a programmed temperature vaporization (PTV) inlet (250°C) in splitless mode. Helium (5.0) was used as a carrier gas with a flow rate of 1.22 ml/min. The column oven was programmed as follows: starting at 50°C, the temperature was increased at a rate of 10°C/min to 300°C, which was held for 10 min. Mass spectra were obtained by electron ionization (EI; 70 eV). The temperatures of the ion source and interface were set to 250°C. Data were recorded from m/z 50 to 500 with a rate of 10 scans/s. For comparison of retention times and fragmentation patterns, a bacterial acid methyl ester standard solution (BAME) (47080-U; Sigma-Aldrich, Steinheim, Germany) was used (10-fold diluted with methyl tert᎑butyl ether).
Computational methods. The evolutionary history was inferred using the neighbor-joining method (62). Evolutionary analyses were conducted in MEGA7 (63).
Accession numbers. The corrected sequence of the rhlA gene in P. ananatis LMG 20103 and the sequence containing PhaG in P. desmolyticum were deposited under the GenBank accession numbers MF671909 and MG099922, respectively. The codon-optimized rhlA homologs for the construction of pHAL and pPLY are accessible under MN369027 and MN369028.