Expanding gene families helps generate the metabolic robustness required for antibiotic biosynthesis

Expanding the genetic repertoire of an organism by gene duplication or horizontal gene transfer (HGT) can aid adaptation. Streptomyces species are prolific producers of bioactive specialised metabolites with adaptive functions in nature and some have found utility in human medicine such as antibiotics. Whilst the biosynthesis of these specialised metabolites is directed by dedicated biosynthetic gene clusters (BGCs), little attention has been focussed on how these organisms have evolved robustness into their genomes to facilitate the metabolic plasticity required to provide chemical precursors for biosynthesis. Here we show that specific expansions of gene families in central carbon metabolism have evolved and become fixed in Streptomyces bacteria to enable plasticity and robustness that maintain cell functionality whilst costly specialised metabolites are produced. These expanded gene families, in addition to being a metabolic adaptation, make excellent targets for metabolic engineering of industrial specialised metabolite producing bacteria.


Introductory paragraph 24
Expanding the genetic repertoire of an organism by gene duplication or horizontal gene 25 transfer (HGT) can aid adaptation. Streptomyces species are prolific producers of bioactive 26 specialised metabolites with adaptive functions in nature and some have found utility in human 27 medicine such as antibiotics. Whilst the biosynthesis of these specialised metabolites is 28 directed by dedicated biosynthetic gene clusters (BGCs), little attention has been focussed on 29 how these organisms have evolved robustness into their genomes to facilitate the metabolic 30 plasticity required to provide chemical precursors for biosynthesis. Here we show that specific 31 expansions of gene families in central carbon metabolism have evolved and become fixed in 32

Introduction 38
A remarkable feature of specialised metabolite producing Actinobacterial genomes is the 39 annotation of multiple genes that encode the same putative biochemical function 1,2 . This 40 expansion of gene families by gene duplication or HGT is thought to introduce robustness into 41 biological systems, which in turn facilitates evolvability and adaptation [3][4][5] . The expansion of 42 gene families results in relaxed selection following the gene duplication or HGT event, that 43 allows the accumulation of mutations which enable diversification of function to occur 6 . This 44 suggests that gene family expansion within genomes is a key driver of biological innovation 45 by facilitating adaptation 7 . The production of extensive specialised metabolites by certain 46 Actinobacterial lineages is thought to be a key adaptive response to life in complex, highly 47 competitive environments such as soil [8][9][10] and, as such, may drive the expansion of primary 48 metabolic capability providing the metabolic robustness that facilitates the evolution of novel 49 biosynthetic functions. 50 Surprisingly many central metabolic enzymes are non-essential for survival due to genetic 51 redundancy through the presence of isoenzymes or alternative reactions. The redundancy 52 allows cells to adapt to a variety of habitats and dynamic environmental conditions through 53 provision of metabolic plasticity 11 . Whilst this has been studied in the unicellular enteric 54 bacterium Escherichia coli and the yeast Saccharomyces cerevisae 12 , little attention has been 55 paid to organisms with extensive specialised metabolism. Gene families of Actinobacterial 56 developmental genes have been studied at the genetic level 7,13,14 but little attention has been 57 paid to either primary or specialised metabolism 15 and how the supply of biosynthetic 58 precursors is maintained during the adaptive response under challenging environmental 59 conditions. 60 In Actinobacteria, production of specialised metabolites is frequently growth phase dependent 61 and usually in response to nutrient starvation and during entry into sporulation 16 . This creates 62 a potential metabolic conflict for an organism, where declining availability of metabolites may 63 constrain certain cellular process in favour of others, such as reducing cellular pools of 64 metabolites that are used directly for specialised metabolites. Under these conditions it is likely 65 that genetic redundancy can promote robustness and plasticity that helps to maintain cellular 66 function in the face of perturbation 17,18 . 67 Here we systematically examine the genetic redundancy within the genomes of specialised 68 metabolite producing Actinobacteria to understand how genetic robustness enables the 69 evolution of extensive specialized metabolism. Moreover, a detailed functional analysis of a 70 redundant pyruvate kinase gene pair from Streptomyces coelicolor A3(2) indicates that 71 biochemical diversification at the enzyme level facilitates the evolution of distinct physiological 72 roles which enables functionality during the metabolic reprogramming that is associated with 73 physiological differentiation. 74 75 76

Results 77
Gene expansion events are overrepresented in specialised metabolite producing 78

Actinobacteria 79
To determine if gene expansion events in central metabolism occur with greater frequency in 80 specialised metabolite producing organisms, a database of 614 Actinobacterial genomes 81 spanning 80 genera was compiled. All genomes were retrieved from GenBank, and re-82 annotated with RAST 19 to ensure consistency of annotation across the database and were 83 then analysed in a bespoke bioinformatics pipeline based on EvoMining 20 ( Supplementary Fig.  84   S1). It was hypothesised that, if precursor supplying pathways are a contributing factor to the 85 adaptive response of specialised metabolite production, then the enzymatic nodes 86 contributing to precursor supply should be overrepresented in the database (Table 1 and  87   Supplementary Table S1). 88 Expansion events were defined as cases where the number of enzyme family members per 89 suborder had a value equal or higher than the mean number of members per phylum plus its 90 standard deviation. The glycolytic pathway showed highest number of gene expansion events, 91 in the Streptomycineae and Catenulisporineae with 23.3% and 25.0% more genes encoding 92 glycolytic function than was average for that pathway in the phylum Actinobacteria respectively 93 (Table 1). Pseudonocardineae showed highest number of gene expansions in 94 gluconeogenesis (25% higher than the mean phylum value) and in the TCA cycle (28.3% 95 higher than the mean phylum value). This was also true for many amino acid biosynthetic 96 pathways. Where the main precursor is derived from 2-oxo-glutarate (Glu, Gln, Pro, Arg), 97 expansion was 20.1% more than the mean suborder value, with pyruvate derived amino acids 98 (Ala, Ile, Leu, Val; 31.3%), oxaloacetate derived amino acids (Asp, Asn, Thr, Met, Lys; 20.7%), 99 3-PGA derived amino acids (Gly, Ser, Cys; 24%) and E4P/PEP derived amino acids (Tyr, Phe, 100 Trp; 19.4%). 101 Focusing on the genus Streptomyces, which is renowned as being amongst the most talented 102 of genera in terms of specialised metabolite production, it was found that 14 enzymatic steps 103 from central metabolism (Glycolysis, TCA cycle, and amino acid metabolism) represented 104 gene expansion events, such that they are overrepresented in this genus compared to the rest 105 of the database. The following enzyme functions were found to be over-represented in the 106 genus Streptomyces compared to the whole Actinobacterial phylum: phosphofructokinase 107 (PFK), pyruvate kinase (PK), pyruvate phosphate dikinase (PPDK), malic enzyme (ME), 108 pyruvate dehydrogenase complex E1 (PDHC E1), chorismate mutase, acetylglutamate 109 kinase, diaminopimelate decarboxylase, aspartate aminotransferase, aspartate-110 semialdehyde dehydrogenase, serine hydroxymethyltransferase, glutamine synthetase, 111 arginiosuccinate lyase and methionine synthetase (Table S1). To investigate how gene 112 expansions in Actinobacteria are a potential prerequisite for increasing robustness in 113 specialised metabolism capability, the two pyruvate kinases kinases from Streptomyces were 114 studied further due to their central role in carbon metabolism linking glycolysis, 115 gluconeogenesis and the TCA cycle. and Streptosporangineae (Fig. 1A). 128 A second phylogeny of the annotated PKs of the Actinobacteria was constructed. It indicated 129 that there is a high level of congruence with the RpoB phylogeny as expected for a central 130 metabolic enzyme (Fig. 1B). However, a bifurcating topology within the Streptomycineae 131 family was observed, which contained the two genes encoding the putative PKs. This topology 132 indicates that a gene duplication event occurred, which gave rise to two PKs within this group. 133 Analysis of 286 Streptomyces species showed that 281 species have duplicate copies of PK, 134 three species possess a single copy (S. somaliensis, S. sp NRRL F5135 and S. 135 scrabrisporus), two species have three copies (S. olindensis and S. sp. AcH505) and a single 136 species has four copies (S. resistomycificus). Interestingly, S. sp AcH505 and S. 137 resistomycificus had one copy of pyk in each main branch of the PK tree and additional copies 138 were found to be phylogenetically distant, suggesting that these copies were acquired through 139 horizontal gene transfer (HGT). Overall, 92 % (302 of 327) Actinobacterial genomes outside 140 of the genus Streptomyces encoded a single PK reinforcing the uniqueness of the duplication 141 in this genus (Fig. 1B). 142 To determine if the duplicate PKs annotated in the Streptomyces genome have pyruvate 143 kinase activity we used the two PKs from the model streptomycete, S. coelicolor A3(2), in 144 genetic complementation tests of PK mutants of Escherichia coli. E. coli also has two PKs: a 145 primary enzyme pykF, which is a Type I enzyme, regulated allosterically by fructose 1,6 146 biphosphate (FBP) and a distinct secondary Type II PK (pykA), regulated allosterically by 147 with the isogenic parental strain (E. coli BW25113) for their ability to grow under a range of 151 physiological conditions (Fig. 1C). In LB (for which PK is dispensable for growth) and M9 plus 152 acetate as the sole carbon source (where PK is also dispensable for growth), little difference 153 was observed in the specific growth rate (h -1 ) of the strains (Data not shown). When the strains 154 were grown in M9 plus glucose as the sole carbon source (where PK is essential for growth) 155 the E. coli ∆pykA∆pykF double mutant was unable to grow, but could be genetically 156 complemented with either pyk1 (SCO2014) or pyk2 (SCO5423) from S. coelicolor. The 157 individual E. coli ∆pykA and the ∆pykF mutants had reduced specific growth rates (around 158 50% of the isogenic parent strain) in M9 plus glucose. Genetic complementation with pyk1 or 159 pyk2 from S. coelicolor was able to fully restore growth of an E. coli ∆pykF mutant as expected. 160 The E. coli ∆pykA mutant could only be complemented with pyk1 from S. coelicolor, 161 suggesting a much more limited physiological role for pykA in E. coli (Fig. 1C). These data 162 confirm that both pyk1 and pyk2 from S. coelicolor have retained PK activity following the 163 duplication event but suggests that each has diverged and evolved different physiological 164

roles. 165
Given that the PKs in S. coelicolor have diverged following duplication, we assessed the level 166 of selection imposed on the PKs of Steptomyces by calculating the ratio of non-synonymous 167 changes (dN) to synonymous changes (dS). Twenty PK sequences from 10 Streptomyces 168 genomes were chosen to calculate the dN, dS and dN/dS values. The dN/dS ratio for pairs of 169 pyk sequences for each of the genomes yielded dN/dS ratios ranging from 0.407 to 0.500, 170 suggesting that PKs in Streptomyces are under strong purifying selection (Table. S2). Such 171 high levels of purifying selection indicate that the duplication event in Streptomyces is likely to 172 be ancient and is consistent with the PK tree topology (Fig. 1B). 173

The two pyruvate kinases in Streptomyces have distinct physiological roles 174
To determine the roles played by the PKs in growth, development and antibiotic production, a 175 series of mutant S. coelicolor strains was constructed and genetically complemented (Table  176 S3, S4 & Fig. S2). Deletion mutants and transposon insertion mutants showed similar 177 phenotypes (Fig.S2) and all subsequent work was carried out with transposon insertion 178 mutants. Growth on nutrient agar showed no differences between the strains, except when an 179 additional copy of pyk1 was present in WT in trans ( Fig. 2A). During culture on solid minimal 180 medium with 1% glucose as carbon source, the strains showed no growth defects when 181 compared to wild-type ( Fig. 2A). Interestingly, the pyk1::Tn5062 mutant showed an increase 182 in specialised metabolite production ( Fig. 2A, 2C & 2D). The strain pyk2::Tn5062 was 183 marginally affected in growth and showed no over expression of specialised metabolites (Fig.  184   2A). No changes in growth rate were observed in rich medium (YEME medium) for the WT, 185 mutants or complemented strains (Fig. 2B). However, growth of the strains in this medium 186 showed an increase in production of coelimycin 24 and undecylprodigiosin (RED) in the 187 pyk1::Tn5062 mutant (Fig. 2C) whereas a pyk2::Tn5062 mutant showed reduced antibiotic 188 yields (Fig. 2C & 2D). These data suggest that each PK isoenzyme plays a distinct Streptomyces is likely to be controlled at the post-translational level. Expression of pyk2 also 207 showed a decrease in expression on Tween compared to glucose during both phases, but the 208 change was not significant (Fig 3Bii).  Table 2), while Vmax also increased 3.5-fold (from 21 U/mg to 73.3 U/mg). Pyk2 216 showed a five-fold increase of Vmax in the presence of 1 mM AMP (from 1.2 U/mg to 6.7 U/mg). 217 S0.5 decreased three-fold (from 0.27 mM to 0.09 mM; Table 2, Fig. S2). There were profound 218 differences in the PEP kinetics for both PKs, with both isoenzymes demonstrating Hill-type 219 cooperative binding kinetics with AMP. In the presence of 1 mM AMP, Vmax of Pyk1 increased 220 five-fold (14.05 U/mg to 65.45 U/mg), S0.5 decreased more than three-fold (3.49 to 1.05 mM) 221 and the Hill coefficient was approximately halved (from 3.7 to 1.8). For Pyk2 in the presence 222 of 1 mM AMP, Vmax was 9.1 U/mg compared to 0.5 U/mg without AMP, with S0.5 increased 223 from 1.3 mM to 8.6 mM. Under these conditions the Hill coefficient increased from 1.45 to 7.1 224 (Table 2). Further analysis demonstrated that Pyk1 has a much higher affinity for AMP (S0.5 = 225 0.01 mM), compared to Pyk2 (S0.5 = 3.8 mM), with a concomitant increase in Vmax (8.2 U/mg 226 for Pyk1 compared to 1 U/mg for Pyk2, Fig S3C). The turnover rate constant (Kcat) for Pyk1 227 was >20 fold greater (4703 sec -1 ) than that of Pyk2 (215 sec -1 ; Table 2). Interestingly Pyk1 228 was also shown to be highly stimulated by ribose-5-phosphate (Fig. S3). Intriguingly it is known 229 that flux through the pentose phosphate pathway increases during entry into stationary phase 230 in streptomycetes 25 suggesting that Pyk1 activity is stimulated during periods of starvation 231 and during antibiotic production to rebalance reduced glycolytic flux and entry of substrates in 232 to the TCA cycle. metabolites. We also demonstrate that the duplication of the primary metabolic enzyme, PK 265 promotes metabolic robustness and influences the production of specialised metabolites. 266 Understanding the evolution of central metabolism in conjunction with specialised metabolism 267 can contribute to our fundamental understanding of the ability of Actinobacteria to produce a 268 plethora of useful molecules and can help inform on novel approaches to metabolic 269 engineering. 270

Database generation and bioinformatics analysis 272
The NCBI database (http://www.ncbi.nlm.nih.gov/genbank/wgs) was the source of 273 actinobacterial genomes having a minimum coverage of 25x and less than 30 contigs per 274 Mbp. To ensure a wide range of phylogeny, a selection of 614 species from 80 genera were 275 included. Each genome was re-annotated using RAST 19  and Δpyk2) were constructed using PCR-targeted gene replacement with an apramycin 294 resistance cassette (acc(3)IV) using the Redirect system 34 and the primers reported in Table  295 S5. S. coelicolor transposon insertion mutants (pyk1::Tn5062 and pyk2::Tn5062) were 296 constructed using Tn5062 mutagenised cosmids as described in Fernández-Martínez et al 35 . 297 Each cosmid was first verified by restriction analysis before being conjugated into 298 Streptomyces. All strains were verified by PCR and sequencing of the respective products. LB or M9 medium with 1% (w/v) glucose or 0.4% sodium acetate (w/v) as carbon source. 307 Flasks were inoculated from an overnight culture (1% v/v) including the appropriate antibiotics 308 and 1 mM IPTG to induce expression of the PKs. Growth was followed at OD600 at 37 • C with 309 shaking at 250 rpm. The specific growth rate was determined from the semi-logarithmic plot 310 of biomass concentration. 311

Protein overexpression and purification 312
The coding sequence of pyk1 was codon optimised for E coli and amplified from the vector 313 pEX-K4 using the primers in Table 2. The native version of pyk2 was used to amplify the 314 coding sequence using the primers in Table 2. Both coding sequences were cloned into the 315 pET100 TOPO vector (Invitrogen) according to the manufacturer's instructions.