A metabolic and physiological design study of Pseudomonas putida KT2440 capable of anaerobic respiration

Pseudomonas putida KT2440 is a metabolically versatile, HV1-certified, genetically accessible, and thus interesting microbial chassis for biotechnological applications. However, its obligate aerobic nature hampers production of oxygen sensitive products and drives up costs in large scale fermentation. The inability to perform anaerobic fermentation has been attributed to insufficient ATP production and an inability to produce pyrimidines under these conditions. Addressing these bottlenecks enabled growth under micro-oxic conditions but does not lead to growth or survival under anoxic conditions. Here, a data-driven approach was used to develop a rational design for a P. putida KT2440 derivative strain capable of anaerobic respiration. To come to the design, data derived from a genome comparison of 1628 Pseudomonas strains was combined with genome-scale metabolic modelling simulations and a transcriptome dataset of 47 samples representing 14 environmental conditions from the facultative anaerobe Pseudomonas aeruginosa. The results indicate that the implementation of anaerobic respiration in P. putida KT2440 would require at least 49 additional genes of known function, at least 8 genes encoding proteins of unknown function, and 3 externally added vitamins.

As there is a relatively short evolutionary distance between the strict aerobic P. putida KT2440 and facultative anaerobic Pseudomonas species compared to other anaerobic bacteria [21], it could be reasoned that a minimal set of adaptations would be required to change an aerobic Pseudomonas species into an facultative anaerobic one. Through the implementation of a rational engineering cycle, this strain could be adapted to a facultative anaerobic lifestyle. In an attempt to obtain a P. putida KT2440 derived strain capable of anaerobic fermentation a Design, Build, Test, Learn-engineering cycle [22] was performed in earlier work [23] to obtain an P. putida KT2440 strain capable of anaerobic fermentation. Using genome metabolic models (GSMs) iJP962 and iJP746 combined with a protein domain comparison (PDC) between six aerobic Pseudomonas putida strains including KT2440 and six facultative anaerobic Pseudomonas strains, three key enzymes were selected and included in the final design: acetate kinase (encoded by ackA), dihydroorotate dehydrogenase (pyrK-pyrD B) and ribonucleotide triphosphate reductase class III (nrdD-nrdG). This design was built, and the resulting recombinant strain showed growth under micro-oxic conditions [23]. Earlier work already described an increase in survival rates upon introduction of solely acetate kinase [4,14], and since the model predictions used in the design only considered full anoxic conditions, survival rates of the recombinant strains under anoxic conditions needed to be tested.
Here, we (i) determined the survival rates of the previously constructed recombinant strains under anoxic conditions, (ii) identified limitations for anaerobic growth through respiration, and (iii) composed a new design for a recombinant P. putida KT2440 capable of anaerobic respiration. In pursuit of this goal we expanded upon earlier work using the current wealth of genome data available on P. putida and other Pseudomonas species by inclusion of 1628 strains in an extensive comparison of the protein domain content [24]. Random forest, a machine learning method, was used to identify key protein domains associated with "anaerobic growth". Transcriptome data of the Pseudomonas aeruginosa type strain PA14 cultures grown in 14 different conditions [25] were also considered and integrated with previous and newly obtained GSM simulation results to compose a final design.

Anoxic survival experiment
Oxygen gradients served to allow the recombinant strains to grow in micro-oxic conditions as described in [23]. Anoxic cultivation of P. putida KT2440 recombinants unpassed or passed over oxygen gradients was performed at 30°C in 50 ml glass 20 mm aluminium crimp cap vials with rubber stoppers (Glasgerätebau Ochs Laborfachhandel e.K.) in 30 ml DeBont with 20 g/l gluconic acid, 1 mg/l resazurin and 50 μg/ml kanamycin as selection marker for recombinant strains. Where indicated, a 1000x diluted vitamin mix was added (0.02 g/l biotin, 0.2 g/l nicotinamide, 0.1 g/l p-aminobenzoic acid, 0.2 g/l thiamin, 0.1 g/l pantothenic acid, 0.5 g/l pyridoxamine, g/l cyanocobalamin, 0.1 g/l riboflavin). Before inoculation, the vials were gas exchanged with CO2/N2. Inoculation was done with aerobically pre-cultured bacterial sample at an OD600 of 0.05. Approx. 8 h after inoculation, the resazurin became completely colourless, indicating obtainment of anaerobic conditions. Samples were taken using sterile CO2 flushed 1.5″ needles (BD Microlance) and 3-5 ml syringes (ThermoFisher) to avoid O2 exposure. Anoxic conditions were ensured as the resazurin turned from colourless to bright pink within seconds in extracted samples. Survival rates were analysed by colony forming units (CFU) determination. A dilution series was made and five drops of 10 μl per dilution were applied onto LB-agar plates without selection marker, which were incubated o/n at 30°C. Colonies were counted manually, and photos were taken of the plates. Gram-staining was performed as an intermediate check of culture purity, according to manufacturers' instructions (Gram-staining kit Machery-Nagel, Germany), while genome sequencing was applied to ensure culture purity at the start and end of every experiment.

Statistical analysis
Experiments were independently repeated six times with biological triplicates in each separate experiment. Figures represent the mean values of corresponding biological triplicates and the standard deviation. The level of significance of the differences when comparing results was evaluated by means of analysis of variance (ANOVA), with α = 0.05.

Genome annotation
Information on the oxygen requirements of 16,989 Pseudomonas strains was obtained from the Gold database [27]. Per species, extensive literature research was performed to validate their aerobicity (Data S5). One thousand six hundred twenty-eight Genomes of facultative anaerobic and strict anaerobic strains from the Pseudomonas genus were obtained from the European Nucleotide Archive repository in March 2015 [28]. All genomes were de-novo annotated in SAPP [29] using Prodigal for gene prediction (version 2.6) [30], 2010] and InterProScan version 5.4-47.0 [31] for functional annotation using Pfam [32].

Comparisons of protein domain content
The positions (start and end on the protein sequence) of the protein domains and their order in a protein when multiple domains were present, were used to identify domain architecture (i.e. combinations of protein domains). Protein domain architectures were labelled by the ordered list of Pfam identifiers as described in [33]. Protein domain architectures identified in each genome sequence were stored in a matrix. From this a binarized domain architecture presence-absence matrix was extracted and used as input for principal component analysis using the standard R-package prcomp and hierarchical clustering using the standard R-package hclust.

Gene persistence
The persistence of a gene in a taxonomic group or group of genomes can be defined as where N (orth) is the number of genomes carrying a given ortholog and N is the number of genomes considered [24]. For the set of 1628 considered genomes. Orthologous genes were identified through identity of protein domain architectures considering copy number. Resulting protein domain contents were analysed through protein domain comparison (PDC).

Feature selection using random forest
The random forest classification algorithm was used to classify the genome sequences in aerobic and facultative anaerobic species with the goal to identify the domains (features) responsible for the separation in these two groups (feature selection). Three hundred randomly selected genomes from aerobic and anaerobic Pseudomonas species were selected to train random forest models. The process was repeated one hundred times. The resulting 100 different models were used to weigh 5831 protein domains from both aerobic and anaerobic Pseudomonas species. Variable selection was used to identify the most influential domains for classification in aerobic and facultative anaerobic strains, yielding 100 Gini coefficients, representing the importance of a protein domain for separation per protein domain. Gini coefficients were combined into the cumulative Gini coefficient. The resulting protein domains were separated into aerobic/anaerobic specific protein domains before further analysis.

Transcriptome data analysis
A publicly available P. aeruginosa transcriptome data set was retrieved from GEO database (accession number GSE55197) [25]. This dataset contains 47 samples corresponding to 14 environmental conditions, including changes in growth temperature, growth stage, osmolarity, concentration of ions in the media, and surface attachment and anaerobic respiration. For every gene, the log2 fold change of its expression values was calculated in comparing every possible condition with anaerobic respiration. Missing or infinity values arising from genes with very low counts in some condition(s) were imputed to 0 or ± 4, according to the significance of the differential expression (False discovery rate, fdr < 0.05). Normalization, fold change computations and differential expression analysis were performed using the R package DESeq [34].

Genome-scale metabolic models
In this study we used the P. putida genome-scale metabolic models (GSMs) iJP962, iJN746 and iJN1411 [3,5,35]. iJN1411 was obtained directly from the authors [35]. GSM simulations were performed as described in [23], with uptake rates of up to 1000 mmol gdw − 1 h − 1 (gdw: grams dry weight) of copper, cobalt, iron, protons, water, sodium, nickel, ammonia, phosphate, sulphate, and nitrate, a maximal glucose uptake rate of 6.14 mmol gdw − 1 h − 1 , based on experimentally measured uptake rates [36]. Thus, the in silico medium composition mimics the De Bont minimal medium used for the in vivo experiments.

Insertion of acetate kinase in P. putida KT2440
Previous designs to obtain P. putida strains surviving anoxic conditions were conceptually based on the hypothesis that survival in anoxic conditions was prevented by a lack of energy conservation and redox balancing [4,[13][14][15][16]. Expression of the acetate kinase gene from P. aeruginosa and E. coli was reported to result in an extended survival under anoxic conditions [4,14]. Expression of the acetate kinase gene (ackA) from E. coli combined with class I dihydroorotate hydrogenase (pyrK-pyrD B) and class III ribonucleotide triphosphate reductase (nrdD-nrdG) from L. lactis successfully led to growth under micro-oxic conditions [23].
To determine the tolerance to anoxic conditions of a P. putida KT2440 recombinant strain enriched with ackA, pyrK-pyrD B and nrdD-nrdG and of a negative control strain carrying an empty plasmid to anoxic conditions and to analyse the effect of an adaptation over oxygen gradients as performed earlier [23], an 18-day anoxic survival experiment was performed. After inoculation at a standardized cell density under oxic conditions, cultures were incubated overnight in capped gasexchanged vials in oxygen-depleted medium (see Materials and Methods). To optimise growth gluconic acid was used as the main carbon source. In a previous design, gluconic acid was shown to offer better results compared to a series of other carbon sources [23]. The survival rate was determined by performing colony forming unit (CFU) counts at set time points over a period of 18 days, with T0 being the start of the experiment in anoxic conditions (supplementary Figures S1, S2, S3, S4, S5). The results showed that in anoxic conditions there is no significant difference in survival rates between the negative control and any of the recombinant strains tested (ANOVA α = 0.05). Under these conditions, only the positive control, E. coli BW25113 harbouring an empty plasmid, survived.
Design requirements for a P. putida KT2440 derivative strain capable of anaerobic respiration The failure of the previous, fermentative, design [23] to grow under anoxic conditions could be explained by the heavy reliance on the two state of art genome-scale models (GSMs) used in this design, which currently do not include an accurate representation of the complete redox balance and its intricate involvement in the metabolism. Additionally, while the protein domain comparison performed in the previous study showed apparent differences between aerobic and anaerobic strains in availability of protein domains, this analysis was performed on a limited set of strains.
Many facultative anaerobic Pseudomonas species are incapable of anaerobic fermentation, but rather perform anaerobic respiration. The close phylogenetic distances between some of these facultative anaerobic Pseudomonas species and P. putida KT2440 may suggest that acquiring a facultative anaerobic lifestyle via anaerobic respiration would require less genetic changes. To come to a rational design of P. putida KT2440 capable of anaerobic respiration, the previous methods were thus expanded upon by (i) using significantly more facultative anaerobic and aerobic Pseudomonas strains for domain analysis, (ii) inclusion of iJN1411, the latest metabolic reconstruction of P. putida KT2440 [35], and (iii) incorporation of an elaborate transcriptome analysis of anaerobic respiration of P. aeruginosa strains grown under anoxic conditions in comparison with 13 aerobic growth conditions [25]. Inclusion of such transcriptome data would show gene regulation due to growth under anoxic conditions, improving the design as it complements genome-based methods.
For protein domain comparisons, the Pfam domain content of P. putida KT2440 was compared with 1627 other Pseudomonas strains with fully sequenced genomes. For each strain, a literature search was performed to determine oxygen requirements, yielding 344 obligate aerobic strains including KT2440 and 1284 facultative anaerobic strains. Strain specific differences in protein domain content were visualised using principal component analysis (PCA), and hierarchical clustering using domain presence/absence as input (Fig. 1). Both the PCA and the hierarchical clustering show a separation between several facultative anaerobic strains and the rest of the considered strains (among which P. putida KT2440). However, it should be noted that only a small fraction of the total variance is explained by the first two principal components. This separation is also apparent in the dendrogram, suggesting that significant differences could be found in protein domain content.
We assumed that domains essential for anaerobic respiration are highly persistent in facultative anaerobic strains but show a lower persistence in obligate aerobic strains. The strategy to obtain this protein domain core is outlined in Fig. 2. A "long list" of anaerobic protein domains was generated by comparing domain persistence between aerobic versus anaerobic strains. First a 95% persistence threshold was applied, to obtain a "domain core" of domains present in at least 95% of the genomes of "aerobic" strains and in the "anaerobic" strains analysed. These aerobic and anaerobic domain cores were used as input for subsequent comparative analysis and for the first list were split into "shared between aerobic and anaerobic species" (Shared domain core), "specific for aerobic species" (Aerobe specific domain core) and "specific for anaerobic species" (Anaerobe specific domain core) creating a long list of 427 anaerobe specific protein domains. A second long list was created by  Overview of in silico approaches to identify limitations to anaerobic respiration in P. putida. a Comparative genomics workflow. Genomes of the P. putida group and the anaerobic Pseudomonas group were systematically annotated using SAPP [24,29], the protein domains were extracted, and both all domains or only the domains common to all anaerobic Pseudomonas species (the core domains) were selected using a 95% persistence threshold. Analysis was performed on the whole set of genomes (left) or a genome cluster of closely related strains (right). Each of these methods resulted in a list of protein domains related to an aerobic lifestyle (purple) or an anaerobic lifestyle (light green). b Transcriptome analysis. c GSM simulations. GSM iJP962 [5] and iJN1411 [35] were expanded with indicated reaction sets and tested for anaerobic growth under anaerobic conditions. Colours indicate final implementation in the design (green). Model and genome base predictions were combined to obtain a final design the same input but searching for the reverse, a separation based on domains with a very low persistency in aerobic or anaerobic strains. For this a no more than 1% threshold was applied creating a long list of 167 anaerobe specific protein domains.
The dendrogram presented in Fig. 1 indicated a possible early branch split between a large group of exclusively anaerobic Pseudomonas strains and a mixed group, including P. putida KT2440, containing 138 facultative anaerobic and 87 obligatory aerobic Pseudomonas strains ( Fig. 1 panel C). Using this split, two "restricted" lists were built by comparing domain persistence as outlined above, but now evaluating only Pseudomonas strains present in the mixed branch. For the restricted lists, a 90% persistence threshold and a 1% persistence threshold were used, creating two anaerobic species-specific protein domains lists of 170 and 248 domains, respectively. The four different lists of protein domains essential for anaerobic growth were compared and manually further annotated. Results are summarized in Table 1 and Fig. 2.
As outlined in the Materials and Methods section, the domain content of the facultative anaerobic and the obligatory aerobic Pseudomonas strains were used to train a random forest classifier with the goal to identify those domains (features) that are mostly responsible for classification. Gini coefficients and cumulative Gini coefficients for each domain are provided in Data S9. From the 5831 domains that were used as input for the classifier, 5 have a cumulative Gini coefficient ≥ 100, as summarized in Table 1. Gini scores were added as weight to the four protein domain lists derived above.
Transcriptome data obtained from P. aeruginosa PO14 grown under 14 different environmental conditions including anoxic conditions [25] was re-analysed for genes that were consistently differentially expressed during anaerobic respiration (see the Materials and Methods section for details). By calculating for every gene, the log2fold change of its expression values in every possible condition compared with anaerobic respiration, 175 protein domains were identified. A heatmap was used to visualise up-and down-regulated genes under anoxic conditions. Regulation due to anoxic growth was considered to be significant when the same behaviour (up-or down-regulation) was observed in at least 7 of the 13 pair-wise comparisons and a fold change of at least 4 was observed in at least three of these comparisons. Protein domain architectures corresponding to the selected locus tags were identified. Based on the differential expression and similar efforts in literature [13], 22 genes encompassing 35 protein domains were selected.
Genome-scale models were used to simulate anoxic conditions. The absence of any reaction products impeding growth due to the simulated lack of oxygen were pinpointed and traced back to proteins and their encoding genes that either need oxygen as a substrate or that cannot be made without oxygen present, resulting in substrates that could thus not be produced under anoxic conditions. Genes and substrates were verified through literature analysis to be essential for growth (Table 1).

Design considerations
A comparison was made between the different lists obtained (Table 1) and previous efforts [4,13,14,23] resulting in an extensive overview of the many hurdles to overcome to build a P. putida KT2440 strain capable of anaerobic respiration. The various lists were compared by evaluating the function of each gene starting with the encoded domain annotation, checking for domain co-existence in operonic structures, comparing metabolic functions with GSM data and with gene regulation data. The importance of each protein domain was determined using the random forest analysis (Data S9) as input. In this way, different lists could be combined and reduced to a list of 57 genes. Furthermore, a supplement of 3 vitamins is required.
The selected genes can be separated into various categories based on their functions: Nitrogen metabolism (45 domains in 35 genes), Hydrogenases (9 domains in 9 genes), Cytochrome C (1 domains in 1 genes), Pyrimidine and amino acid biosynthesis (6 domains in 3 genes if 3 vitamins added), ATP production (1 domains in 1 genes), and Domains of Unknown Function (indirectly associated with anaerobic respiration) (8 domains).

Nitrogen metabolism
Most Pseudomonas species capable of anaerobic respiration do so using nitrite or nitrate as alternative terminal electron acceptor. Of the 49 known genes found vital for anaerobic respiration, 35 are either directly or indirectly involved in nitrogen metabolism. With nitrate as the final electron acceptor in anaerobic respiration the largest amount of energy can be conserved when compared to other final electron acceptors such as sulphate, iron (III), manganese (II), or selenate [37]. P. putida KT2440 lacks the nitrate/nitrite respiration pathway, which was resolved in earlier studies by inserting either a Nir-Nar or a Nor plasmid [13]. This resulted in extended survival under anoxic conditions, but not to growth. Our transcriptomics and protein domain analysis indicated that the combination of the operons of both the Nir-Nar and the Nor operon are required ( Table 2). The operons include genes of the Nif, RhfH, Nqr, Rnf, Dau, Nar, Nir and Nor, and Moa protein families, and are required for energy conservation, cofactor biosynthesis, amino acid biosynthesis, nitrogen metabolism, nitrate-, nitrite-and nitrogen transporters, nitrate-, nitrite-, nitric oxide and nitrous oxide reductases and several regulatory proteins ( Table 2 and Additional file 18). Of the 45 protein domains or 35 genes we identified within this category, only 15 genes had been previously found (narK1, narK2, narG, narH, narJ, narI narX, narL, nirF, nirQ, nirM, nirS, nirJ, nirL within the Nir-Nar operon, norC, norB, norD, nosR within the Nor operon) [13]. Ureohydrolases such as Arg1, SpeB, HutG and Pah facilitate the ammonia to urea conversion, with urea as the principle product of nitrogen excretion.

Hydrogenases
Included in the anaerobic respiration design are 9 hydrogenase subunits: HupH, HypA-F, HyaE, and HybE. Hydrogenases catalyse the reversible oxidation of molecular hydrogen, fulfilling a regulatory role in balancing the redox state. The redox state of the cell and the availability of O2 are regulatory signals in facultative anaerobic species [38].
[FeFe] And [NiFe]-hydrogenases are widely distributed under anaerobic species. These hydrogenases are only produced under anoxic conditions, and most [NiFe]-hydrogenases are inactivated by oxygen, only to be re-activated under reducing conditions [39].
P. putida KT2440 lack hydrogenases necessary for proton reduction or coupling H 2 oxidation to energy yielding processes under anoxic conditions, and the necessary hydrogenase chaperones, assembly, maturation and formation proteins ( Table 3).
None of these genes have been recognised in previous research for their importance in anaerobic respiration.

Cytochrome C
Included in the anaerobic respiration design are 3 Ctype cytochromes. C-type cytochromes account for a vital step in ATP bio-generation via the proton motive force (Table 4). Anaerobically, cytochrome C 551 (NirN), C 552 transfer electrons to nitrite reductase (NirS) and nitric-oxide reductase (NorB, NorC). The importance of NirN and NirC (the precursor of NirN) was demonstrated in [13] ( Table 2).
The PDC also indicates the need for cytochrome C 552 (Tables 2, 4). The enzyme cytochrome C nitrite reductase (C 552), amongst other important functions, catalyses the six-electron reduction of nitrite to nitrogen as one of the key steps in denitrification. Nitrogen is then reduced to ammonium in the ammonification pathway. C552 thus participates in the anaerobic energy metabolism of dissimilatory nitrate ammonification.
In addition, cobalamin-independent methionine synthase is important. This methionine synthase is responsible for precursor formation of C 551 that can be produced without using vitamin B12 (see Pyrimidine and amino acid biosynthesis, Table 5). This might be a key component for anaerobic growth, since both the protein domain analysis and the GSM iJN1411 [35] predict that, amongst other vitamins, the active form of vitamin B12 can only be bio-generated in the presence of oxygen in P. putida KT2440.

Pyrimidine and amino acid biosynthesis
Included in the anaerobic respiration design are 2 genes involved in pyrimidine and amino acid synthesis, and additional bottlenecks that can be solved by adding 3 vitamins to the medium. Earlier GSM simulations with iJP962 indicated that alternate genes must be inserted for dihydroorotate dehydrogenase and ribonucleotide triphosphate reductase type II for pyrimidine and ultimately DNA and RNA biosynthesis [23]. Both the protein domain analysis and GSM simulations using the iJN1411 metabolic model predicted that cobalamin (vitamin B12), pyridoxal-5-phosphate (vitamin B6) and menaquinone (vitamin K2) cannot be produced under anoxic conditions.    [40]. Cobalamin is a complex essential cofactor for many enzymes mediating methylation, reduction, and intramolecular rearrangements, and for methionine synthase. There is a recognised distinction between aerobic and anaerobic generation of cobalamin [41,42]. The routes differ in terms of cobalt chelation (via CobNST complex in the aerobic pathway, via precorrin-2 with CbiK in the anaerobic pathway) and oxygen requirements. The enzymes CobI, CobG, CobJ, CobM, CobF, CobK, CobL, CobH, CobB and CobNST form the aerobic pathway. CbiK, CbiL, CbiH, CbiF, CbiG, CbiD, CbiJ, CbiET, CbiC and CbiA form the anaerobic route [31,41,43]. Surprisingly, the protein domain comparison yielded none of the enzymes of the anaerobic pathway for vitamin B12 synthesis, but instead CobT and CbtB, both described as important for the aerobic pathway [41]. According to the extensive analysis, these specific protein domains linked to these genes are not present in aerobic species analysed but only in anaerobic species. It was found that in the anaerobic bacterium Eubacterium limosum, CobT functions as an activator for a range of lower ligand substrates including DMB, determining cobamide diversity. The specific function of CbtB is unknown [41,42].
Vitamin B6 is required for a wide variety of processes [44]. There are many vitamin B6-dependent proteins involved in amino acid biosynthesis, amino acid catabolism, antibacterial functions, iron metabolism, carbon metabolism, nucleotide utilization, cofactors for biotin, folate and heme, NAD biosynthesis, cell wall metabolism, tRNA modification, regulation of gene expression and biofilm formation.
Vitamin K2 is responsible for electron transport during anaerobic respiration. However, knock-out experiments in E. coli showed that upon loss of menaquinone and vitamin K1 only 3% of theoretical yield was obtained, but this was instantly revived to 44% upon supplementing of vitamin K1 or vitamin K2 [45], indicating vitamin K1 can partially make up for the loss of vitamin K2.
Rather than inserting all missing genes, in a minimal design setup, these vitamins can be supplemented to the medium (indicated in Table 5 with * ). To determine any immediate effect on growth or survival rates, medium supplementation of these vitamins was tested, monitoring performance of all recombinant strains under anoxic conditions. In parallel a survival experiment without the  vitamin mix was done. No difference in growth or survival rates was found ( Figure S4, Figure S5, Data S10, Data S11). Lastly within this category, transcriptional regulator DNR was found, its importance in anaerobic respiration described in earlier research [13].

ATP generation
Of the 49 genes of known function required for anaerobic respiration, only one is involved in ATP generation. The protein domain analysis, transcriptomics data and metabolic modelling with iJP962 and iJN1411 all indicated that ATP production remains one of the main bottlenecks to tackle. Earlier work came to the same conclusion and this was tackled by insertion of genes for acetate production [4,14]. The recombinant strain best performing in previous micro-oxic research included acetate kinase (ackA) for ATP generation through acetate production, which was therefore included in the previous design [23] (Table 6). Pseudomonas putida KT2440 has its own functional phosphate acetyltransferase (pta).

Domains of unknown function
The protein domain analysis additionally included 270 unique protein domains of unknown function occurring in the genomes of facultative anaerobic strains but not in aerobic strains. Based on contextual information, eight were identified as important for anaerobic respiration by their physical co-localisation with genes required for anaerobic respiration. These were therefore included in the design (Table 7). Similarly, 28 protein domains of unknown function were associated with virology factors or immunity and on these ground were excluded from the design. The remaining 234 protein domains of unknown function provided no direct contextual hints and thus cannot not be conclusively excluded from the design.

No extended survival under anoxic conditions after acetate kinase expression
Our previous rational design [23] was based on two genome-scale models and genome domain comparison analysis of six facultative anaerobic Pseudomonas species compared to six obligatory aerobic Pseudomonas putida species. Under micro-oxic conditions, the addition of acetate kinase, dihydroorotate dehydrogenase and class II ribonucleotide triphosphate reductase leads to growth.
In our hands there was however no extended survival under anoxic conditions of the recombinant strains upon introduction of ackA. It is extremely challenging to acquire anoxic conditions. Both the medium and the headspace must be treated to completely remove oxygen from the start of the experiment, otherwise oxygen depletion takes up to 12 h. Further, the medium must be prepared with L-cysteine or sodium thioglycollate to actively remove oxygen. Without these precautions, the medium is very easily oxygenated. Small stopper-capped vials are preferred strongly over screw-cap vials, in which oxygen leaks frequently occurred [23]. Addition of the redox-sensitive phenoxazine dye resazurin functions to detect aerobic respiration. Resazurin is generally used as an oxygen indicator as the colour changes from dark purple (high oxygen levels) to pink (low oxygen levels) to transparent (below detectable oxygen levels, as determined by micro-electrode at 0.01 g/l dissolved oxygen [23]). However, resazurin cannot be applied to distinguish between micro-oxic and anoxic conditions.
The lack of improvement in survival rates under anoxic conditions can easily be explained when contemplating the novel design assembled in this research, as numerous essential factors such as an alternative electron acceptor or an anaerobically active cytochrome-C were missing.

Technical design issues
The aim of this research was to determine the requirements for anaerobic respiration, using the industrially interesting workhorse P. putida KT2440 as a concept organism. This fundamental question resulted in a design that required 49 genes of known function and 8 genes encoding protein domains of unknown function, resulting in almost 60 genes to be included in the genome. For solely industrial applications, it might be more beneficial to opt for a facultative anaerobic strain from the start. However, this design does offer a fundamental insight in the elaborate change in genotype to achieve an anaerobic lifestyle.
To enable an anaerobic lifestyle, previous designs included the introduction of between 3 and 24 genes in P. putida KT2440 genome [4,13,14,23] but our in silico methods suggests that approximately three times more genes are required. Novel methods developed specifically for integration of large operons or multiple genes like yTREX [46] allow incorporation of up to 14 genes at the time in P. putida. Albeit an elaborate effort, we consider the inclusion of almost 60 genes in the P. putida KT2440 genome to be technically possible, especially since there are constantly new developments in the field of genome engineering.
The GSMs indicated that to produce vitamin B6 and vitamin K2, oxygen is indirectly used. Although anaerobic alternative routes could be implemented, supplementation of these vitamins in the medium significantly reduces the design. Furthermore, while the design is based on gene presence or absence in anaerobic strains of the same species, the design could be slimmed down by a futher elimination of proposed genes based on more elaborate analyses of their gene activity in facultative anaerobic species when compared to aerobic species. This would for instance include a selection of nitrogen sources the strain would grow on and would serve to eliminate all nitrogen fixation genes up to that point.
The 57 genes in our design do not consider the 234 genes of unknown function, which complicate the task even further. Without knowing their function, these genes cannot be excluded from the design. At least eight of these were found to be closely associated with genes required for growth in anoxic conditions in literature and/or through their physical co-localisation [31]. The crucial roles that genes of unknown function might play was demonstrated by Hutchison and colleagues [47], who in their attempt to make a minimal bacterial genome, unexpectedly found 149 genes of unknown function to be essential for growth.
Many of the genes found in the design are associated with metal chelation and transport, including many hydrogenases and genes required for vitamin biosynthesis. It should be considered that changes in oxygen availability drastically alters metal bioavailability as extensively reviewed in [48]. Interestingly, we found multiple strict aerobic Pseudomonas strains which upon literature research proved able of nitrogen fixation, but not of anaerobic respiration. This might suggest that the evolutionary road between the two first includes nitrogen fixation (or excludes ammonification) before including ammonification (or excluding nitrogen fixation). Either way, experimental validation in anaerobic strains is required to proof which genes required for ammonification or and nitrogen fixation are essential for anaerobic respiration.

The new design compared to previous designs
We predicted that for anaerobic growth both the Nir-Nar and Nor operons are vital. There do exist Pseudomonas species that naturally have only one of these operons and are capable of nitrate to nitrite transformation. However, these strains respire nitrate under oxic conditions only, and have been shown to be incapable of growth in anoxic conditions [49,50]. If P. putida KT2440 would be enriched with both the denitrification pathway and the ammonification pathway it could reduce nitrate or nitrite to ammonium, which could then be assimilated to organic compounds, transforming P. putida KT2440 in a diazotroph of agronomic importance [51].
The most prevalent anaerobic dissimilatory nitrate respiration regulator DNR is a key transcription factor  obtained from the protein domain comparison. In the facultative anaerobic E. coli, knock-out FNR mutants, an ortholog of DNR, were unable to grow by anaerobic respiration under anoxic conditions. By DNA microarray technology it was shown that in E. coli 49% of the genes that differ in expression between anoxic and oxic conditions are regulated by FNR [38]. The two-component aerobic respiratory control system (ArcA and ArcB) controls gene transcription in E. coli under anoxic conditions. Mutations in this system are known to affect expression of over 30 operons. Most of these are repressed under anoxic conditions, but cytochrome C oxidase and pyruvate formate lyase are activated. In E. coli, ArcAB and FNR are deemed essential for anaerobic activation and robustness under micro-oxic conditions [52][53][54][55]. To maintain after incorporating the ability of anaerobic respiration, optimal functionality of this strain under oxic conditions, these genes are included in the final design. We argue that for a lifestyle shift from a strict aerobic lifestyle in P. putida KT2440 to an faculatitve anaerobic respiratory lifestyle, all 49 known genes, at least 8 protein domains of unknown function and 3 added vitamins are required. However, increased strain performance under micro-oxic conditions or prolonged survival rates under anoxic conditions already could significantly improve strain robustness in large scale bioreactors with fluctuating oxygen levels. For enhanced performance under micro-oxic conditions, it was demonstrated that increasing ATP production through acetate production already appears to be enough [23]. For prolonged survival rates, however, these key elements include both Nir-Nar and Nor operons for denitrification and ammonification, cytochrome C 552, and external supplementation of the lacking vitamins. This conclusion is supported by previous findings that energy supply and redox balancing are the main bottlenecks in an anaerobic lifestyle [4,[13][14][15][16]23].

Conclusion
Increased ATP generation by insertion of acetate kinase via a plasmid does not lead to prolonged survival rates of Pseudomonas putida KT2440 under anoxic conditions. This proves that increased performance under micro-oxic conditions does not guarantee prolonged survival under anoxic conditions. A P. putida KT2440 strain capable of anaerobic respiration would require the insertion of at least 57 additional genes into the genome and a supplement of 3 vitamins to the medium. The conversion of a strict aerobic species to a facultative anaerobic lifestyle by anaerobic respiration is a much more elaborate process than was thought before. Especially the function of DUFs and their role in anaerobic respiration must be researched, as it remains unknown how many of these should be added to this design.
Additional file 1: Table S1. Bacterial strains and plasmids used in this study .xlsx file with bacterial strains and plasmids listed, including references or sources Additional file 2: Figure S1. Anoxic survival of P. putida KT2440 transformed strains, grown on De Bont minimal medium with gluconic acid as sole carbon source and kanamycin. The headspace was flushed from oxygen with nitrogen. Survival under anoxic conditions was determined by comparing the number of colony forming units (CFU) over time with the number of CFU at T0. Escherichia coli BW25113 was used as positive control and Pseudomonas putida KT2440 with an empty plasmid (pS2213 -) was used as a negative control. Tested strains were Pseudomonas putida KT2440 with acetate kinase (pS2213 ackA) unpassed (p + 0) or passed three consecutive times over oxygen gradients (p + 3), and Pseudomonas putida KT2440 with acetate kinase, dihydroororotate dehydrogenase and ribonucleotide triphosphate reductase type II (pS2213 ackA-(pyrK-pyrD B)-(nrdD-nrdG) unpassed (p + 0) or passed three consecutive times over oxygen gradients (p + 3).