MCE domain proteins: conserved inner membrane lipid-binding proteins required for outer membrane homeostasis

Bacterial proteins with MCE domains were first described as being important for Mammalian Cell Entry. More recent evidence suggests they are components of lipid ABC transporters. In Escherichia coli, the single-domain protein MlaD is known to be part of an inner membrane transporter that is important for maintenance of outer membrane lipid asymmetry. Here we describe two multi MCE domain-containing proteins in Escherichia coli, PqiB and YebT, the latter of which is an orthologue of MAM-7 that was previously reported to be an outer membrane protein. We show that all three MCE domain-containing proteins localise to the inner membrane. Bioinformatic analyses revealed that MCE domains are widely distributed across bacterial phyla but multi MCE domain-containing proteins evolved in Proteobacteria from single-domain proteins. Mutants defective in mlaD, pqiAB and yebST were shown to have distinct but partially overlapping phenotypes, but the primary functions of PqiB and YebT differ from MlaD. Complementing our previous findings that all three proteins bind phospholipids, results presented here indicate that multi-domain proteins evolved in Proteobacteria for specific functions in maintaining cell envelope homeostasis.

and maintenance of outer membrane asymmetry, yet have ultimately evolved a primary function that differs to that of MlaD.

Results
MCE protein architectures and phylogenetic prevalence. Lipid shuttling between the inner and outer membranes of diderm bacteria is a poorly understood process. One protein known to be involved in this process is the E. coli protein MlaD. MlaD contains a single MCE domain; the PFAM hidden Markov model (HMM) for the MCE domain (PF02470. 15) defines an 81 amino-acid long sequence with multiple well-conserved hydrophobic residues 1 . We hypothesised that other proteins involved in lipid shuttling might contain similar MCE domains. Thus, we sought to determine the distribution of these domains amongst bacteria and to investigate the most common MCE protein architectures by scanning the PFAM HMM against the UniProtKB database using HMMER 21 9 . The second most common architecture, type II, consists of a single MCE domain followed by a DUF3407 (Cholesterol Uptake Porter) domain, and is specific to Actinobacteria. DUF3407 domains are also specific to Actinobacteria and are almost always (>99%) associated with a single MCE domain 1 . Type III and IV proteins contain three and seven MCE domains in tandem, respectively. Both types are specific to Proteobacteria; in E. coli these have been designated PqiB (type III) and YebT (type IV). Type III proteins are more prevalent than type IV, which are restricted to Deltaproteobacteria, Epsilonproteobacteria, and Gammaproteobacteria (Supplementary fig. S1). Other multi-MCE domain-containing proteins were detected in Proteobacteria, but at much lower frequencies than type III and IV proteins (Supplementary table S1, Supplementary fig. S1).
Some MCE domains were detected in proteins from eukaryotic genomes (Supplementary fig. S2). Type I proteins were found in plant phyla Chlorophyta and Streptophyta. In Arabidopsis they are involved in the trafficking of phosphatidic acid from the outer to the inner membrane of chloroplasts 18 . A small number of MCE proteins were identified in animal genomes. Manual inspection of the DNA sequences encoding these proteins revealed that all but one could be attributed to contamination with bacterial DNA, the exception being in Trichoplax adhaerens, an animal known to have an unusually large mitochondrial genome 22,23 .
Protein clustering and evolution of multi-domain proteins. To understand the evolutionary relationships between MCE proteins, protein-protein similarity networks were constructed and coloured by architecture type and phylum (Fig. 2). MCE proteins generally cluster within phyla, suggesting that little or no horizontal transmission of these genes has occurred and variants have arisen through speciation. , whether type I or II, form a single tight cluster,   perhaps suggesting functional homogeneity and purifying selective pressure in this   phylum. Type I proteins from most other phyla cluster loosely, including Cyanobacteria, Bacteroidetes and a subset of Proteobacterial proteins. All plant MCE proteins cluster closely with Cyanobacteria, and the single Trichoplax protein with Proteobacteria, which is expected given the origins of chloroplasts 24 and mitochondria 25 , respectively. The position of types III and IV proteins in the large Proteobacteria-dominant cluster suggests that these proteins are a functionally divergent population that arose early in Proteobacterial evolution.

Actinobacterial MCE proteins
Genes encoding MCE proteins co-localise with genes encoding membrane transport proteins. Based on our previous structural data we hypothesised MCE proteins formed part of a supramolecular complex spanning the cell envelope 14 16 whilst the latter is a general NTP binding domain 26 . This operon type encodes the Mla pathway in E. coli 12 (Fig. 3B). In many of these neighbourhoods, for example in Neisseria meningitidis 27 , the outer membrane component MlaA (VacJ) is found in the same operon.
In Alphaproteobacteria, an NADH dehydrogenase subunit domain (NDUFA12) is found upstream of a type I gene instead of ABC transporter proteins.
Although it is unknown whether these genes are functionally related in bacteria, the same two domains are found together in the T. adhaerens MCE protein, suggesting that a gene fusion event might have occurred following the formation of mitochondria from Alphaproteobacteria 25 .
Like many type I proteins, but different to type IV proteins, type III proteins are commonly associated with genes encoding DUF330 domains, highlighting a distinction between the multi-domain proteins (Fig. 3A). The majority of the type III and type IV MCE protein-encoding genes are associated with two PqiA domains upstream. In the case of E. coli the predicted operon structures were pqiA-pqiB-ymbA and yebS-yebT (Fig. 3B), which resemble the most common type III and type IV MCE neighbourhoods, respectively. In these cases ymbA encodes the DUF330 domain containing protein and pqiA and yebS encode proteins with two PqiA domains. Each PqiA domain is predicted to span the inner membrane four times with N-and C-termini located in the cytoplasm 28,29 . PqiA is similar to NADH dehydrogenase subunit 2 30 , an antiporter domain involved in bidirectional membrane transport that requires energy from ATP hydrolysis but does not directly bind it.
These data suggest that the mechanism of action of type III and IV transport complexes might differ from each other and type I and II proteins, an observation that is consistent with our previous structural data 14 .

MCE proteins are integral inner membrane proteins.
In agreement with our structural predictions for PqiB and YebT as envelope-spanning complexes, a recent study revealed that the bulk of PqiB and YebT are localised in the periplasm 31 . For both YebT and PqiB, the presence of a single transmembrane α-helix and the lack of a predicted signal sequence suggest the proteins are associated with the inner membrane (Supplementary fig. S3). This would be consistent with the demonstrated localisation of the type I MCE protein, MlaD, to the inner membrane 12 . However, YebT is the orthologue of the V. parahaemolyticus multivalent adhesion molecule, MAM-7, which was reported to be located in the outer membrane 20 . Therefore, it was essential to determine the cellular locations of PqiB and YebT. To probe their localization, inner and outer membranes of the parent strain, E. coli K-12 BW25113, and mutants lacking pqiAB, yebST, or both pqiAB and yebST, were fractionated and separated on sucrose density gradients. Using known outer and inner membrane markers, TolC and AcrB respectively, we demonstrated that PqiB and YebT could only be detected in the inner membrane fraction but not in either the outer membrane fraction or the corresponding mutants (Fig. 4). Therefore, we concluded that in E. coli, all MCE proteins are inner membrane associated proteins.

Loss of MCE proteins disturbs cell envelope homeostasis.
Given the homology between the MCE proteins, we hypothesized that type III and type IV MCE proteins would contribute to cell envelope homeostasis in a manner consistent with the type I MCE protein, MlaD. Loss of outer membrane homeostasis in an E. coli mlaD mutant is indicated by the inability of the mutant to grow in the presence of SDS-EDTA 12 .
However, in contrast the E. coli pqiAB and yebST mutants were as resistant to SDS-EDTA as the parent strain (Supplementary fig. S4). In an attempt to understand more about the roles of PqiB and YebT, we compared the growth of the parent strain and isogenic pqiAB, yebST and pqiAB yebST mutants in over 1900 growth conditions using BiOLOG Phenotype Microarrays. Phenotypes were identified for five compounds: lauryl sulfobetaine (LSB), tetracycline, penimepicycline, azlocillin and clioquinol ( Fig 5A). However, in subsequent growth experiments, clear phenotypic differences were confirmed only for LSB. Growth of both the pqiAB and pqiAB yebST mutants was inhibited by 1% LSB, but the yebST mutant and the BW25113 parent strain were unaffected (Fig. 5).
A complete set of double mutants and the mlaD pqiAB yebST triple mutant was then constructed and screened for growth inhibition by 1% LSB (Fig. 5C). Like the pqiAB mutant, the single mlaD mutant was also sensitive to LSB. In addition, the mlaD pqiAB double mutant was more sensitive than either single mutant, revealing additive phenotypes for these strains. The yebST mutant was not sensitive to LSB, but this mutation further increased LSB sensitivity in the pqiAB strain, suggesting that YebST plays a minor role in LSB resistance. Consistent with this, we observed no growth at all for the triple mutant, the only strain to completely lack MCE domains.
Complementation of these mutants restored growth on LSB, demonstrating that the absence of MCE proteins was the cause of this sensitivity (Supplementary fig. S5).
These results show that all MCE proteins contribute to LSB resistance.
These data clearly show distinct roles for the E. coli MCE proteins in maintaining cell envelope homeostasis. We hypothesised that these differences may be due to differences in substrate specificity, such as variations in fatty acid chain length. To test this hypothesis we screened the mutants for growth on sulfobetaines with varying carbon chain lengths; caprylyl sulfobetaine and myristyl sulfobetaine ( However, additive phenotypes were not observed for the other double mutants. Sensitivity to detergents can often indicate loss of outer membrane integrity. To screen for such outer membrane defects, strains lacking one or a combination of MCE domain proteins were assayed for sensitivity against vancomycin, which does not typically cross the outer membrane. Surprisingly, our data revealed a large increase in vancomycin resistance for all mlaD mutants compared to the parent strain ( Fig. 5F).
This was also observed in the pqiAB yebST double mutant. Furthermore, the triple mlaD pqiAB yebST mutant was marginally more resistant than the other mlaD mutants. From these data, we conclude that the MCE proteins in E. coli have distinct but overlapping functions.

MCE proteins contribute to maintenance of lipid asymmetry. MlaD was
previously demonstrated to have a role in maintaining the lipid asymmetry of the outer membrane 12 . Given the overlapping functions of MlaD, PqiB and YebT, we hypothesized that PqiB and YebT may also be involved in maintaining outer membrane lipid asymmetry. To test this hypothesis we used the activity of the enzyme PagP as an indirect measure of surface exposed phospholipids; the enzyme converts lipid A from the hexa-to hepta-acylated form only when phospholipids are located in the outer leaflet of the outer membrane 32 . Previously it was demonstrated that in a Δmla background, loss of lipid asymmetry could be enhanced by loss of the outer membrane phospholipase PldA 12 . Therefore, radiolabelled lipid A was isolated from the mlaD, pqiAB, and/or yebST mutants in otherwise wild-type and pldA backgrounds and separated by thin layer chromatography. As a positive control, the parent strain was treated with the chelating agent EDTA, which is known to result in increased hepta-acylation of lipid A 33 . As previously reported 12 , there was an increase in hepta-acylated lipid A relative to the parent in all strains lacking MlaD, and this effect was elevated in the absence of pldA (Fig. 6). However, the amounts of heptaacylated lipid A in the pqiAB, yebST or pqiAB yebST mutants did not change when compared to the parent strain, suggesting that both PqiAB and YebST do not play major roles in maintaining outer membrane lipid asymmetry. Similarly, there was no increase in hepta-acylated lipid A in the mlaD pqiAB and mlaD yebST double mutants relative to the mlaD single mutant. These results are consistent with the fact that only the mlaD strain, but not the pqiAB or yebST mutants, is sensitive to SDS-EDTA (Supplementary fig. S4). We did, however, observe that there was a significant increase in the levels of hepta-acylated lipid A in the mlaD pqiAB yebST triple mutant when compared with the mlaD single mutant, particularly in the pldA background.
These results indicate that cells lacking all MCE domain proteins show a larger asymmetry defect than those lacking just MlaD. We conclude that under standard laboratory conditions PqiB and YebT may contribute to outer leaflet integrity in the absence of the Mla pathway, but that their primary roles are clearly distinct.

Discussion
Here, for the first time, we have provided in depth bioinformatic analyses of MCE proteins. We have demonstrated that these proteins are widely distributed across diderm bacteria, that they were present early in such bacterial species 34 fig. S7), and the resistance to vancomycin suggests this is not the case for bacteria deficient in MCE proteins. Furthermore, the phenotypes observed here for the growth of the mutants on sulfobetaines with different carbon chain lengths suggest MCE proteins have an overlap in function but they also highlight mechanistic differences. Indeed, LSB is known to inhibit the carnitine/acyl-carnitine transporter in mitochondria as a substrate analog 37 . A potential explanation for our observed phenotype is that LSB inhibits a similar lipid transport pathway in E. coli, perhaps one that has some overlap in function to the MCE pathways. Alternatively, MCE proteins might be involved in trafficking LSB away from its target; therefore the variable phenotypes observed on alternative sulfobetaines would represent differences in substrate specificity. Indeed, a type I operon in Sphingobium japonicum is essential for the uptake and utilisation of γ-hexachlorocyclohexane 15 and a type I operon in Pseudomonas putida is associated with toluene resistance 16  To provide a percentage measure of prevalence for each MCE architecture in each taxonomic rank, species with fully sequenced genomes (according to UniProt) were examined for the absence or presence of each MCE architecture. To minimise bias due to the sequencing of many strains in well-studied species such as E. coli, one genome sequence per species was selected at random. These results were then summed for each species phylum to calculate the percentages.

Clustering. A representative set of 1,734 MCE proteins selected using the CD-HIT
Suite at a cut-off of 50% protein identity were used in all-against-all searches in BLASTP with an e-value threshold of 1e-15. Information about each protein (phylum, architecture, etc) was incorporated into the BLAST results based on the previous architecture designations and information from UniProt. Cluster diagrams were constructed using Cytoscape (v.3.3.0), where each node represents a single protein sequence and each line (or edge) represents a match below the e-value threshold.

Gene neighbourhoods. The Ensembl gene ID for each MCE protein in
Proteobacteria was cross-referenced in conjunction with the NCBI taxonomic identifier to retrieve the sequence database for each organism in Ensembl (using the Ensembl Perl API) 43,44 . Information was retrieved for up to 10 genes located up to 10 kb eitherside of the MCE gene. Domain architectures were predicted for the encoded proteins by scanning for PFAM domains (PFAM database 27.0) using HMMER (hmmsearch 3.1b2). Where several genomes for a particular species were found, a single representative genome was chosen randomly so that each species was only represented once in the neighbourhood results.
To display groups of conserved neighbourhoods together, gene neighbourhoods were first clustered using a nearest neighbour joining approach. The similarity measure used in the clustering method gives greater weight to genes closer to the centre of the neighbourhood, based on the assumption that gene positions further away from the centre of the neighbourhood are less likely to be conserved.
This clustering method was used to constructs neighbourhood diagrams for each protein architecture, where similar neighbourhoods were clustered together. The most common neighbourhoods were manually selected from these diagrams, and some gene domain architectures were combined (e.g. "ABC_tran" domain architecture was merged with "ABC_tran, AAA_21" architecture due to similar predicted functions and similar neighbourhoods of these genes). The neighbourhoods were split into type I, III and IV proteins. To calculate a percentage for a particular gene neighbourhood, the number of genes with that neighbourhood was divided by the total number of genes that encode the given protein architecture.
Operon predictions. Operons of pqiB and yebT were predicted using ProOpDB 45 and EcoCyc 46 . The domain architecture for each protein encoded in these operons was predicted using HMMER.
Strains, media and growth conditions. E. coli K-12 BW25113 was used as the parent strain 47 . Bacteria were grown in Luria-Bertani (LB) medium or on LB plates (LB supplemented with 1.5% nutrient agar) and incubated at 37°C. If required, the medium was supplemented with 50 µg/ml kanamycin or 100 µg/ml carbenicillin. To construct deletions in pqiAB and yebST, the genes were replaced by a kanamycin resistance cassette, as previously described 48 . To construct gene deletions in mlaD, the mlaD::aph was transferred from the Keio collection 47 by P1 transduction as described previously 49 . The kanamycin cassette was removed using the vector pCP20 48 . For dilution plates, overnight cultures were adjusted to an OD 600 of 1 and diluted down to 10 -5 in a microtitre plate. A multichannel pipette was used to transfer 2 µl of the dilutions for each strain to LB agar plates supplemented with the desired chemicals.

Separation of the inner and outer membranes for identification of cellular
location. This method was adapted from the methods described by Osborn & cells were grown to an OD 600 of 0.6-0.8. The cells were pelleted (16,000 g for 10 minutes) and re-suspended in 10 ml of sucrose-Tris buffer (0.75 M sucrose, 10 mM Tris, pH 7.8). To form spheroplasts, the mixture was transferred to an Erlenmeyer flask where 500 µl of 2 mg/ml lysozyme was added. After 2 minutes of incubation on ice, 20 ml of ice-cold 1.5 mM EDTA was slowly added over 10 minutes using a peristaltic pump, with gentle stirring. The spheroplasts were broken using a C3 cell disrupter (at 15000 Psi) and unbroken cells were pelleted at 17,400 g for 20 minutes.
The supernatant was spun again at 48,400 g for 1 hour to pellet cell membranes. To wash the membranes, the pellet was re-suspended in 20 ml of sucrose-Tris-EDTA buffer 1 (0.25 M sucrose, 3.3 mM Tris, pH 7.8, 1 mM EDTA) and re-pelleted at 165,000 g for 1 hour. The membranes were re-suspended again in 10 ml of sucrose-Tris-EDTA buffer 2 (20% sucrose, 0.5 mM EDTA, 10 mM Tris, pH 7.8).
For the gradient, all sucrose was dissolved in EDTA-Tris buffer (0.5 mM EDTA, 10 mM Tris, pH 7.8). The gradient was made up in 38.5 ml Ultra-Clear Thinwall tubes (Beckman Coulter) with 10 ml of 73% sucrose (bottom), 18 ml of 53% sucrose (middle) and the 10 ml of membrane sample in 20% sucrose (top). The gradient was centrifuged at 141,000 g for 40 hours in a SW 28 Ti rotor at 4 o C. To obtain the inner membrane a pipette was used to withdraw the membrane through the top of the gradient from the 20%-53% boundary. To obtain the outer membrane the tube was pierced at the bottom and the membrane was collected by gravity flow from the 53%-73% boundary.
The resulting isolated fractions were analysed by western blotting following protein separation by SDS-PAGE. Antibodies against the known membrane markers TolC (outer) and AcrB (inner) were used to assess the success of separation.
Antibodies against PqiB and YebT were used to determine the locations of PqiB and YebT. All primary antibodies were used at a dilution of 1:2,000 in Tris-buffered saline and left to incubate overnight. After washing, the blots were labelled with horse radish peroxidase secondary antibody (Sigma Aldrich) (dilution 1:15,000). The western blots were developed using ECL Prime Western Blotting Detection Reagent (Amersham), and exposed to Hyperfilm ECL (Amersham) for between 5 seconds and 5 minutes.
Analysis of lipid A. Lipid A was extracted and analysed as described previously 52  The TLC plate was dried and exposed to phosphor storage screens (GE Healthcare) and was visualised in a phosphor-imager (Storm 860, GE Healthcare). The spots were analyzed by ImageQuant TL analysis software (version 7.0, GE Healthcare). Spots were quantified and averaged based on three independent experiments of lipid A isolation. Before performing statistical tests the datasets were first tested for normality using a Shapiro-Wilk test. A one-sided unpaired t-test was then used to determine statistical significance between pairs of samples. To correct for multiple testing the Benjamini-Hochberg correction was applied to all p values.
Lipid extraction and thin layer chromatography. Lipids were extracted by Bligh-Dyer 48 from outer membranes fractions prepared by sucrose gradient (as described above). A total of 5.7 ml of 1:2 chloroform: methanol was added to the whole outer membrane fraction (approx. 1.5 ml), followed by 1.875 ml of chloroform, followed by 1.875 ml water, with thorough mixing at each stage. The mixture was centrifuged at 1000 rpm in an IEC table-top centrifuge for 5 minutes and the lower (organic) phase was collected using a glass Pasteur pipette and transferred to a new tube. To prepare fresh upper phase, the same procedure was repeated but with 1.    (11) Caldiserica (1) Dictyoglomi (2) Thermotogae (17) Deinococcus-Thermus (17) Synergistetes (8) Fusobacteria (8) Cloacimonetes (1) Fibrobacteres (1) Gemmatimonadetes (2) Ignavibacteriae (2)