Pan-GWAS of Streptococcus agalactiae Highlights Lineage-Specific Genes Associated with Virulence and Niche Adaptation

GBS is a leading cause of mortality in newborn babies in high- and low-income countries worldwide. Different strains of GBS are characterized by different degrees of virulence, where some are harmlessly carried by humans or animals and others are much more likely to cause disease. The genome sequences of almost 2,000 GBS samples isolated from both animals and humans in high- and low- income countries were analyzed using a pan-genome-wide association study approach. This allowed us to identify 279 genes which are associated with different lineages of GBS, characterized by a different virulence and preferred host. Additionally, we propose that the GBS now carried in humans may have first evolved in animals before expanding clonally once adapted to the human host. These findings are essential to help understand what is causing GBS disease and how the bacteria have evolved and are transmitted.

and countries. This study revealed that GBS CCs possess distinct collections of genes conferring increased potential for persistence, including genes associated with carbohydrate metabolism, nutrient acquisition, and quorum sensing. Within CC17, allelic variants of these crucial genes distinguish carriage from invasive strains. The differences in the GBS CCs analyzed are not geographically restricted, and we postulate that they may have emerged from an original ancestral GBS strain in animal hosts before crossing to humans. Animal d Italy *** logeny on five different 140-strain subsets using ClonalFrame-ML, as shown in Fig. S2. CCs clustered in distinct branches of the tree; in particular, CC17 and CC23 produced two clusters. A group of 41 animal-derived WGS data clustered within a separate clade (Fig. 2), branching from human-associated CC17, hence suggesting a human origin for these animal strains. However, 46/87 isolates were located in clades associated with human-derived samples and CCs: three animal isolates for instance (MRI Z1 201, MRI Z1 200, and MRI ZI 203) clustered within the clinical isolates in the CC23 clade. We also observed several cases of CC17 animal isolates (classified as CC17 with eBurst which indicates a close relatedness to the CC17 clade but clustering within these animal strains, such as FSL S3 603 and LDS 623). This pattern could be interpreted as zoonotic transfer, with the human-associated CCs arising in animals before undergoing clonal expansion after infection of the human host. However, given the fragmented sampling frame of this data set, as well as the lack of rooting in this phylogenetic analysis (Fig. 2), we can only speculate on this matter. Pangenome and pan-GWAS. Scoary has previously been used for a similar pan-GWAS analysis of three CC17 strains (30). In this study, we applied it on a data set of 1,988 strains, representing 6 different clonal complexes ( Fig. 1): 1,374 genes were included in the core genome (i.e., present in more than 95% of the strains, "core" and "soft-core" genes), and 12,457 genes in the accessory genome (i.e., less than 95% of the strains, "shell" and "cloud" genes) (32). We observed that pangenome saturation was achieved. A total of 51, 41, 39, 102, and 64 genes associated, respectively, with CC1, CC10, CC19, CC17, and CC23 were identified ( Table 2 and Table S3), with a specificity and sensitivity in defining the CC given the annotated CDS (and vice versa) greater than 90% (P Ͻ 0.05). The pipeline was not applied to CC6, which was represented by only 22   Ia Ib II III IV V VI VIINT Ia Ib II III IV V VI VIINT Ia Ib II III IV V VI VIINT Ia Ib II III IV V  Gori et al. ® genomes in our data set. BLASTn was used to confirm whether gene sequences associated with each CC in the pan-GWAS were completely absent in different CCs, or had accumulated sufficient mutations to fail recognition by automated annotation (i.e., Prokka). We identified 57 such genes in CC17 out of the 102 identified by the Scoary pipeline, 22 genes in CC23, 4 genes in CC1, 9 genes in CC10, and 5 in CC19 ( Fig. S3 and Table S3). This suggests that the genes characterizing a particular CC may have been FIG 2 Core genome-based population structure of GBS. The phylogenetic tree is annotated with 4 colored strips representing the clonal complex, the country of isolation, the origin, and the serotype of each strain. The three binary heatmaps represent the presence (blue) or absence (yellow) of the genes identified by the pan-GWAS pipeline. The tree is rooted at midpoint. The reference strain used in this analysis was COH1, reference HG939456. The red square in the CC10 heatmap highlights the cluster of CC10-associated genes found in CC19 clones. Trees built with different reference strains are shown in Fig. S1 in the supplemental material and show analogous topology.  CC1  47  3  0  1  51  0  CC10  32  3  2  4  41  0  CC17  38  42  10  12  102  0  CC19  34  5  0  0  39  1  CC23  42  14  3  5 64 0 a A GWAS-gene island is defined as a region of the genome of at least 200kbp where Ͼ 90% of the coding regions are GWAS-driving genes.
Streptococcus agalactiae Global Pan-GWAS ® rendered nonfunctional (i.e., pseudogenes) in other CCs. Table S3 highlights which CC-associated genes are completely absent and which genes are characterized by mutations (SNPs or indels) that alter the protein sequence with point mutations or truncation. Table 2 summarizes the number of genes driving the pan-GWAS analysis and the nature of the mutations encountered. Gene location identified from the pan-GWAS analyses in CC1, CC10, CC17, and CC23 was evenly spread across the chromosome and not clustered in particular areas, consistent with the observed gene associations not resulting from a chromosomally integrated plasmid or transposon pathogenicity island acquired through horizontal gene transfer (Fig. 3). One exception was CC19, where the majority of the 39 genes were clustered in a 200-kbp region of the chromosome. Gene synteny was conserved across different isolates; in Fig. S4 the parallel vertical lines show how the genes conserved among the CC17 isolates are present in the same relative location (represented as different colored blocks in the mauve representation) in three representative genomes.
The majority of the pan-GWAS-identified genes were associated with only one CC, but a particular cluster of genes associated with CC10 (including the gatKTEM system for galactose metabolism) was also present in a set of isolates belonging to CC19 (Fig. 2). These isolates were all from Africa (Malawi and Kenya) and were ST-327 and ST-328.
Functional pathways affected by CC-specific genes. A total of 279 genes were found to be CC-specific (Table S3). Genes characteristic of CC17 and CC23 were classified into five functional categories (Table 3): metabolism, environmental information processing, cellular processes, human disease, and genetic information processing. In both CCs, the most-represented functional families were in these categories, including metabolic genes and environmental information processes.
Differences in metabolic pathways between CC17 and CC23 included carbohydrate, amino acid, nitrogen compound, and fatty acid metabolism. Siderophores for the uptake and transport of micronutrients (i.e., iron or nickel), and essential for successful colonization of the human host in several bacterial pathogens (33)(34)(35), also exhibited significant variation, for instance with genes for nickel uptake (nikE and nikD) and iron transport (feuC) truncated or characterized by SNPs in non-CC17 strains (Table S3).
CC17 and CC23 also showed differences in the genes affecting the environmental information processing functional pathways characterized by the presence of phosphotransferase (PTS) systems and two-component systems (TCS), used for signal transduction and sensing of environmental stimuli. Moreover, in the same functional category, differences were present in secretion systems, transporters, quorum sensing, and bacterial toxins. These pathways are used by GBS not only in colonization of the host, but also to gain competitive advantage against other microorganisms occupying a particular ecological niche (36).
Genes for prokaryotic defense systems, such as the CRISPR-Cas9 system, were also found, as well as proteins involved in genetic information processing such as transcription factors and regulators that may affect the expression of multiple genes (37). Finally, antibiotic resistance also appears among the lineage-specific characteristics; in particular, CC23 is the only CC showing typical genes involved in vancomycin resistance. CC17 also showed the presence of genes belonging to the KEGG group for "nucleotide excision repair" and "DNA repair/recombination protein" (KO numbers 03420/03400; Table 3), which could indicate a variation in mutagenesis rate and thus capacity to respond to changes in environmental conditions and presence of stresses.
In contrast, the genes defining CC1, CC10, and CC19 were confined to metabolism, environmental information processing, and genetic information processing. Genes involved with regulation and environmental sensing (PTS systems), as well as secretion systems, were identified in this group of CCs. In particular, a gene encoding the VirD4 type IV secretion system protein was associated with CC19. CC10 was characterized by Galactose metabolism 00500 Starch and sucrose metabolism 00520 Amino sugar and nucleotide sugar metabolism 00620 Pyruvate metabolism 00630 Glyoxylate and dicarboxylate metabolism 00640 Propanoate metabolism 00680 Methane metabolism 00910 Nitrogen metabolism 00561 Glycerolipid metabolism 00230 Purine metabolism 00240 Pyrimidine metabolism 00250 Alanine, aspartate and glutamate metabolism 00260 Glycine, serine, and threonine metabolism 00280 Valine, leucine, and isoleucine degradation 00220 Arginine biosynthesis 01007 Amino acid related enzymes 00430 Taurine and hypotaurine metabolism 01003 Glycosyltransferases 01005 Lipopolysaccharide biosynthesis proteins 01011 Peptidoglycan biosynthesis and degradation proteins 00760 Nicotinate and nicotinamide metabolism 00770 Pantothenate and CoA biosynthesis 01001 Protein kinases 01002 Peptidases 03021 Transcription machinery 03016 Transfer RNA biogenesis CC17 (continued) 00970 Aminoacyl-tRNA biosynthesis 03110 Chaperones and folding catalysts 03060 Protein export 03420 Nucleotide excision repair 03400 DNA repair and recombination proteins Environmental Information Processing (09130) 02000 Transporters 02010 ABC transporters 02060 Phosphotransferase system (PTS) 03070 Bacterial secretion system (Continued on next page) Gori et al.
® an array of genes involved in carbohydrate metabolism and uptake, such as the ABC transport system for multiple sugar transport. The majority of genes characteristic for CC1 were of unknown function, with the exception of genes involved with genetic regulation and a complete toxin/antitoxin system phd/doc (38). These systems are often described as a tool for stabilizing extrachromosomal DNA (i.e., plasmids), but they are often found integrated chromosomally in both Gram-positive and Gram-negative bacterial species-though their function in this setting is unclear (39).
In relation to the CC17-associated genes, we also checked for allelic variants specific to strains isolated from invasive disease or carriage. Figure S4 shows the proportion of CC17 invasive or carriage strains, and the frequency of each allelic variant. We identified 21 genes with alleles that statistically differentiated strains isolated from carriage and invasive disease (Fisher test, P Ͻ 0.05; Table 4). The DNA sequence of the allelic variant differed by a single polymorphism in all cases. In 15/21 cases this nucleotide change was translated into an amino acid change (missense), while in a single case the mutation resulted in a truncated protein (nonsense). Figure S6 shows these mutations in relation to the population structure of CC17. The tree shows how these mutations are associated with particular clades, rather than the disease phenotype. This suggests that the mutations have been acquired historically by the CC17 bacterial population, and have become fixed in each clade. Nonetheless, each CC17 cluster is not equally represented by disease and carriage strains, suggesting that these mutations may have Streptococcus agalactiae Global Pan-GWAS ® contributed to the development of an invasive phenotype. These genes have the potential to affect the metabolism and virulence of the bacterial strains. For example, although the major pilin synthesis gene is known to be characterized by locus variants which are associated with biofilm and virulence (namely, variants PI-I, PI-IIa and PI-IIb) (40), CC17 is characterized by the presence of PI-I/PI-IIb. Smaller variations within the locus PI-IIb appear to be associated with CC17 isolated from carriage, suggesting that this gene may be impaired in functionality. Similarly, the prtP gene and the efrB genes, encoding, respectively, for a virulence-associated protease and for the ATP-binding cassette of a multidrug-efflux pump, have alleles that are more common in strains isolated from disease, highlighting the potential for these allelic variations to result in a more virulent phenotype (41). In order to test the impact of the mutations targeting the efrB multidrug-efflux pump, we assessed susceptibility (MIC) to a range of unrelated compounds previously shown to be informative in S. pneumoniae (42) (Table 5). For three out of five molecules (erythromycin, chloramphenicol, and norfloxacin), the CC17 strain we tested had the highest MIC, although this was also seen in other non-CC17 strains. The acriflavin MIC did not vary between the different strains. The berberine MICs for CC17 and CC1 were lower than for CC23 and CC19.  a MICs were determined from 3 experiments. b Value of 8 g/ml was the higher detection limit for norfloxacin due to acidification and precipitation of media components.

DISCUSSION
S. agalactiae isolated from human and animal sources is characterized by a range of clonal complexes and sequence types. Each CC appears to be phenotypically different, with CC1 being commonly isolated in adult disease, and CC17 (associated with capsular serotype III) commonly isolated in neonatal disease and demonstrating hypervirulence (1,16). We show that these different CCs are characterized by different gene sets belonging to functional families involved in niche adaptation and virulence. This is in part reflected in the various potential of different CCs to cause invasive diseases in different human hosts, as illustrated by the hypervirulence of CC17 in neonates, the lower neonatal invasive potential of CC1, CC19, and CC23 clones, and the propensity of CC1 to cause disease in adults with comorbidities (15,16). Importantly, these CCspecific genetic characteristics and the pattern of gene presence and absence are independent of geographical origin, with the exception of the CC10 gene cluster present in the strains isolated from Africa belonging to ST327 and ST328. Furthermore, within CC17 we have identified several, functionally important allelic variants associated with either carriage or disease.
Among the hypervirulent CC17-specific genes, there were several examples of previously identified genes associated with human disease due to GBS and other related bacteria. For instance, the transporter Nik that controls the uptake of nickel is essential for survival in the human host. A homolog of Nik has been shown to be essential for Staphylococcus aureus in the causation of UTIs (43). The dldh gene, encoding the dihydrolipoamide dehydrogenase enzyme (44), has been implicated in several virulence-related processes in Streptococcus pneumoniae, such as survival within the host and production of capsular polysaccharide. Mutants lacking the dldh gene are unable to cause sepsis and pneumonia in mouse models (44). Surface proteases in S. agalactiae are described to have several virulence-associated functions, such as inactivation of chemokines that recruit immune cells at the site of infection or facilitate invasion of damaged tissue (45,46). We have identified PrtP and ScpA proteases, both characterized by the presence of C5a peptidase domains and a signal peptidase SpsB, specific to this complex. Genes known to be associated with CC17 hypervirulence have also been identified in this analysis, including the Pi-IIb locus (40), part of which is represented by the CC17-associated genes gcc1732, lepB, inlA_2, and gcc1733 (Table S3), supporting the validity of this analysis. Allelic variation of virulence-associated genes has previously been used to identify genes classifying invasive and noninvasive strains in other streptococcal species (41). A proportion of CC17-specific genes also showed unique alleles associated with invasive disease or carriage strains. Sixteen of twenty-one allelic variants resulted in a difference that was translated into the protein sequence, including regulatory proteins and virulence-or metabolism-associated proteins, such as ABC transport systems, a major pilin protein, and a C5a peptidase. These data suggest there have been further selection processes within hypervirulent CC17 that could result in strains characterized by different virulence levels.
A phenotypic analysis was carried out to identify the potential effect of mutations in the efrB multidrug efflux pump that differentiated CC17 from the other clonal complexes. Using the MIC for a panel of antimicrobial molecules, we have not conclusively been able to show that mutations in efrB gene alone change the MIC. This is likely to be confounded by other differences in the non-CC17 genomes. CC23, for instance, encodes several ABC transporters in its accessory genome (such as such as the wxdM gene; Table S3) and we cannot exclude that some of these would be involved with efflux of deleterious molecules.
Further work is needed to assess the effect of the GWAS-driving mutations in an isogenic CC17 background.
CC23-specific genes identified are putatively involved in virulence and host invasion, including mntH, a gene encoding a manganese transport protein. During a bacterial infection the host limits access to manganese, among other micronutrients, and it has been shown that S. aureus responds to this host-induced starvation by expressing metal transporters, such as MntH (35). Interestingly, CC23 is also associated with vanY, a gene implicated in vancomycin resistance in other streptococci (47). GBS is typically susceptible to vancomycin (48), an antibacterial glycopeptide obtained from Streptomyces orientalis, which inhibits cell wall synthesis, alters the permeability of the cell membrane, and selectively inhibits RNA synthesis (49). Whether the presence of this gene also facilitates niche adaptation in the context of complex host-microbiota environment remains to be determined.
Lineage CC10, and the sublineage CC19 that includes the strains belonging to ST327 and ST328, are mostly characterized by metabolic genes, consistent with the lower virulence of these clonal complexes. The genes galTKEM are present in these two lineages only, and encode the "Leloir pathway" in other streptococci, such as S. mutans, S. thermophilus, and S. pneumoniae (50)(51)(52). This pathway in S. pneumoniae is finely tuned by CbpA and activated in tandem with the tagatose-6-phosphate pathway in order to maximize growth (53). The functionality of this pathway is yet to be described in GBS, but we hypothesize that accessing different methods to metabolize carbohydrates facilitates nutrient competition and survival. Among the nonmetabolic genes that are associated with CC19, we identified a virD4 gene, which is part of a previously identified type IV secretion system (T4SS) (54) present in numerous bacterial species and associated with virulence effector translocation and conjugation (55,56).
GBS is widely thought to be a zoonosis (57)(58)(59)(60)(61). Based on the nature and the distribution of CC-characterizing genes, and their location in the GBS genome, we hypothesize that S. agalactiae lineages that colonize humans may have initially evolved in animals and then subsequently expanded clonally in humans. Our analysis is limited by the availability of data sets from across the world and from putative animal hosts, and therefore this hypothesis needs further confirmation. However, in line with the observation that S. agalactiae has undergone genome reduction (62), we postulate that the human-adapted clones evolved in animals through loss of function of redundant genes. Having escaped the animal niche, they were then able to evade the human immune system and establish successful colonization. Recently, the "missing link" between animal and human adaptation of GBS was described to be CC103 (57). However, we have identified animal isolates belonging to human-associated CCs (e.g., CC17 and CC23) which, in the case of the CC23 strains, cluster together with human clinical isolates in the GBS population structure.
Our analysis has a number of additional limitations. First, we were confined to the current publicly available GBS human and animal genomes retrieved from https:// pubmlst.org/sagalactiae/ (a total of 3,028 isolates, including the full data set from the Netherlands), plus a further 303 genomes from Malawi. Second, the GWAS pipeline we used relies on the automated annotation of software Prokka. The use of this software required the use of Roary and Scoary to produce the pangenome and the pan-GWAS. This was extremely efficient when used to annotate the thousands of bacterial genomes in this analysis, and although the genome annotations and the pangenome were manually screened for consistency and quality (such as saturation of the core and accessory genome), it could potentially introduce artifacts. Confirming the GWAS findings with the sequence alignments allowed us to identify several genes that were characterized by nonsynonymous mutations and small indels, as well as unraveling the potential artifacts that require further investigation. Finally, as our analysis is confined to the genomic differences between the different clades, further laboratory and epidemiological analysis will be needed to fully appreciate the biological consequences of these CC-specific genes.
In conclusion, we have shown that the CCs of Streptococcus agalactiae responsible for neonatal meningitis and adult colonization are characterized by the presence of specific gene sets that are not limited to particular geographical areas. In the context of GBS control measures, such as vaccination, we speculate that as the human gastrointestinal and urogenital niches are vacated by vaccine serotypes, serotype replace-ment could occur as a result of new GBS strains arising from animals, including cattle and fish, which are reservoirs of GBS genetic diversity.

MATERIALS AND METHODS
Bacterial strains, genomes, and origin. Publicly available genome sequences from 1,574 human isolates from Kenya, the United States, Canada, and the Netherlands, together with 87 genomes from animal isolates, were analyzed (16,(63)(64)(65) (Table 1). 24 further genomes derived from human isolates were retrieved from https://pubmlst.org/sagalactiae/ and were included. The genome assemblies were not available for the isolates from Kenya and the Netherlands. In those cases, short read sequence data were retrieved from the European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena). Raw DNA reads were trimmed of low-quality ends and cleaned of adapters using Trimmomatic software (ver. 0.32) (66). De novo assembly was performed with SPAdes software (ver 3.8.0) (67), using a sample of 1,400,000 reads and k-mer values of 21, 33, 55, and 77. De novo assemblies were checked for plausible length (between 1,900,000 and 2,200,000 bp), annotated using Prokka (ver. 1.12) (68), and checked for low-level contamination using Kraken (ver. 0.10.5) (69). In cases for which more than 5% of the contigs belonged to a species different from Streptococcus agalactiae, the genome sequence was not included in any further analysis. Resulting assemblies were deposited in the pubmlst.org/sagalactiae database, which runs the BIGSdb genomics platform (70).
In addition, 303 carriage and invasive disease strains isolated in Malawi between 2004 and 2016 in the context of carriage and invasive disease surveillance were sequenced. DNA was extracted from an overnight culture using DNAeasy blood and tissue kit (Qiagen) following the manufacturer's guidelines, and sequenced using HiSeq4000 (paired-end library 2 ϫ 150) platform at Oxford Genomics Centre UK. Sequences were then assembled as described above.
Multilocus sequence types (MLST) STs were derived from the allelic profiles of 7 housekeeping genes (adhP, pheS, atr, glnA, sdhA, glcK, and tkt). This grouped strains into 91 unique STs. Strains which did not show a full set of housekeeping gene alleles or were not assigned to any previously described ST (n ϭ 68) were double-checked for sequence contamination and assigned to a non-sequence-typeable (NST) group.
Phylogeny inference. BURST (71) was used to evaluate the relatedness between different STs, and to define CCs. Five random subsets, each containing 1,000/1,988 isolates, were analyzed using eBURST on PubMLST (70). This grouped STs sharing at least five out of seven MLST loci, and identified the central ST (i.e., the ST with the highest number of single or double locus variants), and was used to define CCs. Each of the five subsets showed the same six CCs (CC1, CC6, CC10, CC19, CC17, and CC23) plus a series of singletons (STs not belonging to any CC). CCs were defined as the set of STs associated with a particular CC in at least one eBURST result.
Core-genome phylogeny of GBS data sets was inferred using the software Parsnp (from Harvest package, ver. 1.1.2) (72), which performs a core genome SNP typing and uses Fastree2 (73) (74). The recombination effect on phylogeny was assessed via Clonal-FrameML ver. 1.12 (75) on five 140-strain random subsets of the entire data set.
Pangenome construction and genome-wide association analysis. A pangenome was generated from the combined human (African isolates from Malawi and Kenya [64], Canadian [32], American [65], and Dutch isolates, and isolates of unkown geographical origin [ Table 1])-and animal-derived strains using Roary (ver. 3.8.0). Parameters for each run were: 95% of minimum BLASTp identity; MLC inflation value 1.5; with 99% of strains in which a gene must be present to be considered "core." Recently, several pipelines were developed for bacterial genome wide association studies (GWAS), such as PLINK, PhyC, ROADTRIPS, and SEER (76)(77)(78)(79). Scoary (80) was designed to highlight genes in the accessory pangenome of a bacterial data set associated with a particular bacterial phenotype. Here, Scoary (ver. 1.6.16) was used to establish which genes were typical of each CC via a pan-genome-wide association study (pan-GWAS). The CC of each isolate was depicted as a discrete phenotype, e.g., belonging to CC17 or not, and defined as "positive" or "negative," respectively, with the Scoary algorithm evaluating which gene feature is statistically associated with a particular CC (80). The cutoff for a significant association was a P value lower than 1eϪ10 and a sensitivity and specificity greater than 90%. CC-associated genes were plotted on the circular representation of the chromosome of 5 GBS isolates belonging to each CC (strains ST-1, NCTC8187, 2603V/R, SGM4, 874391, and NGBS572) using BRIG (ver. 0.8) (81). Gene synteny was also evaluated for the CC-associated genes; three genomes belonging to Streptococcus agalactiae Global Pan-GWAS ® each CC and the genes identified in the pan-GWAS analysis were aligned using ProgressiveMauve (82). Mauve (ver. 2.3.1) (83) was used to produce a graphical representation of the alignment and gene synteny was qualitatively evaluated.
The sequence diversity of genes identified from the pan-GWAS analysis was investigated by selecting one representative nucleotide gene sequence associated with each CC (sequences reported in supplemental material file 1) and aligning this against each genome included in the analysis using BLASTn. The bitscore value of each gene alignment was used to produce the heatmaps shown in Fig. S2, using the R package pheatmap (ver. 1.0.10; https://CRAN.R-project.org/packageϭpheatmap). Bitscores were normalized against the highest scoring isolate for each gene: the normalized bitscore was 0 Ͼ x Ն 1, where 1 corresponds to the highest identified bitscore, 0 corresponds to the absence of the gene, and values in between highlight a different level of gene similarity. For the identification of alleles that distinguish strains isolated from disease and carriage, we calculated the allelic profiles of the genes identified by the pan-GWAS pipeline in 545 CC17 strains (for which the isolation source was non-animal and known). For each gene we selected the alleles present in at least 10 strains and calculated the proportion of strains isolated from invasive source and carriage. Significance for alleles unevenly distributed between carriage and disease was calculated with a Fisher test.
Assessment of ABC transporter activity by MIC. Routine expansion of S. agalactiae strains was done in Todd-Hewitt (TH) broth or on TH agar (TH broth with 15% agar) at 37°C, 5% CO 2 .
Briefly, GBS strains grown statically in TH broth at 37°C, 5% CO 2 to logarithmic phase were harvested and resuspended in prewarmed MH-F broth at ϳ5 ϫ 10 5 CFU/ml. A bacterial suspension (50 l) was added to individual wells of a 96-well plate containing 50 l MH-F supplemented with various concentrations of antimicrobial compounds (ranging from 0.06 to 32 g/ml). The MIC was read after incubation of the 96-well plate at 37°C, 5% CO 2 for 18 to 24 h.
Ethical approval for Malawi GBS collection. Collection of carriage isolates was approved by the College of Medicine Research Ethics Committee (COMREC), University of Malawi (P.05/14/1574) and the Liverpool School of Tropical Medicine Research Ethics Committee (14.036). Invasive disease surveillance in Malawi was approved by COMREC (P.11/09/835 and P.08/14/1614).

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. TEXT S1, TXT file, 0.2 MB.