Novel Wolbachia strains in Anopheles malaria vectors from Sub-Saharan Africa

Background: Wolbachia, a common insect endosymbiotic bacterium that can influence pathogen transmission and manipulate host reproduction, has historically been considered absent from the Anopheles (An.) genera, but has recently been found in An. gambiae s.l. populations in West Africa. As there are numerous Anopheles species that have the capacity to transmit malaria, we analysed a range of species across five malaria endemic countries to determine Wolbachia prevalence rates, characterise novel Wolbachia strains and determine any correlation between the presence of Plasmodium, Wolbachia and the competing bacterium Asaia. Methods: Anopheles adult mosquitoes were collected from five malaria-endemic countries: Guinea, Democratic Republic of the Congo (DRC), Ghana, Uganda and Madagascar, between 2013 and 2017. Molecular analysis was undertaken using quantitative PCR, Sanger sequencing, Wolbachia multilocus sequence typing (MLST) and high-throughput amplicon sequencing of the bacterial 16S rRNA gene. Results: Novel Wolbachia strains were discovered in five species: An. coluzzii, An. gambiae s.s., An. arabiensis, An. moucheti and An. species A, increasing the number of Anopheles species known to be naturally infected. Variable prevalence rates in different locations were observed and novel strains were phylogenetically diverse, clustering with Wolbachia supergroup B strains. We also provide evidence for resident strain variants within An. species A. Wolbachia is the dominant member of the microbiome in An. moucheti and An. species A but present at lower densities in An. coluzzii. Interestingly, no evidence of Wolbachia/Asaia co-infections was seen and Asaia infection densities were shown to be variable and location dependent. Conclusions: The important discovery of novel Wolbachia strains in Anopheles provides greater insight into the prevalence of resident Wolbachia strains in diverse malaria vectors. Novel Wolbachia strains (particularly high-density strains) are ideal candidate strains for transinfection to create stable infections in other Anopheles mosquito species, which could be used for population replacement or suppression control strategies.

Ghana, Uganda and Madagascar, between 2013 and 2017. Molecular analysis was undertaken using quantitative PCR, Sanger sequencing, Wolbachia multilocus sequence typing (MLST) and high-throughput amplicon sequencing of the bacterial gene. 16S rRNA : Novel strains were discovered in five species:

Results
Wolbachia An. coluzzii , s.s., , and species A, increasing An. gambiae An. arabiensis An. moucheti An. the number of species known to be naturally infected. Variable Anopheles prevalence rates in different locations were observed and novel strains were phylogenetically diverse, clustering with supergroup B strains. We Wolbachia also provide evidence for resident strain variants within . species A.
An is the dominant member of the microbiome in and Wolbachia An. moucheti An. species A but present at lower densities in . Interestingly, no An. coluzzii evidence of co-infections was seen and infection Wolbachia/Asaia Asaia densities were shown to be variable and location dependent.
The important discovery of novel strains in

Conclusions:
Wolbachia provides greater insight into the prevalence of resident Anopheles Wolbachia strains in diverse malaria vectors. Novel strains (particularly Wolbachia high-density strains) are ideal candidate strains for transinfection to create stable infections in other mosquito species, which could be used for Anopheles population replacement or suppression control strategies.

Background
Malaria is a mosquito-borne disease caused by infection with Plasmodium (P.) parasites, with transmission to humans occurring through the inoculation of Plasmodium sporozoites during blood-feeding of an infectious female Anopheles (An.) mosquito. The genus Anopheles consists of 475 formally recognised species with ~40 vector species/species complexes responsible for the transmission of malaria at a level of public health concern 1 . During the mosquito infection cycle, Plasmodium parasites encounter a variety of resident microbiota both in the mosquito midgut and other tissues. Numerous studies have shown that certain species of bacteria can inhibit Plasmodium development 2-4 . For example, Enterobacter bacteria that reside in the Anopheles midgut can inhibit the development of Plasmodium parasites prior to their invasion of the midgut epithelium 5,6 . Wolbachia endosymbiotic bacteria are estimated to naturally infect ~40% of insect species 7 including mosquito vector species that are responsible for transmission of human diseases, such as Culex (Cx.) quinquefasciatus 8-10 and Aedes (Ae.) albopictus 11,12 . Although Wolbachia strains have been shown to have variable effects on arboviral infections in their native mosquito hosts 13-15 , transinfected Wolbachia strains have been considered for mosquito biocontrol strategies, due to observed arbovirus transmission blocking abilities and a variety of synergistic phenotypic effects. Transinfected strains in Ae. aegypti and Ae. albopictus provide strong inhibitory effects on arboviruses, with maternal transmission and cytoplasmic incompatibility enabling introduced strains to spread through populations [16][17][18][19][20][21][22] . Open releases of Wolbachia-transinfected Ae. aegypti populations have demonstrated the ability of the wMel Wolbachia strain to invade wild populations 23 and provide strong inhibitory effects on viruses from field populations 24 , with releases currently occurring in arbovirus endemic countries such as Indonesia, Vietnam, Brazil and Colombia (https://www.worldmosquitoprogram.org).
The prevalence of Wolbachia in Anopheles species has not been extensively studied, with most studies focused in Asia using classical PCR-based screening; up until 2014 there was no evidence of resident strains in mosquitoes from this genus [25][26][27][28][29] . Furthermore, significant efforts to establish artificially infected lines were, up until recently, also unsuccessful 30 . Somatic, transient infections of the Wolbachia strains wMelPop and wAlbB in An. gambiae were shown to significantly inhibit P. falciparum 31 , but the interference phenotype is variable with other Wolbachia strain-parasite combinations 32-34 . A stable line was established in An. stephensi, a vector of malaria in southern Asia, using the wAlbB strain and this was also shown to confer resistance to P. falciparum infection 35 . One potential reason postulated for the absence of Wolbachia in Anopheles species was thought to be the presence of other bacteria, particularly from the genus Asaia 36 . This acetic acid bacterium is stably associated with several Anopheles species and is often the dominant species in the mosquito microbiota 37 . In laboratory studies, Asaia has been shown to impede the vertical transmission of Wolbachia in Anopheles 36 and was shown to have a negative correlation with Wolbachia in mosquito reproductive tissues 38 .
Recently, resident Wolbachia strains (those naturally present in wild insect populations) have been discovered in the An. gambiae s.l. complex, which consists of multiple morphologically indistinguishable species including several major malaria vector species. Wolbachia strains (collectively named wAnga) were found in An. gambiae s.l. populations in Burkina Faso 39 and Mali 40 , suggesting that Wolbachia may be more abundant in the An. gambiae complex across Sub-Saharan Africa. Globally, there is a large variety of Anopheles vector species (~70) that have the capacity to transmit malaria 41 and could potentially contain resident Wolbachia strains. Additionally, this number of malaria vector species may be an underestimate given that recent studies using molecular barcoding have also revealed a larger diversity of Anopheles species than would be identified using morphological identification alone 42,43 .
Investigating the prevalence and diversity of Wolbachia strains naturally present in Anopheles populations across diverse malaria endemic countries would allow a greater understanding of how this bacterium could be influencing malaria transmission in field populations and identify candidate strains for transinfection. In this study, we collected Anopheles mosquitoes from five malaria-endemic countries; Ghana, Democratic Republic of the Congo (DRC), Guinea, Uganda and Madagascar, from 2013-2017. Wild-caught adult female Anopheles were screened for P. falciparum malaria parasites, Wolbachia and Asaia bacteria. In total, we analysed mosquitoes from 17 Anopheles species that are known malaria vectors or implicated in transmission, and some unidentified species, discovering five species of Anopheles with resident Wolbachia strains; An. coluzzii from Ghana, An. gambiae s.s., An. arabiensis, An. moucheti and An. species A from DRC. Using Wolbachia gene sequencing, including multilocus sequence typing (MLST), we show that the resident strains in these malaria vectors are diverse, novel strains and quantitative PCR (qPCR) and 16S rRNA amplicon sequencing data suggests that the strains in An. moucheti and An. species A are higher density infections, compared to the strains found in the An. gambiae s.l. complex. We found no evidence for either Wolbachia-Asaia co-infections, or for either bacteria having any significant effect on the prevalence of Plasmodium in wild mosquito populations.

Study sites & collection methods
Anopheles adult mosquitoes were collected from five malariaendemic countries in Sub-Saharan Africa (Guinea, Democratic

Amendments from Version 1
This revised version contains modifications to Table 1 & Table 2 and Figure 1 & Figure 7 to provide greater clarity on these data sets. We have highlighted how our study was undertaken across diverse malaria endemic countries beyond West Africa and the revised manuscript contains minor editing (including the addition of primer sequences) that was suggested by the reviewers. In addition, we have modified our discussion on the correlation between Plasmodium and Wolbachia prevalence in An. gambiae s.s. to provide a more balanced viewpoint on our data.

REVISED
Republic of the Congo (DRC), Ghana, Uganda and Madagascar) between 2013 and 2017 ( Figure 1 DNA extraction and mosquito species identification DNA was extracted from individual whole mosquitoes or abdomens using QIAGEN DNeasy Blood and Tissue Kits according to manufacturer's instructions. DNA extracts were eluted in a final volume of 100 μl and stored at −20°C. Mosquito species identification was initially undertaken using morphological keys followed by diagnostic species-specific PCR assays to distinguish between the morphologically indistinguishable sibling mosquito species of the An. gambiae [45][46][47] and An. funestus complexes 48 . To determine species identification for samples of interest and for samples that could not be identified by species-specific PCR, Sanger sequences were generated from ITS2 PCR products 49 .

Detection of P. falciparum and Asaia
Detection of P. falciparum malaria was undertaken using qPCR targeting an 120-bp sequence of the P. falciparum cytochrome c oxidase subunit 1 (Cox1) mitochondrial gene using primers 5'-TTACATCAGGAATGTTATTGC-3' and 5'-ATATTGGATCT CCTGCAAAT-3' 50 . Positive controls from gDNA extracted from a cultured P. falciparum-infected blood sample (parasitaemia of ~10%) were serially diluted to determine the threshold limit of detection, in addition to the inclusion of no template controls (NTCs). Asaia detection was undertaken targeting the 16S rRNA gene using primers Asafor: 5'-GCGCGTAGGCGGTTT ACAC-3' and Asarev: 5'-AGCGTCAGTAATGAGCCAGGT T-3' 37,51 . Ct values for both P. falciparum and Asaia assays in selected An. gambiae extracts were normalized to Ct values for a single copy An. gambiae rps17 housekeeping gene using primers 5'-GACGAAACCACTGCGTAACA-3' and 5'-TGCT CCAGTGCTGAAACATC-3' (accession no. AGAP004887 on www.vectorbase.org) 52,53 . As Ct values are inversely related to the amount of amplified DNA, a higher target gene Ct: host gene Ct ratio represented a lower estimated infection level. qPCR reactions were prepared using 5 μl of FastStart SYBR Green Master mix (Roche Diagnostics), a final concentration of 1 μM of each primer, 1 μl of PCR grade water and 2 μl template DNA, to a final reaction volume of 10 μl. Prepared reactions were run on a Roche LightCycler® 96 System and amplification was followed by a dissociation curve (95°C for 10 seconds, 65°C for 60 seconds and 97°C for 1 second) to ensure the correct target sequence was being amplified. PCR results were analysed using the LightCycler® 96 software (Roche Diagnostics). A sub-selection of PCR products from each assay was sequenced to confirm correct amplification of the target gene fragment.

Wolbachia detection
Wolbachia detection was first undertaken targeting three conserved Wolbachia genes previously shown to amplify a wide diversity of strains; 16S rRNA gene using primers W-Spec-16S-F: 5'-CATACCTATTCGAAGGGATA-3' and W-Spec-16s-R: 5'-AGCTTCGAGTGAAACCAATTC-3' 40,54 , Wolbachia surface protein (wsp) gene using primers wsp81F: 5'-TGGT CCAATAAGTGATGAAGAAAC-3' and wsp691R: 5'-AAAAA TTAAACGCTACTCCA-3' 55 and FtsZ cell cycle gene using primers ftsZqPCR F: 5'-GCATTGCAGAGCTTGGACTT-3' and ftsZqPCR R: 5'-TCTTCTCCTTCTGCCTCTCC-3' 56 . DNA extracted from a Drosophila melanogaster fly (infected with the wMel strain of Wolbachia) was used as a positive control, in addition to no template controls (NTCs). The 16S rRNA 54 and wsp 55 gene PCR reactions were carried out in a Bio-Rad T100 Thermal Cycler using standard cycling conditions and PCR products were separated and visualised using 2% E-Gel EX agarose gels (Invitrogen) with SYBR safe and an Invitrogen E-Gel iBase Real-Time Transilluminator. FtsZ 56 and 16S rRNA 40 gene real time PCR reactions were prepared using 5 μl of FastStart SYBR Green Master mix (Roche Diagnostics), a final concentration of 1 μM of each primer, 1 μl of PCR grade water and 2 μl template DNA, to a final reaction volume of 10 μl. Prepared reactions were run on a Roche LightCycler® 96 System for 15 minutes at 95°C, followed by 40 cycles of 95°C for 15 seconds and 58°C for 30 seconds. Amplification was followed by a dissociation curve (95°C for 10 seconds, 65°C for 60 seconds and 97°C for 1 second) to ensure the correct target sequence was being amplified. PCR results were analysed using the LightCycler® 96 software (Roche Diagnostics). To estimate Wolbachia densities across multiple Anopheles mosquito species, ftsZ and 16S qPCR Ct values were compared to total dsDNA extracted, measured using an Invitrogen Qubit 4 fluorometer. A serial dilution series of a known Wolbachia-infected mosquito DNA extract was used to correlate Ct values and amount of amplified target product.

Wolbachia multilocus strain typing (MLST)
MLST was undertaken to characterize Wolbachia strains using the sequences of five conserved genes as molecular markers to genotype each strain. In brief, 450-500 base pair fragments of the gatB, coxA, hcpA, ftsZ and fbpA Wolbachia genes were amplified from individual Wolbachia-infected mosquitoes using previously optimised protocols 57 . Primers used were as follows: gatB_F1: 5'-GAKTTAAAYCGYGCAGGBGTT-3', gatB_R1: 5'-TGGYAAYTCRGGYAAAGATGA-3', coxA_F1: 5'-TTGGRGCRATYAACTTTATAG-3', coxA_R1: 5'-CTAAAGACT TTKACRCCAGT-3', hcpA_F1: 5'-GAAATARCAGTTGCTGC AAA-3', hcpA_R1: 5'-GAAAGTYRAGCAAGYTCTG-3', ftsZ_F1: 5'-ATYATGGARCATATAAARGATAG-3', ftsZ_R1: 5'-TCRAGYAATGGATTRGATAT-3', fbpA_F1: 5'-GCTGC TCCRCTTGGYWTGAT-3' and fbpA_R1: 5'-CCRCCAG ARAAAAYYACTATTC-3'. A Cx. pipiens gDNA extraction (previously shown to be infected with the wPip strain of Wolbachia) was used as a positive control for each PCR run, in addition to no template controls (NTCs). If initial amplification with these primers was unsuccessful, the PCR was repeated using the standard primers but with the addition of M13 adaptors. If no amplification was detected using standard primers, further PCR analysis was undertaken using degenerate primer sets, with or without M13 adaptors, which for the hcpA gene of wAnga-Ghana allowed improved amplification (using hcpA_F3: 5'-ATTA GAGAAATARCAGTTGCTGC-3', hcpA_R3: 5'-CATGAA AGACGAGCAARYTCTGG-3' (no M13 adaptors)) 57 . PCR products were separated and visualised using 2% E-Gel EX agarose gels (Invitrogen) with SYBR safe and an Invitrogen E-Gel iBase Real-Time Transilluminator. PCR products were submitted to Source BioScience (Source BioScience Plc, Nottingham, UK) for PCR reaction clean-up, followed by Sanger sequencing to generate both forward and reverse reads. Where PCR primers included M13 adaptors, just the M13 primers alone (M13_ adaptor_F: 5'-TGTAAAACGACGGCCAGT-3' and M13_adaptor_ R: 5'-CAGGAAACAGCTATGACC-3') were used for sequencing, otherwise the same primers as utilised for PCR were used. Sequencing analysis was carried out in MEGA7 58 as follows. Both chromatograms (forward and reverse traces) from each sample were manually checked, edited, and trimmed as required, followed by alignment with ClustalW and checking to produce consensus sequences. Consensus sequences were used to perform nucleotide BLAST (NCBI) database queries, and searches against the Wolbachia MLST database 59 . If a sequence produced an exact match in the MLST database we assigned the appropriate allele number, otherwise we obtained a new allele number for each novel gene locus sequence through submission of the FASTA and raw trace files on the Wolbachia MLST website for new allele assignment and inclusion within the database. Full consensus sequences were also submitted to GenBank and assigned accession numbers. The Sanger sequencing traces from the wsp gene were also treated in the same way and analysed alongside the MLST gene locus scheme, as an additional marker for strain typing.

Phylogenetic analysis
Alignments were constructed in MEGA7 by ClustalW to include all relevant and available sequences highlighted through searches on the BLAST and Wolbachia MLST databases. Maximum Likelihood phylogenetic trees were constructed from Sanger sequences as follows. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model 60 . The tree with the highest log likelihood in each case is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The trees are drawn to scale, with branch lengths measured in the number of substitutions per site. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. The phylogeny test was by Bootstrap method with 1000 replications. Evolutionary analyses were conducted in MEGA7 58 .

Microbiome analysis
The microbiomes of selected individual Anopheles were analysed using barcoded high-throughput amplicon sequencing of the bacterial 16S rRNA gene. Sequencing libraries for each isolate were generated using universal 16S rRNA V3-V4 region primers 61 in accordance with Illumina 16S rRNA metagenomic sequencing library protocols. The samples were barcoded for multiplexing using Nextera XT Index Kit v2. Sequencing was performed on an Illumina MiSeq instrument using a MiSeq Reagent Kit v2 (500-cycles). Quality control and taxonomical assignment of the resultant reads were performed using CLC Genomics Workbench 8.0.1 Microbial Genomics Module. Low quality reads containing nucleotides with quality threshold below 0.05 (using the modified Richard Mott algorithm), as well as reads with two or more unknown nucleotides were removed from analysis. Additionally, reads were trimmed to remove sequenced Nextera adapters. Reference-based operational taxonomic unit (OTU) picking was performed using the SILVA SSU v128 97% database 62 . Sequences present in more than one copy but not clustered to the database were then placed into de novo OTUs (97% similarity) and aligned against the reference database with 80% similarity threshold to assign the "closest" taxonomical name where possible. Chimeras were removed from the dataset if the absolute crossover cost was 3 using a k-mer size of 6. Alpha diversity was measured using Shannon entropy (OTU level).

Statistical analysis
Fisher's exact post hoc test in Graphpad Prism 7 was used to compare infection rates. Normalised qPCR Ct ratios were compared using unpaired t-tests in GraphPad Prism 7.

Mosquito species and resident Wolbachia strains
Anopheles species composition varied depending on country and mosquito collection sites (Table 1). We detected Wolbachia in An. coluzzii mosquitoes from Ghana (prevalence of 4% -termed wAnga-Ghana) and An. gambiae s.s. from all six collection sites in DRC (prevalence range of 8-24%) in addition to a single infected An. arabiensis from Kalemie in DRC ( Figure 1 and Table 1). The molecular phylogeny of the ITS2 gene of Anopheles gambiae s.l. complex individuals (including both Wolbachia-infected and uninfected individuals analysed in our study) confirmed molecular species identifications made using species-specific PCR assays ( Figure 2). Novel resident Wolbachia infections were detected in two additional Anopheles species from DRC; An. moucheti (termed wAnM) from Mikalayi, and An. species A (termed wAnsA) from Katana. Additionally, we screened adult female mosquitoes of An. species A (collected as larvae and adults) from Lwiro, a village near Katana in DRC, and detected Wolbachia in 30/33 (91%), indicating this resident wAnsA strain has a high infection prevalence in populations in this region. The molecular phylogeny of the ITS2 gene revealed Wolbachia-infected individuals from Lwiro and Katana are the same An. species A (Figure 3) previously collected in Eastern Zambia 43 and Western Kenya 63 . All ITS2 sequences were deposited in GenBank (accession numbers MH598414-MH598445; listed in Supplementary Table 1).

Wolbachia strain typing
Phylogenetic analysis of the 16S rRNA gene demonstrated that the 16S sequences for these strains cluster with other Supergroup B strains such as wPip (99-100% nucleotide identity) ( Figure 4a). When compared to the resident Wolbachia strains in An. gambiae s.l. populations from Mali 40 and Burkina Faso 39 , wAnga-Ghana is more closely related to the Supergroup B strain of wAnga from Burkina Faso. Although a resident strain was detected in An. gambiae s.s. and a single An. arabiensis from DRC through amplification of 16S rRNA fragments using two independent PCR assays 40,54 , we were unable to obtain 16S sequences of sufficient quality to allow further analysis. The Wolbachia wsp gene has been evolving at a faster rate and provides more An. arabiensis Kankan An. gambiae s.s. 10 (21.7) 0 (0) 15 (32.6) 0 (0) 9 (19.6) 0 (0) 12 (26.1) 46 An. moucheti Kisangani An. gambiae s.s.
An. squamosus An. mascarensis An. pauliani An. maculipalpis An. squamosus An. mascarensis 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 11 (100.0) 11 *Adult individuals from Lwiro (Katana), DRC were collected as both larvae and adults so have been excluded from P. falciparum and Asaia prevalence analysis (NT; Not tested).    MLST was undertaken to provide more accurate strain phylogenies. This was done for the novel Wolbachia strains wAnM and wAnsA in addition to the resident wAnga-Ghana strain in An. coluzzii from Ghana. We were unable to amplify any of the five MLST genes from Wolbachia-infected An. gambiae s.s . and An. arabiensis from DRC (likely due to low infection densities). New alleles for all five MLST gene loci (sequences differed from those currently present in the MLST database) and novel allelic profiles confirm the diversity of these novel Wolbachia strains ( Table 2). The phylogeny of these three novel strains based on concatenated sequences of all five MLST gene loci confirms they cluster within Supergroup B (Figure 5a). This also demonstrates the novelty as comparison with a wide range of strains (including all isolates highlighted through partial matching during typing of each locus) shows these strains are distinct from currently available sequences (Figure 5a and Table 2). The concatenated phylogeny indicates that wAnM is most closely related to a Hemiptera strain: Isolate number 1616 found in Bemisia tabaci in Uganda, and a Coleoptera strain: Isolate number 20 found in Tribolium confusum. Concatenation of the MLST loci also indicates wAnsA is closest to a group containing various Lepidoptera and Hymenoptera strains from multiple countries in Asia, Europe and America, as well as two mosquito strains: Isolate numbers 1830 and 1831, found in Aedes cinereus and Coquillettidia richiardii in Russia. This highlights the lack of concordance between Wolbachia strain phylogeny and their insect hosts across diverse geographical regions.
We also found evidence of potential strain variants in wAnsA through variable MLST gene fragment amplification and resulting closest-match allele numbers. A second wAnsA-infected sample from Katana, An. sp. A/1 (W+) DRC-KAT2, only successfully amplified hcpA and coxA gene fragments and although identical sequences were obtained for wsp ( Figure 4b) and hcpA, genetic diversity was seen in the coxA sequences, with typing indicating a different, but still novel allele for the coxA sequence from this individual (wAnsA(2) coxA DRC-KAT2) (Figure 5b). Further analysis of the coxA sequence as part of MLST allele submission from this variant suggested the possibility of a double infection, where two differing strains of Wolbachia are present. MLST gene fragment amplification was also variable for wAnga-Ghana-infected An. coluzzii, requiring two individuals to generate the five MLST gene sequences, and for the hcpA locus, more degenerate primers (hcpA_F3/hcpA_R3) were required to generate sequence of sufficient quality for analysis. This is likely due to the low density of this strain potentially influencing the ability to successfully amplify all MLST genes, in addition to the possibility of genetic variation in primer binding regions. Despite the sequences generated for this strain producing exact matches with alleles in the database for each of the five gene loci, the resultant allelic profile, and therefore strain type, did not produce a match, showing this wAnga-Ghana strain is also a novel strain type. The closest matches to the wAnga-Ghana allelic profile were with strains from two Lepidopteran species: Isolate number 609 found in Fabriciana adippe from Russia, and Isolate number 658 found in Pammene fasciana from Greece,  but each of these only produced a match for three out of the five loci. The concatenated phylogeny for this strain (Figure 5a) indicates that across the 5 MLST loci, wAnga-Ghana is actually most closely related to a Lepidopteran strain found in Thersamonia thersamon in Russia (Isolate number 132). The phylogeny of Wolbachia strains based on the coxA gene (Figure 5b) highlights the genetic diversity of both the wAnsA strain variants and also wAnga-Ghana, compared to the wAnga-Mali strain 40 ; coxA gene sequences are not available for wAnga strains from Burkina Faso 39 . All Wolbachia MLST sequences were deposited into GenBank (accession numbers MH605286-MH605305; listed in Supplementary Table 3).

Resident strain densities and relative abundance
The relative densities of Wolbachia strains were estimated using qPCR targeting the ftsZ 56 and 16S rRNA 40 genes. qPCR analysis of ftsZ and 16S rRNA indicated the amount of Wolbachia detected in wAnsA-infected and wAnM-infected females was three orders of magnitude higher (Ct values 20-22) than Wolbachiainfected An. gambiae s.s., An. arabiensis and wAnga-Ghana-infected An. coluzzii (Ct values 30-33). To account for variation in mosquito body size and DNA extraction efficiency, we compared the total amount of DNA for Wolbachia-infected mosquito extracts and conversely, we found less total DNA in the wAnsA-infected extract (1.36 ng/μl) and wAnM-infected extracts (5.85 ng/μl) compared to the mean of 6.64 ± 2.33 ng/μl for wAnga-Ghana-infected An. coluzzii. To estimate the relative abundance of resident Wolbachia strains in comparison to other bacterial species, we sequenced the bacterial microbiome using 16S rRNA amplicon sequencing on Wolbachia-infected individuals. We found wAnsA, wAnsA(2) and wAnM Wolbachia strains were the dominant OTUs of these mosquito species ( Figure 6). In contrast, the lower-density infection wAnga-Ghana strain represented only ~10% of the OTUs within the microbiome.

P. falciparum, Wolbachia and Asaia prevalence
The prevalence of P. falciparum in female mosquitoes was extremely variable across countries and collection locations ( Figure 1 and For all Wolbachia-infected females collected in our study (including An. coluzzii from Ghana and novel resident strains in An. moucheti and An. species A), we did not detect the presence of Asaia. No resident Wolbachia strain infections were detected Figure 6. The relative abundance of resident Wolbachia strains in Anopheles. Bacterial genus level taxonomy was assigned to operational taxonomic units clustered with a 97% cut-off using the SILVA SSU v128 97% database, and individual genera comprising less than 1% of total abundance was merged into "Others". Figure 1). Interestingly, 2/5 of these individuals from Kissidougou (Guinea) were P. falciparuminfected compared to 3/5 individuals from Uganda. To determine if the presence of Asaia had a quantifiable effect on the level of P. falciparum detected, we normalized P. falciparum Ct values from qPCR (n = 61) (Supplementary Figure 2a) and compared gene ratios for An. gambiae s.s. mosquitoes from Guinea, with or without Asaia (Supplementary Figure 2b). Statistical analysis using student's t-tests revealed no significant difference between normalized P. falciparum gene ratios between the Asaia positive (n = 33) and negative (n = 28) groups (p = 0.51, df = 59). Larger variation of Ct values was seen for Asaia (n = 90) (Supplementary Figure 2c) suggesting the bacterial densities in individual mosquitoes were more variable than P. falciparum parasite infection levels.

s. comparing two locations with contrasting Asaia infection densities.
Bacterial genus level taxonomy was assigned to operational taxonomic units clustered with a 97% cut-off using the SILVA SSU v128 97% database, and individual genera comprising less than 1% of total abundance was merged into "Others". The variability of Wolbachia prevalence rates in An. gambiae complex from locations within DRC and Ghana and previous studies in Burkina Faso 39 and Mali 40 suggest the environment is one factor that influences the presence or absence of resident strains. In our study we found no evidence of Wolbachia-Asaia co-infections across all countries, supporting laboratory studies that have shown these two bacterial species demonstrate competitive exclusion in Anopheles species 36,38 . We also found that Asaia infection densities (whole body mosquitoes) were variable and location dependent which would correlate with this bacterium being environmentally acquired at all life stages, but also having the potential for both vertical and horizontal transmission 37 . Significant variations in overall Asaia prevalence and density across different Anopheles species and locations in our study would also correlate with our data indicating no evidence of an association with P. falciparum prevalence in both Guinea and Uganda populations. Further studies are needed to determine the complex interaction between these two bacterial species and malaria in diverse Anopheles malaria vector species. Horizontal transfer of Wolbachia strains between species (even over large phylogenetic differences) has shaped the evolutionary history of this endosymbiont in insects, and there is evidence for loss of infection in host lineages over evolutionary time 79 . Our results showing a novel strain present in An. coluzzii from Ghana (phylogenetically different to strains present in An. gambiae s.l. mosquitoes from both Burkina Faso and Mali), strain variants observed in An. species A, and the concatenated grouping of the novel Anopheles strains with strains found in different Orders of insects, support the lack of congruence between insect host and Wolbachia strain phylogenies 80 .
Our qPCR and 16S microbiome analysis indicates the densities of wAnM and wAnsA strains are significantly higher than resident Wolbachia strains in An. gambiae s.l. However, caution must be taken as we were only able to analyse selected individuals, and larger collections of wild populations would be required to confirm these results. Native Wolbachia strains dominating the microbiome of An. species A and An. moucheti is consistent with other studies of resident strains in mosquitoes showing Wolbachia 16S rRNA gene amplicons vastly outnumber sequences from other bacteria in Ae. albopictus and Cx. quinquefasciatus 81,82 . The discovery of novel Wolbachia strains provides the rationale to undertake vector competence experiments to determine what effect these strains are having on malaria transmission. The tissue tropism of novel Wolbachia strains in malaria vectors will be particularly important to characterise given this will determine if these endosymbiotic bacteria are proximal to malaria parasites within the mosquito. It would also be important to determine the additional phenotypic effects novel resident Wolbachia strains have on their mosquito hosts. Some Wolbachia strains induce a reproductive phenotype termed cytoplasmic incompatibility (CI) that results in inviable offspring when an uninfected female mates with a Wolbachia-infected male. In contrast, Wolbachiainfected females produce viable progeny when they mate with both infected and uninfected male mosquitoes. This reproductive advantage over uninfected females allows Wolbachia to spread within mosquito populations.

Conclusions
Wolbachia has been the focus of recent biocontrol strategies in which Wolbachia strains transferred into naïve mosquito species provide strong inhibitory effects on arboviruses 16,18-20,83,84 and malaria parasites 31,35 . The discovery of two novel Wolbachia strains in Anopheles mosquitoes that are potentially present at much higher density than resident strains in the An. gambiae complex, also suggests the potential for these strains to be transinfected into other Anopheles species to produce inhibitory effects on Plasmodium parasites. Wolbachia transinfection success is partly attributed to the relatedness of donor and recipient host so the transfer of high density Wolbachia strains between Anopheles species may result in stable infections (or co-infections) that have strong inhibitory effects on Plasmodium development. Finally, if the resident strain present in An. moucheti is at low infection frequencies in wild populations, an alternative strategy known as the incompatible insect technique (IIT) could be implemented where Wolbachia-infected males are released to suppress the wild populations through CI (reviewed by 22). In summary, the important discovery of diverse novel Wolbachia strains in Anopheles species will help our understanding of how Wolbachia strains can potentially impact malaria transmission, through natural associations or being used as candidate strains for transinfection to create stable infections in other species. Our main comments have been addressed in the revised manuscript. We still don't understand what An. species A is (and even An species O). It is not referenced in Vectorbase, so is it just a species that the authors were not able to identify, or a recently identified species that we are not aware of? A short explanation on this would be helpful, especially that Wolbachia was specifically found at high prevalence in this mosquito species.

ITS2 GenBank accession numbers are listed in Supplementary
Except for this small point, we think that the manuscript is sound and clear, and that conclusions are drawn adequately.
No competing interests were disclosed.

Competing Interests:
We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

General comment
In the method section the authors say DNA was extracted from whole mosquitoes or abdomen for their analysis. What are the chances that wolbachia infections cases reported in the paper could be due to parasites contain in the blood meal rather than true infection of mosquitoes? 1 2,3 1 2 3 parasites contain in the blood meal rather than true infection of mosquitoes?

Study sites & collection methods
"Democratic Republic of the Congo" change to "Democratic Republic of Congo" Collection sites it will be interesting to indicate from the coordinates if it is Latitude North/South or longitude East/West the paper is also for non specialists in the domain. Figure 1: B, C, D in the legend it is mentioned "P. falciparum prevalence" is it for human or mosquitoes please provide precision. (% Positive ???, % Negative ??) P. falciparum should be in italics. Figure 1A: It should be interesting to indicate the names of study sites. The authors could labelled the sites by using number for sites for each country 1, 2,3 … then providing in the legend what 1 is placed for.
legend not clear. Figure 6: The legend is not clear (can't read anything).

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes This research is timely. With the development of new pest control strategies using as a natural Wolbachia biological agent against the transmission of several vector-borne diseases in the field, it is important to have a comprehensive understanding of the diversity of the natural infections already present in the field, but also of the different factors that could affect the efficiency of such control programs. Including the presence of competing natural infection by bacteria for example. Asaia The study is well written and clear, with the sufficient information included to support future potential replication. I think this is a fine contribution to the current literature, I have only minor comments to the authors.
It might be worth modifying the text in the abstract, and the introduction, to specify that the previous reports of in were only from 2 countries, while the current study is Wolbachia Anopheles West-African providing data from 5 countries across the Sub-Saharian region.
Method: Please provide information on how the maps of figure 1 were generated. Did you need any approval/licenses for using these maps?
Please provide information on collection permits, if any was needed from the different African countries.
What is CDC standing for in the method section? 'CDC-light trap' In the detection method section: Wolbachia Edit typo: 'was used AS a positive control' Table 1: What is the rational for the authors to provide the information by countries rather than by species? Isn't the most interesting point of the paper about the infection being reported in additional species of ? Anopheles Figure 2: Explain the significance of the difference square/circle/triangle shapes and filled vs empty shapes? Also state in legend that the codes given are the Genbank Accession numbers. Figure 2: Where did you get the sequences from the and ? I think this An. bwambae An. quadriannulatus info is missing from the method section.
, Wolbachia and Asaia prevalence section, paragraph 2: P. falciparum Does your analysis include the and infected specimens? Would it make any P. falciparum Wolbachia difference to remove the -infected specimens from the analysis? Wolbachia Discussion section, end of 4 paragraph: 'New' strain in from Ghana. 'New' sounds like the infection is more recent than any other An. Coluzzii infection found in this mosquito species, which the results are not supporting. Would 'unique' or 'different' be good enough? Figure 7: Where is from Figure 7b? from the current picture it looks like is absent from those Asaia Asaia samples. Although the text states that the infection is not a dominant species of those samples. If is Asaia included in the 'Others' maybe it is worth specifying it in the legend, otherwise it could be added as a particular section of the graph like in Figure 7a to ease comparison of the two panels. Figure S1: Why are some of the circles slightly larger than others? Is it that different samples are overlapping?

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Partly

Are the conclusions drawn adequately supported by the results? Yes
No competing interests were disclosed. infected with Wolbachia, confirming and widening the recent discovery of the presence of Wolbachia in Anopheles mosquitoes. This is the strongest point of the paper, as an independent confirmation is always welcome and as some populations of Anopheles are even found here to have a high prevalence of Wolbachia.
The authors also checked for the presence of Asaia sp. in the analysed mosquitoes, as this bacterium is thought to compete with Wolbachia in Anopheles. They did not find any mosquito co-infected by Asaia and Wolbachia. This is also an important finding as it corroborates studies performed in the laboratory, but this time with field-collected mosquitoes. They found that in mosquitoes coming from one population, Asaia was actually a dominant species, >99% of the microbiota. Figure 7a is not very clear as one expects the scale to go from 0 to 100%, therefore we suggest to use a discontinued axis to present these interesting results.
Finally the authors investigated the presence of Plasmodium in the studied mosquitoes, as Wolbachia is thought to interfere with some transmitted pathogens. This part is less convincing as the tests have been performed on DNA extraction from whole bodies or abdomens, while the presence of Plasmodium in head and thorax (or more specifically, in salivary glands) is a more suitable method to assess transmission potential. Moreover, the conclusions drawn on the interactions between Plasmodium and Wolbachia are not exactly clear. Considering that 10.16 + 1.56 = 11.72% mosquitoes are infected with Wolbachia and 11.72 + 1.56 = 13.28% are infected with Plasmodium, if there is no effect between Wolbachia and Plasmodium, you expect that 11.72% x 13.28% = 1.56% is infected by both. Surprisingly, this is exactly the result here. Biology is rarely so close to math, for so small numbers… The authors should thus state more clearly that their results favor no interactions, as confirmed by the p value which is very close to 1. On the contrary, the discussion currently suggests that the non significant correlation is due to small numbers. However, one cannot jump to conclusion on the inability of Wolbachia to interfere with Plasmodium, as these results have been performed on abdomens and whole bodies, therefore we do not know whether the co-infected mosquitoes had just blood fed (and/or carried early stages of Plasmodium).
To improve the clarity of the article, it would be interesting to have an additional figure or table summarizing the experimental set up, explaining which mosquitoes are included in which analysis and which Wolbachia strain is found in which mosquitoes.
We also have minor comments on the manuscript: The expression « resident strain » is not clear to us. 16S « rRNA » and « rDNA »: a consistent word may be used, rRNA seems more consensual. The total number of mosquitoes, of Wolbachia infected mosquitoes, of Asaia infected ones, etc would be interesting.
Page 3: §2: Asaia is not an endosymbiont §3: « have » needs probably to be removed in « than would have be identified using morphological identification alone » §4 needs a first sentence identifying the gap of knowledge that the authors want to fill §5: Can the authors clearly state whether some mosquitoes had blood in their midgut?
Page 4: Figure 1: scale should be in km, miles is not an SI unit §2: « DNA extraction and MOSQUITO species identification ». More generally, it is not always clear §2: « DNA extraction and MOSQUITO species identification ». More generally, it is not always clear whether the authors speak about mosquitoes or Wolbachia strains. §3: « as preliminary trials revealed this was the optimal method for both sensitivity and specificity »: please add « data not shown » or remove it Page 5: Instead of µL of DNA, the actual quantity in ng would be preferable. All PCRs: primer sequences are needed §3: « Both chromatograms (forward and reverse traces) from each sample WERE manually » Pages 6-7 Table 1: probably some mistakes, e.g. An. gambiae in Mikalayi: 11.8% corresponds neither to 1/16 nor to 2/16, so all the numbers should be checked. It would be appropriate to enter the actual numbers in brackets, and to indicate the co-prevalence of Wolbachia and Plasmodium. The legend should be grouped below or above the table and the explanation about mosquitoes in bold is unclear.
In the text, numbers would be interesting rather than only proportions. « previously named M molecular form OF AN. GAMBIAE » (or remove it, as this precision may now be superfluous). On the contrary, « An. species A » is barely introduced, it would be interesting to mention something about this species and its identification (besides the quick explanation in the introduction).
Page 13 « Approximately 1000-fold higher », it is very much of an approximation (variable Ct values and potential variations in 16S copy number): it may be good to rephrase, mentioning that 1000 is an order of magnitude rather than approximately. §2: « An. moucheti (wAnM-infected) » comes at the 2nd occurrence of wAnM.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above. Dear Mathilde and Ottavia, Firstly many thanks for your thoughtful and comprehensive review of our manuscript. We have tried to address all your comments below in : bold Figure 7a is not very clear as one expects the scale to go from 0 to 100%, therefore we suggest to use a discontinued axis to present these interesting results.

We agree and have modified this figure for clarity
Finally the authors investigated the presence of Plasmodium in the studied mosquitoes, as Wolbachia is thought to interfere with some transmitted pathogens. This part is less convincing as the tests have been performed on DNA extraction from whole bodies or abdomens, while the presence of Plasmodium in head and thorax (or more specifically, in salivary glands) is a more suitable method to assess transmission potential. Moreover, the conclusions drawn on the interactions between Plasmodium and Wolbachia are not exactly clear. Considering that 10.16 + 1.56 = 11.72% mosquitoes are infected with Wolbachia and 11.72 + 1.56 = 13.28% are infected with Plasmodium, if there is no effect between Wolbachia and Plasmodium, you expect that 11.72% x 13.28% = 1.56% is infected by both. Surprisingly, this is exactly the result here. Biology is rarely so close to math, for so small numbers… The authors should thus state more clearly that their results favor no interactions, as confirmed by the p value which is very close to 1. On the contrary, the discussion currently suggests that the non significant correlation is due to small numbers. However, one cannot jump to conclusion on the inability of Wolbachia to interfere with Plasmodium, as these results have been performed on abdomens and whole bodies, therefore we do not know whether the co-infected mosquitoes had just blood fed (and/or carried early stages of Plasmodium) We agree and have modified our discussion on these results to make more appropriate conclusions based on our data To improve the clarity of the article, it would be interesting to have an additional figure or table summarizing the experimental set up, explaining which mosquitoes are included in which analysis and which Wolbachia strain is found in which mosquitoes. Many thanks for this suggestion. After careful consideration, we feel that an additional figure or table is not needed given we have figure 1 showing which species Anopheles were -infected and from which locations within countries and have all the PCR Wolbachia screening data from all samples available from Open Science Framework: DOI 10.17605/OSF.IO/MW6XZ in addition to sample details for all accession numbers in the supplementary tables.
However, we have also modified table 1 to provide the comparison between Plasmodium-infected, Wolbachia-infected, Asaia-infected, co-infected individuals and uninfected individuals across all collection sites.
We also have minor comments on the manuscript: The expression « resident strain » is not clear to us.
The expression « resident strain » is not clear to us. 'Resident' strains are considered to have resulted naturally and have an Wolbachia evolutionary association with the host (wAlbA and wAlbB in Ae. albopictus) rather than have been generated artificially through transinfection (eg. wMel in Ae. aegypti).
We have modified our introduction to make this clearer by the inclusion of 'those naturally present in wild insect populations' 16S « rRNA » and « rDNA »: a consistent word may be used, rRNA seems more consensual. The total number of mosquitoes, of Wolbachia infected mosquitoes, of Asaia infected ones, etc would be interesting.

We agree and have modified table 1 to include the number of infected mosquitoes for all categories (including uninfected individuals).
Page 3: §2: Asaia is not an endosymbiont We agree and have modified throughout the manuscript to reflect this mistake §3: « have » needs probably to be removed in « than would have be identified using morphological identification alone » We agree have corrected this sentence §4 needs a first sentence identifying the gap of knowledge that the authors want to fill We agree and have added the following sentence: "Investigating the prevalence and diversity of strains naturally present in populations across diverse Wolbachia Anopheles malaria endemic countries would allow a greater understanding of how this bacterium could be influencing malaria transmission in field populations and provide candidate strains for transinfection" §5: Can the authors clearly state whether some mosquitoes had blood in their midgut? We did not fully determine the Sella score of the mosquitoes used in our study so our collection likely contained individuals that had undigested blood. However, we have the following sentences in our discussion which we feel acknowledges the limitations of our study: "However, detection of parasites in whole body mosquitoes does not P. falciparum confirm that the species plays a significant role in transmission. Detection could represent infected bloodmeal stages or oocysts present in the midgut wall so further studies are warranted to determine this species ability to transmit human malaria parasites." Page 4: Figure 1: scale should be in km, miles is not an SI unit

We have changed this to km
We have changed this to km §2: « DNA extraction and MOSQUITO species identification ». More generally, it is not always clear whether the authors speak about mosquitoes or Wolbachia strains. We have added the word 'mosquito' prior to species identification for clarity §3: « as preliminary trials revealed this was the optimal method for both sensitivity and specificity »: please add « data not shown » or remove it We have removed this as it's been shown before in multiple previous publications and is a well-established PCR assay for detection of . Plasmodium Page 5: Instead of µL of DNA, the actual quantity in ng would be preferable. Although we did measure total DNA for selected samples and normalised An. gambiae extracts to Ct values for a single copy housekeeping gene, we did not An. gambiae rps17 do this for all species across all countries so for consistency we feel ul of DNA is more representative of our work All PCRs: primer sequences are needed We have added all primer sequences were appropriate §3: « Both chromatograms (forward and reverse traces) from each sample WERE manually » We have changed this grammatical error Pages 6-7 Table 1: probably some mistakes, e.g. An. gambiae in Mikalayi: 11.8% corresponds neither to 1/16 nor to 2/16, so all the numbers should be checked. It would be appropriate to enter the actual numbers in brackets, and to indicate the co-prevalence of Wolbachia and Plasmodium. The legend should be grouped below or above the table and the explanation about mosquitoes in bold is unclear.
In the text, numbers would be interesting rather than only proportions. « previously named M molecular form OF AN. GAMBIAE » (or remove it, as this precision may now be superfluous). On the contrary, « An. species A » is barely introduced, it would be interesting to mention something about this species and its identification (besides the quick explanation in the introduction). We have modified table 1 for clarity including numbers and removed the reference to M and S forms. The legend format is according to WOR guidelines and we have modified the table legend for clarity.
As very little is known about An. species A and what we were able to find on this species is presented in our discussion "An. species A should be further investigated to determine if this species is a potential malaria vector, given our study demonstrated P. falciparum infection in one of two individuals screened and ELISA-positive samples of this species were reported from the Western Highlands of Kenya ." Page 13 « Approximately 1000-fold higher », it is very much of an approximation (variable Ct values and potential variations in 16S copy number): it may be good to rephrase, mentioning that 1000 is an order of magnitude rather than approximately.