Evidence of horizontal gene transfer within porB in 19 018 whole-genome Neisseria spp. isolates: a global phylogenetic analysis

The PorB porins are the major pore-forming proteins in the genus Neisseria . The trimeric PorB porins consist of 16 highly conserved transmembrane domains that form an amphipathic β-sheet connected by short periplasmic turns and eight extracellular hydrophilic loops. These loops are immunogenic and also play an important role in mediating antimicrobial influx. This study sought to (i) characterize the variations in Neisserial loop 3(355–438 bp) associated with intermediate resistance to penicillin/tetracycline and (ii) evaluate if there was evidence of horizontal gene transfer in any of the loops. We collated an integrated database consisting of 19 018 Neisseria spp. genomes – 17 882 Neisseria gonorrhoeae , 114 Neisseria meningitidis and 1022 commensal Neisseria spp. To identify the porB alleles, a gene-by-gene approach (chewBBACA) was employed. To evaluate the presence of recombination events, the Recombination Detection Programme (RDP4) was used. In total, 3885 porB alleles were detected. Paralogues were identified in 17 Neisseria isolates. Putative recombination was identified in loop regions. Intraspecies recombination among N. gonorrhoeae isolates and interspecies recombination between N. meningitidis and commensal Neisseria spp., and N. gonorrhoeae and N. lactamica were identified. Here, we present a large-scale study of 19 018 Neisseria isolates to describe recombination and variation in the porB gene. Importantly, we found putative recombination in loop regions between the pathogenic and non-pathogenic Neisseria spp. These findings suggest the need for pheno- and genotypic surveillance of antimicrobial susceptibility in commensal Neisseria spp. to prevent the emergence of AMR in the pathogenic Neisseria . This article contains data hosted by Microreact.

Porins are trimeric β-barrel membrane proteins composed of 35 kDa monomers with a 16-strand β-barrel fold and eight surfaceexposed, variable, hydrophilic loops [11,12]. Porins cross the cell membrane and act as a pore through which molecules, such as nutrients, toxins and antibiotics, can diffuse [13,14]. Neisserial porins not only serve as channels through which water and solutes of less than 1000 MW can diffuse through the outer membrane, but also play an active role in pathogenesis and antibiotic resistance [15,16]. Neisserial porins share sequence homology in the transmembrane domains, but the sequences of extracellular loops (1 through 8) have a high degree of variability between species and strains [17,18].
All Neisseria spp. express porins, and these porins are the most represented outer-membrane protein [19]. These porins play an important role in virulence, immune evasion and resistance [20]. Most Neisseria species express one porin (Por) [11]. N. meningitidis is exceptional because it contains two por loci and expresses two porins, PorA and PorB [21]. Whilst, N. gonorrhoeae has both porA and porB genes that encodes PIA and PIB, respectively; the porA gene is not expressed due to frameshift and promoter mutations [11,20,22]. Gonococcal porB exists in one of two allelic forms, porB1a and porB1b [23][24][25]. Gonococcal porB1a isolates have an increased capacity for invasion of the bloodstream and causing disseminated infections, while porB1b isolates may be more likely to result in ascending infection of fallopian tubes [26,27]. Alterations, modification and reduction in the expression of porins have all been associated with antimicrobial resistance (AMR) [19,20]. PenB (encodes altered forms of PIB) mutations coding for amino acid (aa) positions 120 and 121 (of loop 3 that forms the pore constriction zone of PIB) result in decreased membrane permeability and hence decreased uptake of cephalosporins, penicillins and tetracyclines [20,28,29]. These penB mutations are an important cause of reduced susceptibility to these classes of antimicrobials [20].
Neisserial porins likely play an important role in immune evasion [20]. Antigenic variation of the gonococcal porins remains one of the challenges of vaccine development [30][31][32]. Little is known about the antigenic diversity of the commensal Neisseria and the antigenic relationships between non-pathogenic and pathogenic Neisseria species with the exception of N. lactamica and N. meningitidis [33][34][35].
Previous studies in Escherichia coli and Yersinia spp. have found evidence of extensive intra-and interspecies recombination in the genes coding for the surface exposed regions of OmpF and other major porin proteins [43,44]. Less is known about the extent of HGT in the Neisserial porins and their role in antigenic variation and AMR. A limited number of studies have characterized HGT and antigenic variation in the genus Neisseria. Frequent intraspecies recombination among gonococcal housekeeping genes and hypervariable porB and tbpB have been described [45,46]. Recombination in a collection of 204 isolates collected in a Baltimore STI clinic over a decade showed a higher recombination rate in PIB than PIA and 13 housekeeping genes [47]. In the mature PorB protein, the mutability of 328 aa were examined and found that 308 aa were likely to be mutable and 20 aa were likely to be nonmutable [48]. One important paper found evidence of recombination in the surface loops of porB of N. meningitidis [49]. This analysis was, however, limited to 35 alleles of porB. Another phylogenetic analysis suggested evidence of horizontal gene transfer in the evolution of different porin classes and confirmed the close evolutionary relationships of the porins from N. meningitidis, N. gonorrhoeae, N. lactamica and N. polysaccharea [11].

Impact Statement
The PorB porins are the major pore-forming proteins in the genus Neisseria. They play an important role in regulating the entry of antimicrobials such as beta-lactams and tetracyclines into the cytoplasm. The extracellular loops are important immunogens targeted by anti-gonococcal vaccines, and mutations in loop 3 of PorB have been associated with intermediate resistance to penicillin/tetracycline. In this study, we used a collection of 19 018 Neisseria spp. genomes to characterize the variations in loop 3 and evaluate if there was evidence of horizontal gene transfer in any of the extracellular loops. We found evidence of recombination in the loop regions. Intraspecies recombination among N. gonorrhoeae isolates, and interspecies recombination between N. meningitidis and commensal Neisseria spp., and N. gonorrhoeae and N. lactamica were identified. These findings suggest that the Neisseria commensal species serve as a reservoir of genetic variation that can be taken up by the pathogenic Neisseria species. This has important implications for the development of vaccines targeting PorB and the emergence of antimicrobial resistance. As a result, it may be prudent to include genotypic surveillance of commensal Neisseria spp. in current surveillance programmes of pathogenic Neisseria.
Recombination analyses rely on contrasting evolutionary histories in different parts of the DNA sequences, leading to incongruence between gene trees at individual loci or mosaic structures within gene sequences or measurements of linkage disequilibrium between loci [50][51][52][53][54]. Recombination can be detected using multiple techniques [52,55]. In this study, we sought to detect recombination in porB using the Recombination Detection Programme (RDP4) and by inferring incongruence in the porB gene tree in a population of N. gonorrhoeae and other Neisseria isolates using a dataset of 19 018 globally sourced Neisseria isolates [52,56]. We determined the variations in loop 3(355-438 bp) at amino acid positions G120 and A121 associated with intermediate resistance to penicillin/tetracycline.

Genome collection
Whole-genome sequences (WGS) and metadata were downloaded from pubMLST [57] and pathogenwatch (https://pathogen. watch/ngonorrhoeae) from 86 countries in July 2022. This resulted in an integrated database containing WGS from 19 018 isolates, including 17 882 from N. gonorrhoeae, 114 N. meningitidis (the global data set [58], 625 N. lactamica and 397 other Neisseria spp. (When referring to species in plural spp. is used.) The numbers of isolates per species used in this study are listed in Table S1, available in the online version of this article. The quality of the genomes analysed using Quality Assessment Tool for Genome Assemblies (QUAST) [59] is provided in Table S2.

Allele calling
Allele calling analysis was carried out as described in [38,39] (Fig. S1). In brief, a study-specific schema was created. WGS from 19 018 isolates were analysed using chewBBACA software v.3.1.0 [60], followed by recombination analysis using the recombination detection programme, RDP4 v.4.100 [52]. For the chewBBACA analysis, a training dataset was created from the complete reference genome sequence of N. gonorrhoeae FA1090 using Prodigal v2.6.3 [61]. As the vast majority of genomes used in the study included gonococcal genomes, N. gonorrhoeae FA1090 was used as the reference genome. FA1090 was selected as the reference strain as it is a commonly used reference strain, has been used in the development of vaccine prototypes and carries antigen sequence types identical to the most broadly distributed antigen variants [30,62,63]. This was followed by creating a study-specific Neisseria scheme from 11 complete . Then the FASTA file for each coding sequence (CDS) was generated, followed by the creation of whole-genome (wg) multi-locus sequence typing (MLST) loci. The core-genome (cg) MLST loci with the threshold of 0.95, i.e. loci present in >95 % of the genomes, were then extracted from the wgMLST loci and visualized using a grape tree; the clusters isolate based on their allelic profiles using a minimum spanning algorithm [64]. Finally, the functional information of the CDS was retrieved using UniProtFinder (https://www.uniprot.org/). Multiple sequences were aligned using mafft v7.515, and neighbour-joining (NJ) trees were created using SchemaEvaluator implemented in chewBBACA [65]. PorB was identified based on the UniProt identifier, and the aligned sequences and NJ trees were exported from the SchemaEvaluator. The multiple sequence alignment files were imported into the molecular evolutionary genetic analysis tool, mega X, and the CDS was translated [66]. The NJ trees and the corresponding metadata were visualized using iTOL [67].

Identification of porB1a and porB1b genes and substitutions at amino acid positions 120 and 121
Gonococci can be divided into serogroups WI, WII and WIII by coagglutination [68]. In the current study, the strains were assigned to PorB1a (WI) or PorB1b (WII/III) serogroups [63,68]. The following search was conducted to classify the PorB1a and PorB1b serogroups in the porB FASTA sequences generated during the allele calling: Firstly, a search against NCBI protein databases with the term 'trimeric protein PorB.IA' and 'trimeric protein PorB.IB' retrieved 41 and 411 sequences, respectively (accessed date: March 2022). Secondly, a local blastp database of the PorB.IA and PorB.IB sequences were created in CLC Genomics Workbench (v 20.1.2). Lastly, a blastp was carried out against the PorB allele FASTA files, and sequences with 100 % identity were classified as belonging to either PorB1a or PorB1b serogroups N. gonorrhoeae sequence typing for antimicrobial resistance (NG-STAR) uses a small 60 bp sequence that includes the G120-and A121-encoding region associated with antimicrobial resistance to limit the number of alleles of the hypervariable porB gene [69]. Due to the high homology of PorB1a sequences, the G120/A121 mutations apply only to PorB1b. However, both genes are included in NG-STAR to enhance typing purposes [69]. Therefore, in the present study, amino acid (aa) changes at positions 120 and 121 for both PorB1a and PorB1b were included.
The porB sequences extracted from the SchemaEvaluator were imported to CLC Genomics Workbench v 20.1.2. The nucleotide sequences were translated to protein sequences and aligned using mafft [65]. G120 and A121 aa positions were deduced based on the amino acid sequence ARO:3 000 464, retrieved from the CARD database [70].

Phylogenetic and recombination analyses
Phylogenetic analysis was carried out on all the porB alleles representing 19 018 Neisseria isolates and a subclade comprising the PorB1a serogroup clustered together with N. meningitidis and Neisseria commensals, representative of 2653 isolates.
The nucleotide alignment of porB alleles (n=3,885) and the subclade (n=698) were screened for recombination events using the Recombination Detection Programme (RDP4) [52]. The analysis was carried out as described in [38,39]. In brief, RDP4 examines the nucleotide sequence alignments and identifies recombinant events using recombination detection methods such as RDP, GENECONV, Bootscan, Maxchi, Chimaera, SiSscan and 3Seq [71][72][73][74][75]. Recombinant events supported by at least two of the seven algorithms were used with default settings, except the window size was increased to 60 nt in RDP, 120 nt in MaxChi and Chimaera, and to 500 in BootScan and SiScan. NJ trees were constructed using 1000 bootstrap replicates [76]. The minor parent and major parent were defined as the ones contributing to the smaller and larger fraction of the recombinant, respectively [52].
Statistical analyses, including the calculation of GMs, were performed using XLSTAT (2022, Statistical and data analysis solution, New York, USA). MIC variables were log 2 -transformed for use as continuous outcome variables [39]. To evaluate the correlation between multiple amino acid substitutions and to assess the correlations between phenotypic and genotypic patterns of resistance to penicillin and tetracycline, the two-sided Mann-Whitney U test was used. A value of P of <0.001 was considered statistically significant.
The PorB1a and PorB1b serogroups were spread across 47 and 67 countries, respectively (Fig. 2b, Table S3). The individual variants did not appear to be equally distributed worldwide (Fig. 2b, Table S3). For example, 302 isolates assigned to allele 835 belonging to the porB1a serogroup were spread worldwide (Americas, Australia and Europe). In contrast, 120 of the 122 isolates from allele 19 (PorB1b) were identified in the UK and no geographical data was available for the other two isolates (Fig. 2b, Table S3).

Variations in loop 3 and antimicrobial susceptibility in N. gonorrhoeae
The surface-exposed loops (loops 1 to 8) were analysed individually, and the size of the loops varied for the Neisseria spp. as follows: (1)    variability and antimicrobial susceptibility in N. gonorrhoeae was dissimilar to the corresponding loops in the commensal Neisseria spp., with rarely identical amino acid sequences and with no alleles shared (Figs 2 and S3).
Substitutions in 120 and 121 aa positions that result in reduced susceptibility to penicillin and tetracycline were deduced from the PorB protein sequences. Amino acid positions 120 and 121 corresponded to amino acid positions 172 and 173 in the full-length sequence alignment of all the Neisseria spp., and to amino acid positions 2 and 3 in loop 3, respectively (Fig. S3). G120 and A121 substitutions were detected in 17 882 isolates and were distributed worldwide (Fig. 3a-c). Penicillin and tetracycline MICs were available for 35.1 % of (6 283/17 882) and 25.9 %(4 639/17 882) isolates, respectively.

(i) Serogroup-Porb1a
Four isolates belonging to the Porb1a serogroup were available in 1928 (Fig. 5a). Out of the four isolates, two isolates each had both the G120D and A121G substitutions. Of note, the G120D substitution was first observed in 1928 and then not observed again until 1979. The G120N substitution was first observed in 2008. All the PorB1a isolates from 1928 to 2019 had a glycine at position 121.

(ii) Serogroup-Porb1b
Two isolates each from the year 1930 and 1931 and six isolates from 1932 were available (Fig. 5b). All ten isolates were wild-type (G120 and A121). The first G120 substitution, G120D was observed in 1960, and A121S substitution (n=2) was first observed in 1940. serogroups. X and Y axis denotes the year and the relative abundance of resistance-associated mutations in the N. gonorrhoeae isolates, respectively.

Evidence of horizontal gene transfer
RDP4 analysis and phylogenetic tree construction were carried out on (a) all the porB alleles (n=3,885) and (b) subclade consisting of the PorB1a serogroup clustered together with N. meningitidis and commensal Neisseria isolates (n=698).

(a) porB alleles (n=3885)
Twenty-seven unique recombination events were supported by at least two of seven detection methods using RDP4 analysis (Table S5). These included nine recombination events identified in the loop three regions. Intraspecies recombination among N. gonorrhoeae species and interspecies recombination were identified between N. meningitidis and commensal Neisseria spp.
All the recombinant events are summarized in Table S5, and an example of inter-and intra-species recombination is provided below:  Table S5). The minor parent, N. gonorrhoeae isolate, allele-779, had the G120D/A121G substitutions, and the major parent, N. gonorrhoeae isolates, allele-3654, had the G120N substitution. The recombinant N. gonorrhoeae isolates (alleles-3223; 3805 and 2934) had the A121G substitution (Table S5). The region of the tree containing the subclade of porB1a alleles was clustered together across N. meningitidis and other Neisseria species (Fig. 1a, b). The subclade was further investigated using RDP4, and 36 unique recombination events were supported by at least two out of seven detection methods. Interspecies recombination was identified between N. gonorrhoeae and commensal Neisseria spp. All the recombinant events are summarized in Table S6, and an example is provided below: belonging to the same clade (Fig. 1b). The serogroup of the above alleles could not be determined. The major parent was isolated from a woman and the minor parent from a man who has sex with men. In another instance of recombination, N. gonorrhoeae allele-2,049 (n=2) was identified within one of the clades of N. meningitidis (Fig. 1b).

DISCUSSION
We found evidence of HGT in the porB gene in N. gonorrhoeae and N. meningitidis. Additionally, using an allele-calling approach, we evaluated the substitutions at amino acid positions 120 and 121 to assess decreased susceptibility to penicillin and tetracycline within a global collection of 17882 N. gonorrhoeae genomes.
Using a cgMLST approach, we identified 3885 porB alleles from 17 882 N. gonorrhoeae, 114 N. meningitidis and 1022 Neisseria spp. isolates from around the world. Out of the 3885 porB alleles, 3461 alleles were identified in 17 831 N. gonorrhoeae isolates. The gonococcal alleles were further classified into serogroups as follows: PorB1a-216 alleles (isolates=1,441) and PorB1b-2,359 alleles (isolates=13,915). The remaining 886 alleles comprising 2,526 NTNG isolates could not be assigned to either of the above serogroups.
G120D, A121D, A121G, A121N and A12S have previously been found to be associated with reduced susceptibility to penicillin and tetracycline [20,29,81,82]. Interestingly, two of four Porb1a isolates from 1928 had G120D/A121G (n=2, alleles-1835, ST11688) substitutions, i.e. before the introduction of penicillin. Isolates belonging to Porb1b serogroup were available from 1930 onwards, and no mutations were observed in the isolates from the years 1930-32. The A121G substitution was first observed in two isolates belonging to Porb1b serogroup from 1998. Interestingly, the first resistance-associated mutation at aa position 121 (A121S) was observed in two isolates belonging to PorB1b serogroup from 1940. Of note, few NTNG isolates (n=27) and Neisseria commensal isolates (n=129) had the A121G substitution (Fig. 3a, b). These NTNG isolates were not further investigated and are beyond the scope of the current study. Though G120 and A121 substitutions apply to only Porb1b, both Porb1a and Porb1b are used for typing purposes [69]. Therefore, resolving the amino acid wild-type at position 121 for PorB1a, i.e. G121 instead of A121, will help resolve the NTNG isolates.
The geographical distribution of N. gonorrhoeae isolates with PorB1a, and PorB1b genotypes have been studied in several countries/regions [83]. In the current study, allele 835, consisting of 302 isolates belonging to the PorB1a serogroup, was spread worldwide (Americas, Australia and Europe), Fig. 2b. The high rate of HGT observed in the gonococcus led to the assumption that limited structure would be evident in its populations [54]. Allele 9, consisting of 122 isolates belonging to the PorB1b serogroup, was identified only in the UK, and this is consistent with a semi-clonal population structure [84][85][86][87] HGT among and within the hypervariable porB has been previously described [45]. We found evidence of intra-and inter-species recombination among the Neisseria species. In the interspecies recombination, DNA was acquired by N. meningitidis from two commensal Neisseria species: N. cinerea and N. polysaccharea.
Phylogenetic incongruence was observed in a subclade that consisted of the PorB1a serogroup in the porB gene tree, which could have arisen due to HGT. Since HGT significantly influences bacterial phylogenomic patterns, this subclade with apparent incongruence was further investigated [56]. A subclade consisting of 698 porB alleles was extracted, and a phylogenetic tree was constructed. The multiple sequence alignment was then analysed using RDP4. Interestingly, this approach provided evidence of interspecies recombination between N. gonorrhoeae (minor parent) and Neisseria commensal N. lactamica (major parent).
Caveats of our study include the following: only computational analysis was carried out based on the publicly available data, and variations and recombination events were determined only for the porB gene. Analysis of variations was accounted only for G120 and A 120 amino acid positions, the most frequent antimicrobial-resistant determinant found in PorB. Moreover, due to the lack of availability of MIC data for the commensals, the correlation of MICs between the parents and the recombinants could not be determined. Some of the Pathogenwatch MICs are estimated based on genotype using 485. toml (https://gitlab.com/ cgps/pathogenwatch/amr-libraries/blob/0.0.17/485.toml) and the estimated MICs may differ slightly from MICs obtained via phenotypic testing [38]. Finally, the HGT events were not confirmed experimentally.
Antimicrobial resistance in N. gonorrhoeae and N. meningitidis frequently emerges in commensal Neisseria species and is transferred to the pathogenic Neisseria via natural transformation [36][37][38][39][40][41][42]. In the present study, recombination was seen in the hypervariable loop regions, including loop 3, where the resistance-associated determinants are present [20,29,82]. HGT in loops of porB is worrisome and suggests the need for geno-and phenotypic surveillance of antimicrobial susceptibility in commensal Neisseria spp. and a pan-Neisseria approach to prevent the further emergence of AMR in the pathogenic Neisseria [88]. Similar considerations apply to vaccine strategies that target PorB [89][90][91][92][93].

Funding information
The study was funded by SOFI 2021 grant-'PReventing the Emergence of untreatable STIs via radical Prevention' (PRESTIP).

Conflicts of interest
The author(s) declare that there are no conflicts of interest.