DNA uptake sequences in Neisseria gonorrhoeae as intrinsic transcriptional terminators and markers of horizontal gene transfer

DNA uptake sequences are widespread throughout the Neisseria gonorrhoeae genome. These short, conserved sequences facilitate the exchange of endogenous DNA between members of the genus Neisseria. Often the DNA uptake sequences are present as inverted repeats that are able to form hairpin structures. It has been suggested previously that DNA uptake sequence inverted repeats present 3 of genes play a role in rho-independent termination and attenuation. However, there is conflicting experimental evidence to support this role. The aim of this study was to determine the role of DNA uptake sequences in transcriptional termination. Both bioinformatics predictions, conducted using TransTermHP, and experimental evidence, from RNA-seq data, were used to determine which inverted repeat DNA uptake sequences are transcriptional terminators and in which direction. Here we show that DNA uptake sequences in the inverted repeat configuration occur in N. gonorrhoeae both where the DNA uptake sequence precedes the inverted version of the sequence and also, albeit less frequently, in reverse order. Due to their symmetrical configuration, inverted repeat DNA uptake sequences can potentially act as bi-directional terminators, therefore affecting transcription on both DNA strands. This work also provides evidence that gaps in DNA uptake sequence density in the gonococcal genome coincide with areas of DNA that are foreign in origin, such as prophage. This study differentiates for the first time, to our knowledge, between DNA uptake sequences that form intrinsic transcriptional terminators and those that do not, providing characteristic features within the flanking inverted repeat that can be identified.


Introduction
In Neisseria gonorrhoeae and Neisseria meningitidis, a short, conserved sequence is scattered throughout the genome.This sequence enhances the uptake of external DNA fragments, providing discrimination that favours DNA acquisition from members of the genus Neisseria over foreign DNA (Goodman & Scocca, 1988).These DNA uptake sequences are present in both commensal and pathogenic species of the genus Neisseria (Marri et al., 2010), resulting in horizontal gene transfer not only between different strains of gonococci (Sparling, 1966) but also between the various species of the genus Neisseria (Kriz et al., 1999;Spratt et al., 1992).
There are several variations on the DNA uptake sequence (DUS), the most abundant of which is the 10 bp 5 ¢ -GCCGTCTGAA-3 ¢ sequence (Goodman & Scocca, 1988), which occurs almost 2000 times in the N. gonorrhoeae strain FA1090 genome.Most of these are part of an extended 12 bp DUS (eDUS; 5 ¢ -ATGCCGTCTGAA-3 ¢ ) that has been described as having enhanced uptake efficiency (Ambur et al., 2007).Compared with the 10 bp DUS, the 12 bp eDUS has been shown to instigate an around three-fold increase in the rate of endogenous DNA uptake in pathogenic Neisseria (Ambur et al., 2007).Also identified is a variant; of the DNA uptake sequence (vDUS; 5 ¢ -GTCGTCTGAA-3 ¢ ), found more commonly in some of the commensal species of the genus Neisseria (Marri et al., 2010).These DNA uptake sequences are scattered throughout the genome, however they are less abundant in regions that may have once been foreign in origin, such as prophages, but are also less frequent in areas responsible for ribosomal proteins (Bentley et al., 2007).
Unlike other repetitive sequences in bacterial genomes, such as transposons and gene duplications, the DNA uptake sequences remain highly conserved.Duplications of sequence will, over time, diverge from each other, such that elements such as transposons and duplications accumulate sequence differences between them (Smith et al., 1999).However, the short DNA uptake sequences are highly conserved between strains of members of the genus Neisseria and across the various neisserial species, with the most common sequence in the non-pathogenic species varying from the pathogenic sequence by only one base out of 10-12 (Marri et al., 2010).It may be that the specificity of the ComP surface protein that recognises the DNA uptake sequence (Cehovin et al., 2013) exerts selective pressure to maintain the conserved sequence with high affinity for this receptor.Differences in ComP proteins between the pathogens and non-pathogens and corresponding differences in affinity for their variations in the DNA uptake sequences supports the hypothesis that there is pressure to conserve the sequence (Berry et al., 2013).
The DNA uptake sequences of 10-12 bp are also found in the species of the genus Neisseria in an inverted repeat orientation (IR-DUS), often flanking a short central region (Ambur et al., 2007;Goodman & Scocca, 1988).These inverted repeat configurations of the DNA uptake sequence have been found downstream of or internally to neighbouring genes, many of which meet the criteria to function as rho-independent terminators or attenuators (Goodman & Scocca, 1988;Marri et al., 2010).The inverted repeats are not thought to further enhance endogenous DNA uptake above that achieved with a single copy of the DNA uptake sequence (Ambur et al., 2007), yet the frequent occurrence as inverted repeats suggests that these sequences have an additional function.
Using the genome sequence data from N. gonorrhoeae strain NCCP11945 (Chung et al., 2008), the implications of the locations of DNA uptake sequences in this strain are explored.The locations of all of the inverted repeat DNA uptake sequences were identified and their fitness as attenuators and rho-independent terminators assessed using both bioinformatics predictions and RNA-seq experimental evidence of transcription across the inverted repeat hairpin loop.

Impact Statement
The human pathogens, Neisseria gonorrhoeae and Neisseria meningitidis are naturally competent for transformation.These species will take up DNA from the environment, engaging in horizontal gene transfer, which has contributed to the rise in antibiotic resistance in N. gonorrhoea and the evolution of virulence factors such as the capsule in N. meningitidis, as well as capsule serogroup switching in this species.Uptake of DNA carrying the 10-12 bp neisserial DNA uptake sequence is more efficient than uptake of DNA without the sequence.Because these sequences are scattered in the genome and are associated with the 3 ¢ ends of genes, they have also been associated with transcriptional termination.Here we show that DNA uptake sequences terminate transcription only when other sequence features are present, not in all circumstances.In addition, the presence of these uniquely neisserial sequences can be used as an indicator of horizontal gene transfer.
Intrinsic transcriptional terminator identification.IR-DUS locations for the DUS, eDUS, vDUS and veDUS were determined using the default settings for Fuzznuc pattern finder (Rice et al., 2000) within xBASE (Chaudhuri et al., 2008) on the N. gonorrhoeae strain NCCP11945 genome sequence, allowing for a central region of one to ten bases followed by the mirrored DNA uptake sequences consensus [e.g. 5 ¢ -GCCGTCTGAA X(1,10) TTCAGACGGC-3 ¢ ].These were then mapped using the Artemis sequence visualization tool (Rutherford et al., 2000).The inverted repeat DNA uptake sequences that could potentially be intrinsic transcriptional terminators were then identified using TransTermHP (Kingsford et al., 2007) set to a confidence score of 75/100.Intrinsic transcriptional terminator criteria used in TransTermHP prediction software were based on previous results: the terminator must be À50-500 bases 3 ¢ of the stop codon (Ermolaeva et al., 2000;Kingsford et al., 2007); the hairpin stem length must be 10-18 bases long with a loop size of 3-10 bases (Lesnik et al., 2001); and the hairpin must be followed immediately by a poly-T region (Ermolaeva et al., 2000;Kingsford et al., 2007).If the intrinsic transcriptional terminator is to be functional on both strands it must also be preceded by a poly-A region (Ermolaeva et al., 2000;Kingsford et al., 2007).Using Artemis (Rutherford et al., 2000) and Progressive Mauve genome alignment software default settings (Darling et al., 2010), N. gonorrhoeae stain NCCP11945 was also scanned for all IR-DUS located within a gene or a maximum of 500 bases 3 ¢ of Ion torrent RNA-seq.N. gonorrhoeae strain NCCP11945 was grown in standard conditions on GC agar (Oxoid) with Kellogg's (Kellogg et al., 1963) and 5 % Fe(NO3) 3)3 supplements at 37 C or 40 C in an incubator with 5 % CO 2 overnight.Growth from the agar plates was removed using sterile plastic 10 ml loops and immediately placed into 500 ml of RNALater (Life Technologies), mixed, pelleted, and the RNA extracted using the RNeasy kit (Qiagen).RNA quality was determined on the 2100 Bioanalyzer (Agilent) and RNA was only used for RNA-seq when it had a RIN of 9 or above.The Ion Personal Genome Machine, Ion Total RNA-Seq Kit with ERCC RNA Spike-In Control Mix, Ion Express Template kit, and Ion Sequencing 200 kit (Life Technologies) were used to sequence 1 mg of rRNAdepleted RNA.
Data analysis.Following data trimming on the Ion Server for quality, determined against internal Ion controls and the ERCC RNA Spike-In controls, the BAM files generated for each RNA-seq experiment were loaded into NextGENe (v2.3.0 to v2.3.2) for mapping against the reference genome sequence of N. gonorrhoeae strain NCCP11945 (CP001050) using default settings.Each of the intrinsic transcriptional terminator sites identified using TransTermHP was manually analysed to assess whether transcripts were present at this location, their orientation and whether they terminated at or near the DUS site.Transcripts in the location of the DNA uptake sequence were seen as those mapping against this region.The orientation of these reads relative to the consensus genome sequence was indicated within the Next-GENe viewer.Reads which ended at or near to the DNA uptake sequence were recorded, taking account of their orientation (e.g.transcription toward the DUS, transcription through the DUS), and the number of reads that ended at the same base position on the genomic sequence.This experimental evidence was recorded for each site and for each RNA-seq experiment (Table S1, available in the online Supplementary Material)).RNA-seq data was deposited into GEO (accession numbers GSE58650 and GSE73032).

Results and Discussion
Uptake sequences and variant uptake sequences All of the DNA uptake sequences in N. gonorrhoeae strain NCCP11945 were identified (Fig. 1), including 445 copies of the 10 bp DUS, 1520 copies of the extended 12 bp eDUS, 120 copies of the variant 10 bp vDUS found more frequently in the non-pathogenic species of the genus Neisseria, and nine copies of the extended 12 bp version of the variant veDUS (Table S2).Therefore, the majority of the DNA uptake sequences found in N. gonorrhoeae strain NCCP11945 fit the consensus from the pathogenic species (94 %; 1965 of 2094), as would be expected.There are, however, 129 copies of the vDUS/veDUS most commonly found in the non-pathogenic species of the genus Neisseria (Marri et al., 2010), providing further evidence suggestive of horizontal exchange of DNA between the species (Spratt et al., 1992).These results are similar to those presented previously for other genomes of species of the genus Neisseria, where sequences were also identified using fuzznuc (Frye et al., 2013).These results are also in concordance with previous results demonstrating horizontal transfer of sequences containing DNA uptake sequences, including between pathogens and non-pathogens, albeit at a lower rate (Berry et al., 2013) and the presence of pathogen DUS in the non-pathogens (Marri et al., 2010).It has been suggested that the presence of the DNA uptake sequences ensures that the sequences of conserved regions can be repaired if they accumulate mutations (Davidsen et al., 2004), as well as playing a role in the dissemination of new traits within the genus (Snyder et al., 2007).

Uptake sequences as inverted repeats and intrinsic transcriptional terminators
Some of the DNA uptake sequence copies are adjacent to one another and present in an inverted repeat configuration.A total of 415 of these inverted repeat instances of paired DNA uptake sequences were found (Table 1, Table S3).The majority of these (83 %, 345 out of 415) are extended 12 bp DNA uptake sequences followed by its complement (5 ¢ -ATGCCGTCTGAA -N n -TTCAGACGGCTA -3 ¢ ; 313 copies) or the shorter 10 bp DNA uptake sequence and its complement (5 ¢ -GCCGTCTGAA -N n -TTCA-GACGGC -3 ¢ ; 32 copies) as described previously (Goodman & Scocca, 1988).
Two-thirds of the DNA uptake sequence inverted repeats have both an upstream gene in range (À50 to 500 bp) and are predicted by TransTermHP to function as intrinsic transcriptional terminators (66 %; 275 out of 415) (Table 1).These are roughly equally split between those that are predicted to terminate transcription on both strands (119) and those that are predicted to terminate transcription on one strand (156) (Table 2).
The minority of the inverted repeats of DNA uptake sequences are in a reversed orientation (14 %, 57 out of 415), with either a 5 ¢ -TTCAGACGGCAT -N n -ATGCCGTCTGAA -3 ¢ or a 5 ¢ -TTCAGACGGC -N n -GCCGTCTGAA -3 ¢ configuration; none were found for the variant DUS in this strain.These reversed orientation inverted repeats are much less likely to be intrinsic transcriptional terminators; only 25 % (14 out of 57) meet both criteria of transcriptional terminators (Table 1).The reverse orientation of the inverted repeated have briefly been mentioned previously in an analysis of DNA uptake sequences in the related species Neisseria meningitidis, where 360 pairs (89 %) were identified in the standard configuration and 45 (11 %) DNA uptake sequences inverted repeat pairs were found in the reverse orientation (Smith et al., 1999).*On either the positive strand of the genome, the negative strand of the genome, or both strands, the inverted repeat is both within range (-À50 to 500 bp) of an annotated CDS from the N. gonorrhoeae strain NCCP11945 genome sequence (CP001050) and the sequence is predicted to be an intrinsic transcriptional terminator by TransTermHP (Kingsford et al., 2007).†Inverted repeat of the 12 bp extended DNA uptake sequence (5 ‡Inverted repeat of the 12 bp extended DNA uptake sequence in reverse orientation (5 ¢ -TTCAGACGGCTA -N 1-10 -ATGCCGTCTGAA -3 ¢ ).§Inverted repeat of the shorter 10 bp DNA uptake sequence (5 ¢ -GCCGTCTGAA -N 1-10 -TTCAGACGGC -3 ¢ ).||Inverted repeat of the shorter 10 bp DNA uptake sequence in reverse orientation (5 ¢ -TTCAGACGGC -N 1-10 -GCCGTCTGAA -3 ¢ ).¶Inverted repeat of the DNA uptake sequence variant commonly found in non-pathogenic Neisseria spp.(5 ¢ -GTCGTCTGAA -N 1-10 -TTCA-GACGAC -3 ¢ ).#Inverted repeat of the DNA uptake sequence variant commonly found in non-pathogenic Neisseria spp. in reverse orientation (5 *The inverted repeat of the DUS uptake sequence is predicted by TransTermHP to be an intrinsic transcriptional terminator.†The prediction of termination is supported by either or both sets of RNA-seq data from cultures grown at 37 C. ‡The percentage of predicted transcriptional terminators that have been supported by RNA-seq data to terminate transcription.§The inverted repeat of the DUS uptake sequence is predicted by TransTermHP to be an intrinsic transcriptional terminator on both strands, therefore making a bidirectional intrinsic transcriptional terminator.||The inverted repeat of the DUS uptake sequence is predicted by TransTermHP to be an intrinsic transcriptional terminator on only strand, therefore making an intrinsic transcriptional terminator on either the positive or negative strand of the genomic DNA.Previously, it was experimentally shown that the DNA uptake sequence inverted repeat within the dcw cluster of N. gonorrhoeae strain FA1090 does not act as a transcriptional terminator 3 ¢ of mraY (Snyder et al., 2003).This gene lies within the division and cell wall synthesis cluster, where transcriptional termination within the cluster could prove problematic for the expression of its essential genes.The prediction results here agree with the previous experimental evidence.This suggests that the predictions for the other inverted repeats of DNA uptake sequences made here are robust.Furthermore, analysis of the inverted repeat DNA uptake sequences in N. meningitidis suggested that some do not have the features to be intrinsic transcriptional terminators, either being within genes or lacking the 3 ¢ T's (Smith et al., 1999).These previous results support our starting hypothesis that not all of the inverted repeats of uptake sequences in the species of the genus Neisseria are intrinsic transcriptional terminators.
In a previous study of N. meningitidis (Ambur et al., 2007), potential transcriptional terminators were identified using GeSTer (Unniraman et al., 2002).This software is largely similar to TransTermHP used here (Kingsford et al., 2007), in that it searches for inverted repeats likely to form hairpins.GeSTer identified approximately one half of the N. meningitidis features as potential intrinsic transcriptional  (Hagman et al., 1995).MtrR binds to the regulator binding site indicated by the blue/ green box (Lucas et al., 1997).A Correia repeat enclosed element (beige triangle) sometimes removes this operon from its ancestral regulatory network through insertion between the mtrR gene and promoter (Rouquette-Loughlin et al., 2004).At the 3' end of the mtrCDE operon is an IR-DUS (red diamond) predicted by TransTermHP and supported by RNA-seq (black bar) to be an intrinsic transcriptional terminator (Table S1).The staggered right end of the black bar represents the ends of transcripts from mtrCDE toward the IR-DUS from the RNA-seq data.An IR-DUS is also 3' of mtrA, which encodes the positive regulator MtrA that also binds to the mtrCDE regulator binding site (blue/green box; Rouquette et al., 1999).(b) Schematic of the chromosomal region containing NGK_0100 ileS (orange), NGK_0101 a hypothetical protein CDS (green), and NGK_0102 opa (yellow).The IR-eDUS between NGK_0100 and NGK_0101 (red) is predicted to be an intrinsic transcriptional terminator on both strands.The IR-eDUS between NGK_0101 and NGK_0102 (purple) is predicted to be an intrinsic transcriptional terminator on the negative strand but not the positive strand.Optimal Energy = -20.0kcal mol -1 -20.0 < Energy <= -19.9 kcal mol -1 -19.9 < Energy <= -19.7 kcal mol  terminators (Ambur et al., 2007).In N. gonorrhoeae strain FA1090, 91 % (361 of 396) of the identified features were predicted to terminate transcription.In addition to the differences in prediction algorithms, some parameters differed including the number of bases between each repeat and the distances from CDSs.In N. gonorrhoeae strain NCCP11945, we have found that 66 % (275 of 415; Table 1 and  Table S3) of all of the inverted repeat DNA uptake sequences identified are predicted by TransTermHP to be intrinsic transcriptional terminators on the basis of hairpin formation, presence of a following poly-T region, and association with the 3 ¢ of a CDS.Annotation of CDSs within the genome sequence data therefore influences the criteria for identification of transcriptional terminators.Of the 415 inverted repeats identified here, 189 are between CDSs predicted to be convergently transcribed and 116 of these are predicted to be transcriptional terminators on both strands by TransTermHP (61 %).
If the majority of inverted repeat DNA uptake sequences in the gonococcus are able to act as rho-independent terminators or attenuators, this would mean that they are potentially able to regulate and/or terminate the very CDS they helped introduce into the N. gonorrhoeae genome through enhanced uptake of DUS-containing DNA.

RNA-seq data supports TransTermHP predictions for some IR-DUS as transcriptional terminators
As discussed above, not all of the DNA uptake sequence inverted repeats are predicted to be transcriptional terminators by TransTermHP.For example, the copy within the dcw cluster does not have the necessary features to be predicted to terminate transcription by TransTermHP and it has been demonstrated experimentally that there is transcription across this inverted repeat (Snyder et al., 2003).Using RNA-seq, it is possible to visualise the transcriptome against the genome sequence and assess the presence of transcriptional termination or, indeed, transcription through the location of a DNA uptake sequence inverted repeat.Using RNA extracted from two different cultures grown under the same conditions, the transcriptome of N. gonorrhoeae strain NCCP11945 was investigated in these biological replicates at each of the locations containing an inverted repeat of the DNA uptake sequence where one or both sides of the inverted repeats were predicted by TransTermHP to be intrinsic terminators of transcription.The directionality of the transcript was assessed from the RNA-seq data aligned against the reference genome, in each case looking for convergent transcription toward the inverted repeat to assess if these sequences were acting as bidirectional intrinsic terminators of transcription, even when the annotation did not reflect convergent annotated CDSs.Pooling of the RNA from the two biological replicates or merging of the RNA-seq data before analysis would have made it impossible to identify any differences between the replicates, therefore both datasets were fully analysed and there results compared.Because not all CDSs in the genome are transcribed at any one time, it was expected that there may be no transcript data available for some of the regions under investigation.
Across the two RNA samples grown at 37 C, experimental evidence could be found for transcriptional termination supporting the TransTermHP predictions at 75 % (207 out of 275) of these inverted repeat DNA uptake sequence sites (Table 2).Experimental evidence supporting transcriptional termination was recorded (Table S1) indicating the end point of the transcript data relative to the genome data for all transcripts converging on the inverted repeat.Such analysis required RNA-seq data that retained the strand information for the original mRNA transcripts, as can be achieved through Ion Torrent RNA-seq.Therefore, twothirds of the inverted repeat DNA uptake sequences are predicted to be intrinsic transcriptional terminators and these predictions are supported by RNA-seq data for three-quarters of the locations.An example of the transcript data for one such site is represented in Fig. 3a.
To assess if the hairpin structure created by the inverted repeat might be influenced by temperature, RNA-seq data was also collected from N. gonorrhoeae grown at 40 C. When this RNA-seq data is included in the analysis, it provides further supporting evidence of transcriptional termination for some locations (Table S1).Of those inverted repeat DNA uptake sequence sites predicted to terminate transcription on both strands, the 40 C RNA-seq data contributes evidence supporting the 37 C data, as well as adding three new sites with evidence for one strand (Table S1).For each of these three additional sites identified only in the 40 C RNA-seq data, there is no data at the location in the 37 C RNA-seq data, which suggests that either these genes are not expressed at 37 C or that their expression is not captured in either of the two 37 C RNA-seq data sets.
Overall, the bioinformatics prediction and the experimental evidence agree with each other quite well and it can clearly be seen that not all copies of the DNA uptake sequence in N. gonorrhoeae present as inverted repeats act as transcriptional terminators.Those that meet the specific criteria set out by TransTermHP, including distance from the gene, formation of a hairpin loop and presence of poly-T regions (Ermolaeva et al., 2000;Kingsford et al., 2007), are excellent candidates for transcriptional terminators, but it should not be assumed that all DNA uptake sequences downstream of a gene are acting as such, particularly if they are placed within essential operons (Snyder et al., 2003).Indeed, only two-thirds of the IR-DUS (275 of 415) possess the necessary sequence features to act as intrinsic transcriptional terminators and these predictions have been supported by the RNA-seq evidence for 75 %.This is in concordance with earlier predictions based on N. meningitidis sequence data that suggested that about a quarter of the copies lacked the sequence features to be terminators.

Temperature control of termination
In the two sets of RNA-seq data from N. gonorrhoeae grown at 37 C, there is only one example of mRNA that reads through an inverted repeat DNA uptake sequence predicted to be an intrinsic transcriptional terminator and into the adjacent sequence.In this case, at the IR-DUS associated with NGK_1855, an ArsR family regulator, there are both transcripts that end with the inverted repeat and those that go through it (Table S1, text in red).There is also a transcript through the IR-DUS in RNA-seq data from cultures grown at 40 C.In total, the RNA-seq data set from N. gonorrhoeae grown at 40 C contains 12 examples of mRNA transcripts that go through the inverted repeat region in a direction predicted by TransTermHP to terminate transcription (Table S1, text in red).In addition, at another five locations there is both evidence of termination and read through of a predicted terminator (Table S1, text in red).These results suggest that the termination of transcription by the inverted repeat of DNA uptake sequences may be a temperature sensitive phenomenon.
There is an inverted repeat of the 12 bp extended DNA uptake sequence between NGK_0100 and NGK_0101 (Fig. 3b), which is predicted to be an intrinsic terminator for both strands and has experimental supporting evidence from both 37 C RNA-seq datasets and support for the NGK_0101 transcriptional direction from the 40 C RNAseq data.However, the transcript from NGK_0100 goes straight through the inverted repeat in the 40 C sample, as assessed by RNA-seq data that aligns to this region.NGK_0100 is ileS, isoleucyl-tRNA synthetase and NGK_0101 is a small hypothetical protein, where the reverse transcription through the CDS may seem inconsequential, since the function and indeed candidacy of this 192 bp annotated feature is uncertain.Yet, the preceding gene NGK_0102 is an opa (Fig. 3), although the annotation of this has not included the correct initiation codon 5 ¢ of the CTTCT repeat region.Temperature-sensitive transcriptional read-through at the DNA uptake sequence inverted repeat 5 ¢ of ileS resulting in anti-sense opa NGK_0102 mRNA could attenuate expression of this opa.There are differences by mfold analysis of this inverted repeat sequence in the optimal energies and predicted structures between 37 C and 40 C (Fig. 4).
There is a second extended DNA uptake sequence inverted repeat copy in this region, between NGK_0101 and NGK_0102 (Fig. 3b).It is not predicted to be a terminator on the positive strand.As the annotation of NGK_0101 is on the negative strand, this is 5 ¢ of NGK_0101, rather than 3 ¢ where a transcriptional terminator would need to be located.It is predicted to be a transcriptional terminator on the negative strand for opa, NGK_0102, and there is evidence of transcriptional termination in all three RNA-seq datasets, yet there is also evidence of transcription through the inverted repeat in the 40 C RNA-seq data.
It may therefore be that the function of the intrinsic transcriptional terminators formed by the DNA uptake sequence inverted repeats are influenced by more than just the composition of their surrounding sequences and their position relative to the genes.In another example, NGK_1855 has a DNA uptake sequence inverted repeat that is predicted to be a transcriptional terminator only in one direction, to act only for this NGK_1855, encoding a putative ArsR family transcriptional regulator.The RNA-seq data has transcripts ending within five bases of the inverted repeat in the two 37 C datasets.However, for one of the 37 C samples and the 40 C sample there is transcription straight through.The next coding sequence is exodeoxyribonuclease III, which has been shown to be critical for the survival of N. meningitidis under conditions of oxidative stress (Silhan et al., 2012).It may be that this transcriptional terminator is relaxed under certain conditions, acting as an attenuator rather than a terminator.

Regions of the chromosome lacking CREE and DUS
When mapped to the circular chromosome, three regions (a, b and c) were identified that lacked the Neisseria-specific Correia repeat enclosed element (CREE), DNA uptake sequence, or both repeat elements (Fig. 1).Region a is divided into two parts that lack DNA uptake sequences entirely.The first, from 542, 500 to 562 000 contains putative prophage genes (Spencer- Smith et al., 2012), whilst the second, from 570 000 to 592 000 contains the mafAB system and associated mafB cassettes (Bentley et al., 2007;Parkhill et al., 2000).Between these two regions there are 14 copies of the DNA uptake sequence associated with the LOS sialyltransferase gene lst (Gilbert et al., 1996), one of the 11 opa alleles, some metabolic genes, and a number of small hypothetical CDSs.The absence of DNA uptake sequences from the prophage and mafAB regions supports their foreign origin as previously reported.
Region b is also made up of two parts, from 832 000 to 850 500 and 863 000 to 882 500 with four DUS between.This region is the Gonococcal genetic island (GGI) described previously (Dillard & Seifert, 2001;Hamilton et al., 2005;Snyder et al., 2005), which lacks CREE and has few DUS.This region is also believed to have originated outside of the Neisseria spp.(Hamilton et al., 2005).
Large gaps in DNA uptake sequence density may therefore indicate region of non-Neisseria origin, particularly if CREE are also absent.

Conclusion
DNA uptake sequence locations and sequences can be important in understanding both gene expression and sequence origin.Two-thirds of the DNA uptake sequences present as inverted repeats in N. gonorrhoeae have all of the sequence features to act as transcriptional terminators.Therefore, their presence within operons cannot be assumed to mediate transcriptional termination without additional experimentation, as previously shown within the 20 kb dcw cluster (Snyder et al., 2003).RNA-seq enables the experimental interrogation of transcriptome data across the genome, supporting predictions and identifying transcriptional terminators.In addition, absence of the Neisseriacharacteristic repetitive sequences can be a powerful indicator of the presence of foreign sequences, especially when two Neisseria-specific sequence features are both lacking from a genomic region.Through assessment of genomic regions that are not associated with Neisseria-specific DNA uptake sequences and Correia repeat enclosed elements, several prophages and regions of horizontal gene transfer can be identified.

Fig. 1 .
Fig. 1.Circular chromosome map representing the locations of DNA uptake sequences (DUS) and Correia repeat enclosed elements (CREE) in the genome sequence of N. gonorrhoeae strain NCCP11945.From the outer circle to the inner circle, each represents the position of features on the positive (outer) and negative (inner) strand: annotated CDSs (red); 10 bp DUS (blue) overlaid with extended 12 bp eDUS (grey); non-pathogen variant 10 bp vDUS (green) overlaid with extended 12 bp versions of the non-pathogen variant sequence veDUS (grey); and CREE (purple).Gaps in these features are labelled a, b and c.Two gaps in DNA uptake sequence density are present at a, two gaps in DUS density that coincide with a large gap in CREE density are present at b, and c is an area of low CREE density.

Fig. 2 .
Fig.2.Schematic representation of the ideal requirements needed for a potentially functional inverted repeat DNA uptake sequence bi-directional rho-independent terminator, using the enhanced 12 bp DNA uptake sequence in the most common orientation as an example.

Fig. 3 .
Fig. 3. Transcriptional termination and inverted repeats of the DNA uptake sequence.(a) The mtrRCDE region contains overlapping promoter elements (bent arrows) that drive transcription of the mtrCDE operon (blue CDSs) encoding the MtrCDE efflux pump proteins and the mtrR gene encoding the MtrR negative regulator(Hagman et al., 1995).MtrR binds to the regulator binding site indicated by the blue/ green box(Lucas et al., 1997).A Correia repeat enclosed element (beige triangle) sometimes removes this operon from its ancestral regulatory network through insertion between the mtrR gene and promoter(Rouquette-Loughlin et al., 2004).At the 3' end of the mtrCDE operon is an IR-DUS (red diamond) predicted by TransTermHP and supported by RNA-seq (black bar) to be an intrinsic transcriptional terminator (TableS1).The staggered right end of the black bar represents the ends of transcripts from mtrCDE toward the IR-DUS from the RNA-seq data.An IR-DUS is also 3' of mtrA, which encodes the positive regulator MtrA that also binds to the mtrCDE regulator binding site (blue/green box;Rouquette et al., 1999).(b) Schematic of the chromosomal region containing NGK_0100 ileS (orange), NGK_0101 a hypothetical protein CDS (green), and NGK_0102 opa (yellow).The IR-eDUS between NGK_0100 and NGK_0101 (red) is predicted to be an intrinsic transcriptional terminator on both strands.The IR-eDUS between NGK_0101 and NGK_0102 (purple) is predicted to be an intrinsic transcriptional terminator on the negative strand but not the positive strand.The staggered ends of the bars below the operon represent the ends of the transcripts from the RNA-seq data, colour coded to match the genes.The red bar and purple bar represent the RNA-seq data from growth at 40 C, where there is transcription through the IR-DUSs.
Fig. 3. Transcriptional termination and inverted repeats of the DNA uptake sequence.(a) The mtrRCDE region contains overlapping promoter elements (bent arrows) that drive transcription of the mtrCDE operon (blue CDSs) encoding the MtrCDE efflux pump proteins and the mtrR gene encoding the MtrR negative regulator(Hagman et al., 1995).MtrR binds to the regulator binding site indicated by the blue/ green box(Lucas et al., 1997).A Correia repeat enclosed element (beige triangle) sometimes removes this operon from its ancestral regulatory network through insertion between the mtrR gene and promoter(Rouquette-Loughlin et al., 2004).At the 3' end of the mtrCDE operon is an IR-DUS (red diamond) predicted by TransTermHP and supported by RNA-seq (black bar) to be an intrinsic transcriptional terminator (TableS1).The staggered right end of the black bar represents the ends of transcripts from mtrCDE toward the IR-DUS from the RNA-seq data.An IR-DUS is also 3' of mtrA, which encodes the positive regulator MtrA that also binds to the mtrCDE regulator binding site (blue/green box;Rouquette et al., 1999).(b) Schematic of the chromosomal region containing NGK_0100 ileS (orange), NGK_0101 a hypothetical protein CDS (green), and NGK_0102 opa (yellow).The IR-eDUS between NGK_0100 and NGK_0101 (red) is predicted to be an intrinsic transcriptional terminator on both strands.The IR-eDUS between NGK_0101 and NGK_0102 (purple) is predicted to be an intrinsic transcriptional terminator on the negative strand but not the positive strand.The staggered ends of the bars below the operon represent the ends of the transcripts from the RNA-seq data, colour coded to match the genes.The red bar and purple bar represent the RNA-seq data from growth at 40 C, where there is transcription through the IR-DUSs.

Fig. 4 .
Fig. 4. Boxplots generated by mfold of the region between NGK_0100 and NGK_0101 containing an extended DNA uptake sequence inverted repeat at 37 (a) and at 40 C (b) showing that these differ at the two temperatures.This has an effect upon the predicted structures from mfold, with the three structures predicted at 37 C shown in panels c, d and e and those at 40 in panels f, g and h, each in the order of likelihood of the formation of the structure.

Table 1 .
DNA uptake sequences as inverted repeats in N. gonorrhoeae strain NCCP11945

Table 2 .
Predicted intrinsic transcriptional terminators that are supported by RNA-seq