Epigenomic Landscape of Lyme Disease Spirochetes Reveals Novel Motifs

ABSTRACT Borrelia burgdorferi, the etiological agent of Lyme disease, persists in nature through an enzootic cycle consisting of a vertebrate host and an Ixodes tick vector. The sequence motifs modified by two well-characterized restriction/modification (R/M) loci of B. burgdorferi type strain B31 were recently described, but the methylation profiles of other Lyme disease Borrelia bacteria have not been characterized. Here, the methylomes of B. burgdorferi type strain B31 and 7 clonal derivatives, along with B. burgdorferi N40, B. burgdorferi 297, B. burgdorferi CA-11, B. afzelii PKo, B. afzelii BO23, and B. garinii PBr, were defined through PacBio single-molecule real-time (SMRT) sequencing. This analysis revealed 9 novel sequence motifs methylated by the plasmid-encoded restriction/modification enzymes of these Borrelia strains. Furthermore, while a previous analysis of B. burgdorferi B31 revealed an epigenetic impact of methylation on the global transcriptome, the current data contradict those findings; our analyses of wild-type B. burgdorferi B31 revealed no consistent differences in gene expression among isogenic derivatives lacking one or more restriction/modification enzymes.

genes; however, these generally revealed R/M enzymes that due to either size or frameshifts would not produce functional proteins (Table S1, REBASE). The phylogenetic trees were assembled from amino acid sequences identified through BLAST searches, with a greater number of BBE02/BBH09 homologs than BBQ67 homologs being identified (Fig. 1), in agreement with previous reports (13).
Single-molecule real-time sequencing of Borrelia reveals methylation of unique nonpalindromic m6A motifs by multiple strains. SMRT sequencing was performed to determine the motifs recognized by the R/M enzymes of 8 different B. burgdorferi B31 derivatives (13,(40)(41)(42)(43), 3 different B. burgdorferi N40 derivatives (44), B. burgdorferi 297 (2), B. burgdorferi CA-11 (45) and clonal CA-11.2A (46), B. garinii PBr (47), and B. afzelii PKo (47) and BO23 (48), resulting in 75-to 776-fold coverage and .99.99% overall concordance with FIG 1 Phylogenetic trees depicting homologs of the BBE02 (Pfam1) (A) and BBQ67 (Pfam2) (B) type II R/M systems of Borrelia burgdorferi B31. The trees were rooted by the outgroup (GenBank accession number YP_002344108). The CAT approximations of the Shimodaira-Hasegawa (SH) test were used to determine the local support values (SH support values) with 1,000 resamplings. The B. burgdorferi sensu stricto (s. s.) single nucleotide polymorphism (SNP) groups are indicated by the boxes next to the R/M protein GeneIDs. Genes belonging to B. burgdorferi B31 are indicated by stars, while additional Borrelia strains that underwent SMRT sequencing are indicated with triangles. Any truncations identified in GenBank and any indels identified through sequencing are indicated next to the strain name. The identified motifs are indicated in the rightmost column next to the R/M enzyme responsible for its methylation; however, if it is not known which protein is responsible for motif recognition, all motifs identified within that strain are shown in parentheses. The colored circles and similarly colored lines indicate orthologs whose clade matches the expected strain phylogeny. The sequences used to create the tree are located in Table S1 in the supplemental material (R/M homologs).
New Insights into the Borrelia Methylome ® the published genome sequences (Table S2, coverage stats). Due to the absence of complete chromosomal sequences, contigs were utilized for chromosomal alignments of B. burgdorferi CA-11 and B. afzelii BO23. Additionally, as B. burgdorferi 297 does not have a sequenced chromosome, a previously sequenced and assembled chromosomal sequence of a close relative, JD1, was used for alignment. Analysis of SMRT sequencing data revealed that, for the most part, genomic DNAs (gDNAs) from different Borrelia strains exhibited distinct methylation motifs, although they carry homologous R/M genes (Fig. 2). The variation of motifs within the same species is consistent with the rapid evolution of the R/M system in Borrelia. While there is a significant strain-specific distribution of motifs, through analysis of the pairwise motif differences and the phylogenetic distributions, orthologous R/M enzymes methylate motifs that are more similar than those of paralogous R/M enzymes ( Fig. 1 and 2). All motifs were of the m6A type, with no confident m4C or m5C motifs being detected. Although the Bbun40_y07 gene product of B. burgdorferi N40 is reported to methylate cytosine (49), cytosine methylation was not detected in either nonclonal or clonal N40 gDNA ( Table 1). The number of unique methylated motifs detected for each B. burgdorferi B31 and N40, B. garinii PKo, and B. afzelii PBr strain was the same as the number of m6A R/M enzymes either known or predicted to be encoded in their genomes (Fig. 2). In contrast, B. burgdorferi N40, CA-11, and 297 and B. afzelii BO23 are predicted to harbor more R/M genes than the number of unique methylated motifs that were identified. While CA-11 harbored intact R/M genes, Bbun40_e01 in N40 contains indels; B. burgdorferi 297 lacks plasmid lp28-5 carrying Bbu297_y09 and Bbu297_y05, along with indels in Bbu297_h03 and Bbu297_j01; and B. afzelii BO23 contains a frameshift in gene BLA32_04945. However, it is unknown if the indels in BbuN40_e01, Bbu297_h03, and Bbu297_j01 would inactivate the resulting enzymes. All other Borrelia strains and derivatives did not contain any indels within their respective R/M genes (Table S2, R/M sequencing). Apart from the H/WCAG motifs (the methylated adenine is underlined), clonal derivatives, nonclonal B. burgdorferi N40 and CA-11, and B. afzelii PKo had efficient methylation (.95%) of m6A motifs encoded throughout the genome, while nonclonal B. burgdorferi B31 MI and 297 and B. garinii PBr did not methylate the majority of sites containing their respective motifs. The incomplete methylation of gDNA extracted from nonclonal populations is presumably due to the loss of plasmids encoding the R/M enzymes by a subset of the population (Table 1 and Fig. 2) (50). The two highly methylated motifs in B. burgdorferi B31, CGRKA modified by BBQ67 and GNAAYG modified by BBE02, are consistent with those identified by Casselli et al. (39). However, the H/WCAG motif of B. burgdorferi B31 has not been previously described (22,51). The methylated motifs identified within B. burgdorferi N40, CA-11, and 297; B. garinii PBr; and B. afzelii PKo and BO23 are novel (22,51). Further REBASE searches identified additional motifs that contain some form of the Borrelia motifs identified within this study, except for GNAAYG, GNAAYC, and HMAAG (Table S2, REBASE motifs).
The number of motifs present in Borrelia DNA is not more than would be expected by chance. To determine if individual motifs confer some type of advantage and are overrepresented in strains carrying the cognate R/M enzyme, the numbers of motifs within each genome were analyzed. This analysis assessed the numbers of motifs present in each replicon versus randomly assembled, noncoding sequences of the same length to determine if there were significantly more motifs than expected by chance in any particular replicon (Table S2, motif analysis). A false-positive "replicon" occurs if a randomly assembled sequence contains more motifs than would be expected by chance. Significance is determined by scanning primary and control sequences for each motif and computing an odds score for each position in each sequence; these scores are combined through the average odds score, and the sum test is applied to determine if the primary sequences have significantly higher scores (52). Of all the genomes analyzed, there was no evidence of selective pressure on the prevalence of R/M motifs (Table S2, motif analysis). Additionally, all genes harbored within each sequenced genome were synonymously shuffled at the third position of each codon to produce 1,000 synonymously shuffled sequences (53). Comparisons revealed that the majority of genes encoded fewer motifs than the synonymously shuffled sequences (Fig. 3).
The number of R/M motifs in the ospC gene is higher than would be expected by chance. Although there is no evidence for selective pressure on the total number of R/M motifs per replicon, we analyzed the distribution of R/M motifs in plasmidborne genes of B. burgdorferi B31 for which there is strong evidence of horizontal gene transfer and recombination ( Fig. 4 and Table S2, genes of interest) (33)(34)(35)(36)(37)(38)(54)(55)(56)(57). For most of the R/M motifs in the 7 strains analyzed, B. burgdorferi B31 ospC contained a significantly higher number than would be expected by chance. Additionally, analysis of the ospC genes of the other 6 strains revealed a significantly higher occurrence of some motifs (Table S2, ospC). In contrast, B. burgdorferi B31 ospD, dbpA, dbpB, and the erp genes revealed higher numbers of some R/M motifs than would be expected by chance, but there was also a high number ($25%) of false-positive sequences. However, there is no appreciable difference (as determined by the log fold change [logFC]) between the numbers of motifs encoded within these genes and their synonymously shuffled counterparts (Table S2, CodonShuffle). Together, these data indicate the clustering of R/M motifs in ospC but not in other plasmid-borne genes for which there is also evidence of horizontal gene transfer, indicating an ospC-specific selective pressure.
The number of motifs and the prevalence of methylated motifs are independent of the replicon and strain, with some striking exceptions. To determine if motif prevalence varies among replicons ( Fig. S2 and S3), the numbers of encoded m6A sites were compared to the length of the replicon and found to correlate (Fig. 5). Therefore, the poor correlation between the number of methylated motifs and the total number of motif sites on a replicon can be explained by either variable methylase activity or the absence of the responsible R/M locus. Analysis of methylated motifs revealed that both the CGRKA and GNAAYG motifs of B. burgdorferi B31 and the GGAYG motif of B. afzelii BO23 significantly correlate with the number of motif sites present on the replicons, regardless of the derivative being analyzed, as would be expected with methylation of ;90% of encoded motifs (Table 1 and Fig. S4A). Similarly, methylation of CTCRRA and GGAYC of B. afzelii PKo, CAGC of B. garinii PBr, VCAAYG of B. burgdorferi N40, and GNAAYC of B. burgdorferi CA-11 and 297 significantly correlated with the number of motifs present on the replicons and methylated about 80% of their encoded motifs despite the fact that these were sequenced as nonclonal cultures (Table 1 and Fig. S4B and C). In striking contrast, however, the numbers of methylated H/WCAG motifs in B. burgdorferi B31, HMAAGG motifs in B. afzelii PKo, and CMAAYC motifs in B. garinii PBr do not correlate with the numbers of the respective motif sites present on the replicon and were found to methylate ,65% of their respective sites (Table 1 and Fig. S4). The partial methylation of gDNA by BBH09 in B. burgdorferi B31 likely reflects the variable activity of the enzyme in clonal B31 derivatives (A3, A3 Dbbe02, A3-68, A3-68 Dbbe02, and A34). However, we cannot say if the incomplete methylation of gDNA from nonclonal cultures of B. afzelii PKo and B. garinii PBr reflects the activities of the enzymes bbe02 and bbq67 transcript levels are higher in culture than in vivo. While the R/M enzymes of B. burgdorferi B31 are active in vitro, with BBE02 and BBQ67 methylating .95% of all genomic sites in clonal populations (Table 1), the transcription of these genes in vivo is unknown. Therefore, we measured bbe02 and bbq67 transcript levels in B. burgdorferi B31-A3 spirochetes grown in vitro and throughout the in vivo mouse-tick infectious cycle using qRT-PCR (Fig. 6). Transcripts for both genes were present at low levels in vitro relative to the constitutive flaB gene, and neither was detectable by qRT-PCR in total RNA isolated from infected mouse tissues. Analysis of B. burgdorferi B31-A3-infected ticks revealed low levels of bbe02 and bbq67 transcripts in larval ticks and unfed nymphs, with a transient increase in the levels of both transcripts in nymphs after feeding to repletion. Thus, bbe02 and bbq67 are expressed at very low levels throughout the mouse-tick infectious cycle, with a relative increase in transcript levels immediately following the nymphal blood meal.
Methylation by BBE02 and BBQ67 does not control the expression of genes in B. burgdorferi B31-A3. Previous analyses detailing the impact of bbq67 on shuttle vector transformation efficiency and epigenetic regulation have relied on strains lacking lp56 (9)(10)(11)(12)(13)39). Therefore, deletion of the bbq67 locus was conducted to assess the effects of gene regulation on cells containing lp56. RNA-seq was performed on B. burgdorferi B31-A3 and B31-A3 derivatives containing isogenic deletions of bbe02, bbq67, FIG 4 Heat map of methylated motifs per 1,000 bp within select genes of sequenced Lyme disease Borrelia spirochetes. The numbers of methylated motifs within each gene are displayed as the ratio of methylated motifs to the gene length per 1,000 bp so that each column (strain) totals 1. Each strain and derivative is displayed as a different color below the motif. and both bbe02 and bbq67. Analysis of the RNA-seq data revealed that there was no obvious difference in gene expression regardless of the presence or absence of the R/M gene(s) when visualized with a principal-component analysis (PCA) plot to determine and visualize the maximum variation between samples (Fig. S5A). Additionally, both edgeR and DESeq2 analyses, which both normalize and analyze the data to determine differentially New Insights into the Borrelia Methylome ® expressed genes, were conducted, with similar results. A comparison of strains containing isogenic deletions of bbe02, bbq67, and both bbe02 and bbq67 revealed two genes that were differentially expressed. No other genes were determined to be significantly differentially expressed by edgeR and DESeq2 (Fig. 7).
As the lack of differential gene expression due to the absence of one or more R/M genes in B31-A3 is quite different than what was reported previously by Casselli et al. (39), additional RNA-seq analyses were performed with independently prepared cDNA libraries of B31-A3, B31-A3 Dbbe02, B31-A3-68, B31-A3-68 Dbbe02, B31-A3 lp28-3 2 , B31-A34, and B31-A34 Tn::H09 (Table S1). However, similar to our previous data set, these seven strains did not reveal significant differences in gene expression that correlated with R/M gene content. The highly passaged B31-A34 and B31-A34 Tn::H09 strains lack 9 plasmids present in the B31-A3 derivatives, causing these strains to cluster separately in the resulting PCA plot. However, the lack of bbh09 in the B31-A34 Tn::H09 strain does not result in differential gene expression compared to B31-A34, the isogenic strain from which it was derived (Fig. S5B). Additionally, edgeR and DESeq2 analyses of B31-A3 derivatives revealed that the majority of differentially expressed genes reflected the absence of lp56 in the B31-A3-68 derivatives (Fig. 8).
To confirm the RNA-seq data, qRT-PCR was performed to include differentially regulated genes with a log fold change of .1. Overall, qRT-PCR could not confirm all differentially expressed genes under a less stringent log fold change of .1 ( Table 2, Fig. 7, and  Table S2, increased and decreased expression).
In addition to the differential expression of numerous genes, Casselli et al. reported the differential expression of the alternative sigma factor rpoS and posttranscriptional regulators of rpoS, such as bbd18, in strains lacking bbe02 and bbq67 (39). As rpoS is a transcription factor that regulates a large number of genes, this would have a significant cDNA of total RNA from B. burgdorferi B31-A3 in vitro or within I. scapularis at different stages underwent qPCR. bbe02 and bbq67 were expressed at lower levels within I. scapularis than in in vitrogrown organisms, with bbq67 being present at undetectable levels in larvae and feeding nymphs. The exception is the expression of bbe02 at nymphal repletion. Significance was determined by Dunn's multiple comparison of the Kruskal-Wallis test (n = 3 biological and 3 technical replicates each). **, P value of ,0.002; ***, P value of ,0.0002.
impact on global gene expression (58)(59)(60)(61)(62)(63)(64). We therefore carefully examined the expression of sigma factors and response regulators in both RNA-seq data sets and further verified the relative levels of rpoN, rpoS, and bbd18 transcripts by qRT-PCR. Consistent with our previous analyses and different from those of Casselli et al. (39), the expression levels of sigma factors and response regulators did not differ significantly between isogenic B31 strains containing and those lacking R/M loci by either analysis (Table S2).

DISCUSSION
DNA modification by R/M enzymes has been recognized as having diverse impacts on microorganisms (14)(15)(16). In B. burgdorferi type strain B31, two plasmid-encoded type II R/M enzymes, BBE02 and BBQ67, significantly impact the stable uptake and incorporation of exogenous DNA (9-13). Analysis of 8 B. burgdorferi B31 derivatives New Insights into the Borrelia Methylome ® along with B. burgdorferi N40, CA-11, and 297; B. afzelii PKo and BO23; and B. garinii PBr revealed 9 methylation motifs. In addition to confirming the recently described GNAAYG (modified by BBE02) and CGRKA (modified by BBQ67) motifs, an H/WCAG motif modified by BBH09 was identified within B. burgdorferi type strain B31 (39). Despite the high sequence identity among BBE02 (;64 to 98%) and BBQ67 (;65%) orthologs analyzed in this study, the majority of methylated motifs are distinct and novel, suggesting a possible impact on genetic exchange between species and strains ( Table 1 and Fig. 1 and 2). However, orthologous genes contain similar motif specificities, as observed in B. burgdorferi CA-11 and 297, which methylate the same motif, GNAAYC, which is similar to the B. burgdorferi B31 BBE02 motif ( Fig. 1 and 2). The cause for motif variability, therefore, is more likely through the acquisition, loss, and pseudogenization of paralogous R/M genes than by sequence evolution among orthologs. While the majority of genomes analyzed carry R/M genes without indels, B. burgdorferi 297 lacked lp28-5, which carries the R/M genes Bbu297_y09 and Bbu297_y05 and  contains several silent and missense variations within its other R/M genes, Bbu297_h03 and Bbu297_j01. Additionally, all B. burgdorferi N40 derivatives contain a single missense variation within the R/M gene Bbun40_e01, and BLA32_04945 in B. afzelii BO23 contains a frameshift (see Table S2 in the supplemental material [R/M sequencing]). However, it is unknown if the identified indels would inactivate the resulting R/M enzyme. All previously and newly identified Borrelia motifs were of the m6A type, and the number of motifs identified within each genome corresponds to the number of encoded, intact m6A R/M enzymes, except for B. burgdorferi CA-11 (Table 1 and Fig. 2). Further work is required to assess and confirm the absence of cytosine methylation within all sequenced B. burgdorferi N40 derivatives that carry Bbun40_y07, a homolog of a Haemophilus haemolyticus cytosine methyltransferase (49). Similar to other bacterial R/M systems, the majority of Borrelia R/M genes analyzed in this study methylated $95% of their respective motifs (20). However, the H/WCAG motif of B. burgdorferi B31, the GNAAYG and CGRKA motifs of B. burgdorferi B31-MI, the GNAAYC motif of B. burgdorferi 297, the HMAAGG motif of B. afzelii PKo, and the CMAAYC and CAGC motifs of B. garinii PBr were not efficiently methylated. Apart from Bbu297_h03, Bbu297_j01, Bbun40_e01, and BLA32_04945, it is not believed that mutations have resulted in this decrease in methylation activities as all other R/M genes lack indels ( Successful transformation of B. burgdorferi strain B31 derivatives largely depends on the presence or absence of R/M genes and on the transforming DNA. As previously noted, B. burgdorferi B31 derivatives that lack one or more R/M genes, such as B31-A34 and B31-S9, are more readily transformed with shuttle vectors (9)(10)(11)(12)(13). Additionally, the transformation of B. burgdorferi N40 was shown to be enhanced after the deletion of Bbun40_e01 (65), and shuttle vector transformation has been successful in B. burgdorferi CA-11.2A (34) and 297 (64) and B. afzelii BO23 (66). However, the impacts of the R/ M genes in B. afzelii PKo and B. garinii PBr on the transformation efficiency are unknown. As the relative number of B. burgdorferi B31 CGRKA and GNAAYG motifs within the shuttle vectors pBSV2, pBSV2G, and pKFSS1 is consistent with previously reported transformation efficiencies (13), it is believed that the large number of B. afzelii PKo and B. garinii PBr motifs in these shuttle vectors would have a negative impact on the transformation efficiency.
In B. burgdorferi B31, it has been shown that in vitro methylation with the CpG methyltransferase M.SssI increased the rate of pBSV2 and pKFSS1 shuttle vector transformation in a BBQ67-dependent manner. Investigation of these shuttle vectors revealed that all CGRKA sites and ;75% of GNAAYG sites contain a cytosine that would be methylated by M.SssI (12). Furthermore, previously identified sites protected from RsaI cleavage by BBQ67 methylation were found to overlap CGRKA methylated motifs in pBSV2, pBSV2G, and pKFSS1 (13) (Fig. S6). As the presence of R/M genes primarily affects shuttle vector transformation and has relatively little influence on the rate of allelic exchange, the construction of a shuttle vector lacking the CGRKA and GNAAYG motifs should allow the more efficient transformation of B. burgdorferi B31 derivatives expressing BBQ67 and BBE02 (9)(10)(11)(12)(13)67).
Our attempts to create a functional shuttle vector devoid of all CGRKA and GNAAYG motifs have been unsuccessful to date. The removal of these sites in the ColE1 origin of replication or within broad-host-range plasmids such as pEP2 resulted in plasmids that were unable to replicate within Escherichia coli (Table S2, ori). However, codon optimization and the removal of CGRKA and GNAAYG motifs within the flg promoter and the antibiotic cassettes aadA1 (GenBank accession number MW473471), aacC1 (accession number MW473470), and aph (accession number MW473472) conferred resistance to the respective antibiotics in E. coli. Furthermore, modified flgp-aph was shown to confer kanamycin resistance to B. burgdorferi B31-A3 following allelic exchange at bbq67.
Recent work has demonstrated that a putative methyltransferase gene of Babesia bigemina, a pathogen associated with tick vectors, is expressed only within the tick New Insights into the Borrelia Methylome ® environment (68). Analysis of the relative levels of bbe02 and bbq67 transcripts showed significantly higher expression levels in vitro than within B. burgdorferi B31-A3-infected larvae and nymphs, except immediately following the detachment of fully engorged nymphs from naive mice (Fig. 6). However, the role of R/M gene expression within the tick, where horizontal gene transfer among Borrelia bacteria likely occurs, remains unknown (67). Analyses of B. burgdorferi B31 genes that have strong evidence for gene transfer, such as ospC (54), ospD (55), the erp genes (37,56), and the decorin binding protein (dbp) genes (57), revealed that there is no appreciable difference in the numbers of motifs when the codons are synonymously shuffled (as determined by the log fold change [ Table S2, CodonShuffle]). This may indicate that motif presence is driven by codon bias. However, a comparison of these genes with randomly assembled noncoding sequences of the same length revealed that 6 out of 15 genes and their flanking regions contained significantly more GNAAYG or CGRKA motifs than would be expected by chance (Table S2, genes of interest). ospC was the only gene analyzed to contain significantly higher numbers of both GNAAYG and CGRKA motifs than would be expected by chance. Additionally, ospC carried by the other strains in this study showed a significantly higher rate of occurrence of at least one of the newly identified motifs. Further analysis into the presence of the other known Borrelia R/M motifs in the B31 ospC gene revealed that the majority of identified motifs were also present within the central portion of the coding region. A previous investigation into recombination via gene transfer at ospC demonstrated the highest recombination breakpoints and diversity in the middle of the coding sequence (54). This leads us to speculate that the presence of Borrelia R/M motifs within ospC could facilitate the transduction of cp26 DNA or recombination via gene transfer (34,54,69). Therefore, while R/M motifs are typically viewed as a means to degrade foreign DNA, in the case of ospC, and perhaps other horizontally transferred DNAs, it may be beneficial by providing breakpoints for packaging DNA into phage or for recombination into the native locus. The utilization of R/M motifs for this purpose would provide a higher probability of generating novel, chimeric, immunodominant serotype-defining ospC alleles (54).
In an attempt to understand the incomplete methylation of HCAG motifs within B. burgdorferi B31, the methylation patterns of each replicon were analyzed. This revealed that the number of CGRKA, GNAAYG, or HCAG motifs correlates with the length of the replicons, while the number of methylated HCAG motifs does not ( Fig. 5 and Fig. S4). Neither the absence of bbe02 and bbq67 in B. burgdorferi B31-A3-68 Dbbe02 and B31-A34 nor the overexpression of bbh09 improves the methylation of HCAG motifs (Table S2, A3-68-LS). While this could be due to the activity of the enzyme, it could also be due to a blockage of the HCAG motif by DNA binding proteins. Of the 16,677 HCAG motifs within the B. burgdorferi B31 genome, there are 8 BosR binding sites (70, 71), 2,203 EbfC binding sites (72), and 10,362 BpaB binding sites (73) within 100 bp surrounding an HCAG motif, while there are 0 and 4 BosR sites, 280 and 394 EbfC sites, and 1,506 and 1,861 BpaB sites within 100 bp of CGRKA and GNAAYG motifs, respectively. Within all B31 derivatives except A34, there are 2,617 HCAG motifs that are consistently methylated and 4,682 that are never methylated. Further analysis of 100 bp around the 4,682 unmethylated motifs revealed that 1 contains the BosR binding site, 540 contain the EbfC binding site, and 2,606 contain the BpaB binding site. In comparison, there are 179 unmethylated CGRKA motifs and 418 GNAAYG motifs within B31-MI, with 0 and 1 BosR sites, 15 and 70 EbfC sites, and 103 and 221 BpaB sites within 100 bp, respectively. While this does not explain all of the unmethylated HCAG motifs, DNA binding proteins may have a negative impact on DNA methylation by BBH09.
Although the majority of methylated motifs occur fairly consistently among the replicons of each Borrelia genome analyzed, the CAGC motif of B. garinii PBr is highly prevalent (174 sites within ;5,000 bp) and methylated at the left telomeric end of lp28-3 until it encounters the 39 end of the bgapbr_h0006 R/M enzyme ( Fig. 4 and Fig. S3). Unfortunately, as B. garinii PBr encodes two R/M enzymes, it is unknown whether this motif is methylated by BGAPBR_H0006. These motifs lie within the vlsE expression locus and vls silent cassettes of B. garinii PBr. In fact, a closer analysis of the vls loci within B. burgdorferi B31, 297, and N40 and B. afzelii PKo revealed a high concentration of CAGC motifs within these strains and derivatives as well. As the high number of Borrelia R/M motifs within the ospC gene is presumed to aid in the horizontal transfer and antigenic variation of immunodominant OspC, the high number of CAGC motifs within the vls genes may aid in antigenic variation through gene conversion of vlsE within spirochetes.
While a previous investigation of the B. burgdorferi B31 methylome concluded that the BBE02 and BBQ67 R/M enzymes had a global impact on gene regulation, our current study failed to find any significant evidence for the epigenetic regulation of the Borrelia transcriptome by either BBE02 or BBQ67 (Table 2 and Fig. 7). Despite lowering the stringency of RNA-seq analysis to identify differentially expressed genes, these could not be consistently validated by qRT-PCR (Table S2, increased and decreased expression). Furthermore, the current study analyzed B. burgdorferi B31-A3, B31-A3 Dbbe02, B31-A3 Dbbq67, and B31-A3 Dbbe02 Dbbq67 that retained all native plasmids, while the previous study analyzed RNA-seq data generated from B. burgdorferi B31-A3 and B31-A3 Dbbe02 that lacked lp38 and from 5A18-NP1 Dbbe02 that lacked lp28-4 in addition to lp56 (39). Additionally, the striking impact of R/M gene content on the expression of sigma factors and posttranscriptional regulators evidenced by Casselli et al. was not reproduced in two independently generated RNA-seq data sets during our current analysis. Variable culture conditions leading to differential sigma factor expression were most likely responsible for the putative epigenetic impact of R/M loci on gene expression in the previous study (39) and would explain this discrepancy.

MATERIALS AND METHODS
Phylogenetic tree construction. The BBE02 and BBQ67 nucleotide and amino acid sequences were used to identify similar genes within both Lyme disease and tick-borne relapsing fever spirochetes within the NCBI database as well as to search for Borrelia R/M genes within REBASE (see Table S1 in the supplemental material [R/M homologs and REBASE]). The identified proteins were aligned with Clustal Omega, and the percentages of identity and similarity between genes were determined with the Ident and Sim sequence analysis tool from the Sequence Manipulation Suite (74,75). The phylogenetic tree was inferred by approximately maximum likelihood methods implemented through FastTree (76). This analysis utilized the JTT (Jones-Taylor-Thornton) model of amino acid evolution with the CAT approximation to account for the various rates of evolution across sites and estimated the local support values with the Shimodaira-Hasegawa (SH) test (77)(78)(79). The resulting tree was visualized with iTOL v 4.4.2 (80).
B. burgdorferi strains and growth conditions. B. burgdorferi strains were cultured in Barbour-Stoenner-Kelly II (BSKII) medium supplemented with 6% rabbit serum (PelFreez Biologicals, Rogers, AZ) and the appropriate antibiotics (streptomycin at 50 mg/ml and kanamycin at 200 mg/ml) at 35°C under 2.5% CO 2 (81). B. burgdorferi B31 MI obtained from MedImmune (now AstraZeneca, Gaithersburg, MD) was subjected to PacBio sequencing as this was the gDNA used to generate the B31 genomic sequencing data for the type strain. B. burgdorferi strain B31-A3, an infectious clone derived from B31 MI lacking cp9, was used as the wild-type strain in this study (41). Cloning vectors were propagated using E. coli strain TOP10 (Invitrogen, Carlsbad, CA). All B. burgdorferi strains and derivatives along with plasmids utilized in this study are shown in Table S1 (bacterial strains).
Assembly of constructs and transformation of B. burgdorferi. B. burgdorferi was transformed by electroporation as previously described (82). Competent B. burgdorferi bacteria were freshly prepared from an exponential-phase culture and electroporated with 15 to 30 mg of plasmid DNA prepared from E. coli. Transformants were confirmed by PCR and sequencing, and the plasmid content was determined to ensure that no plasmids were lost during transformation (83).
The antibiotic cassettes aacC1 (gentamicin 3-N-acetyltransferase), aadA1 (streptomycin 3-adenylyltransferase), and aph (aminoglycoside phosphotransferase) (kanamycin resistance) were optimized for electroporation into B. burgdorferi B31 through the removal of CGRKA and GNAAYG sites along with Borrelia codon optimization (GenScript, Piscataway, NJ). These optimized cassettes were driven by a B. burgdorferi R/M-minus flg promoter and transformed into E. coli TOP10 cells as part of a pUC57 vector. These optimized antibiotic cassettes were shown to confer the appropriate antibiotic resistance (gentamicin at 5 mg/ml, spectinomycin at 100 mg/ml, and kanamycin at 50 mg/ml) in transformed E. coli. Additionally, flg-driven aph was electroporated into B. burgdorferi B31-A3 and conferred kanamycin resistance to transformed cells through allelic exchange at bbq67. gDNA was used to generate 16 barcoded SMRTbell libraries (adapter kits 8A and 8B) and subjected to SMRT sequencing according to the manufacturer's instructions for multiplex microbial SMRTbell libraries v2 (Pacific BioSciences, Menlo Park, CA) with the following modifications. Two pools were independently generated for sequencing on two SMRT cells on the PacBio Sequel platform (Pacific BioSciences, Menlo Park, CA). The primer and polymerase were annealed to the first pool, which consisted of all 16 libraries, according to a non-size-selection protocol. The second pool was the same except for following the optional size selection protocol. Each SMRT cell underwent diffusion loading with a preextension time of 120 min and a 10-h movie time. The non-size-selected pool was loaded at 4.5 pM, while the sizeselected pool was loaded at 7 pM. For each SMRT cell, SMRT Link RunQC showed a P1 value of .50% with N 50 longest subreads of 9,250 bp and 12,750 bp, respectively. Reads were processed and mapped to the appropriate references using the pbsmrtpipe ds_modification_detection and sa3_ds_resequencing_fat pipelines (Pacific BioSciences). The references used for assembly were GenBank accession numbers AE000783 to AE000794 and AE001575 to AE001584 for B31; CP001651 and CP002227 to CP002242 for N40; ABJV02000001 to ABJV02000005 and CP001301 to CP001311 for PBr; CP002933 to CP002950 for PKo; CP001653, CP002312, and CP002253 to CP002270 for 297; ABJY02000001 to ABJY02000014 and CP001473 to CP001484 for CA-11; and CP018262 to CP018293 for BO23. Only base calls that had a quality value of 20 or higher were used for analysis.

New Insights into the Borrelia Methylome
To determine if the number of encoded motifs is higher than would be expected by chance, the Analysis of Motif Enrichment (AME) program of the MEME suite 5.0.5 package was utilized (52,84). This analysis shuffles the FASTA sequences of each genome or gene to conserve 2-mer frequencies and create, at minimum, 1,000 randomized control sequences for motif comparison. AME then determines if each motif is enriched in the primary sequence using one-tailed Fisher's exact test and performs partition maximization over all possible position-weight matrix (PWM) thresholds, with false (control sequences) and true (input sequences) positives determined by their PWM scores (52). Codons within genes were synonymously shuffled by utilizing the N3 script of CodonShuffle in which the third position of each codon was shuffled to generate 1,000 shuffled sequences for each gene (53).
The number of motifs and the number of methylated motifs per 1,000 bp were determined and mapped using Circos version 0. 69-7 (85) and analyzed using Pearson correlation. Graphs were generated using the ggpubr package of the ggplot2 version 3.2.0 tool of the tidyverse package in R (86).
cDNA libraries for RNA-seq and quantitative PCR (qPCR) were generated from these total RNAs. cDNA libraries used for RNA-seq were synthesized using Ribo-Zero and ScriptSeq complete bacterial kits with indexing primers for the synthesis of directional libraries (Illumina, San Diego, CA) according to the manufacturer's instructions. The   FIG 9 bbq67 isogenic deletion. To create the B31-A3 Dbbq67 and B31-A3 Dbbe02 Dbbq67 strains, the 3,261-bp bbq67 gene was replaced with the 919-bp flgB::aphA1 antibiotic resistance cassette in the same orientation as bbq67.
quality and quantity of the resulting cDNA libraries were assessed with an Agilent DNA 1000 assay on an Agilent 2100 Bioanalyzer (Agilent) and with a Kapa library quantification kit (Kapa Biosystems, Wilmington, MA) prior to RNA-seq. RNA-seq libraries were sequenced on an Illumina MiSeq platform using v3 chemistry with 2-by 75-bp reads.
RNA-seq reads were compiled, filtered to remove any reads with PHRED scores of less than 10, and aligned to the B. burgdorferi B31 genome (GenBank accession numbers AE000783 to AE000794 and AE001575 to AE001584) using bowtie2 (87). Reads for annotated genes were determined using featureCounts (88,89). Differential expression analysis was conducted with edgeR (90) and DESeq2 (91).
For analysis of bbe02 and bbq67 expression within the tick vector, the total RNA and resulting cDNA from B. burgdorferi B31-A3-infected Ixodes scapularis ticks that were previously described were used for qPCRs (92,93). To generate cDNA used for qPCR from B. burgdorferi cultured in vitro, 1mg of RNA was reverse transcribed using the high-capacity cDNA reverse transcriptase kit (Life Technologies) according to the manufacturer's instructions. qPCRs were performed using IQ SYBR green supermix (Bio-Rad Life Sciences, Hercules, CA) with gene-specific primer sets (500 nM) (Table S1). Reactions were performed so that each experiment contained a biological and technical triplicate on a Viia7 real-time PCR system (Applied Biosystems, Foster City, CA) and analyzed with PRISM software. Negative-control reactions of primers lacking a template and on RNA samples that underwent cDNA reactions in the absence of reverse transcriptase were performed with each reaction to ensure that threshold cycle (C T ) values were not obtained from primer-primer interactions or from contaminating genomic DNA. Additionally, the melt curve was analyzed for each reaction. All primers used for qPCR are listed in Table S1 (oligonucleotides).
Data availability. Sequences of standard Borrelia selectable markers that were optimized for B. burgdorferi B31 transformation by the removal of CGRKA and GNAAYG motifs are located in GenBank under accession numbers MW473470 (aacC1), MW473471 (aadA1), and MW473472 (aph). The RNA-seq data are available under GEO accession number GSE169460.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.