A Persisting Nontropical Focus of Burkholderia pseudomallei with Limited Genome Evolution over Five Decades

Burkholderia pseudomallei is predominantly a tropical pathogen uncommonly found in the environment of temperate climatic regions. It is unclear if introduction into temperate regions is sporadic and temporary or if B. pseudomallei can persist in such environments. B. pseudomallei was identified in the environment of southwest Western Australia with melioidosis cases between 1966 and 1991. We report a new cluster with 23 animal fatalities in the same region from 2017, with B. pseudomallei again being recovered from the environment. Comparison of the isolates from the first and second clusters using genomics revealed a single sequence type, high clonality, and limited recombination, even though the time of recovery of the isolates spanned 51 years. This is a major contrast to the extensive genomic diversity seen in the tropics. Our data support the suggestion that B. pseudomallei has the ability to persist in nontropical environments, potentially in a latent state, and has the ability to activate following favorable conditions (rainfall) and then infect animals and humans.

soil samples, and Fig. 3 shows the geographical locations of Moondyne and Gidgegannup in southwest Western Australia in comparison to the locations of previous melioidosis cases in that region. The Moondyne Farm owner reported that the alpacas were located at location 8 on the farm and that soil from this site was culture negative, but the adjacent sampling locations were those that were culture positive (Fig. 2).
All animal and environmental isolates belong to ST-284. In silico multilocus sequence typing (MLST) showed that all nine Moondyne Farm-Avon Valley animal outbreak and soil isolates and the parrot isolate (Table 1,  Phylogenetic analysis of the ST-284 strains from the first and second clusters. Phylogenetic analysis was initially performed solely on the genomes of the ST-284 isolates from the first and second clusters (n ϭ 22) to determine the phylogenetic relatedness of the ST-284 isolates. The 22 genomes were mapped with 60ϫ to 120ϫ coverage against the genome of the reference ST-284 isolate, isolate MSHR0169. Despite the isolation of these B. pseudomallei genomes over more than 5 decades, phylogenetic analysis of the 22 genomes revealed that the 22 ST-284 isolates were clonal, with 532 core-genome single-nucleotide polymorphism (SNP) differences being detected across the ST-284 phylogeny (Fig. 4). The population structure of the 22 isolates was further defined using the RhierBAPS program, based on a core-genome SNP mapping alignment, which divided the 22 isolates into 4 primary sequence clusters (Bayesian analysis of population structure [BAPS] hierarchical level 1). These were further subdivided into 8 lineages (BAPS level 2) (see Table S1 in the supplemental material). Two of the primary sequences that clustered at level 1, sequence clusters 2 and 4, comprised sequences from isolates from both melioidosis time clusters (Table 2), while sequence clusters 1 and 3 consisted of isolates collected during the first southwest Western Australian melioidosis cluster (Table 2). Time cluster 2 included three isolates that were distributed across different branches of the maximum parsimony (MP) tree. The characteristics of each cluster are listed in Table 2. We also constructed a recombination-adjusted phylogeny (Fig. S2); strain clustering was consistent with the phylogenetic clustering observed with the MP SNP tree.
The ST-284 isolates did not completely group together on the basis of the year or the location of isolation but were highly conserved (Fig. 4). Eight strains recovered from Moondyne Farm in 2017 grouped together, belonged to level 1 sequence cluster 4, and were conserved with nine SNPs, and they also clustered most closely to isolates (separated by eight SNPs) from two distinct geographical regions that had been isolated 37 years earlier (in 1980 and 1987). One isolate recovered from a parrot in 2017 was distant to the cluster containing the eight Moondyne Farm 2017 isolates and two isolates recovered in the 1980s, being separated by 116 SNPs. The remaining soil isolate (isolate MSHR1039) from Moondyne Farm formed its own lineage that was distinct from the other Moondyne Farm isolates, which were located 146 SNPs away. Level 1 sequence cluster 3 contained a cluster of seven clonal isolates from four geographical regions outside of the Moondyne Farm that were isolated between 1966 and 1985.
For the cluster that occurred in 2017, there were phylogenetic links between the animals and the soil taken from Moondyne Farm (Fig. 4, purple branches). The isolates from alpacas A (isolate MSHR10305), B (isolate MSHR10306), and D (isolate MSHR10309) all clustered together within level 1 sequence cluster 4 and were closely related to the soil isolate (isolate MSHR10489) from location 10 of Moondyne Farm, being separated from the soil sample isolate by only four SNPs. Identical isolates from alpaca C (isolates MSHR10307 and MSHR10308) clustered together in level 1 sequence cluster 4 and were most closely related to soil isolates (isolates MSHR10320 and MSHR10380) from Moondyne Farm location 9, being separated by only three SNPs. The isolate from the scarlet macaw parrot (isolate MSHR10310) clustered separately from the isolates from the Bootstrapping revealed that the phylogenetic tree had moderately supported to well-supported nodes, with the branch support values ranging from 80 to 100. Branch colors indicate cluster 1 or cluster 2 (with the first cluster occurring between 1966 and 1992 and the second cluster occurring in 2017); the inner strip delineates the location of isolation; the outer strip delineates the isolate source, which is followed by the year in which the sample was collected and the BAPS cluster for each isolate, as defined by the RhierBAPS program; and the gray highlighted regions delineate transmission events. The geographical location for isolate MSHR0164 is unknown, and so the location of MSHR0164 was not included. Black circles on the branches denote bootstrap values of Ͼ80. The consistency index for the tree was 0.9963, and the homoplasy index was 0.0037. alpacas and did not cluster closely to any of the soil isolates from Moondyne Farm, with the closest soil isolate (MSHR10489) being 121 SNPs different from the parrot isolate. The phylogenetic relatedness of the animal and environmental ST-284 isolates from the first southwest Western Australia cluster (Fig. 4, orange branches) has been described previously (10). Phylogenetic reconstruction of ST-284 strains from the first cluster with the isolates from the second cluster did not considerably alter the phylogenetic structure of the ST-284 cluster 1 isolates (10).
Southwest Western Australia isolates form a single clade on the global B. pseudomallei phylogeny. We then undertook a phylogenetic analysis of the genomes of the 22 southwest Western Australia isolates with a set of global genomes to determine the phylogeographical origin of the southwest Western Australia isolates. Within our global phylogeny, isolates from the Avon Valley and surrounding area were highly similar and nested within their own well-supported clade within the Australian clade (Fig. 5A). Isolates from Australia (Northern Territory, Queensland, and Western Australia) were interspersed throughout the Australasian clade, with some geographical structure being detected in the lineage, and other Western Australia isolates were distantly related to the isolates from the Avon Valley and strains from the surrounding area (Fig. 5B).
Limited evidence of recombination among the ST-284 isolates. To determine the contribution of recombination to the limited diversity noted for the ST-284 isolates, we examined the distribution of SNPs using the Gubbins program. We did not identify recombinogenic SNPs or regions associated with recombination in strains (n ϭ 10) from the 2017 cluster. However, we found evidence of recombination in two strains from the first cluster: strains MSHR0162 (isolated from a sheep in 1968; number of recombinogenic SNPs ϭ 35, number of recombinogenic SNPs occurring in recombinogenic blocks ϭ 1, and number of bases associated with the recombinogenic block ϭ 14.4 kb) The node belonging to the non-Australasian strains (indicated by the gray circle, containing 124 strains) has been collapsed. The consistency index for the tree was 0.1620, and the homoplasy index was 0.8380. and MSHR0172 (isolated from a dog in 1985; number of recombinogenic SNPs ϭ 5, number of recombinogenic SNPs occurring in recombinogenic blocks ϭ 1, and number of bases associated with the recombinogenic block ϭ 0.56 kb). The Gubbins program also identified SNPs that arose due to point mutations rather than recombination events, and there was evidence for SNPs occurring outside recombinogenic regions for 18 of the southwest Western Australia strains (range, 1 to 144 SNPs).
Protein-coding regions unique to isolates from southwest Western Australia. Comparative genomics identified 14 protein-coding regions that were present in all temperate southwest Western Australian isolates but that were absent from a set of tropical Australian B. pseudomallei isolates (Fig. 6). Using the corresponding assemblies annotated with the Prokka program, all 14 protein-coding regions were annotated as hypothetical protein-coding regions and did not have any matches to annotated protein-coding regions in the UniProt database (https://www.uniprot.org/), and so their function is unknown. We did not identify protein-coding region differences between cluster 1 and cluster 2 isolates. We then used Island Viewer (v4; https://www .pathogenomics.sfu.ca/islandviewer/) to determine if the 14 protein-coding regions were located within GIs and demonstrated that the 14 protein-coding regions were located on GIs (Table S4).
GIs and virulence genes in ST-284 are ubiquitously present or absent. To further assess the genomic diversity of the 22 ST-284 strains, we examined the distribution of 38 known genomic islands and six variably present virulence genes. Despite the ST-284   Table S2), and GI16b (comprised of 7 genes; Table S2)-and lacked the remaining 30 GIs. At GI1.1, GI7.4, and GI16b, the coding sequences (CDSs) were identical for the ST-284 strains, and minor variations were noted in the CDSs belonging to GI14, GI14.1, GI4.3, and GI15e (see Table S2 for the BLAST score ratio [BSR] scores for the CDSs belonging to the eight identified GIs). Minor variations were present in GI13.4 across the ST-284 strains, and two protein-coding regions belonging to GI13.4 were truncated in MSHR0160 (the sole isolate from Chittering Farm 2): an integrase core domain (protein-coding region BDL_3619; BSR ϭ 0.37) and a putative gp37 (proteincoding region BDL_3620; BSR ϭ 0.29). These truncations were not observed in any other ST-284 strain. Encompassed in the eight GIs (GI1.1, GI4.3, GI7.4, GI13.4, GI14, GI14.1, GI15e, and GI16b) identified in the southwest western Australia strains were genes typically present in GIs, such as genes involved in metabolism, prophage genes, and genes involved in the integration of GIs into the genome, including integrase, transposase, resolvase, and recombinase genes (Table S2). We also determined the presence of the eight GIs in the global strain set (Table S1), and the eight GIs were highly varied across the global data set, with the global genomes either containing complete GIs, lacking GIs, or containing partial GIs.
Lastly, the variably present virulence genes that were screened for in the ST-284 strains were universally present or absent.

DISCUSSION
While melioidosis is traditionally defined as a disease of the tropics and previous models demonstrated that tropical regions have environments favorable for B. pseudomallei, evidence is mounting for the persistence of B. pseudomallei in both arid regions (20) and temperate regions (10,21). Since 1966, melioidosis cases have been described in the Avon Valley and surrounding regions (latitude ϳ31.6°S). B. pseudomallei was first detected in farm animals, and then in 1980 B. pseudomallei was detected in the environment and in 1992 the first human case was reported (16)(17)(18)(19). We hypothesized that additional outbreaks would continue to occur in the region following severe weather events (10). In late January 2017, heavy rain occurred in southwest Western Australia, 17 days prior to the reported alpaca deaths on Moondyne Farm, which fits with the normal incubation period for acute melioidosis of 1 to 21 days and which fits with death usually occurring within several days of illness onset in fatal cases (22). We isolated B. pseudomallei in soil from the farm where the alpacas died, confirming the prolonged presence of B. pseudomallei in the environment of southwest Western Australia. The 10 isolates collected from the 2017 cluster were all ST-284, the same ST identified in the cluster 1 isolates, revealing that the second cluster was not the result of another B. pseudomallei introduction into the region but, rather, represents the long-term low-prevalence endemicity of B. pseudomallei.
Phylogenetic analysis with the B. pseudomallei isolates available revealed the likely origin of the ST-284 melioidosis cases in the four alpacas. The alpaca isolates clustered with culture-positive soil isolates from two separate Moondyne Farm locations, with each alpaca genome being separated from the respective soil samples by only two or four SNPs, suggesting that the alpacas most likely acquired their infections from soil at Moondyne Farm. Necropsy confirmed that the alpacas died from extensive and severe melioidosis pneumonia, consistent with the inhalation of B. pseudomallei which was potentially aerosolized during the severe weather. Studies combining epidemiology (23), climate science (24), clinical and pathological data (25), and bacterial genomics (26,27) are providing new insights into the postulated shift from percutaneous inoculation to aerosol inhalation following severe weather events. Nevertheless, the proportions of melioidosis cases from inhalation, percutaneous inoculation, and ingestion remain unclear (28).
The clonality observed among the southwest Western Australia strains contrasts to the extensive within-ST diversity observed in settings where melioidosis is highly endemic, such as northern Australia (29). The genetic diversity of the B. pseudomallei genome can arise quickly via recombination within B. pseudomallei and with nearneighbor bacteria. We found evidence of some limited recombination among the ST-284 strains in the first cluster but not among the ST-284 strains in the 2017 cluster. In support of this, no additional STs were identified in the Avon Valley or surrounding environment during the first or second cluster (10), and the lack of genetically diverse strains in the environment potentially hindered genetic diversification through recombination. Another explanation for limited recombination in the southwest Western Australian isolates is that under suboptimal or harsh conditions, the bacteria enter a dormant state and so are mostly not metabolically active and are thus not recombining with neighbors (30). In support of dormancy and the limited recombination, we found no unique protein-coding regions in the isolates of the 2017 cluster when their sequences were compared to those of the isolates in the first cluster, which would have likely arisen via recombination events. We hypothesize that the genetic diversity described here has most likely accumulated by point mutation during replication bursts occurring during short periods of favorable environmental conditions, such as with heavy rainfall. In support of this, our genomic analysis identified point mutations outside of recombinogenic regions in almost all the ST-284 strains.
It remains unknown as to when or how B. pseudomallei was introduced into the Avon Valley region of southwest Western Australia. A separate and unrelated cluster of five cutaneous human melioidosis cases in suburban southwest Australia beginning in 2012 also remains unexplained (21). Our global phylogeny demonstrated that all ST-284 strains clustered on a separate branch within the Australasian clade and that all shared a recent common ancestor. It is clear that ST-284 is of Australian origin and that a single introduction into the Avon Valley and the surrounding region is likely, as previously proposed (10). We were unable to identify a likely origin of the ST-284 strains due to deep Australian evolutionary nodes being unsupported and the lack of a close relative. The true geographical origin of ST-284 cannot be determined until it or closely related STs have been identified elsewhere in the Australian environment outside of southwest Western Australia.
Speculation about the importation of B. pseudomallei into southwest Western Australia via horses has been made (10,19,31). In 1965, two horses were imported into southwest Western Australia from tropical Kimberley in northern Western Australia, and autopsy confirmed that the horses died from melioidosis. All clinical isolates from those horses were discarded, so the genotypes of the isolates from that scenario were not available. Further environmental sampling of the Kimberley region may therefore shed light on the origin of ST-284. Alternatively, B. pseudomallei may have been introduced into southwest Western Australia via an extreme weather event. In Australia, B. pseudomallei can become aerosolized during severe weather events, such as cyclones, and severe weather has been implicated in the movement of B. pseudomallei in Australia (32), Taiwan (27), and, more recently, the Caribbean (11). While it is uncertain if weather is directly related to the original arrival of B. pseudomallei to southwest Western Australia, the unusually high level of rainfall in late January 2017 was very likely the driver of the outbreak in the alpacas at Moondyne Farm. In support of this theory, the frequency of melioidosis cases has been shown to fluctuate with rainfall, with increased case rates being associated with periods of higher, more intense rainfall (24). During periods of heavy rainfall and increased surface discharge, B. pseudomallei can be leached out of the soil, which may lead to an increased risk of transmission through contact with broken skin and contaminated soil or via inhalation or ingestion of contaminated water particles. As outlined by the Australian Bureau of Meteorology, as a result of climate change, southwest Western Australia is projected to experience a future decrease in total annual rainfall, with more time in drought and extreme heat waves but with an increase in intense heavy rainfall events. Therefore, despite the predicted generally drier future for southwest Western Australia, future heavy rainfall events may result in further clusters of melioidosis in this particular region of southwest Western Australia, with inhalational melioidosis being a notable concern following severe weather events, such as cyclones (28). In addition, other as yet unknown foci of endemic environmental B. pseudomallei in this and other subtropical locations may become recognized by clusters of melioidosis cases occurring in animals and/or humans. Nevertheless, the specific mechanisms of activation of latent B. pseudomallei bacteria residing in the farm soil, bacterial movement within and across the farm environment under the influence of rainfall and local flooding, and the nature and dynamics of aerosolization all remain unclear.
B. pseudomallei is non-spore forming but is a robust pathogen that can persist and survive for decades in diverse environments in the absence of nutrients (33,34). We have demonstrated the persistence of B. pseudomallei in the environment of southwest Western Australia over a 51-year time period, with limited evolution occurring. It is hypothesized that the bacterium enters a quiescent nonreplicative or minimally replicative state, enabling survival in the nontropical environment of southwest Western Australia. We identified 14 unique hypothetical protein-coding regions in the strains from this region that were absent in other Australian B. pseudomallei. Whether some of these hypothetical protein-coding regions are implicated in the survival of B. pseudomallei in temperate climatic regions requires further functional protein analyses. Of note is the postulated ability of B. pseudomallei and some other non-spore-forming bacteria to exist in the environment in a dormant nonreplicative state. It was hypothesized that the geographical and temporal genomic phylogeny of Francisella tularensis in Sweden was consistent with the bacteria being in a nearly dormant stage with minimal replication in the environment, with replication bursts occurring to explain regional outbreaks over a period of 10 years (35), and this has also been observed for Yersinia pestis, the plague-causing bacterium (36). Weather may also change bacterial generation and replication cycles, which may be higher with higher ambient temperatures (37). In contrast to the findings from temperate southwest Western Australia, high levels of genetic diversity have been observed over multiple spatial scales in tropical Australia and northeast Thailand (38)(39)(40).
Estimating the true evolutionary rate of B. pseudomallei is confounded by the "open genome" and the magnitude of lateral gene transfer among strains. The use of molecular clocks assumes that the substitution rates per site are constant through time and that violation of these conditions can lead to erroneous inferences and incorrect evolutionary estimates. We did not observe a clear trend in our 51-year B. pseudomallei data set, finding limited evidence of a correlation between mutation patterns and time. Isolates from the second melioidosis cluster were most closely related to isolates from 24 years earlier, with one 2017 isolate clustering on another clade with isolates from 51 years earlier. The exact rate of B. pseudomallei evolution in the environment is unknown, although it has been predicted in silico to be highly varied at the same 10 loci among distinct strains (41). Similarly, variable substitution rates per site over time have been reported for other bacterial species (42,43). Due to the lack of a correlation between mutation patterns with time and the associated implications for clock models, we did not infer an evolutionary timescale for the B. pseudomallei 51-year isolate set.
In conclusion, we demonstrated the ongoing presence of B. pseudomallei in the Avon Valley and surrounds of southwest Western Australia, which was unmasked by fatal cases of melioidosis occurring in alpacas and a parrot after unusually high rainfall in January 2017. We used genomics to characterize the genetic changes to the B. pseudomallei isolates from this region of temperate Australia recovered over a period spanning 51 years. All isolates belonged to a single ST and were highly related, with limited evidence of recombination and with conserved GIs. Time and geographical region did not correlate with the mutation patterns, with two isolates from the 1980s being most closely related to the 2017 isolates. These findings challenge the idea that, at least for B. pseudomallei, temporal and geographical distance correlate with evolutionary relatedness. Melioidosis has remained endemic in this temperate region of Australia for over half a century, and further animal and human cases of melioidosis are likely in the future, especially following heavy rainfall events.

MATERIALS AND METHODS
Location, epidemiology, and collection of environmental B. pseudomallei isolates. In 2017, more than 20 alpacas were agisted to Moondyne Farm in southwest Western Australia, coming from two separate nearby properties. In mid-February 2017, the alpacas became unwell and died or were euthanized at Moondyne Farm. This cluster was first noted in the Animal Health Australia surveillance quarterly newsletter (44). Necropsy examination revealed multifocal pulmonary nodules, extensive pulmonary congestion, and hemorrhage in all alpacas examined. Histology examination showed aggregates of small Gram-negative bacillus bacteria present throughout the pulmonary tissue. Lung lesions were cultured from four alpacas (alpacas A to D) ( Table 1), and all grew B. pseudomallei, which was also recovered from a vitreous humor sample from alpaca C. During this period, a scarlet macaw parrot residing in Gidgegannup (approximately 50 km south of Moondyne Farm) also died from melioidosis, with B. pseudomallei being cultured from a tissue sample. Bacterial growth, DNA extraction, and B. pseudomallei confirmation. Culture of B. pseudomallei was performed using previously described methods (26,45,46), and B. pseudomallei was confirmed with a real-time PCR assay targeting a B. pseudomallei-specific 115-bp segment within the type 3 secretion system 1 (TTS1) gene (47). Following B. pseudomallei confirmation, genomic DNA was extracted from purified B. pseudomallei colonies (48).
Sequencing, quality control, and assembly. For this study, we sequenced 10 new B. pseudomallei isolates. Four of these were environmental isolates collected at Moondyne Farm (from three locations, with two isolates being from a single location), and six were animal isolates (five alpaca isolates, including two from a single alpaca, and one scarlet macaw parrot isolate). The 10 isolates were sequenced on an Illumina NovaSeq 6000 sequencing instrument (Illumina, Inc., San Diego, CA), with each sequencing platform generating 150-bp paired-end reads. Analysis included an additional 12 genomes that we had sequenced from the prior cluster from the Avon Valley and surrounds (1966 to 1992) (10) and 296 genomes that represented the global phylogenetic structure and diversity of B. pseudomallei (Table S1). Metadata (ST and geography) corresponding to the global genomes were used to select the 296 global strains to best represent diversity and geography; the 296 genomes belonged to 201 unique STs and were from 38 geographical regions around the globe. Read quality was conducted using the Trimmomatic (v0.39) tool (49) and the FastQC program (https://www.bioinformatics.babraham.ac.uk/projects/ fastqc). Following adapter trimming and quality control, high-quality draft assemblies for all southwest Western Australia genomes (n ϭ 22) were generated using the MGAP pipeline (https://github.com/ dsarov/MGAP---Microbial-Genome-Assembler-Pipeline), and all assemblies were deemed to be of good quality and so were included in downstream genomic analysis (contigs, Ͻ200; N 50 , Ͼ80,000 bp). Final assemblies were annotated with the Prokka (v1.11) program (50).
MLST. The multilocus sequence typing (MLST) profiles for the B. pseudomallei isolates were extracted from the sequencing reads using the BIGSdb tool, which is accessible on the B. pseudomallei MLST website (http://pubmlst.org/bpseudomallei/). Variant calling, phylogenetics, and identification of recombination events. We created two phylogenies; the first phylogeny was created to examine the highest number of identifiable variants among the southwest Western Australia isolates and contained solely the ST-284 isolates (n ϭ 22). The second phylogeny contained the southwest Western Australia isolates (n ϭ 22) and a global set of B. pseudomallei genomes (n ϭ 296) and was constructed to determine if the southwest Western Australia isolates belonged to a single clade and to investigate the presumptive phylogeographical origin of the southwest Western Australia isolates. Both phylogenies were based on orthologous biallelic singlenucleotide polymorphisms (SNPs). For the ST-284 analysis, the genome of isolate MSHR0169 was used as the reference genome (10), and for the global phylogeny, the high-quality genome of Australian B. pseudomallei isolate MSHR1153 was used as the reference for read mapping (N 50 , 4,032,226 bp; number of contigs, 2; size, 7,312,903 bp) (51). Orthologous SNPs were identified from the Illumina whole-genome sequencing data using the Genome Analysis Toolkit (GATK; v4.1.0.0) (52), which is wrapped in the SPANDx (v3.2) algorithm (53). The SNP variants identified by GATK were used for phylogenetic reconstruction using maximum parsimony (MP) in the PAUP* (v4.0.b5) program (54). Bootstrapping was carried out, using 1,000 replicates, to establish the robustness of the tree branches. Phylogenetic trees were visualized and manipulated using the Interactive Tree of Life (ITOL; https://itol.embl.de). Recombinogenic SNPs were identified with the Gubbins (Genealogies Unbiased by Recombination in Nucleotide Sequences, v.2.3.1) program, using the default parameters (55).
Hierarchical Bayesian clustering of the southwest Western Australian isolates. A Bayesian approach was applied to spatially delineate and infer the genetic structure of the southwest Western Australia B. pseudomallei isolates. Hierarchical clustering of the 22 genomes was done using the RhierBAPS (v1.1.2) program, implemented in R (v3.5.1) software with a tree-independent approach, using the core-genome mapping alignment. This method allows the population to be subdivided into groups with closely related genetic backgrounds. Clustering was performed until convergence to a local optimum using two levels with k equal to 20 as the prior upper bound for the number of clusters. Higher levels of clustering were then performed based on the first result using k equal to 15 to k equal to 8.
Marginal likelihood values and level 1 and 2 clustering were the same across all k values tested, and this is likely due to the clonality of the southwest Western Australia isolates.
Pan-genome analysis and identification of unique protein-coding regions. The pan-genome was calculated for the ST-284 isolates from the Avon Valley and surrounding area (n ϭ 22) using the large-scale BLAST score ratio (LS-BSR) pipeline (56) with a 0.9 BSR threshold and the blat alignment option (57). The compare_BSR.py option was used to identify coding sequence (CDS) differences between the first and the second clusters. The conserved and variably conserved coding regions identified in the ST-284 strains were screened against an Australian set of genomes (Table S1), using LS-BSR, to identify protein-coding regions unique to the southwest Western Australia strains, using a BSR threshold of 0.9 for presence.
Identification of 38 genomic islands and six virulence genes. To further assess the genetic similarity of the 22 southwest Western Australia strains, we determined the presence of known genomic islands (GIs) and virulence genes. We screened for the presence of 38 known B. pseudomallei GIs from two Australian strains (strains MSHR0668 and MSHR0305) that ranged from 6 kb to 97 kb; CDS coordinates for the 38 GIs have previously been reported (58 . The presence of the 38 GIs and six virulence genes was determined using the LS-BSR pipeline with the ls_bsr.py option and the tblastn program (38 GIs and six virulence genes) (56). We used a BSR threshold of Ն0.6 or Ն0.9 for the presence of CDSs belonging to the GIs or virulence genes, respectively. CDSs with scores below this cutoff were deemed absent or variable. Genome assembly annotations of the ST-284 strains were reviewed at CDSs with various BSR scores (0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) to determine the BSR threshold of Ն0.6 for presence of GIs. For the virulence genes, CDSs had a BSR score of either 0 or Ͼ0.9, and so a BSR threshold of Ն0.9 was used for presence of virulence genes.
Data availability. Read data for all isolates are available in the Sequence Read Archive (SRA) database (Table S1) under accession number PRJNA615049.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.