‘Candidatus Aquirickettsiella gammari’ (Gammaproteobacteria: Legionellales: Coxiellaceae): A bacterial pathogen of the freshwater crustacean Gammarus fossarum (Malacostraca: Amphipoda)

Invasive and non-native species can pose risks to vulnerable ecosystems by co-introducing bacterial pathogens. Alternatively, co-introduced bacterial pathogens may regulate invasive population size and invasive traits. We describe a novel candidate genus and species of bacteria (‘Candidatus Aquirickettsiella gammari’) found to infect Gammarus fossarum, from its native range in Poland. The bacterium develops intracellularly within the haemocytes and cells of the musculature, hepatopancreas, connective tissues, nervous system and gonad of the host. The developmental cycle of ‘Candidatus Aquirickettsiella gammari’ includes an elementary body (496.73 nm ± 37.56 nm in length, and 176.89 nm ± 36.29 nm in width), an elliptical, condensed spherical stage (737.61 nm ± 44.51 nm in length and 300.07 nm ± 44.02 nm in width), a divisional stage, and a spherical initial body (1397.59 nm ± 21.26 nm in diameter). We provide a partial genome for ‘Candidatus Aquirickettsiella gammari’, which clades phylogenetically alongside environmental 16S rRNA sequences from aquatic habitats, and bacterial symbionts from aquatic isopods (Asellus aquaticus), grouping separately from the Rickettsiella, a genus that includes bacterial pathogens of terrestrial insects and isopods. Increased understanding of the diversity of symbionts carried by G. fossarum identifies those that might regulate host population size, or those that could pose a risk to native species in the invasive range. Identification of ‘Candidatus Aquirickettsiella gammari’ and its potential for adaptation as a biological control agent is ex-


Introduction
The Prokaryota includes the diverse group of bacteria (Hugenholtz, 2002;Logares et al., 2014) that are found in a wide range of environments (from ice-sheets to volcanoes), within a diversity of hosts (from humans to protists) and are considered one of the most ancient lineages of life (3-4 Gya) (Poole et al., 1999;DeLong and Pace, 2001). Many bacterial taxa have adapted to survive through colonisation of a host, acting either as parasites or mutualists to survive (Bhavsar et al., 2007;Chow et al., 2010). The evolutionary systematics of bacterial taxa is being revolutionised through wider application of DNA sequencing techniques and development of improved phylogenetic tools to resolve their position within the tree of life (Konstantinidis and Tiedje, 2007).
Some bacterial taxa reside within the cells of their host, utilising resources within the cell for their own division and development. One such intracellular bacterium is the well-known Chlamydia trachomatis, a common sexually transmitted disease in humans (Campbell et al., 1987; economically important, resulting in diseases that cause significant healthcare costs, or cultured species yield losses (Pospischil et al., 2002). Others are interesting from a biodiversity and wildlife pathogen perspective (Duron et al., 2015).
Several other Rickettsiella-like taxa have been described to infect the cells of aquatic hosts, but their description is only based on morphological information. These include those infecting the aquatic crustaceans: Carcinus mediterraneus (Bonami and Pappalardo, 1980); Paralithoides platypus (Johnson, 1984); Cherax quadricarinatus (Romero et al., 2000); Eriocheir sinensis (Wang and Gu, 2002); three species of penaeid shrimp (Anderson et al., 1987;Brock, 1988;Krol et al., 1991); and two amphipods, Gammarus pulex (Larsson, 1982) and Crangonyx floridanus (Federici et al., 1974). Over 100 16S rRNA gene sequence accessions exist within online databases for bacterial isolates linked to the Rickettsiella, and these include taxa infecting a wide diversity of arthropod hosts, including isolates from aquatic hosts (NCBI). An example from an aquatic host includes an isolate from the aquatic isopod Asellus aquaticus (NCBI: AY447041), that lacks morphological and ultrastructural information.
Rickettsiella spp. are considered to have a slow developmental cycle, which involves initially entering a host cell through phagocytosis, dividing within a vacuole, and eventually lysing the cell before completing its lifecycle . In detail, small, dense elementary bodies are first phagocytosed by the host cell, prior to their enlargement (Kleespies et al., 2014). These enlarge into spherical bodies, which in insects at least, often contain a crystalline substance that has not yet been observed in those Rickettsiella infecting crustaceans (Kleespies et al., 2014). Finally, these enlarged cells condense and divide before condensing further into infective stage elementary bodies (Kleespies et al., 2014).
Rickettsiella spp. often cause disease in their host. Some have been associated with clinical signs, leading to descriptions such as "Blue Disease" or "Milky Disease" (Dutky and Gooden, 1952;Kleespies et al., 2011). In insects, disease often results in an iridescent appearance to the infected tissues (Dutky and Gooden, 1952;Kleespies et al., 2011). In crustaceans, clinical signs include an opaque white appearance of fluids and intersegmental membranes (Vago et al., 1970;Federici et al., 1974). In all cases, bacterial colonies are observed in the cytoplasm of host cells, causing displacement of organelles and cellular hypertrophy (Federici et al., 1974;Kleespies et al., 2014). Although genomic information is not available for many taxa, a full genome sequence is available for an R. grylli isolate from an isopod (Leclerque, 2008), and a genome is available for R. isopodorum; along with several others from the Coxiellaceae but outside the Rickettsiella (Seshadri et al., 2003;Mehari et al., 2015).
As part of a disease survey of the amphipod Gammarus fossarum for pathogens and symbionts, we discovered infection and disease associated with a novel bacterium. We utilise high throughput sequencing data to construct a partial genome of the pathogen and provide complementary information obtained from transmission electron microscopy and histopathology to describe a novel genus and species, 'Candidatus Aquirickettsiella gammari', as a candidate sister genus to the Rickettsiella. The pathogen infects the cytoplasm of circulating haemocytes and cells of the gonad, nerve, hepatopancreas, connective tissues and musculature of the amphipod and may have future applicability as a control agent for invasive and non-native G. fossarum.

Animal collection
Gammarus fossarum (n = 140) were collected from the Bzura River in Łódź (Łagiewniki), Poland (N51.824829, E19.459828) in June 2015. One hundred and twenty-seven individuals were fixed for histology on site while 13 were transported live to the University of Łódź for dissection. Dissection involved initial cooling to anaesthetise the individual before removing and dividing the hepatopancreas, gut and muscle tissue for fixation for molecular diagnostics (96% EtOH), histology [Davidson's freshwater fixative (Hopwood, 1996)] and, transmission electron microscopy (TEM) (2.5% glutaraldehyde in sodium cacodylate buffer) according to protocols published by our laboratory (Bojko et al., 2015).

Histopathology and transmission electron microscopy
For histology, whole animals or dissected organs and tissues were initially fixed in Davidson's freshwater fixative for 48 hr. After fixation, the tissues were submerged in 70% ethanol and transported to the Cefas Weymouth Laboratory, UK for histological processing. Specimens were decalcified for 30 min before placement in 70% industrial methylated spirit and transfer to an automated tissue processor (Leica, UK) for wax infiltration. Whole animals, or dissected organs and tissues, were embedded in wax blocks and sectioned at 3 μm before transfer to glass slides. Sections were stained using haematoxylin and alcoholic eosin (H &E) and mounted with a glass coverslip using DPX. All slides were read using standard light microscopy (Nikon E800, Nikon, UK). Digital images were captured using an integrated camera (Leica, UK) and Lucia Image Capture software. For TEM, dissected tissues (muscle and hepatopancreas) were processed and analysed according to Bojko et al. (2015). Digital images were obtained on a Jeol JEM 1400 transmission electron microscope using on-board camera and software (Jeol, UK). These two techniques identified a previously unknown bacterial infection, providing the incentive to apply molecular tools for bacterial systematics.

DNA extraction, PCR, sequencing, and in-situ hybridisation (ISH)
Ethanol-fixed muscle biopsies from infected amphipods (n = 3) were initially digested using proteinase K (10 mg/ml) in solution with Lifton's Buffer (0.1 M Tris-HCl, 0.5% SDS, 0.1 M EDTA). The solution extracts were analysed for 16S rRNA sequence in a single-round Taq polymerase PCR protocol using the general bacterial 16S primers fD1 and rP2 according to Weisburg et al. (1991). Amplicons (∼1900 bp) were excised from the gel and forward and reverse sequenced using the 'eurofinsgenomics' service (www.eurofinsgenomics.eu). This sequence length was not used in the phylogenetic comparison, but rather the larger sequence obtained from metagenomic analysis which shared 100% sequence identity. Each specimen positive for infection via histology provided the same 16S rRNA gene sequence.
The amplicon was also used as an ISH probe upon histological section. The band was isolated and purified using Polyethylene Glycol 8000® Sigma-Aldrich (Lis, 1980), and the purified DNA was digoxigenin (DIG)-labelled using the same PCR conditions above, but with altered reagent concentrations (10 μl colourless buffer, 5 μl MgCl 2 solution, 5 μl of PCR DIG labelling mix (Roche), 3 μl template DNA, 1 μl of forward and reverse primers, 0.5 μl of GoTaq Polymerase and 24.5 μl molecular water). The control was produced by amplifying the same 16S rDNA gene using non-labelled standard dNTPs. Products were purified as previous, and the amount of DNA quantified (NanoDrop 1000 Spec-trophotometer® Thermo Scientific) and diluted to 1 ng/μl, for a total volume of 50 μl.
The ISH technique presented below is an adaptation from published protocols (Montagnani et al, 2001;Fabioux et al, 2004). Dry tissue sections were dewaxed and rehydrated: Clearene for 5 min (2 times), followed by 100% industrial denatured alcohol (IDA) for 5 min and 70% IDA for another 5 min. Slides were rinsed in 0.1 M TRIS buffer (0.1 M TRIS base, 0.15 M NaCl, adjust the pH to 7.5 adding HCl) and placed in a humid chamber before being covered with 300 μl of 0.3% Triton-X diluted in 0.1 M TRIS buffer (pH 7.5) for 20 min and rinsed with 0.1 M TRIS buffer (pH 7.5). Tissue was covered with Proteinase K (25 μg/ml) in 0.1 M TRIS buffer (pH 7.5) (37°C) and kept for 20 min at 37°C to prevent evaporation. Slides were washed in 70% IDA for 3 min and 100% IDA for 3 min before rinsing in SSC (2×) for 1 min while gently agitating (SSC 1×: 0.15 M NaCl and 0.015 M Sodium Citrate). Slides were kept in 0.1 M TRIS buffer (pH 7.5) until the In-Situ hybridization frame seals (BIO-RAD) are duly placed. The DIG labelled probe and the non-labelled probe (control) were both diluted 1:1 with hybridization buffer and added to the slide. After DNA denaturation (94°C for 6 min), slides were hybridised overnight at 44°C.
Samples were washed for 10 min with washing buffer (25 ml of SSC 20×, 6 M Urea, 2 mg/L BSA), before a further 2 washes in preheated (38°C) washing buffer for 10 min. Slides were rinsed with preheated (38°C) SSC (1×) for 5 min (twice) and with 0.1 M TRIS buffer (pH 7.5) (twice). The blocking step included a solution of 6% dried skimmed milk diluted in 0.1 M TRIS buffer (pH 7.5) for 1 hr and washed with 0.1 M TRIS buffer (pH 7.5) for 5 min (twice). Slides were incubated with 1.5 U/ml of Anti-Digoxigenin-AP Fab fragments (Roche) diluted in 0.1 M TRIS buffer (pH 7.5) for 1 hr in darkness. Excess Anti-DIG-AP was removed. Slides were transferred to 0.1 M TRIS buffer (pH 9.5) for 2 min, and the slide was covered with NBT/BCIP stock solution (Roche) diluted in 0.1 M TRIS buffer (pH 9.5) and incubated in darkness until the first clear signs of blue staining start to appear (∼30 min). Slides were washed in 0.1 M TRIS buffer (pH 9.5) for 1 min (twice) and stained with 1% Bismark Brown (6 min) and dehydrated in 70% IDA, 45 s in 100% IDA, then washed twice in Clearene (1 min) prior to coverslip. The ISH probe confirmed infection in haemocytes and cells of the musculature, gill, gonad, hepatopancreas and nerve tissues of the host. Labelling was not detected in uninfected individuals. In some animals, infection was specifically detected as dense inclusions within the cells of the hepatopancreatic tubules.

Genome sequencing, assembly and annotation
Muscle tissue from an infected G. fossarum carcass, initially fixed in 96% ethanol, was prepared for metagenomic analysis using the Illumina MiSeq platform (Illumina, UK). Corresponding histology for this specimen included only 'Candidatus Aquirickettsiella gammari' pathology, without visible infection/pathologies caused by other bacteria. The specimen was split into 3 sub-samples with 1 ng of DNA from each sub-sample prepared for sequencing by Nextera XT library preparation per manufacturer's protocol (Illumina; www.illumina.com). Libraries were quality and size checked by bioanalyzer (Agilent; www. agilent.com) and quantified by QuantiFluor fluorimeter (Promega, www.promega.com) before being pooled in equimolar concentrations, denatured by sodium hydroxide, and diluted to 10 pM in Illumina HT1 hybridisation buffer for sequencing. Sequencing was done on an Illumina MiSeq system using the MiSeq Reagent Kit v2 (500 cycles). In total, 23,090,904 individual reads were attained (50,178,184 paired reads) from the sequencer and 46,181,808 reads remained after qualitytrimming and low quality read removal.

Phylogenetics
Predicted gene sequences were utilised in combination with available sequence data from NCBI to generate both a Neighbour-Joining (NJ) phylogenetic tree and Maximum-Likelihood (ML) phylogenetic tree for a concatenated 20-gene phylogeny, and production of a NJ tree and ML tree for bacterial 16S data, using MEGA 7.0.21 (Kumar et al., 2016). In both cases the ML tree topology is used in the respective figures.
The concatenated phylogeny was constructed from 20 end-to-end gene sequences [23S rRNA, 16S rRNA, 50S L1-5, 30S S1-5, DNA Pol III alpha/beta/tau/delta/epsilon subunit, DNA primase, Replicative DNA Helicase (DnaB), DNA Pol I], which are primarily house-keeping/conserved genes for 8 individual bacterial taxa for which data was available, including Chlamydophila pneumoniae to root the tree (NCBI Genbank Accession numbers for all the genes used are shown in Supplemental Table 2). Multiple sequence alignments were generated using the ClustalW algorithm with default settings in MEGA 7.0.21, and phylogenetically compared using the Tamura-3 parameter model (Tamura, 1992) of evolution with uniform rate heterogeneity and the complete deletion model selection algorithm to form a final tree using both NJ (genetic distance) (Saitou and Nei, 1987) and ML methods. The NJ method utilised the Tamura-3 parameter model also, with transitions and transversions accounted for, including Gamma distribution with gamma parameter 5.1 and homogenous pattern among lineages. The NJ also used the complete deletion treatment of the data. The clade credibility for both trees was assessed using bootstrap tests with 200 replicates.
The phylogenetic analysis of the 16S rRNA gene utilised various bacterial isolates (36 species in total), including two Chlamydophila sp. that acted as an out-group to root the tree. Two trees were constructed, the first using the same ML method as stated above, but with 500 bootstrap replicates and the use of all sites (gaps/missing data). The second method utilised an NJ approach (Saitou and Nei, 1987) in combination with the Jukes and Cantor (1969) algorithm, pairwise deletion model, and uniform rates of heterogeneity, with 500 bootstrap replicates. The ML tree topology is displayed in this manuscript.

Histopathology and ultrastructure of a novel bacterial species
Gammarus fossarum were infected with an intracellular bacterial infection at a prevalence of 37.8%, which is identified herein as 'Candidatus Aquirickettsiella gammari' (Fig. 1). Externally, infected hosts had a creamy-white appearance due to the heavy burden of bacterial infection, which was iridescent, and included the presence of orange beads running along the carapace. In 14.2% of hosts, the bacterial infection was only apparently located within epithelia cells of the hepatopancreas (Fig. 2), proposing that this may be the initial seat of infection prior to systemic spread. In other cases, 'Candidatus Aquirickettsiella gammari' was also present within the haemocytes, which were highly hypertrophic and enlarged (Fig. 1a), and cells of the nervous system (Fig. 1b, c), gonad, connective tissues, musculature (Fig. 1d) as well as the hepatopancreas (Fig. 2). In all tissues the bacterial pathogen resulted in a hypertrophic cytoplasm that stained deep purple under H&E. The infection resulted in the aggregation of haemocytes in addition to extreme hypertrophy of all infected tissues (Fig. 1a).
TEM revealed an intracellular bacterium in cells of the hepatopancreas (Fig. 2), the space beneath the sarcolemma of muscle cells (Fig. 3a) and in the cytoplasm of haemocytes (Fig. 3b). The inclusions within the hepatopancreas had a different morphology to those within the connective tissues, primarily the presence of fibrous material within spherical stages, which is not observed in other infected tissues (Fig. 2d). Primary infection may occur within the hepatopancreas epithelial cells, with liberated bacteria potentially being phagocytosed by haemocytes and connective tissue cells. Bacteria with a highly condensed cytoplasm measured 496.73 nm ± 37.56 nm (n = 20) in length, and 176.89 nm ± 36.29 nm in width, contained an electron dense core (Fig. 3c, d) and electron lucent lamella (Fig. 3d). The bacteria apparently developed through four main stages (Fig. 3e-h), but the order of this developmental process is unknown. The putative first stage is the electron dense elementary body (Fig. 3e), followed by an elliptical, condensed spherical stage [737.61 nm ± 44.51 nm (n = 10) in length and 300.07 nm ± 44.02 nm in width (n = 17)], with an electron lucent cytoplasm (Fig. 3f), which then putatively underwent division (Fig. 3g). Spherical initial bodies were the largest stages observed, measuring 1397.59 nm ± 21.26 nm (n = 10) in diameter (Fig. 3h). Federici et al. (1974) utilised specific methods to identify a crystalline cell surface protein layer [S-layer (glyco) proteins] in the bacteria they observed, but this technique was not conducted here so comparison cannot be confidently made. Crystalline inclusion bodies, often observed in insect-infecting Rickettsiella, were not observed during infection by 'Candidatus Aquirickettsiella gammari'.
An in-situ hybridisation (ISH) probe corresponding to the bacterial isolate for which we provide genomic information, was observed to bind to the presence of bacteria within the muscle (Fig. 4a, b), hepatopancreas (Fig. 4c, d), gonad, gill, connective tissues and nerve tissue; suggesting synonymy between the sequenced isolate and the bacteria infecting each tissue type. Denser staining in infected hepatopancreatic epithelial cells provides evidence that these cells contain large accumulations of the bacteria and likely act as the seat of infection prior to spread to other tissues and organs.

'Candidatus Aquirickettsiella gammari' genome sequence and annotation
Sequence assembly produced 29,089 contigs with a minimum length of 200 bp (N50 = 685), accounting for 16,125,808 bp. Annotation analysis using Diamond and Megan identified 55 contigs with high similarity to Legionellales. Blobplot analysis identified a cluster of 39 contigs with high coverage and similarity to Legionellales (Supplemental Fig. 1) and among these contigs were 10 additional sequences that were not classified as Legionellales by Diamond/Megan analysis. Together, a total of 65 contigs representing 'Candidatus Aquirickettsiella gammari' were found, with lengths ranging from 220 to 149,894 bp and a total partial genomic length of 1,491,410 bp (N50: 73,822 bp).
Alignment of the 'Candidatus Aquirickettsiella gammari' genome against the complete R. grylli genome identified conserved segments (Locally Collinear Blocks; LCBs) along the full reference genome sequence (Fig. 5). Additional comparison to the genome of R. isopodorum suggests that this genome has the most LCBs (Fig. 5). Annotation of the 'Candidatus Aquirickettsiella gammari' genome assembly resulted in the identification of 1386 coding regions of which a total of 996 had homologues that most closely associated with those encoded in the R. isopodorum genome isolated from an isopod (Supplementary Table 1). A total of 82.1% of complete BUSCO genes (371 out of 452) specific to Gammaproteobacteria were recovered, increasing to 84.1% if fragmented genes were included. This compared well to the number of complete BUSCO genes recovered from the genomes of R. grylli (2 contigs, total length of 1,581,239 bp) and Legionella sp. 40-6 (656 contigs, total length of 3,142,726 bp), with values of 80.5% and 69%, respectively.
Three hundred and sixty-nine of the predicted genes encode for hypothetical proteins and have not yet been fully characterised. Sequences for 16S, 23S and 5S rRNA were also featured within the 65 contigs as well as 40 tRNAs (for guiding 19 unique amino acids, excluding selenocysteine and isoleucine) and 1 tmRNA (SsrA) (see NCBI submission: NMOS00000000/PRJNA392245). The genes included on the 65 contigs suggest a wide range of metabolic and physiological capabilities; of interest here are those that may be involved in virulence. These include secretion systems (Vir, Dot, Icm), other Type IV secretion proteins, and conjugal transfer proteins (Tra/Trb), which may aid horizontal gene transfer to conspecifics and host cells. 'Candidatus Aquirickettsiella gammari' encodes 8 genes that show similarity to Vir-like proteins, all of which show closest similarity to species outside of the Rickettsiella, Diplorickettsiella and Coxiella, and are more similar to Legionella sp. (45-50% similarity), 'Candidatus Neoehrlichia lotoris' (45.3% similarity), Sphingopyxis sp. (38.4% similarity), Tatlockia micdadei (57.5%), and Virgibacillus senegalensis (73.1% similarity). Ten Dotlike genes primarily showing similarity to Rickettsiella sp., and one to Legionella fairfieldensis (61.9% similarity). Eight genes that show similarity to Icm-like genes that all show closest similarity to Rickettsiella isopodorum. Eighteen genes that are Type-IV-secretion-system-like, primarily with closest similarity to Rickettsiella isopodorum; however, 4 show similarity to members of the Legionella, Bartonella and Sphingopyxis. Finally, TraA-like (Legionella sp.), TraD-like (Legionella sp.) and TrbN-like (Rickettsiella sp.) genes are encoded that show closest similarity to the Legionella and Rickettsiella. Detailed outputs can be found in Supplementary Table 1. For the most part, the virulence genes encoded by 'Candidatus Aquirickettsiella gammari' are linked with close relatives in the Coxiellaceae and Legionellales; however, those that seem more closely linked with distant species, such as 'Candidatus Neoehrlichia lotoris' (a tick-borne disease of humans), Sphingopyxis sp. (hardy bacteria that thrive in polluted environments), and Virgibacillus senegalensis (a species linked with the human microbiome), may be the result of historic horizontal gene transfers that have contributed to the pathogenicity of 'Candidatus Aquirickettsiella gammari'.
In addition to genes linked directly with virulence, genes that are involved in the production of toxins may also be linked to the pathology observed in this study and contribute to the declining health of the host. 'Candidatus Aquirickettsiella gammari' encodes 7 different genes showing closest similarity to 7 different bacterial species that are toxinlike genes, including 6 Type-II-toxin-antitoxin-system genes [RatA

Phylogeny of 'Candidatus Aquirickettsiella gammari'
The 16S rRNA gene of 'Candidatus Aquirickettsiella gammari' was used to search the NCBI database for similar taxa, determining that the closest known species is a Rickettsiella-like bacterium of Asellus aquaticus (similarity = 99%; e-value = 0.0) (AY447040) and that the most closely related species with taxonomic description was R. isopodorum The 20-gene concatenated phylogeny determined that R. grylli from an isopod and R. isopodorum are the most related taxa with genome sequence data to 'Candidatus Aquirickettsiella gammari' (Fig. 6). In Fig. 6, the two isolates from terrestrial isopods group together at 100% bootstrap confidence, and 'Candidatus Aquirickettsiella gammari' branches below them, with a branch distance of 0.4 units (ML) from R.
grylli from an isopod and 0.6 units (ML) from R. isopodorum. Diplorickettsia massilisensis is also closely grouped with these three isolates, at a branch distance of 0.4 units (ML) from 'Candidatus Aquirickettsiella gammari'.
The phylogenetic tree representing the 16S rRNA gene of many available uncategorised isolates, Rickettsiella sp., or other Coxiellaceae, outlines an interesting result whereby 'Candidatus Aquirickettsiella gammari' sits outside of the terrestrial Rickettsiella, grouping with only aquatic bacterial isolates (Fig. 7). The single gene phylogeny showed a strong 96/80% bootstrap confidence support (NJ/ML) for the separation between the Rickettsiella spp. isolated from terrestrial environments/hosts, and those isolated from aquatic environments/hosts (Fig. 7). The 16S phylogeny also determined that R. isopodorum and R. armidillidii [now thought to be the same species (Kleespies et al., 2014)] branch separately to those Rickettsiella sp. that infect insect hosts (62/ 49% bootstrap confidence) and group together with 100% bootstrap confidence. Additionally, the R. grylli isolate (from an isopod) (NZAAQJ02000001) branches just above the Rickettsiella isolates from isopods at low-mid bootstrap confidence (51/49%).
One species, R. viridis, branches early within the tree, and outside of the Rickettsiella, with 100/100% bootstrap confidence. The closest branching species on the tree to R. viridis is Diplorickettsia massiliensis (0.09 substitutions per site), which sits between R. viridis, the Rickettsiella and 'Candidatus Aquirickettsiella'. Whether this suggests that R. viridis is a member of the Diplorickettsia requires further research.
Based upon the 16S rRNA gene sequence of this novel bacterium and closely related rDNA sequences from NCBI, along with ultrastructural differences (such as the lack of crystalline protein formation within the spherical initial body stage) between the terrestrial insectinfecting Rickettsiella and the aquatic crustacean-infecting bacteria described here, we suggest a new candidate genus, 'Candidatus Aquirickettsiella', to contain this set of aquatic, crustacean-infecting bacteria until the bacteria can be cultured for full genus status, at which time this should be reassessed. Fig. 4. In-situ hybridisation of bacterial gene probes to histopathologies. a-b) Bacteria in the muscle and haemocytes are detected using an in-situ probe; where a H& E slide (a) with infection (black arrow) is compared to the in-situ slide (b), which has infection stained in blue. Scale = 50 µm. (c, d) Bacteria in the hepatopancreas are detected using the same in-situ probe; where a H&E slide (c) with infection (black arrow) is compared to the in-situ slide (d), which has infection stained in blue. Scale = 100 µm. Fig. 5. 'Candidatus Aquirickettsiella gammari' scaffold comparison to the genome of Rickettsiella grylli from an isopod (NZAAQJ02000001) and genome of Rickettsiella isopodorum (NZLUKY00000000). These assessments do not determine the actual order of the scaffolds in the true genome of 'Candidatus Aquirickettsiella gammari' but refers to genomic arrangement comparisons. The colours between the two comparisons in this graph do not correspond. Intracellular bacterial organisms, which are pathogenic for crustaceans in aquatic environments. Crystalline inclusions present in insectinfecting Rickettsiella are not present in Peracarid-infecting 'Candidatus Aquirickettsiella'; however, fibrous inclusions are present in those Fig. 6. Phylogenetic placement of 'Candidatus Aquirickettsiella gammari' using a 20-gene concatenated phylogeny, relative to other related bacterial species from other Crustacea, arachnids and humans, with the available gene complement needed for concatenated sequence analysis. The evolutionary history was inferred by NJ/ML based on the Tamura 3-parameter model. The tree with the highest log likelihood of the ML analysis (-166814.4006) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying NJ and BioNJ algorithms to a matrix of pairwise distances using the Maximum Composite Likelihood approach, and then selecting the topology with superior log likelihood value. The tree is to scale, with branch lengths measured in the number of substitutions per site. There was a total of 20,637 positions in the final dataset. For the NCBI references of the genes used to develop this figure please refer to Supplementary Table 2. Fig. 7. A phylogenetic tree of the available 16S rRNA gene sequences for several bacterial species, closely and distantly related to 'Candidatus Aquirickettsiella gammari' (black arrow). The evolutionary history was inferred using an NJ algorithm based on the Jukes and Cantor model, and an ML algorithm based on the Tamura 3-parameter model. The tree shown is from the ML analysis and annotated with the results from both analyses. Both analyses involved 16S rRNA sequences of 36 species. Each genus/family group is indicated with a coloured box, and the outgroup is indicated in grey. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) bacteria infected hepatopancreatic tissues. The bacterium infects the cell cytoplasm of hepatopancreatic epithelia (where morphology can vary), musculature, gill, gonad, nerve and haemocytes, manifesting in late stages as systemic infection. Externally visible pathologies include a white iridescent appearance to infected amphipods, particularly their muscle tissues. The bacterium passes through a four-step development cycle including: the elementary body (smallest developmental stage); an elliptical, condensed sphere stage; division; and a spherical initial body (but not necessarily in that order). All developmental stages take place within a vacuole separating the bacteria from the host cell cytoplasm; however, the elementary body (infective stage) is predicted to be able to survive outside the host cell. Genome sequence data of novel species must show close relatedness through the phylogenetic methods used by this study, and gene conservation relative to the type species.
Type species: 'Candidatus Aquirickettsiella gammari ' Bojko, Dunn, Stebbing, van Aerle, Bacela-Spychalska, Bean, Urrutia and Stentiford, 2018. This species is intracellular in organs and tissues of the host, Gammarus fossarum, including the cells of the hepatopancreas, musculature, connective tissues, nervous system, gonad, gill and, the haemocytes. Heavy infection causes hosts to appear creamy-white, and often iridescent with orange beads running along either side of the pereon. The ultrastructure of the elementary body is composed of an outer membrane measuring 496.73 nm ± 37.56 nm (n = 20) in length, and 176.89 nm ± 36.29 nm in width and is present with an electron dense core and electron lucent lamella. Development includes the elementary body, an elliptical condensed sphere stage, which undergoes division, and includes an initial spherical body stage. Initial spherical body stages do not appear to contain crystalline substances observed in other members of the family, but some fibrous elements are visible in the separate phenotype displayed by the bacterium infecting the hepatopancreatocytes. 'Candidatus Aquirickettsiella gammari' can be discriminated from other members of the family, and presumably newly discovered members of the genus, by 16S rDNA phylogenies, or construction of concatenated phylogenies based upon the multi-gene sequences as described in this study.
Site of infection: The hepatopancreas is proposed as the seat of infection and may precede wider dissemination of the pathogen to cells of the musculature, nerves, gills gonads, connective tissues and, the haemocytes. Etymology: The genus name "Aquirickettsiella" is based upon the similarity between this genus and the sister genus Rickettsiella, whilst referring to the aquatic habitat and host in which the type species was detected. The specific epithet "gammari" refers to the aquatic gammarid host of 'Candidatus Aquirickettsiella gammari'.
Type material: Histological, TEM and ethanol-fixed material is deposited within the Registry of Aquatic Pathology, Cefas, UK. Data pertaining to the 16S rDNA gene, next generation sequence data and assembled scaffolds for the pathogen and metagenomic dataset generally, is deposited at the NCBI database. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession (s): NMOS00000000/PRJNA392245. The version described in this paper is version NMOS01000000.

Discussion
This study describes a novel intracellular bacterial pathogen infecting G. fossarum native to continental Europe (Poland), named herein as 'Candidatus Aquirickettsiella gammari' using histology, ISH, TEM, single-gene and multi-gene phylogenies. 'Candidatus Aquirickettsiella gammari' is closely related to previously described pathogens of terrestrial arthropods and may be of interest as a biological control agent for invasive gammarid species.

Taxonomy of 'Candidatus Aquirickettsiella gammari'
Considering the ultrastructural, histological, genomic and single/ multi-gene phylogenies detailed in this study, the aquatic relations of the Rickettsiella display some significant differences to terrestrial species. Several insects and some terrestrial isopods have been shown to be infected by members of the genus Rickettsiella (Krieg, 1955;Roux et al., 1997;Leclerque and Kleespies, 2008;Leclerque et al., 2011;Kleespies et al., 2011;Leclerque et al., 2012;Tsuchida et al., 2014;Cordaux et al., 2007;Kleespies et al., 2014). Phylogenetic analyses conducted in this study suggest that, within the Rickettsiella, a divergence (62/72% bootstrap support) is seen between those species infecting terrestrial crustaceans and those infecting terrestrial insects (Fig. 7). Expanding upon this, a divergence (96/80% bootstrap support) is seen between those bacteria isolated from aquatic hosts/environments relative to those collected from terrestrial hosts/environments (Fig. 7), signifying a likely terrestrial clade (Rickettsiella) and an aquatic clade ('Candidatus Aquirickettsiella') of this intracellular bacterial group. The concatenated phylogeny suggests that 'Candidatus Aquirickettsiella gammari' branches closer to D. massiliensis; however, 'Candidatus Aquirickettsiella gammari' is at equal branch distance between the Diploricketsiella and Rickettsiella (0.4 Units). This suggests that the true position of this genus in the Coxiellaceae at the phylogenomic level needs to be reconsidered when more species have sequence data to further explore the phylogeny. Despite this, the large branch distance from the Diploricketsiella and Rickettsiella suggests that the Aqirickettsiella is supported as a novel genus (Fig. 6).
Histology identified this bacterial species to cause gross pathology in the host by infecting the haemocytes, hepatopancreas, muscle sarcolemma, connective tissues, gill, gonad, and nerve tissues. This suggests it is pathogenic to the host, but survival rate when infected is yet to be studied. When bacterial morphology is considered, one primary feature mentioned in the initial genus description of Rickettsiella (Philip, 1956) is the inclusion of crystalline protein production within the 'initial body' development stage (see also : Vago et al., 1970;Kleespies et al., 2014). This is apparently missing from those pathogens shown to infect aquatic crustaceans (Federici et al., 1974;Larsson, 1982;This Study). The lack of intracellular crystalline protein formation in the initial body development; the divergence in the 16S rRNA gene between aquatic and terrestrial isolates (Fig. 7); and the branching distance between 'Candidatus Aquirickettsiella gammari' and other Rickettsiella (Fig. 6) provides the basis for an erection of a novel candidate bacterial genus to include the novel bacterium described herein.
As more Aquirickettsiella spp. are characterised, such as the two Rickettsiella-like bacterial isolates from Asellus aquaticus (AY447040/ AY447041) (Fig. 7), or when those from G. pulex and C. floridanus are provided with 16S sequence data, the solidarity of this candidate genus and the phylogenetic analyses should be reassessed.
Currently, no information exists on whether 'Candidatus Aquirickettsiella gammari' is acquired vertically or horizontally by the host. In addition, because of a lack of crustacean cell culture techniques, it is not possible to axenically culture this bacterial species, although a study assessing this must be formally carried out, and so this genus and species must remain in candidacy until a culture is attempted and successfully carried out.

Genome composition and annotation for 'Candidatus Aquirickettsiella gammari'
This study identified 65 contigs associated with 'Candidatus Aquirickettsiella gammari' from the tissues of G. fossarum that show closest similarity to R. isopodorum and R. grylli, as well some LCBs. Several of the genes isolated from the genomic fragments have homologues that associate to well-characterised pathogens, such as Legionella sp. (Edelstein et al., 1999;Albert-Weissenberger et al., 2007). Legionella sp. have been used in model systems to identify which genes are involved in the infection process and several studies like the one by Edelstein et al. (1999) have identified that Type IV secretion systems and conjugal transfer proteins are important for virulence. Such studies are yet to be conducted in bacterial taxa more closely related to 'Candidatus Aquirickettsiella'; however, parallels can be drawn for certain homologues in both 'Candidatus Aquirickettsiella gammari' and Rickettsiella sp. isolated from isopods. Both species include Dot-like genes, Icm-like genes and conjugal transfer proteins (Tra) that are homologous to those found in Legionella. Only 'Candidatus Aquirickettsiella gammari' encodes Vir-like proteins homologous to those found in Legionella, Tatlockia and Diplorickettsia. The presence of several genes associating to the Type IV secretion system in the genome of 'Candidatus Aquirickettsiella gammari' suggests it has the capability to introduce genetic material to its hosts cells, a process which may be similar to the wellcharacterised pathway used by Agrobacterium tumefaciens to engineer its hosts cell cycle to suit its own development needs (Wood et al., 2001;Tzfira and Citovsky, 2006). Plants infected with the wild-type, pathogenic, A. tumefaciens produce localised cellular growths to form a "gall" (Wood et al., 2001;Tzfira and Citovsky, 2006). For 'Candidatus Aquirickettsiella gammari', the histopathology data revealed several infected tissue types, all of which were undergoing relatively large levels of hypertrophy; in particular, the infected haemocytes and connective tissues had adhered to one another forming large masses in the circulatory system of the host (Fig. 1a). Although speculation at this point, this species and the systems encoded by its genome may provide a useful insight for future studies exploring the introduction of genetic material to crustacean tissues via bacterial horizontal methods.

Why characterise the pathogens of native amphipod hosts?
Most taxa are evolutionarily adapted to survive in particular settings, but when transferred to new surroundings those taxa may either thrive and become invasive or perish and be removed from the community. Amphipods are renowned for their capability to spread and colonise water systems, and several studies have assessed their hardiness (Bruijs et al., 2001), behaviour (Dick et al., 2002) and ability to spread (Bacela-Spychalska, 2016); even suggesting some are "perfect invaders" (Rewicz et al., 2014). With impending invasion comes the possibility to co-introduce disease (Dunn and Hatcher, 2015), or escape from disease, allowing the host to become fitter and more competitive in its new territory (Colautti et al., 2004). Invasions threaten biological diversity (Lambertini et al., 2011) and finding natural enemies that may control invasive species is one possible mode to negate an invader's impacts. A recent study by Bojko et al. (2018) has identified that the presence of pathogens co-introduced alongside an invasive amphipod host can both control invasive amphipod characteristics, but also threaten native species that are susceptible to infection, signifying the importance of understanding pathogens before using them as control agents in invasive amphipod research.
By screening an amphipod population from its native environment, it is possible to observe an overview of the naturally associated symbionts before enemy release has taken place. The identification of 'Candidatus Aquirickettsiella gammari' provides an example of a novel organism similar to those selected for biological control in the past (McNeill et al., 2014;Lacey et al., 2015). This novel pathogen could possibly be adapted into a control agent, but not without firstly conducting further studies upon the effects of the pathogen on the survival of the host, and the pathogens host range. Such studies would relate to the development of biocontrol agents for agricultural settings (Lacey et al., 2015). Such studies would also assess potential risk for this pathogen transfer to native fauna and determine whether it could cause damage to other populations that may co-occur with invasive G. fossarum (Blackman et al., 2017).
'Candidatus Aquirickettsiella gammari' is the first characterised intracellular bacterial species from an amphipod, and this novel genus likely includes the bacteria identified from C. floridanus (Federici et al., 1974), G. pulex (Larsson, 1982) and possibly the intracellular bacterial pathogen from the hepatopancreas of non-native G. roeselii in Poland (Bojko et al., 2017). This new discovery suggests that the native environments of other invasive amphipods that require control, such as D. villosus and Pontogammarus robustoides, may hold similar microbial agents that could benefit their biological control.
When invaders co-occur with native gammaridean fauna, including G. fossarum inhabiting the lowland rivers of Central Europe, these invasive species may face new pathogens, such as the one described in our study, which could be contracted and may also play a role to regulate their populations. When host range data for 'Candidatus Aquirickettsiella gammari' is researched, infection trials with high impact invasive amphipods would determine the transmissibility of the pathogen to high profile invaders and could determine if the agent could transmit to, and control, invaders (D. villosus, D. haemobaphes, Echinogammarus tirchiatus and P. robustoides) in native Polish freshwater environments.