Interactions of the Intracellular Bacterium Cardinium with Its Host, the House Dust Mite Dermatophagoides farinae, Based on Gene Expression Data

ABSTRACT Dermatophagoides farinae is inhabited by an intracellular bacterium, Cardinium. Using correlations between host and symbiont gene expression profiles, we identified several important molecular pathways that potentially regulate/facilitate their interactions. The expression of Cardinium genes collectively explained 95% of the variation in the expression of mite genes assigned to pathways for phagocytosis, apoptosis, the MAPK signaling cascade, endocytosis, the tumor necrosis factor (TNF) pathway, the transforming growth factor beta (TGF-β) pathway, lysozyme, and the Toll/Imd pathway. In addition, expression of mite genes explained 76% of the variability in Cardinium gene expression. In particular, the expression of the Cardinium genes encoding the signaling molecules BamD, LepA, SymE, and VirD4 was either positively or negatively correlated with the expression levels of mite genes involved in endocytosis, phagocytosis, and apoptosis. We also found that Cardinium possesses a complete biosynthetic pathway for lipoic acid and may provide lipoate, but not biotin, to mites. Cardinium gene expression collectively explained 84% of the variation in expression related to several core mite metabolic pathways, and, most notably, a negative correlation was observed between bacterial gene expression and expression of mite genes assigned to the glycolysis and citric acid cycle pathways. Furthermore, we showed that Cardinium gene expression is correlated with expression levels of genes associated with terpenoid backbone biosynthesis. This pathway is important for the synthesis of pheromones, thus providing an opportunity for Cardinium to influence mite reproductive behavior to facilitate transmission of the bacterium. Overall, our study provided correlational gene expression data that can be useful for future research on mite-Cardinium interactions. IMPORTANCE The molecular mechanisms of mite-symbiont interactions and their impacts on human health are largely unknown. Astigmatid mites, such as house dust and stored-product mites, are among the most significant allergen sources worldwide. Although mites themselves are the main allergen sources, recent studies have indicated that mite-associated microbiomes may have implications for allergen production and human health. The major medically important house dust mite, D. farinae, is known to harbor a highly abundant intracellular bacterium belonging to the genus Cardinium. Expression analysis of the mite and symbiont genes can identify key mite molecular pathways that facilitate interactions with this endosymbiont and possibly shed light on how this bacterium affects mite allergen production and physiology in general.

M icrobes associated with arthropods can have significant impacts on host life history, ecology, digestion, and reproductive physiology and can shape their evolution. Interactions between microbes and their hosts can be broadly categorized as mutualistic, commensal, or parasitic, and these associations can be obligate or facultative. As examples of mutualistic associations, some arthropod species rely on the metabolic capabilities of symbiotic bacteria to colonize otherwise unavailable, nutrient-poor ecological niches (1)(2)(3)(4). Cardinium is a Gram-negative endosymbiotic bacterium associated with insects, arachnids, and nematodes and belongs to the phylum Cytophaga-Flavobacterium-Bacteroides. Cardinium is very common among arthropods, with an estimated 13% of extant arthropod species potentially having associations with this bacterium (5)(6)(7)(8)(9). This bacterium can cause cytoplasmic incompatibility and feminization (10) and affect vector competence, sex ratios, and biotin provisioning in its hosts (11). The functional role of Cardinium associated with mites is unknown thus far.
Here, we investigated the molecular mechanisms underlying the interactions of Cardinium and its acarine host, the American house dust mite, Dermatophagoides farinae Hughes, 1961 (Pyroglyphidae). This is one of the most medically important species of house dust mites (12)(13)(14). House dust mites, including D. farinae, produce almost 40 different allergens causing allergic diseases that affect 15 to 30% of the world's population (15). Mite allergens and biologically active molecules of mite-associated bacteria have diverse molecular functions, affecting both mites and bacteria, which has implications for human health. For example, some mite immunogenic proteins may control bacterial populations based on their hydrolytic function (e.g., cysteine protease [Der f 1], trypsin [Der f 3], chymotrypsin [Der f 6]) and other antibacterial proteins (e.g., a bactericidal permeability-increasing protein-like protein [Der f 7], a bacteriolytic enzyme [Der f 38]) (15). In turn, mite-associated bacteria can also play a role in modulating mite allergen production (16)(17)(18)(19) or induce IgE sensitization to microbial antigens, causing further impacts on human health (20).
Cardinium associated with D. farinae is a common hyperabundant bacterial species in U.S. and European populations (21)(22)(23). Phylogenetically, it is part of Cardinium group A (24) and a subgroup that is nearly exclusively associated with stored-product mites and predatory cheyletid mites (25). In Brevipalpus mites, Cardinium is localized in mesenchymal tissues, the reproductive tract, and appendages (26). A previous study (27) using Cy5-labeled universal bacterial probes indicated the presence of bacteria in the reproductive tract, mesodermal tissues, appendages, and hemolymph of D. farinae, although the specific locations of Cardinium in mite tissues still need to be confirmed. PCR-based analysis with symbiont-specific primers indicated that the bacterium was also associated with eggs and that it was more prevalent in females than in males, representing 92 and 67% of all bacterial taxa in females and males, respectively (25). A whole-genome sequence of Cardinium associated with D. farinae has been assembled. The total assembly length is 1.48229 Mb (GC content, 37.3%), and the genome codes for 1,247 proteins (25). Relative to other Cardinium genomes, Cardinium associated with D. farinae had a larger genome size and coded for larger numbers of protein-coding genes but was comparable in size to the genome of "Candidatus Amoebophilus asiaticus," an intracellular bacterium associated with Acanthamoeba. Due to its smaller genome size relative to free-living bacteria and its intracellular localization, the current hypothesis is that the endosymbiont of D. farinae show hallmarks of obligate endosymbiosis. Mite-Cardinium symbiosis may have an important medical consequence, since the presence of this bacterium is correlated with the presence of bacterial endotoxins, which can interfere with industrial mite-based vaccine production. For example, D. farinae usually has abundant endotoxins derived from Gram-negative bacteria compared to another house dust species, Dermatophagoides pteronyssinus, which typically lacks Cardinium (18,(28)(29)(30). Some D. farinae populations from Asia, however, may host different Gram-negative bacteria, i.e., Bartonella (16,17,31).
Previous attempts to eliminate Cardinium from D. farinae by applying tetracycline at a rate of 1 mg/g of diet to study the impact of these intracellular bacteria on mite physiology were unsuccessful (22). Therefore, we employed correlation analyses among global gene expression profiles of D. farinae and Cardinium to identify potential molecular interactions between hosts and their symbionts (32). Aspects related to cytoplasmic incompatibility, nutrient exchange, and host cell interactions have already been addressed in the literature using the same methodology in nonmite hosts (33,34). We therefore focused on other aspects important for mite-Cardinium symbiosis, particularly mite genes involved in the immune and metabolic pathways.
Effect of culture age on gene expression. In our experiments, old and young mite cultures differed in population structure, as older cultures had a higher proportion of adult mites than younger cultures. Distance-based redundancy analysis (dbRDA) correlation models indicated that the variable "culture age" explained 22% of the variability in predicted Cardinium KEGG gene expression and 48% variability in predicted mite gene KEGG gene expression (Table 1). When the "culture age" variable was entered in dbRDA models using either Cardinium or mite expression as the factor, it was eliminated by the forward selection procedure. Furthermore, the age of the mite culture was correlated with the expression of 79 predicted protein-encoding D. farinae genes. The expression of only 4 Cardinium predicted proteins prevailed in young culture, while the rest of the predicted proteins (e.g., BRU, PORCN, ZFPM1, and B9D) were not influenced by culture age (Fig. S3 and Data Set S1, sheet 8). However, variation in mite and Cardinium expression levels was observed among the biological replications of both the old and young samples, allowing this data set to be analyzed to identify correlations between mite and Cardinium genes that could be important for mediating interactions.
Mite regulatory pathways and Cardinium gene expression. Correlations between Cardinium and D. farinae gene expression were identified using dbRDA. We detected a total of 232 KEGG-annotated D. farinae genes belonging to signaling and regulatory pathways associated with mite immune responses, such as phagocytosis, apoptosis, the mitogen-activated protein kinase (MAPK) signaling cascade, endocytosis, the Toll pathway, the tumor necrosis factor (TNF) pathway, the transforming growth factor beta (TGF-b) pathway, lysozyme, and the Toll/Imd pathway (Data Set S1, sheet 9). In addition, four D. farinae genes (DF_06356, DF_09960, DF_14487, and DF_17932) that were not annotated by KEGG were included (Data Set S1, sheet 9) based on their high protein sequence similarity to Toll-like receptors. When the expression levels of these 236 mite genes were used as independent variables, they explained the same amount of variation (R 2 = 0.756 and 0.750) in Cardinium gene expression as the full data set (3,189 mite genes; R 2 = 0.750) ( Table 1). Cardinium gene expression data entered as independent variables explained the higher variation (R 2 = 0.952) of expression levels in mite regulatory/immune pathways than the inverse (R 2 = 0.756). Partial dbRDA models were constructed using the expression levels of Cardinium genes as the dependent variables (all predicted KEGG proteins) and the expression of mite genes belonging to selected pathways as independent factors (i.e., phagocytosis, 31 predicted KEGG-annotated proteins; apoptosis, 30; MAPK signaling cascade, 48; endocytosis, 78; Toll pathway, 21; TNF pathway, 17; TGF-b pathway, 26; lysozyme, 49; Toll/Imd pathway, 23) (Data Set S1, sheet 9; Table 1). The explained variability in these dbRDA models decreased in the following order: lysosome, apoptosis, phagocytosis, endocytosis, MAPK signaling cascade, Toll/Imd pathway, Toll pathway, TNF pathway, and TGF-b pathway ( Table 1). These models indicate a correlation (R 2 ranging from 0.377 to 0.692) between the expression of predicted Cardinium genes as the dependent variable and the expression of genes in these pathways as the independent variable. The expression of predicted Cardinium genes as an independent variable (R 2 ranging from 0.891 to 0.991) was likewise strongly correlated with the expression of predicted mite genes in immune regulatory pathways, decreasing in the following order: phagocytosis, apoptosis, MAPK signaling cascade, endocytosis, TNF pathway, TGF-b pathway, lysosome, Toll pathway, and Toll/Imd pathway.
For the matrix of the mite KEGG-annotated genes in these 9 pathways, including 236 predicted mite genes and the expression data of 448 predicted Cardinium genes, we calculated Spearman rank correlations (Data Set S1, sheets 10 and 11) and obtained 6,684 significant correlations (P , 0.05). These data were then visualized as a heat map ( Fig. 1) and network graph (Fig. S4). There were a total of 105,020 combinations, and 3.5% of the combinations showed significant positive pairwise correlations while 2.9% showed significant negative pairwise correlations (Fig. 1). A cluster analysis of these  Tables S7 and S9 for details). The models were either full models (marked by #) or partial models inferred by the forward selection procedure. R 2 values describe the explained variability by the models. The significance of the models was tested by a permutation test (n = 999 permutations). For each model, degrees of freedom (df), F, and P values are provided.
pairwise expression levels revealed 11 clusters of mite transcripts (d clusters = 11) with similar expression profiles and nine clusters of Cardinium expression patterns (c clusters = 9). The heat map showed the following features. Peaks of positive correlations were observed for the following cluster combinations: c1 and d3 (98% of the genes in this cluster were significantly and positively correlated with Cardinium expression), c2 and d11 (65%), c3 and d11 (34%), c4 and d5 (27%), and c5 and d8 (23%) (Fig. 1). Peaks of negative correlations were observed for the following combinations . The data were clustered by Ward's method; clusters and predicted KEGG-annotated proteins are described in Data Set S1, sheets 10 and 11, in the supplemental material.
There was a marked pattern of the levels of highly expressed Cardinium predicted proteins (e.g., BamD, LepA, LptF, and VirD4 but not LtrA) having negative correlations with the expression of mite genes belonging to immune/regulatory pathways (Fig. 2), while the reverse was also true, i.e., the levels of weakly expressed Cardinium predicted FIG 2 Expression of genes predicted to code for select Cardinium proteins and their correlations with the expression of genes belonging to mite regulatory and immune pathways. Standardized abundance data were used to generate the heat map; Ward's method was used for clustering. The Cardinium predicted proteins are described in Data Set S1, sheet 10.
proteins (e.g., PltB, SymE, MurC) were mostly positively correlated with expression of genes from mite immune/regulatory pathways. Cardinium and metabolic pathways of D. farinae. For D. farinae, 372 transcripts were assigned to KEGG pathways of interest (Data Set S1, sheets 12 to 15). These predicted mite genes were assigned to 63 pathways, forming three clusters with low, intermediate, and strong expression levels (Fig. 3). The pathway with the highest expression levels included transcripts associated with carbohydrate metabolism, purine metabolism, and glycan biosynthesis (Fig. 3). The expression levels of mite transcripts in metabolic pathways entered as independent variables showed a correlation with the expression of predicted Cardinium KEGG genes as dependent variables in the dbRDA model (R 2 = 0.56) ( Table 1). The dbRDA model using Cardinium KEGG gene expression as an independent variable and mite metabolic pathway expression as a dependent variable explained higher variability (R 2 = 0.841) than the previous model. The "culture age" variable explained substantially lower amounts of variation (R 2 = 0.21) than the Cardinium expression levels.

DISCUSSION
Mite metagenomics. The results of metatranscriptomic analysis again confirmed that cultured D. farinae mites are associated with Cardinium (25). We identified Cardinium as the most abundant mite symbiont (1,358 predicted proteins), followed by other bacteria (260 predicted proteins). Our quantitative PCR (qPCR) abundance estimates suggest that, on average, each mite had 103 to 104 Cardinium cells, a result consistent with previous studies (25). We also identified two unknown and possibly novel eukaryotic mite symbionts: Alveolata (1,145 to 5,500 reads per sample) and Microsporidia (203 to 392 reads per sample). The presence of the following alveolates associated with D. farinae has also been previously shown using light microscopy: Gregarina (Apicomplexa) (35,36) and unidentified flagellated protozoa (36,37). However, to our knowledge, Microsporidia have not yet been detected in D. farinae, although they are known to occur in stored-food mites, such as Tyrophagus putrescentiae (22,27,38). These microsporidians should therefore be identified morphologically in the future.
Effect of culture age. The age of mite cultures can affect the levels of allergens and other proteins (39,40), as well as the microbiome composition (22) in mites. D. farinae cultures show specific growth patterns during cultivations. There are three population growth phases: (i) a latency phase (0 to 10th week) that is characterized by a slow increase in the mite population size; (ii) an exponential phase (10th to 20th week) that is characterized by exponential growth of the mite population; and (iii) a "decline" phase (20th week) that is characterized by a large decrease in the mite population size (41,42). Here, we showed that samples collected in the exponential (young) and decline (old) phases largely did not differ in Cardinium abundance or gene expression, with only 4 Cardinium genes having significant expression differences. This indicates that Cardinium probably does not contribute to mite culture collapse during the decline phase. Remarkably, our rarefaction analysis showed some differences between young and old cultures in the number of expressed transcripts, which was 2-fold higher in the mites and 0.5-fold higher in Cardinium in old cultures than in young cultures. Different demographic structures (i.e., a higher number of juveniles in the young cultures) are probably responsible for this effect on mites.
Effect of Cardinium on mite immune/regulatory pathways and metabolism. It is commonly accepted that Wolbachia interacts with the host immune system, influencing gene expression to ensure its persistence within the host (43). The host immune system expresses effector molecules that promote the proliferation of these intracellular symbionts inside the host (44,45). The proliferation of Cardinium may be affected by a similar mechanism. In particular, the expression of 31% of Cardinium genes was correlated with the expression of mite genes belonging to mite immune and/or regulatory pathways.
Several molecular pathways and receptors influenced by Cardinium were identified. The TGF-b receptor pathway regulates parasites in the host gut (46,47). The Toll pathway is involved in mediating innate immune responses in arthropods and is activated by Toll-like receptors (TLRs) (48). The TNF pathway stimulates inflammation and triggers apoptosis through caspase (49). Two predicted proteins from our de novo transcriptome assembly were candidates for TNF receptors (DF_02523, DF_22218). However, their expression levels were very low. The NF-kappa-B receptor belonging to the Toll pathway was not detected here, although a candidate NF-kappa-B receptor was previously found in D. pteronyssinus (50). We identified four possible Toll-like receptors in the D. farinae transcriptome that were not assigned to any KEGG pathway (DF_06356, DF_14487, DF_09960, DF_17932). Their transcripts showed both positive and negative correlations to transcripts from the Cardinium endosymbiont (Data Set S1, sheet 10). We hypothesize that Cardinium can activate mite TLRs with its surface patterns (see below for membrane-associated proteins), as is known for other invertebrate hosts (51)(52)(53)(54).
The mitogen-activated protein kinase (MAPK) signaling pathway is involved in the regulation of multiple cellular processes, such as inflammation, immunity, cell differentiation, proliferation, and death. Caspases involved in MAPK induce apoptosis when overexpressed in tissue culture cells, and bacteria have evolved molecules that downregulate caspases (55). Here, we identified mite transcripts encoding CASP3 and CASP7. CASP3 expression was correlated with the expression of 24 Cardinium genes (e.g., negative correlations for genes coding for metabolism-associated proteins PqqL and RdgB and signaling molecule protein MlaF, and positive correlations for genes coding for metabolism-associated proteins DLAT, MurF, and SufS). CASP7 expression was negatively correlated with that of 4 Cardinium genes (e.g., coding for membrane transport protein LptF and environmental information processing protein RpoN) and positively correlated with 14 genes (e.g., coding for signaling and cellular process protein CTH and uncharacterized protein SufD and for metabolism protein DacA). By these mechanisms, intracellular bacterial symbionts can manipulate host cells, for example, to avoid apoptosis of nurse cells, which provides nutrients to growing symbiont-infested oocytes (56)(57)(58).
Endocytosis is a mechanism of entry of symbionts into the host cell. For example, Buchnera aphidicola) facilitates TOR pathway-mediated endocytosis to gain entry into host cells (Acyrthosiphon pisum) in the aphid blastula (59,60). Wolbachia utilizes host cell phagocytic and clathrin/dynamin-dependent endocytic machinery for horizontal cell-to-cell transfer (61). Additionally, Wolbachia resides in phagosomes, or vacuoles derived from the host endoplasmic reticulum (ER), which help the bacterium replicate and escape host immune surveillance (62). A similar mechanism for cytoplasmic entry and intracellular localization is expected for Cardinium. Previous studies involving fluorescent in situ hybridization with Cy5-labeled universal bacterial probes identified bacteria in the reproductive tract, appendages, and Standardized abundance data were used to generate the heat map; Ward's method was used for clustering. The Cardinium predicted proteins are described in Data Set S1, sheet 15. mesenchymal tissues and inside hemocytes of D. farinae (27). The majority of the bacterial reads from D. farinae were identified as Cardinium in this and previous studies (25). For this reason, it is currently hypothesized that Cardinium occurs inside hemocytes, indicating the hemocyte-based control of Cardinium through phagocytosis (63). In this study, we found that expression of two mite cathepsins (CTSL and CTSF) was negatively correlated with Cardinium protein expression (Data Set S1, sheet 10). Inside phagocytes, Cardinium may be degraded by the proteolytic action of cathepsins, indicating a potential regulatory function of mite cathepsins for Cardinium.
Predicted Cardinium proteins and mite molecular pathways. Several Cardinium genes were predicted to up-or downregulate genes from mite immune pathways via interaction with pathway receptors. The Toll receptor is an example of such a mite receptor that may recognize several Cardinium proteins, such as DegP (serine protease Do [EC 3.4.21.107]), MlaD (phospholipid/cholesterol/gamma-HCH transport system substrate-binding protein), and BamD (outer membrane protein assembly factor). DegP is a bifunctional bacterial protein acting both as a protease and a chaperone regulating cell growth (64). In the host cell, DegP of "Candidatus Liberibacter asiaticus" can presumably digest host proteins using zinc as a cofactor; therefore, DegP is considered a virulence factor (65). The Mla A-F system is universally conserved in Gram-negative bacteria; this system plays an important role in phospholipid transport to ensure the translocation of phospholipids between the inner membrane and periplasmic proteins (66). BamD is an outer membrane lipoprotein of Gram-negative bacteria that is essential for viability. The protein is translocated across the inner membrane and maintained in an assembly-competent state by periplasmic chaperones (67). In our system, the expression of genes coding for the metabolism-associated protein DegP, membrane transport protein MlaD, and signaling protein BamD showed negative (AP1G1, SEC61A, CANX, TUBA, CDC42, HSPA1s, ARFGAP1, RAB11A, RAB35, SRC, and STK3) or positive (e.g., MRT43, PLD1_2, EMC, and LMNB) correlations with expressed genes from all analyzed mite immune/regulatory pathways. The lipopolysaccharide transport system including LptF (lipopolysaccharide export system permease protein) in Gram-negative bacteria is responsible for transporting lipopolysaccharide from the cytoplasmic surface of the inner membrane (68). In our system, LptF expression showed a positive correlation with mite endocytosis, indicating that an increase in Cardinium lipopolysaccharide production may trigger an endocytic response from the mite host to control Cardinium.
Cardinium assembles Fe-S clusters, used as cofactors for many biological processes, by the sulfur formation system Suf (69). Interestingly, in our system, Cardinium SufS (cysteine desulfurase/selenocysteine lyase; EC:2.8.1.7 and EC:4.4.1.16) had strong correlations with several mite genes, mainly with genes coding for proteins involved in MAKK (n = 16) and endocytosis (n = 21). The prevailing correlations were negative, which may be explained by the need of the bacterium to compete for iron with its host.
The function of the GTP-binding protein LepA in bacterial cells is known in Legionella, in which LepA and LepB were shown to promote the nonlytic release of bacteria from their protozoan host by commandeering the exocytic pathway to release type IV secretion system (T4SS) effectors (70)(71)(72). A similar function is suggested for this protein in Cardinium; however, we did not identify proteins associated with the T4SS, except for VirD4. We found positive correlations of LepA with clathrin-mediated endocytosis.
The protein RpoN (RNA polymerase sigma-54 factor) is a regulator of genes involved in nitrogen metabolism and nitrogen assimilation under nitrogen-limited conditions (73); rpoN has been found in many free-living and symbiotic bacteria, e.g., "Ca. Amoebophilus asiaticus" (74), "Ca. Walczuchella monophlebidarum" (75), Blattabacterium (76), and Sodalis (77). The increase of RpoN expression in Cardinium is correlated with increased expression of genes coding for Toll receptors of mite immune/regulatory pathways, both an increase and decrease of expression of genes involved in the endocytosis pathway, a reduced expression of genes encoding caspases (apoptosis), a reduced expression of genes coding for polyamine biosynthesis and purine metabolism, and a reduced expression of genes coding for proteins involved in glycosaminoglycan and pyrimidine metabolism in mites. These results highlight the importance of rpoN in Cardinium-host interactions.
We also identified the following proteins in Cardinium that had not been previously detected (11,78,79): DdpD (ATP-binding cassette domain-containing protein; GenBank no. WP_186292434), VirD4 (T4SS protein VirD4 [EC 7.4.2.8]), and the toxin protein SymE (Fig. S5 and S6). Our alignment of SymE indicated a high similarity to the UniProtKB A0A368ZU39 protein of Schleiferia thermophila (Fig. S5). Our finding of genes encoding components of the T4SS (VirD4) is remarkable because this secretion system has not yet been found in other known species of Cardinium (11), and this finding is consistent with the presence of LepA (see above), which is also known to interact with this system. The protein encoded by virD4 is involved in the production of a transport membrane protein of the T4SS (80). We observed that the expression of the gene predicted to code for VirD4 was correlated with regulatory and immunological pathways in D. farinae (Fig. 5). However, other components of the T4SS were not found in our Cardinium transcriptome.
Effect of Cardinium on host metabolism. Previous studies have associated the presence of intracellular symbionts with increased levels of mitochondrial oxidative phosphorylation in the host, indicating that these bacteria stimulate host metabolism (81,82). Although the precise mechanism by which these bacteria stimulate oxidative phosphorylation is not known, key nutrients provided by intracellular bacteria, such as vitamins, lipoate, pantothenic acid, biotin, and riboflavin, may play a role (32,81,83,84). Our study indicated that Cardinium gene expression was negatively correlated with the expression of genes involved in several metabolic mite pathways (Data Set S1, sheet 10). Negative expression correlations were 3 times more frequent than positive correlations (Fig. 5). Mite populations heavily infested with Cardinium have significantly lower population growth rates (22) and can thus potentially affect mite energy metabolism. The genomes of many intracellular symbionts are highly reduced in comparison to Escherichia coli, e.g., whose genome is 4.64 Mb and has 4,609 genes (85), while the genome of the Cardinium associated with D. farinae is only 1.48 Mb (1,403 genes). The symbionts are largely dependent on their host cells for several nutrients (86,87). Likewise, the Cardinium genome is reduced, making this bacterium highly dependent on its eukaryotic host for obtaining most key metabolites and energy in the form of ATP (11). Collectively, this could explain the interactions between Cardinium genes and mite genes coding for proteins involved in core metabolism.
Cardinium bacteria associated with D. farinae have a complete pathway for lipoic acid biosynthesis, as is known for Cardinium hertigii (11). Lipoate is a highly conserved sulfur-containing cofactor involved in oxidative reactions in a previously characterized Cardinium strain (11,88). Lipoate produced by Cardinium could help the mite host overcome the negative effect on oxidative phosphorylation. Previous studies have shown that biotin may be a key nutrient provided by Cardinium to some insect hosts (11,79), while this was not true for bacteria associated with a strain of Bemisia tabaci (89) or the nematode Pratylenchus penetrans (5). Interestingly, the mite-associated Cardinium also lacks genes encoding the enzymes BioA, BioB, BioD, and BioF but expresses biotin transporters (BioY, BioN, BioM).
Potential effect of Cardinium on mite pheromone production. Some bacteria are known to mediate the aggregation of their hosts (90) and manipulate their defense strategies (91). For example, a decrease in Wolbachia infection in Thasus neocalifornicus was associated with reduced production of defensive and alarm pheromones by the host (92). D. farinae has opisthosomal glands (93) that produce a variety of compounds, including hydrocarbons, terpenes, aromatics, and alkaloids, with suggested defensive, alarm, aggregation, and sex pheromone functions (94). The terpenoid pathways of D. farinae likely produce geranylgeranyl diphosphate, which is required to produce defense-related secretions in aphids and termites (95,96). In our study, the expression of genes from the pathway responsible for the synthesis of terpenoids was positively correlated with the expression of Cardinium genes from cluster c4 (Fig. 4). We hypothesize that Cardinium can stimulate the production of sexual or aggregation pheromones (97)(98)(99) to encourage mites to congregate and increase its own vertical transmission.
Conclusions. The molecular interactions between the mite D. farinae and Cardinium host included the correlation in the transcriptome of mite immune/regulatory and metabolism pathways (apoptosis, endocytosis, lysozome, MAPK, TGF, TNF, and TOLL/IMD) and Cardinium genes (e.g., BamD, LepA, SymE, and VirD4). Cardinium protein expression was negatively correlated with 3-fold-more predicted mite metabolic pathways (glycolysis and citrate cycles) than those of positive correlations. Cardinium possesses a complete pathway for lipoic acid biosynthesis but not biotin. The correlation analysis indicates the possible effect Cardinium on terpenoid pathways of D. farinae and production of sexual or aggregation pheromones.

MATERIALS AND METHODS
Mite populations. The laboratory cultures of D. farinae originated from rearing facilities of the Crop Research Institute, Prague, Czechia, and were maintained as described previously (22). Mites were mass reared in Iwaki flasks on house dust mite diet (HDMd) and then harvested. The detailed rearing and harvesting methodology was described previously (21).
Sampling. Mites were reared in 12 Iwaki chambers, and 1-month (young) and 3-month (old) cultures were collected for further analysis. Live mites were collected from the plugs and surfaces of the flasks with a brush and placed into sterile microcentrifuge tubes. We modified a previously published protocol to collect the mites for RNA extraction and perform surface sterilization to reduce contamination from environmental microbes that may be present on the exoskeleton (27), as follows: (i) 30-to 40-mg samples of mites were weighed (the fresh weight of a mite is approximately 6 mg, so each sample contained 5,000 to 6,700 mites); (ii) each mite sample was placed into 100% ethanol, vortexed for 5 s, and centrifuged (13,000 Â g for 1 min); (iii) the supernatant was replaced with a 1:10 bleach solution containing 5% sodium hypochlorite, and then samples were mixed by vortexing for 5 s and centrifuged (13,000 Â g for 2 min); and (iv) samples were washed twice with double-distilled water (ddH 2 O) to remove residual bleach. Steps i to iv were performed on ice, and we collected six biological replicates for each 1-and 3month culture for a total of 12 samples. The same methodology was used for mite sampling for DNA isolation.
RNA extraction. RNA isolation was performed using the NucleoSpin RNA kit (catalog no. 740984.50; Macherey-Nagel, Duren, Germany) according to the manufacturer's instructions, with the following modifications: (i) mites were homogenized for 30 s in a glass tissue grinder (Kavalierglass, Prague, Czechia) in 500 ml of lysis buffer; (ii) samples were centrifuged at 2,000 Â g for 3 s; and (iii) DNA was degraded by DNase I at 37°C according to the manufacturer's protocol (Riboclear plus, catalog no. 313-50; GeneAll, Lisbon, Portugal). RNA quality was evaluated using a NanoDrop (NanoDrop One; Thermo Scientific, Waltham, MA, USA) and an Agilent 2100 bioanalyzer (Agilent Technologies, Santa Clara, CA, USA), and samples were shipped on dry ice to the MrDNA laboratory (Shallowater, TX) for downstream processing and sequencing.
DNA extraction for quantification of Cardinium by qPCR. Mites were transferred to 2.0-ml screwcap MCT conical NS tubes (catalog no. CP5912; Alpha Laboratories, Eastleigh, Hampshire, UK) with 0.5 g of an equal mixture of 0.3-mm and 1.0-mm garnet sharp particles (catalog no. 11079103gar and 11079110gar; BioSpec) and 1 glass bead (3-mm diameter; catalog no. R155761; BioSpec). The samples were homogenized in a mini-beadbeater (BioSpec) for 5 min and then centrifuged (10,000 Â g for 2 min). The supernatant was transferred to a new sterile microcentrifuge tube with 20 ml of proteinase K and incubated at 56°C overnight. The remainder of the DNA extraction procedure was performed using the QIAamp DNA micro kit (Qiagen, Hilden, Germany; catalog no. 56304) and by following the manufacturer's protocol for tissue samples. This method was used to extract DNA from pools for mites. In contrast, DNA from individual mites was isolated using 50 ml of a 5% solution of Chelex sodium (catalog no. C7901; Sigma-Aldrich, Saint Louis, MO, USA). Mites were crushed by pipetting, and then 3 ml of proteinase K was added. Samples were incubated at 56°C overnight, and the rest of the DNA extraction procedure was performed according to a previously described protocol (100). The abundance of Cardinium bacteria was also quantified using a previously described protocol (22). In brief, amplification of a 160-bp fragment of the 16S rRNA gene of Cardinium was performed on a StepOnePlus real-time PCR system (Life Technologies, Carlsbad, CA, USA) using 96-well plates with GoTaq qPCR master mix (catalog no. A6001; Promega) containing SYBR green as a double-stranded-DNA-binding dye. The primers Card_6QF3-TGCAAATCTCAAAAGCATGT-5 and Card_6QR 3-TCAAGCTCTACCAACTCCCA-5 (101) were used.
RNA processing and sequencing. The concentration of total RNA was determined using the Qubit RNA assay kit (Life Technologies). For rRNA depletion, first, 1,000 ng of total RNA was used to remove any residual DNA contamination using Baseline-ZERO DNase (Epicentre, Madison, WI, USA) by following the manufacturer's instructions, followed by purification using the RNA Clean & Concentrator kit (Zymo Research, Irvine, CA, USA). DNA-free RNA samples were used for rRNA removal with the Ribo-Zero Gold rRNA removal epidemiology kit (Illumina, San Diego, CA, USA), and final purification was performed using RNA Clean & Concentrator columns (Zymo Research). DNA-free and rRNA-depleted RNA samples were used for library preparation using Kapa mRNA hyperprep kits (Kapa Biosystems, Wilmington, MA, USA) by following the manufacturer's instructions. In all the libraries, DNA concentration was measured using the Qubit dsDNA HS assay kit (Life Technologies), and the average library size was determined using an Agilent 2100 bioanalyzer (Agilent Technologies). The libraries were then pooled in equimolar ratios at 2 nM, and the library pool (10.0 pM) was clustered using cBot (Illumina) and sequenced using the Illumina HiSeq 2500 system as 250 Â 250 bp paired-end (PE) ends (500 cycles).
Sequence processing. Read processing was performed using CLC Genomic Workbench v.20 (Qiagen, Venlo, Netherlands). PE reads were trimmed to remove residual adapter sequences and then filtered by length and quality score according to the default software parameters. The filtered reads were mapped to the previously described genome of Cardinium (25) (BioProject accession no. PRJNA555788; GCA_007559345.1), and unmapped reads were de novo assembled and processed as the D. farinae meta-transcriptome (25) using CLC Genomic Workbench software.
Contig processing. The reads were de novo assembled to generate 6 Â 10 4 contigs with a contig N 50 of 1,423 bp. In this data set, 3 Â 10 4 coding DNA sequences (CDSs) were predicted and annotated from these contigs by using a set of known proteins as the trusted source (see below) using Prokka (102,103) installed on the Galaxy server (104).
Expression analyses. Trimmed reads were mapped against the annotated transcriptome assembly generated as described above using CLC Workbench. The mapped reads ranged from 32 Â 10 6 to 90 Â 10 6 paired reads per sample for D. farinae and 0.16 Â 10 6 to 1.2 Â 10 6 paired reads per sample for Cardinium. The total numbers of reads mapped to each mite transcript and Cardinium locus were used to generate a count matrix for the expression analysis.
PHMMER (115) bundled with the Galaxy server was used to match predicted proteins from the de novo transcriptome to annotated proteins in the reference library above. Matches with bit scores higher than 300 were retained and used for functional predictions. Some proteins were manually annotated with the EMBL-EBI website with reference proteomes and assembled genomes as databases (116). Proteins with the best matches to Archaea, Chordata, Cyanobacteria, plants, and nonmicrosporidian fungi were discarded as possible contaminants of the transcriptome from the mite diet. We also eliminated apicomplexan, microsporidian, and bacterial proteins (except Cardinium) from prior analyses (Data Set S1, sheets 4 and 7).
Selected proteins and metabolic pathway analyses. The KEGG mapper tool was used to infer cellular functions from protein sequences using the GhostKOALLA annotator against the genus prokaryotes 1 family eukaryotes 1 viruses databases (117). A comparison of predicted proteins of D. farinae (with or without expression) to D. pteronyssinus is given in Data Set S1, sheet 16.
Statistical analyses. We tested whether the numbers of Cardinium 16S rRNA gene copies in young and old mite cultures were significantly different using a Mann-Whitney test in PAST (118) on log 10transformed data. Expression analyses of mite, Cardinium, microsporidian, and alveolate genes were performed separately and then subsampled by recalculation to the lowest number of reads per sample (119). The standardized and unstandardized data were compared by nonmetric multidimensional scaling (NMDS) (Fig. S8), and the outliers of unstandardized data are shown in Fig. S9. Rarefaction analyses were performed in PAST (118) to standardize reads. Expression of the genes assigned to KEGG pathways of interest was used for further analyses. Comparisons between the effects of the cultures (old and young) were performed using dbRDA on the Bray-Curtis dissimilarity distance matrix in the R package vegan (120). To identify genes with different expression levels between old and young mite cultures, we compared the expression levels of genes encoding predicted KEGG-annotated proteins by parametric analysis of variance (ANOVA) using the Benjamini-Hochberg correction and visualized the results as volcano plots in XLSTAT (121).
Potential interactions among KEGG-annotated gene expression of Cardinium and mite genes were analyzed by two types of correlation analyses, dbRDA and Spearman correlation analysis. In dbRDA, mite expression levels were set as "species," Cardinium gene expression levels were set as "environmental variables" (122), and the Bray-Curtis dissimilarity distance matrix was used. To identify variables with the strongest influence on the dbRDA model, we applied a variable selection procedure ("both") using the ordistep function (123) to reduce a full data set to a subset of predictor environmental variables. Then, a new model based on a reduced set of "environmental" variables was constructed and analyzed as described above. dbRDA models were generated separately for mite whole-transcriptome data sets and metabolic pathways and for regulatory and immune pathways associated with the host reaction to infection or symbiotic bacteria (apoptosis, endocytosis, lysozyme, MAPK signaling cascade, TGF-b pathway, TNF pathway, and Toll/Imd pathway). Models with the best predictive power were selected based on their explained variability (R 2 ). Spearman correlations using bootstrap permutation were calculated in PAST. Correlation values were filtered based on their significance level (P # 0.05) in R and were visualized in Cytoscape v3.8.0 (124) using a correlation-based network in Metscape 3 (125). Only edges with an absolute R 2 of $0.75 were included in the initial analysis. To further reduce the numbers of edges, only edges with "strength correlations" of $8 (R 2 $ 0.86) were considered. The final network was organized in the Compound Spring Embedder (CoSE) layout (126).
Heat maps were constructed from the standardized abundance of predicted proteins or Spearman correlation coefficients. Data sets were clustered by Ward's method using the Euclidean distance in PAST.
Data availability. Raw reads were deposited into NCBI's Sequence Read Archive (SRA) database under accession numbers SRR11058239 to SRR11058250, associated with BioProject PRJNA605718.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. DATA SET S1, XLSX file, 19 MB.