Division of labor within psyllids: metagenomics reveals an ancient dual endosymbiosis with metabolic complementarity in the genus Cacopsylla

ABSTRACT Hemipteran insects are well-known for their ancient associations with beneficial bacterial endosymbionts, particularly nutritional symbionts that provide the host with essential nutrients such as amino acids or vitamins lacking in the host’s diet. Therefore, these primary endosymbionts enable the exploitation of nutrient-poor food sources such as plant sap or vertebrate blood. In turn, the strictly host-associated lifestyle strongly impacts the genome evolution of the endosymbionts, resulting in small and degraded genomes. Over time, even the essential nutritional functions can be compromised, leading to the complementation or replacement of an ancient endosymbiont by another, more functionally versatile bacterium. Herein, we provide evidence for a dual primary endosymbiosis in several psyllid species. Using metagenome sequencing, we produced the complete genome sequences of both the primary endosymbiont “Candidatus Carsonella ruddii” and an as yet uncharacterized Enterobacteriaceae bacterium from four species of the genus Cacopsylla. The latter represents a new psyllid-associated endosymbiont clade for which we propose the name “Candidatus Psyllophila symbiotica.” Fluorescent in situ hybridization confirmed the co-localization of both endosymbionts in the bacteriome. The metabolic repertoire of Psyllophila is highly conserved across host species and complements the tryptophan biosynthesis pathway that is incomplete in the co-occurring Carsonella. Unlike co-primary endosymbionts in other insects, the genome of Psyllophila is almost as small as the one of Carsonella, indicating an ancient co-obligate endosymbiosis rather than a recent association to rescue a degrading primary endosymbiont. IMPORTANCE Heritable beneficial bacterial endosymbionts have been crucial for the evolutionary success of numerous insects by enabling the exploitation of nutritionally limited food sources. Herein, we describe a previously unknown dual endosymbiosis in the psyllid genus Cacopsylla, consisting of the primary endosymbiont “Candidatus Carsonella ruddii” and a co-occurring Enterobacteriaceae bacterium for which we propose the name “Candidatus Psyllophila symbiotica.” Its localization within the bacteriome and its small genome size confirm that Psyllophila is a co-primary endosymbiont widespread within the genus Cacopsylla. Despite its highly eroded genome, Psyllophila perfectly complements the tryptophan biosynthesis pathway that is incomplete in the co-occurring Carsonella. Moreover, the genome of Psyllophila is almost as small as Carsonella’s, suggesting an ancient dual endosymbiosis that has now reached a precarious stage where any additional gene loss would make the system collapse. Hence, our results shed light on the dynamic interactions of psyllids and their endosymbionts over evolutionary time.

N umerous insects maintain long-lasting associations with heritable bacterial endosymbionts that provide the host with essential nutrients lacking in its diet (1).Plant sap-feeding and blood-feeding insects in particular are well-known to harbor nutrient-providing endosymbionts in specialized cells called bacteriocytes, which may form a tissular structure called a bacteriome (2,3).These so-called primary endosym bionts are obligatory for host survival and reproduction, as they provide essential amino acids and/or vitamins that the host cannot produce or obtain from its food source (4)(5)(6)(7)(8)(9).Hence, these bacteria have been crucial for the evolutionary success of numerous insects, enabling the exploitation of nutritionally unbalanced food sources such as vertebrate blood and plant sap.
In turn, the host-associated lifestyle has a strong impact on the genome evolution of the endosymbionts.Their strictly intracellular environment, small effective population size, and frequent bottlenecks due to vertical transmission result in genomic decay through the accumulation of deleterious mutations (Muller's ratchet) and the loss of genes that are no longer needed (10)(11)(12).Over evolutionary time, this has produced some of the smallest bacterial genomes known to date (6), enriched in genes involved in the production of nutrients required by the host.However, eventually even these pathways can be degraded, leading to either the complementation or the replacement of the ancient endosymbiont by another, more functionally versatile, bacterium or fungus (13)(14)(15)(16)(17)(18)(19).
This dynamic can be observed in several plant sap-feeding hemipterans that rely on more than one primary endosymbiont to produce all necessary nutrients.Notably, the Auchenorrhyncha (cicadas, planthoppers, spittlebugs) are well-known for their ancient dual endosymbiotic consortia, where two co-primary endosymbionts jointly produce the complete set of essential nutrients required by the host, resulting in an intricate metabolic interdependence between the different partners (20).Nonetheless, multiple endosymbiont replacements occurred over time to compensate for the extreme genome erosion of the ancient symbionts (6,16,(21)(22)(23)(24)(25)(26).A similar pattern occurs in aphids (Sternorrhyncha).While most species harbor a single primary endosymbiont, Buchnera aphidicola, which provides the host with the 10 essential amino acids and the vita mins biotin and riboflavin (27), dual-endosymbiotic systems have evolved repeatedly in multiple aphid lineages to compensate for lost pathways in B. aphidicola (13,14,18,(28)(29)(30).
Similar dual primary endosymbioses may be widespread in psyllids (Hemiptera: Psylloidea), a species-rich group of phloem-feeding jumping plant lice.Like other plant sap-feeding insects, psyllids harbor a bacteriocyte-associated primary endosymbiont ("Candidatus Carsonella ruddii", hereafter Carsonella), which provides the host with essential amino acids (31)(32)(33)(34).Carsonella is present in all investigated psyllid species and exhibits strict host-symbiont co-divergence, suggesting a single infection of a common ancestor of all extant psyllids (31,35,36).Its genome is extremely streamlined and figures among the smallest bacterial genomes known to date (157-175 Kbp) (32).Due to this extreme genome reduction, some Carsonella strains are no longer able to produce the full complement of essential amino acids, questioning their ability to fulfill their symbiotic function without compensation from host genes or co-occurring symbiotic bacteria (34,37).
Additional endosymbionts have indeed been observed to co-inhabit the bacteriome with Carsonella in several species (3,(38)(39)(40).In these cases, Carsonella is located in bacteriocytes surrounding the bacteriome, while a second bacterium occurs in the syncytium at the center of the bacteriome.Importantly, the taxonomy of the co-primary endosymbiont varies depending on the psyllid species.While the syncytium-symbiont (Y-symbiont) of the mulberry psyllid Anomoneura mori is an uncharacterized Enterobac teriaceae bacterium (Gammaproteobacteria) (38) whose symbiotic role is unknown, the citrus psyllid Diaphorina citri harbors "Candidatus Profftella armatura" (Betaproteobacte ria).The latter is a defensive and nutritional endosymbiont that produces vitamins, carotenoids, and a polyketide toxin, i.e., metabolites that are not provided by Carsonella (40,41).In contrast, the psyllid species Ctenarytaina eucalypti and Heteropsylla cubana harbor symbionts closely related to the insect endosymbionts "Ca.Moranella endobia" and Sodalis, whose genomes precisely complement several amino acid biosynthesis pathways missing from the co-occurring Carsonella strains (34).Despite typical hallmarks of vertically transmitted intracellular bacteria, the genomes of both endosymbionts are less reduced (>1 Mbp), suggesting a more recent acquisition relative to Carsonella, presumably to compensate for lost functions in the latter.In addition, numerous psyllid microbiome studies revealed highly abundant but yet uncharacterized Enterobacteria ceae bacteria in diverse species from several psyllid families (42)(43)(44)(45)(46)(47), suggesting that dual primary endosymbioses may be more widespread in psyllids than previously thought.
Herein, we aim to elucidate the evolutionary and metabolic relationships between psyllids of the genus Cacopsylla (Psyllidae) and their Enterobacteriaceae endosymbionts.Many Cacopsylla species have indeed been shown to harbor highly abundant Enterobac teriaceae endosymbionts that are closely related to the Y-symbiont co-inhabiting the syncytium of the bacteriome in A. mori (43,46,47).Furthermore, these symbionts were present in all tested individuals of a given species, suggesting that they may represent co-primary endosymbionts widespread in this genus.In this study, we produced the complete genome sequences of both Carsonella and the Enterobacteriaceae endosym bionts of four Cacopsylla species (C.melanoneura, C. picta, C. pyri, and C. pyricola) known to harbor closely related endosymbionts from our previous metabarcoding studies (46,47).Fluorescent in situ hybridization confirmed the co-localization of both endosymbionts in the bacteriome.Comparative genomic analyses revealed that the Enterobacteriaceae endosymbionts represent a psyllid-associated clade among other insect endosymbionts.Its genome is almost as small as that of Carsonella and comple ments the tryptophan biosynthesis pathway that is compromised in the co-occurring Carsonella.

All four Cacopsylla species harbor two endosymbionts with tiny genomes
To investigate endosymbiont genetic diversity across different Cacopsylla species and genotypes, 12 insect metagenomes were sequenced.These metagenomes encompassed four different host species: C. melanoneura and C. picta, which complete their develop ment on apple (and also on hawthorn in the case of C. melanoneura) and the pear psyllids C. pyri and C. pyricola.Multiple metagenomes were sequenced for C. melano neura (N = 8) and C. picta (N = 2), covering different Cytochrome Oxidase I (COI) haplotypes and, in the case of C. melanoneura, different regions of origin (Aosta Valley vs.South Tyrol, Italy), and different host plants (apple vs. hawthorn) (Table 1).As expected, the majority of the Nanopore reads belonged to the insect genome, and only about 5% of the reads (range: 2.94%-9.80%)corresponded to non-host reads.Nonetheless, circular genomes of the primary endosymbiont Carsonella could be assembled from all metagenomes (Table 1) with coverages of 85-408×.Genome size ranged from 169,917 to 171,920 bp with 14.98%-15.53%GC content, similar to previously sequenced Carsonella genomes from other psyllid genera (32,34,40,41).The genomes encoded 182-190 protein-coding genes, 1 ribosomal rRNA operon, and 26-27 tRNAs (Table 1).Synteny and gene content were highly conserved across all genomes, with 161 out of 184 orthogroups (87.5%) shared across all 12 genomes (Fig. 1a and b).
In addition to Carsonella, a second circular genome could be assembled from 10 out of the 12 metagenomes (Table 1) with coverages of 19-431×.These genomes belonged to the uncharacterized Enterobacteriaceae endosymbiont previously identified through 16S rRNA gene metabarcoding (46,47).Contigs of this symbiont were also present in the two remaining metagenomes (both from C. melanoneura), but the coverage was insufficient to assemble complete genomes.The    content were highly conserved across all Enterobacteriaceae genomes (Fig. 1c and d).
They contained 205-208 protein-coding genes, 1-3 pseudogenes, 1 ribosomal rRNA operon, 27 tRNAs, and 2 ncRNAs (Table 1).Moreover, 196 out of 209 orthogroups (93.78%) were shared across all 10 genomes (Fig. 1d), indicating that the functional repertoire is highly similar across all four host species.Taken together, all four Cacopsylla species harbor two endosymbionts with typical hallmarks of a long intracellular symbi otic lifestyle, such as extremely small genomes and low GC content.

The Enterobacteriaceae symbionts represent a new clade of insect endosym bionts
To determine the phylogenetic position of the newly sequenced psyllid endosymbionts, we performed a maximum-likelihood phylogenomic analysis based on 67 single-copy genes present in 46 genomes, namely, the 10 Enterobacteriaceae endosymbionts of Cacopsylla spp., 33 insect endosymbionts from Gammaproteobacteria, and 3 Pseudomo nas entomophila strains as outgroups (Fig. 2).The insect endosymbionts included the two previously sequenced endosymbionts of the psyllid species C. eucalypti and H. cubana, as well as both obligate and facultative endosymbionts of diverse hemipterans (aphids, adelgids, leafhoppers, mealybugs, and stinkbugs), beetles (reef beetles and weevils), and the tsetse fly (Table S1).Interestingly, the Enterobacteriaceae endosymbionts of Cacopsylla spp.were not closely related to the previously sequenced endosymbionts of the psyllids C. eucalypti and H. cubana (34) (Fig. 2).Instead, they formed a clade with full bootstrap support that was most closely related to "Ca.Annandia adelgestsuga" and "Ca.Annandia pinicola, " nutritional endosymbionts of adelgids (48) and "Ca.Nardonella sp., " and ancient endosymbionts of weevils (49) (Fig. 2).Hence, the Enterobacteriaceae endosymbionts of Cacopsylla spp.represent a new psyllid-associated clade of insect endosymbionts for which we propose the name "Candidatus Psyllophila symbiotica" (hereafter Psyllophila).

Both Cacopsylla endosymbionts are localized in the bacteriome
Fluorescence in situ hybridization with Carsonellaand Psyllophila-specific probes revealed that adults of all Cacopsylla species exhibit the same pattern of endosymbiont co-localization in the same bacteriome (Fig. 3).The bacteriomes are large, paired organs localized in the insect's abdomen.A single bacteriome contains two distinct parts: central and peripheral.Uninucleated bacteriocytes filled with Carsonella are located in the peripheral zone of the bacteriome (Fig. 3), whereas the central part is occupied by a multinucleated syncytium filled with Psyllophila cells as well as some bacteriocytes containing Carsonella (Fig. 3).

Metabolic complementarity between Carsonella and Psyllophila
The Cluster of Orthologous Gene (COG) category "amino acid transport and metabolism" was enriched in all sequenced Carsonella genomes (Fig. 4a), in line with its role as a nutritional symbiont.Indeed, based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotation, the biosynthesis pathways for 8 of the 10 essential amino acids are complete or almost complete in all 12 Carsonella strains from the 4 Cacopsylla species (Fig. 4b).Most of the missing functions (hisN in the histidine pathway, dapC in the lysine pathway, thrB in the threonine pathway, and aroE in the Shikimate pathway, Fig. 4b) are also missing in all previously sequenced Carsonella genomes (Tables S2 and S3).
The same applies to the methionine biosynthesis pathway, for which only the last reaction (metE) is present in all sequenced Carsonella genomes from this and previous studies (Fig. 4b; Table S3).The only difference between the Cacopsylla-associated Carsonella strains was the absence of aroB in the stains from C. picta and C. pyri, whereas this gene is present in all Carsonella strains from C. melanoneura and C. pyricola (Fig. 4b; Table S3).Interestingly, the tryptophan biosynthesis pathway was incomplete in all 12 Carsonella strains from Cacopsylla spp., in that only trpE and trpG were present, whereas the rest of the pathway was missing (Fig. 4b).

Repeated gene losses throughout Carsonella evolution
Apart from the Carsonella genomes presented herein, complete genome sequences are available for 11 Carsonella strains from 9 psyllid species representing 5 genera and 3 families (Aphalaridae, Psyllidae, and Triozidae) (Table S2).The functional repertoire of these genomes is quite conserved, since 135 out of 197 orthogroups (68.5%) were shared across all 23 genomes and specific orthogroups occurring only in strains from particular host species or genera were rare (14/197) (Fig. 5a).In contrast, host lineage-specific losses of orthogroups were more common.For instance, 12 orthogroups were specifically absent from the 3 Carsonella strains from Heteropsylla texana, Pachypsylla celtidis, and Pachypsylla venusta (Fig. 5a).Similarly, six orthogroups were specifically absent from the Carsonella strains from Ctenarytaina spp., four orthogroups were absent from strains from Cacopsylla spp., and three orthogroups were absent from strains from Pachypsylla spp.(Fig. 5a).These differences are also reflected in repeated losses of genes or entirely new pathways involved in essential amino acid biosynthesis across the Carsonella phylogeny (Fig. 5b).Notably, the tryptophan pathway has been lost at least three times independently, as it is incomplete or missing in all Carsonella strains associated with the genera Cacopsylla and Heteropsylla (Psyllidae), as well as Ctenarytaina and Pachypsylla (Aphalaridae) (Fig. 5b; Table S3).In contrast, this pathway is complete in the Carsonella strains from Bactericera spp.(Triozidae) and Diaphorina citri (Liviidae) (Fig. 5b; Table S3).Other repeatedly lost functions include the histidine biosynthesis pathway as well as the genes aroB and dapE, implicated in the Shikimate and lysine pathways, respectively (Fig. 5b; Table S3).Concomitantly, co-primary endosymbionts complementing the missing amino acid biosynthesis pathways have been identified in several species, i.e., comple menting tryptophan in Cacopsylla spp.(this study) and H. cubana (34) and both trypto phan and arginine in C. eucalypti (34) (Fig. 5b).

DISCUSSION
Herein, we present the complete genome sequences of both Carsonella and the uncharacterized Enterobacteriaceae endosymbionts of four Cacopsylla species from different host plants.The Enterobacteriaceae endosymbionts represent a psyllid-associ ated clade among other insect endosymbionts, for which we propose the name "Ca.Psyllophila symbiotica".Both endosymbionts co-occur within the bacteriome, exhibiting the same co-localization pattern (Carsonella in peripheral bacteriocytes, Psyllophila in the central syncytium) as for other dual endosymbioses in D. citri and A. mori (38)(39)(40).
In combination with a small and AT-rich genome, the bacteriome localization confirms that Psyllophila is a co-primary endosymbiont widespread within the genus Cacopsylla.Interestingly, unlike co-occurring endosymbionts in other psyllid species (34,40,41), the Psyllophila genome is almost as small as the genome of Carsonella, indicating an ancient dual endosymbiosis rather than a recent acquisition of a more versatile symbiont to rescue a degrading primary endosymbiont.Despite having a tiny and functionally limited genome, Psyllophila has retained the necessary genes to complement the tryptophan biosynthesis pathway that is compro mised in the co-occurring Carsonella.This appears to be a recurring theme across Carsonella evolution since the tryptophan pathway is the most frequently lost amino acid biosynthesis pathway based on the genomes available to date.Specifically, this pathway has been lost multiple times independently, namely, in the Carsonella strains associated with species from the genera Pachypsylla and Ctenarytaina (Aphalaridae)  S3 for a detailed list of genes identified based on KEGG pathway annotation).The presence of known co-primary endosymbionts is indicated using blue dots for amino acid-providing nutritional co-primary endosymbionts and red dots for the defensive and nutritional endosymbiont "Ca.Profftella armatura".and the psyllid lineage leading to both Heteropsylla and Cacopsylla (Psyllidae).Apart from tryptophan, the arginine and histidine pathways have also been lost in specific Carsonella strains, albeit less frequently (34).However, the currently available Carsonella genomes cover only a few branches on the psyllid tree of life; therefore, we do not know how frequent (or rare) these gene losses actually are.Concomitantly, co-primary endosymbionts complementing the missing amino acid biosynthesis pathways have been identified in several species, namely, Psyllophila and a Sodalis-like symbiont complementing tryptophan in Cacopsylla spp.(this study) and H. cubana (34), respec tively, and another Sodalis-like symbiont complementing both tryptophan and arginine in C. eucalypti (34).Intriguing cases in this context are the psyllid species H. texana, P. celtidis, and P. venusta, whose Carsonella strains have lost both the histidine and tryptophan pathways, but no co-primary endosymbionts have been observed to date (34).Possible alternative scenarios are that the missing genes are encoded by the host, e.g., after horizontal transfers of bacterial genes to the host genome, or that the amino acids in question are present in sufficient quantities in the phloem sap of the psyllid's host plant.Although numerous genes of bacterial origin have indeed been identified in the genome of P. venusta, they do not restore the missing amino acid pathways (37).
Based on its genome sequence, Psyllophila not only rescues tryptophan biosynthesis but also encodes partial biosynthesis pathways for the vitamins biotin and riboflavin, as well as all necessary genes for the synthesis of carotenoids, pigments that may protect against oxidative damage to DNA (50).This represents a striking convergence with "Ca.Profftella armatura, " the co-primary endosymbiont in several Diaphorina species, which has an almost identical gene set for these pathways as Psyllophila (41).As in Psyllophila, the last step in the riboflavin pathway is missing in "Ca.Profftella armatura, " but the relevant gene has been detected in the genomes of the psyllids D. citri and P. venusta, likely due to a horizontal transfer from an unknown bacterium (37).Likewise, we identified a similar riboflavin synthase gene encoded in the genomes of all four Cacopsylla species via blast searches of the D. citri gene against preliminary assemblies of the insect genomes from our metagenomic data sets.Hence, it is likely that riboflavin can be jointly synthesized by Psyllophila and its psyllid hosts, just like in the symbiosis of D. citri and Profftella.In any case, the functional similarity between two distantly related psyllid endosymbionts highlights the importance of these metabolites for the psyllid hosts and/or the endosymbionts.
Taken together, our data shed light on the dynamic interactions of psyllids and their endosymbionts over evolutionary time.Notably, the tiny and highly eroded genome of Psyllophila indicates a long-lasting dual endosymbiosis of Carsonella and Psyllophila within the genus Cacopsylla.Based on fossil records, extant psyllids (superfamily Psylloidea) evolved relatively recently, about 48 mya (51).The Psyllidae family is likely younger (52), with some fossils of Cacopsylla spp.estimated to date back 16-23 mya, but this has not been formally validated.In addition, recent microbiome studies confirm the presence of other Enterobacteriaceae symbionts in psyllid species from other families, notably Triozidae and Aphalaridae (42)(43)(44)53), suggesting that related co-primary endosymbionts may be more widespread across psyllid families than currently thought.However, in the absence of genomic data for these Enterobacteriaceae symbionts, it is unknown whether they belong to the same genus or share a common ancestor with Psyllophila.
In any case, the Carsonella-Psyllophila dual endosymbiosis in Cacopsylla spp.has likely reached a highly precarious state since no functional redundancy exists between the two endosymbionts and any additional gene loss would destabilize the symbiotic system.Considering the diversity of predominant psyllid-associated bacteria revealed by previous studies (35,(42)(43)(44)(45)(46)(47)54), it is likely that the ancient endosymbiont Psyllo phila has already been replaced by younger symbionts in some psyllid lineages.For instance, this may have been the case in H. cubana and C. eucalypti, which harbor more recently acquired co-primary endosymbionts with larger genomes (34).It is tempting to speculate that species that are not known to harbor co-primary endosymbionts today (e.g., P. venusta) may have harbored a similar dual endosymbiosis in the past, but the co-symbiont was lost without replacement, maybe because its functions were no longer required after a change in ecological conditions (e.g., change of host plant, evolution of gall-forming behavior).This could also explain why the Carsonella strains in these species have lost similar genes and pathways as the strains existing in dual endosymbiotic systems today.
This raises the question of whether these pathways were lost before or after the establishment of the dual primary endosymbiosis.According to the Black Queen Theory on the evolution of dependencies within bacterial communities (55), it is advantageous for bacteria to lose costly metabolic functions (i.e., to streamline their genomes), as long as another species within the community still produces these metabolites as "common goods." Applying this concept to a community with two partners would imply that any essential pathway can be lost in only one of them but has to be retained in the other to maintain all essential functions in the system.Hence, in most psyllid dual endosymbioses, Carsonella may have lost the tryptophan pathway since it was encoded by its symbiotic partner.In turn, the co-primary endosymbionts lost all other genes involved in amino acid biosynthesis since these were maintained in Carsonella, thus establishing the existing metabolic complementarities in different psyllids.An exception to this occurs in Diaphorina spp., whose co-primary endosymbiont Profftella is very similar to Psyllophila in its metabolic repertoire except for the tryptophan pathway, which is complete in the co-occurring Carsonella strains (41).It is therefore conceivable that an ancestral co-primary endosymbiont was replaced by Profftella before Carsonella lost the tryptophan pathway in these species.However, since only a few psyllid endosymbionts have been characterized at the genomic level, more studies across the psyllid tree of life will be necessary to obtain a more complete picture of the evolutionary dynamics of psyllids and their primary endosymbionts.

Psyllid samples
Genomic data were obtained from four psyllid species: the apple psyllids Cacopsylla melanoneura and C. picta as well as the pear psyllids C. pyri and C. pyricola.All four species are vectors of plant pathogens, namely, "Ca.Phytoplasma mali" and "Ca.Phytoplasma pyri, " respectively, causing apple proliferation and pear decline (56,57).Remigrants (i.e., adults that return to their host plants for reproduction after overwintering on shelter plants Since psyllids are too small to obtain sufficient DNA for long-read sequencing from a single individual, several specimens need to be pooled, which introduces genetic variation that can hinder genome assembly.We used two different strategies to reduce the genetic variation among the pooled individuals, depending on the host plant of the different species.For C. melanoneura and C. picta collected on apple trees, we applied the same experimental design as in reference (56).In the green house, the field-caught adults were sorted into mating couples, and each couple was caged on a branch of an apple tree (cultivar Golden Delicious) using nylon nets.Once the offspring of the mating couples had reached adulthood, all newly emerged siblings were collected and stored at −20°C.Since the primary endosymbionts are vertically transmitted from mother to offspring, all siblings harbor genetically identical endosymbionts and can therefore be pooled without introducing genetic variation for the endosymbionts.In addition, the Cytochrome Oxidase I (COI) haplotype was determined for two individuals per sibling group according to reference (58), to determine the genetic diversity among the different populations and mating couples.For all psyllids that do not develop on apples (C.melanoneura from hawthorn and the pear psyllids C. pyri and C. pyricola), adults collected in the field were immediately stored at −20°C.Subsequently, the COI haplotype was determined for numerous individuals of each species in order to select individuals with identical COI haplotype for pooling.

DNA extraction
For each sibling group of the apple psyllids C. melanoneura and C. picta selected for long-read metagenome sequencing, DNA was extracted from two pools, each containing four to six whole females.For the field-caught C. melanoneura from hawthorn and C. pyricola from pears, DNA was first extracted from individual females and subjected to COI haplotype determination as outlined above.Subsequently, two pools, each combining the DNA extracts of five females with identical haplotypes, were established for each species.Only females were used since they are larger and hence provide more DNA, and we reasoned that their endosymbiont titers may be higher since the endosymbionts are harbored in two tissues, the bacteriome and the ovaries.DNA extraction was performed using a modified protocol of the PureGene Tissue Kit (Qiagen, Venlo, Netherlands).Whole insects were ground in 100 µL of cell lysis solution and 5 µL of proteinase K solution and incubated at 56°C for 3 h, followed by an incubation with 1.5 µL of RNase A at 37°C for 30 min.Subsequently, proteins were precipitated by adding 35 µL of protein precipitation solution.DNA was then extracted with 1 vol of chloroform/isoamyl alcohol (24:1 vol/vol) and precipitated in 1 vol of isopropanol after overnight incubation at −20°C.The DNA pellet was resuspended in 40 µL of sterile water and incubated at 65°C for 1 h to increase DNA rehydration.For C. pyri, DNA was extracted from a single female using the QIAamp DNA Micro Kit (Qiagen) according to the manufacturer's instructions.

Metagenome sequencing and assembly
Long-read metagenome sequencing using an Oxford Nanopore-Illumina hybrid approach was performed for C. melanoneura, C. picta, and C. pyricola.For each sam ple, one pool was used for long-read sequencing on the MinION (Oxford Nanopore Technologies, UK), and the second pool was used for 2× 150 bp paired-end sequencing on an Illumina NovaSeq (Macrogen Europe, Netherlands).About 1.5 µg of DNA was used for library preparation using the Oxford Nanopore Ligation Sequencing Kit SQK-LSK 109 (Oxford Nanopore Technologies, UK).Each library was sequenced on an entire R9.4 flowcell for 43-72 h, depending on pore activity.Basecalling was done using Guppy v5.0.11(Oxford Nanopore Technologies, UK) in high-accuracy mode.Low-quality (<Q7) and short (<500 bp) reads were discarded, and host reads were removed via mapping against a genome scaffold of C. melanoneura (J.M. Howie & O. Rota-Stabelli, unpublished data) using Minimap2 v2.15 (59).The remaining non-host reads ≥500 bp were assembled using Flye v2.9 (60) with the metagenome option.Contigs belonging to the endosym bionts were identified using blast (61).Reads were mapped back onto the endosymbiont contigs using Minimap2 v2.15, and all mapped reads were assembled again with Flye v2.9 using the same parameters.This produced two circular genomes for most data sets.These genomes were first polished with Nanopore reads using Medaka v1.5.0 (https:// github.com/nanoporetech/medaka)and subsequently with Illumina reads using several iterations of Polca, a genome polisher integrated in the MaSuRCa toolkit v4.0.7 (62), until no more errors were found.It is important to note that the two endosymbiont genomes need to be polished together to avoid the introduction of errors in highly conserved regions (e.g., the ribosomal RNA operon) of the endosymbiont genome with lower coverage.In rare cases, two rounds of Flye assemblies did not produce circular endosym biont genomes.Two of these genomes (CRmelAO2 and PSmelET) could be finished using alternative assembly approaches: (i) assembly with Canu v2.1.1 (63), polishing with Medaka and Polca as outlined above, followed by scaffolding and gap-closing with Redundans v0.14 ( 64) and (ii) Nanopore and Illumina reads mapping onto the complete endosymbiont genomes were assembled together using SPAdes v3.15.1 (65).Genome coverage was estimated by mapping the Nanopore reads onto the finished genomes during the polishing step with Medaka.
The metagenome of C. pyri was assembled from 42 million 2× 250 paired-end reads from a single female sequenced on an Illumina NovaSeq (University of Illinois, Urbana-Champaign, IL, USA).The metagenome was assembled using SPAdes v3.15.1 (65) with the meta option and default kmers.Endosymbiont contigs were identified based on coverage, which initially produced two contigs for Carsonella and three contigs for the Enterobacteriaceae symbiont.These contigs were ordered based on the complete genomes obtained using long-read sequencing and closed after scaffolding and gap-closing with Redundans v0.14 (64).The completeness of all genomes was assessed using BUSCO (gammaproteobacteria_odb10 data set) (66).
Insects preserved in ethanol were rehydrated and then postfixed in 4% paraformal dehyde for 2 h at room temperature.Next, the specimens were dehydrated again by incubation in increased concentrations of ethanol and acetone, embedded in Technovit 8100 resin (Kulzer, Wehrheim, Germany), and cut into semithin sections (1 µm).The sections were then incubated overnight at room temperature in hybridization buffer containing the specific probes at a final concentration of 100 nM.After hybridization, the slides were washed three times in PBS, dried, covered with ProLong Gold Antifade Reagent (Life Technologies, Carlsbad, California, USA), and observed using a Zeiss LSM 900 Airyscan 2 confocal laser scanning microscope.

FIG 1
FIG 1 Endosymbiont genomes are highly conserved across Cacopsylla host species.(a and c) Circular genome plots of the 12 Carsonella genomes (a) and the 10 Psyllophila genomes (c) produced in this study.The three outer-most circles represent forward coding sequences (CDS), reverse CDS, and the ribosomal RNA operon of a reference genome (CRmelAO1 and PSmelAO1, respectively).The inner circles represent the conserved genes in all other genomes of the same taxon (the order is identical to b and d), with the shading indicating the degree of sequence similarity compared to the reference genome.(b and d) Intersection plots showing the number of shared orthogroups across all Carsonella (b) and Psyllophila (d) genomes.The matrix lines are colored according to host species.

FIG 2
FIG 2 The Enterobacteriaceae symbiont represents a new psyllid-associated genus.Maximum-likelihood tree based on the concatenated amino acid sequence alignment of 67 single-copy orthologous genes from 46 genomes, namely, the 10 Enterobacteriaceae endosymbionts of Cacopsylla spp., 33 insect endosym bionts from Gammaproteobacteria, and 3 Pseudomonas entomophila strains as outgroups.The Enterobacteriaceae endosymbionts of Cacopsylla spp.are color-coded based on host species.Branch support is based on 1,000 bootstrap iterations.Blue dots on branches indicate full bootstrap support.

FIG 4
FIG 4 Metabolic complementarity between Carsonella and Psyllophila.(a) COG functional categories for the Carsonella and Psyllophila genomes show different proportions of genes involved in amino acid transport and metabolism (red).(b) Schematic representation of the metabolic complementarity between the two symbionts for the biosynthesis of essential amino acids, vitamins, and carotenoids.Genes present in Carsonella genomes are shown in blue, and genes present in Psyllophila are shown in red.

FIG 5
FIG 5 Repeated gene losses throughout Carsonella evolution.(a) Intersection plot showing the distribution of orthogroups across 23 Carsonella genomes depending on host species.(b) Maximum-likelihood tree based on the concatenated amino acid sequence alignment of 119 single-copy orthologous genes present in all 23 Carsonella genomes.The Carsonella strains of Cacopsylla spp.are color-coded based on host species.Branch support is based on 1,000 bootstrap iterations.Blue dots on branches indicate full bootstrap support.Losses of genes or pathways involved in the biosynthesis of essential amino acids are indicated on the branches (see TableS3for a detailed list of genes identified based on KEGG pathway annotation).The presence of known co-primary endosymbionts ) of C. melanoneura were captured in various apple orchards in two Italian regions (Aosta Valley and South Tyrol) in March 2020 and March 2021.Additional C. melanoneura specimens were sampled on hawthorn (Crataegus sp.) in a single location (Aosta Valley) in March 2021.Remigrants of C. picta were captured from apple orchards in Trentino (Italy) in April 2021.Adults in C. pyri and C. pyricola were collected in pear orchards in Litenčice (Czech Republic) in December 2019 and in Starý Lískovec (Czech Republic) in July 2020, respectively.Sampling was done using the beating tray method.
10 complete chromosomes of the Enterobacteriaceae endosymbiont ranged from 221,413 bp in C. pyri to 237,114 bp in a strain from C. melanoneura from Aosta Valley (strain PSmelAO1; Table 1).GC content varied from 17.30% to 18.60%.Despite the variations in genome size, synteny and gene

TABLE 1
Properties of the complete endosymbiont genomes obtained