Assembly and comparative genome analysis of four mitochondrial genomes from Saccharum complex species

Saccharum complex includes genera Saccharum, Miscanthus, Erianthus, Narenga, and Tripidium. Since the Saccharum complex/Saccharinae constitutes the gene pool used by sugarcane breeders to introduce useful traits into sugarcane, studying the genomic characterization of the Saccharum complex has become particularly important. Here, we assembled graph-based mitochondrial genomes (mitogenomes) of four Saccharinae species (T. arundinaceum, E. rockii, M. sinensis, and N. porphyrocoma) using Illumina and PacBio sequencing data. The total lengths of the mitogenomes of T. arundinaceum, M. sinensis, E. rockii and N. porphyrocoma were 549,593 bp, 514,248 bp, 481,576 bp and 513,095 bp, respectively. Then, we performed a comparative mitogenomes analysis of Saccharinae species, including characterization, organelles transfer sequence, collinear sequence, phylogenetics analysis, and gene duplicated/loss. Our results provided the mitogenomes of four species closely related to sugarcane breeding, enriching the mitochondrial genomic resources of the Saccharinae. Additionally, our study offered new insights into the evolution of mitogenomes at the family and genus levels and enhanced our understanding of organelle evolution in the highly polyploid Saccharum genus.


Introduction
In the Andropogoneae tribe, there are seed-based commercial crops such as maize, sorghum, and Coix lacryma-jobi (Wall and Ross, 1970;Ranum et al., 2014;Xi et al., 2016) and leaf-based commercial crops like Chrysopogon zizanioides (Chahal et al., 2015).Additionally, crops like sugarcane, which primarily yield from their stalks, are of significant commercial importance (Bell and Garside, 2005).Sugarcane belongs to the genus Saccharum and is the principal constituent of the Saccharum complex, which also includes other genera such as Miscanthus, Erianthus, Narenga, Sclerostachya, and Tripidium (Vasquez et al., 2022).Some of these genera are clearly not closely related to Saccharum (Li et al., 2022).Indeed, recent phylogenetic studies indicate that Tripidium is over 11 million years divergent from Saccharum, and Eriochrysis is even more divergent (Evans and Joshi, 2020).Miscanthus species are traditionally believed to possess a basic chromosome number of n = x = 19, and genome sequencing studies have shown that Miscanthus sinensis is a paleotetraploid comprising the A and B subgenomes (Mitros et al., 2020).Initially considered the closest diploid relative of sugarcane, Narenga porphyrocoma (2n = 30) has been estimated to have diverged from sugarcane approximately 2.5 million years ago (Evans and Joshi, 2020).Erianthus rockii (2n = 4x = 30) is a drought-and cold-tolerant wild relative of sugarcane from China (Qi et al., 2022).
To broaden the genetic base of modern sugarcane cultivars, sugarcane breeders aim to enrich the sugarcane gene pool through intergenic crosses with sugarcane relatives, enhancing yield, stress resistance, and disease resistance.Sugarcane breeders have used species from closely related genera to improve sugarcane varieties.Tripidium, for example, has demonstrated considerable cold hardiness and biomass yields (Mahadevaiah et al., 2023); N. porphyrocoma has excellent characteristics such as high tillering ability, drought tolerance, and mosaic disease resistance (Liu et al., 2018).Wild-type clones of Tripidium arundinaceum and Saccharum spontaneum show the potential to provide resistance to smut and high biomass, fiber, and bioenergy (Anna Durai and Karuppaiyan, 2023;Meena et al., 2024).Given that the Saccharum complex/Saccharinae serves as the gene pool for sugarcane breeders attempting to introgress useful traits into sugarcane, determining the phylogenetic relationships among these genera and species through molecular strategies is of considerable significance and relevance.
Mitochondria are essential for cellular energy production and numerous biological processes, including growth, development, and adaptation to environmental stress, potentially affecting agronomic traits (Mahadevaiah et al., 2023).However, the specific impact may vary depending on plant species and environmental conditions.Plant mitochondria are more complex and have a higher proportion of non-coding regions compared to animal mitochondria (Møller et al., 2021), likely because plants must manage various biotic and abiotic stresses to maintain proper physiological functions.Given their role as the cell's energy supply system and their predominantly matrilineal inheritance in most plants (Li et al., 2021), mitochondria are an excellent research focus for studying parental relationships in polyploidization and the evolution of the energy supply system (Vallejo-Marıń et al., 2016).
Currently, only the mitogenomes of Saccharum spp.have been published (Li et al., 2024), leaving numerous properties of Saccharinae mitogenomes yet to be discovered.In this research, we de novo assembled the complete mitogenome of four Saccharinae species (T.arundinaceum, M. sinensis, E. rockii, and N. porphyrocoma) using a combination of second-generation Illumina sequencing and third-generation PacBio sequencing technologies.Then, mitogenome organization, characteristic, phylogenetic relationship, and comparative genome analyses were performed.Our study revealed reticulate mitochondrial conformations featuring multiple junctions.By comparing the organellar genomes of four Saccharinae species, we aimed to identify structural, sequence, and evolutionary differences within Saccharinae that have significant contributions to the cultivar enhancement of Saccharum cultivars.

Plant materials and genome sequencing
Four accessions from the Saccharum complex, including BM87-36 (T.arundinaceum), M022 (M.sinensis), DZM1 (E.rockii), and HBW017 (N.porphyrocoma), were used for mitogenome investigation (Table 1).Leaf samples of the four accessions were collected (September 1, 2023) in Guangxi Key Laboratory of Sugarcane Genetic Improvement (22°50′N, 108°15′E).The extracted total genomic DNA was used for library construction with 150-bp and 15-kb insert sizes and then sequenced on the Illumina NovoSeq 6000 sequencing platform (Illumina, San Diego, CA, USA) and PacBio Revio platform for short and long reads, respectively.Finally, the Illumina and Nanopore high-quality reads were obtained and processed (Supplementary Table S1).

Mitogenome assembly and annotation
A hybrid assembly strategy was used for mitogenome assembly.First, the GetOrganelle (v1.7.6.1) was used to assemble short reads into a corresponding unitig graph with the parameters '-R 30 -k 85,105,115,127 -F embplant_mt', and the contigs that contained the mitochondrial core genes in Andropogoneae were selected.The mitochondrial contigs obtained by GetOrganelle assembly were then used as bait to extract mitochondrial PacBio HiFi reads by Seqkit (v2.2.0); then, they were assembled by flye (2.9.1-b1780) with the parameters '-pacbio-hifi -meta -g 500K -t 20'; and the final mitogenome was visualized and adjusted manually by the Bandage (v0.8.1) software (Wick et al., 2015).

Validation of linkage junction and PCR amplification
Linkages between graphical contigs were designed within a range of 100-500 bp on either side of each junction site using the Primer 3 program (http://bioinfo.ut.ee/primer3-0.4.0/) (Supplementary Table S2).The DNA isolated from young leaf tissue of each species was used to conduct PCR verification.PCR was carried out in a 20-µL reaction mixture containing 10 µL of 10× reaction buffer, 5 pmol of each primer, 1.25 units of Taq DNA polymerase, and 20 ng of DNA template.The PCR was performed in thermocyclers using the following cycling parameters: 94°C (5 min); 30 cycles of 94°C (30 s), 55°C-57°C (30 s); 72°C (30 s), then 72°C (7 min).PCR products were visualized on agarose gels (2.0%-3.0%)containing Safe gel stain.

Phylogenetic analysis
To better and comprehensively explore the evolutionary relationship of the Saccharum complex, the 12 mitogenomes of Poaceae (Supplementary Table S3) were downloaded from the NCBI database.A total of 13 shared protein-coding genes (PCGs) among the analyzed species were identified and extracted using PhyloSuite (v1.2.2) (Zhang et al., 2020a).All the PCGs were aligned in batches with MAFFT (v7.313) (Katoh and Standley, 2013) and integrated into PhyloSuite using normal-alignment mode.Maximum likelihood phylogenies were inferred using IQ-TREE under the Edge-unlinked partition model for 50,000 ultrafast bootstraps, and the tree was visualized using iTOL.

Collinear analysis and comparative genome analysis
Six species related and within Saccharum complex were selected for analysis, including S. bicolor (NC013816.1),T. arundinaceum (in this study), M. sinensis (in this study), E. rockii (in this study), N. porphyrocoma (in this study), and S. spontaneum (Li et al., 2024), to conduct comparative mitogenome analysis and collinearity analysis.To investigate the similarity of mitogenome sequences within the Saccharum complex and closely related species, homologous sequences between the four relatives were detected using Blastn (2.5.0+) (parameters: e-value 1e −10 ).Homologous sequences less than 0.5 kb were not retained.A multiple synteny plot of those four mitogenomes was generated using TBtools.
A dot plot of pairwise comparison on conserved collinear blocks was generated and plotted using MUMmer (Kurtz et al., 2004).Based on sequence similarity, a Multiple Synteny Plot of the five mitogenomes from this and previous studies with closely related species was plotted using MCScanX in TBtools.

Mitogenome assembly, annotation, and gene features
Accurate mitogenomes were obtained by combining Illumina and PacBio HiFi reads.Consistent depths of mapping reads revealed the high-quality gap-free assembly (Supplementary Table S4).First, the mitogenome of four Saccharum complex species was assembled into initial graph-based structures.From PacBio data, 13 contigs of T. arundinaceum, six contigs of M. sinensis, three contigs of N. porphyrocoma, and six contigs of E. rockii were obtained (Figure 1).Primer design and PCR validation were performed based on the junctions of the graphical assembly results (Supplementary Figures S1A-D), which showed that the junctions and product lengths were as expected, as evidenced by the variable structure of the assembly results and other conformations of the mitochondria.Subsequently, by examining the connections between contigs and mapping withthird-generation long sequences, a relatively simplified primary conformation can be obtained (Figure 2; Table 2), with a total length of 549,593 bp, 514,248 bp, 481,576 bp, and 513,095 bp in T. arundinaceum, M. sinensis, E. rockii, and N. porphyrocoma, respectively.A total of 51 unique genes were identified from the assembled mitogenome, comprising 32 PCGs, 16 transfer RNA (tRNA) genes, and three ribosomal RNA (rRNA) genes (Table 3).Notably, the gene copy numbers varied among four species: the atp1 gene was duplicated in T. arundinaceum, E. rockii, and N. porphyrocoma, while the atp8 gene exhibited duplication in T. arundinaceum and N. porphyrocoma.Additionally, cox1 was found to be duplicated in T. arundinaceum, and nad3 showed duplication in E. rockii.Eight genes were identified to contain one to four introns, including ccmFc (1), cox2 (1), nad1 (4), nad2 (4), nad4 (3), nad5 (4), nad7 (4), and rps3 (1).It was noteworthy that in all four species, the exons of nad1 and nad5 were identified to be located on different chromosomes, requiring trans-splicing to generate complete transcripts.

Chloroplast-derived sequence analysis
During the evolution of mitochondria, chloroplast fragments were transferred to the mitogenome (Wang et al., 2007).Approximately 5%-10% of the sequences in mitogenome that can be identified as homologs are derived from the chloroplast genome (Rodrıǵuez-Moreno et al., 2011).In this study, the chloroplast genomes of four species were reassembled and annotated based on PacBio HiFi reads, and then the transfer sequences between mitochondrial and chloroplast genomes were analyzed.

Phylogenetic analysis
Mitochondrial genes are a valuable source of information for phylogenetic analyses at large-scale taxonomic levels due to their

Function of genes
Name of genes T. arundinaceum M. sinensis E. rockii N. porphyrocoma complex", was monophyletic as sister to Saccharum.Saccharum spp.were clustered into the same clades, and N. porphyrocoma was the most closely related to the Saccharum genus, followed by E. rockii.T. arundinaceum and M. sinensis were not closely related to Saccharum.The phylogenetic relationships of species in the core "Saccharum complex" were consistent with previous chloroplastbased phylogenies (Li et al., 2022).

Collinearity analysis
Sequence transfer between species was also explored to understand which sequences were retained in the mitochondrial genome during evolution and how these sequences were recombined between species.There were collinear segments with a total length of 335,303 bp, 392,247 bp, 360,963 bp, 420,218 bp, and 433,309 bp for S. bicolor-T.arundinaceum (350 fragments), T. arundinaceum-M.sinensis (369 fragments), M. sinensis-E.rockii (334 fragments), E. rockii-N.porphyrocoma (260 fragments), and N. porphyrocoma-S.spontaneum (277 fragments), respectively (Figure 5).Collinear sequences over 10 kb in length were retained in each species, and short collinear sequences were often lost after species divergence.In the graph assembly results, short collinear sequences were often nodes connected between long and repetitive contigs, which may have played a role in the conformation of the species but were not retained during evolution.
Based on the phylogenetic relationships, we performed a dot-plot analysis of two species with close phylogenetic relationships.The results showed numerous collinear blocks in T. arundinaceum-M.sinensis and M. sinensis-E.rockii, but all were short and fragmented.However, long collinear blocks were found in both E. rockii-N.porphyrocoma and N. porphyrocoma-S.spontaneum.The relatively short divergence time between these two species may not have led to a large-scale reorganization of the mitogenome, and large collinear blocks are still retained between the two species (Figure 6).The mitogenomes in the subtribe taxonomic level had undergone extensive genomic rearrangements with closely related species, and the mitogenome was extremely not conserved in structure.

Gene duplication and loss, and characteristic difference of mitogenomes
Gene duplication and loss often occur during the evolution of plants, and the genes that are retained by duplication are crucial for normal life activities.Therefore, gene duplication and loss in the Andropogoneae species were compared here (Figure 7).For the mitochondrial PCGs, it was found that the ATP synthase genes and the cytochrome c synthesis gene were not lost with strong conservation.However, the succinate dehydrogenase proteins (sdh3 and sdh4) were found lost in Poaceae with high volatility and weak conservation during evolution.It was observed that genes within the ATP and COX family may exist in multiple copies in the Andropogoneae mitogenomes.For instance, Z. mays (OP832500) and S. bicolor (NC008360) had two copies of atp1; Coix lacryma (MT471100) harbored two copies of atp4 and atp9; C. zizanioides (MN635785) exhibited a duplication of atp6.Furthermore, cox1 was found in triplicate in C. zizanioides (MN635785) and duplicated in C. lacryma (MT471100).The mttB gene was not found in some accessions of Zea, which may be the impact of the incompetent annotation.
In Andropogoneae, most mitogenomes were published in the genus Zea, followed by genera Coix, Saccharum, and Sorghum.The length of mitogenomes ranged from 449,028 bp (S. bicolor subsp.drummondii, MZ506736.1)to 739,719 bp (Z.mays subsp.mays genotype CMS-C, DQ645536.1),with an average length of approximately 500 kb (Table 5).The total lengths in T. arundinaceum, M. sinensis, E. rockii, and N. porphyrocoma were 549,593 bp, 514,248 bp, 481,576 bp, and 513,095 bp, respectively, which were close to the published mitochondrial genome lengths of Andropogoneae.The mitochondrial genome lengths of species that have been domesticated in Andropogoneae, such as domesticated maize under the genus Zea, differ significantly from those of wild species (Zea perennis and Zea luxurians) of maize.In different accessions of C. lacryma-jobi, the length of the genome varied considerably, from 598,321 bp to 673,349 bp.Variations in mitochondrial genome sizes mainly result from the transfer of sequences from the nucleus and chloroplasts, as well as the expansion of repetitive sequences within the mitogenomes.We examined the Mitochondrion to Mitochondrion sequences (MTMTs) events and the proportion of the genome they occupied for species in Andropogoneae.Z. mays subsp.mays genotype CMS-C had not only the largest genome length but also the highest percentage of MTMTs (71.23%); meanwhile, the lowest percentage of MTMT sequences (0.80%) was found in chromosome 2 of Saccharum species.

Mitogenome characterization in Saccharinae
Advances in sequencing and assembly strategies have made it possible for researchers to recover polymorphic conformations of plant mitochondria (Wang et al., 2024).In this study, we assembled the mitogenome of four species in Saccharinae first to provide a more comprehensive description of the mitogenome in Saccharinae, including basic mitochondrial features, structure, and homologous sequences.In Saccharinae, the mitogenome length averaged approximately 500 kb.Our study revealed that T. arundinaceum had the longest mitogenome (549,593 bp).While the unique PCGs in Saccharinae mitogenomes were similar across the four species and Saccharum, variations in gene copy numbers were observed among them.Duplication of cox1, atp1, and atp8 genes was found in T. arundinaceum; atp1 and nad3 genes in E. rockii; and atp1 and atp8 in N. porphyrocoma, but these genes were single-copy in M. sinensis.Within Andropogoneae, certain PCGs, such as those associated with ATP synthesis (atp1, atp4, and atp8) and the COX subunit complex (cox1), exhibited varying copy numbers from one to three.C4 plants in Andropogoneae, such as sorghum, maize, and sugarcane, are leading players in global agriculture and can survive in hot and dry conditions.Mitochondria play a crucial role in the internal regulation of organisms under harmful environmental conditions such as hypoxia and high temperature (Xiong et al., 2022).Through our analysis of gene duplications and losses in published Andropogoneae mitogenomes, we observed that most plants harbored multiple copies of ATP synthase and COX subunits, crucial for plant respiration.These findings offer insights into how C4 plants maintain cellular physiological functions in response to high temperatures.
The polymorphic structure of plant mitochondrial genomes often results in an uneven distribution of functional genes on each molecule, with many individual molecules lacking functional genes, and others having multiple copies of functional genes due to their presence in repetitive regions.For example, in the species within the N. porphyrocoma and Saccharum species, cox1, atp8, rrn5, rrn18, and rrn26 all have two copies; nad1 and nad5 are usually present on different single molecules in the genomes of the Saccharinae, which are assembled into functional transcripts by trans-spicing.Single Phylogenetic tree of 20 species in BOP clade and PACMAD clade.The number at each node was the bootstrap probability.molecules lacking functional genes are often shown in graphical results as nodal molecules connecting long contigs, and they may have an important role in shaping the conformation of mitogenomes.We compared mitogenome synteny among six species, sorghum, T. arundinaceum, M. sinensis, E. rockii, N. porphyrocoma, and S. spontaneum, to identify transfer sequences among them.We found that most transfer sequences were long contigs and that short contigs (i.e., single molecules lacking functional genes) were rarely transferred between species, suggesting that they play a role in these species and that these sequences were not homologous by disrupted recombination after species formation.

Mitogenome evolution in Saccharinae
Most land plant mitogenomes analyzed to date can be represented as a single circular chromosome (called the major cyclic chromosome or master circle) and a set of secondary chromosomes (called the minor cyclic chromosome), which arise by active recombination of large direct repeats (Wang et al., 2024).Few comparative analyses of plant mitochondrial genes have so far revealed the relationship between species evolution and mitochondrial conformation.The simplest structures of the mitochondrial genomes of Saccharum robustum and S. officinarum in the genus Saccharum can be described as two-ring structures (300 kb and 144 kb), and the structure of the S. spontaneum as a 380-kb linear structure and a 100-kb-ring structure (Li et al., 2024).The diploid N. porphyrocoma, closely related to Saccharum, exhibited a single cyclic structure.Its secondary conformation can generate substructures of 225 kb and 288 kb, whose secondary conformation allowed the generation of 225 kb and 288 kb substructures, respectively.In T. arundinaceum, there was no connection between chromosome 1 (425 kb) and chromosome 2 (124 kb), similar to that in S. robustum and S. officinarum.However, the linear structures of M. sinensis and E. rockii can form a closed structure in the graph-based assembly.During speciation, the conformation of plant mitochondria may have undergone fusion and division to accommodate species formation.
Many studies have shown that mtDNA evolves faster in structure but slower in sequence change compared to cpDNA or nuclear DNA, making mtDNA a good tracer for genome evolution (Hu et al., 2023).Changes in gene copy number are common in plant mitochondria, and whole-genome duplication (WGD) events occurring in plant nuclear genomes can affect the relative copy numbers of nuclear, mitochondrial, and chloroplast genomes (Zwonitzer et al., 2024).Increases in organelle genome copy number represent a common response to polyploidization, suggesting that maintenance of nuclear stoichiometry is an important aspect of establishing polyploid lineages.
In the evolution of species, understanding how nucleoplasmic equilibrium evolves after polyploidization requires studying the exchange between mitochondrial and nuclear genomes of their diploid ancestors (Korpelainen, 2004).The ploidy levels of species within the Saccharinae vary significantly, with most of the close relatives of sugarcane being diploid (N.porphyrocoma, M. sinensis, and E. rockii in this study are diploid, while T. arundinaceum is tetraploid), some wild sugarcane species ranging from tetraploid to octoploid (Zhang et al., 2022), and cultivated sugarcane species being over octoploid (Zhang et al., 2019).Chromosome-level genome assemblies have been recently published for the genera Erianthus (Erianthus rufipilus) (Wang et al., 2023) and Saccharum (Healey et al., 2024).The structure and copy number of the mitogenome are closely associated with the balance of genetic material within the cell during polyploidization (Zwonitzer et al., 2024), but the evolutionary mechanisms of the mitochondrial genome following nuclear genome polyploidization remain to be studied.The mitogenomes of the four Saccharinae species assembled and analyzed in this study lay the groundwork for exploring nucleoplasmic interactions between the mitogenome and nuclear genome during WGD events in polyploidization within the Saccharinae.

Conclusions
In this study, we assembled graph-based mitochondrial genomes of four Saccharinae species (T.arundinaceum, E. rockii, M. sinensis, and N. porphyrocoma) closely related to sugarcane and provided a more complete characterization of the mitochondrial genomes instead of the "master circular" structure.Second, we performed a comparative mitogenome analysis of Saccharinae Gene duplication and loss in Andropogoneae species.

Data availability statement
The datasets presented in this study can be found in online repositories.The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.The data presented in the study are deposited in online repository, accession number Narenga porphyrocoma (PP551632.1),Tripidium arundinaceum (PP664562.1,PP664563.1),Miscanthus sinensis FIGURE 1 Branched conformation of four Saccharinae species mitogenomes.(A) Tripidium arundinaceum, (B) Miscanthus sinensis, (C) Narenga porphyrocoma, and (D) Erianthus rockii.

FIGURE 3
FIGURE 3Gene map of the chloroplast genome of four Saccharinae species.

FIGURE 5
FIGURE 5 Mitogenome synteny.Bars indicate the mitogenomes, and the lines show the homologous sequences between the adjacent species.The pink line represents the sequence of inheritance between the six species.Sb, Sorghum bicolor; Ta, Tripidium arundinaceum; Ms, Miscanthus sinensis; Er, Erianthus rockii; Np, Narenga porphyrocoma; Ss, Saccharum spontaneum.

FIGURE 6
FIGURE 6 Dot-plot analysis.The blue and red lines represent reverse and forward sequences, respectively.Dot plot of Tripidium arundinaceum with Miscanthus sinensis is shown in the upper left corner, and the rest are dot plots of M. sinensis with Erianthus rockii, E. rockii with Narenga porphyrocoma, and N. porphyrocoma with Saccharum spontaneum.

TABLE 5
Genome length and mitogenome transfer sequence of Andropogoneae species.