Many species belonging to the genus Colletotrichum are causal agents of plant diseases, generally referred to as anthracnose, in a wide range of hosts worldwide. Colletotrichum spp. are responsible for impacting numerous economically important crops on a global scale. This genus comprises approximately 257 distinct species, which are further organized into at least 15 major phylogenetic lineages known as species complexes (Talhinhas and Baroncelli 2021). Virtually every crop grown in the world is susceptible to one or more species of Colletotrichum (Baroncelli et al. 2014). Among these, the Colletotrichum acutatum species complex stands out as a diverse group of closely related plant pathogenic fungi within the genus (Baroncelli et al. 2017). Members of the Colletotrichum acutatum species complex have a wide host range in both domesticated and wild plant species, and their capability to infect insects has also been described (Damn et al. 2012, Marcelino et al. 2008). In this species complex, Colletotrichum limetticola (formerly known as Gloeosporium limetticola; Clausen 1912) was initially described in 2012 as a species predominantly associated with wither tip symptoms on sour lime (Citrus aurantiifolia) in Cuba and the USA during the 1910s (Damm et al. 2012). Later descriptions associated the disease with strains of C. gloeosporioides (Brown et al. 1996) or C. acutatum (Peres et al. 2008). Recent findings in Brazil have revealed the presence of C. limetticola causing Glomerella leaf spot on apples, although its prevalence remains low while displaying high virulence (Moreira et al. 2019). To the best of our knowledge, no further occurrences of C. limetticola have been documented, despite the presence of other known Colletotrichum species that infect citrus and apples (Talhinhas and Baroncelli 2021). This raises concerns regarding the conservation status of C. limetticola considering the scarcity of records on its original hosts and the occurrence of cross-infections.

In the present study, Colletotrichum limetticola strain KLA-Anderson was isolated from a leaf tissue of Citrus x aurantiifolia commonly known as the Key lime or Mexican lime in the Lake Alfred region (Florida, USA). C. limetticola genome was sequenced using the Illumina NovaSeq 6000 150 bp paired-end sequencing system. Illumina sequences were analyzed with FastQC (Babraham Bioinformatics) to assess the quality of the reads. Sequences adapters and low-quality reads were trimmed with TrimGalore! v0.6.10 (Krueger et al. 2021). Pair-end reads were merged with FLASH v1.2.11 (Magoc and Salzberg 2011). Merged and unmerged reads were then assembled using SPAdes v3.15.1 (Bankevich et al. 2012). Scaffolds with low coverage were removed as possible contaminations. Scaffolds corresponding to the mitochondrial DNA (mtDNA) and ribosomal DNA (rDNA) genome were identified by BLASTN v2.9.0 (Camacho et al. 2009) using queries of the closely related species Colletotrichum lupini (Baroncelli et al. 2021) which was the closest complete genome to C. limetticola. The completeness of the assembly was assessed using BUSCO v5.4.7 (Simão et al. 2015) while statistics were evaluated with QUAST v5.2.0 (Gurevich et al. 2013). The total size of the nuclear genome assembly was 50,48 Mb, with an N50 contig length of 68638 kb and a L50 of 229. The nuclear genome assembly resulted in 1750 contigs with an average coverage of 90X and it was assessed to be 97.7% complete (Table 1). A total of 15248 protein-coding genes were predicted to be encoded using MAKER v3.01.02 pipeline (Holt and Yandell 2011) with both self-trained GeneMark-ES v4.10 (Borodovsky and Lomsadze 2011) and AUGUSTUS v3.3 prediction using the “Colletotrichum” model (Becerra et al. 2023). SignalP v5.0 (Almagro Armenteros et al. 2019) revealed that 1981 proteins in C. limetticola are secreted and among those 624 have been predicted by EffectorP v3.0 (Sperschneider and Dodds 2022) to be candidate effectors. A comparative analysis of the newly sequenced genome with those publicly available (Baroncelli et al. 201620212022; Goulin et al. 2023) revealed similar genomic features and gene content within closely related species (Fig. 1).

Table 1 Summary statistics of the Colletotrichum limetticola KLA-Anderson strain genome
Fig. 1
figure 1

Comparative analysis between the newly sequenced genome of C. limetticola and a selection of closely related species publicly available. The C. limetticola genome is highlighted in bold. On the left side, multilocus sequence typing (MLST) tree based on the concatenation of the partial sequences of following loci: actin [ACT], beta-tubulin 2 [TUB2], calmodulin [CAL], glyceraldehyde-3-phosphate dehydrogenase [GAPDH], chitin synthase [CHS-1], glutamine synthetase [GS], histone-3 [HIS3], superoxide dismutase 2 [SOD2] mating type 1–2 [MAT1-2] and the Apn2-Mat1-2 intergenic spacer [ApnMat]. Numbers next to the nodes represent in order: Bayesian posterior probability, FastTree and RAxML bootstrap support values. Bubble plots report on assembly fragmentation, genome size, N50 and L50, GC content and completeness. Bubble sizes have been scaled to each panel and are not comparable across panels. Horizontal histograms report on secreted and non-secreted protein coding gene content

In this study we presented a draft genome sequence of C. limetticola, obtained using Illumina sequencing technology, providing a range of new resources that serve as a useful platform for further research in the field of comparative genomics of fungi. Further analysis of these genomes will enhance our understanding of the molecular mechanisms underlying the pathogenicity and virulence of Colletotrichum species facilitating the exploration of potential targeted and environmentally friendly strategies for its control.