Complete Genome Sequence of a New Chickpea Chlorotic Dwarf Virus Strain Isolated from Tomato in Kenya, Obtained from Illumina Sequencing

High-throughput sequence analysis revealed the complete genome sequence of a novel, hitherto uncharacterized strain of Chickpea chlorotic dwarf virus (CpCDV) from tomato plants in Kenya. The sequence shared its highest nucleotide similarity (88.7%) with two CpCDV isolates from Burkina Faso.

T omato (Solanum lycopersicum) is one of the world's most important vegetable crops. However, its production is constrained by viral diseases that cause yield losses (1). Many such viruses are found in the family Geminiviridae and the genus Mastrevirus with single-stranded DNA genomes (2). Seven Mastrevirus species infect dicotyledonous plants, including Chickpea chlorotic dwarf virus (CpCDV) (3)(4)(5)(6)(7)(8)(9). Nineteen CpCDV strains (strains A to S) have been described (9)(10)(11), but the presence of the virus in East Africa has not yet been reported. In this study, we identified a novel CpCDV strain, infecting tomato in Kenya, through metagenomic sequencing and phylogenetic analysis.
Tom54 yielded 320,207 reads (Q30, 97%), which were trimmed to 314,556 reads with an average length of 174 bp. De novo genome assembly yielded two contigs; one was 2,469 nucleotides long, with a GϩC content of 51.73%, and the other was Ͻ138 bp long. A BLASTN-based search revealed both contigs to represent CpCDV, with the larger contig sharing the highest nucleotide similarity (88.7%) with the complete circular genome sequences of two CpCDV strains infecting tomato in Burkina Faso (GenBank accession numbers KY047532 and KY047533) (11). Four ORFs, typical of Mastrevirus sp. genomes, were identified, i.e., V1, V2, C1, and C2. The V1 coat protein ORF encoded 283 amino acids, whereas V2 encoded the putative movement protein of 100 amino acids. C1 and C2 encoded the replication-associated proteins A and B via transcript splicing, with 302 and 143 amino acids, respectively. Phylogenetic analyses revealed a divergence of this isolate into a distinct and previously unreported clade, sharing a common ancestor with CpCDV strains M, R, and S (Fig. 1). Based on these properties, the larger contig was designated a complete, novel CpCDV genome and was deposited in GenBank under accession number MN178605.
According to the species demarcation criteria for Mastrevirus spp. (18), our sequence qualifies to be considered to represent a distinct CpCDV strain; therefore, we propose the name CpCDV-T. Although Kenya shares a border with Sudan, where several CpCDV isolates have been reported (8), the high level of similarity of our isolate to one from West Africa suggests a possible introduction via trade. To our knowledge, this is the first report of CpCDV in Kenya.
Data availability. The sequence described here was deposited in GenBank under accession number MN178605. Raw data were deposited under SRA accession number PRJNA556271 with SRA identification number SRR9737059, while the Tom54 sample was deposited under BioSample accession number SAMN12346850.

ACKNOWLEDGMENTS
We  Council GCRF grant number BB/P023223/1. This project was supported by the BecA-ILRI Hub through the ABCF program. The ABCF program is funded by the Australian Department for Foreign Affairs and Trade through the BecA-CSIRO partnership, the Syngenta Foundation for Sustainable Agriculture, the Bill and Melinda Gates Foundation, the UK Department for International Development, and the Swedish International Development Cooperation Agency.