Genetic Diversity of Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) Colonizing Sweet Potato and Cassava in South Sudan

Bemisia tabaci (Gennadius) is a polyphagous, highly destructive pest that is capable of vectoring viruses in most agricultural crops. Currently, information regarding the distribution and genetic diversity of B. tabaci in South Sudan is not available. The objectives of this study were to investigate the genetic variability of B. tabaci infesting sweet potato and cassava in South Sudan. Field surveys were conducted between August 2017 and July and August 2018 in 10 locations in Juba County, Central Equatoria State, South Sudan. The sequences of mitochondrial DNA cytochrome oxidase I (mtCOI) were used to determine the phylogenetic relationships between sampled B. tabaci. Six distinct genetic groups of B. tabaci were identified, including three non-cassava haplotypes (Mediterranean (MED), Indian Ocean (IO), and Uganda) and three cassava haplotypes (Sub-Saharan Africa 1 sub-group 1 (SSA1-SG1), SSA1-SG3, and SSA2). MED predominated on sweet potato and SSA2 on cassava in all of the sampled locations. The Uganda haplotype was also widespread, occurring in five of the sampled locations. This study provides important information on the diversity of B. tabaci species in South Sudan. A comprehensive assessment of the genetic diversity, geographical distribution, population dynamics, and host range of B. tabaci species in South Sudan is vital for its effective management.


Introduction
Cassava (Manihot esculenta) and sweet potato (Ipomoea batatas (L.) Lam.) are key staple root crops that assure food security in sub-Saharan Africa. This is due to their high calorie content, low production inputs, adaptation to different soil types, and resilience to climatic change as compared to other major staple food crops [1][2][3][4]. The total production of cassava in Africa amounts to 177.8 million tonnes, while that of sweet potato is 27.7 million tonnes [5]. In South Sudan, cassava is the major food security crop after maize or sorghum in the Greenbelt and the Ironstone Plateau zones, which include Western, Central, and Eastern Equatoria (Greenbelt zone), Western Bahr el Ghazal State (Ironstone Plateau zone), and Lakes State [6]. In 2015, the estimated production area for cassava was 75,910 ha and the total production was 1.1 million tonnes. However, these estimates may not represent the actual production due to the ongoing civil unrest in the country [6]. Like cassava, sweet potato is widely SSA1 has been separated into five sub-groups: SSA-subgroup1 (SSA1-SG1), SSA1-SG2, SSA1-SG3, SSA1-SG4, and SSA1-SG5 [10,58,[62][63][64][65].
Several molecular techniques have been used to characterize B. tabaci groups. They include esterase isozyme polymorphisms [66], and DNA markers, including RAPDs, PCR-RFLPs, and AFLPs [67]. Improvements in molecular tools led to the application of sequencing of 16S rRNA, cytochrome oxidase I gene portions in the mitochondrial genome, and the nuclear ribosomal intergenic spacer 1 (ITS1) [55,56,67,68]. Microsatellite markers have also been developed and used to study B. tabaci populations [69,70]. Recent advances have utilized SNP genotyping and nuclear genes [71][72][73]. The mitochodrial DNA cytochrome oxidase I (mtCOI) marker has been the most widely used marker for phylogenetic studies of B. tabaci. It has a high degree of variability and has played an important role in characterizing the genetic relationships between the B. tabaci cryptic species and haplotypes [53,56,74]. MtCOI has been extensively used in assessing the genetic variability, phylogeographic distribution, and identification of new invasive species of B. tabaci in Africa [10,60,62,75] and elsewhere [76][77][78]. However, recent SNP-genotyping using NextRAD sequencing revealed that mtCOI sequencing is not completely effective at distinguishing cassava-colonizing B. tabaci genotypes [71]. These were reclassified into six major groups designated: Sub-Saharan Africa East and Central Africa (SSA-ECA), Sub-Saharan Africa East and Southern Africa (SSA-ESA), Sub-Saharan Africa Central Africa (SSA-CA), Sub-Saharan Africa West Africa (SSA-WA), Sub-Saharan Africa 2 (SSA2), and Sub-Saharan Africa 4 (SSA4) [71,72]. Information on the occurrence and distribution of B. tabaci in sub-Saharan Africa is available, but there are no data for South Sudan. Assessing the nature of the problem that is posed by B. tabaci and the viruses it transmits in South Sudan and developing appropriate control strategies are currently impeded by the instability that is caused by the ongoing civil war. Data are urgently required regarding the genetic groups, haplotype diversity, geographical distribution, and the phylogenetic relationships of B. tabaci in South Sudan. As a first step, this study sought to address this need by sampling and characterizing B. tabaci collected from cassava and sweet potato in Juba County, Central Equatoria State, South Sudan. Therefore, we aimed to provide the first description of the diversity of B. tabaci on sweet potato and cassava in South Sudan, despite the civil unrest in the country severely limiting safe collection sites. The south of Sudan has been experiencing civil war for the past four decades, which is the reason for the lack of data in agricultural research and lack of agricultural progress. The civil unrest commenced again in 2013 one month after this project was started in the new nation of South Sudan.

Whitefly Sampling
Adult whiteflies (Bemisia tabaci) were collected from sweet potato and cassava fields across 10 locations in Juba County (Central Equatoria State, South Sudan) between July and August 2018 (Table 1 and Figure 1). The sampling was confined to these locations as a consequence of the widespread civil insecurity in many parts of South Sudan at the time of the study. Whiteflies were also sampled from tomato and squash plants adjacent to sweet potato and cassava fields. B. tabaci adults that were collected from sweet potato plants in greenhouses at the University of Juba in August 2017 were also added to the field collections. In total, 24 fields were sampled from the 10 locations. Whiteflies were aspirated alive and immediately preserved in 95% ethanol in vials, before being stored in the freezer at −20 • C. Sweet potato and cassava leaves that contained B. tabaci nymphs were cut into small pieces, put into vials, and also preserved in 95% ethanol before being stored in the freezer. B. tabaci were collected from several plants in each sampled field and at least 20 whiteflies were collected from each field.

DNA Extraction
DNA was extracted from single whiteflies, which were either adults or fourth instar nymphs. The insects were added to 3µL of lysis buffer in a 1.5 mL Eppendorf tube then macerated. The lysis buffer contained 10 mM Tris-HCl (pH 8.0, 50 mM KCL, 2.5 mM MgCl 2 , 0.45% Tween-20, 0.01% Gelatine, and 60 µg/mL Proteinase). The mixture was then vortex shaken and spun down and immediately incubated on ice for 15 min. This was followed by incubation at 55 • C in a water bath for 30 min. and the lysate was stored at −20 • C for downstream use. For PCR use, the lysate was diluted while using sterile DPEC treated water in a ratio of 1:9.

Mitochondrial COI (MtCOI) PCR Amplification and Sequencing
DNA extracted from 228 individual whiteflies from all the sampled locations was used for PCR amplification. Two sets of primers were used for the amplification of a partial fragment of mtCOI, primer MM1: (5 -CTGAYATRGCKTTTCCTCG-3 -F, 5 -TTACTGCAYWTTCTGCCAC-3 -R) (IITA lab) and primer set: 2195-Bt-F (5 -TGRTTTTTTGGTCATCCRGAAGT-3 ) and C012-Bt-sh2-R (5 -TTTACTGCACTTTCTGCC-3 ) [79]. These primers amplified~1300 bp and~867 bp, respectively, portions of the mtCOI gene. The PCR reaction contained 1X QuickLoad Master Mix (New England Biolabs, UK), 1 mM MgCl 2 , 0.24 µM of each primer, 2 µL DNA, and sterile distilled water to achieve the desired reaction volume of 25 µL. PCR was carried out at 95 • C for 5 min. initial denaturation of template DNA, followed by 35 cycles at 94 • C for 40 s, 56 • C for 30 s for annealing, and 72 • C for 90 s for extension, with a final extension at 72 • C for 10 min. The PCR products were run on a 1% agorose gel in 1× TAE buffer stained with GelRed TM (Biotium, Fremont, CA, USA). DNA bands were visualized while using a Gel Doc™ XR+ Gel Documentation System and only samples with intact bands were selected for sequencing. PCR products were sent to Macrogen Inc. (Rockville, MD, USA) for purification and direct sequencing. DNA sequences were manually edited while using Ridom Trace Edit v1.1.0 software (Ridom GmbH., Würzburg, Germany). The sequences were assembled into contigs using CLC Main Workbench 7.0.2 (QIAGEN, Aarhus, Denmark). Multiple alignment of edited sequences was performed while using Clustal W in MEGA version 7.0.26 [80] and the sequences were trimmed to 744 nt. Construction of a maximum-likelihood phylogenetic tree was performed using MEGA with 1000 bootstrap replicates. Sequences were blasted using GenBank's (NCBI) Blastn and selected reference sequences with 99% to 100% identity to our mtCOI sequences were included in the phylogenetic tree for comparison with previously published haplotypes. The extent of nt sequence variation within the identified B. tabaci groups was examined. Estimates were obtained for the number of haplotypes, polymorphic sites (S), average number of nucleotide differences (k), nucleotide diversity (Pi), haplotype diversity (Hd), Theta per sequence and Theta per site, and significance values using the mismatch distribution procedure of Dna-SP 6.12.03 [81]. Tajima's D and Fu's Fs were calculated while using Dna-SP 6.12.03 to determine whether the sampled whitefly populations were stable or expanding. sampled were from sweet potato and cassava, which were the main targeted crops of this study. Consequently, the whiteflies that were collected from tomato and squash were from fields adjacent to either sweet potato or cassava plantings. The number and distribution of collection sites varied, depending on the number of fields that were found in each location and the relative abundance of B. tabaci in those fields. However, most importantly, the sampling provided a representative collection of B. tabaci from the major target crops (cassava and sweet potato). The degree of variability subsequently revealed by the sequencing and phylogenetic analysis confirmed this point.  In total, 183 whitefly samples were sequenced, out of which 162 produced high quality mtCOI sequences. There was a high level of diversity among B. tabaci populations that were collected from the sampled crop plants. The sequences obtained from sweet potato, tomato, and squash were grouped into three phylogenetically distinct groups, which included (MED), Indian Ocean (IO) and Uganda. The sequences from cassava were grouped into three distinct groups, SSA1-SG1, SSA1-SG3, and SSA2 ( Figure 2). These groups were identified based on the topology of the phylogenetic tree and the clustering of the sequences that were obtained from this study relative to the reference sequences retrieved from GenBank. The predominant haplotype MED had a total of 90 whiteflies, which accounted for 55.5% of all the whiteflies collected from the four host plants. Of these, 72 whiteflies (44.4%) were found on sweet potato ( Table 3). The second most abundant haplotype was SSA2 with 43 whiteflies (26.5%), all of them being found on cassava. SSA1-SG1 was a second haplotype found only on cassava for which there were 13 whiteflies (8%). The other haplotypes were Uganda, which was present on sweet potato and had a total of 11 whiteflies (6.8%), Indian Ocean with four whiteflies (2.5%) and SSA1-SG3, which was the least frequent haplotype with only one whitefly (0.6%) found on sweet potato (Table 3). A total of 45 selected sequences that represent haplogroups found in this study have been submitted to GenBank under the following accession names (MN318379-MN318423).

Results and Discussion
The clustering of the whiteflies SSA2, SSA1-SG1, and SSA1-SG3 into a distinct major clade separate from B. tabaci whiteflies that do not colonize cassava is consistent with what has been reported in other studies of B. tabaci from various cassava-growing countries in Africa [60,61,71]. The grouping of MED and Indian Ocean haplotypes is also consistent with what has been reported in previous studies [60,61]. Uganda, which was depicted by a clearly defined monophyletic grouping in our mtCOI sequence analysis, has previously been identified as a genetically distinct haplotype that occurs in East Africa [58,70].   We found that B. tabaci MED was predominant on sweet potato, tomato, and squash in all of the sampled locations. MED is a globally important B. tabaci haplotype group which is thought to have originated from Africa. Consequently, there are numerous other reports of its prevalence on a wide range of crop and weed hosts [59,60,64]. B. tabaci MED has been reported to be extremely polyphagous and invasive [54], causing damage to both field and greenhouse crops [82]. It has also developed resistance to various insecticides under intensive production systems [83][84][85]. The presence of B. tabaci MED in all locations and on all sampled crop plants in our study in South Sudan suggests that this haplotype is an important pest of sweet potato and other crops in Juba County. MED has been reported to widely occur in sub-Saharan Africa, and it seems likely, therefore, that, in addition to Juba County, it is an important B. tabaci haplotype throughout South Sudan. Moreover, as SPCSV transmitted by B. tabaci is one of the most important viruses affecting sweet potato in this region of East Africa [12], it is likely that this is the main vector of this virus in South Sudan. However, future investigations should determine the relative abilities to transmit SPCSV of each of the three B. tabaci haplotypes occurring on sweet potato. Since no similar studies have been conducted anywhere else in sub-Saharan Africa, this represents an important gap in the existing understanding of the relationship between B. tabaci haplotype groups and the viruses that they vector.
The MED haplotype analyses revealed six haplotypes amongst the samples that were collected from South Sudan ( Table 4). Two of these are previously described African MED haplotypes, whilst the other four are new unique haplotypes that fall within the MED group. Haplotype diversity (0.51), nucleotide diversity (0.012), and a positive significant Tajima's D (2.07283: p < 0.05) suggest that the population is undergoing balancing selection and has not undergone rapid recent expansion. Sixty-two of the 90 MED sequences (Haplotype 1) represent an important African MED haplotype, for which there are a further 17 sequences in GenBank from Cameroon, Uganda, and Nigeria. The samples in this haplotype were predominantly from sweet potato, although there were also individuals from tomato, squash, and cassava, which indicated that it could be sharing host plants. Haplotype 2 had eight sequences from sweet potato that were identical to 13 sequences from GenBank originating from Sudan, Cameroon, Uganda and Burkina Faso. Haplotype 3 had eight sequences from sweet potato and squash. These were most closely matched (99.7%) with a GenBank sequence from Uganda KX570768. Haplotypes 4 and 5 occurred on sweet potato and had eight and three sequences, respectively. These were most closely related (99.7%) to a sequence from China (MH908653). Haplotype 6, which was recorded from sweet potato, had one sequence sharing 99.9% homology with MH908653 from China. Currently GenBank hosts 944 MED sequences that comprise 168 haplotypes. 673 (71%) of the sequences cluster in three major haplotypes that are spread worldwide. Of the 168 haplotypes, 137 (81%) have only one sequence in GenBank, although it is possible that some of these may be erroneously considered as unique haplotypes due to the frequent occurrence of sequencing errors in mtCOI data submitted to this database. In some scenarios, mitochondrial bar-coding has been found to overestimate the number of species or scale of divergence, where there are nuclear mitochondrial DNA pseudogenes (NUMTs) that can be PCR-amplified with mitochondrial primers [86]. A study has demonstrated that NUMTs were the cause for the incorrect identification of a putative Bemisia species that was given the name MEAM2 based on mtDNA COI data [87]. Another study using whole genome nuclear markers on the major clades of B. tabaci revealed the existence of fewer putative species (five so far), as opposed to the much larger number reported with mtCO1 [73].  10 -In this study, B. tabaci Indian Ocean were collected from sweet potato and tomato, and their sequences were most closely related to Reunion 1 from Spain [88], although B. tabaci Indian Ocean has also been widely reported from sub-Saharan Africa and the surrounding islands [60,65,89]. Haplotype analysis revealed the existence of two Indian Ocean haplotypes. The Uganda haplotype sequences were obtained from several whiteflies that were collected from sweet potato and one individual from tomato. The South Sudan 'Uganda' haplotype sequences were identical to the original Uganda haplotype sequence also obtained from a whitefly adult collected from sweet potato in Uganda (33NamSP-AY057174) [58]. However, Sseruwagi et al. [60] reported the occurrence of sweet potato Uganda haplotype on crop plants other than sweet potato, and Wainana [90] made similar observations from western Kenya, noting the presence of the Uganda haplotype on common bean as well as sweet potato. These data suggest that this haplotype is confined to East Africa and it has a relatively narrow host range, specializing on sweet potato. Our results represent the northernmost record of haplotype 'Uganda' and the third country report.
In the studied cassava group of B. tabaci, the largest number of samples were SSA2, and these were distributed through all of the locations. Two SSA2 haplotypes were identified (Table 4). Haplotype diversity (0.509), nucleotide diversity (0.00205), and a positive significant Tajima's D (257824: p < 0.05) suggest that the population has not undergone recent expansion, but is instead experiencing balancing selection. SSA1-SG1 was less frequent, as it was only detected at two locations and comprised only one haplotype. These results differ from other recent findings from East and Central Africa, which have shown SSA1-SG1 to be the predominant B. tabaci haplotype on cassava [10,61,71]. SSA2, which was previously associated with the severe CMD epidemic in Uganda [58], has been reported to be absent in more recent whitefly collections from cassava in Uganda and western Kenya [65,91], and replaced by SSA1-SG1 [10], although low frequencies of this haplotype were reported from Uganda and Kenya between 2004 and 2010. Recent studies have noted the occurrence of SSA2 on cassava in western Kenya and weedy hosts in Uganda [72,79,92]. The detection of SSA1-SG1 and SSA2 fourth instar nymphs of B. tabaci confirms that both of the haplotypes colonize cassava in South Sudan. A recent continent-wide assessment of cassava-colonizing B. tabaci in sub-Saharan Africa noted that SSA2 was the most widely distributed of the recorded haplotypes [72]. Significantly, this haplotype co-occurs with others throughout its geographic range (stretching from Sierra Leone in West Africa to Kenya in the East), but it appears to be less frequent than SSA1 haplotypes in all cases. Our data suggest that South Sudan could be an exception to this pattern, since there were more than three times as many SSA2 individuals recorded when compared to those of SSA1-SG1.
In the 1990s, CMD was reported to be highly destructive in the Western Equatoria Province of pre-independence southern Sudan [18]. Furthermore, in a baseline survey that was conducted on cassava in 2005 in Eastern and Western Equatoria states, African cassava mosaic virus (ACMV), East African cassava mosaic virus (EACMV), and East African cassava mosaic virus-Uganda (EACMV-UG) were found to be the viruses that affect cassava [7]. SSA2 was shown to be the most abundant B. tabaci haplotype in areas that were affected by the severe CMD epidemic, which spread through Uganda in the 1990s. It is quite likely that there might have been a similar association between virus and vector in southern Sudan during this period. However, whilst SSA1-SG1 subsequently displaced SSA2 as the predominant B. tabaci haplotype on cassava in Uganda, this change might not have happened further north in southern Sudan, with the result being that SSA2 is currently the main cassava-colonizing B. tabaci haplotype in present day South Sudan. The reasons behind these contrasting patterns of population change in Uganda and South Sudan are not currently apparent, but would be a useful topic for future study.
In this study, a single individual of non-cassava B. tabaci haplotype MED was collected from cassava. These rare occurrences have been reported elsewhere [61,75,91]. However, previous studies have demonstrated that non-cassava B. tabaci whiteflies are unable to reproduce on and colonize cassava [93], partly since they are unable to feed effectively on cassava plant hosts [94]. In each of these instances, it has been concluded that whiteflies of non-cassava B. tabaci haplotypes occurring on cassava are present as visitors, and are not colonizing the crop.

Conclusions
This study presents the first report on the genetic diversity of B. tabaci whitefly populations collected from South Sudan. Six B. tabaci haplotype groups, which include three non-cassava groups (MED, Indian Ocean and Uganda) and three cassava groups (SSA1-SG1, SSA1-SG3 and SSA2), were identified. MED and SSA2 were the most prevalent and most widely distributed amongst the sampled locations. The Uganda haplotype is also widespread and it was identified from five of the locations. The discovery of six B. tabaci haplotype groups from the relatively small portion of South Sudan that was sampled does suggest that, like Uganda, this part of East Africa has a high level of whitefly diversity. This provides a strong indication that this part of Africa might have been a source for MED whiteflies that have had devastating global impacts as an invasive pest [95]. It is also significant that the MED species group of B. tabaci includes some of the most insecticide-resistant populations of whiteflies. Therefore, any future management efforts will need to apply extreme caution in the application of chemical insecticides to preclude the development of whitefly resistance, although Bemisia whiteflies may not be present on sweet potato and other host plants at high abundance levels. Whitefly populations that were observed in South Sudan were associated with transmission of viruses causing damaging disease in cassava and sweet potato. Improving the understanding of the dynamic interactions between vector and virus will be important for each of these crop-virus-vector pathosystems. An essential first step in this task will be conducting a comprehensive assessment of the genetic diversity, geographical distribution, population dynamics, and host range of B. tabaci species in South Sudan. This new knowledge will then provide the basis for the development of effective whitefly management strategies.