Assessment and optimization of rolling circle ampli ﬁ cation protocols for the detection and characterization of badnaviruses

The genus Badnavirus is characterized by members that are genetically and serologically heterogeneous which presents challenges for their detection and characterization. The presence of integrated badnavirus-like sequences in some host species further complicates detection using PCR-based protocols. To address these chal- lenges, we have assessed and optimized various RCA protocols including random-primed RCA (RP-RCA), primer-spiked random-primed RCA (primer-spiked RP-RCA), directed RCA (D-RCA) and speci ﬁ c-primed RCA (SP-RCA). Using Dioscorea bacilliform AL virus (DBALV) as an example, we demonstrate that viral DNA ampli ﬁ ed using the optimized D-RCA and SP-RCA protocols showed an 85-fold increase in badnavirus NGS reads compared with RP- RCA. The optimized RCA techniques described here were used to detect a range of badnaviruses infecting banana, sugar cane, taro and yam demonstrating the utility of RCA for detection of diverse badnaviruses infecting a variety of host plant species.


Introduction
The genus Badnavirus (family Caulimoviridae) consists of plant pararetroviruses that infect a wide range of economically important crops and can cause economically significant crop losses (Bhat et al., 2016). Badnaviruses have non-enveloped bacilliform-shaped virions of approximately 30 nm × 120-150 nm (Geering and Hull, 2012) with a circular, double-stranded DNA genome of 7.2-9.2 kb, typically encoding three open reading frames (ORFs) (Geering, 2014). Replication occurs via reverse transcription of a greater-than-genome length RNA which subsequently serves as a template both for the translation of viral proteins and for reverse transcription to replicate the genome (Borah et al., 2013;Geering and Hull, 2012;Iskra-Caruana et al., 2014). The genomes of some badnavirus species are integrated into their host plant genomes, and these sequences are referred to as endogenous badnaviruses (Bhat et al., 2016;Hohn et al., 2008;Staginnus et al., 2009). In some instances, these integrated sequences can give rise to systemic virus infection following recombination events post-exposure to abiotic stress such as in vitro tissue culture process (Côte et al., 2010;Dallot et al., 2001) and interspecific crossing (Lheureux et al., 2003).
Members of the genus Badnavirus are genetically and serologically heterogeneous, having relatively low nucleotide identities even within the same species (Borah et al., 2013;Bouhida et al., 1993;Geering et al., 2000;Harper et al., 2005Harper et al., , 2004Jaufeerally-Fakim et al., 2006;Kenyon et al., 2008). The heterogeneous nature of badnaviruses complicates both serological-and nucleic acid-based detection, while the presence of endogenous badnavirus sequences presents additional challenges using nucleic acid-based methods (Bhat et al., 2016). One promising strategy to overcome these issues is rolling circle amplification (RCA). RCA is a method that utilizes phi29 polymerase, an enzyme which preferentially amplifies circular DNA and has strong strand displacement and 3′-5′ proofreading abilities. These features and the resulting high-fidelity amplification (Blanco et al., 1989;Rockett et al., 2015) have seen the technique exploited as a sequence-independent (random-primed (RP)) amplification strategy to characterize several groups of DNA-viruses infecting humans, animals and plants (Johne et al., 2009). Despite the sequence-independent nature of RP-RCA, offtarget amplification can sometimes still occur, with Paprotka et al. (2010) and Homs et al. (2008) reporting amplification of mitochondrial DNA from sweet potato and sugar beet, respectively.
Until 2011, RCA had primarily been used to detect plant viruses with small genomes (< 3 kb) belonging to the families Geminiviridae and Nanoviridae (Grigoras et al., 2009;Haible et al., 2006;Inoue-Nagata et al., 2004). However, James et al. (2011a) used RCA to detect plant viruses with larger genomes such as the badnaviruses, banana streak virus (BSV) and sugarcane bacilliform virus (SCBV), and the caulimovirus, cauliflower mosaic virus (CaMV). Further, they reported an increase in the amplification of the target BSV sequences by the addition of a mixture of degenerate primers with the commercial Illustra TempliPhi kit reagents used for RCA. The study also highlighted the utility of the technique for the differential amplification of episomal badnavirus genomes compared to their integrated counterparts. Although the improved RCA protocol has now been used to characterize novel badnaviruses infecting banana (James et al., 2011b), fig (Laney et al., 2012), taro ( Kidanemariam et al., 2018) and yam (Bömer et al., 2016;Sukal et al., 2017), Bömer et al. (2016) reported non-specific amplification of DNA from both circular and linear non-viral templates in yam.
The use of premixed kit components such as those supplied with the Illustra TempliPhi kit, either with or without additional virus-specific primers, has been the standard for most badnavirus RCA applications. However, the nature of premixed kits precludes significant scope for optimization in order to maximize target amplification and minimize non-specific amplification. In this study, we report the development of optimized RCA methods by manipulating the individual components and reaction conditions of the RCA reaction and by the inclusion of improved badnavirus degenerate primers. The use of this method significantly increases episomal badnavirus genome amplification compared to commercial premixed kits and can be used for badnavirus detection and characterization using both Sanger and next-generation sequencing (NGS).

Samples
Leaves from Dioscorea esculenta infected with Dioscorea bacilliform AL virus (DBALV) isolate VUT02_De (GenBank accession MG948562) and Dioscorea bacilliform ES virus (DBESV) isolate FJ14 (GenBank accession KY827394), D. alata infected with Dioscorea bacilliform AL virus 2 (DBALV2) isolate PNG10 (GenBank accession KY827395) and D. rotundata infected with Dioscorea bacilliform RT virus 2 (DBRTV2) isolate SAM01 (GenBank accession KY827393) were obtained from the Centre for Pacific Crops and Trees (CePaCT), Pacific Community (SPC) germplasm collection in Fiji. Banana leaf tissue infected with isolates of banana streak CA virus (BSCAV), banana streak GF virus (BSGFV), banana streak MY virus (BSMYV) and banana streak OL virus (BSOLV), as well as sugarcane infected with sugar cane bacilliform IM virus (SCBIMV) and taro infected with taro bacilliform virus (TaBV), was provided by the Centre for Tropical Crops and Biocommodities (CTCB), Queensland University of Technology (QUT). Total nucleic acid (TNA) was extracted from all samples using a CTAB-based method (Kleinow et al., 2009) and the yield and quality were assessed using a NanoDrop™ spectrophotometer (ThermoFisher Scientific, Australia). The concentration of purified TNA was adjusted to~500 ng/μl with sterile nuclease-free water (sterile NF-H 2 O) for RCA experiments.
Episomal DBV-free D. alata accession DA/NGA01 (IITA accession code TDa98/01174), a Nigerian yam accession obtained from the International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria, was used as a negative control for the RCA experiments. This accession tested negative for episomal DBV both at IITA using immunocapture-PCR (IC-PCR) and at SPC-CePaCT using IC-PCR and RCA using protocols described in Seal et al. (2014).

Primer design
The complete genome sequences of 182 badnaviruses, representing 43 species, were accessed from GenBank. For each genome, the ORF 1 and ORF 2 sequences, as well as the ORF 3 conserved domains equivalent to the CaMV movement protein (L 43 -E 243 of ORF 1 protein), coat protein (L 261 -N 429 of ORF 4), aspartic protease (K 36 -Q 120 of ORF 5), reverse transcriptase (K 273 -G 449 of ORF 5) and ribonuclease H (I 547 -E 673 of ORF 5) (Geering and Hull, 2012) were identified. The ORF 1, ORF 2, or individual ORF 3 conserved domain sequences were separately aligned using the CLUSTALW algorithm in MEGA7 (Supplementary Files 1-7). Primers were designed from the consensus sequence of each alignment using Geneious ® v11.0.4 (http://www.geneious.com; Kearse et al., 2012). The specificity of each primer was assessed using both Primer-BLAST at NCBI and in silico in Geneious ® using the 182 complete genome sequences. To circumvent the DNA exonuclease activity of phi29 polymerase, the two terminal 3′ nucleotides (nt) of each primer were phosphorothioate-modified. A total of twenty eight primers were synthesized (Table 1), in addition to phosphorothioate-modified Badna-MFP/MRP (Turaki, 2014) and BadnaFP/RP (Yang et al., 2003) primers.

Random-primed RCA (RP-RCA)
RP-RCA was done using the Illustra TempliPhi 100 Amplification Kit (GE Healthcare) essentially as described by the manufacturer with some modifications. Briefly, a 1 μl aliquot of TNA (~500 ng) was mixed with 5 μl of kit sample buffer and incubated at 95°C for 3 min, cooled to 4°C and placed on ice. The denatured sample solution was then combined with 5 μl of reaction buffer premixed with 0.2 μl of phi29 polymerase. RP-RCA was either carried out at 30°C (manufacturer's recommendation) or 36°C for 18 h followed by 65°C for 10 min to inactivate the enzyme.

Table 1
Sequences of primers used in primer-spiked random-primed (RP)-RCA, directed (D)-RCA and specific-primed (SP)-RCA protocols. 2.4. Primer-spiked random-primed RCA (primer-spiked RP-RCA) Primer-spiked RP-RCA was essentially as described for RP-RCA with the addition of 1 μl of a mixture of 32 badnavirus-specific primers (Table 1) at a final concentration of 0.4 μM of each primer to 5 μl of the sample buffer as previously described by James et al. (2011a) with either 30°C or 36°C as the incubation temperature.

Directed RCA (D-RCA)
The directed RCA (D-RCA) protocol was a modification of the published two-step RCA used for amplification of low-copy number human polyomaviruses, which consisted of an annealing step followed by an amplification step (Marincevic-zuniga et al., 2012;Rockett et al., 2015). For the annealing step, the 32 badnavirus-specific primer mix (Table 1) at a final concentration of 0.4 μM of each primer was combined with 1× phi29 buffer (NEB, Australia) and 1 μl TNA in a final volume of 10 μl. The mixture was denatured at 95°C for 3 min, cooled to 4°C and placed on ice. A second mixture consisting of 2.5 μM of exonuclease-resistant random hexamers (ThermoFisher Scientific, Australia), 1× phi29 buffer, 2 ng/μl BSA, 4 mM DTT, 15 mM dNTPs, 5 U of phi29 polymerase (ThermoFisher Scientific, Australia) and sterile NF-H 2 O to 10 μl was prepared and combined with the denatured primer/template mixture. Reactions were incubated at 36°C for 18 h, followed by enzyme inactivation at 65°C for 10 min.

Specific-primed RCA (SP-RCA)
SP-RCA was carried out essentially as described for D-RCA except that the random hexamers used in the second master mix were substituted with the badnavirus-specific primer mixture at a final concentration of 0.4 μM of each primer.

Optimization of D-RCA and SP-RCA
To determine the optimum incubation temperature, RCA was carried out using TNA from DBALV-infected yam as template. Incubation temperatures ranging from 30 to 40°C, in increments of 2°C, were assessed and all reactions used an 18 h incubation time. To determine the optimal dNTP concentration for amplification, RCA was carried out using final concentrations of 0, 2.5, 5, 10, 15 and 20 mM dNTPs. To investigate the optimum incubation time, RCA was carried out as before using incubation times of 4, 12, 16 and 18 h at the optimum RCA temperatures. The sensitivity of RP-RCA, primer-spiked RP-RCA, D-RCA and SP-RCA was determined by varying the template concentration using 500, 250, 125, 50, 25, 10, 5 and 0 ng of DBALV-infected yam TNA with RCA carried out using the optimized temperature (36°C), incubation time (18 h) and concentration of dNTPs (15 mM). All RCA conditions were kept essentially as described in Sections 2.3-2.6 while varying the parameter under investigation.

Restriction analysis, cloning and Sanger sequencing
DBALV RCA-amplified DNA was digested either with EcoRI, generating four fragments of~3.3, 1.7, 1.4 and 1 kb, or SphI, generating a single~8 kb fragment. RCA products of the other positive samples were independently digested with either KpnI, SphI or StuI, which were selected based on in silico restriction analysis of published badnavirus genome sequences, or from experimental experience, to generate useful restriction profiles. EcoRI, a known multiple cutter of DBVs, was used to digest RCA reaction products of the healthy control sample DA/NGA01. RCA products (10 μl) were digested in a total reaction volume of 20 μl containing 5 U of enzyme, 1× CutSmart ® buffer (NEB) and sterile NF-H 2 O. Reaction mixtures were incubated at 37°C for 2-4 h and the digested RCA products were analyzed by electrophoresis through 1.5% agarose gels stained with SYBR ® Safe (ThermoFisher Scientific).
Fragments of interest were purified using Freeze N Squeeze™ DNA Gel Extraction Spin Columns (Bio-Rad, Australia) and cloned into pUC19 as described in Sukal et al. (2017). Sequencing was carried out using either M13F/R or BadnaFP/RP primers.

RCA-NGS and genome assembly
To characterize the specificity of each of the different RCA protocols for amplification of badnaviruses, DBALV DNA amplified using RP-RCA, D-RCA and SP-RCA was purified and sequenced using the Illumina MiSeq platform. RCA products were purified using the Illustra™ GFX™ PCR DNA and Gel Band Purification Kit (GE Healthcare). Sequencing libraries were prepared from purified RCA products using the Nextera™ XT Library Prep Kit (Illumina) and paired-end reads generated using the MiSeq system (Illumina) at the Central Analytical Research Facility (CARF), Queensland University of Technology, Brisbane, Australia. Raw read quality was assessed with FastQC v0.10.1 (http://www. bioinformatics.babraham.ac.uk/projects/fastqc/), with residual adapter sequences trimmed, as well as low quality and short reads (< 40 nt) removed, using the BBduk plugin in Geneious ® . To determine NGS read identities, the quality corrected reads were mapped against badnavirus genomes (182 complete sequences), the D. alata reference genome (GenBank accessions: CZHE02000001-CZHE0205770), plastid (GenBank accessions: EF380353, KJ490011, KY085893) or mitochondrial (GenBank accession: LC219374) DNA using the Geneious reference mapper algorithm. Unmapped reads were de novo assembled using SPAdes v3.5 (Bankevich et al., 2012) and BLASTn was carried out on contigs > 1000 nt to further determine read identities. Finally, complete viral genomes were generated by reference mapping total NGS reads against the DBALV-[2ALa] genome (GenBank accession KX008571).

Badnavirus RCA optimization
RCA optimization was initially carried out using TNA extracted from D. esculenta originating from Vanuatu infected with DBALV isolate VUT02_De. This sample had previously tested positive at SPC using IC-PCR and the complete virus genome was characterized using NGS in the present study (see below). The effect of incubation temperature on RP-RCA, D-RCA and SP-RCA was evaluated between 30 and 40°C at intervals of 2°C. Following RCA, a 10 μl aliquot of each amplification reaction was digested with EcoRI, to generate four fragments of~3.3, 1.7, 1.4 and 1 kb. Using RP-RCA, only very low levels of visible digest products were observed irrespective of the RCA incubation temperature (Fig. 1A). In contrast, whereas incubation temperatures of 30 and 40°C also resulted in very poor amplification of virus DNA using D-RCA and SP-RCA, considerably stronger visible digest products were observed using incubation temperatures of 32-36°C for SP-RCA, or 32-38°C for D-RCA (Fig. 1A). Similar results were observed when RCA reactions were digested with SphI (results not shown).
The effect of dNTP concentration on amplification was evaluated by varying the final concentration of dNTPs between 0 and 20 mM in the RCA reactions. When RP-RCA-amplified DNA was digested with EcoRI, very low levels of visible reaction products, similar to, or lower than, those observed in Fig. 1A, were observed irrespective of the dNTP concentration (result not shown). When D-RCA and SP-RCA-amplified DNA was digested with EcoRI, no visible reaction products were observed in reactions with 0 or 2.5 mM dNTPs (Fig. 1B). In contrast, visibly stronger digest products were observed using D-RCA reactions containing 5 mM dNTPs, while no visible digest products were observed from SP-RCA reactions using 5 mM dNTPs. The strongest visible digest products were observed in D-RCA and SP-RCA reactions containing dNTP concentrations in the range of 10-20 mM (Fig. 1B).
Evaluation of RCA reaction incubation times showed that detectable levels of badnavirus DNA were amplified after 12 h in both the D-RCA and SP-RCA reactions, but only after 16 h from RP-RCA reactions (Fig. 1C). However, in all three RCA protocols, optimal levels of visible digest products were observed following incubation for 16-18 h. When the TNA template concentration was varied from 5 to 500 ng, very low levels of visible digest products were only observed using the RP-RCA protocol containing 250-500 ng of TNA template ( Fig. 2A). In contrast, digestion of products from primer-spiked RP-RCA generated visible digest products when as little as 50 ng of TNA template was used (Fig. 2B). The levels of visible digest products from both the RP-RCA and primer-spiked RP-RCA protocol was positively correlated with the amount of TNA template added, with higher starting template concentrations resulting in higher amounts of visible digest products. When the amount of TNA template was varied in the D-RCA and SP-RCA protocols, relatively high amounts of visible digest products were observed at all TNA concentrations assessed ( Fig. 2C and D, respectively).
Following optimization, all further experiments using D-RCA and SP-RCA were carried out as described in 2.5 and 2.6, respectively, with a final concentration of 15 mM dNTPs and an incubation temperature of 36°C for a duration of 18 h, while RP-RCA and primer-spiked RP-RCA were carried out as described in 2.3 and 2.4, respectively. All RCA was carried out using 500 ng of TNA as template.

RP-RCA, primer-spiked RP-RCA, D-RCA and SP-RCA amplification of badnaviruses
To compare the utility of the optimized RCA protocol with previously described RCA protocols for the detection of a broad range of badnaviruses, TNA from bananas infected with BSCAV, BSGFV, BSMYV or BSOLV, yam plants infected with DBALV, DBALV2, DBESV or DBRTV2, sugar cane infected with SCBIMV and taro infected with TaBV were subjected to RP-RCA, primer-spiked RP-RCA, D-RCA and SP-RCA using the optimized reaction conditions. Since temperatures in the range of 34-38°C were shown to increase the efficiency of both RP-RCA (Fig. 1A) and primer-spiked RP-RCA (figure not shown), the performance of RP-RCA and primer-spiked RP-RCA was evaluated at both 30°C (used in previous published studies), and at 36°C.
To obtain putative full-length restriction fragments, RCA products from TNA extracts containing BSCAV, BSGFV, BSMYV, BSOLV, DBESV and SCBIMV were digested with KpnI, RCA products from TNA extracts containing DBALV, DBALV2, DBRTV2 were digested with SphI while RCA products from the TNA extract containing TaBV was digested with with EcoRI, while RCA products in (C) were digested with SphI. Digest products were electrophoresed through 1.5% agarose gels and stained with SYBR ® Safe (Invitrogen, Australia). M -GeneRuler 1 kb DNA Ladder (ThermoFisher Scientific, Australia) with fragments larger than 1 kb visible, labels on the right-hand side indicate the length of selected DNA fragments (bp).
Putative full-length viral genome fragments observed following digestion of reaction products from RP-RCA (30 and 36°C), primer-spiked RP-RCA (30 and 36°C), D-RCA and SP-RCA were excised, cloned and Sanger sequenced using either M13F/R or BadnaFP/RP primers. In each case, sequencing confirmed that the fragments observed following digestion of RCA reaction products was of the respective virus isolate, with 97-100% sequence identity to published sequences for the virus known to be present in each sample.

RCA-NGS for virus characterization
To investigate the efficiency of the different RCA protocols for the amplification of badnavirus DNA, TNA extracted from DBALV-infected D. esculenta from Vanuatu (DBALV isolate VUT02_De) was used in RP-RCA, D-RCA and SP-RCA. Undigested reaction products were sequenced using the Illumina platform which resulted in paired-end reads of 350,272, 604,382 and 800,560 from NGS of RP-RCA, D-RCA and SP-RCA derived products, respectively. Following adapter removal, quality trimming and removal of short reads (< 40 nt) from the paired-end reads, 317,608, 557,088 and 751,680 respective reads from the RP-RCA, D-RCA and SP-RCA products were obtained. Reference mapping using Geneious revealed that 0.15%, 67.44%, 2.39% and 1.38% of RP-RCA sequences, 85.71%, 6.57%, 0.41% and 0.03% of D-RCA sequences and 84.78%, 7.76%, 0.30% and 0.02% of SP-RCA sequences mapped to badnaviruses, the D. alata reference genome sequence, plastid or mitochondrial sequences, respectively, while 28.64%, 7.28% and 7.14% of RP-RCA, D-RCA and SP-RCA generated sequences remained unmapped. When the unmapped reads were de novo assembled and contigs > 1000 nt were subjected to BLASTn analysis, the majority of hits were to plant genomes other than Dioscorea spp.
Reference mapping was further carried out in Geneious, using DBALV-[2ALa] (GenBank accession KX008571) as a reference, to generate the complete genome of isolate VUT02_De used in this study. The Geneious mapper assigned 477 (mean coverage 11), 477,485 (mean coverage 12,783) and 637,254 (mean coverage 17,145) of the trimmed reads generated using RP-RCA, D-RCA and SP-RCA, respectively, to . The D-RCA and SP-RCA libraries generated a complete circular virus genome using the NGS data, whereas the RP-RCA library only generated fragmented sequences, the largest of which was 7203 nt.

DBALV isolate VUT02_De genome
The complete NGS-derived sequences of DBALV isolate VUT02_De RCA reaction products were digested with EcoRI and electrophoresed through 1.5% agarose gel stained with SYBR ® Safe (Invitrogen, Australia). NT -No template control. M -GeneRuler 1 kb DNA Ladder (ThermoFisher Scientific, Australia) with fragments larger than 1 kb visible, labels on the right-hand side indicate the length of selected DNA fragments (bp). generated using D-RCA and SP-RCA NGS showed 99.6% nucleotide identity to each other. The consensus genome of DBALV isolate VUT02_De comprised 7509 nt and contained three open reading frames (ORFs). ORF 1 comprised 432 bp and encoded a putative protein of 144 amino acids (aa), while ORF 2 was 378 bp and encoded a putative protein of 126 aa. ORF 3 was 5682 bp and encoded a putative protein of 1894 aa. An intergenic region (IR) of 1022 bp was present between ORF 1 and ORF 3 and contained a putative plant tRNA met binding site ( 3′-TGGTATCAGAGCTTGGTT-5′ 1-18 ) complementary to the consensus sequence of plant cytoplasmic initiator tRNA met (3′-ACCAUAGUCUCG GUCCAA-5′) which was designated as the start of the circular viral genome. The 529 bp partial RT/RNase H-coding region delineated by the BadnaFP/RP primers showed highest nucleotide identity (99.6%) to a partial RT/RNase H-coding sequence of Dioscorea bacilliform virus isolate VU249_Db (GenBank accession AM072705) and 92.5-94.5% similarity to two other partial sequences, VU254_DP and VU252_Db (AM072707 and AM072706, respectively) originating from Vanuatu. When compared to published DBALV full-length sequences, the complete genome sequence of DBALV-VUT02_DE had the highest nucleotide sequence identity with DBALV isolate DBALV-

Discussion
In this paper, we describe two improved RCA protocols for the Lanes 1-4, 7 and 9 represent KpnI-digested RCA products of BSCAV, BSGFV, BSMYV, BSOLV, DBESV and SCBIMV, respectively, while lanes 5-6 and 8 represent SphI-digested RCA products of DBALV, DBALV2 and DBRTV2, respectively. Lane 10 represents TaBV digested with StuI. Lane 11 is the healthy control sample DA/NGA01 digested with EcoRI and NT is the no template control. M -GeneRuler 1 kb DNA Ladder with fragments larger than 1 kb visible, labels on the right-hand side indicate the length of selected DNA fragments (bp). amplification and characterization of badnaviruses developed by optimization of the reaction parameters and by the inclusion of a suite of badnavirus degenerate primers. Based on our NGS data these modified methods were shown to enhance badnavirus genome amplification from 0.15% of total NGS reads to~85% of total NGS reads, compared to the standard RP-RCA protocol.
Although RCA using kit-based protocols, such as TempliPhi, has been used to amplify badnaviruses from various host species, such as banana Carnelossi et al., 2014;James et al., 2011b;Javer-Higginson et al., 2014;Sharma et al., 2015Sharma et al., , 2014Wambulwa et al., 2013Wambulwa et al., , 2012, cacao (Chingandu et al., 2017a(Chingandu et al., , 2017bMuller et al., 2018), fig (Laney et al., 2012), mulberry (Chiumenti et al., 2016), Rubus spp. (Diaz-Lara et al., 2015), taro (Kidanemariam et al., 2018) and yam (Bömer et al., 2018(Bömer et al., , 2016Sukal et al., 2017;Umber et al., 2014), the method still has several limitations. Due to the sequence-independent nature of the kit-based RCA protocols, plantgenome derived DNAs, such as mitochondrial or chloroplast DNA, are sometimes also amplified (Bömer et al., 2016;Homs et al., 2008;Paprotka et al., 2010). Further, during the initial annealing process, the random hexamer primers in the premixed sample buffer bind to all available nucleic acids and not just the preferred target sequences, enabling non-target DNA amplification in addition to the desired target. Although the addition of virus-targeted primers reported by James et al. (2011a) creates a bias towards the target badnavirus genomes, nonspecific amplification still occurs as a consequence of priming to nontarget DNA by the random primers present in the premixed sample buffer used in the denaturation step. Modification of the initial denaturation mixture to only contain primers which anneal to the target virus sequences results in amplification which is biased towards the target circular virus genomes. Compared to random-primed RCA using the TempliPhi kit, where < 1% of the NGS reads mapped to badnavirus sequences, the use of either D-RCA or SP-RCA resulted in 85.71% and 84.78% of respective reads mapping to the target badnavirus, showing that these RCA protocols greatly enhance amplification of the target virus genome.
In an effort to optimize the RCA protocol to detect the greatest breadth of badnavirus sequence diversity, in silico primer-binding analysis using the 32 degenerate primers developed in this study was carried out using complete genome sequences of the 43 currently recognized badnavirus species. This analysis showed that at least 10 of the primers were able to bind to every complete genome sequence. Using these primers, the SP-and D-RCA protocols were shown to successfully amplify distantly related badnaviruses including four distinct BSVs, four distinct DBVs, as well as SCBIMV and TaBV, from four different host plant species, highlighting the utility of the method for both detection and characterization of badnaviruses.
We also found that increasing incubation temperature to 36°C greatly improved the performance of kit-based RCA protocols such as the RP-RCA and primer-spiked RP-RCA ( Fig. 3B and D, respectively). However, D-RCA was found to be the most consistent and reproducible protocol for generic badnavirus amplification from the different host plants. Although SP-RCA performed to an equivalent level in some cases, there was some variability in amplification levels possibly due to variability in the number of primers binding to different target virus genomes. However, the reliability of SP-RCA could be improved further by designing specific primers for the target virus species of interest.
RCA post-amplification analysis often involves restriction analysis to confirm viral genome amplification, however, this is dependent on knowledge of suitable restriction enzymes which generate reproducible restriction profiles. The genomic heterogeneity of many badnaviruses, together with the limited availability of complete genome sequences for some virus species, complicates the use of restriction analysis for virus detection. By utilizing NGS of total undigested RCA products, both virus detection as well as full genome characterization can be accomplished. The combination of RCA and NGS has previously been used for the characterization of circular DNA viruses including geminiviruses (Leke et al., 2016;Zubair et al., 2017;Idris et al., 2014;Kathurima et al., 2016) and badnaviruses (Chingandu et al., 2017a(Chingandu et al., , 2017bMuller et al., 2018). Previous work using RP-RCA-NGS to characterize the badnaviruses, cacao mild mosaic virus (CaMMV) and cacao yellow veinbanding virus (CYVBV), showed that of the total 2,111,947 and 3,664,739 NGS reads obtained by sequencing of RCA products, only 1,084,938 and 15,355 reads (representing 51.37% and 0.4% of the total reads, respectively) were derived from the target viral sequences (Chingandu et al., 2017b). Using the SP-and D-RCA methods described herein produced a far greater percentage of reads (~85%) mapping to the target DBALV-[2ALa] genome compared with the RP-RCA protocol (~1%). This result highlights the significance of the improvement in target sequence amplification by omitting random primers from the initial denaturation/annealing step as well as the utility for using NGS to diagnose and characterize badnavirus genomes. The costs associated with NGS may preclude its use as a routine diagnostic tool, however, if enough badnavirus genome information can be amassed through initial RCA-NGS efforts, RCA restriction analysis can then be used as an effective diagnostic tool.
The high levels of heterogeneity at both serological and genetic levels and the presence of host genome integrated sequences believed to be remnants of ancient viral sequences make characterization and detection of some badnaviruses difficult. The optimized RCA protocols described in this study coupled with NGS can be used for the characterization and detection of badnaviruses from a range of host species. The potential for using restriction analysis of either D-RCA or SP-RCA products as a diagnostic tool for badnavirus detection remains high, with continual sequencing of complete genomes improving knowledge of suitable restriction enzymes for digestion of reaction products, particularly for the host/badnavirus combinations described in this study. Further, knowledge of the badnaviruses infecting a specific host plant may aid the judicious selection of primers used in the D-RCA or SP-RCA protocols.