Adapterama III: Quadruple-indexed, double/triple-enzyme RADseq libraries (2RAD/3RAD)

Molecular ecologists frequently use genome reduction strategies that rely upon restriction enzyme digestion of genomic DNA to sample consistent portions of the genome from many individuals (e.g., RADseq, GBS). However, researchers often find the existing methods expensive to initiate and/or difficult to implement consistently, especially because it is difficult to multiplex sufficient numbers of samples to fill entire sequencing lanes. Here, we introduce a low-cost and highly robust approach for the construction of dual-digest RADseq libraries that build on adapters and primers designed in Adapterama I. Major features of our method include: (1) minimizing the number of processing steps; (2) focusing on a single strand of sample DNA for library construction, allowing the use of a non-phosphorylated adapter on one end; (3) ligating adapters in the presence of active restriction enzymes, thereby reducing chimeras; (4) including an optional third restriction enzyme to cut apart adapter-dimers formed by the phosphorylated adapter, thus increasing the efficiency of adapter ligation to sample DNA, which is particularly effective when only low quantity/quality DNA samples are available; (5) interchangeable adapter designs; (6) incorporating variable-length internal indexes within the adapters to increase the scope of sample indexing, facilitate pooling, and increase sequence diversity; (7) maintaining compatibility with universal dual-indexed primers and thus, Illumina sequencing reagents and libraries; and, (8) easy modification for the identification of PCR duplicates. We present eight adapter designs that work with 72 restriction enzyme combinations. We demonstrate the efficiency of our approach by comparing it with existing methods, and we validate its utility through the discovery of many variable loci in a variety of non-model organisms. Our 2RAD/3RAD method is easy to perform, has low startup costs, has increased utility with low-concentration input DNA, and produces libraries that can be highly-multiplexed and pooled with other Illumina libraries.


INTRODUCTION
Although next-generation DNA sequencing (NGS) facilitates data collection at low cost, it is not yet economically or computationally feasible for most ecological projects to sequence whole genomes from many individuals or from organisms with large genomes. However, many questions can be addressed with a small fraction of the genome (DeWoody & DeWoody, 2005;Cariou, Duret & Charlat, 2013;Pante et al., 2015;Andrews et al., 2016). Thus, researchers have created a variety of strategies to sample a consistent portion of the genome from large numbers of individuals at low cost (Harvey et al., 2016;Heyduk et al., 2016;. One of the most popular genome sampling strategies uses restriction enzymes to reduce genome complexity and sequence a set of orthologous loci across individuals (Restriction site Associated DNA sequencing, RADseq;Miller et al., 2007;Baird et al., 2008). RADseq approaches have several key advantages. First a reference genome is not required (although it is better to have one (e.g., Shafer et al., 2016)). Second library preparations are relatively low-cost. Third, RADseq techniques can be applied with minimal modification across a broad spectrum of organisms. Finally, mature software that is updated regularly is available for data analyses (e.g., Stacks; Catchen et al., 2011;Catchen et al., 2013;pyRAD;Eaton, 2014).
Limitations of methods that pool DNA samples from multiple individuals within putative populations are particularly acute (Andrews et al., 2016); thus, we focus on methods where individual samples are indexed such that individual organisms can be genotyped. Limitations of current individual-based RADseq methods include: (1) high up-front costs for adapters (∼$4,550 USD, for 12 pools of 48 samples; Peterson et al., 2012;Salas-Linaza & Oono, 2018), which limits experimental flexibility; (2) that adapters are phosphorylated in both ends, and thus, can form adapter dimers; (3) the use of an optional step to incorporate a biotin-containing primer and streptavidin beads to separate correct constructs; (4) the inability to reduce chimera formation; (5) the fact that no sequence diversity is present across the restriction recognition sequence of the resulting libraries, which constrains sequencing options; (6) the need for moderate to high amounts of high-molecular-weight DNA; (7) limited ability to multiplex high numbers of libraries, and thus, high sequencing costs; and (8) workflows of varying complexity. Thus, developing RADseq methods that reduce the cost of adapters and primers, vastly improve sample multiplexing, better accommodate low-concentration DNA samples (low DNA input), improve flexibility, increase consistency, and simplify the workflow would be helpful to many researchers.
Here, we present an alternative approach (2RAD/3RAD) for preparing RADseq libraries, which is similar in spirit to and builds upon the strengths of ddRAD, while also addressing each of the limitations summarized above and described below in our Rationale for methodological approach. In brief, 2RAD/3RAD uses DNA that is digested with multiple restriction enzymes and ligates adapters in the presence of these functional restriction enzymes, followed by PCR with primers developed in Adapterama I (Glenn et al., 2019) to make fully active quadruple-indexed Illumina libraries that can be highly-multiplexed (Figs. 1 and S1). Although some of our adapter designs and working procedures have been implemented and published during the development of our method (e.g., Graham et al., 2015;Hoffberg et al., 2016;Scott, Glenn & Rissler, 2017), additional designs, design details, flexibility, advantages and disadvantages of the system have yet to be described. Below, we explain the design goals and rationale for our approach, detail how we have implemented the method, demonstrate that large numbers of polymorphic loci can be discovered from a broad array of organisms using just one of the possible variations of our method, and discuss these results and additional work to extend this approach.
Our overall goal was to develop a ddRAD-style method with the following characteristics: (1) a simplified workflow with few consecutive buffer exchanges; (2) sequential or simultaneous digestion of DNA and ligation of adapters; (3) reduced chimera formation; (4) increased library efficiency (i.e., increasing the ratio of sample molecules converted into complete library molecules) through suppressed adapter dimer formation; (5) hierarchical combinatorial indexing to facilitate efficient multiplexing of many samples; (6) reduced costs, both for initial buy-in (i.e., cost of all reagents to start using the method) and per sample prepared; and (7) facilitation of pooling with any other Illumina library type. We built upon the adapter design and methods of Glenn & Schable (2005) to achieve the first four goals, whereas we extended the work of Faircloth & Glenn (2012) and Glenn et al. (2019) to achieve the last three goals (Files S1-S2).  Figure 1 Overview of 2RAD/3RAD library construction. Genomic DNA is digested with two restriction enzymes (A and B). Adapters are ligated to the digested DNA, but only the bottom strand has functional adapters. The top strand has shorter, non-functional versions of the adapters. The ligation products are then used in a limited cycle PCR with iTru5 and iTru7 primers to form fully active double-stranded DNA molecules. The color-scheme follows those of Glenn et al. (2019) andHoffberg et al. (2016).

Rationale for methodological approach
To achieve our first three design goals, we use reagents that allow simultaneous digestion of the sample DNA and ligation of the adapters onto the sample DNA (Glenn & Schable, 2005). Simultaneous digestion of DNA with multiple restriction enzymes requires that the enzymes are active in the same buffer and at the same temperature. New England Biolabs (NEB; Ipswich, MA, USA) has developed many enzymes that retain high activity in a single buffer (CutSmart) and describes the activity of their enzymes in their other standard buffer formulations (Table 1). However, this approach is transferable to enzymes and buffers from other companies as long as cut-sites and buffers are carefully chosen. Because T4 DNA ligase can be used in the same buffers as most restriction enzymes, if the buffers are supplemented with ATP, researchers can start by digesting DNA, then add ligase and ATP to the digestion reaction and change the temperature to promote ligation. By cycling between temperatures that promote ligation, then digestion, multiple times, reactions can be driven to highly efficient outcomes (i.e., high proportions of the input DNA will be cut and will have adapters ligated onto the ends; Figs. S1-S2). A major distinction between the methods of Glenn & Schable (2005) and those needed here is that a single blunt-ended 5 phosphorylated adapter (Super SNX) was used in the prior work, resulting in identical adapters on each end of the resulting libraries, whereas Illumina libraries require unique adapter sequences on each end of the library molecules ( Fig. 1). By focusing on a single strand of the template, rather than both strands, it is possible to use only one adapter that is phosphorylated (i.e., Read 1 adapter, bottom strand; Fig. 1, Figs. S1-S2) in 2RAD/3RAD, whereas the other adapter can use plain oligonucleotides for both strands (File S3), forming fewer adapter chimeras. To achieve our fourth design goal, we design custom adapters. During ligation, double-stranded adapters that are modified versions of the TruSeq Read 1 and Read 2 sequences (Table 2) are ligated onto each fragment of DNA. The iTru Read 2 adapters are unphosphorylated on the 5 end and will not self-ligate to form dimers. The iTru Read 1 adapter is a perfect match to the sticky end of the insert DNA, but the adapter does not have the correct bases to recreate the restriction site used to cut the sample DNA. Because the iTru Read 1 adapter is phosphorylated on the 5 end, it can and will form dimers.
For 2RAD, we select sets of low-cost, type II restriction enzymes that form unique cohesive-ends (i.e., incompatible sticky-ends). To further achieve goal four, in our 3RAD protocol, we use these two restriction enzymes (e.g., XbaI and EcoRI; Table 1) with a third restriction enzyme (e.g., NheI) that produces a cohesive-end compatible with one of the other restriction enzymes (e.g., XbaI; Fig. 2). We then assigned the two restriction enzymes with compatible cohesive-ends (e.g., XbaI, and NheI) to Illumina Read 1 adapter stub sequences (Glenn et al., 2019) and assigned the incompatible restriction enzyme (e.g., EcoRI) to Read 2 adapter stubs. Next, we designed the Read 1 stubs such that if they selfligated to form Read 1 adapter-dimers, they create the recognition sequence for the third restriction enzyme (e.g., NheI; File S3; Glenn & Schable, 2005). Similarly, Read 1 adapters ligated to genomic DNA with third restriction enzyme cut-sites recreate the recognition sequence for the third restriction enzyme. See Fig. 2 for a graphical representation of one example of this design using the restriction enzymes XbaI, EcoRI, and NheI.
As described above, we cycle temperatures in the simultaneous digestion and ligation to allow this third restriction enzyme to cut apart adapter-dimers, which increases the consistency and efficiency of 3RAD library preparation, even with limited amounts of sample DNA. However, while genomic DNA cut by this third restriction enzyme and ligated to Read 1 adapters should be recut by the same restriction enzyme, any remaining molecules of this form are suitable for PCR amplification and may be present in final libraries. 2RAD functions without this third restriction enzyme, and in practice, the differences between 2RAD and ddRAD are: (a) simultaneous digestion and ligation reactions; (b) inclusion of variable-length internal indexes on each end (see below); (c) compatibility with iTru primers from Adapterama I that allow for highly-multiplexed libraries to be pooled for sequencing on Illumina platforms (File S3); and (d) potential substitution of the normal iTru5 primer containing specific index for iTru5-8N primer pool with 65,536 indexes, which facilitates identification and removal of PCR duplicates (Hoffberg et al., 2016).
Additionally, we construct 2RAD/3RAD adapters so that each double-stranded adapter has one active strand (i.e., the bottom strand as shown in all figures herein) and one unused strand (i.e., the top strand in all figures herein). The dummy strand is simply used for structural support and correct 3D structure of the adapters and constructs through the ligation process. Both strands fit together on each side of the input DNA during ligation, but the nick between the sample DNA and Read 2 adapter top dummy strand is not ligated Table 1 Enzyme combinations and characteristics. Four design sets each for Read1 (R1) and Read2 (R2) are given. For 2RAD, any two-enzyme combination of Read 1 and Read 2 in black can be used. For 3RAD, the third enzyme (in blue) blocks adapter-dimer formation of the Read 1 adapter (File S3). Digestion efficiency is given for three NEB buffers (2.1, 3.1, and CutSmart R ), with the best conditions highlighted in green, and poor or important non-standard conditions in red. Sensitivity to methylation in the template sequence is given, as is the optimal temperature for digestion and the number of bases in the recognition sequence. Note: some restriction enzymes are available as high-fidelity (HF, i.e.: NheI-HF, SpeI-HF, and NsiI-HF), all these have 100% efficiency in CutSmart R Buffer.

Read 1 adapter sets
Read 2 S1). Thus, the unused strand construct breaks apart during PCR steps, and only those constructs with bottom strands that successfully ligate both kinds of adapters are amplified. This ensures that valid constructs with the correct restriction sites at opposite ends dominate the amplified library pools. Additionally, the oligonucleotides for the top strand (as depicted in Figs. 1, S1, and S2) are not full-length, so they cannot be used as templates for the iTru5 or iTru7 primers. Finally, the top strand of the Read 2 adapter ends in five non-complementary bases so that it cannot serve as an unwanted primer during library amplification. We achieve our next three design goals (i.e., 5-7) by including variable-length internal indexes-also known as ''in-line barcodes'' (Andrews et al., 2016)-within the Read 1 and Read 2 adapter stubs and making the adapter stubs compatible with the primers of Glenn et al. (2019; Fig. 1 and S1). For each adapter stub design, we have made eight versions of the Read 1 adapter stub and 12 versions of the Read 2 adapter stub (File S3). Each adapter stub version includes an internal index of 5, 6, 7, or 8 nucleotides (nt). The purpose of these internal indexes is twofold: (1) combinations of the Read 1 and Read 2 adapters create 96 (8 ×12) index combinations, which facilitates pooling of samples from 96-well plates (File S3); and (2) the variable length of indexes increases base diversity within pools of libraries (Krueger, Andrews & Osborne, 2011), which is important when sequencing libraries derived from restriction enzymes digestion (Mitra et al., 2015;Glenn et al., 2019).
To create full-length libraries, after digestion-ligation cycles, we amplify through reduced cycle PCR using the iTru5 and iTru7 primers of Glenn et al. (2019;Figs. 1 and 3, S1 and S2). Because the 2RAD/3RAD adapters already include internal tags that can identify all samples in a 96-well plate, samples can be pooled prior to PCR and externally tagged with the iTru5 and iTru7 primers (to identify multiple plates of samples), or users can

EcoRI Read 2 Adapter
Undesirable Products  PCR amplify individual wells with unique external tags and pool PCR products, creating redundant indexing (Fig. 3). The resulting libraries are then size-selected and sequenced using the four standard Illumina TruSeq primers, each of which returns a different indexing read (Fig. 4). Because 2RAD/3RAD libraries are constructed using the iTru primers from Adapterama I (Glenn et al., 2019), they are compatible with (i.e., can be pooled with) iTru and Illumina TruSeq libraries prior to sequencing on Illumina sequencing instruments, achieving our final design goal. This potential for highly-multiplexed pools is especially important with the advent of platforms such as the Illumina NovaSeq, which is capable of producing up to 3,000 Gb of data from one flow cell.

Design of adapters
Oligonucleotide sequences for all versions of all adapter combinations were designed using File S3. We ordered the resulting oligonucleotide sequences in plates from Integrated DNA Technologies (IDT; Coralville, IA, USA) and prepared them following directions from File S4.

Library preparation
Step-by-step protocols can be found in our extended material (File S1, Fig. 1). Several variants of the protocol are possible; thus we present specific examples that demonstrate this range (File S1). Briefly, we digested sample DNA with one of the combinations of restriction  Figure 4 Sequencing reads that can be obtained from full length 2RAD/3RAD library molecules. The top double stranded molecule shows a 2RAD/3RAD library molecule prepared as described in the text (File S1). The horizontal arrows beneath the library molecule indicate Illumina sequencing primers (binding to the complementary strand of the library molecules). The tip of the arrowhead indicates the 3 end of the primer and the direction of elongation for sequencing. Four sequencing reads are shown for each library prepared molecule, with one read for each index and each strand of the genomic DNA, including internal indexes. Reads are arranged 1 to 4 (numbered in magenta) from top to bottom, respectively. The arrow immediately 3 of the primers, indicates the data that are obtained from that primer, with coloring that is consistent with 2RAD/3RAD library molecule. Full-size DOI: 10.7717/peerj.7724/ fig-4 enzymes (Table 1), ligated a Read1 and Read2 adapter pair (File S3) and simultaneously digested dimers and chimeras with two alternating cycles of ligation-digestion. Next, we pooled libraries that had unique internal indexes (i.e., samples within 96-well plates), and purified ligation products (Fig. 3); note: this step is optional. To generate full-length libraries, we performed a PCR using iTru primers from Adapterama I that contained unique indexes (i.e., external indexes) to further differentiate individual samples (i.e., to indicate which 96-well plate the samples were from). The resulting libraries were purified and quantified. If samples were not pooled after the ligation step, iTru primers with unique external indexes were used, and then the resulting libraries were pooled. Library pools were size-selected to capture fragments in a Pippin Prep (Sage Science, Beverly, MA, USA) with a 1.5% dye-free Marker K agarose gel cassette (CDF1510) set to capture fragments at 550 bp ±10%. Resulting libraries were quantified, and in some cases, we performed a limited-cycle PCR with P5 and P7 primers to increase library concentration before sequencing.

3RAD efficiency
We tested and compared our 2RAD and 3RAD protocols with the traditional ddRAD protocol using the same restriction enzymes, adapters, and primers. To simplify the comparison between protocols, we used the pUC19 vector, which contains XbaI cut-site at position 423 and EcoRI cut-site at position 396, as template DNA. First, we amplified an approximately 500 bp fragment within the vector using the primers pUC19-215F-AAGGAGAAAATACCGCATCAGG and pUC19-774R-TAACCGTATTACCGCCTTTGAG. Then, we made four 10-fold dilution series. We used these five products (one stock and four dilutions) plus a negative control as input for 2RAD, 3RAD, and ddRAD (i.e., 2RAD with sequential digestion and ligation) libraries (File S5).

3RAD applied case studies for validation of the method
We tested our 3RAD protocol on eight example projects focused on diverse taxa: Kinosternidae ( Each dataset consisted of 12 to 24 samples. These projects span a broad diversity of organisms (e.g., in taxonomic classification, population size, heterozygosity level, and genome size), motivating evolutionary questions, and associated methods (i.e., from population genetics to phylogenetics; Table 3). After preliminary examination of several restriction enzyme combinations, we used XbaI, EcoRI-HF, and NheI and adapter sets R1.A and R2.1 (Design 1 in File S3) for all projects. We prepared all libraries using similar methods to those mentioned above, but some details varied among projects (Files S1, S6 and S7). We used a modification of our 3RAD method to incorporate molecular ID tags for the Wisteria project (see Hoffberg et al., 2016;Files S6 and S7). We included these datasets to validate the utility of our method and present limited results in File S6; more detailed population genetic or phylogenetic results for each project are beyond the scope of this article.

Sequencing and data analyses
We sequenced libraries on multiple Illumina platforms in multiple core labs. We used the Illumina NextSeq 500 platform to generate PE75 data for the Rhodnius, Gambusia, Kinosternidae, and Wisteria projects and Illumina HiSeq 2500 or NextSeq 500 platforms to generate PE150 data for the Ixodidae, Sphyrna, and Eurycea projects. We planned all sequencing runs to produce approximately one million reads per sample, which facilitates comparison of the 3RAD results among species with varying genome sizes, with the exception of the Ixodidae project, where four million reads were targeted per sample.
Finally, we estimated the prevalence and impact of loci with third restriction enzyme cut-sites in our data. We estimated the proportion of these third restriction enzyme cut-sites relative to the first restriction enzyme cut-site (i.e., intended cut-site) for five of the projects, and we evaluated variation among adapters and projects using ANOVA in R v3.5.1 (R Core Team, 2018). To evaluate the effect of these loci in downstream analyses, we reanalyzed data from three of our projects (i.e., both Sphyrna and Amblyomma americanum) after removing third restriction enzyme loci from the datasets. To do this, we reassembled data in Stacks v1.44 Catchen et al., 2013) using process_radtags two independent times: first, ''rescuing barcodes'', cleaning, and trimming the raw sequence data as before, but disabling rad check (-disable_rad_check) to leave the cut-sites intact; and second, using the previous step's output as input, checking only for exact, intended restriction enzyme cut-sites (i.e., XbaI and EcoRI). From this output, we assembled and analyzed data similar to above, as detailed in File S6.

RESULTS
We developed four sets of adapters, each with eight versions of Read 1 adapters and 12 versions of Read 2 adapters (File S3). We modified the iTru_R2_5 index sequence for BamHI because this index creates a BamHI recognition site in the adapter; otherwise, all adapters use a universal set of index sequences. The cost of synthesizing oligonucleotides for the adapters varies with synthesis scale, but it starts as low as ∼$350 (US) per design set, with a recommended scale (100 nmol) costing ∼$500 (US) per design set when synthesized into 96-well plates, which are sufficient for up to ∼4,800 sample libraries.
3RAD libraries can be constructed routinely with approximately 12 h of hands-on time over the course of 2-3 days, with some variation depending mostly upon the step at which samples are pooled. The initial cost of restriction digestion and ligation is ∼$0.85 per sample (File S3, ''Library_prep_costs'' Sheet). If PCRs are conducted individually, this adds >$1 per sample, but if the ligations are pooled before PCR, then this cost averages to $0.06 per sample. Size-selection using the Pippin Prep adds $0.12 per sample, assuming user access to the equipment. A total of $0.25 per sample is required for tips, plates and tubes. Thus, when samples are pooled prior to PCR, the total cost for library preparation is about $1.35 per sample.
In the efficiency test, we demonstrate that 3RAD libraries have fewer adapter-dimers and a higher concentration of library constructs than 2RAD or ddRAD libraries, which is In the Ixodidae example project, we obtained between 1.8-7.3 (mean = 4.2) million reads per sample, and for all other projects, we obtained between 0.6-3.6 (mean = 1.3) million reads per sample. With the exception of one sample from the Ixodidae project, we always recovered a high percentage of retained reads (78.9-99.7%) after cleaning and filtering steps (Table S1). Average coverage per locus varied from 6× for Eurycea to 70× for Gambusia (Table 3; Table S1; Fig. 6).
Our initial assemblies contained between 18,629 loci (Gambusia) and 425,729 loci (Eurycea). After filtering to retain only polymorphic loci found in at least 75% of individuals within each population, we recovered between 30 loci (Eurycea) and 19,843 loci (Ixodidae) containing between 360 and 69,518 SNPs, respectively (Table 3). As expected for RADseq protocols, the number of loci we obtained in the initial steps was proportional to the genome size, with more loci recovered in organisms with larger genomes. Due to the manner in which we filtered these loci, the final number of loci recovered is dependent upon the intrinsic genetic variability of the organism, the scale of sampling, and sequencing coverage (Fig. S3). Detailed results for each project can be found in File S6. Third restriction enzyme loci were present in all datasets, comprising an average of 20.5% (sd = 14.1%) of all reads. The percentage of reads from third restriction enzyme loci varied significantly both among datasets (p < 0.01) and among adapters with different indexes (p < 0.01; Fig. S4). Removing third restriction enzyme loci from our raw reads increased mean coverage of remaining loci (Tables S2, S3 and S4) and reduced the size of our final datasets from 7,183 to 6,738 loci in Sphyrna tiburo, from 5,263 to 4,807 loci in Sphyrna lewini, and from 69,518 loci to 39,605 in Amblyomma americanum. Estimates of F ST were qualitatively similar in analyses including and excluding third restriction enzyme loci (Table S5).

DISCUSSION
We present an efficient, flexible, and low-cost system for preparing dual-digest RADseq libraries. Our method uses the iTru primers from Adapterama I (Glenn et al., 2019) and modifies the adapter stub for RADseq libraries by adding the appropriate overhang bases, as well as variable-length internal indexes which facilitates a single Illumina lane to be shared by many quadruple-indexed libraries. To illustrate the utility of our method, we present summary statistics from analyses of eight small example projects, representing diversity in taxonomy and scientific objectives. For each project, we obtained thousands of loci containing SNPs for downstream population genetic and phylogenetic analyses. Among the example projects, we highlighted the role of genome size, genetic variation, and sequencing coverage in determining the quality and quantity of data recovered. For example, when processed differently, we recovered large numbers of homologous loci both among species in the family Kinosternidae and within a single representative species, Sternotherus depressus, from the same libraries and sequencing reads. These data are informative both for studying variation among populations of S. depressus and across relatively deep evolutionary time (∼55 Myr; File S6; Scott, Glenn & Rissler, 2017). We note that when datasets span deeper evolutionary time (e.g., Eurycea and Kinosternidae), we recover more loci in initial steps, but fewer of these loci are shared among individuals. However, many studies support the utility of these large and sparse data matrices for phylogenetic studies (e.g., Streicher, Schulte & Wiens, 2015;Hosner et al., 2016). For detailed discussion of the analyses for each dataset, see File S6.
The improved performance of 3RAD libraries when low concentration DNA is used as input is a consequence of maximizing the efficiency of the ligation of the adapters to library fragments by the use of the third-enzyme. However, it is important to highlight that: (1) our experiment qualitatively (not quantitatively) evaluated the library performance and (2) our experiment was simplified by using vector DNA that has the restriction sites. We believe the better performance on high complex and diverse samples will be maintained because our mechanism suppresses adapter-dimer formation, but this scenario was not tested here. Similarly, Graham et al. (2015) concluded that our 3RAD protocol performs well even with moderately degraded DNA samples.
By using the same set of restriction enzymes on DNA from a diverse set of organisms, we have demonstrated that our 3RAD method recovers suitable numbers of loci and SNPs from organisms with varying genome characteristics. As expected, when we use the same set of restriction enzymes (and thus similar expected frequency of cut-sites) and the same sizeselection criteria for organisms with varying genome size, the average sequencing coverage per locus decreases as genome size increases (Fig. S3). For example, our dataset generated here from Eurycea (C-value = ∼25; Table 3) produced few loci meeting our coverage criteria, but higher sequencing coverage can remedy this. Alternatively, the number of loci in libraries can be tuned by changing restriction enzymes or size-selection criteria (Peterson et al., 2012). Additionally, using broad size-distributions in the size-selection step retains more loci and allows for greater size tolerance among alleles, whereas narrow distributions yield fewer loci. We suspect that use of narrow size-distributions excludes alleles at loci with significant size variation and may lead to increased levels of incorrect genotype calls due to the missing alleles outside of the selected size-range, but because most researchers are targeting loci without size variation, this bias should be small. A more significant issue related to narrow size-distributions is that a significant proportion of loci will be near the size cut-off and coverage decreases for these loci because some molecules of the targeted size will not be among the fragments retained (e.g., simply due to variance in migration through a gel).
A key advance of the 3RAD workflow (Files S1-S2) is the combination of enzymes and adapters used during digestion, ligation, and PCR steps to create the desired construct while minimizing the presence of dimers, chimeras, and improperly formed library molecules (those lacking restriction sites at both ends). Sequential and simultaneous digestion and ligation without buffer exchange increases the efficiency of the lab workflow and decreases the amount of input DNA required. The 3RAD adapters we designed function with multiple restriction enzymes that leave compatible, cohesive ends (Table 2). Thus, there are at least 72 different restriction enzyme combinations possible with the current adapter sets. Although this flexibility is desirable, having adapters compatible with multiple restriction enzymes means that it was necessary to name the adapters based on the third RE, which can be confusing because the third restriction enzyme is not the desired restriction site in the sample libraries (and their resulting reads). The design spreadsheet (File S3, Sheets: Design_1-4) can be easily modified to accommodate other restriction enzymes to create additional designs. These sheets also incorporate cost calculations so that researchers may easily change the costs to reflect updated pricing from their supplier(s).
It should be noted that certain restriction enzyme combinations work well with some species, but not with others, primarily due to the presence of restriction sites within repetitive elements. Thus, our standard strategy is to empirically determine what restriction enzyme combination is best for any particular organism in a few representative samples. We start by looking at the distribution of post-PCR library DNA run through an agarose gel and exclude restriction enzyme combinations that don't produce even smears or that have dense bands in the desired size range. Sometimes, we then size-select and sequence libraries from one or two restriction enzyme combinations from this small batch of samples to determine which combination produces the most variable loci.
Our results show that XbaI, EcoRI-HF and NheI provide a suitable combination of restriction enzymes for a wide range of organisms, including plants, vertebrates, and invertebrates. We have used all of the adapter designs and many other restriction enzyme combinations from the enzyme list (Table 1) to survey SNPs in a variety of organisms. While not all combinations work well in all organisms, most of the organisms we have studied to date work well with restriction enzymes from Design 1 or Design 2 adapters. Although viable, we only rarely use Designs 3 and 4 (R1.C, R1.D, R2.3, and R2.4).
In our standard protocols (Files S1 and S5), we digest DNA with two different restriction enzymes (2RAD) or three different restriction enzymes (3RAD) to create sticky ends for adapter ligation (similar to ddRAD and 2-enzyme GBS). In 3RAD, the third restriction enzyme digests a recognition site formed by self-ligation of the phosphorylated adapters (Fig. 2). Although the third restriction enzyme facilitates creation of libraries with very little input DNA (≤ 0.1 ng; File S5), it does come with a cost. The third restriction enzyme also cuts genomic DNA that can be ligated to the R1 adapter, but the adapter:DNA ligation product is susceptible to re-cleavage by the restriction enzyme. To encourage this, our digestion/ligation cycling ends with a digestion step to cleave as many of these products as possible. Still, an average of 20% of loci in our assemblies had these third restriction enzyme cut-sites, and the nonrandom relationship between R1 adapter index and the prevalence of these loci suggests that the index has an effect upon the efficiency of the third restriction enzyme in cleaving the re-created recognition site. These loci are, in principle, suitable for downstream analyses, but because the protocol is designed to minimize their retention, they should have lower coverage than those with the intended restriction enzyme cut-site. A high prevalence of these off-target loci can require additional sequencing (and thus, increase costs), but these reads can easily be filtered and removed for all downstream analyses if desired. The third restriction enzyme is not required for this procedure and further investigation into to trade-offs of including this restriction enzyme (i.e., 2RAD vs. 3RAD) and its optimal concentration are warranted. Alternatively, it is possible to engineer adapters that can ligate to genomic DNA, but not self-ligate (e.g., using 3 dideoxycytidine), which we have done. Unfortunately, the adapters are significantly more expensive and were not stable when stored for ≥ ∼6 months, both of which make the method impractical. Further research into other modifications or storage solutions for these adapters is warranted.
As detailed in Hoffberg et al. (2016), our 3RAD approach is easy to modify to include tags that allow the removal of PCR duplicates (File S7) through a process comparable to quaddRAD (Franchini et al., 2017) and to Schweyen, Rozenberg & Leese (2014). Our molecular ID tag protocol, which uses an iTru5-8N primer, can be used to make libraries for different types of downstream processes and/or modified for other types of libraries. For example, Hoffberg et al. (2016) used the iTru5-8N primer and sequence capture to focus sequencing on informative loci, reduce missing data, and remove PCR duplicates.
Due to the release of cutting-edge technologies with higher throughput, such as the Illumina NovaSeq 6000, which can generate > 2 billion paired-end reads in a single run, the ability to multiplex samples becomes paramount to reduce sequencing costs per sample. Our system, introduced in Adapterama I (Glenn et al., 2019), and from which 2RAD/3RAD extends, includes two sets of 384 indexed PCR primers, allowing for the pooling of up to 147,456 dual-indexed samples. Furthermore, the adapters designed in Adapterama III can create quadruple-indexed libraries, enabling the potential multiplexing of up to ∼14 million libraries.
Our 2RAD/3RAD methods are similar to other dual-digest RADseq methods (e.g., ddRAD and 2-enzyme GBS), and most of the advantages of our general approach have been described previously (Andrews et al., 2016;Harvey et al., 2016;Heyduk et al., 2016). Our 2RAD/3RAD method achieves the seven methodological design objectives. We have demonstrated that the third restriction enzyme increases ligation efficiency by reducing adapter-dimers; thus, much less input DNA is necessary. In addition, multiple restriction enzymes are compatible with many of the adapters (Table 1), all indexes used herein conform a minimum edit distance of 3 (Faircloth & Glenn, 2012), we use limited PCR-cycles of pooled ligations to reduce PCR bias, and our method facilitates the easy incorporation of molecular ID tags to detect PCR duplicates in downstream analyses (Hoffberg et al., 2016).

CONCLUSIONS
In summary, 2RAD/3RAD protocols function similar to other dual-digest RADseq methods but are easier to perform, have lower startup costs, have increased utility with lowconcentration input DNA, and produce libraries that can be highly-multiplexed and pooled with other Illumina libraries.