Orthogonal NGS for High Throughput Clinical Diagnostics

Next generation sequencing is a transformative technology for discovering and diagnosing genetic disorders. However, high-throughput sequencing remains error-prone, necessitating variant confirmation in order to meet the exacting demands of clinical diagnostic sequencing. To address this, we devised an orthogonal, dual platform approach employing complementary target capture and sequencing chemistries to improve speed and accuracy of variant calls at a genomic scale. We combined DNA selection by bait-based hybridization followed by Illumina NextSeq reversible terminator sequencing with DNA selection by amplification followed by Ion Proton semiconductor sequencing. This approach yields genomic scale orthogonal confirmation of ~95% of exome variants. Overall variant sensitivity improves as each method covers thousands of coding exons missed by the other. We conclude that orthogonal NGS offers improvements in variant calling sensitivity when two platforms are used, better specificity for variants identified on both platforms, and greatly reduces the time and expense of Sanger follow-up, thus enabling physicians to act on genomic results more quickly.

High throughput sequencing has transformed the landscape of clinical genetics, enhancing our ability to decipher and treat rare human diseases with underlying genetic causes. There are many examples of patients ending diagnostic odysseys and benefitting from a broad, unbiased examination of their genome [1][2][3][4][5][6][7] . While some advocate whole genome sequencing, the costs of generating, analyzing, interpreting, and confirming the accuracy of such data makes it currently impractical for routine use. The great bulk of clinical sequencing is currently performed as whole exome sequencing (WES), representing the best compromise among cost, completeness, accuracy, and timeliness. WES focuses efforts on the most clinically-relevant and interpretable regions of the genome, providing excellent diagnostic value at a reasonable cost to the patient and health care system. Each next generation sequencing (NGS) platform has its own strengths and weaknesses. Raw base-calling error rates are typically 0.5-1% but can range much higher 8,9 . Because of these high error rates, the American College of Medical Genetics (ACMG) practice guidelines 10 recommend that orthogonal or companion technologies should be used to ensure that variant calls are independently confirmed and thus accurate. This has generally been carried out using Sanger sequencing but this is a relatively manual process that does not scale well for genome-wide studies.
In addition to the base-calling errors, WES performance is also impacted by the effectiveness of the method used to target DNA. Many comparisons of exome offerings have been carried out [11][12][13][14][15][16][17][18][19] but the rapid turnover of the underlying methods makes such studies outdated very quickly. Furthermore, published studies have tended to focus on comparisons between platforms rather than how multiple methods might be used productively in tandem.
In this report, we describe a strategy for rapidly generating high quality exome variant calls by leveraging orthogonal and independent NGS technologies for both selection and sequencing of DNA. Many thousands of variants can be simultaneously called and confirmed at a genomic scale. We show that these methods provide high quality and complete exome data compatible with the needs of clinical diagnostics, enhancing the ability of patients to get answers in a timely manner.

Methods
Purified NA12878 DNA for sequencing was obtained from both the National Institute of Standards and Technology (NIST, Gaithersburg, MD) and Coriell Institute for Medical Research (Camden, NJ). Clinical sequencing was performed with DNA isolated using an Autogen FlexStar for blood volumes greater than 2 ml and a QiaCube for lower blood volumes and for saliva. For Illumina-based sequencing, DNA was targeted using the Agilent Clinical Research Exome kit for hybridization capture and then made into libraries using the QXT library preparation kit based on manufacturer's recommendations. These were then sequenced on either a MiSeq or NextSeq (with version 2 reagents) as recommended by Illumina. Both MiSeq and NextSeq data underwent alignment, cleaning, and variant calling according to GATK best practices 20 . Tools and versions used for analysis include BWA-mem (0.7.10-r789), sambamba (v0.4.7), CalculateHSMetrics.jar (1.84(1332)), picard.jar (Version: 1.124(69ecf101f612fdc0f3d555aa2d3cc0b1ea193c68_1415030499)), IGVTools_2.3.36, bcl2fastq (2.16), bed-tools2-2.19.1, and samtools-1.2. For many analyses, final Illumina variant calls were also subjected to minimum depth and quality thresholds of DP > 8 and GQ > 20 chosen to minimize loss of true variants while filtering out as many false positives as possible 21 . Variants filtered out by DP or GQ are retained but classified as NoPass calls for further evaluation.
DNA sequenced on the Ion Torrent Proton was targeted using the Life Technologies AmpliSeq Exome kit as directed by the manufacturer with libraries prepared on the OneTouch system. Libraries were then sequenced on the Ion Proton TM system with HiQ polymerase. Read alignment, cleaning, and variant calling was performed using Torrent Suite v4.4 followed by application of additional custom filters to remove strand-specific errors and recurrent false positives generated from over 6000 Proton exomes sequenced by Claritas Genomics (filters to be made available through Life Technologies in future software updates).
Variant calls from Illumina and Ion Torrent were combined using a custom algorithm (Combinator) developed by Claritas for integrating multi-platform VCF files. Briefly, variants are compared across platforms and grouped into classes based on a set of attributes including whether the variant is a SNP or indel, whether the variant call and zygosity match between both platforms, and whether the variant site is well-covered in each platform. To assess the accuracy of each variant class, the algorithm was applied to NA12878 orthogonal sequencing data and compared to the NIST Genome In A Bottle NA12878 truth set (v2.17). Some analyses were also carried out using truth set v2. 19. A positive predictive value (PPV) was calculated for each variant class and used for subsequent applications of orthogonal sequencing to guide variant interpretation.

Results
To assess the performance of different exome sequencing strategies, we used the reference sample NA12878 from HapMap in conjunction with the gold standard reference call set maintained by NIST 22 . Sequencing libraries were generated in parallel using both the oligo-based Agilent SureSelect Clinical Research Exome (CRE) and the amplification-based AmpliSeq Exome Kit. Three independent libraries were made using each method. These libraries were sequenced on the NextSeq (CRE), MiSeq (CRE), and Proton (AmpliSeq) platforms to an average coverage of 125× , 46× and 133× , respectively. Variants were called and then compared to the NIST reference for completeness and accuracy.
We chose as our analytic region all RefSeq coding exons (CDS) as well as 10 bp on both sides of each exon in order to capture all clinically-relevant regions including splicing mutations (~37. 6  First, we analyzed how well each sequencing approach covered our analytic target. For comparison purposes, NextSeq and Proton data was numerically normalized to a mean depth of 100× . In Fig. 1, mean coverage on both platforms for each exon is plotted for all 187,475 exons in our analytic region as a function of sequencing method. Exons with no coverage were adjusted up to 1× in order to allow log-log plotting. The graph was split into four quadrants based on mean 20× coverage. The great majority of exons (> 90%) were covered by at least 20 reads by both platforms. 4327 (2.3%) exons failed to achieve at least 20× coverage on both platforms with 2253 (1.2%) of these having less than 10× coverage on both platforms. More than 8% of exons were well covered (> 20× mean coverage) by one platform but not the other (4.7% or 8892 with > 20× coverage on NextSeq only and 3.7% or 6973 with > 20× coverage on Proton only). Many of the exons found on only the NextSeq or only the Proton are difficult to sequence and thus NIST has not included them in their reference. Because they are not in NIST, they do not impact the apparent sensitivity listed in Table 2. Thus, use of two orthogonal platforms improves the orthogonal sensitivity ~3-4% relative to the use of one platform alone. This estimate is based on the number of exons where variants can be detected on only one platform.
To better understand which exons are poorly covered, the impact of GC-content on coverage was examined (Supplementary Figure 1). After normalization of coverage on both platforms to 100× , the number of exons that did not achieve 20× coverage in each platform is shown as a function of GC content. Neither platform performs as well at the extremes of GC-content though the Proton tends to be better with AT-rich exons and the NextSeq with GC-rich exons. Both platforms have better coverage with 40-70% GC-content.
Next, we analyzed how each individual platform performed with respect to calling accuracy. Within the complete analytic region as well as some representative smaller, clinically-relevant gene subsets [23][24][25] , the platforms yielded similar numbers of variant calls. To assess their accuracy, we compared each call (both the variation and its zygosity) to the NIST 2.17 and 2.19 truth sets. The sensitivity (ability to detect true positive variants), specificity (number of false positive variants per Mb), and Positive Predictive Value (PPV) for each platform are listed in Table 2. For both Single Nucleotide Variants (SNVs) and Insertion-Deletions (InDels), NextSeq achieved the highest sensitivity (99.6% of SNVs and 95.0% of InDels, respectively) followed by MiSeq (99.0% and 92.8%) and Proton (96.9% and 51.0%). PPV is nearly identical for SNVs among all the platforms. InDels are best with NextSeq (96.9%) and lowest with Proton (92.2%). After accounting for coverage differences, the NextSeq and MiSeq are nearly equivalent in performance. The apparent sensitivity with the combined platforms is as high as 99.88% for SNVs in the NIST 2.19 consensus regions. The sensitivity across non-NIST regions will likely be less due to lower coverage in many of those regions. Based on the number of exons with low coverage, the true sensitivity may be less than 98%.
Rare variants are the most relevant to clinical sequencing and detecting them can be more challenging because analysis programs are not tuned to them. To determine the sensitivity for rare variants, the protein-coding NIST regions were also examined in the ExAC database 26 . NA12878 variants detected in ExAC with a population frequency of < 1% were used as the truth set. Sensitivity for detecting such variants was somewhat less than for all To determine the impact of combining two orthogonal sequencing platforms on accuracy, variant calls from the platforms were compared. Because the format of VCFs from the platforms is different and calling multi-nucleotide variants can generate multiple different but equivalent names for variants, algorithms were generated at Claritas to carry out the combination of two distinct VCF files (Combinator). We found that the overall concordance between calls was extremely high. In variant calls from three independent replicates of NA12878, nearly 95% of variants are called identically on the two platforms (a total of 49,167 orthogonally concordant variant calls over three replicates in the NIST region of RefSeq + /− 10 bp).
When compared to the NIST truth set, nearly all variants matched (> 99.99%). Only four variants were discordant with NIST 2.17 (Described in detail in Supplementary Discussion). All were subjected to Sanger sequencing in an attempt to disambiguate these results. Sanger sequencing confirmed that three of these four variants did not match the NIST 2.17 truth set. Inspection of a newer version (NIST 2.19) revealed that these three variants had been removed, indicating that others had also found issues at these positions. Eight additional apparent false positives were found in v2.19 but these were all confirmed to be artifacts by Sanger with 7/8 arising from v2.19 issues with properly deconvoluting multinucleotide variants. The remaining variant was found in both NIST versions and just barely passed the threshold for NextSeq coverage (8 reads, Supplementary Table 1). This single real FP among the total of 49,167 variants called yields a final Positive Predictive Value (PPV) of 99.998% for the orthogonally confirmed variants (Table 3). Raising the coverage depth threshold from 8 to 10 would eliminate the single FP and raise the PPV to 100%. However, this change would come at a sensitivity cost (approximately 0.3% fewer NextSeq TPs). The single FP is actually not a failure to detect a variant but rather an error in zygosity with the heterozygous position incorrectly called homozygous alternate.
We analyzed the remaining variants that were not orthogonally concordant on the two platforms. Less than 5% of all variants are SNVs or InDels that are called only on the NextSeq or only on the Proton. Singleton NextSeq calls have PPV~95% regardless of whether there is Proton coverage or not ( Table 3). As a result, we classify them as Likely True Positives. Singleton Proton calls have a similar PPV (and classification) when there is no NextSeq coverage but significantly worse PPV when there is NextSeq coverage but there is no call or a different call so these situations are separated. Most of these variants are NextSeq-specific with the number of TPs and FPs arising from NextSeq/Proton shown in parentheses below the total. Fewer than 1% of SNVs and InDels are either a singleton NextSeq NoPass call or a singleton Proton call opposite a different NextSeq call. These have PPV~20%. We classify this small set of variants as Likely False Positives. The number of TPs and FPs arising from NextSeq/ Proton are also shown in parentheses below the total. Even though we did not observe a difference in the quality of the OC variants made with NextSeq Pass versus the 134 NoPass calls, their potential for lower quality calls led us to categorize them separately. We classify them as Reliable (PPV = 100%). Prior to reporting a Reliable variant, we think it wise to inspect or further confirm them.
When multiple experiments are compared, we found that the variant classifications remain stable across replicate sequencing runs. As shown in Table 4, nearly 99% of Orthogonally Confirmed variants found in one blood sample were also OC in a replicate blood sample from the same donor. Variants identified initially as Likely TP were found to repeat as Likely TP, Reliable or OC at a rate of ~92%. In contrast, the less certain variants identified as Likely FP were categorized as Likely FP again less than one third of the time and not called at all in the second run > 50% of the time. Very similar results were observed across all categories with technical replicates of the same DNA. This high level of reproducibility lends confidence to these classifications.  Table 2. Sensitivity and specificity of sequencing platforms.
In order to provide maximum flexibility for patients providing biological fluids for DNA extraction, we have examined multiple sample collection methods. DNA collected from cell lines, from blood, and from saliva all performed identically in our hands (Table 4). While sensitivity and specificity could not be determined based on the lack of truth sets, the rate of orthogonal confirmation and the number of variants identified was indistinguishable between blood and saliva using clinically validated methods.

Discussion
Today, the genetic basis for more than half of the > 7000 known Mendelian disorders has been elucidated and the pace of genetic discoveries continues unabated 27 . The benefits of broad genetic testing of patients for previously undiagnosed diseases for the patient are clear 7 . However, the quality and costs of such testing and reimbursement for them have been problematic. WES provides cost-efficient identification of clinically-relevant variants that eliminates the high expense of testing only a few genes at a time and the resultant lengthy diagnostic odysseys 28,29 . We have demonstrated here that the use of orthogonal DNA selection and sequencing methodologies provides better sensitivity than standard WES and additionally allows immediate confirmation of ~95% of all variants. This improves turnaround time and eliminates the need and cost for most subsequent Sanger confirmation.
While Sanger sequencing is generally considered the gold standard for confirmation, it is subject to the same amplification and repeat-based artifacts that can afflict NGS technologies. For example, primer or other allele-specific effects can cause selective amplification of one allele. We found that G > A variants in very GC-rich amplicons caused differential amplification efficiency in both Sanger and NGS methods. Other sequences can cause other problems when unusual DNA structures are created 30 . The Sanger-based confirmation of three NA12878 variants could have led to two errors if the results were taken at face value (Supplementary Discussion). These difficulties highlight the issue that even the "gold standard" sequencing technology is error-prone and subject to artifacts.
In addition to the immediate confirmation of ~95% variants and the high accuracy of OC variants, another key benefit to the parallel exome sequencing is the increased sensitivity due to the overlapping regions that are covered by each platform. Because the NIST reference is biased for regions that are most easily sequenced, results as shown in Table 2 can be deceptive and overestimate the true sensitivity of both platforms. The singly covered regions allow a greater percentage of variants to be identified and subsequently confirmed by other methods. There are thousands of variants in the exome that are detected only by the NextSeq or the Proton. These variants   require confirmation prior to clinical reporting but they would not have been reported at all if only the NextSeq or only the Proton had been used exclusively. By using orthogonal and complementary technologies, we are able to quickly confirm variants at a genome-wide scale and provide improved sensitivity for detecting potentially pathogenic variants.
No matter which method of sequencing is chosen, clinical quality data should be confirmed prior to reporting 10 . At the genome-wide level, this is a large burden that is sometimes addressed by prioritizing different variant calls based on pathogenicity and/or call quality. The more confident one wants to be in assuring call accuracy, the more variants require individual attention and likely Sanger sequencing. Use of orthogonal NGS eliminates virtually all needs for prioritization of variants for subsequent confirmation. While the initial cost of sequencing two exomes is higher, the ultimate savings in confirmatory sequencing as well as the improved sensitivity makes up for that expense. Sanger sequencing costs vary significantly from lab to lab but, in our hands, a single Proton exome can be prepared and sequenced for about the same cost as 25-50 individual Sanger reactions that each require custom primers. Furthermore, choosing which variants to confirm via Sanger requires waiting for the initial analysis to be completed so the inclusion of the orthogonal Proton sequencing run results in a faster turnaround time in addition to the potential for lower cost.
Speed of results is also frequently an issue. Ion Torrent systems detect pH changes electronically so are typically faster than laser/CCD camera-based Illumina sequencing systems. The various Illumina instruments have different speeds and output yields and thus could potentially have different performance as well. However, there are times where the more rapid sequencing time and lower computational requirements would be advantageous for returning results more quickly and the slight reduction in sensitivity caused by the faster but lower output of the MiSeq compared to the NextSeq would be preferable, especially since much of it would be compensated by Proton coverage. In some cases, the need for extreme speed may override cost and accuracy considerations. We have not yet attempted to minimize handling times but, in our hands, DNA extraction, library preparation and DNA sequencing require about 44 hrs of clock time and just over 7 hrs of hands-on time for the Illumina NextSeq while the Proton requires only half that for both. Reagent costs can vary significantly based on volumes purchased but we find that reagents for extraction through sequencing are about twice as much for the NextSeq compared to the Proton.
Frequently, providers wish to test a defined gene list known to be associated with a patient's disease. When defining a gene list for a particular phenotype-driven investigation of clinically relevant variants, the existing literature is used for determining inclusion. With the continuing pace of novel discoveries, any defined gene list will not include new findings so any novel genes would lie outside the region of interest. Pathogenic variants in poorly characterized genes or novel genes would not be identified in such lists. Additionally, phenotypes or symptoms can change over time which could also affect the list of genes for which testing is desired. The orthogonal approach has the advantage of providing high sensitivity and immediate confirmation of nearly all variants on any gene list while retaining the ability to expand to the whole exome if no pathogenic variants are found. We have found this tiered approach critical with a number of clinical cases where the pathogenic variants were not identified in the initial analysis of the pre-specified gene list. However, convincing candidate variants were identified when the gene list was expanded and the remainder of the exome was examined (data not shown). The improved speed and data quality should translate into improved diagnostic rates for patients with concomitant benefits for them and their families.