Reversible Oligonucleotide Chain Blocking Enables Bead Capture and Amplification of T-Cell Receptor α and β Chain mRNAs

Next-generation sequencing (NGS) has proven to be an exceptionally powerful tool for studying genetic variation and differences in gene expression profiles between cell populations. However, these population-wide studies are limited by their inability to detect variation between individual cells within a population, inspiring the development of single-cell techniques such as Drop-seq, which add a unique barcode to the mRNA from each cell prior to sequencing. Current Drop-seq technology enables capture, amplification, and barcoding of the entire mRNA transcriptome of individual cells. NGS can then be used to sequence the 3′-end of each message to build a cell-specific transcriptional landscape. However, current technology does not allow high-throughput capture of information distant from the mRNA poly-A tail. Thus, gene profiling would have much greater utility if beads could be generated having multiple transcript-specific capture sequences. Here we report the use of a reversible chain blocking group to enable synthesis of DNA barcoded beads having capture sequences for the constant domains of the T-cell receptor α and β chain mRNAs. We demonstrate that these beads can be used to capture and pair TCRα and TCRβ sequences from total T-cell RNA, enabling reverse transcription and PCR amplification of these sequences. This is the first example of capture beads having more than one capture sequence, and we envision that this technology will be of high utility for applications such as pairing the antigen receptor chains that give rise to autoimmune diseases or measuring the ratios of mRNA splice variants in cancer stem cells.

T he widespread availability of next-generation sequencing (NGS) instruments has enabled researchers around the world to examine organisms at unprecedented levels of detail at low cost, resulting in a rapid escalation of NGS-based studies. 1 Current technologies generally require large amounts of template material, which is produced from samples containing millions of cells. While data generated using this approach give a view of the population-wide characteristics of these cells, it provides no mechanism for linking information to individual cells. As a result, it is not possible to directly examine the distribution of variation across large populations of cells within a sample. This important limitation means that variation that is present at a low frequency within a population cannot be easily discriminated from sequencing noise. Furthermore, populationbased methods are incapable of directly examining how the relative frequency of variation within a population of cells fluctuates over time. These deficiencies of NGS technology pose a particularly large challenge for research examining cancer stem cells, which occur at a low frequency in tumors, but are thought to be central to tumor survival and treatment escape. 2 Recently, high-throughput fluidics-based systems such as Drop-seq have been reported in which individual cells are incubated with an mRNA capture bead in nanoliter droplets, enabling downstream sequencing of mRNA on a single-cell level. 3 This method has allowed significant progress in circumventing the limitations of population-wide studies, but it remains incompatible with the preferred short read length sequencing platforms for many sequences where critical information is distant from the 3′-end of the message. Additionally, the current technology only enables capture of a single mRNA sequence (usually the poly-A tail of mRNA). The ability to use two or more capture sequences would be of significant value, for example in measuring the ratio of differentially spliced messages in individual cancer stem cells, or obtaining the paired sequences of α and β antigen receptor chains.
The antigen receptors on individual lymphocytes control their specificity and are critical components in disease outcome. 4 These receptors are created by somatic gene rearrangements at two loci on different chromosomes to produce the antibody-like αβ T-cell receptor (TCR) or γδ TCR. The αβ TCR is of particular interest, as this receptor is believed to orchestrate the adaptive immune response. The critical determinant of antigen specificity is the rearrangement of variable (V), diversity (D), and joining (J) segments that make up the complementarity determining regions (CDRs), and each T-cell clone has a unique set of CDR sequences. 5 The antigen receptor V(D)J sequence for each chain lies adjacent to a locus-specific constant domain, enabling capture from the total cellular RNA using primers specific for this sequence. 6 Subsequent primer extension and PCR amplification can then allow complete chain sequence assembly from short NGS reads. Obtaining the paired α and β chain antigen receptor sequences for all of the lymphocytes in a clinical sample, combined with their abundance and activation status, would allow powerful systems level examination of human pathologies and disease, and there is much interest in this area. 7 Unfortunately, current single-cell analyses of this type are extremely laborious, constraining throughput to hundreds of cells, and therefore these studies only scratch the surface of this valuable trove of information.
The ability to obtain paired sequence information for antigen receptor chains within large clinical samples using methodologies such as Drop-seq would be a game changing technology. However, this requires the construction of capture beads having two different primer sequences attached to beadspecific unique barcodes that allow chain reconstruction from short read-length NGS platforms. Here we overcome this challenge using a novel phosphoramidite monomer that reversibly blocks chain extension to enable divergent synthesis of two primer sequences on each barcoded bead. We demonstrate that two different capture sequences can be built onto a single bead, enabling specific capture and amplification of mRNAs encoding the α and β chains of the αβ TCR.
In principle, capture beads having two different primer sequences could be generated by chemical attachment of the primers to beads. However, a critical component of single-cell analysis techniques is the use of unique barcodes on each bead, located upstream from the primer sequence. This enables the >10 8 reads acquired from NGS to be regrouped into populations arising from each individual cell. 3 The barcoded beads are generated using split-pool combinatorial synthesis, and thus the primer sequences must be synthesized directly onto the barcode sequence, such that the entire sequence is amplified and read in NGS. An additional consideration for the capture beads is that the DNA must be synthesized in the "reverse" 5′−3′ direction, as this provides the 3′-terminus that is needed for enzymatic primer extension after mRNA capture.
We envisioned the synthetic strategy shown in Figure 1, in which split-pool synthesis would be used to generate barcoded beads, and these would then be reacted with a mixture of 5′ phosphoramidite monomers having two different protecting groups at the 3′-position. This would enable selective deprotection of half of the oligonucleotides and synthesis of the first primer sequence, followed by a second deprotection and primer synthesis. The first monomer could have the typical dimethoxytrityl (DMT) protecting group, which is removed under mildly acidic conditions. But, the second monomer requires a protecting group having the following characteristics: (1) stable to DMT deprotection conditions; (2) stable to conditions used for oligonucleotide synthesis; (3) deprotected under conditions compatible with DNA oligonucleotides. A search of the literature revealed no examples of 3′-protected phosphoramidite monomers that would meet these requirements. However, our interest was piqued by previous reports using a 2′-levulinyl protecting group for the synthesis of branched oligonucleotides. 8 This protecting group appeared to meet the requirements outlined above, as it had been shown to be stable to the conditions used for RNA synthesis and is deprotected using aqueous hydrazine, which is nondamaging to oligonucleotides and orthogonal to the acidic DMT deprotection conditions.
As shown in Figure 2a, we synthesized 3′-OLev thymine phosphoramidite 2 from commercially available 1 using standard protocols. We then carried out a synthesis of short polyT sequences to test the compatibility of 2 with our divergent oligonucleotide synthesis protocol. To mimic synthesis of the barcode region, we performed four thymine monomer coupling steps on all beads. We then added a 1:1 mixture of 3′-ODMT monomer 3 and 3′-OLev monomer 2. The DMT group was then removed, and the deprotected oligonucleotide chains coupled with a fluorescein phosphoramidite. Next, the Lev group was removed using aqueous hydrazine, and the resulting deprotected oligonucleotide chains were coupled with an additional thymine monomer. Both oligonucleotide chains were then cleaved from the beads and analyzed by HPLC. As shown in Figure 2b, the ratio between T 5 -FAM DNA (arising from addition of 3) and T 6 DNA (arising from addition of 2) is very close to 1:1, demonstrating that the two different monomers are added to a growing oligonucleotide chain with nearly equal efficiency. Additionally, the high purity of the DNA products obtained indicates that the Lev protecting group is not removed during DMT removal or the subsequent coupling step and that deprotection of the Lev group does not damage the DNA molecules.
Having established that Lev can function as a DNAcompatible orthogonal protecting group, we next sought to generate dual-primer beads for capture and amplification of the mRNAs encoding the α and β chains of the T-cell antigen receptor. Using the same synthetic protocol as described above for the test beads, we synthesized beads capable of targeting both the T-cell receptor alpha chain (TCRα) and beta chain (TCRβ) mRNAs. Specifically, the capture sequences attached to the beads are complementary to the TCRα and TCRβ constant regions (TRAC and TRBC). To test the ability of the capture beads to pull down TCR mRNA, the beads were mixed with total RNA isolated from a known T cell clone (clone GDB4). Unbound RNA was removed by washing, and the bead-bound RNA was subjected to a reverse transcription reaction in which the DNA oligonucleotides attached to the beads served as primers for the synthesis of bead-bound cDNA.

Journal of the American Chemical Society
Communication PCR amplification of the cDNA was then performed using primers complementary to the known sequence of the TCR α and β chain transcripts in the GDB4 clone. As a control, all of the above steps were also carried out using "raw" beads, on which no DNA had been synthesized. Figure 3 shows an agarose gel image of the PCR product from each reaction. As anticipated, we observe amplification of the TCRα and TCRβ sequences when using the capture beads, but not with the raw beads. The bands corresponding to the PCR products were excised from the gel and subjected to Sanger sequencing. The sequencing results shown in Figure S1 are consistent with the known TCRα and TCRβ sequences in the GDB4 clone, confirming the ability of the capture beads to pull down TCR mRNA.
In conclusion, we have demonstrated that beads containing unique oligonucleotide barcodes followed by two independent oligonucleotide capture sequences can be produced. These beads can be used to capture two different specific target sequences, which has high utility in cases such as the T-cell antigen receptor, where receptor chain pairing is critical for determination of antigen specificity, but is highly variable at the single cell level. Importantly, each bead has approximately 5000 copies of each capture sequence, which provides redundancy in the sequencing process to discriminate real polymorphisms from transcriptional or sequencing errors (Figure 4). The key to achieving synthesis of the two different oligonucleotide sequences on a single bead is use of a 3′-Lev-protected phosphoramidite, which is stable to the conditions used for

Journal of the American Chemical Society
Communication DMT deprotection and oligonucleotide synthesis, and can be subsequently removed without damaging the newly formed oligonucleotides. While the current study only utilized a single differential protection step, we envision that the divergent DMT/Lev protection step could be used iteratively to generate beads having larger numbers of unique DNA capture sequences. This synthesis scheme may also be used to produce capture reagents that can be used as biological probes when linked to sequences that include molecular dyes that can be specifically visualized or quenched.
The immense diversity of the adaptive immune system relies on production of three antigen receptor types, on both B-cells and T-cells, that are synthesized from transcripts that have undergone germline rearrangements. Identifying these germline rearrangements is only possible through direct sequencing of the corresponding DNA or mRNA. As our understanding of the complexity of the adaptive immune response continues to expand, abhorrent immune responses have been identified as the basis for several diseases. 9 These diseases can arise from a single clone, and thus the ability to obtain paired sequence information for antigen receptor α and β chains at the single cell level could be transformative in identifying these disease causing autoimmune clones. The research reported here provides the tools necessary to undertake these experiments and demonstrates the feasibility of this approach by capturing, amplifying, and sequencing the α and β chains for a single Tcell receptor clone. Future studies will be aimed at utilizing our capture beads for single cell sequencing of large populations of T-cell antigen receptors.