Testing thousands of nanoparticles in vivo using DNA barcodes

Nanoparticles improvedrugefficacybydeliveringdrugs to sitesof disease. To effectively deliver a drug in vivo, a nanoparticle must overcome physical and physiological hurdles that are not present in cell culture, yet in vitro screens are used to predict nanoparticle delivery in vivo. An ideal nanoparticle discovery pipeline would enable scientists to study thousands of nanoparticles in vivo. Here,wediscuss technologies that enablehigh throughput in vivo screens, focusing on DNA barcoded nanoparticles. Addresses Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory School of Medicine, Atlanta, GA, USA Corresponding author: Dahlman, James E. (james.dahlman@bme. gatech.edu) Current Opinion in Biomedical Engineering 2018, 7:1–8 This review comes from a themed issue on Molecular and Cellular Engineering: Gene Therapy Edited by Charlie Gersbach https://doi.org/10.1016/j.cobme.2018.08.001 2468-4511/Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


RNA therapies are limited by inefficient delivery to target tissues
Advances in genomics have armed scientists with lists of genes that cause diseases. This has brought about a significant change in the way drugs are designed and discovered. Researchers were previously limited to targeting broad cellular phenotypes; for example, cisplatin intercalates into double stranded DNA, causing toxicity in any cell undergoing cell division. Although many of these drugs successfully treated disease, they also caused severe side effects driven by drug activity in 'off-target' cells. Researchers now often use small molecules that target specific mutations. However, only 15% of the protein coding genome e and a much smaller percentage of the non-coding genome e is 'druggable' using small molecules [1]. This has led scientists to develop technologies to target all genes.
RNA therapies have emerged as a promising solution to the problem of 'undruggable' targets [2]. RNAs can specifically turn any gene in the genome on or off, regardless of its eventual protein structure. Gene silencing can be mediated by RNA interference (RNAi), a well characterized mechanism where base pairing between small interfering RNA (siRNA) and target mRNA catalyzes RISC-mediated mRNA degradation. Gene silencing can also be mediated by DNA nucleases, including those derived from zinc fingers (ZFNs) [3], transcription factor-like effectors (TALENs) [4], or clustered regularly interspaced short palindromic repeats and their associated proteins (CRISPR-Cas) systems [5,6]. Similarly, mRNA-based therapies can transiently upregulate genes [7]. To date siRNA therapies have shown the most promise in patients. In one example, a degenerative disease that was uniformly fatal has been halted (and in some cases, reversed) in Phase III clinical trials [8,9]. More generally, over 1200 patients have been treated with siRNA, some for as long as 36 months. With a few exceptions, primarily limited to antisense nucleotides (which are distinct from siRNA) [10], siRNA therapies have been well tolerated and efficacious. Advances in siRNA biochemistry [11] and improved delivery vehicles [12] raise the exciting prospect that once yearly injections to treat genetic disease are within reach.
Although siRNA is currently the most clinically advanced RNA therapy, DNA nucleases and mRNA therapies are up and coming. They have shown efficacy in mice [13,14] and non-human primates [15], and are poised to treat disease in humans [16e18]. The results described above have one limitation: most clinical trials have involved RNAs locally injected into muscle (e.g., vaccines), eye, or lymph nodes, administered to cells ex vivo, or systemically delivered to hepatocytes. Systemically delivering therapeutic RNA outside the liver remains a substantial unsolved problem [19] that limits the development of gene therapies targeting other organs. RNA therapies will require delivery because naked RNAs are quickly degraded by nucleases, and their large molecular weight and highly negative phosphodiester backbone prevents them from crossing the anionic cell membrane [20].
Thousands of nanoparticles are screened in vitro; this may not predict in vivo delivery many drug delivery vehicles have been used to deliver RNA, we will focus on nanoparticles, which have generated promising clinical data [8]. Here we define a nanoparticle as a structure with all 3 dimensions less than 1000 nm. Scientists have made steady advances in nanoparticle design. Nanoparticles can be made with variable size [21], ionizability [22], hydrophilicity [23], shape [24], and with varying degrees of active targeting Figure 1 (A) Lipid nanoparticle libraries can be generated with a variety of chemical compounds. LNPs are formulated by combining different biomaterials with cholesterols, lipid-PEG compounds, and helper lipids. (B) High-throughput nanoparticle formulation allows for the rapid production of large, diverse libraries. (C) Nanoparticle libraries can be generated by varying the molar ratio of biomaterials, cholesterols, PEG, and helper lipids. (D) Between 1 × 10 8 and 2 × 10 11 chemically distinct nanoparticles can be made by combining these compounds. Notably, these numbers do not include variations in the molar ratio of the compounds (e.g., Figure 1C).
ligands [25]. Large, chemically diverse libraries have been synthesized using simple synthetic routes including, but not limited to, Michael addition [26], epoxides [27], peptide [28], and thiol chemistry [29]. Importantly, advances in nanoparticle formulation e defined here as the process of 'loading' the nucleic acid into the nanoparticle e have also been reported. High throughput microfluidics has been shown to reliably make small, consistent batches of nanoparticles that are stable for weeks [30,31]. It is still difficult to cover the entire nanoparticle chemical space; formulating nanoparticles using available chemistries, it is feasible to formulate between 100 million and 200 billion chemically distinct nanoparticles ( Figure 1).
After nanoparticles are formulated, they are screened in vitro. More specifically, scientists (a) synthesize thousands of different nanoparticles before (b) evaluating whether they deliver RNA in easily expandable cell lines (e.g., HeLa). Scientists have also used primary cells in place of immortalized cells [32]. Based on in vitro results, a very small number of nanoparticles are (c) tested in vivo. This process is only an efficient use of time and resources if in vitro nanoparticle delivery predicts in vivo delivery. To test this, we compared how the same 300 nanoparticles delivered nucleic acids in vitro and in vivo in multiple cell types; we found no correlation [33] (Figure 2A). The discrepancy between in vitro and in vivo delivery is not necessarily surprising. If a nanoparticle is systemically administered, it must travel through the blood stream and enter the target tissue. Several physical factors dictate where the nanoparticles go. For example, it is difficult for nanoparticles to access the brain, which is physically cordoned off by the blood brain barrier. Nanoparticles can also be physically disassembled by the glomerular membrane in the kidney [34]. As a counterexample, nanoparticles often target hepatocytes since nanoparticles can exit the blood via nanoscale pores in sinusoidal blood vessels and basement membrane [35]. Physiological factors also influence nanoparticle delivery. For example, serum proteins can bind to nanoparticles in the blood [36,37], changing nanoparticle interactions with the immune system and target cells [38]. Interestingly, systemic nanoparticle delivery can also change with local disease states; in one example, scientists found that delivery to systemic organs was affected by the presence of a primary tumor [39]. Even if the nanoparticle reaches its target cell, it must enter the cytoplasm, often by escaping an endosome. Endosomal escape is inefficient; a LNP that delivers siRNA to hepatocytes very efficiently (50% target gene silencing after a 0.01 mg/kg injection) still had >95% of its siRNA sequestered within endosomes [40]. The nanoparticle must also evade liver, kidney, and splenic clearance, as well as avoid initiating an immune response. The majority of these obstacles are not recapitulated in vitro. Some processes are required for in vitro delivery and in vivo delivery (e.g., endocytosis and endosomal escape). However, these processes are carefully regulated [41] by gene expression that is likely to change with microenvironmental cues. Increasing evidence suggests that gene expression in cultured cells may not reliably predict gene expression in primary cells or in vivo [42].
Nanoparticle barcoding enables simultaneous analysis of >100 nanoparticles in vivo The lines of evidence described above suggest screening nanoparticles directly in vivo would be useful. However, the expensive nature of in vivo experiments has limited the field to testing a few in vivo. This is a universal problem in nanomedicine, and it has driven groups to design systems that facilitate high throughput in vivo nanoparticle screens. In all cases, these systems utilize 'multiplexed' signals, which are signals that can be quantified without interfering with one another. In the simplest case, nanoparticle 1, with chemical structure 1, is 'barcoded' (i.e., 'tagged') with signal 1; nanoparticle N, with chemical structure N, is barcoded with signal N. The nanoparticles are co-administered, and later, signals 1 and N are quantified simultaneously. In one example, scientists isotopically-barcoded silver nanoparticles with different functional peptides to evaluate potential targeting ligands [43]. Another group has used quantum dots to barcode polymeric nanoparticles; in vivo screens were performed to understand how nanoparticle surface modifications altered delivery through the blood brain barrier [44].
One multiplexed signal that has been used by groups (including ours) is DNA [33, 45,46]. DNA is an excellent molecule for multiplexing. The number of different barcodes scales exponentially faster than any other signaling molecule; 4 N different barcodes can be generated with a DNA sequence N nucleotides long. An 8 nucleotide barcode can create 65,536 different sequences. Additionally, DNA sequence readouts are easy to analyze. The cost of DNA sequencing has decreased more rapidly than Moore's Law; it is now possible to generate 400 million DNA reads for $2000 using a machine that fits on a desktop. Because DNA sequencing has become an increasingly more common tool in several fields [47], easy to use software packages are readily available to help analyze and interpret the data.
Two distinct nanoparticle DNA barcoding systems have been reported to date [33, 45,46]. In one example, DNA was formulated into liposomes alongside different chemotherapies [45] (Figure 2B). By varying the primers or altering the barcode length, different sequences were detected using gel electrophoresis or real time PCR. The authors tested several drugs at once in vivo and concluded that the DNA barcodes found in dead cells corresponded to nanoparticles containing effective chemotherapies. This method may be used in the future to test hundreds of different cancer drugs in vivo. We developed a separate DNA barcoding system that utilizes next generation sequencing. Our barcodes contain (i) universal primer sites, (ii) a 7 nucleotide region with fully randomized sequences (to monitor for biased PCR amplification), and (iii) an 8 nucleotide barcode region in the center [46] (Figure 3A). Initially, we demonstrated that this system predicted siRNA delivery in vivo and generated DNA sequencing outputs that were linear with respect to the administered DNA [46]. Later, we demonstrated that this system, which we named Joint Rapid DNA Analysis of Nanoparticles (JORDAN), could simultaneously analyze over 150 nanoparticles simultaneously in vivo [33] ( Figure 3B).
Although there are different ways to design DNA barcodes, specific traits help increase the robustness of the data. Most importantly, universal primer sites e which are primer sites that do not change -confer an important advantage, maximizing the chance all barcodes are amplified in an unbiased way. This seems counterintuitive, but the vast majority of a DNA barcode should be identical; the barcode region of the DNA should be small. Adding chemical modifications including phosphorothioates to the 5 0 and 3 0 termini increase barcode stability. The barcode regions should also be designed to work together on solid phase next generation sequencing (NGS) platforms like Illumina. Since solid phase NGS relies on fluorophores ( Figure 3C), each individual barcode must have a 'base distance' of at least 3; in other words, each barcode must be different from all other barcodes at 3 of the 8 positions ( Figure 3D). We have designed >200 barcodes with base distances of 3 or more [33]. It is also critical to sequence the DNA'input' administered to the animals. This allows us to normalize the data and compare different cell types to each other within an experiment. Finally, like all big data systems, nanoparticle DNA barcoding experiments should include proper controls in each experiment. Two examples include a liposome loaded with caffeine (instead of a chemotherapeutic) [45] and a naked barcode [33]; in both cases, the negative controls performed poorly relative to the experimental conditions, as expected. A detailed protocol describing barcode design and a nanoparticle bioinformatics pipeline is available on DahlmanLab.org. DNA barcodes enable nanotechnologists to track how hundreds of distinct nanoparticles deliver drugs in vivo for the first time; this enables nanoparticle studies that were not feasible using traditional methods. As an example, we compared how dozens of different nanoparticles delivered DNA to 8 different cell sub-types in the spleen. Using unbiased Euclidean clustering e a common bioinformatics technique that analyzes large datasets e we found that specific types of immune cells tended to be targeted by the same nanoparticles [33].
DNA barcodes have enabled large scale experiments in fields as varied as oncology, developmental biology, and viral delivery. As an example, by using the single guide RNA (sgRNA) sequence as a 'barcode' to denote a gene that was knocked out, scientists performed whole genome screens to identify genes and non-coding regions that regulate cellular response to cancer and immunotherapies [48e51]. More specifically, a system named Genome-Scale CRISPR-Cas9 Knockout Screening (GeCKO) helped determine which genes promote resistance to anti-cancer drugs [50]. GeCKO was used to target 18,080 genes in the same experiment; multiple sgRNAs were designed per gene. The pool of sgRNAs were administered to human cancer cell lines that also expressed Cas9; as a result, each cell e on average e had no gene knocked out, or 1 gene knocked out. After adding an anti-cancer drug, cells were allowed to grow; enriched sgRNA sequences in live cells corresponded to genes that increased drug resistance ( Figure 4A). The same approach has been used to study metastasis in vivo. In this case, a pool of tumor cells were edited using GeCKO and administered in the hind limb of a mouse. After waiting a period of time, the authors isolated metastatic tumor cells from different organs, and thereby identified genes that promoted metastasis [52]. Similar approaches have been used to identify genes that suppress primary tumor growth in the liver and brain ( Figure 4B) [53,54]. Scientists also designed a combinatorial DNA barcode system named 'PolyLox' that identified stem and progenitor cells that led to the development of the immune system [55]. Finally, there are reports of 'capsid shuffling' and other combinatorial cloning strategies; these use sequences that code for amino acids as barcodes that denote which amino acids successfully enabled Adeno-associated viral (AAV) vectors that facilitate cellular entry and DNA delivery [56e 58]. Given that DNA is a highly efficient method to store and access information, it is likely that many other biology and engineering fields will use DNA barcodes in the future.

Outlook
RNA therapies have tremendous clinical potential but will require on-target drug delivery to reach the clinic. siRNA drugs in the liver are a great example; several patient populations are already benefiting from treatments that silence genes in hepatocytes. However, without new delivery vehicles, the clinical impact will be limited to local injection, and to the liver [19]. To target new cell types, it is important to test as many nanoparticles as possible. Traditional screening methods test delivery vehicles in vitro; however, in vitro particle screens may inaccurately predict delivery in vivo. In conjunction with rapid nanoparticle synthesis, high-throughput in vivo nanoparticle screening methods will allow scientists to track more particles simultaneously than ever before. Over time, this may allow scientists to better understand how nanoparticles behave in the body. We envision a time where enough nanoparticles have been tested in vivo that nanoparticles which target a given cell type can be rationally designed. We find it likely these high throughput approaches will help accelerate the development of new genetic therapies.  DNA barcodes are regularly used by biologists to perform whole genome studies. (A) In cancer cells, each gene in the genome of Cas9 expressing cells was knocked out individually using sgRNAs. The sgRNA served as the barcode that denoted which gene was silenced. After knocking every protein coding gene, an anti-cancer drug is added to the cells and time is allotted to facilitate drug resistance. Enriched sgRNA sequences in the live cells correspond to genes that affect cancer drug resistance. In this example, knocking down Gene A promotes drug resistance. (B) The same whole genome studies have identified genes thatwhen knocked downpromote primary tumor growth. These papers describe practical microfluidic devices that enable scientists to formulate many nanoparticles in the same day. Without these devices, high throughput in vitro screens and in vivo screens would be much more challenging. Reference 30 is also an important paper that describes microfluidic systems for nanoparticle formulation.