Hybrid Nanostructures from the Self-Assembly of Proteins and DNA

Proteins and DNA are two commonly used molecules for self-assembling nanotechnology. In this tutorial review, we discuss the hybrid ﬁeld of ‘‘protein-DNA nanotechnology,’’ whereby proteins are integrated with DNA scaffolds for the creation of hybrid nanostructures with distinct properties of each molecular type. We ﬁrst discuss bioconjugation strategies, both covalent and supramolecular, for integrating proteins with DNA nanostructures. Next, we review seminal work in four emerging areas of protein-DNA nanotechnology: (1) controlling protein orientation on DNA nanoscaffolds, (2) controlling protein function with DNA nanodevices, (3) answering biological questions with protein-DNA nanostructures, and (4) building hybrid structures that integrate both protein and DNA structural units. Finally, we close with a series of forward-looking research propositions and ideas for directions of the ﬁeld. The emphasis of this work is on integrated nanostructures with precise protein orientation on DNA scaffolds, as well as hybrid assemblies that integrate the structural and functional properties of each molecule.


INTRODUCTION AND MOTIVATION FOR PROTEIN-DNA NANOTECHNOLOGY
Since the inception of nanotechnology, scientists have dreamed of the ability to create tiny structures and machines that can manipulate matter at will. For example, to this day much research in the field is driven by the concept of nanomachines or devices (or, more evocatively, ''nanorobots'') that can interact with biological or other systems in programmable ways. Such nanostructures could, for example, diagnose and treat disease, synthesize novel materials, harvest and shuttle energy, exert mechanical forces, store and transmit information, or arrange other molecules with atomic precision. Richard Feynman outlined this idea of nanoscience in 1959 in his groundbreaking talk ''There's Plenty of Room at the Bottom,'' and generations of chemists, biologists, engineers, and materials scientists have since probed the limits of nanostructure synthesis and function. Not surprisingly, biology has served as one of the most fertile sources of inspiration in this endeavor. Cells are teeming with nanoscale analogs of macroscopic structures and machines, including architectural scaffolds (the cytoskeleton), programmable ''robots'' and assembly lines for building materials in a controlled and monodisperse manner (the ribosome, non-ribosomal peptide synthesis, and enzyme cascades), motors and other machines for exerting mechanical force (actin, myosin, and focal adhesion complexes), channels and transport mechanisms for controlling the flow of matter (ion channels and endocytosis), structural materials with exceptional strength (spider silk, bone, and nacre), adaptors that can selectively bind to a target in a crowded sea of competing molecules (antibodies and ligand receptors), and both ''hardware'' and ''software'' for information processing (DNA and RNA copying and transcription, signaling pathways, and riboswitches). Aside from cells, viruses offer another example of complex biological The Bigger Picture Nanotechnology as a field seeks to create structures and materials that can manipulate and influence the microscopic world in much the same way that traditional machines and devices work on the macroscopic world. For inspiration, scientists have turned to biology, which has countless examples of nanoscale structures and machines that can carry out complex functions. Biological molecules such as proteins and DNA are particularly attractive for this purpose because of their programmable nature and functional relevance. In this review, we discuss hybrid nanostructures that integrate the structural programmability of DNA nanotechnology with the chemical and functional diversity of proteins. We discuss strategies for creating complex, integrated structures with these two biomolecules, as well as four areas where they have found application. In the long term, the field of ''protein-DNA nanotechnology'' has the potential to create materials with capabilities that rival, or even surpass, nature. nanodevices with their ability to enter a cell, bypass the cell's defense mechanisms, and create new copies of themselves by hijacking the host machinery.
All of these functions, and many others, are mediated either entirely or in part by proteins. Although composed primarily of the 20 canonical amino acids, proteins have a breathtakingly wide range of functions as a result of their complex folds and their ability to hierarchically self-assemble with other proteins, DNA, RNA, carbohydrates, and lipids. This complexity comes at a cost, however, because the relationship between protein sequence and function or assembly is still imperfectly understood. The past few decades have brought remarkable progress in directed protein evolution, 1 de novo computational design, 2 the repurposing of existing biological scaffolds such as viral capsids, 3 and the abstraction of design rules in simplified building blocks such as self-assembling peptides 4 or proteins. 5 However, there is still a need for generating nanostructures with a high degree of programmability and structural control that can capitalize on the enormous power of native protein function both for recapitulating and probing biological systems and for designing new materials that can outpace nature. Specifically, one key unmet challenge is building highly anisotropic structures de novo from proteins, such as a nanoscale machine or robot that can perform a function given an external stimulus. Cells possess many such multi-protein complexes, but the difficulty in predicting even monomeric protein structures means that most assemblies made to date are highly symmetric-such as polyhedral cages or extended fiber and sheet assemblies-and usually static.
By contrast, oligonucleotides have proved to be highly promising molecules for constructing anisotropic and uniquely addressable assemblies at the nanoscale, as well as imparting programmable dynamic behavior. The field of DNA nanotechnology uses these molecules as ''smart'' self-assembling building blocks divorced from their natural genetic role. The design rules that drive oligonucleotide hybridization (i.e., the Watson-Crick pairing rules) are well known, a vast number of orthogonal interactions (i.e., sequences) exist, and the structural and physicochemical properties of the double helix, single-stranded DNA, and Holliday junction crossovers have been determined to great precision. In the past three decades, DNA nanotechnology has reported an ever-increasing catalog of structures, including simple 1D and 2D arrays 6,7 (Figures 1A and 1B), 3D crystals ( Figure 1C), 8,9 highly complex and anisotropic 2D and 3D nanostructures (commonly known as ''DNA origami;'' Figures 1D and 1E), 10,11,12 and dynamic 13 or logic-gated 14 machines and devices ( Figure 1F). Importantly, because DNA nanotechnology relies on a small subset of key motifs for self-assembly-which are then combined independently and modularly into more complex structures-the design process can be aided by user-friendly, graphical interface software such as Cadnano. 15 This facility allows both rapid entry of nonexperts into the field and the parallel design and testing of multiple structures with a high degree of precision.
The programmability and tractability of DNA, however, comes at the expense of chemical heterogeneity. With a few notable exceptions, such as aptamers or DNAzymes, DNA does not come close to mimicking the functions of proteins. Thus, in recent years there has been an explosion of interest in merging the chemical and functional diversity of proteins with the structural programmability of nucleic acid nanotechnology to forge a truly hybrid field of ''protein-DNA nanotechnology.'' Research in this field has already shown promise in creating protein-DNA nanostructures for applications that include targeted delivery of therapeutics, biosensing, control over protein activity, elucidation of protein structure, functional biomaterials, and experiments to probe biology in new ways. In this tutorial review, we will discuss recent advances in protein-DNA nanotechnology with an emphasis on truly integrated assemblies that seek a well-defined relationship between the protein and DNA scaffold. To accomplish these endeavors, researchers have had to develop a host of novel chemical methods for synthesizing hybrid nanostructures, including site-specific modification of the protein with one or more oligonucleotides, the use of binding agents (such as DNA-binding fusion proteins) that yield a defined interface between the two components, control over linker length and rigidity between the two molecules, or some combination of the aforementioned factors.
We will begin with an overview of protein-DNA bioconjugation strategies in which we emphasize chemical approaches for site-specific covalent modification and strategies for modifying proteins more than once, as well as discuss several supramolecular methods for associating the two molecules. We will then describe landmark work in four key areas that have seen an explosion of interest in recent years: (1) controlling protein orientation on a DNA scaffold, (2) dynamically controlling protein function by using DNA structures, (3) using proteins on DNA nanoscaffolds to answer biological questions, and (4) synthesizing hybrid nanoscale assemblies with both protein and oligonucleotide structural components. Finally, we  6 (copyright 2003 AAAS) and Winfree et al. 7 (copyright 1998 Springer Nature). (C) 3D self-assembled crystals based on a tensegrity triangle motif. Reprinted with permission from Zheng et al. 8 Copyright 2009 Springer Nature. (D) 2D ''DNA origami'' with a long scaffold strand folded by many short staple strands. Reprinted with permission from Rothemund. 10 Copyright 2006 Springer Nature. (E) Example of 3D DNA origami shapes. Reprinted with permission from Douglas et al. 11 Copyright 2009 Springer Nature. (F) A dynamically reconfigurable DNA origami ''nanorobot'' that switches between two states when the concentration of divalent magnesium is varied. Reprinted with permission from Gerling et al. 13 Copyright 2015 AAAS.
will close with a series of research propositions and future directions for the field, where we stress the role of novel chemical and supramolecular methods for improving the integration between proteins and DNA.
We also take this moment to mention what this review will not cover. First and foremost, it is not intended to provide a comprehensive overview of all protein-DNA hybrid materials; rather, it focuses on seminal examples in the four areas described. An extensive and rich literature exists on the merger of proteins with DNA nanotechnology, 12,16-21 so we have restricted the focus on the four key areas described below. Second, because of space limitations, we will not cover work on enzymatic cascades scaffolded by oligonucleotides despite the central role that these systems have played in protein-DNA nanotechnology. We will discuss some examples that pertain to the dynamic control over enzyme function in Using DNA Nanostructures to Dynamically Control Protein Function, but for a more comprehensive treatment, we refer the interested reader to several comprehensive reviews on this rich topic. [22][23][24] Third, although RNA will be increasingly used in the future with proteins-thanks to its richer structural diversity, its ability to be transcribed inside cells, and the existence of multiple protein-binding domains-we will restrict this discussion to DNA-based scaffolds because of their preponderance in the field. Fourth, we will discuss hybrid nanostructures containing only proteins produced by recombinant expression or from natural isolates rather than synthetic peptides conjugated to oligonucleotides. 25 Fifth, because the focus of this review is on controlled integration and positioning of proteins on DNA scaffolds-driven by chemical strategies for site-specifically modifying proteins with DNA-we will not cover examples where proteins are used to coat DNA nanostructures via electrostatic effects or non-specific DNA-binding domains. [26][27][28][29] Sixth, we will consider only systems where a protein is attached to a DNA nanostructure more complex than a duplex or a simple branched junction or forms a hybrid assembly with both structural components. We will not discuss examples where DNA is used as a barcode, as a linker to attach a protein to another material, or for proximity-based ligation or detection strategies, although these approaches have been crucial in a number of other applications reviewed elsehwere. 30,31 Finally, both the work covered and the forward-looking section at the end reflect our own research interests and excitement for future work. We apologize in advance to all the scientists whose work we are not able to discuss because of space limitations or the selection of themes. There are unquestionably many fruitful directions in this hybrid area of nanotechnology, and it is our hope that this work will both highlight emerging directions in the field and spur new ideas in previously untapped disciplines.

STRATEGIES FOR MODIFICATION OF PROTEINS WITH DNA
Given that only a small subset of proteins naturally interact with DNA or RNA, most hybrid protein-DNA nanostructures rely on one of two methods for integrating the two molecules: (1) direct covalent modification of the protein with the oligonucleotide and (2) fusion of the protein of interest (POI) with a DNA-binding protein, a protein for which an aptamer or exists, or streptavidin (which binds biotin). The first approach, which results in a covalent linkage between the two molecules, poses distinct challenges in both reactivity and site specificity of the target. Proteins and DNA are both large molecules with many potentially reactive sites, so achieving efficient coupling without compromising their function can be very difficult at the low micromolar concentrations typically used. This challenge is exacerbated when the POI is highly cationic, which can lead to non-specific aggregation with the oligonucleotide. Incomplete or over-modification is common, which in turn raises the issue of separating the desired species from unmodified proteins or proteins with too many strands.
Below we will describe some of the key strategies for protein-oligonucleotide conjugation and outline the advantages and disadvantages of each. Although a great number of bioconjugation reactions exist, 31,32 here we will highlight only approaches already demonstrated for linking proteins to DNA. We divide these approaches into two sections: (1) methods commonly used in most protein-DNA conjugation studies, both historically and currently, and (2) less commonly used strategies that nonetheless have great potential for future applications, especially when two or more modifications with DNA are necessary. We also take a moment to stress that most monomeric proteins are quite small in relation to DNA nanostructures. For comparison, in Figure 2A, we show that green fluorescent protein (GFP), which has a molecular weight of $27 kDa, is of comparable size to a 20-nt strand of DNA (approximately two helical turns). Thus, larger assemblies such as DNA origami ($5,000 kDa when the M13 scaffold is used) dwarf most proteins, and they often surpass even large multivalent protein assemblies such as viral capsids. 33 Common Approaches for Protein-DNA Conjugation Lysine Acylation The most straightforward way to modify a protein with DNA is through lysine acylation with activated esters or iso(thio)cyanates ( Figures 2B and 2C). Lysine is one of the most common surface-exposed residues on a protein surface, so most wildtype proteins can be modified without further engineering. The biggest downside of this approach is that usually multiple lysines are accessible (and potentially the N terminus as well), so a mixture of conjugates varying in the location and number of modifications is obtained. This lack of selectivity can be problematic if the DNA strand ends up attached to a critical part of the protein (such as a binding interface) or if it positions the protein on a DNA scaffold in a way that blocks its function. Lysine acylation can also be relatively slow and inefficient at the low micromolar concentrations used; with small molecules, a large excess of one coupling partner can be used to circumvent this issue, but this is often not possible with DNA and proteins. Background hydrolysis of amine-reactive reagents such as NHS esters also restricts the utility of this reaction. To link DNA to proteins via lysine chemistry, homo-bifunctional linkers such as disuccinimidyl glutarate (DSG) or tunable bis-NHS polyethylene glycol (PEG) linkers, in conjunction with amine-modified DNA, are typically used ( Figure 2D). Amines are available as a common DNA modification from commercial suppliers and are introduced via solid-phase synthesis with a functionalized phosphoramidite. Modifying the DNA with a thiol and using a hetero-bifunctional crosslinker with a disulfide or maleimide moiety can avoid the crosslinking of two proteins or two oligonucleotides (see Cysteine Modification).
One key consideration with most of the conjugation strategies described herein is that they involve flexible linkers such as aliphatic carbon chains or oligoethylene glycol in the bifunctional molecule itself, between the DNA and the introduced functional group, or both. Although the flexibility of these linkers helps facilitate the reaction of the two large biomolecules, they often preclude a defined orientation between the protein and DNA scaffold, which could be problematic for some applications. Creating truly hybrid nanostructures with well-defined relationships between the protein and DNA components will most likely require a more rigid connection. Indeed, one of the key challenges in protein-oligonucleotide nanotechnology will be building complex assemblies where the two molecular scaffolds are integrated in a seamless fashion, much like naturally occurring protein-protein interfaces. Although the simplicity of lysine chemistry makes it ubiquitous for making protein-DNA conjugates, we will for the most part not discuss this method; however, we will include a few exceptions for particularly interesting applications within the four areas described or as a secondary reaction in conjunction with a more site-specific strategy.

Cysteine Modification
A second way to modify native protein residues with DNA is through alkylation of thiols with reagents such as maleimides and iodoacetamides or via disulfide formation and exchange ( Figure 2E). Because cysteine is one of the rarest amino acids exposed on a protein surface-most are either buried in an active site or tied up in disulfide bonds-this strategy is powerful for achieving site specificity. A mutagenically introduced cysteine can often be targeted selectively, allowing for DNA modification away from any potentially deleterious site on the protein, and side reactions with other nucleophiles such as amines can usually be avoided. Cysteine modification represents the best balance between selectivity and general accessibility for protein-DNA bioconjugation and is the first reaction our lab and many others working on protein-DNA nanotechnology use when site selectivity is required. 32 Furthermore, a number of commercially available linkers allow for modification of the protein with either amine-or thiol-modified DNA ( Figure 2F). One potential downside of this method is that a surface-exposed cysteine can lead to protein dimerization and/or aggregation as a result of spontaneous disulfide formation. 34 These disulfides can be broken prior to modification with a reducing agent such as dithiothreitol (DTT) or tris(2-carboxyethyl)phosphine (TCEP), although one must carefully control the reaction conditions to avoid cleaving endogenous disulfides, which can in turn lead to over-modification of the protein.
Biotin-(strept)avidin and Ni-NTA All of the above methods result in covalent modification of the protein, but strong non-covalent interactions can also be used to functionalize proteins with DNA. The most common approach uses the binding of biotin to streptavidin, which has such a high affinity (K d $10 À15 M) as to be almost irreversible. DNA strands functionalized with biotin are readily available from commercial suppliers, and tetravalent streptavidin can be used as an intermediary ''glue'' between DNA and a biotinylated protein; alternatively, monovalent avidin can be fused directly to the POI. 35 The interaction between a fused hexahistidine (His 6 ) tag-which is inherently site specific, though generally restricted to the protein's termini-and DNA functionalized with nickel-nitrilotriacetic acid (Ni-NTA) can also be used for modifying proteins in a reversible manner 36 and attaching them site specifically to DNA origami. 37 In one particularly elegant example, Gothelf and coworkers used a His 6 tag (or naturally occurring metal-binding patches) to transiently modify a protein with a DNA handle, which could in turn be used to direct a complementary strand modified with an NHS ester ( Figure 3A). 38 In this way, the authors could target only a subset of lysine residues (in close proximity to the directing strand) without resorting to more complex bioconjugation strategies.

Covalent Modification of Self-Ligating Protein Tags
An alternate common strategy for protein-DNA conjugation, which altogether avoids chemical conjugation of DNA to the POI, involves fusing the POI to protein tags that can in turn link specific substrates to their own surface. Functionalizing the target DNA with the chemical moiety accepted by these self-ligating tags results in conjugation of the oligonucleotide to the POI-tag fusion ( Figure 2G). Specific examples include (1) SNAP-tags (which ligate O 6 -benzylguanine groups), 42 (2) HaloTags (which ligate haloalkanes), 43 (3) CLIP-tags (which ligate O 2 -benzylcytosine moieties), 44 and (4) the SpyTag/SpyCatcher system (which ligates a short peptide). 45 This approach is powerful because the fusion proteins can be readily generated through standard molecular biology techniques, and commercial suppliers offer many of the target moieties as standard DNA modifications. Because the tag ligates the DNA to a specific site on its surface, a single well-defined conjugate is formed, often efficiently and at high yield, in a single step. The tags are also orthogonal to one another, so they can be used consecutively to modify different locations on a DNA nanostructure with two different proteins (or three, if yet another orthogonal conjugation, such as biotin-streptavidin, is used). 46 Conversely, this method requires fusion with a full-length, folded protein that can be comparable to the POI (see Figure 2G, which shows the size of a SNAP-tag relative to GFP). The selfligating tag is also usually grafted onto the POI through a flexible amino acid linker. Thus, it is useful for applications that require single, site-specific attachment of DNA (e.g., displaying proteins on an origami scaffold) with minimal manipulation or purification of the conjugates. The method is less suitable for creating hybrid nanostructures (given that the tag itself takes up a lot of space), if rigid attachment is desired, or if the target protein must be fairly small, for example, to fit inside a DNA origami box.

Less Common Strategies for Protein-DNA Conjugation
Enzymatic Modification of Small-Molecule or Peptide Tags A different approach to modifying proteins with DNA is the use of enzymes to ligate the two components. This strategy requires introducing a chemical handle-such as a small molecule or genetically fused peptide tag-that the enzyme can recognize to the POI. The other chemical handle is attached to the oligonucleotide, leading to selective linking of the two components ( Figure 2H). A key advantage of this method is the inherent site specificity of a fusion peptide, as well as the lack of competing side reactions (like hydrolysis with NHS esters), because the target functional groups react only in the presence of the enzyme. Enzymes such as protein farnesyltransferase (PFTase) can be used to graft modified isoprenoids with a new chemical handle to the short C-terminal peptide tag CVIA for subsequent secondary bioconjugation (e.g., copper click chemistry). 47 Another enzyme, sortase A, ligates an oligo-glycine peptide to the LPXTG sequence, although this strategy requires a second bioconjugation reaction to link one of those peptides to DNA. 48 Additional enzymes that have been used to link peptide-tagged proteins to DNA include transglutaminase 49 and methyltransferase. 50 Avoiding peptide tags altogether, Gothelf and coworkers used a terminal deoxynucleotidyl transferase (TdT) to ligate DNA strands with molecules bearing a pendant nucleotide triphosphate (NTP). 51 By first attaching the NTP to the molecule of interest (e.g., via click chemistry using an azide dCTP), the authors could efficiently attach DNA to peptides, polymers, dendrimers, or full-length proteins. Alternatively, relaxase enzymes ligate themselves to specific oligonucleotide sequences, which can be engineered into an origami scaffold; 52 the use of several different enzymes with differing sequence specificities imparts additional orthogonality to this approach. However, using this method to attach an arbitrary protein would require a fusion of the POI with the relaxase (which the authors demonstrated with fluorescent proteins). A similar approach was demonstrated with a fusion of a POI and the phi X174 Gene-A* protein, which covalently links a tyrosine residue in Gene-A* to a specific oligonucleotide sequence. 53 Expressed protein ligation-which usually links a peptide to a protein bearing an intein fusion-can also be used to directly ligate an oligonucleotide bearing a terminal cysteine (or a mimic thereof). 54 Functionalization of Non-canonical Amino Acids with Bio-orthogonal Reactions When none of the above approaches are suitable, a non-canonical amino acid (NCAA) can be introduced and functionalized with reactions that do not affect the native residues. For example, amino acids such as 4-azidophenylalanine can be installed by the Schultz amber-codon-suppression method 55 and targeted via ''click'' chemistry (catalyzed either by copper 56 or through a strain-promoted, ''copper-free click;'' 57 Figures 2I and 2J). DNA can be purchased with both terminal and internal azides or alkynes, usually with an intervening 6-or 12-carbon linker. Alternatively, modified phosphoramidites with a reduced linker length can be used in solid-phase DNA synthesis ( Figure 2K), resulting in a more seamless transition between the oligonucleotide and the protein surface. The lack of reactivity with native residues makes NCAA incorporation ideal for site selectivity or if two conjugation reactions are required. Furthermore, the relative inertness and resistance to hydrolysis of the reactive moieties allow for long-term storage, extended reactions times, and elevated temperatures. The primary downside to this method is that it requires additional plasmids for the non-canonical tRNA and tRNA synthetase necessary for incorporating the NCAA, and the expression yields are typically lower than with wild-type expression. Although click chemistry is by far the most common method for NCAA modification with DNA, the Francis lab has reported an attractive series of oxidative coupling reactions with residues such as 4-aminophenylalanine ( Figure 2L). 58 These reactions proceed in minutes at mild aqueous conditions with oxidants such as sodium periodate or potassium ferricyanide, and their efficiency can rival that of click chemistry. Additional examples of NCAA-mediated DNA coupling include the reaction of aldehyde-containing proteins (e.g., introduction of 4-acetylphenylalanine) 59 with hydroxylamine-or hydrazine-modified DNA to form oximes or hydrazones, respectively ( Figure 2M). 60 Aptamers, Fusion Proteins, and Other Binding Agents Aptamers-antibody-mimetic single-stranded DNA sequences that fold into a tertiary structure and bind to a target 61 -represent a particularly attractive strategy for immobilizing proteins on DNA scaffolds 62 because they can be easily introduced into a constituent strand of the structure ( Figure 2N). Although aptamers bind noncovalently to proteins, they have the advantage of being inherently ''site specific'' as they target a unique interface on the protein. Although this strategy has not been extensively employed, fusing the POI with a protein that already has an aptamer would allow for attachment to a DNA scaffold. Compared with antibodies or other binding groups, aptamers often have modest K d values ($mM). Their binding can, however, be enhanced through spatial control of two aptamers that bind to different interfaces of a protein, ''clamping'' proteins in a more rigid fashion. 63 It is also possible to covalently photo-crosslink an aptamer to a protein surface through reactive handles such as phenyl azides; 64 to our knowledge, this strategy has never been applied to DNA nanostructures, but it could represent an avenue for future research. As an alternative to aptamers, natural protein-protein interactions can be used to functionalize a target with DNA, as demonstrated by de Greef and coworkers with antibodies. 39 Protein G-which binds to the Fc region of antibodies and is more easily expressed, handled, and modified with DNA than a full-size IgG-was functionalized with a DNA handle and a benzophenone moiety, resulting in covalent trapping after UV irradiation ( Figure 3B). This approach is particularly useful if, like for protein G, the binding interaction does not affect the function of the protein (antigen binding by the Fv region in this case). This last report was particularly interesting because it relied on site-specific incorporation of both the DNA and the benzophenone-the former through a unique cysteine residue and the latter via the Schulz method with a benzophenone NCAA-demonstrating the potential for multi-functional hybrid protein-DNA structures.
Fusing a DNA-binding protein-such as a zinc finger, leucine zipper, or transcription factor homeodomain-to the POI can also localize it on a DNA nanostructure 65 bearing the cognate DNA sequence to create, for example, enzyme cascades on a DNA origami 66 or to reconstitute functional ion channels. 67 The key drawback of this approach is the reversibility of binding, given that many DNA-binding proteins have K d values in the nanomolar regime, comparable to the working concentrations for DNA origami. The Morii lab demonstrated an elegant workaround to this issue by using zinc-finger DNA-binding proteins to localize enzymes fused with SNAP-tags, CLIP-tags, and HaloTags to three different regions of a DNA origami modified with the double-stranded DNA (dsDNA) targets. 40 In this fashion, the non-covalent (reversible) DNA binding enhanced the covalent (irreversible) linking of the proteins to the scaffold and enabled an over 90% yield of the modified structure to generate an enzymatic cascade ( Figure 3C). Once again, we foresee that such ''cooperative'' approaches-using a non-covalent interaction to direct a covalent one-will be particularly useful for creating rigid interfaces with proteins and/or modifying them in more than one location. As an alternative to aptamers or DNA-binding domains, natural protein-biomolecule interactions can be used in a multivalent fashion to create a binding interface between the two components. In 2017, the Sleiman lab leveraged the affinity of human serum albumin (HSA) for lipids to bind a DNA nanocube by functionalizing it with multiple alkyl tails ( Figures 3D and  3E). 41 They could tune the number of tails on the cube by modifying the constituent DNA strands, and a structure with four dendritic chains gave 5-fold stronger binding than a single chain. Instead of alkyl tails, multiple peptides that bind to a protein surface could be attached to a DNA structure to create a mimetic of a protein-protein interface. This approach was demonstrated with two different peptides that bind to different faces of the POI (templated on a linear duplex 68 ) or multiple copies of the same peptide that binds to a multivalent protein (templated on a DNA origami 69 ); we will discuss this more thoroughly in Controlling Protein Orientation on a DNA Scaffold below.

SEMINAL EXAMPLES OF INTEGRATED PROTEIN-DNA NANOTECHNOLOGY
In this section, we will describe four exciting areas where the integration of proteins and DNA nanoscaffolds has been used for creating novel structures. Most involve covalent protein-oligonucleotide conjugates, but several use affinity interactions instead. Many of these examples could fit into several sections, so we chose the topic where they demonstrated the greatest novelty and potential for applications.

Controlling Protein Orientation on a DNA Scaffold
Interestingly, the entire field of DNA nanotechnology was inspired by immobilizing proteins on a self-assembled DNA scaffold with a defined orientation. In 1982, Ned Seeman proposed using oligonucleotides as a structural material to create selfassembled 3D crystals through sticky-end cohesion and use these addressable frameworks to immobilize proteins in a periodic 3D array. 70,71 In this way, the structure of the guest molecule would be solved by X-ray crystallography without the laborious process of crystallizing the protein first. Although this goal has not been realized to date (and is indeed an exciting future direction for protein-DNA nanotechnology; see Structural Biology on Proteins Aided by DNA Scaffolds), the controlled positioning of proteins on DNA nanoscaffolds has continued to inspire the field ever since. Potential applications go far beyond structural biology, and indeed the other three topics covered in this review would all benefit from control between the protein orientation and the underlying nanoscaffold. For example, nanostructures that control enzyme function (e.g., latched DNA origami boxes in Using DNA Nanostructures to Dynamically Control Protein Function) will not function properly if the protein active site is occluded because it ''points at'' the cavity wall. Likewise, biological studies such as those described in Using Hybrid Protein-DNA Nanostructures to Answer Biological Questions must present the active portion of the protein (e.g., its binding interface) correctly to enable association with its receptor. Finally, the hybrid structures with both protein and DNA components as discussed in Building Nanostructures with Protein and DNA Structural Components must have a specific relationship between the two molecules to create a well-defined final assembly. Future directions-such as protein-actuated DNA nanomachines-will likewise require careful integration of the two components. Even the enzyme-cascade examples not covered herein rely on the positional control of proteins with respect to one another to enhance substrate flow between them 22-24 and could benefit from orienting the respective pieces with greater precision. Finally, on a conceptual level, biology tightly controls the orientation and interaction of proteins with one another and other molecules, so investing in similarly precise synthetic protein-DNA nanostructures and devices will pay dividends in new and unexpected ways in the future.
In 2006, Turberfield and coworkers reported one of the earliest examples of a nanostructure-guided protein display by using the intrinsic helicity of the DNA duplex to control the relative placement of cytochrome c with respect to a tetrahedral cage. 72 Although the protein was conjugated to a thiolated DNA via its lysine residues in a non-specific fashion, systematically varying its attachment point on the duplex comprising the edge of the cage resulted in a smooth shift from the inside to the outside of the interior volume ( Figure 4B). This change could be probed by gel electrophoresis and demonstrated an elegant mechanism for controlling the relative position of the two macromolecules. The Fan and Yan labs both recently imparted site specificity to the cytochrome c by using a mutagenically introduced cysteine and also confirmed the orientation (pointing in versus out of the cage) via cryo-electron microscopy (cryo-EM; Figure 4C). 73 By tethering the DNA cage to a gold substrate through three thiol-modified vertices, the authors could tune the orientation of the protein in relation to the surface, yielding a $50% increase in electron-transfer rate for the cytochrome c inside the cage. This ability to regulate the spatial relationship between proteins and other functional components or interfaces will be particularly useful for creating nanosystems that control the flow of energy or matter in nanoscale factories or synthetic cells. In a related report, Fromme and coworkers demonstrated that PNA-modified proteins could be incorporated into a tetrahedral DNA cage ( Figure 4D). 74 The authors did not explicitly control the protein orientation but rather found that, depending on its charge, it was either repelled from the cage (e.g., negatively charged azurin) or attracted to the anionic environment inside the cage (e.g., positively charged cytochrome c). This result raises the intriguing possibility of controlling the placement of proteins on a DNA scaffold by either manipulating their surface charge or fusing them with a highly charged partner as a ''directing group'' of sorts.
A key limitation of DNA nanostructures is that their resolution for molecular placement is generally limited to the minimal distance between handles attached to staple strands, typically $3-5 nm. Self-assembling protein scaffolds such as viral capsids, by contrast, can position chemically appended molecules with a higher resolution ($0.5-1 nm). 3 Although the first example of viral capsids immobilized on DNA origami was reported in 2010 by the Yan and Francis groups, 33 in 2018 Wang and coworkers demonstrated that the relative orientation of self-assembling protein scaffolds could be controlled with DNA origami. 75 The authors used the tobacco mosaic virus (TMV), but rather than directly modifying the individual proteins with DNA, they assembled the capsid monomers around RNA strands in a similar fashion to native capsid formation around the RNA genome ( Figure 4E). The authors could precisely control the exact number and length of TMV rods by attaching one or more RNA strands to defined locations on several different DNA origami templates. Highly anisotropic assemblies could be formed by this approach (Figures 4F and  4G), demonstrating the power of DNA nanotechnology as a programmable scaffold able to retain the chemical and self-assembly properties of the protein. Moreover, the exact orientation of not just one molecule but an entire self-assembled protein scaffold was controlled with high precision. Further functionalizing the TMV monomers with bioactive molecules (e.g., peptides), polymers, or nanoparticles would yield unprecedented materials with hierarchical control over multiple length scales. DNA nanostructures can also be assembled reversibly through strand-displacement reactions, providing a temporal control not possible with many protein-based systems.
Although many researchers covered herein acknowledged the importance of sitespecific protein-DNA chemistry, the Jones and Stulz groups set out to systematically probe this effect by modifying proteins with DNA in different locations and examining the relative effect on their function. 76 The authors chose superfolder GFP (sfGFP) and b-lactamase (BL) for their study, and to ensure site selectivity, they turned to copper-free click chemistry with the NCAA 4-azidophenylalanine. The NCAA was incorporated via the Schultz technique, and the cyclooctyne coupling partner was introduced into the DNA with a modified phosphoramidite. This strategy allowed them to place the modification at any location on the protein surface without any of the restrictions imposed by fusion tags or cysteine residues. The authors first studied energy transfer (fluorescence resonance energy transfer [FRET]) between the sfGFP chromophore and a Texas Red dye attached to an oligonucleotide complementary to the protein-linked DNA handle. The efficiency of FRET could be tuned from $90% to 75% depending on the modification site on the protein as a result of differing distances between the donor and acceptor dyes ( Figure 4H). DNA was also attached to several different locations on the BL surface, and the catalytic efficiency of the protein attached to a DNA origami surface was probed. Both the site of modification and the orientation relative to the origami (pointing ''up'' versus ''down'') affected the catalytic rate, allowing for up to 30-fold enhancement relative to bulk solution. Thus, this work showed the role of protein orientation on a DNA scaffold and the importance of using a site-specific bioconjugation strategy. We also note that the cyclooctyne used (appended directly to the 5 0 end of the DNA) and the use of a NCAA on the protein resulted in a short linker between the molecules and thus allowed for tight coupling between the two components.
One potentially transformational area for protein-DNA nanotechnology-which harkens back to the original motivation for the field-is the use of origami scaffolds to aid in cryo-EM characterization of proteins. Using cryo-EM for single-particle reconstruction of protein structure requires imaging tens to hundreds of thousands of individual images of the target molecule, classifying them into different orientations, averaging the electron density for each of these orientations, and then combining all the images into a 3D reconstruction of the protein. DNA origami scaffolds (which are large and easily visualized by cryo-EM) are ideal candidates for ''nanoscale sample holders'' to control the orientation of the protein, prevent its adsorption and denaturation at the air-water interface of the sample, and help find small particles with a low signal-to-noise ratio (see Structural Biology on Proteins Aided by DNA Scaffolds). In 2016, the Dietz and Scheres groups demonstrated this principle by using a barrel-like origami to immobilize the transcription factor p53. 77 The use of a DNA-binding protein circumvented the need for bioconjugation, and the researchers could control the orientation of the protein by changing the location of the binding site on a DNA duplex spanning the origami barrel. By relying on the helical nature of DNA, the authors could tune the exact orientation of the protein by using the origami as a ''nanoscale goniometer'' ( Figure 5A). The DNA nanostructure protected the protein from adsorption to the air-water interface (which could denature it or bias its orientation), controlled the thickness of the ice, and provided a reference mask for selecting particles and sorting them into different classes. The authors were able to obtain a structural solution for the protein to $15 Å resolution, which was not sufficient for atomic-scale information but did yield some new insights into the way the protein bound DNA. A number of factors prevented a higher-resolution structure, but chief among them was the lack of a sufficiently rigid and well-defined protein-DNA interface. The authors also explicitly mentioned that, although their method was amenable to DNA (or RNA) binding proteins, extending it to arbitrary proteins would require ''chemical modifications of some of the DNA staples within the support with a specific tag on the target protein.'' In Structural Biology on Proteins Aided by DNA Scaffolds, we discuss potential approaches for addressing this exact issue for both cryo-EM and X-ray crystallography applications.
In 2018, Mao and coworkers demonstrated a different approach to protein-DNA cryo-EM studies by using a DNA nanobarrel to immobilize the membrane protein a-hemolysin ( Figures 5B and 5C). 78 The authors used the DNA scaffold to attach lipids modified with complementary handles, creating a hydrophobic milieu that allowed anchoring of the protein in a local environment mimicking a lipid nanodisc. Key to integrating the two molecules-all while preventing aggregation due to their hydrophobic nature-was first stabilizing the protein with detergent, which could be slowly dialyzed away to promote insertion of the protein into the hydrophobic nanostructure cavity. After single-particle 2D class averaging, the structure of the DNA barrel could be visualized at 7.5 Å resolution, allowing fitting of the DNA backbone. The protein, because it lacked specific orientational control, could be resolved to only around 30 Å (and then only after application of its intrinsic C7 symmetry). However, combining this method with additional bioconjugation techniques to ''pin'' the protein more tightly and prevent rotation in the barrel could potentially enhance the resolution and help determine the structures of membrane proteins, for which crystallization is notoriously difficult. The DNA scaffold can also be readily tuned to match differently sized membrane proteins, an advantage not possible with systems such as nanodiscs or micelles. Indeed, using DNA origami to control liposome size and shape 80 would be a natural way to further immobilize membrane proteins.
In biological systems, the relative orientation of two interacting proteins is controlled by their binding interface, which is in turn dictated by the spatial arrangement of supramolecular interactions (e.g., hydrogen bonds, salt bridges, or hydrophobic packing). In 2017, the Sacca lab demonstrated that a similar interface could be generated between a DNA origami structure and the multivalent protease DegPwhich can exist in oligomeric states ranging from 6 to 24 monomers-through the attachment of multiple short binding peptides ( Figure 5D). 69 The peptides (sequence: DPMFKV) were attached to thiolated DNA handles through an N-terminal maleimide, allowing up to 18 peptides to be attached to a DNA origami structure composed of hinged sheets, which could in turn ''wrap'' the DegP target through multiple peptide-protein interactions. Key to this work was the controlled positioning of these peptides (driven in turn by the helicity of DNA and the exact attachment site); pointing the peptides ''outward'' dramatically reduced protein binding. The number of peptides on the structure could be controlled such that more peptides yielded tighter binding, effectively converting a rather weak individual peptide-protein interaction (K d $5 mM) into a tight, multivalent interface. Interestingly, although the size of the barrel was a better match for the 24-mer protein, the 12-mer DegP bound almost twice as well as the smaller or larger oligomers. The authors surmised that this was due to balancing the size match and energy of binding (which increases with larger assemblies) with the ease of diffusion into the cage (which increases with smaller assemblies). Extending this concept to other targets, especially monovalent proteins-by attaching several different peptides (or other binding agents) that each bind to a different face of the protein-would enable synthetic antibodies that can take on arbitrary shapes, sizes, and additional functionalization. Alternatively, generating a binding interface on a DNA scaffold could help rigidly pin down a target protein, which will be critical for structural biology studies.
One of the most impressive examples of controlling protein orientation-integrating multiple site-specific conjugation approaches with the judicious design of a DNA nanostructure-was reported by the Gothelf lab 79 and built off prior work using polyhistidine-Ni(NTA) interactions to direct an NHS ester conjugation to a specific lysine residue on an antibody (as in Figure 3A). 38 Origami scaffolds (both 2D rectangles and 3D cages) were designed with holes large enough to fit the Fc region of a mouse IgG antibody, flanked by two handles for (NTA)-functionalized DNA handles. Four additional handles for complementary strands bearing NHS esters were added to the other two sides of the cavity in order to covalently trap an antibody after binding to the (NTA) moieties via Fc histidine clusters ( Figure 5E). The efficiency of IgG crosslinking to the origami (which the authors determined by adding the nickel chelator EDTA and probing for how many antibodies remained tethered to the structures) increased with the number of NHS esters until it peaked at over 90% for the constructs with four handles. Through both chemical conjugation and steric trapping, the orientation of the antibody could be controlled precisely, and the authors demonstrated two 3D origami objects with the IgG protruding from either the center or the end of the nanostructures (Figures 5F and 5G). The yield for these more complex objects was lower ($50%) but still impressive given the degree of control of a large biomolecular complex on the nanostructure. Finally, the functionality of the antibody (antigen binding) was retained, highlighting the advantage of this approach over non-specific mechanisms that could occlude the Fv region.
Using DNA Nanostructures to Dynamically Control Protein Function A second key area of application for protein-DNA nanotechnology is in modulating protein function with a DNA scaffold either by hiding the protein function in a cage to block its activity or by spatially restricting other co-reagents. One of the seminal examples in this field was reported in 2012 by the Church group, who developed a ''nanorobot'' that could control protein function in a logic-gated, stimulus-responsive manner. 14 The authors designed a hexagonal DNA origami cage mimicking a clamshell, whose top and bottom halves were held together with two DNA ''locks'' ( Figure 6A). Placing a protein inside the DNA structure prior to closing it blocked its activity by sterically isolating it inside the origami cage. In order to render the cage stimulus responsive, the two locks holding it closed consisted of aptamers for a specific target; upon exposure to a protein ''key,'' the aptamer-target interaction would outcompete hybridization and open the lock. Using aptamers for two different targets on the cage resulted in an AND logic gate, whereby the cage would open only when both targets were present (e.g., on a cell surface), opening the clamshell and exposing the protein cargo. Six different cells lines, expressing different combinations of three possible aptamers, were used for demonstrating the function of the system; the correct lock combination exposed an antibody fragment to human leukocyte antigen-A/B/C. The system could selectively label a target cell even in the presence of non-target cells, and various aspects of cell signaling could be modulated with antibodies that activated specific intracellular pathways. In this case, the ''protein function'' controlled by the origami was antibody binding, but this principle could apply to virtually any protein whose function can be blocked by the cage. Although many other targeted drug-delivery systems exist, the work by Church and colleagues was the first to demonstrate a programmable container that could be opened in a ''smart'' fashion.
In 2018, a collaborative team (led by the Nie, Yan, Ding, and Zhao labs) extended this ''nanorobot doctor'' concept to an in vivo application for treating tumor cells. 81 A rolled-up origami sheet locked with aptamers against nucleolin-a marker of tumor vasculature-was used to block the action of thrombin, a critical protein in blood coagulation ( Figure 6B). The aptamers targeted the robot to a human breast cancer tumor, whereupon the rolled-up sheet opened to expose thrombin, resulting in localized clotting that killed the cancer cells. The aptamer served as both a targeting element and a functional ''lock'' for selective actuation of the structure, and the authors showed effective tumor necrosis in vivo in both mice and miniature pigs. The structures also did not raise an immunological response, as measured by cytokine production. One key limitation to these approaches is the requirement for aptamers that bind to a target of interest; extending this concept to protein-based locking mechanisms (e.g., antibodies, which exist for a wide range of targets) is an exciting future direction for protein-DNA nanotechnology. We also foresee great opportunities for combining this approach with protein-or polymer-based coatings to stabilize the nanobots to degradation and enhance their targeting to the desired cells.
Aside from targeted killing of diseased cells, reversible DNA cages provide a powerful way to switch protein activity on and off, allowing for dynamic control of enzyme function. In 2017, Andersen and coworkers used a DNA ''nanovault'' to reversibly expose and occlude a protein in order to control access to its substrate. 82 The authors designed the nanovault to fully encapsulate the enzyme a-chymotrypsin and could open and close it with DNA strands in order to control its cleavage of casein, a target protein (Figures 6C and 6D). The authors conducted a number of elegant experiments to probe the accessibility of the protein in the DNA cage and had to engineer several additional features into the structure to mitigate the inherent (and non-negligible) porosity of the DNA origami cage. In the end, they were able to enhance the activity of the enzyme by roughly 3-fold between open and closed states. Interestingly, the authors found that the best way to conjugate the enzyme to DNA was via copper click chemistry with an azide-functionalized protein (itself made through non-selective lysine chemistry with an azide-NHS ester) with an alkyne strand already incorporated into the open cage. Attempting to first modify the protein with a DNA handle and then attach it to the cage gave a lower yield of encapsulation overall. Using a more selective bioconjugation method (to ensure that the enzyme is not attached in a way that occludes its active site) could enhance its function in the future. Alternative locking mechanisms for DNA origami boxessuch as pH-switchable locking mechanisms, 87 photocleavable linkers between the structure and the POI, or photoswitchable protein latches (see Protein-Actuated DNA Nanomachines)-could further enhance dynamic protein control via this strategy. Around the same time as the Andersen nanovault work, the Kim lab reported a pH-responsive tetrahedral DNA cage that could reversibly control the activity of RNase A. 83 Their cage, composed of only four strands, was a much simpler structure than the nanovault and functioned through the reversible assembly and disassembly of one side via a pH-switchable i-motif ( Figure 6E). The authors built off Turberfield's previous work 72 ( Figure 4B) to ensure that the RNase A was located inside the cage, and conjugation was achieved by copper-free click chemistry after protein modification with a cyclooctyne-NHS conjugate. Encapsulation protected the enzyme from both degradation by proteases and binding to antibodies, and its activity could be switched by roughly 2-fold between the open and closed states. If the enzyme was positioned outside the cage, by contrast, no such modulation was seen, demonstrating the importance of protein orientational control for creating functional nanostructures.
Rather than expose or occlude a protein's active site, a second mechanism for switching enzymes on and off is by controlling the availability of reagents or other cofactors necessary for catalysis. In 2013, the Liu lab demonstrated the reversible regulation of an enzyme cascade by using a DNA ''nanotweezer'' to control the distance between the two proteins. 88 The tweezer was driven between two states with strand-displacement mechanisms, which in turn modestly affected the efficiency of intermediate transfer between the two enzymes. That same year, the Yan and Fu labs used the same tweezer structure to control the availability of an NAD + cofactor to the glucose-6-phosphate dehydrogenase (G6pDH) enzyme ( Figure 6F). 84 Similar to Liu's work, this DNA nanomachine effectively controlled protein function in a reversible fashion, enhancing catalytic activity by $5-fold in the ''closed'' versus ''open'' states. Shortly thereafter, Yan and coworkers reported a similar mechanism for guiding protein function: tethering NAD + /NADH on a ''swinging arm'' in order to shuttle electrons between G6pDH and malic dehydrogenase (MDH) with nanoscale control. 89 The authors demonstrated both the dependence of the enzymatic rate on the distance from the swinging arm and its relative orientation to the proteins (i.e., pointing ''towards'' versus ''away''), as controlled by the helicity of DNA. In this case, the proteins were tethered to the scaffolds by non-specific lysine chemistry, so even greater control might be possible through their careful positioning with active sites oriented toward or away from the cofactor. In two follow-up works, the Yan and Yang labs imbued these systems with greater control over protein function by using light to reversibly tether the cofactor-laden swinging arm away from the enzymes 90 or controlling its relative position between two different sets of enzymes ( Figure 6G). 85 Taking a different tack, Ke, Bellot, and coworkers used a DNA origami ''nanoactuator'' to control the distance between two halves of a split GFP; mechanically bringing them into close proximity with DNA locking strands reconstituted the protein and turned fluorescence on ( Figure 6H). 86 Using DNA nanomachines to control protein function in these ways provides a powerful way to build nanoscale ''chemical plants'' (or synthetic cells). Such systems can also be used to more precisely probe protein function in biological contexts, a topic to which we turn next.
Using Hybrid Protein-DNA Nanostructures to Answer Biological Questions One of the most promising applications of DNA nanotechnology in the past decade has been using structures to answer questions of biological importance. In cells, the nanoscale distribution of proteins is critical to their function, as are their oligomerization state and the forces they apply (or that are applied upon them). DNA origami constructs are particularly good at controlling the spacing and stoichiometry of biological ligands, as well as exerting tunable biophysical forces on proteins, allowing researchers to probe systems with a precision not possible with other approaches. In conjunction with single-molecule imaging techniques, these hybrid nanostructures have opened up a new frontier in science that will undoubtedly spread to diverse subfields of biology. Early experiments with DNA tile arrays modified with peptides demonstrated that antibodies could be patterned with nanoscale accuracy by binding to their antigen, 91 highlighting the potential for single-molecule measurements via atomic force microscopy (AFM). However, it was not until the adoption of the origami approach 10 that DNA nanostructures began to be used widely for advanced biological experiments. We note that in this section we only discuss biological studies using pre-formed hybrid protein-DNA nanostructures. Examples where a DNA nanostructure alone was used for visualizing or probing the function of a protein acting upon it-such as the elegant work by the Endo and Sugiyama labs on DNA-manipulating enzymes 92-94 -will not be covered.
From its inception, one of the key applications of DNA origami has been as a ''molecular breadboard'' capable of controlling the positioning of other species with $5 nm resolution. In cells, the nanoscale presentation of extracellular ligands can cluster their receptors, leading to activation of intracellular pathways. Thus, in the past 5 years there has been an explosion of interest in using DNA-scaffolded proteins to probe the distance and valence dependence of protein presentation on cellular behavior. In 2014, the Hogberg and Texiera labs used a ''nanocaliper'' to present two (or more) copies of ephrin A5 in order to probe the optimal distance for activation of the EphA2 receptor ( Figure 7A). 95 Although other approaches for clustering the EphA2 receptor had been developed for probing its signaling (which is often disrupted in cancer), none was able to control the distance between exactly two ligands in a tunable fashion. The ephrin A5 ligand was conjugated to aminelabeled DNA with a bifunctional linker and bound to complementary handles on a rod-like origami structure. The distance between proteins was set at either 40 or 100 nm; structures with only a single ligand (which should not be able to dimerize the receptor) were used as controls. The multivalent structures bound more tightly to EphA2 (as measured by surface plasmon resonance), but only the origami with 40 nm spacing produced more receptor phosphorylation and downstream effects on breast cancer cells than monomeric ligands. Interestingly, the 100 nm spacing was identical to the origami bearing a single ligand, and increasing the density of ligands to eight (spaced 14 nm apart) did not yield an increase over the dimers spaced 40 nm apart. Such precise control of both the number and distance between ligands is not possible with any other system, especially when rigidity must be maintained over long (e.g., tens of nanometers) distances. In 2015, the Hogberg lab extended this nanocaliper concept to probe the binding of antibodies to antigens immobilized with a tunable distance on origami. 96 With this system, the optimal distance between ligands was determined to be 16 nm, and the precise effect of antibody type and linker length or flexibility could be probed as well. Also in 2015, the Niemeyer lab used microarrays modified with single-stranded DNA (ssDNA) handles to immobilize several different rectangular origami scaffolds, each of which displayed a different nanoscale pattern of epidermal growth factor (EGF) ligands. 97 This hierarchical approach, which the authors termed ''multiscale origami structures as interface for cells'' (MOSAIC), allowed for control of EGF density and spacing at the nanometer length scale and top-down patterning of different origami scaffolds at the micron length scale. Cells were then adhered to these surfaces, enabling detailed studies of ligand distributions on bioactivity in a way not possible with other surface immobilization techniques such as supported lipid bilayers or direct surface grafting.
We next discuss three specific fields where protein-DNA nanoassemblies have been used for probing biology: (1) the collective action of motor proteins, (2) densitydependent function of confined proteins, and (3) nucleosome assembly. In 2012, Reck-Peterson and coworkers demonstrated that a rigid DNA origami bundle could be used for attaching the molecular motors dynein and kinesin-1, which walk along microtubules but with opposite polarities. 98 By using a SNAP-tag fusion, the authors could precisely pattern both the number and distribution of these two proteins on the origami ''chassis,'' allowing for unprecedented analysis of their movement on myosin at the single-molecule level (Figures 7B and 7C). In particular, the authors could probe the ''tug of war'' between these two oppositely oriented motors and see which one dominated as a result of differences in affinity and stall force; attaching one protein type with a photocleavable linker enabled their dynamic release and dominance of the other motor type. Without an addressable scaffold such as DNA origami, it would not be possible to create such controlled protein assemblies. We also note that the size of DNA origami (approximately tens to hundreds of nanometers) is ideally suited for positioning multiple proteins (approximately one to tens of nanometers in size) without interference; smaller nanoscaffolds would be hard pressed to retain such precision. In 2015, Sivaramakrishnan and coworkers used DNA origami scaffolds to attach two different motor proteins, myosin V and myosin VI, with controlled spacing and number. 105 In this way, they could probe the collective action of multiple motors but on a defined system that avoided the complexities of natural actin-myosin ensembles (for example, in muscle fibers). The spacing between motors could be tuned (14, 28, or 42 nm) to match the spacing of various natural filaments. The authors found that neither myosin density nor the number affected the gliding speed of the origami on surfaces coated with actin filaments, which they attributed to the ensemble of motors acting as an ''energy reservoir'' that allowed them to function more consistently over a larger range of loads. One year later, the Iwaki lab reported a DNA origami as not merely a scaffold but rather a ''nanospring'' force sensor for myosin V and VI. 99 A coiled nanostructure-with a precise force-extension curve determined with optical tweezers-was attached to motor proteins bound to actin filaments, and the extension of the spring was used as an output for the force exerted on it ( Figure 7D). The nanostructure matched the stall force of the motor proteins but over a much shorter distance than with dsDNA, demonstrating the utility of origami. The authors were able to not only probe the tug of war between myosin V and VI but also demonstrate that myosin VI switched between hand-over-hand-and inchworm-type motions depending on the force exerted on it. The origami-based experiment was also more tractable than traditional approaches using optical tweezers and allowed for precise protein patterning not possible with other methods.
The second key area where protein-DNA nanostructures have played a key role is in determining the function of proteins under nanoscale confinement. DNA origami scaffolds are ideal for creating a defined volume, and the exact number of molecules entrapped therein can be tuned through protein-DNA conjugates that orient the proteins into that volume. In 2018, two reports-one by the Lin and Lusk labs 100 and the other by the Dietz and Dekker labs 101 -used DNA origami nanorings to assemble the proteins that make up the nuclear pore complex (Figures 7E and  7F). Both studies probed the function of FG-nups, disordered repeat proteins rich in phenylalanine-glycine (''FG'') residues, and which control the flow of molecules such as transcription factors or ribosome components into the cell nucleus. However, the complexity of the nuclear pore, combined with the unstructured nature of the FG-nup proteins, has made studying their properties (e.g., how they selectively control the flow of different molecules into the nucleus) difficult. The two studies discussed were able to incorporate 32-48 copies of the protein in an inward-directed fashion (through site-specific cysteine chemistry) and probed their assembly by using AFM, transmission electron microscopy (TEM), cryo-EM, and molecular modeling approaches. The dynamics of protein occlusion of the ring could be visualized, and control experiments with outwardly oriented proteins or mutants bearing hydrophilic residues demonstrated that both the geometry and chemical composition of the proteins are important for blocking the nuclear pore. Critically, the DNA origami rings provided a scaffold highly similar to the dimensions and shape of the nuclear pore, allowing the FG-nups to assemble in a biomimetic fashion not possible with other templates. These systems also open up the possibility of probing the function of multiple different FG-nups or creating assemblies with multiple types of proteins (as is seen in the native nuclear pore) in order to probe the effect of protein architecture on function and selectivity of molecular transport.
Although the above works investigated nuclear pore proteins, DNA origami provides an attractive platform for nucleating other protein assemblies as well. In 2016, the Shih and Rothman labs used ring-like origami to template SNARE proteins, 102 which drive membrane fusion in processes such as vesicle and neurotransmitter transport. Building off the work by Lin and Shih to template liposomes with DNA origami, 80 the nanoring scaffolds could simultaneously encapsulate a spherical liposome and a defined number of SNARE proteins through programmable DNA handles ( Figure 7G). Additional DNA handles could be used to dock the origami-encircled liposomes with a lipid bilayer, allowing for a detailed study of membrane fusion as a function of SNARE number. By decoupling the docking and fusion steps, the authors showed that only one or two SNARE proteins were necessary for the process, resolving an outstanding debate in the field. It is particularly important to highlight that this system combined several key aspects of protein-DNA nanotechnology: (1) controlled orientation of SNARE proteins, (2) assembly of a defined number of proteins with controlled spacing, and (3) integration of proteins with other molecules, such as lipids, in a highly biomimetic fashion. Aside from SNARE proteins, in 2019 the Fan and Zhong labs demonstrated that DNA-templated CsgB proteins could be used to nucleate bacterial curli nanofibers by polymerizing CsgA proteins. 106 This approach was highly mimetic of natural fiber nucleation and growth from the surface of E. coli cells, and using the origami nucleator as an easily visualized ''molecular landmark'' allowed the growth kinetics of the fibers to be determined by high-speed AFM.
A third recent subfield in biology where DNA nanostructures have found applications for single-molecule biophysics is in probing nucleosome assembly and forces. Three reports in 2016-one by the Poirier and Castro groups 103 and two by the Dietz lab 104,107 -used DNA origami hinge-like structures to investigate either the wrapping of dsDNA by nucleosomes ( Figure 7H) or the interaction force of two nucleosomes as they were brought into contact ( Figure 7I). In all three cases, the origami served as an easily visualized output for these supramolecular interactions-a lever-like ''amplifier'' of a much smaller-scale molecular interaction-through direct imaging (via TEM) or indirect methods (such as FRET between two dyes attached to the devices). The effect of parameters such as DNA length, salt concentration, nucleosome number, or transcription factor binding on nucleosome wrapping (or the effect of histone acetylation on the interaction forces between nucleosomes) could be probed with unprecedented precision. These approaches all relied on a thorough understanding of the biophysical properties of the hinged origami structures, which are governed by electrostatic and entropic spring effects, and how these properties relate to the force applied to their ends. However, the researchers demonstrated that DNA origami constructs are uniquely suited for single-molecule experiments in that they combine rigidity, ease of modification at precise locations, and multiple possible output modes. Future experiments that directly apply tunable forces at multiple points on a protein's surface (e.g., the ''Bohr-radius'' resolution tweezer developed by the Dietz lab) 108 would allow for precise unfolding experiments akin to optical tweezers but with a much simpler setup. We hasten to add that for monomeric proteins, accomplishing this goal will require modification of the protein in two distinct locations with high site specificity, short linkers, and defined rigidity (as discussed in Controlled and Rigid Orientation of Proteins on DNA Nanoscaffolds).

Building Nanostructures with Protein and DNA Structural Components
All of the examples presented in the previous three sections employed a pre-formed DNA scaffold upon which proteins were arrayed for either probing or controlling their function. In this section, we turn to a conceptually distinct area of protein-DNA nanotechnology: integrating proteins and DNA into nanostructures that contain both molecules as structural components. We focus specifically on assemblies where each plays a unique role in the final assembly and could serve as a scaffold for other materials of molecules. Although such structures present new challenges-namely balancing the self-assembly of two different macromolecules, each with their individual requirements and physicochemical behavior-they also offer several distinct advantages. First and foremost, proteins possess a wide range of unique structural motifs that DNA does not-such as a helices, coiled coils, b sheets, and collagen triple helices-with varying mechanical properties and nanoscale display of chemical functionality. Second, proteins have the potential for functionality ranging from catalysis to protein binding to stimulus-responsive behavior. Third, the chemical diversity of the 20 canonical residues (and dozens of reported non-canonical amino acids) 55 opens up novel chemical attributes beyond the uniformly anionic phosphate backbone of DNA. Fourth, proteins are a potentially ''higher-resolution'' scaffold than DNA, allowing functional groups to be positioned in closer proximity; for example, a typical a helix has a pitch of 0.54 nm, compared with 3.4 nm for the B-form DNA helix. Fifth, most proteins do not require the elevated (and supraphysiological) concentrations of divalent cations such as magnesium that many complex DNA nanostructures do, and protein complexes can form highly specific structures in cellular environments without high-temperature annealing. Although the reports below focus on full-length folded proteins, we point out that synthetic peptides-such as collagen-mimetic peptides, 109 coiled coils, 110,111 and peptide amphiphiles 112 -have also been integrated with DNA for the creation of nanoassemblies with both structural motifs and represent a promising direction that takes advantage of polypeptide properties while skirting some of the challenges of recombinant expression.
One of the first examples of a nanostructure comprising both DNA and protein structural elements was reported in 2012 by Mao and coworkers, who used polyhedral cages constructed from the symmetry-guided self-assembly of branched DNA tiles. 113 The authors attached biotin to one of the strands, and as a result of the symmetry of assembly, this approach yielded three of these molecules projecting into the triangular cavity. Adding streptavidin (which can bind up to four biotin molecules) in a second step effectively ''plugged'' each cavity in a highly multivalent fashion ( Figure 8A). Several protein-DNA cages-including tetrahedral, octahedral, and icosahedral geometries-were designed, and their 3D structure was confirmed by cryo-EM to 29 Å resolution. However, these results were reported prior to the ''resolution revolution'' in cryo-EM arising from the direct electron detector, 114 so it is likely that today much higher-resolution structures of protein-DNA cages can be obtained. The ability to extend beyond streptavidin to more structurally and functionally complex proteins would yield structures that mimic the dense protein shell of viral capsids but with programmable sizes and novel symmetries. However, to rival protein assemblies such as viral capsids (e.g., to minimize undesired porosity), a rigid and preferably tight interface between the DNA frame and the protein ''walls'' will be necessary. Nonetheless, this example demonstrates that hybrid nanocages could be constructed from a small set of building blocks in two steps with a combination of symmetry and multivalency through well-defined protein-DNA interfaces.
In 2015, the Mayo lab achieved a simple yet elegant approach to protein-DNA hybrid structures by using the engrailed homeodomain (ENH) and dsDNA. 115 The ENH protein binds dsDNA (sequence: TAATNN) with nanomolar affinity, and the authors computationally reengineered its surface with Rosetta so that the protein formed a C2-symmetric homodimer. Co-assembling this dimeric protein with dsDNA bearing two identical binding sites rotated by 180 yielded 1D nanofibers ( Figures 8B and 8C); importantly, no structure would be possible without the protein (or with only monomeric protein) or without the DNA, so the fibers were truly hybrid nanostructures. The relatively small size of the two components resulted in narrow fibers ($15 nm, although the length could extend up to 300 nm), but extending this approach to larger proteins capped with DNA-binding domains, as well as DNA nanostructure tiles or origami, could give significantly larger assemblies. As we will discuss in Protein-Actuated DNA Nanomachines, such structures are prime candidates for integrating computational protein design-for example, to control the angles and rigidity between DNA-binding domains and the structural domains of a protein building block-with DNA nanotechnology to create components that assemble without chemical modification of the protein.
In 2017, Dietz and Praetorius reported an alternative, and far more complex, method for hybrid protein-DNA structures by using sequence-specific DNA-binding proteins. 116 In DNA origami, short single-stranded ''staples'' are used to fold the long single-stranded bacteriophage M13 ''scaffold'' strand into arbitrary shapes. 10,11,12 The authors realized that the staple strands could be replaced with sequence-specific DNA-binding proteins called transcription-activator-like (TAL) effector proteins, which bound to two distal parts of a double-stranded scaffold, to fold it into distinct shapes through an analogous mechanism ( Figure 8D). TAL proteins consist of 34 amino acid repeats, each of which can bind to a specific DNA base pair ( Figure 8E), so concatenating 21 distinct repeats enabled binding of two turns of B-form DNA. By linking two such binding domains with a flexible linker, the authors could bring distal parts of the scaffold into close proximity. In an experimental tour de force, the authors carried out extensive characterization of this system to probe the exact design of the staples and the design strategies to give well-formed structures, generating shapes such as circles, squares, a Drigalski spatula, and a four-armed tetrahedral structure (Figures 8F and 8G). The approach could be extended to multi-layered assemblies (e.g., a four-helix bundle) reminiscent of 3D DNA origami, 11 and structures could even be generated with a cell-free transcription and translation system from plasmids encoding the staple proteins. This last result was particularly important because it strongly suggests that these nanostructures could be formed isothermally (i.e., without annealing) in a cellular environment, paving the way for protein-DNA nanotechnology in vivo. The authors also fused staples with GFP as a model cargo to demonstrate that the final structures could, in principle, display functional proteins. Overall, this landmark work demonstrates several key elements of protein-DNA nanotechnology: orientational control (given that TAL proteins bind rigidly and in a defined manner), hybrid structures with both protein and DNA components, and the possibility for highly functional assemblies for probing biology. Future work integrating this system with other proteins has the potential to generate userdefined, intracellular nanomachines that can carry out complex functions.
In contrast with the non-covalent protein-DNA interactions used in the reports above, recent years have seen additional interest in building hybrid structures from covalent protein-DNA conjugates. In two reports in 2018, the Aida and Mirkin labs independently modified multivalent proteins with multiple DNA strands and then linked them together either directly or with complementary linkers ( Figures  9A and 9B). 117,118 Although the proteins used were different (GroEL by the Aida lab and b-galactosidase by the Mirkin lab), both approaches relied on site-specific chemistry-namely, alkylation of mutagenically introduced cysteine residues at defined locations-to generate a well-defined protein-DNA hybrid molecule. But rather than attach these proteins to a preformed DNA scaffold, their DNA-mediated polymerization generated 1D nanofibers alternating between protein and DNA structural units. In both cases, the multivalent association of DNA strands drove the formation of fairly rigid linear structures, and the site specificity gave the assemblies a defined directionality that would not have been possible with non-specific approaches such as lysine acylation. The fibers could also be de-polymerized through the addition of displacement strands to break the DNA hybridization. Protein surfaces are inherently asymmetric, so mutagenesis can create anisotropic functionalization in a way not possible with more isotropic surfaces, such as those of inorganic nanoparticles. Merging these approaches with bioactive proteins, or multivalent DNA nanostructures that can control the diameter and stiffness of the protein fibers, would be particularly useful in functional biomaterials (see Protein-DNA Bionanomaterials).
Aside from linear arrays of proteins, 3D assemblies and crystals with both protein and DNA components can be created from oligonucleotide-functionalized virus capsids 123,124 or smaller multivalent proteins. 119 Although crystals are not nanostructures per se, the systems described herein are composed of nanostructured repeating units, so we feel it is appropriate to include them in this section. In 2015, the Mirkin lab reported that catalase proteins (tetrameric, heme-containing enzymes) could be densely modified with DNA through a two-step strategy: functionalization of surface lysines with an azide-NHS ester and subsequent copperfree click chemistry with cyclooctyne-labeled DNA. 119 This two-step protocol was presumably necessary to get a higher yield (up to 15 strands per tetramer) than with a DNA-NHS ester conjugate directly. Combining two sets of enzymes with complementary sticky ends and thermally annealing them resulted in 3D crystal lattices with body-centered cubic (BCC) symmetry ( Figures 9C and 9D). The enzymes were still functional in the crystals and could be co-assembled with DNA-modified nanoparticles for the creation of a hybrid lattice bearing both components. In an elegant follow-up work, the authors could switch the exact lattice symmetry of the hybrid protein-nanoparticle crystals from a BCC to an AB 2 packing by moving the modification of DNA from lysines (which were evenly and spherically distributed) to cysteines (which yielded fewer handles that were more asymmetrically distributed), demonstrating the power of site-specific bioconjugation to yield a tunable protein-DNA building block. 125 Creating ''Janus'' protein-DNA nanoparticles by merging cysteine-specific DNA-based dimerization with a dense lysine-specific DNA coating yielded a hexagonal symmetry ( Figure 9E). 120 All together, these reports show the great potential of geometrically defined assemblies-where the DNA display and valence are controlled by the protein surface-for creating hybrid protein-DNA nanostructures.  117 and McMillan et al. 118 Copyright 2018 American Chemical Society. (C and D) Strategy for creating a 3D crystal from proteins modified with multiple DNA handles bearing sticky ends (C) and SAXS profile and TEM micrograph of protein-DNA crystals (D). Reprinted with permission from Brodin et al. 119 (E) Creating a ''Janus particle'' by using two proteins for assembly into complex lattice geometries. A unique cysteine residue on the proteins results in an anisotropic homodimer, whereas lysine modification gives a dense DNA shell for crystal formation. Reprinted from Hayes et al. 120 Copyright 2018 American Chemical Society.
Also in 2018, Tezcan and coworkers described a different approach for assembling 3D protein-DNA crystals. Rather than relying solely on DNA hybridization interactions to drive the assembly of proteins, the authors used proteins that included engineered intermolecular interactions. 121 For this purpose they selected the protein RIDC3-which the Tezcan lab had redesigned to self-assemble into crystalline structures upon the addition of zinc ions-and modified it with DNA at a unique, mutagenically introduced cysteine. Mixing proteins with complementary oligonucleotide handles resulted in the rapid assembly of crystalline materials driven by both DNA hybridization and zinc-mediated protein-protein interactions ( Figure 9F). This approach differed from many others described in this section in that the authors did not specifically design the system to form one particular assembly; indeed, many highly divergent arrangements of the protein and DNA building blocks were possible. The authors combined a suite of structural characterization techniquessuch as AFM, scanning electron microscopy, small-angle X-ray scattering (SAXS), and cryo-EM-with computational simulation of over 50,000 protein orientations to yield four possible models for the hybrid assemblies. By systematically knocking out putative metal-binding interactions via alanine scanning, they determined that only one model fit all the experimental data perfectly. Interestingly, the final model contained several pH-dependent protein-DNA interactions that had not been explicitly designed, including both hydrogen bonds and salt bridges between protein side chains and the phosphate backbone. This result demonstrated both the complexity and difficulty of creating hybrid protein-DNA assemblies with multiple interaction ''modes,'' as well as the opportunities for creating complex protein-DNA interfaces that more directly mimic those between proteins. Improving simulation tools for protein-DNA hybrid nanostructures to explicitly engineer these non-obvious interactions would enable a number of the complex applications described in Future Research Directions in Protein-DNA Nanotechnology and endow protein-DNA nanotechnology with many of the impressive capabilities already possible with Rosetta-based protein design. 2 In 2019, our lab reported a novel design approach for hybrid protein-DNA nanostructures by constructing a tetrahedral cage self-assembled from a triangular DNA structure with three identical ssDNA handles and a homotrimeric protein modified with complementary oligonucleotides ( Figure 9G). 34 Both the protein and the triangular DNA ''base'' are integral components of the nanostructure: in the absence of either, no cage will form. In this case, site-specific chemistry is critical to give a well-defined structure, and the location must be chosen carefully to avoid deleterious strain or steric hindrance (see below). For the protein, we chose the C3-symmetric KDPG aldolase, which is stable to over 80 C and readily amenable to mutagenesis and recombinant production in E. coli. We attached 21-nt DNA handles to mutagenic cysteines with a heterobifunctional crosslinker and purified the triply modified trimer away from incompletely modified proteins via anion-exchange chromatography. This protein-DNA building block was then co-assembled with the triangular DNA base bearing complementary handles, resulting in a wireframe DNA cage ''capped'' by the protein trimer. The DNA base could be readily tuned in size (three and four turns yielded structures 10 and 14 nm on an edge, respectively), and a number of indirect experiments demonstrated the cage structure as designed. To prove the versatility of our approach (and avoid disulfide-induced aggregation of the proteins), we also attached the DNA by using copper-free click coupling with trimers bearing a 4-azidophenylanine residue, introduced by the Schultz method. Interestingly, the site of modification affected the yield of cage formation: if the DNA handles were placed too close together, only one arm of the base could bind to the trimer because of electrostatic repulsion. These hybrid protein-DNA cages are, to our knowledge, the first example of a discrete and monodisperse hybrid nanostructure (i.e., not an extended nanofiber or crystal) made from chemical conjugation of oligonucleotides to a protein surface. Future studies fusing targeting peptides or proteins to the trimer, creating nanostructures with multiple copies of the protein, and incorporating stimulus-responsive proteins will yield highly functional structures for multiple applications.
Also in 2019, the Song lab used protein-DNA hybrids to create dendrimeric nanoparticles composed of both DNA and protein structural units. 122 To avoid chemical conjugation of DNA, the authors used tetrameric traptavidin (a more stable mutant of streptavidin) and modified it with four distinct DNA handles through the corresponding biotin conjugates. Key to their approach was the purification of tetramers bearing exactly four distinct handles from conjugates stemming from the statistical mixture of all possible combinations, which they accomplished by using a sequential magnetic-bead purification method ( Figure 9H). Although low yielding for the final tetra-functionalized protein ($8%), this method avoids the myriad challenges of site-specific bioconjugation, and the traptavidin-biotin interaction is virtually irreversible. The authors used the multivalent protein-DNA conjugates to assemble dimers, trimers, and tetramers of DNA-functionalized gold particles, as well as hierarchical, size-controlled dendrimers. The dendrimers in particular represent a new paradigm in DNA-mediated protein assembly, which is to create nanostructures with a defined size and both DNA and protein building blocks; extending this approach to multivalent proteins with greater functionality (e.g., through genetic fusions to an oligomeric protein) will greatly expand the applications of these structures. One next logical step for both these traptavidin-DNA conjugates and our protein-DNA cages will be to use addressable DNA scaffolds to direct the conjugation of additional strands and thereby break the intrinsic symmetry of the protein assemblies without laborious purification.

FUTURE RESEARCH DIRECTIONS IN PROTEIN-DNA NANOTECHNOLOGY
The four areas listed above demonstrate the great diversity in both fundamental and applied nanostructures that can be achieved through the merging of proteins and DNA scaffolds. We next turn to several key areas of future investigation and application in this field that are of particular interest to our lab and many others. We place a special emphasis on creating nanostructures where the protein and DNA scaffold make up a continuous, hybrid ''macromolecule'' (through either covalent or supramolecular interfaces). This goal, in turn, will require advancements in bioconjugation methods or the integration of these methods with approaches such as affinity interactions (e.g., binding peptides and aptamers) for further control. The net outcome will be to create a set of protein-DNA building blocks that can be combined in a fully modular fashion (akin to the Lego-like DNA ''bricks'' developed by the Yin group) 126 in a highly predictable and computationally designable manner and integrate protein functionality and diversity.

Controlled and Rigid Orientation of Proteins on DNA Nanoscaffolds
Although the examples in Controlling Protein Orientation on a DNA Scaffold demonstrated the potential for controlling protein orientation on DNA scaffolds, many opportunities remain in this field both for chemical approaches and for novel applications. To further enhance the defined relationship between proteins and DNA scaffolds, two key criteria will be crucial ( Figure 10A): (1) the reduction of linker length between the DNA strand and the protein surface and (2) the attachment of the protein at two or more points on the resulting nanostructure. We can accomplish the first goal by avoiding commercial linkers (which generally include 6-12 bonds between the two components) and synthesizing custom phosphoramidites with bioconjugation handles directly off the 5 0 or 3 0 ends or attached to the backbone (Figure 10B). To compensate for decreased efficiency due to shortened linkers, powerful reactions such as inverse-electron-demand Diels-Alder reactions between tetrazines and trans-cyclooctenes 127 or oxidative couplings 58 in conjunction with non-canonical amino acids might be necessary. It will be particularly difficult to attach a second (or third) unique DNA strand to a protein surface, especially if longer strands are used, because of electrostatic and steric effects. In these cases, single nucleotides or short strands can be attached, and then enzymatic ligations can be applied to extend them ( Figure 10C), as in the terminal deoxynucleotidyl transferase strategy 51 or a splint-based ligation strategy. Alternatively, the first DNA modification can be used to tether a protein on a DNA structure and enhance the local concentration of the second strand ( Figure 10D); strand displacement to remove the nanostructure can then follow. This strategy is similar to Gothelf's polyhistidine-Ni(NTA) strategy 38 or the photo-affinity labeling of proteins with DNA with photo-crosslinkable moieties such as diazirines 128 and creates the potential for the size and shape of the nanostructure to be used for directing the second modification.
Although a protein can be attached to a DNA nanostructure at multiple points through two sequential and site-specific bioconjugation steps, another option is to use a binding moiety-such as a peptide or aptamer-to help ''anchor'' the protein on the structure. Protein-protein interactions generally rely on a binding interface composed of multiple weak interactions between the two molecules; DNA structures are ideal nanoscale scaffolds for positioning multiple weak binders to tightly immobilize a protein. Such affinity interactions could be used in concert with a covalent linkage and could bind directly to the protein surface or to an engineered domain such as a fusion (for which binding peptides already exist), a coiled-coil peptide ( Figure 10E), or a peptide that binds to certain DNA sequences, such as an ''A-T hook.'' 129 Novel affinity molecules can in turn be discovered through methods such as phage display (for peptides) or SELEX (for aptamers), and introducing photo-crosslinking moieties can convert a noncovalent interaction into a covalent bond in order to lock it in place. Small, protein-based binding agents such as nanobodies or scFv antibody fragments-which can bind to the protein or a fusion thereof and are more amenable to recombinant expression and DNA modification-can also be used ( Figure 10F). Alternatively, multiple short binding peptides (which each target a different part of the protein surface) 68 could be spatially arranged on a DNA scaffold to best ''fit'' a protein and bind it with high affinity and rigidity ( Figure 10G). The work by Sacca and coworkers 69 mentioned in Controlling Protein Orientation on a DNA Scaffold ( Figure 5D) is an example of such DNA-enabled mimics of traditional protein-protein interfaces, albeit with multiple copies of a single peptide sequence for binding to a multivalent assembly. Extending this approach to monovalent proteins would greatly expand the palette of applications possible. (B) Example of a phosphoramidite for introducing an alkyne bioconjugation handle into the DNA backbone. Attaching two such residues 10-11 nt apart will allow for ''pinning'' a protein on a DNA backbone. (C) Attaching a short (4-nt) DNA strand to a protein and then elongating it with an enzyme (e.g., via splint ligation) could circumvent the challenges with attaching full-length DNA strands to a protein.
(D) Rather than modifying a protein with two DNA strands, the first strand can be used to position it close to the second bioconjugation handle, resulting in a proximity-enhanced second reaction. (E) A coiled-coil association between a protein fusion and a peptide linked to a DNA structure can help enhance a rigid interface without requiring a second bioconjugation reaction. (F) A small binding agent such as a nanobody can be used to help anchor a protein on a DNA scaffold. (G) DNA-scaffold-templated peptides bind to different faces of a protein in order to ''pin'' it, akin to antibody-antigen binding.
(H) Comparison of the structure of natural DNA (which is anionic) with those of PNA and DNG, which are neutral and cationic, respectively. The colored spheres represent the natural DNA bases (A, T, C, and G).
In parallel with these chemical methods, new strategies will need to be developed for purifying multiply modified proteins or non-covalent protein-DNA assemblies, including chromatography-for example, anion-exchange chromatography, which can be particularly effective given the large number of negative charges introduced by appending DNA-or gradient-ultracentrifugation methods. The DNA handles can also be used as affinity tags to pull the desired conjugate out of a mixture through either modification of a solid support with the complementary handle or temporary attachment of the protein to a larger DNA nanostructure to aid purification (as demonstrated by Fromme and coworkers). 74 Given that many proteins are cationic (or have cationic domains, such as heparin-or DNA-binding modules), additional issues could arise as a result of non-specific aggregation with the DNA during the conjugation reaction. In this case, uncharged oligonucleotides, such as peptide nucleic acid (PNA), or oligonucleotides with cationic backbones, such as guanidine-PNA 130 (GPNA) or deoxynucleic guanidine 131 (DNG), can be used instead ( Figure 10H).

Structural Biology on Proteins Aided by DNA Scaffolds
As mentioned in Controlling Protein Orientation on a DNA Scaffold, Ned Seeman conceived of DNA nanotechnology as a way to solve protein structures by positioning them in 3D on a repeating oligonucleotide scaffold. Several groups have in fact immobilized proteins in the cavities of 2D and 3D DNA scaffolds, though not with sufficient rigidity to yield a structural solution. [132][133][134] The Yan lab, in collaboration with our own, is actively pursuing novel designs to increase the cavity size and improve the crystal resolution 9,135,136 in order to immobilize proteins or peptides to solve their structure. As more designs with larger cavities and channels are reported, ever larger guest molecules can be immobilized in the self-assembled lattices. The key to solving protein structure on such assemblies will be rigid attachment in a defined manner on the DNA that composes the crystal, as well as high occupancy of the available cavities. The techniques and advances outlined in Controlled and Rigid Orientation of Proteins on DNA Nanoscaffolds will be critical to accomplishing a rigid and identical linkage between the protein and the DNA scaffold; binding interfaces (or molecules such as aptamers or nanobodies) could be particularly helpful in avoiding modification of the target protein with multiple DNA handles first. In addition to traditional X-ray crystallography (which requires crystals tens to hundreds of micrometers in size), new methods such as X-ray free-electron lasers 137 and cryo-EM electron diffraction 138 can be used with much smaller DNA crystals bearing proteins (tens to hundreds of nanometers). These approaches-which dramatically reduce the distance the protein must diffuse from the outside to the interior-will be useful if the proteins are not stable to the temperatures used for DNA self-assembly and thus must be soaked into the crystals after assembly.
A second field where DNA nanoscaffolds can aid in protein structural determination is cryo-EM. As described in Controlling Protein Orientation on a DNA Scaffold, the Dietz lab demonstrated the first use of a DNA origami structure as a fiducial marker in cryo-EM to select particles, protect the protein from adsorption to the air-water interface, and tune the orientation of the target. In order to solve the structure of proteins that do not intrinsically bind DNA, additional methods will be necessary for rigidly attaching them to the origami scaffold. Binding interfaces, or the ability to chemically (and seamlessly) transition from the DNA origami platform into the protein and back out again, will be critical to enforcing a defined relationship between the protein and the scaffold. DNA nanostructures have several key advantages for this purpose. First, the Shih and Lin labs have demonstrated the templating of liposomes 80 or lipid nanodiscs 139 on DNA nanostructures, paving the way for cryo-EM characterization of membrane proteins. Second, asymmetric DNA scaffolds can be used for determining the absolute orientation of the appended protein (as long as it is rigidly attached), thereby aiding particle averaging. Third, nanostructures with repeating sites for proteins attachment (in a linear, 6 2D, 6,132 or helical fashion) can be designed, which will in turn allow many orientations of the target to be visualized from a single particle.
One reasonable criticism of the above proposals is that structural biology methods are constantly improving, rendering the need for a DNA scaffold such as a crystal or a cryo-EM nanogoniometer moot. Furthermore, rigid attachment of a protein to a DNA scaffold assumes some level of knowledge of the structure to begin with, so using those scaffolds to solve the structure does not add any additional value. This second point can be addressed with the use of protein-binding agents whose structures are known (e.g., nanobodies and aptamers) for binding a target with unknown structure or probing protein-protein (or protein-DNA or protein-RNA) interactions that are not known even if the individual partners are. However, even if emerging structural biology methods obviate the need for a DNA scaffold altogether, controlled attachment of proteins to crystals or large nanostructures has a range of additional advantages. For example, a 3D crystal can be used to immobilize enzymatic cascades of proteins or as a material to protect them from degradation, neither of which requires solving the structure of the protein on the crystal. Cryo-EM characterization of protein-DNA nanoassemblies could also be useful for characterizing the spacing and orientation of DNA-scaffolded proteins or the exact structure of hybrid protein-DNA nanomachines or nanostructures (as described in the next two sections). Neither of these applications requires atomic resolution, making them useful with current capabilities. Currently, researchers routinely characterize DNA nanostructures by cryo-EM to resolutions $10 Å in order to probe their structure in solution (e.g., to prove that an origami cage is actually a 3D object with an interior cavity), so there is great value in applying these methods to the hybrid structures described herein.
Protein-Actuated DNA Nanomachines One of the most enduring inspirations for nanotechnology is the idea of a ''nanorobot'' that can manipulate molecular targets in programmable ways. DNA nanotechnology is arguably the most powerful method for building complex, highly anisotropic nanostructures (including those that even resemble macroscopic robots; Figure 1F), 13 and a number of actuation mechanisms have been designed to switch them between two or more states. Most of these rely on modulation of DNA hybridization or stacking, but stimulus-responsive proteins represent a powerful and unexplored alternative mechanism. Such nanostructures would be particularly useful in triggered cargo delivery or in switching protein activity ''on'' and ''off'' by opening a DNA box containing the protein. If photoswitchable proteins are used to open and close the box-e.g., proteins that reversibly dimerize under two different wavelengths of light 140 -such structures would in effect control the protein activity by light, an approach that could be termed ''nano-optogenetics'' ( Figure 11A). Combining multiple triggers in one cage would allow it to activate the protein only upon binding an intracellular target (e.g., a protein or mRNA, as demonstrated in the two key nanorobot papers covered in Using Hybrid Protein-DNA Nanostructures to Answer Biological Questions) 14,81 and in the presence of light. It is hard to achieve this complexity with other supramolecular systems, especially if reversibility of protein function is desired. In fact, in 2018 Famulok and coworkers reported a protein-driven DNA machine composed of a T7 RNA polymerase attached to a catenated DNA nanoring that functioned as a ''nanoengine'' by consuming fuel (e.g., ATP) to drive the protein motion in a continuous circular fashion ( Figure 11B). 141 Other stimulus-responsive proteins that undergo a conformational change-such as calmodulin or elastin-like polypeptides (ELPs)-could be used for reconfiguring DNA nanostructures in a reversible fashion, akin to ''muscles'' acting on a DNA ''skeleton'' ( Figure 11C). Fuel-dependent enzymes such as molecular motors would allow for mechanical motion to be stimulus responsive and out of equilibrium. Any proteins actuating a DNA structure would have to be tethered at two (or more) site-specific locations, through relatively rigid or short linkers, in order to exert force on a DNA nanostructure in an efficient and directionally defined manner. The exact relationship between the protein and DNA could be probed with a combination of moderate-resolution cryo-EM (as described in Controlled and Rigid Orientation of Proteins on DNA Nanoscaffolds) and computational simulations.
A range of applications exist for the types of nanomachines proposed above. For example, functionalizing such a nanomachine with additional proteins that bind to biological receptors (as in Figure 7A) would allow for single-molecule studies of the forces required for biological activation. A protein-DNA nanomachine could also serve as a ''nanoinjector'' for a cell by embedding itself into the membrane (F) Using a DNA origami scaffold to position, link, and then release protein-DNA building blocks in order to create a unique protein nanostructure. Without the DNA scaffold, many assemblies and oligomeric states would be possible. and then poking it with a protein-actuated motion, similarly to viruses such as bacteriophage T4. Machines that bind to two different faces of a protein (or protein complex) inside cells could also reversible (de)activate them by applying precise forces. Photoswitchable protein-DNA nanostructures could control the assembly (e.g., of extracellular matrix [ECM] fibers) upon exposure to light to create photoreversible hydrogels. Proteins that are not intrinsically light responsive, such as ELPs, could be rendered so by the attachment of gold nanorods (which locally generate heat upon illumination with infrared light) to the DNA scaffolds, creating truly hybrid assemblies that integrate multiple molecular functionalities. Finally, hierarchical organization of protein-DNA nanomachines into bundles or hydrogels that span multiple length scales could result in stimulus-responsive mechanical materials such as artificial muscles.

More Complex Protein-DNA Nanostructures
We also foresee the development of more complex hybrid nanostructures with both protein and DNA components. For example, our lab's work with a trimeric protein building block ''capping'' a DNA structure 34 ( Figure 9G) could be extended to cages with proteins at all vertices ( Figure 11D). Tuning the rigidity of the protein-DNA interface could lead to cages of varying sizes and symmetries, akin to reports using all-DNA tiles. 142 Protein building blocks could also be used to ''plug'' symmetrymatched holes in a wireframe DNA origami structure (similar to Mao's work with streptavidin-capped cages), 113 creating a semi-closed protein shell akin to a virus capsid. Unlike virus capsids, however, each cavity of these structures could be modified with a different protein-DNA building block, enabling highly anisotropic protein shells and Janus-like particles ( Figure 11E). The protein building blocks can also contain fusion peptides or proteins for biological activity (e.g., drug delivery, artificial vaccines, and catalytic cascades) and can be removed selectively through toehold-mediated strand displacement. In addition to covalent attachment of DNA strands, alternative strategies for integrating self-assembling proteins with DNA could yield a tighter and more rigid interface. All of these applications would especially benefit from the introduction of design software that can accurately model both building blocks. Such software-for example, extensions of Rosetta (for proteins) 143 or oxDNA (for oligonucleotides) 144 that incorporate representations of the other macromolecular type-will enable rapid in silico testing of designs to avoid laborious synthetic trial and error.

DNA-Scaffold-Templated Synthesis of Protein Nanostructures
One of the key advantages of DNA nanotechnology is that it allows for scaffolds with a high degree of addressability because of the unique nature of the strands that compose them. Most engineered protein nanostructures, by contrast, are highly symmetric 2 because engineering multiple specific and orthogonal interactions is difficult. Thus, a unique opportunity for protein-DNA nanotechnology is to use oligonucleotide scaffolds to position proteins in an asymmetric fashion with complete stoichiometric control, to covalently or non-covalently link them, and to then remove them from the DNA scaffold by using cleavable linkers ( Figure 11F). The DNA scaffold in effect serves as a ''supramolecular mold'' for building protein structures that could not otherwise be created in solution, all without having to re-engineer (multiple) protein self-assembly interfaces. This approach will require additional bioconjugation strategies beyond those necessary to attach the proteins to DNA in the first place in order to link the pieces together into higher-order structures, in much the same way that traditional organic synthesis requires multiple reactions to link various functional groups together in a selective and site-specific fashion. However, the final outcome will be an all-protein nanostructure with the complexity of DNA origami and the diversity of proteins. Beyond the biological and catalytic function of proteins, such nanostructures could also integrate structural proteins (e.g., actin fibers and collagen fibrils), thereby moving beyond the properties of the DNA double helix while retaining the modularity that makes DNA nanotechnology so powerful. It is still an open question what the best protein building blocks would be for this purpose, but the de novo designed, tunable, and highly stable assemblies employed by protein engineers (such as repeat proteins 145 or helical bundles 146 ) are possible starting points.

Protein-DNA Bionanomaterials
In addition to the myriad areas discussed in Using Hybrid Protein-DNA Nanostructures to Answer Biological Questions, two areas of biology that will benefit greatly from nanomaterials that merge the structural tunability of DNA with the bioactivity of proteins are (1) targeted delivery of therapeutic cargo to cells and (2) biomaterials for regenerative medicine. In this regard, protein-DNA nanostructures will create tunable analogs of (1) viruses and (2) the ECM. For both of these applications, a DNA ''skeleton'' can be coated with a protein or polymer ''skin'' to allow independent control of size, shape, and rigidity (via the DNA) and bioactivity (via the polypeptide or polymer coating). Functionalizing these assemblies with proteins to enhance stability, modulate surface charge, and facilitate targeting, uptake, endosomal escape, and subcellular localization will allow for targeted cargo delivery into cells. For biomaterials, a DNA-based nanofiber could control the diameter, stiffness, and nanoscale morphology of a fiber to mimic the ECM (akin to collagen or laminin), whereas the proteins can interface with cell-surface proteins, such as integrins or growth factor receptors, to influence migration, differentiation, or regeneration. For example, fibronectin is a protein composed of multiple individually folded domains, much like beads on a string, each with a different biological role. 147 Creating a DNA nanostructure coated with these domains would allow for the precise control of bioactivity, such as cell adhesion or growth factor signaling, with simultaneous control over the mechanical and morphological properties of these assemblies. Such a hybrid nanostructure would strike a balance between using much simpler structures that often fail to recapitulate biological complexity and using full-length proteins.
For both targeted delivery and biomaterials, the ability of DNA to spatially control the presentation of multiple signals will be key, e.g., by ''matching'' the spacing of receptors or co-localizing several proteins to enhance bioactivity. 95 DNA also allows for the dynamic presentation of proteins through strand-displacement reactions or reversible crosslinking of DNA fibers, opening up a wealth of possibilities for adaptive biomaterials with spatiotemporal control. 112,148 A combination of non-specific and specific (via direct tethering or binding to defined locations) will most likely be necessary for effectively coating DNA nanoscaffolds with proteins. It should also be noted that recent breakthroughs in biotechnological production of DNA origami ($$100/g) 149 have opened up the possibility of scalable production of DNA nanostructures that match those of recombinant proteins.

Selective Modification of Proteins with a Supramolecular Scaffold
One especially powerful application of the hybrid field described herein is to extend the ideas outlined by Gothelf and coworkers ( Figures 3A and 5E) 38,79 to a general platform for DNA-scaffold-enabled, site-specific protein modification. By positioning a protein on a DNA nanostructure, it should be possible to selectively modify it on one face, even on a single residue, akin to the way that enzymes such as kinases can phosphorylate a specific site by binding to a unique location on their target. This modification could be something completely synthetic (e.g., a drug, polymer, or fluorophore) or a natural moiety such as a post-translational modification (e.g., phosphorylation, glycosylation, or lipidation). As with many of the propositions described herein, a somewhat rigid and controlled orientation of the protein on the DNA scaffold will most likely be necessary to prevent the modification of an incorrect site. Chaining together several DNA-based modules with unique site-specific chemistries, and passing the target protein sequentially between them, would in effect create a ''molecular assembly line'' similar to non-ribosomal peptide synthesis. The assembly of these modules could in turn be controlled dynamically with DNAbased circuits and computational elements such as logic gates, creating a truly cell-mimetic molecular factory (see below) that goes far beyond current DNA-templated enzyme cascade capabilities. We highlight that obtaining suitable quantities of proteins with these materials will require the use of simple tiles or scaffolds or the scaling up of DNA origami to gram quantities, as recently reported. 149 Synthetic Cells and Nanoscale ''Chemical Plants'' Creating an artificial cell with complexity rivaling biology but with completely synthetic components is a holy grail of nanotechnology. Such a nanoscale ''chemical plant,'' 150 with the ability to produce novel molecules, sense and respond to the environment, and even self-replicate and evolve, would be the ultimate realization of Feynman's iconic dream. This vision is still far away, but we believe that protein DNA nanostructures will play a key role in its realization. Applications include switching molecular assembly lines (e.g., enzymes attached to DNA scaffolds) on and off; integrating with DNA molecular computing networks for signaling, feedback, and control; using proteins on DNA ''nanorobotic'' arms to functionalize other molecular species; localizing proteins on DNA cages as nanoreactors; using motor proteins on dynamically controllable DNA tracks to transport cargo; and spatially controlling multiple proteins on addressable DNA ''cytoskeletons.'' Key to these endeavors will be the integrated, hybrid protein-DNA nanostructures we have discussed herein, enabled by novel chemical tools and self-assembly methods.

CONCLUSIONS
We hope that this review has demonstrated the rich potential that lies at the interface of DNA nanotechnology and protein chemistry and engineering. A truly hybrid field of protein-DNA nanotechnology will build on the ever-increasing advances in DNA and RNA nanotechnology, de novo protein design, bioconjugation chemistry, supramolecular self-assembly, and computational simulation of biomolecular systems. The long-term potential for this field is to create a self-assembled, biomoleculebased analog to synthetic organic chemistry: using a palette of building blocks and reactions to build complex final structures with complete control down to the atomic level. The key differences are that protein-DNA nanotechnology will extensively use supramolecular interactions (as opposed to purely covalent bonds), and the final ''molecules'' will be complex assemblies and devices with functions that rival those of biology. This goal will require an intimate interplay between chemists, biologists, engineers, physicists, and materials scientists, but the potential is limitless; the ultimate goal is to create nanostructures and nanosystems that one day rival natural enzymes, cells, and perhaps entire organisms. There truly is plenty of room left at the bottom, and protein-DNA nanotechnology is ready to fill it.