Symmetry in Nucleic-Acid Double Helices

In nature and in the test tube, nucleic acids occur in many different forms. Apart from single-stranded, coiled molecules, DNA and RNA prefer to form helical arrangements, in which the bases are stacked to shield their hydrophobic surfaces and expose their polar edges. Focusing on double helices, we describe the crucial role played by symmetry in shaping DNA and RNA structure. The base pairs in nucleic-acid double helices display rotational pseudo-symmetry. In the Watson–Crick base pairs found in naturally occurring DNA and RNA duplexes, the symmetry axis lies in the base-pair plane, giving rise to two different helical grooves. In contrast, anti-Watson–Crick base pairs have a dyad axis perpendicular to the base-pair plane and identical grooves. In combination with the base-pair symmetry, the syn/anti conformation of paired nucleotides determines the parallel or antiparallel strand orientation of double helices. DNA and RNA duplexes in nature are exclusively antiparallel. Watson–Crick base-paired DNA or RNA helices display either right-handed or left-handed helical (pseudo-) symmetry. Genomic DNA is usually in the right-handed B-form, and RNA double helices adopt the right-handed A-conformation. Finally, there is a higher level of helical symmetry in superhelical DNA in which B-form double strands are intertwined in a rightor left-handed sense.


Introduction
Nucleic acids are inherently asymmetric molecules, yet in their biologically prevalent, base-paired and helical forms, DNA and RNA display several levels of symmetry. Their constituents, nucleobases, nucleosides and nucleotides, do not have any non-trivial symmetry. However, bases tend to form base pairs, and oligo-and polynucleotides tend to form helices, whose structures are governed by symmetry in a hierarchical order. The rotational symmetry of base pairs determines how they can be arranged in double helices, and the conformation of paired nucleotides determines the parallel or antiparallel strand orientation. Helices have helical symmetry, and DNA double helices can be organized into superhelical structures with higher-order helical symmetry.
The formation of helical structures is a unifying concept in structural and molecular biology. Helices are made of repeating units of similar shape and chemical properties, and they serve the purpose of satisfying hydrogen-bonding and other stereochemical requirements of their building blocks. As early as 1951, Pauling and coworkers proposed a model for helical polypeptides [1]. In addition to the well-known 3 10 -, αand π-helical conformations of proteins, special helical forms such as the collagen triple helix [2,3] are supported by repetitive polypeptide sequences and/or amino acid modifications. Carbohydrates are also known to form helical structures [4]. The first structural insight into DNA was presented in 1938 by Astbury and Bell [5], and the first detailed models of double-helical DNA [6,7] were famously published in 1953. The review by Goodsell [8] in this volume provided an excellent overview on molecular symmetry in biology.
The symmetries found in nucleic acids are often not exact. Frequently, the execution of a symmetry operation does not lead to an atom-by-atom superposition of molecular structures but to a superposition of shapes or molecular sub-structures. A well-known example of this concept, for which we will use the term "pseudo-symmetry" below, are the Watson-Crick base pairs of double-stranded DNA and RNA, which have similar shapes and whose dyad symmetry axes superimpose their glycosyl bonds after rotation.
Since the 1950s, a vast literature on nucleic acids and their structures has been accumulated. For general principles of nucleic-acid structure, we refer the reader to the classical text book by Saenger [9] and for an introduction into higher-order structure of DNA to Sinden's book [10]. Basic knowledge covered in these books will be used in this review without any further citation. In this review, we focus on symmetry properties of double-stranded DNA and RNA, an aspect of these structures that has not received much attention so far, and illustrate them with examples from the published literature. We make an attempt to review how our knowledge of the symmetries inherent in DNA and RNA structures has developed over time. Our selection of specific examples is slightly biased towards our own work; we apologize to all authors whose work could not be cited out of considerations of economy and readability.

Symmetry of Base Pairs
The four nucleobases A, C, G and T present in DNA and the bases A, C, G and U of RNA are aromatic heterocycles with a propensity for stacking and hydrogen bonding. Base pairing is an efficient way of satisfying the hydrogen bonding potential of the purine and pyrimidine bases. The bases in Figure 1 are shown in their standard non-protonated keto/amino form generally assumed to be predominantly present in nature. However, their hydrogen-bonding and, consequently, their base-pairing patterns can be altered by keto/enol tautomerism or base protonation. The potential for adopting the rare enol/imino forms of bases was noted early on [11] and received renewed interest recently [12,13] because of its possible involvement in genomic mutagenesis. At neutral pH, the bases in DNA and RNA are uncharged. However, at acidic pH or in a spatial environment that changes the microscopic pKa value, protonation at various sites is possible. Bases violating the Watson-Crick pairing scheme (see below) in DNA or RNA double helices as a consequence of base protonation and their effects on duplex structure have been described [14][15][16]. In our following discussion of the symmetry properties of nucleic-acid double helices, we will disregard the unusual enol/imino and protonated forms of the bases and focus entirely on their standard unprotonated keto/amino forms.
Adenine has three hydrogen bond acceptor atoms (N1, N3 and N7) and one donor atom (N6), guanine has three acceptors (N3, O6 and N7) and two donors (N1 and N2), cytosine has two acceptors (O2 and N3) and one donor (N1), and thymine/uracil have two acceptors (O2 and O4) and one donor (N3). This is the basis for a large number of base pairs that may form between the bases in addition to the standard Watson-Crick (WC) base pairs (Figure 1a), which are commonly, but not exclusively, observed in DNA and RNA double helices in nature. Out of the many alternative pairings, the WC geometry may have been selected in the course of molecular evolution, because only the WC G:C pair is linked with three hydrogen bonds and may thus confer extra stability to helical arrangements in which it occurs. Base-paired arrangements stabilized by two hydrogen bonds are ubiquitous, and the most common of them are depicted in Figure 1.
The standard WC base pairs G:C (depicted in Figure 1a), C:G, A:T and T:A are the core constituents of double-helical DNA in cells. In RNA, of course, A:T is replaced with A:U and T:A with U:A. All WC base pairs are isosteric; superimposing them reveals close shape similarity and accurately places their glycosyl bonds (purine N9 or pyrimidine N1 to sugar C1') atop each other. This is the basis for the formation of regular double-helical structures of random sequence. Note that the (pseudo-) dyad axis relating a G:C base pair to C:G and A:T to T:A is in the plane of the base pair and intersects the angle between the glycosyl bonds. The important consequence of this symmetry is that the edges of a  Figure 1a), defined by the >180 • angle between the glycosyl bond vectors, differs from the edge defined by the <180 • angle between glycosyl bonds (bottom) in terms of shape and hydrogen-bonding potential [17]. In double-stranded DNA or RNA, the base-pair edge with purine N7 and N6/O6 and pyrimidine N4/O4 will define the major groove, the opposite edge with purine N3 and N2 (in guanine) as well as pyrimidine O2 defines the minor groove. We may formulate a general rule: base pairs with (pseudo-) dyad axis in the base-pair plane have different edge properties and distinct grooves when incorporated in double helices.
Symmetry 2020, 12, x FOR PEER REVIEW  3 of 17 and N6/O6 and the pyrimidine N4/O4 (at the top in Figure 1a), defined by the >180° angle between the glycosyl bond vectors, differs from the edge defined by the <180° angle between glycosyl bonds (bottom) in terms of shape and hydrogen-bonding potential [17]. In double-stranded DNA or RNA, the base-pair edge with purine N7 and N6/O6 and pyrimidine N4/O4 will define the major groove, the opposite edge with purine N3 and N2 (in guanine) as well as pyrimidine O2 defines the minor groove. We may formulate a general rule: base pairs with (pseudo-) dyad axis in the base-pair plane have different edge properties and distinct grooves when incorporated in double helices. Reverse Watson-Crick base pairs (Figure 1b) use the same hydrogen bonding atoms as WC pairs, but have a radically different arrangement of the bases and do not obey the Watson-Crick pairing rules [6,18]. Most importantly, their dyad axis is perpendicular to the base-pair plane. This gives rise to an approximately antiparallel orientation of the glycosyl bonds and a similar shape of the two base-pair edges, while their hydrogen-bonding patterns may differ, as seen in the example of the A:C anti-WC base pair.
Two more non-WC base pairs are frequently encountered in DNA or RNA double helices. Hoogsteen base pairs (Figure 1c) are unique in that they involve the purine N7 atoms in hydrogen bonding [19]. A reverse Hoogsteen base pair (not depicted) can be formed by rotating the purine base by 180° against the pyrimidine, facilitating formation of an A (N6) to T (O2) hydrogen bond. The existence of wobble base pairs was proposed by Crick [20] to explain apparently relaxed rules for codon-anticodon pairing during translation. In the G:U wobble base pair (Figure 1d), the pyrimidine base is shifted towards the major groove relative to its position in a WC base pair such that a hydrogen bond between G (N1) and U (O2) is formed. G:U wobble pairs occur frequently in RNA where they may serve regulatory functions [21]. In both Hoogsteen and wobble base pairs, the (pseudo-) dyad axis is in the base-pair plane, creating base-pair edges with distinctly dissimilar features.  [6,18]. Most importantly, their dyad axis is perpendicular to the base-pair plane. This gives rise to an approximately antiparallel orientation of the glycosyl bonds and a similar shape of the two base-pair edges, while their hydrogen-bonding patterns may differ, as seen in the example of the A:C anti-WC base pair.
Two more non-WC base pairs are frequently encountered in DNA or RNA double helices. Hoogsteen base pairs (Figure 1c) are unique in that they involve the purine N7 atoms in hydrogen bonding [19]. A reverse Hoogsteen base pair (not depicted) can be formed by rotating the purine base by 180 • against the pyrimidine, facilitating formation of an A (N6) to T (O2) hydrogen bond. The existence of wobble base pairs was proposed by Crick [20] to explain apparently relaxed rules for codon-anticodon pairing during translation. In the G:U wobble base pair (Figure 1d), the pyrimidine base is shifted towards the major groove relative to its position in a WC base pair such that a hydrogen bond between G (N1) and U (O2) is formed. G:U wobble pairs occur frequently in RNA where they may serve regulatory functions [21]. In both Hoogsteen and wobble base pairs, the (pseudo-) dyad axis is in the base-pair plane, creating base-pair edges with distinctly dissimilar features.

Symmetry of Nucleoside Pairs
Nucleoside conformation adds a further level of complexity to DNA or RNA helix structure, as the glycosyl torsion angles determine the polarity of the sugar-phosphate backbones. As a rule, the nucleosides adopt the anti-conformation of their glycosyl bonds, i.e., the χ torsion angle defined by atoms O4'-C1'-N9-C4 in purine nucleosides and by O4'-C1'-N1-C2 in pyrimidine nucleosides is between 90 • and 270 • (χ~195 • in A-DNA,~260 • in B-DNA, see below). While pyrimidine nucleosides rarely deviate from anti, in purine nucleosides the rotational barrier is lower, and the syn-conformation is populated in equilibrium [22] and may be adopted in polymeric nucleic acids under certain conditions. Double helices are antiparallel if they are formed by nucleosides in standard anti-conformation ( Figure 2a) paired with WC, Hoogsteen or wobble geometry. In the figure, the C4'-C3' vectors indicating strand polarity point in opposite directions. If helices could be formed by nucleosides in WC, Hoogsteen or wobble base pairs that are all in syn-conformation, these would also be antiparallel. Conversely, double helices formed by base pairs in reverse WC or reverse Hoogsteen geometry with all glycosyl bonds in standard anti-conformation (not shown) are parallel.

Symmetry of Nucleoside Pairs
Nucleoside conformation adds a further level of complexity to DNA or RNA helix structure, as the glycosyl torsion angles determine the polarity of the sugar-phosphate backbones. As a rule, the nucleosides adopt the anti-conformation of their glycosyl bonds, i.e., the χ torsion angle defined by atoms O4'-C1'-N9-C4 in purine nucleosides and by O4'-C1'-N1-C2 in pyrimidine nucleosides is between 90° and 270° (χ ~195° in A-DNA, ~260° in B-DNA, see below). While pyrimidine nucleosides rarely deviate from anti, in purine nucleosides the rotational barrier is lower, and the synconformation is populated in equilibrium [22] and may be adopted in polymeric nucleic acids under certain conditions. Double helices are antiparallel if they are formed by nucleosides in standard anticonformation ( Figure 2a) paired with WC, Hoogsteen or wobble geometry. In the figure, the C4'-C3' vectors indicating strand polarity point in opposite directions. If helices could be formed by nucleosides in WC, Hoogsteen or wobble base pairs that are all in syn-conformation, these would also be antiparallel. Conversely, double helices formed by base pairs in reverse WC or reverse Hoogsteen geometry with all glycosyl bonds in standard anti-conformation (not shown) are parallel. WC, Hoogsteen and wobble base pairs with the purine nucleoside in syn and the pyrimidine in anti are consistent with parallel strands (Figure 2b), and the related reverse WC and Hoogsteen base pairs with identical glycosyl torsion angles (not shown) also have parallel strands. An interesting deviation from these general principles is found in Z-form DNA double helices (see below). Here, the purine nucleosides are all in syn-conformation and the pyrimidines in anti (Figure 2c), but the strands in Z-DNA are nevertheless antiparallel because of a unique conformation of the sugar-phosphate backbone. This may be taken as an exception from a second general rule that follows from the above considerations, which posits that base-pair symmetry along with the glycosyl torsion angles determines the strand polarity in double-stranded nucleic acids (Table 1).  WC, Hoogsteen and wobble base pairs with the purine nucleoside in syn and the pyrimidine in anti are consistent with parallel strands (Figure 2b), and the related reverse WC and Hoogsteen base pairs with identical glycosyl torsion angles (not shown) also have parallel strands. An interesting deviation from these general principles is found in Z-form DNA double helices (see below). Here, the purine nucleosides are all in syn-conformation and the pyrimidines in anti (Figure 2c), but the strands in Z-DNA are nevertheless antiparallel because of a unique conformation of the sugar-phosphate backbone. This may be taken as an exception from a second general rule that follows from the above considerations, which posits that base-pair symmetry along with the glycosyl torsion angles determines the strand polarity in double-stranded nucleic acids (Table 1).

Guanine Quadruplexes
As they contain two hydrogen-bond donor (N1 and N2) and three acceptor functions (N3, O6 and N7), guanines are unique among nucleobases in their ability to form quadruplex structures of various topologies in G-repeat nucleic acids [23]. In the planar guanine-tetrad core structure of G-quadruplexes (Figure 3a), the hydrogen-bonding potential of the purine bases is nearly fully saturated by the participation of donor and acceptor functions from both their WC and Hoogsteen edges, leaving only N3 at the minor-groove edge uninvolved. The guanine bases in a G-tetrad are related to each other by a fourfold symmetry axis perpendicular to the tetrad plane. G-tetrads and G-quadruplexes have been observed in both DNA and RNA [24][25][26].

Guanine Quadruplexes
As they contain two hydrogen-bond donor (N1 and N2) and three acceptor functions (N3, O6 and N7), guanines are unique among nucleobases in their ability to form quadruplex structures of various topologies in G-repeat nucleic acids [23]. In the planar guanine-tetrad core structure of Gquadruplexes (Figure 3a), the hydrogen-bonding potential of the purine bases is nearly fully saturated by the participation of donor and acceptor functions from both their WC and Hoogsteen edges, leaving only N3 at the minor-groove edge uninvolved. The guanine bases in a G-tetrad are related to each other by a fourfold symmetry axis perpendicular to the tetrad plane. G-tetrads and Gquadruplexes have been observed in both DNA and RNA [24][25][26].  [27]. The guanine nucleotides involved in G-tetrad formation are printed in italic letters and shown in blue in the cartoon drawing. Extra-tetrad nucleotides are colored green. Drawn with PyMOL [28].
In G-rich nucleic acids, guanine tetrads can stack atop each other to form guanine quadruplexes [23,24]. These quadruplexes often contain three stacked tetrads stabilized by monovalent cations (Na + or K + ) and may be intermolecular or intramolecular, i.e., formed by several DNA or RNA strands or by a contiguous strand folding back upon itself. The formation of these structures is facilitated by the propensity of guanosine nucleosides to adopt either the trans or the syn conformation of their glycosyl bonds, the former being favored over the latter by just a small difference in energy. As with the base pairs in double-stranded DNA or RNA, the glycosyl bond angles adopted by the nucleosides determine both groove dimensions and strand polarity. If all guanosine glycosyl bonds have identical torsion angles, e.g., all-trans as in most intermolecular G-quadruplexes [24], the globular quadruplex structure will display identical grooves and parallel strands. If adjacent strands have different glycosyl torsion angles, they will be antiparallel, and the quadruplex structure will display grooves of different widths. In intramolecular G-quadruplexes, the combination of different strand polarities with different types of connecting loops gives rise to a bewildering variety of quadruplex topologies [24]. The G-quadruplex based on a DNA sequence found in the human c-Myc promoter (Figure 3b) is an example for an intramolecular parallel-stranded quadruplex with propeller-type loops between strands and all-trans glycosyl-bond conformation [27]. Crystal and NMR structure analyses have revealed a large variety of globular quadruplex topologies [23,24,27,29].

Guanine Quadruplexes
As they contain two hydrogen-bond donor (N1 and N2) and three acceptor functions (N3, O6 and N7), guanines are unique among nucleobases in their ability to form quadruplex structures of various topologies in G-repeat nucleic acids [23]. In the planar guanine-tetrad core structure of Gquadruplexes (Figure 3a), the hydrogen-bonding potential of the purine bases is nearly fully saturated by the participation of donor and acceptor functions from both their WC and Hoogsteen edges, leaving only N3 at the minor-groove edge uninvolved. The guanine bases in a G-tetrad are related to each other by a fourfold symmetry axis perpendicular to the tetrad plane. G-tetrads and Gquadruplexes have been observed in both DNA and RNA [24][25][26].  [27]. The guanine nucleotides involved in G-tetrad formation are printed in italic letters and shown in blue in the cartoon drawing. Extra-tetrad nucleotides are colored green. Drawn with PyMOL [28].
In G-rich nucleic acids, guanine tetrads can stack atop each other to form guanine quadruplexes [23,24]. These quadruplexes often contain three stacked tetrads stabilized by monovalent cations (Na + or K + ) and may be intermolecular or intramolecular, i.e., formed by several DNA or RNA strands or by a contiguous strand folding back upon itself. The formation of these structures is facilitated by the propensity of guanosine nucleosides to adopt either the trans or the syn conformation of their glycosyl bonds, the former being favored over the latter by just a small difference in energy. As with the base pairs in double-stranded DNA or RNA, the glycosyl bond angles adopted by the nucleosides determine both groove dimensions and strand polarity. If all guanosine glycosyl bonds have identical torsion angles, e.g., all-trans as in most intermolecular G-quadruplexes [24], the globular quadruplex structure will display identical grooves and parallel strands. If adjacent strands have different glycosyl torsion angles, they will be antiparallel, and the quadruplex structure will display grooves of different widths. In intramolecular G-quadruplexes, the combination of different strand polarities with different types of connecting loops gives rise to a bewildering variety of quadruplex topologies [24]. The G-quadruplex based on a DNA sequence found in the human c-Myc promoter (Figure 3b) is an example for an intramolecular parallel-stranded quadruplex with propeller-type loops between strands and all-trans glycosyl-bond conformation [27]. Crystal and NMR structure analyses have revealed a large variety of globular quadruplex topologies [23,24,27,29].
) perpendicular to the tetrad plane. With the exception of N3 and N9, all polar atoms of the bases are engaged in hydrogen bonding. (b) NMR(nuclear magnetic spectroscopy) structure of an intramolecular parallel-stranded G-quadruplex with sequence dTGAGGGTGGGTAGGGTGGGTAA present in the human c-Myc promoter [27]. The guanine nucleotides involved in G-tetrad formation are printed in italic letters and shown in blue in the cartoon drawing. Extra-tetrad nucleotides are colored green. Drawn with PyMOL [28].
In G-rich nucleic acids, guanine tetrads can stack atop each other to form guanine quadruplexes [23,24]. These quadruplexes often contain three stacked tetrads stabilized by monovalent cations (Na + or K + ) and may be intermolecular or intramolecular, i.e., formed by several DNA or RNA strands or by a contiguous strand folding back upon itself. The formation of these structures is facilitated by the propensity of guanosine nucleosides to adopt either the trans or the syn conformation of their glycosyl bonds, the former being favored over the latter by just a small difference in energy. As with the base pairs in double-stranded DNA or RNA, the glycosyl bond angles adopted by the nucleosides determine both groove dimensions and strand polarity. If all guanosine glycosyl bonds have identical torsion angles, e.g., all-trans as in most intermolecular G-quadruplexes [24], the globular quadruplex structure will display identical grooves and parallel strands. If adjacent strands have different glycosyl torsion angles, they will be antiparallel, and the quadruplex structure will display grooves of different widths. In intramolecular G-quadruplexes, the combination of different strand polarities with different types of connecting loops gives rise to a bewildering variety of quadruplex topologies [24]. The G-quadruplex based on a DNA sequence found in the human c-Myc promoter (Figure 3b) is an example for an intramolecular parallel-stranded quadruplex with propeller-type loops between strands and all-trans glycosyl-bond conformation [27]. Crystal and NMR structure analyses have revealed a large variety of globular quadruplex topologies [23,24,27,29].
G-quadruplexes readily form under physiological conditions and have been implicated in a variety of cellular processes ranging from maintenance of genomic stability to replication, transcription and translation [24,30]. They can be stabilized, but also destabilized or unfolded by specific G-quadruplex-binding proteins [31,32]. The first well-documented cellular role of G-quadruplexes was in telomeres, at the ends of human and other eukaryotic chromosomes, where they are thought to serve a structurally protective role [30,33]. More recently, guanine quadruplexes have been characterized as transcriptional regulators in the promoter regions of genes [34,35].

Regular Double-Helical Structures as Defined by X-ray Fiber Diffraction Studies
In the following, we shall restrict ourselves to antiparallel DNA and RNA double helices, neglecting less common structures such as triple helices or parallel double helices. This does not mean to imply that these unusual structures do not exist in nature or are irrelevant. Triple-helical structures are formed when a third strand is inserted into the major groove of B-form DNA and its bases form Hoogsteen pairs with the purines of the WC base pairs [36]. Intramolecular triplexes (H-DNA) can be formed with various strand arrangements in homopurine tracts of supercoiled DNA [36]. The potential use of triplex-forming oligonucleotides as DNA-modifying agents has been discussed [37]. Finally, parallel double-helix structures have been observed in DNA hairpins with 3'-p-3' or 5'-p-5' linkages in the connecting loop [38]. Biochemical and spectroscopic evidence strongly suggests that these molecules contain right-handed parallel duplexes with reverse WC A:T base pairs, which are slightly less stable than their antiparallel counterparts. A parallel double helix mimicking poly (rA) has also been observed at acidic pH in a crystal of rA7 [39].
One more remark may be allowed before we focus our attention on fiber structures of nucleic acids in their canonical conformations. This concerns the handedness of double helices, a property that has been known since the first models [6,7] were presented, but is thoroughly underappreciated by geneticists and biologists in general. Popular science outlets and, regrettably, also part of the scientific literature are full of images of double helices that have all the features of B-DNA, but show a left-handed mirror-image structure. This structure is obviously impossible because of the chiral nature of nucleotides. The prevalence of these wrong images has led to the creation of a gallery that shows hundreds of examples and is at the same time fun and sad to peruse: The Left Handed DNA Hall of Fame [40].
From the pioneering work of Astbury and Bell [5] in the 1930s until the late 1970s, when chemical synthesis of DNA became possible [41], X-ray fiber diffraction was the method of choice for DNA and RNA structure analysis. This work [42] defined the canonical types of nucleic-acid double helices: the A-form typical for RNA and occasionally found in DNA, the B-form of DNA corresponding to the Watson-Crick model [6] and related, over-twisted double helices, and the Z-form [43]. The latter, incidentally, was initially observed in crystals [44] and only later characterized in oriented fibers [45].
Helical symmetry as observed in the A-, B-and Z-forms of double-stranded nucleic acids is generated by screw operators. Fiber models of DNA are constrained to 11 1 helical symmetry in the A-form or 10 1 symmetry in the B-form, meaning they have exactly 11 or 10 repeating units per right-handed turn of the helix. An r t screw axis is characterized by a right-handed rotation of (360/r) • and a translation of t/r of the repeat length, here the pitch of a full helical turn. Hypothetical, but stereochemically impossible left-handed mirror duplexes of A-and B-DNA (see above) would have 11 10 or 10 9 symmetry. Z-DNA obeys 6 5 symmetry, because it has six repeating units in one left-handed helical turn. All canonical forms of DNA double helices (A, B and Z) have WC base pairs at their center and antiparallel strands. Fiber models are completely regular (Figure 4). There are no sequence-dependent features, even if the underlying sequence is random, such as in calf thymus DNA. Sequence effects have been studied by analyzing structures of homopolymeric nucleic acids or simple repetitive sequences [46,47], and structural variations were introduced by modulating the ionic Symmetry 2020, 12, 737 7 of 18 medium and relative humidity. Local effects on the double-helix structure, such as those induced by covalent modifications of individual bases or backbone sites, including mutagenic lesions, or ligand binding cannot be studied by X-ray fiber diffraction.

The A-Form of RNA and DNA
RNA duplexes generally adopt the helical A-form (Figure 4, left). Here, a single base pair (nucleotide pair) is the repeating unit of the double helix. The right-handed A-helix is characterized by 11 base pairs per turn (111 symmetry), a rise per base pair along the helix axis of 2.55 Å, an inclination of the base pairs of ~19° against the helix axis, anti-conformation for all glycosyl torsion angles and C3'-endo sugar pucker of all nucleosides. The combination of these parameters gives the helix a stout appearance with a diameter of ~23 Å, deep and narrow major groove and a shallow minor groove. Due to the deep major groove, the base pairs are pushed away from the helix, and the helix appears hollow when looking down its axis. A-RNA helices with 12 base pairs per turn (121 symmetry) have also been modeled based on fiber diffraction data [42]. In the top row, the straight helix axis is vertical, and the view is into the minor groove at the helix center. The bottom row presents the view down the helix axis after 90° rotation. The antiparallel strands are drawn with different colors, and the sugar-phosphate backbone is depicted as smooth tube. Models were generated with Web 3DNA 2.0 [48] based on coordinates derived from X-ray fiber diffraction [46,47] and drawn with PyMOL [28].
DNA generally prefers to adopt the B-conformation, which is assumed to be the biologically relevant conformation, but can undergo a transition to the A-form under certain conditions. The conformational shift to the A-form is favored under conditions of low relative humidity (high ionic strength) or if the DNA sequence is G/C-rich. Techniques to control the relative humidity in X-ray fiber diffraction experiments have been known for some time [42,46,47,49] and were recently revisited in a study of hydration forces in A-DNA and B-DNA fibers [50]. The formation of A-DNA is not limited to experimental in vitro situations. It has been reported, for example, that the DNA genome of the SIRV2 virus that infects a hyperthermophilic bacterium is entirely in the A-form [51].

The B-Form of DNA
The iconic double helical DNA structure presented by Watson and Crick [6] is the first published example of B-DNA and displays many of the features associated with this conformation (Figure 4, center). As in the A-form, a single base pair (nucleotide pair) is the repeating unit of the double helix. Twelve-base-pair fragments of calf thymus A-DNA and B-DNA with arbitrary sequence dATCGATCGATCG and of Z-DNA with alternating sequence dCGCGCGCGCGCG. In the top row, the straight helix axis is vertical, and the view is into the minor groove at the helix center. The bottom row presents the view down the helix axis after 90 • rotation. The antiparallel strands are drawn with different colors, and the sugar-phosphate backbone is depicted as smooth tube. Models were generated with Web 3DNA 2.0 [48] based on coordinates derived from X-ray fiber diffraction [46,47] and drawn with PyMOL [28].

The A-Form of RNA and DNA
RNA duplexes generally adopt the helical A-form (Figure 4, left). Here, a single base pair (nucleotide pair) is the repeating unit of the double helix. The right-handed A-helix is characterized by 11 base pairs per turn (11 1 symmetry), a rise per base pair along the helix axis of 2.55 Å, an inclination of the base pairs of~19 • against the helix axis, anti-conformation for all glycosyl torsion angles and C3'-endo sugar pucker of all nucleosides. The combination of these parameters gives the helix a stout appearance with a diameter of~23 Å, deep and narrow major groove and a shallow minor groove. Due to the deep major groove, the base pairs are pushed away from the helix, and the helix appears hollow when looking down its axis. A-RNA helices with 12 base pairs per turn (12 1 symmetry) have also been modeled based on fiber diffraction data [42].
DNA generally prefers to adopt the B-conformation, which is assumed to be the biologically relevant conformation, but can undergo a transition to the A-form under certain conditions. The conformational shift to the A-form is favored under conditions of low relative humidity (high ionic strength) or if the DNA sequence is G/C-rich. Techniques to control the relative humidity in X-ray fiber diffraction experiments have been known for some time [42,46,47,49] and were recently revisited in a study of hydration forces in A-DNA and B-DNA fibers [50]. The formation of A-DNA is not limited to experimental in vitro situations. It has been reported, for example, that the DNA genome of the SIRV2 virus that infects a hyperthermophilic bacterium is entirely in the A-form [51].

The B-Form of DNA
The iconic double helical DNA structure presented by Watson and Crick [6] is the first published example of B-DNA and displays many of the features associated with this conformation (Figure 4, center). As in the A-form, a single base pair (nucleotide pair) is the repeating unit of the double helix. The right-handed B-helix is characterized by 10 base pairs per turn (10 1 symmetry), a rise per base pair along the helix axis of 3.38 Å, base pairs arranged nearly perpendicular to the helix axis, anti-conformation for all glycosyl torsion angles and C2'-endo sugar pucker of all nucleosides. The resulting helix has a diameter of~20 Å and equally deep grooves, where the major groove is wider than the minor groove. The helix axis passes straight through the base pairs, allowing the helical periodicity to be read out by counting the spokes (the glycosyl bonds) connecting the base pairs to the sugar-phosphate backbone.

The Z-Form of DNA
While the A-and B-forms of DNA as well as A-type RNA helices can accommodate many different nucleotide sequences, Z-DNA is strictly sequence-specific, as it is restricted to alternating purine-pyrimidine sequences. The structure of Z-DNA (Figure 4, right) differs radically from the previously discussed conformations. Here, not a single base pair (nucleotide pair) is the repeating unit of the double helix, but two stacked base pairs. The left-handed Z-helix is characterized by six dinucleotide stacks per turn (6 5 symmetry), a rise per repeating unit along the helix axis of 7.25 Å and an inclination of the base pairs of~-9 • against the helix axis. A distinctive feature of Z-DNA is the syn-conformation of all purine nucleosides and the usual anti-conformation of pyrimidines. Purines have C3'-endo sugar pucker, while pyrimidines have C2'-endo pucker in Z-DNA. The combination of the dinucleotide stack repeat with the syn-anti alternation along each strand accounts for the wrinkled appearance of the sugar-phosphate backbone in Z-DNA that differs markedly from the smooth trajectory seen in A-and B-helices. The special backbone conformation is required to allow the formation of an antiparallel double helix, although the WC-paired nucleosides have syn and anti glycosyl bonds on their purines and pyrimidines, respectively. The Z-DNA helix appears unusually slender with a diameter of~18 Å, a shallow major groove and a deep minor groove. Because of the deep minor groove, the base pairs are displaced from the helix axis, and the helix reveals a small central opening when viewed along its axis.
It is noteworthy that a structural transition between right-handed B-DNA and left-handed Z-DNA is a sterically complex event, because the strand polarities in both species are different. Simply unwinding the left-handed Z-DNA helix and twisting it to the opposite handedness will not yield B-DNA, but a right-handed helix with inverted strand polarity, i.e., the lower strand, looking into the minor groove, would run upwards (5' to 3') and not downwards, as in B-DNA. The B-Z transition must therefore involve the opening and rejoining of base pairs or the flipping of base pairs around their long axis by 180 • . Molecular mechanisms for the B-Z transition have been studied by a variety of biophysical and modeling approaches [52,53].
Since its discovery, the biological significance and function of Z-DNA has been a matter of much debate [54]. It is well established that both covalent DNA modification and superhelical density may trigger the structural transition of oligo-G/C sequences to the Z-form in cells [55]. In addition, proteins were found that associate with Z-DNA in a structure-specific manner [56,57]. Nevertheless, the evidence for a biological function that requires the formation and involvement of Z-DNA is still circumstantial.
To conclude the discussion of DNA and RNA helix types as revealed by X-ray fiber diffraction studies, we may state that the canonical A-, B-and Z-forms are clearly distinct, as they differ in their helical symmetry and several other structural features (Table 2). In these structural models, the helical symmetries are exact and have integral numbers of repeats per turn of the double helix.

Double-Helical Structures as Observed in Single-Crystal X-ray Diffraction Studies
Starting in the late 1970s and early 1980s, the crystal structure analysis of short synthetic pieces of DNA and (a short time later) RNA became possible. This work confirmed the general features of A-DNA [58] and B-DNA [59] as established by X-ray fiber diffraction as described above and led to the discovery of a new left-handed form of DNA, the Z-form [44]. Thus, the basic symmetries of the canonical forms of nucleic-acid helices were preserved. In addition, however, the crystal structures ( Figure 5) added a new level of structural understanding of nucleic acids, because they revealed sequence-dependent structural variations of the double helix and structural perturbations introduced by base-pair mismatches, chemical modifications of the nucleotides and the binding of small ligands or proteins.
Some of the distinctive features of A-DNA and B-DNA (see Table 2) became blurred when crystal structures of many different oligonucleotides became available. For example, the clear distinction between the A-and B-form was partly lost, when A-DNA structures with only moderate inclination of base pairs against the helix axis (~7 • ), a B-DNA-like rise per base pair of 3.2 Å, a wide major groove and the occasional C2'-endo pucker were described [60], suggesting a continuum of conformations between A-and B-DNA. However, since several salient distinctive features such as the groove depth, helix axis position and most of the sugar puckers still differ between helix types, the concept of the Aand B-forms of DNA is still valid today.

Sequence-Dependent Helix Modulation Introduced by Base-Pair Stacking Propensities
Soon after the first oligonucleotide crystal structures were published, it became clear that the A-DNA [60][61][62] and B-DNA [63][64][65] structures observed at high resolution revealed many subtle sequence-dependent deviations from the fiber models. This becomes apparent, for example, when one compares the B-DNA fiber model (Figure 4, center) with the crystal structure of the "Dickerson dodecamer" (Figure 5, center). The crystal structure appears more irregular with base pairs buckled and propeller-twisted to various degrees, and at its A/T-rich center the minor groove is narrower than in the B-DNA fiber model. This narrow minor groove was linked to the local structure around the A:T base pairs in the center of the helix and to a particular hydration pattern in the minor groove that was made possible by the A/T stretch [66]. This hydration-based concept was validated when G/C-rich and G/C-only B-DNA crystal structures were analyzed [64,65,67,68] that featured wider minor grooves resembling the fiber model of B-DNA and a more irregular hydration in the minor groove.
To comprehend the sequence-dependent features of nucleic-acid helices that were observed in high-resolution crystal structures, universally applicable analytic tools and a new terminology had to be devised. This analysis centers on the geometry and distortions of individual base pairs (Opening, Propeller Twist, Buckle, Stagger, Stretch and Shear), position and orientation of base pairs relative to a helix axis (x-, y-Displacement, Tip and Inclination) and the geometry of base-pair stacking (Twist, Roll, Tilt, Rise, Slide and Shift) [48,69]. Using these analytical tools, the influence of the nucleotide sequence on the helical fine structure could be studied. It became apparent that many of the above parameters of DNA structure vary with rather large amplitudes. The helix Twist, for example, may be as small as 24 • and as large as 51 • within the same helix [65], but over a full helix turn the Twist values tend to average out at~36 • , the value for the fiber model of B-DNA. Similar observations have been made for other helical parameters. Thus, in spite of significant local structure variations, the overall characteristics of A-, B-and Z-form helices are preserved in crystal structures.
Symmetry 2020, 12, x FOR PEER REVIEW 10 of 17 relative to a helix axis (x-, y-Displacement, Tip and Inclination) and the geometry of base-pair stacking (Twist, Roll, Tilt, Rise, Slide and Shift) [48,69]. Using these analytical tools, the influence of the nucleotide sequence on the helical fine structure could be studied. It became apparent that many of the above parameters of DNA structure vary with rather large amplitudes. The helix Twist, for example, may be as small as 24° and as large as 51° within the same helix [65], but over a full helix turn the Twist values tend to average out at ~36°, the value for the fiber model of B-DNA. Similar observations have been made for other helical parameters. Thus, in spite of significant local structure variations, the overall characteristics of A-, B-and Z-form helices are preserved in crystal structures. Figure 5. Examples for the main double-helical forms of nucleic acids based on single-crystal X-ray diffraction analysis of dodecamers. Left, A-RNA with sequence rUAAGGAGGUGAU [70] containing the Shine-Dalgarno sequence [71] important for bacterial and archaeal translation initiation; center, the "Dickerson dodecamer" with sequence dCGCGAATTCGCG [63], the first published B-DNA structure; right, Z-DNA with alternating sequence dCGCGCGCGCGCG [72]. In the top row, the helix axis is vertical and the view is into the minor groove at the helix center. The bottom row presents the view down the helix axis after 90° rotation. PDB entry codes are provided in the figure. Drawn with PyMOL [28].
The stacking propensities of the DNA and RNA bases and base pairs are at the origin of the sequence-dependent helical structure modulations. Over the years, many experimental and theoretical approaches have been taken to unravel the energetics of base stacking and the resulting DNA fine structure [73,74]. In spite of all the research effort invested in this field, a simple code relating the DNA sequence to the helical fine structure, which could be exploited by transcription factors and other DNA-binding proteins, has not emerged [75].

Double Helix Structure Modulation by Mis-Pairing and Chemical Modification
The availability of synthetic DNA and RNA fragments for crystallization opened the door for studying the effects of mismatched bases, chemical modifications of bases or backbone structures or Figure 5. Examples for the main double-helical forms of nucleic acids based on single-crystal X-ray diffraction analysis of dodecamers. Left, A-RNA with sequence rUAAGGAGGUGAU [70] containing the Shine-Dalgarno sequence [71] important for bacterial and archaeal translation initiation; center, the "Dickerson dodecamer" with sequence dCGCGAATTCGCG [63], the first published B-DNA structure; right, Z-DNA with alternating sequence dCGCGCGCGCGCG [72]. In the top row, the helix axis is vertical and the view is into the minor groove at the helix center. The bottom row presents the view down the helix axis after 90 • rotation. PDB entry codes are provided in the figure. Drawn with PyMOL [28].
The stacking propensities of the DNA and RNA bases and base pairs are at the origin of the sequence-dependent helical structure modulations. Over the years, many experimental and theoretical approaches have been taken to unravel the energetics of base stacking and the resulting DNA fine structure [73,74]. In spite of all the research effort invested in this field, a simple code relating the DNA sequence to the helical fine structure, which could be exploited by transcription factors and other DNA-binding proteins, has not emerged [75].

Double Helix Structure Modulation by Mis-Pairing and Chemical Modification
The availability of synthetic DNA and RNA fragments for crystallization opened the door for studying the effects of mismatched bases, chemical modifications of bases or backbone structures or other perturbation of helix conformation. Early crystal structures of mis-incorporated bases in A-DNA [16,76,77] and B-DNA [78,79] duplexes showed that these mismatches had a surprisingly small effect on the underlying helix structure. Even purine-purine mismatches could be incorporated without major structural effect [78,79]. The ability to accommodate mismatched bases into a standard double helix without major structural perturbation is shared by RNA [14,80].
Chemical modifications of bases and the sugar-phosphate backbone were also assessed by structure analysis regarding their effect of DNA or RNA helix structure. Cytosine methylation, known as an important epigenetic modification of chromatin and to stabilize the Z-form of DNA, had very small structural consequences for B-DNA [81,82] or Z-DNA [83]. Similar observations were made with 6-methyladenine incorporated in B-DNA and 5-methylcytosine incorporated in an A-DNA helix [84]. The exchange of deoxyriboses by their ribose analogs is a drastic modification of a DNA sugar-phosphate backbone. Nucleic-acid helices with hybrid DNA/RNA backbones invariably adopt the A-conformation [85,86], because ribonucleotides cannot adopt C2'-endo sugar pucker as required in a B-form helix. Other backbone modifications, such as 3'-methylene phosphonate linkages in A-DNA [87] or chiral phosphorothioate groups in B-DNA [88], were tolerated with minor structural distortions.

Double Helix Structure Modulation by Ligand Binding
Crystal structure analysis allows researchers to study the influence of ligand binding on DNA helix structure. DNA-binding drug molecules were among the first ligands that were systematically tested for their binding modes. These molecules adopt various strategies to associate with DNA. Here, we will briefly mention two major categories of DNA-binding drug molecules, groove binders and intercalators. Molecules such as netropsin, distamycin and the DNA fluorochrome Hoechst 33258 [89,90] have optimal shapes to bind to the narrow minor groove of A/T-rich DNA without causing much conformational change. The binding of intercalating drugs like triostin A and echinomycin, on the other hand, has very profound consequences for DNA helix structure, as the intercalation is accompanied by a local unwinding and stretching of the double helix and triggers local changes of the base-pairing pattern [91,92]. Other molecules, such as nogalamycin [93], have even more complex binding patterns, acting as intercalators and groove binders simultaneously and causing significant helix distortion.
Hundreds of crystal structures of DNA-or RNA-bound proteins have been determined since the 1980s, and it is beyond the scope of this paper to review even the most basic patterns of protein-nucleic-acid interactions. Regarding the issue of symmetry in double-helical structures, it may be noted that many proteins such as the transcription factors KorB [94], Klf4 [95] and Grhl2 [96] analyzed in our own laboratory bind their target DNA without major helix distortion, preserving the helical symmetry. Of course, there are plenty of examples of proteins that severely bend or distort the helical structure. These include architectural proteins such as the bacterial integration host factor [97] or histone octamers that wrap nuclear DNA into nucleosome structures [98] and transcription factors such as the bacterial catabolite gene activator protein [99] or the eukaryotic TATA-box-binding protein [100]. These proteins achieve DNA helix bending by the intercalation of protein side chains between DNA base pairs and/or asymmetric charge compensation of the DNA backbone, a powerful and well-known mechanism leading to DNA curving [101,102].
To conclude this section, we note that the basic symmetry of DNA and RNA helices is preserved in crystal structures, but it is relaxed such that we no longer find exact 11-, 10-or six-fold helical symmetry, but non-repeating symmetry and non-integral numbers of repeats per helix turn instead.
Crystal structure analyses demonstrate that the canonical A-, B-and Z-conformations are remarkably resistant against nucleotide sequence effects, covalent modifications and ligand binding.

The Biology of Double-Helical DNA Structures
Until this point, we have limited our discussion of nucleic-acid structure to oriented fibers or short DNA or RNA fragments. Under these conditions, where fibers are fixed in an outstretched shape and the length of synthetic fragments remains well below the persistence length of DNA [103], double helices behave like rigid rods with (mostly) straight axes. Above a length of 100-200 base pairs, DNA helices lose their bending stiffness; the uncooked spaghetti turns into cooked spaghetti. These long DNA molecules no longer have internal symmetry. However, there is one state with defined helical symmetry: the DNA superhelix.

Superhelical Structures and Chromatin
When a circularly closed plasmid is run on an agarose gel, one usually observes not one but three bands. These bands represent open circles with single-strand breaks, running the slowest, linearized molecules with intermediate speed of migration and covalently closed circles, running the fastest. These fast-running molecules have a compact shape because they are supercoiled, a state that can be directly observed in an electron microscope. DNA supercoiling arises from the activity of DNA topoisomerases [104], a diverse class of enzymes that catalyze the topological remodeling of DNA in cells by introducing and sealing single and double-strand breaks in DNA. Topological remodeling can also be achieved in vitro, e.g., by the combined use of intercalating molecules (see above) and topoisomerases. Depending on the topological change achieved by a topoisomerase in vivo or in vitro, DNA may adopt a positively or negatively supercoiled form ( Figure 6). These long DNA molecules no longer have internal symmetry. However, there is one state with defined helical symmetry: the DNA superhelix.

Superhelical Structures and Chromatin
When a circularly closed plasmid is run on an agarose gel, one usually observes not one but three bands. These bands represent open circles with single-strand breaks, running the slowest, linearized molecules with intermediate speed of migration and covalently closed circles, running the fastest. These fast-running molecules have a compact shape because they are supercoiled, a state that can be directly observed in an electron microscope. DNA supercoiling arises from the activity of DNA topoisomerases [104], a diverse class of enzymes that catalyze the topological remodeling of DNA in cells by introducing and sealing single and double-strand breaks in DNA. Topological remodeling can also be achieved in vitro, e.g., by the combined use of intercalating molecules (see above) and topoisomerases. Depending on the topological change achieved by a topoisomerase in vivo or in vitro, DNA may adopt a positively or negatively supercoiled form ( Figure 6). Positively supercoiled DNA displays a left-handed superhelix, negative supercoiling is associated with a right-handed superhelix. Obviously, these superhelices have helical symmetry, much the same as the underlying B-form double helices. If one takes one double-helical turn as the repeating unit of the superhelix, then the right-handed superhelix of the model displayed in Figure  6a can be said to have 101 superhelical symmetry, because it organizes 20 double-helical turns into two superhelical turns. By the same logic, the left-handed superhelix of Figure 6b would have 109 superhelical symmetry. It remains unclear, however, whether this terminology is helpful for understanding superhelical structures. It should be noted in closing that superhelical structures are Positively supercoiled DNA displays a left-handed superhelix, negative supercoiling is associated with a right-handed superhelix. Obviously, these superhelices have helical symmetry, much the same as the underlying B-form double helices. If one takes one double-helical turn as the repeating unit of the superhelix, then the right-handed superhelix of the model displayed in Figure 6a can be said to have 10 1 superhelical symmetry, because it organizes 20 double-helical turns into two superhelical turns. By the same logic, the left-handed superhelix of Figure 6b would have 10 9 superhelical symmetry. It remains unclear, however, whether this terminology is helpful for understanding superhelical structures. It should be noted in closing that superhelical structures are topologically equivalent with toroidally wrapped DNA helices like those in nucleosomes, i.e., one form can be converted into the other without nicking and re-sealing the DNA and without breaking any covalent bonds.
At higher levels of chromatin organization, the supramolecular entities containing DNA become more and more asymmetric. For that reason, these structures are not covered in this review.

Conclusions
We have tried to make the case that symmetry is of central importance for understanding DNA and RNA helical structure. Symmetry determines essential structural features in a hierarchical manner. The rotational (pseudo-) symmetry of base pairs determines if the helices formed by them have grooves with identical shapes or different shapes, such as the major and minor grooves of the A, B-and Z-form duplexes. In combination with the base-pair symmetry, the glycosyl bond conformations determine the parallel or antiparallel orientation of the double-helix strands. The double helices in their different canonical conformations can be described in terms of helical symmetry. This helical symmetry is exact in double-helical models derived from X-ray studies of oriented fibers but relaxed in crystal structures of DNA or RNA duplexes. Crystal structures provide information about sequence-dependent features of helices and about perturbations of the helix structures by covalent modifications or ligand binding. These factors often exert only small effects on the helix conformation, but some ligand binding events lead to major structural rearrangements. Right-or left-handed DNA superhelices represent the highest level of structural organization with clearly recognizable symmetry.