The cellular protein nucleolin preferentially binds long-looped G-quadruplex nucleic acids

Background G-quadruplexes (G4s) are four-stranded nucleic acid structures that form in G-rich sequences. Nucleolin (NCL) is a cellular protein reported for its functions upon G4 recognition, such as induction of neurodegenerative diseases, tumor and virus mechanisms activation. We here aimed at defining NCL/G4 binding determinants. Methods Electrophoresis mobility shift assay was used to detect NCL/G4 binding; circular dichroism to assess G4 folding, topology and stability; dimethylsulfate footprinting to detect G bases involved in G4 folding. Results The purified full-length human NCL was initially tested on telomeric G4 target sequences to allow for modulation of loop, conformation, length, G-tract number, stability. G4s in promoter regions with more complex sequences were next employed. We found that NCL binding to G4s heavily relies on G4 loop length, independently of the conformation and oligonucleotide/loop sequence. Low stability G4s are preferred. When alternative G4 conformations are possible, those with longer loops are preferred upon binding to NCL, even if G-tracts need to be spared from G4 folding. Conclusions Our data provide insight into how G4s and the associated proteins may control the ON/OFF molecular switch to several pathological processes, including neurodegeneration, tumor and virus activation. Understanding these regulatory determinants is the first step towards the development of targeted therapies. General significance The indication that NCL binding preferentially stimulates and induces folding of G4s containing long loops suggests NCL ability to modify the overall structure and steric hindrance of the involved nucleic acid regions. This protein-induced modification of the G4 structure may represent a cellular mechanosensor mechanism to molecular signaling and disease pathogenesis. This article is part of a Special Issue entitled "G-quadruplex" Guest Editor: Dr. Concetta Giancola and Dr. Daniela Montesarchio.


Introduction
G-quadruplexes (G4s) are unique four-stranded nucleic acid structures that may form in guanine-rich sequences [1]. Four Gs bind via Hoogsteen-type hydrogen bonds to yield G-quartets that in turn stack on top of each other to form the G4. The presence of K + cations specifically supports G4 formation and stability. G4s are highly polymorphic, in terms of strand stoichiometry (forming both inter-and intramolecular structures), strand orientation/topology and loop sequence, length and position [2]. G4s have been shown in humans and other eukaryotes [3,4], in prokaryotes [5][6][7][8] and in viruses [9,10]. From a functional point of view, in humans G4s play key regulatory and pathological roles, including transcriptional regulation of gene promoters and enhancers, translation, chromatin epigenetic regulation, DNA recombination [11][12][13]. Human G4 DNA motifs have been reported to be associated with recombination prone regions [14] and to show mutational patterns that preserved the potential to form G4 DNA structures [15]. G4 misregulation has been associated with a number of relevant human diseases, such as the amyotrophic lateral sclerosis [16], Alzheimer [17] and fragile X syndrome [18], in which expansion of G4-forming regions has been reported. In viruses, G4s have been proposed both as virus cycle modulators and antiviral targets [10]. In the human immunodeficiency virus-1 (HIV-1), the etiologic agent of the acquired immune deficiency syndrome (AIDS), functionally significant G4s in the Nef coding region [19] and in the unique long terminal repeat (LTR) promoter [20][21][22] have been identified. These studies have shown that G4 folding at the LTR promoter decreases viral transcription with an effect that is augmented by G4 ligands [23,24]. G4s have also been reported in the herpes simplex virus 1 [25], Epstein-Barr virus [26,27], SARS coronavirus [28], human papilloma, hepatitis C, Zika and Ebola virus genomes [29,30,31,32].
The biological effects of G4s are often modulated by the interaction with proteins [33]: the shelterin complex proteins are involved in telomere homeostasis [34], and other G4 interacting proteins either unfold or stabilize the G4 conformation [35,36]. Mutations in some of these G4interacting proteins have been linked to genetic diseases, such as the Bloom's and Werner's syndromes, Fanconi anemia [37,38], and cancer [39].
Among G4 interacting proteins, nucleolin (NCL) is the protein most often reported for its biological functions upon G4 recognition. It is widely believed that NCL plays a chaperone role by helping the correct folding of complex nucleic acids structures. Indeed, NCL has been shown to display a marked preference for both endogenous and exogenous G-rich sequences that can fold into G4 [40]. It has been reported that binding of NCL to the endogenous (GGGGCC) n hexanucleotide repeat expansion (HRE) in C9orf72 is responsible for the initiation of molecular cascades that lead to neurodegenerative diseases [16]. At the promoter level, binding of NCL to G4 structures augments the basal effect of the folded conformation [41][42][43][44]. For example, in the case of HIV-1, it induces potent inhibition of viral transcription by stabilization of the LTR promoter [45]. NCL is mainly present in the nucleolus, but it can be found also in cell membranes and, upon stress stimuli, in the nucleoplasm and cytoplasm to some extent [46,47]. NCL has been initially described as a RNA binding protein, but recent evidence indicates the relevance, and in some case the preference, of its binding to DNA structures [45]. Because of NCL role in viral mechanisms, such as HIV cell attachment/entry and budding [48][49][50] and virus replication and transcription [51][52][53][54][55][56], NCL inhibitors have been developed [57][58][59]. Interestingly, one of these is a G4-forming oligonucleotide, which displays interesting HIV activity [60,61].
Despite the involvement of NCL/G4 complex in several human diseases, a deep understanding of the G4 structural features that drive binding to NCL is lacking. A certain degree of selectivity towards nucleic acids has been reported [41,45]. For instance, in the case of the HIV-1 LTR promoter, where three G4s can form (LTR-II, LTR-III and LTR-IV), NCL binds with high affinity to LTR-III and with intermediate affinity to LTR-II, whereas LTR-IV is not recognized [45].
To understand the structural requirements that stimulate NCL binding to G4s, we tested NCL binding towards different sets of G4 oligonucleotides, considering loop length, topology, sequence, number of Gruns, stability. We found that loop length was always the prevalent determinant of NCL binding to G4s, independently of the conformation and oligonucleotide/loop sequence. These findings both allow a deeper understanding of NCL/G4-regulated physiological and pathological mechanisms, and the possibility to develop new target-specific drugs.

Oligonucleotides and proteins
Desalted oligonucleotides were purchased from Sigma-Aldrich (Milan, Italy). Oligonucleotides were denatured for 5 min at 95°C and gradually cooled to room temperature to achieve proper G4 folding. The human recombinant nucleolin (NCL) is from Origene Technologies, Rockville, USA.

Circular dichroism (CD)
CD spectroscopy was used to probe G4 folding and conformation of the tested DNA oligonucleotides. Oligonucleotides were diluted to a Table 1 Sequences of oligonucleotides used in this study.

Name
Length (nts) Sequence 5′-3′' (GGGTTA) 4 21 GGGTTAGGGTTAGGGTTAGGG (GGGTTA) 6 36 GGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTA (GGGTTA) 8 54 TCGAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAC (GGGTTT) 4 21 GGGTTTGGGTTTGGGTTTGGG (GGGTTT) 6 36 GGGTTTGGGTTTGGGTTTGGGTTTGGGTTTGGGTTT (GGGTTT) 8 48 GGGTTTGGGTTTGGGTTTGGGTTTGGGTTTGGGTTTGGGTTTGGGTTT (GGGTTT) 5  final concentration of 4 μM in lithium cacodylate buffer (10 mM, pH 7.4), KCl or NaCl (100 mM) and 20% of PEG200 where indicated. After heat denaturation samples were folded at room temperature overnight. CD spectra were recorded on a Chirascan-Plus (applied Photophysics, Leatherhead, UK) equipped with a Peltier temperature controller using a quartz cell of 5 mm optical path length and an instrument scanning speed of 50 nm/min. The temperature was set at 20°C and the spectra were recorded over a wavelength range of 230-320 nm. For T m determination, spectra were recorded over a temperature range of 20-95°C, with temperature increase of 5°C/min. T m values were calculated according to the van't Hoff equation, applied for a twostate transition from a folded to unfolded state. Acquired spectra were baseline corrected for signal contribution due to the buffer and the observed ellipticities converted to mean residue ellipticity (θ) = deg × cm 2 × dmol −1 (mol ellip).

Electrophoresis Mobility Shift Assay (EMSA)
EMSA was used to detect G4-NCL interaction and the stoichiometry of G4 folding. Oligonucleotides were 5′-end labeled with [γ-32 P]ATP by T4 polynucleotide kinase after 30 min incubation at 37°C. After DNA precipitation, labeled species were resuspended in lithium cacodylate buffer (10 mM, pH 7.4, KCl 100 mM or NaCl 100 mM and 20% (w/v) PEG200 where indicated), heat denatured and folded at room temperature.
For the determination of G4 stoichiometry samples were run in 20% non-denaturing polyacrylamide gels for~24 h at 60 V. Gels were fresh impressed and visualized by phosphorimaging (Typhoon FLA 9000, GE Healthcare Europe, Milan, Italy).
For NCL binding to DNA, reactions were performed in binding buffer (Tris-HCl 20 mM pH 8, KCl 30 mM, MgCl 2 1.5 nM, DTT 1 mM, glycerol 8%, protease inhibitor cocktail -Sigma Aldrich, Milan, Italy -1%, NaF 5 mM, Na 3 VO 4 1 mM), labeled oligonucleotides (40 nM), human recombinant NCL (300 ng) and incubated for 1 h at 37°C. Binding reactions were supplemented with 20% (w/v) PEG200 where indicated. To load the same amount of DNA in each reaction, counts per second emitted by the radiolabeled oligonucleotides were measured by a Geiger counter (Mini 900 Scintillation Monitor, Thermo Scientific™, Monza, Italy). Samples were loaded on 8% non-denaturing polyacrylamide gels for 15 h at 40 V. Gels were dried using a gel dryer (Bio Rad Laboratpries, Milan, Italy) and visualized by phosphorimaging (Typhoon FLA 9000, GE Healthcare Europe, Milan, Italy). Quantification of bound DNA molecules was performed by ImageQuant TL Software (GE Healthcare Europe, Milan, Italy). Each gel was performed twice to confirm quantification results; only results consistent throughout the repeated gels were retained and reported.

Dimethylsulfate footprinting (DMS)
The DNA substrate of interest was gel-purified before use, 5′-end-labeled with [γ-32 P]ATP by T4 polynucleotide kinase, purified using MicroSpin G-25 columns (GE Healthcare Europe, Milan, Italy), resuspended in lithium cacodylate buffer 10 mM pH 7.4 with or without KCl 100 mM, heat denatured and folded. Sample solutions were then treated with dimethylsulfate (DMS, 0.5% in ethanol) for 5 min and stopped by addition of 10% glycerol and β-mercaptoethanol. Samples were loaded onto a 16% native polyacrylamide gel and run until the desired resolution was obtained. DNA bands were localized via autoradiography, excised and eluted overnight. The supernatants were recovered, ethanol-precipitated and treated with piperidine 1 M for 30 min at 90°C. Samples were dried in a speed-vac, washed with water, dried again, and resuspended in formamide gel loading buffer. Reaction products were analyzed on 20% denaturing polyacrylamide gels, visualized by phosphorimaging analysis, and quantified by ImageQuant TL software (GE Healthcare Europe, Milan, Italy).

Nucleolin binding to G-quadruplexes directly correlates with the number of G-tracts
To investigate the determinants required for efficient NCL binding, we initially tested the human telomeric sequence, formed by GGGTTA repeats, and a simplified sequence formed by GGGTTT repeats, which is found in telomeres of ciliates of the genus Paramecium [62]. The interaction of NCL and its four RNA binding domains (RBD) and the Arg-Gly-Gly (RGG) domain to telomeric G4s has been previously reported [41,63,64]. Telomeric sequences are rather simple compared to the G4 sequences in oncogene and HIV-1 promoters and were thus chosen to allow assessment of oligonucleotide length, loops, repeats, conformation and sequence contribution to NCL binding. NCL/oligonucleotide binding was tested by electrophoresis mobility shift assays (EMSA) with 32 P-labelled nucleic acids and the purified full-length human NCL protein [45].
All sequences folded into mixed type G4s as assessed by circular dichroism (CD) analysis: in fact, all oligonucleotides displayed positive peaks around 290 nm and 260 nm, which are signatures of hybrid/ mixed type G4 topologies (Fig. 1A). NCL binding to the G4 oligonucleotides was visualized as two bands running slower than the free unbound DNA. The presence of the two bands has been previously demonstrated to be due to the N-terminus self-cleaving activity of NCL [45]: thus the two bands were ascribed to the binding of the fulllength (upper) and the N-terminus-minus (lower) NCL protein to the G4s (Fig. 1B). In these conditions, NCL binding was directly proportional to the length of the telomeric sequences and the difference in loop sequence between human and ciliate G4s did not modify this trend (Fig.  1C).

Long loops favor NCL binding to G4s
While the sequences with 4 and 8 G-tracts can fold into 1 and 2 stacked G4s, respectively, sequences with 5 and 6 G-tracts can fold into one G4 structure leaving one or two spare G-tracts. To test whether there was a preferred involvement of G-tracts in these conformationally dynamic sequences, oligonucleotide (GGGTTT) 5 (Table 1) was folded in the presence and absence of K + and treated with DMS to highlight G bases protected from DMS-induced methylation when in the G4 folded state. The analysis showed that all Gs were equally protected during folding ( Fig. 2A)   that involved alternative G-tracts. Similar results have been previously reported [65]. In particular, in the case of the (GGGTTT) 5 oligonucleotide, 5 alternative G4 structures, which comprise G-tracts 1-2-3-4, 2-3-4-5, 1-3-4-5, 1-2-4-5, 1-2-3-5, may be envisaged. To test NCL binding to each of these sequences, mutant oligonucleotides where each G-tract was replaced by Ts were employed ( Table 1). The (GGGTTT) 4 oligonucleotide was used as a control since it displays the same sequence and number of G-tracts as the mutant oligonucleotides, but lacks extra T bases in the loops or at the 5′-or 3′-end (Table 1). All mutant sequences were shown to maintain the G4 folding by CD analysis: oligonucleotides displayed mainly a hybrid/mixed type topology with different levels of parallel/antiparallel contributions (Fig. 2B). The stability of the mutant sequences forming internal long loops was slightly lower than that of the wild-type sequence, as gathered by T melting (T m ) values measured by CD thermal unfolding ( Table 2). EMSA analysis of NCL binding to the mutant sequences showed mild preferential binding towards oligonucleotides where internal G-tracts were mutated (Fig. 2C and D). Binding to sequences with external Ts was consistently lower, but higher than that to the (GGGTTT) 4 control. The (GGGTTT) 5 oligonucleotide displayed an average binding to all other mutant sequences. These data indicate that NCL preferentially binds G4s with at least one loop longer than 3 nts (e.g. 9 nts, compare n5M2, n5M3 and n5M4 binding, lanes 4-6 NCL, to n5M1 and n5M5, lanes 3 and 7 NCL, Fig. 2C) and with free nts at the 5′ or 3′ termini (compare n5M1, lane 3 NCL, and n5M5, lane 7 NCL, to (GGGTTT) 4 , lane 1 NCL, Fig. 2C).

More than two long or extremely short loops destabilize NCL interaction with G4s
To explore the impact of loop length on NCL binding to G4s, three series of oligonucleotides with different loop-length combinations were tested: in each series, the two lateral loops were kept constant (1, 3 or 6 nts in each series) and the central loop was modified (1, 3 or 6 nts) ( Table 1)  to the parallel conformation [66,67]. In general, G4s with shorter loops were more stable than those with longer loops (Table 3). EMSA binding analysis showed a general increase in NCL binding upon loop extension. In particular, considering the series where the lateral loops were kept constant, an increase in binding was observed with the increasing length of the central loop; in the series where the central loop was kept constant, an increased binding was observed upon the increasing length of the lateral loops ( Fig. 3B and C). Two notable exceptions were found: when all loops were 1 nt-or 6-nt long, NCL binding was higher or lower, respectively, than expected by the above-described trend (oligonucleotides 111 and 666 in Fig. 3B and C).
The observed behavior was apparently independent of the G4 conformation: oligonucleotides adopted both mixed and parallel-like G4 conformations (Fig. 3A), while no straightforward correlation with the intensity of NCL binding could be found (Fig. 3B and C). However, when testing G4 folding in EMSA, we observed one main band for all oligonucleotides, except oligonucleotides 111 and 131, which displayed two bands running at different migration rates (Fig. 3D). To test the molecularity of the upper running band, each labeled oligonucleotide was added with increasing amounts of the same unlabeled nucleic acid. As shown in Fig. 4A, the amount of the upper running band increased with increasing amounts of the unlabeled oligonucleotide, indicating the dimeric nature of the upper band. It has been reported that the presence of one T base at the end of a G4 forming oligonucleotide induces a preferential monomeric G4 [68,69]. Indeed, when one T base was added at each end of the 111, 131 and 161 oligonucleotides (T111T, T131T, T161T, Table 1), the oligonucleotides maintained the G4 conformation (Fig. 4B) and only the monomeric faster running band was obtained (Fig. 4C). These T-added oligonucleotides bound NCL to a lower extent than those without T, probably due to steric hindrance, however, the binding trend was restored: the increase in the amount of NCL/DNA complex paralleled the lengthening of the loops (Fig. 4D).
The above data indicate that in G4-forming sequences formed by GGG tracts and T loops, NCL binding is in general favored by the presence of one or two long loops (i.e. 6 nt), whereas three long loops are less favored. The presence of one or two 1-nt long loops in the absence of at least one 6-nt loops disfavors NCL binding.

NCL binding to G4 oligonucleotides is independent of the G4 conformation
It has been reported that oligonucleotide G4 folding depends on the presence of the monovalent cation in solution. For instance, the human telomeric sequence adopts a hybrid conformation in K + solution [70], while it is antiparallel in a Na + solution [71]. Moreover, the presence in solution of cosolvents, such as polyethylene glycol (PEG), during G4 folding is commonly used to perturb G4 topology [72]. The human telomeric G4 DNA in the presence of PEG is reported to adopt a parallel G4 conformation, which is not the predominant form in dilute solutions [73][74][75][76]. To investigate the effect of G4 conformation on NCL binding, we folded oligonucleotides previously tested in KCl alone (Fig. 3B), in KCl and NaCl in the presence of PEG 20% (w/v). In these conditions we observed in most cases a clear-cut conformational change (Fig. 5A): for instance, oligonucleotide 333 was mixed with a prevalent antiparallel conformation in K + , while it shifted to a prevalent parallel topology in K + + PEG (compare Figs. 3A and 5A). The binding of NCL was assessed on these oligonucleotides folded in PEG (Fig. 5B). We observed the maintenance of the increased NCL binding to longer loop sequences trend (Fig. 5C). These data indicate that binding of NCL to telomeric G4 oligonucleotides is in general independent of the conformation and that loop length is the major determinant of NCL binding efficiency to telomeric G4s.

Sequence dependence: NCL binding to oncogenes promoter G4s
To verify if NCL binding determinants observed in telomeric (GGGTTT) n G4s were maintained in G4s with more complex sequences, we tested NCL binding towards four oncogene promoters i.e. c-myc, c-kit1, bcl-2 and hTert 1-4 [77][78][79][80]. These were shown to form stable G4s whose conformations have been characterized by deep structural analysis. All these oncogene promoter G4s display loops with different length, composition and conformation (Table 1 and Fig. 6A). NCL bound preferentially to bcl-2, followed by c-myc; hTert 1-4 and c-kit1 displayed similar and lower NCL binding ( Fig. 6B and C). Among these G4s, bcl-2 is the only structure exhibiting a long loop (7 nts) and two shorter loops (3 and 1 nts), whereas c-myc, c-kit1 and hTert 1-4 they all have loops ≤4 nts (Table 1).
It has very recently been shown that the hTert full-length sequence is composed of 12 G-tracts that fold in two adjacent G4s: our tested hTert 1-4 comprises the first four G-tracts; the second G4, spanning G-tracts 5-12, displays a peculiar structure where a 26 nt-long loop forms an extended stem-hairpin [81]. We wished to test NCL binding to the two separate hTert G4s (hTert 1-4 and hTert [5][6][7][8][9][10][11][12] and to the full-length sequence (hTert 1-12). All three hTert G4s have similar CD spectra with a major peak at 260 nm and a shoulder at 295 nm, with molar ellipticity values at 260 nm increasing with the number of Gtracts in the G4-forming oligonucleotides (Fig. 7A). NCL binding to hTert 1-4 and hTert 5-12 was similar, indicating that the stem-hairpin suppresses the enhancing binding effect of the long loop ( Fig. 7B and C). Binding of NCL to hTert 1-12 was slightly enhanced (Fig. 7B and C), retracing the behavior observed in the (GGGTTA) n and (GGGTTT) n oligonucleotides, where NCL preferentially bound to oligonucleotides with longer loops when multiple G-tracts were available for folding ( Figs. 1 and 2). It is thus possible that also in the case of hTert 1-12 NCL induces folding of a less stable but favorite long-looped G4. In alternative, the 7 nt spacer between the two G4 s (hTert 1-4 and hTert [5][6][7][8][9][10][11][12] is recognized as a long loop and therefore stimulates NCL binding. We next tested bcl-2 and c-myc in the presence of PEG 20% (w/v) to rule out an effect of PEG in NCL binding. Both oligonucleotides in K + are folded in a parallel G4 conformation, which is not modified in the presence of PEG (Fig. 7D). NCL maintained a similar trend and degree of binding to these G4s in the absence/presence of PEG, confirming that PEG did not significantly alter NCL binding (Fig. 7E).
These data are in prefect agreement with the NCL binding trend observed in telomeric (GGGTTT) n G4s. In addition, they also fit with NCL binding to HIV-1 LTR promoter G4s, previously reported and obtained by both EMSA and FRET analysis [45]. In that case, as well, the best bound G4 LTR-III displayed 1, 11, 3 nt-long loops, followed by LTR-II with 8, 11, 1 nt-long loops; in contrast LTR-IV, which presented 1, 4, 1 nt-long loops, was bound to a negligible extent [21,45].
These data indicate that the results collected on (GGGTTT) n G4s apply in general to G4s with random sequences and that, again, the loop length is the major determinant in driving NCL binding to G4s.

Discussion
Despite the fact that NCL binding to G4 nucleic acids has been proposed to be responsible for human neurodegenerative diseases [16], pathological molecular mechanisms in cells [41][42][43][44] and viruses [48][49][50][51][52][53][54][55][56], a deep characterization of the G4 features that induce NCL binding have been attempted only by Gonzalez et al. [41]. This group, who performed pioneer studies on NCL binding to the oncogene promoter c-myc G4, reported a preferential binding of NCL to parallel G4s with short loops. In contrast, our present work indicates that long loops in the G4 structure are preferred and this behavior is independent of the G4 conformation or loop/oligonucleotide sequence. These apparent differences may be explained by several reasons: 1) the use of a recombinant NCL purified in bacteria [41] vs the full-length human NCL purified from human cells (this work): the recombinant protein included all RBDs and C-terminal domain of the natural human protein but lacked the N-terminal domain which in turn was fused to the Escherichia coli maltose-binding protein; even if the nucleic acid binding activity was maintained in these conditions, the specific recognition of G4s, and in particular of their steric features, may be severely distorted by the introduction of a different Nterminal domain. Indeed, the N-terminal domain of NCL is a long unstructured (300 amino acids) domain that, besides its reported interaction with other proteins, may support the binding specificity of the RBD and RGG domains that directly make contact with the nucleic acid.
2) It has been demonstrated that short loops tend to constrain G4 structures into a parallel conformation, regardless of base composition [66,67], therefore the suggested preferential binding to shortlooped G4s in the parallel conformation may derive from the analysis of two dependent variables. 3) In our present work we performed a systematic analysis of loop sequence and oligonucleotide conformation working on a simplified (GGGTTT) n sequence. We unambiguously showed that NCL prefers G4s with at least one long loop (≥ 3 nts), while three short loops (1 nt) or three long loops (6 nts) are disfavored. The analysis on the simplified sequences allowed us to modify the conformation of the oligonucleotides working in different salt (K + or Na + ) and crowding conditions (PEG 20% w/v) [68,72,75] to show that NCL binding is in general independent of the oligonucleotide conformation and that loop lengths is the driving force that enhances NCL binding to G4s. By extending the results from the telomeric sequences to cellular oncogene and HIV-1 promoter G4s, we were able to show that loop length is the major determinant in the enhancement of NCL binding to G4s even in more variable and complex sequences. This observation is in line with the reported recognition of G4s by the cellular protein EWS: EWS shares with NCL the RGG domain which mediates recognition of G4s [82] and binds to G4s independently of base composition in long loops [64].
Finally, our results may shed some light on NCL recognition of repeats of G4 forming regions, such as the hexanucleotide repeat involved in the neurodegenerative diseases amyotrophic lateral sclerosis and frontotemporal dementia [16] and telomeric repeats [83]. We ( Fig. 2A present work) and others [65] have shown that regions containing multiple G-tracts can fold into several conformations involving the alternative use of different G-tracts. We here indicate that NCL binding stimulates folding of these regions to G4s containing long loops (i.e. some G-tracts may be included to the loop moiety) therefore modifying the overall 3D structure and steric hindrance of these nucleic acid repeated sequences. Interestingly, indeed, NCL preferentially stabilized G4s that were less intrinsically stable, likely because of the possibility in this condition to fold them in the best protein-fitting conformation, i.e. G4s with longer and loose loops: this conformational modification may represent a marker to molecular signaling and disease pathogenesis. Our data perfectly fit with the very recent demonstration that protein binding to G4 and iMotif structures modify complex stability and the overall nucleic acid conformation to provide a mechanosensor-regulated cellular mechanism [84].

Conclusions
Our work has unambiguously shown that loop length of G4 structures is a fundamental feature recognized by NCL to promote its binding to G4-folded nucleic acids. Understanding the determinants that regulate NCL/G4 binding is the first step towards the development of therapies that specifically alter this molecular switch that triggers cellular signaling, such as initiation of transcription in the HIV-1 promoter and hence viral production, virus infection, tumor formation, maintenance and metastasis, and neurodegeneration.

Transparency Document
The Transparency document associated with this article can be found, in online version.