Chasing Tails: Cathepsin-L Improves Structural Analysis of Histones by HX-MS*

Cathepsin-L, a protease known to proteolyse histone tails in cells, is introduced as a novel enzyme for HX-MS. Cathepsin-L generates overlapping N-terminal peptides, improving coverage of histone tails that are poorly represented in HX-MS employing pepsin alone. Cathepsin-L/pepsin is employed to investigate in-solution dynamics of H3 and H4 monomers. Rapid deuteration of the tails is indicative of unfolded polypeptides, whereas extensive bimodal distributions detected in the histone-folds reveal cooperative unfolding events. Graphical Abstract Highlights Cathepsin-L is introduced as a novel protease for HX-MS studies. Cathepsin-L improves resolution of traditionally challenging histone tails. Cathepsin-L can be readily combined with pepsin for improved protein coverage. In-solution dynamics of the H3.1 and H4 monomers reveal extensive EX1 kinetics. The N-terminal regions (tails) of histone proteins are dynamic elements that protrude from the nucleosome and are involved in many aspects of chromatin organization. Their epigenetic role is well-established, and post-translational modifications present on these regions contribute to transcriptional regulation. Considering their biological significance, relatively few structural details have been established for histone tails, mainly because of their inherently disordered nature. Although hydrogen/deuterium exchange mass spectrometry (HX-MS) is well-suited for the analysis of dynamic structures, it has seldom been employed in this context, presumably because of the poor N-terminal coverage provided by pepsin. Inspired from histone-clipping events, we profiled the activity of cathepsin-L under HX-MS quench conditions and characterized its specificity employing the four core histones (H2A, H2B, H3 and H4). Cathepsin-L demonstrated cleavage patterns that were substrate- and pH-dependent. Cathepsin-L generated overlapping N-terminal peptides about 20 amino acids long for H2A, H3, and H4 proving its suitability for the analysis of histone tails dynamics. We developed a comprehensive HX-MS method in combination with pepsin and obtained full sequence coverage for all histones. We employed our method to analyze histones H3 and H4. We observe rapid deuterium exchange of the N-terminal tails and cooperative unfolding (EX1 kinetics) in the histone-fold domains of histone monomers in-solution. Overall, this novel strategy opens new avenues for investigating the dynamic properties of histones that are not apparent from the crystal structures, providing insights into the structural basis of the histone code.


In Brief
Cathepsin-L, a protease known to proteolyse histone tails in cells, is introduced as a novel enzyme for HX-MS. Cathepsin-L generates overlapping N-terminal peptides, improving coverage of histone tails that are poorly represented in HX-MS employing pepsin alone. Cathepsin-L/pepsin is employed to investigate in-solution dynamics of H3 and H4 monomers. Rapid deuteration of the tails is indicative of unfolded polypeptides, whereas extensive bimodal distributions detected in the histone-folds reveal cooperative unfolding events.

Graphical Abstract
Histones are highly conserved proteins that are integral components of the nucleosome, the repeating unit of chromatin. In the "canonical nucleosome structure," 147 DNA base pairs wrap around a histone octamer that consists of a core H3-H4 tetramer flanked by two H2A-H2B heterodimers (1). Interactions within the octamer core and with the sur-rounding DNA promote a compact and rigid nucleosome structure bordered by flexible N-and C-terminal tails. An extensive array of covalent post-translational modifications (PTMs) 1 decorating the "tails" combinatorially regulate gene transcription, replication, and DNA repair, commonly known as the "histone-code" (2). These PTMs are pivotal in modulating interactions of effector proteins with the nucleosomal DNA and with neighboring nucleosomes to modulate the formation of higher order chromatin structure (3). Decoding the information hidden in this extensive modification network and knowing the structure and dynamics of the tails is crucial in understanding regulation of gene expression.
Histones share a highly similar structure known as the histone-fold composed of the helix-loop-helix motif (4) complemented by highly basic, dynamic, flanking termini that extend outside the nucleosome boundaries. Given their disordered nature, parts of the histone tails are absent in highresolution X-ray nucleosome structures reported to date (1,(5)(6)(7), although extension of the DNA sequence has resulted in improved electron density clarity of the tails (8). NMR spectroscopy has provided complete coverage of either all histone tails in single nucleosome particles (9) or of individual tails in condensed nucleosomal arrays (10 -14). However, size limitations (15) prohibit the analysis of samples of increasing complexity (i.e. as would be the case of a histone/reader complex), a space in which cryo-EM has already shown great promise (16). Cryo-EM has been employed successfully for elucidating structures of nucleosomes alone or in complex with chromatin modifiers (17)(18)(19) and for deciphering the organization of higher-order chromatin (for a recent review see (20)). Finally, computational studies including coarse-grained models (21,22) and molecular dynamics simulations (23)(24)(25) on the tails have provided additional information to the role these have on internucleosomal interactions and chromatin compaction.
Despite the wide applicability of hydrogen-deuterium exchange-mass spectrometry (HX-MS) toward the study of unstructured regions within folded proteins (26 -28), there are only a few reports employing bottom-up (peptide level) HX-MS to the analysis of histones to-date (29 -36) and these mainly focus on features of the histone-fold domains. Histone dynamics of H3 variants and H4 have been reported in the context of tetramers, nucleosomes and nucleosomal arrays (29 -31). Further, HX-MS has been employed to study interactions between histone dimers H2A-H2B and H3-H4 in complex with chaperones that facilitate chromatin assembly (32)(33)(34)(35)(36). In all these studies, pepsin, the most common enzyme for generating peptide fragments under HX quench conditions (37) was employed and for most histones provided poor coverage of the tails. Neprosin was introduced recently as an alternative enzyme for histone studies producing N-terminal tail peptides 1-38 for H3 and 1-32 for H4 in acidic conditions, (38) demonstrating its potential applicability for future HX-MS studies. Other novel HX-MS proteases that have emerged over the past few years (39 -45) have yet to be tested to this class of proteins; their reported cleavage preferences however do not seem to render them appropriate for generating measurable size peptides (7-25 amino acids) given the sequence of the tails. Recently, an alternative approach employing electron transfer dissociation (ETD) in combination with HX-MS top-down and middle-down approaches for this class of proteins was demonstrated (46). Despite the high-resolution histone maps obtained however, the widespread implementation of ETD in HX-MS studies lags typical bottom-up approaches, as the analysis, although seemingly straightforward, presents technical challenges (47)(48)(49).
In this study, we sought to investigate the suitability of cathepsin-L, a protease that has been shown to proteolytically process the N-terminal tail of H3 in vivo (50), and to enhance coverage of histone tails in bottom-up HX-MS studies. Cathepsin-L, is one of the 11 cysteine cathepsin proteases that share a common catalytic mechanism and strong sequence similarity with the papain super-family of proteases. Cathepsins are synthesized as propeptides and are processed in the mature form either autocatalytically or by other proteases (51). Their role as degradative enzymes of the lysosome is well established (52), although their observed localization in the nucleus and in tumor microenvironments suggests a variety of extended functions, i.e. in cell development and differentiation (50), as well as in cancer progression (53). Cysteine cathepsins exhibit broad specificity and cleave various proteins in vitro at acidic pH (52). Cathepsin-L has been shown to generate a "primary" cleavage site at H3 Ala21 and additional downstream residues (50,54), an event of potential epigenetic importance, commonly referred to as "histone clip-ping" (55). Additionally, cathepsin-L has been shown to generate multiple cleavage sites downstream of H2A Ala21 in vitro (56). Taken together, these data suggest that cathepsin-L may be applicable to the generation of histone N-terminal peptides of appropriate size for HX-MS studies. Inspired by these reports, we investigated the activity of cathepsin-L at HX-MS quench conditions for all core histones. We show enhanced N-terminal coverage for most of the proteins, and developed a method to investigate in-solution dynamics of monomeric H3 and H4 by HX-MS.
In Vitro Enzyme Activity Analysis-Individual histones (1 l of 1 g/l, PBS pH 7.5) were diluted with PBS (9 l, pH ranging from 7.5 to 2.5, TCEP 5 mM), and cathepsin-L (1l of 0.5 g/l, 25 mM sodium acetate, pH 5.5, TCEP 5 mM) was immediately added to the samples. The enzymatic digestions were allowed to proceed for 2.5 min either on ice (H3) or at room temperature (H4, H2A3 and H2B.1) and were quenched with SCP0110 (1 l of 240 M in PBS, pH 2.5). Samples were subsequently subjected to electrophoresis under denaturing conditions and were stained with EZBlue TM following standard protocol.
Intact Protein Analysis and Deconvolution-Intact histones (2 l, 50 ng/l in 5% FA) were flow injected and analyzed on a Q Exactive Plus Orbitrap MS (Thermo Fisher Scientific, Bremen, Germany) using a standard m/z range of 400 -2000. MS parameters were: spray voltage 2 kV, source temperature 250°C, AGC target 3e6, maximum injection time (IT) 200 ms and resolution (at m/z 200) 140,000. Data were acquired in profile mode and ESI mass spectra were deconvoluted using Intact software (Protein Metrics Inc., San Carlos CA).
In-solution Digestions and Analysis Using a Q Exactive Plus MS-Individual histones (1.5 g, PBS pH 7.5) were diluted with PBS (pH 2.5) and cathepsin-L (0.15 g, 25 mM sodium acetate, pH 2.5) was added to the samples. For cathepsin-L activity experiments, digestions were performed in the presence of PBS containing TCEP, guanidine HCl, or both (at pH 2.5) at various concentrations (Results and Discussion). The enzymatic digestions were allowed to proceed for 10 min at either 37°C or 0°C and were quenched by boiling samples at 95°C for 5 min. Samples were desalted on Empore C18 silica bead 1 The abbreviations used are: PTMs, post-translational modifications; EM, electron microscopy; ETD, electron transfer dissociation; HX-MS, hydrogen/deuterium exchange mass spectrometry; NMR, nuclear magnetic resonance; PBS, phosphate buffered saline; TCEP, tris(2-carboxyethyl)-phosphine hydrochloride; UPLC, ultra-high performance liquid chromatography.
Stagetips (57) following standard protocol (58). Elution mixtures were lyophilized and reconstituted in 5% formic acid befure nLC-MS/MS analysis. Peptides were injected using a Proxeon Easy nLC 1000 and analyzed on a Q Exactive Plus Orbitrap MS (Thermo Fisher Scientific). Briefly, 2 l of samples (1 g) were loaded onto a microcapillary column (360 M OD ϫ 75 m ID) with an integrated electrospray emitter tip (10 m, New Objective) packed with 20 cm of ReproSil-Pur C18-AQ 1.9 m beads (Dr. Maisch, GmbH). The column was heated to 50°C using a column heater (Phoenix S&T). Peptides were separated using a flow rate of 200 nl/min and the following gradient: 3% B to 40% B in 30 min, to 90% B in 4 min, keep at 90% B for 6 min and return to initial conditions in 1 min. Samples were analyzed in datadependent analysis (DDA) mode using a Top-12 method. Ion source parameters were: spray voltage 2 kV, source temperature 250°C. Peptide and fragment tolerances were at Ϯ20 ppm, minimum matched peak intensity 40% and peptide false discovery rates were calculated to be less than 1% using the target-decoy approach (59). No fixed or variable modifications were included in the search. Spectra with a score higher than 4 were accepted.
HX Sample Preparation and Analysis Using a QToF-MS-For hydrogen exchange experiments, individual histones (3 l, 0.4 g/l in PBS; 10 mM Na 2 HPO 4 , 1.8 mM KH 2 PO 4 , 2.7 mM KCl and 147 mM NaCl, pH 7.35) were equilibrated on ice and were mixed with icechilled deuterated PBS (47 l, final D 2 O content during reaction 94% v/v), prepared by two cycles of lyophilization and reconstitution in D 2 O. Samples were quenched at 10, 100, 1000, and 10,000 s with TCEP (5 l of 55 mM containing 13.8% FA) to a final pH 2.4 and were snap-frozen in liquid nitrogen. Non-deuterated samples (0 s) were prepared in PBS and full deuteration (FD) controls were prepared by incubating proteins as described above for 10 min at 95°C. Samples were prepared and analyzed in duplicates. Prior to LC-MS analysis, samples were thawed using a mini centrifuge and were mixed with ice-chilled cathepsin-L (5 l, 100 ng/l in 5 mM TCEP, pH 2.5). Samples were incubated for 5 min at 3°C using an Eppendorf TM Thermomixer TM R (Thermo Scientific) and were immediately injected into a nanoAcquity UPLC with HX technology (Waters). Samples were digested online for 1 min at 300 l/min using the Enzymate™ BEH Pepsin Column (130 Å, 2.1 ϫ 30 mm, 5 m, 186007233; Waters) at 0°C and were subsequently loaded on an Acquity UPLC R BEH C18 VanGuard pre-column (130 Å, 1.7 m, 2.1 ϫ 5 mm, 186003975; Waters) using FA (0.23% (v/v)). Peptides were separated on an Acquity UPLC R BEH C18 analytical column (130 Å, 1.7 m, 1 ϫ 100 mm, 186002346; Waters) at 40 l/min using solvents A (0.23% v/v FA) and B (0.23% v/v FA in ACN). The following gradient was applied: 3% B to 10% B in 0.5 min, to 40% B in 7 min, to 60% B in 1.2 min, to 97%B in 0.3 min; kept at 97% B for 0.6 min and returned to initial conditions in 0.4 min. Samples were analyzed on a Synapt XEVO G2-XS QToF MS (MassLynx TM 4.1, SCN 916; Waters). Spectra were collected in Resolution mode (m/z 350 -1500) using the following parameters: capillary voltage 3 kV, cone voltage 40 V, ion source block temperature 80°C, cone gas flow 100 L/h, desolvation gas flow 800 L/h at 150°C, and nebulizer gas flow 6 bar.
Step Wave values were: DC Offset, 15V; Wave Height 1, 5 V; Wave Height 2, 30 V; Wave Velocity 1, 150 m/s and Wave Velocity 2, 150 m/s. Leucine enkephalin (2 ng/l in 50% [v/v] ACN and 0.23% [v/v] FA) was infused for lock mass spray at 4 l/min; three spectra of 0.5 s were acquired every 20 s. For peptide identification, individual histones (1 g each) were run in data-dependent acquisition (DDA) mode upon digestion with cathepsin-L, pepsin or a serial combination of both (Results & Discussion). MS survey scan time was 0.1 s and the 5 most intense multiply-charged ions with Ͼ10000 intensity/s were chosen for MS/ MS. Fragmentation scans were acquired for m/z 100 -1600 with a total ion current target of 15000 intensity/s for up to 0.8 s. A ramped collision energy profile of 18 -40 eV was applied and fragmented precursor ions were excluded from further MS/MS for 8 s. The lock mass spray and UPLC parameters were as described above. Peptides were identified using PLGS (Waters). For all histones, a nonspecific enzyme search was performed with parameters as described above for SpectrumMill Proteomics Workbench identification. Spectra were recalibrated using Leucine-Enkephalin (556.2771 Da) for lock mass.
Analysis of Deuterium Uptake-Peptide lists were subsequently loaded to DynamX 3.0 (Waters) and deuterium (D) uptake curves of individual peptides were calculated based on the assigned centroid m/z value of the isotopic cluster envelopes. D-uptake values were normalized using experimental values obtained from the analysis of FD controls divided by the deuterium fraction (0.94) in the solvent during the exchange and the theoretical maximum number of exchangeable amides (60). For peptides displaying bimodal distributions, data for each deuteration time point were imported into HX-Express2 (61) and processed using double binomial fittings. Long peptides (Ͼ21 residues) that presented convoluted mass envelopes and required triple binomial fitting were excluded from analysis.

RESULTS AND DISCUSSION
Cathepsin-L Activity Profiling at pH 2.5-Cathepsin-L displays optimal activity at slightly acidic pH (ϳ5.5), though activity in the cell nucleus at pH ϳ7-8 has also been reported (50). To test whether cathepsin-L is functional at pH 2.5 (required during HX-MS quench conditions (37)), we digested the four core histones individually over a pH gradient ranging from basic to acidic to test the enzyme's activity (Fig. 1A). Cathepsin-L demonstrated cleavage patterns that were substrate and pH-dependent, and interestingly, for all substrates exhibited optimum activity in the lower acidic range (ϳ3.5-2.5). For H3.1, H2A, and H2B, several sub-bands were detected across the entire pH range tested, although activity against H3.1 was reduced at pH 2.5. Digestions at higher pH values yielded fewer bands indicating lower enzyme activity, in accordance with similar in vitro experiments of cathepsin-L (73.3% sequence similarity) for H3.1 (50). In contrast, cathepsin-L cleaved H4 exclusively at pH values Ͻ5.5 with highest intensity fragments detected at pH 2.5. These results confirm previous reports showing that cathepsin-L has maximal activity at acidic pH and that cathepsin-L affects substrate kinetics in a pH-dependent manner (52).
We next employed nanoLC-MS/MS to identify peptides generated by cathepsin-L for the four histones (supplemental Fig. S1) and investigated cleavage preferences. Prior to digestion, intact histones were analyzed individually by ESI-MS, confirming that the major proteins species present had com-plete and unmodified sequences (ϳ11-15 kDa, supplemental Fig. S2). Histones were digested in triplicate in a slightly reducing environment that enhances the enzyme's activity (Fig. 1B). The enzyme's reproducibility was assessed by the number of common peptides identified within individual histones across replicates. Therefore, for H2A 57% of peptides was identified in three replicates, for H2B 53%, for H3.1 55% and for H4 65% (supplemental Table S1). Of note, another ϳ20% of peptides were identified in two replicates for individual histones. The moderate reproducibility observed is attributed to notable secondary exopeptidase activity detected (supplemental Fig. S3A), concordant with cysteine cathepsins that are known to contain critical structural elements for both endopeptidase and exopeptidase activities (52). The median pairwise Pearson's correlation of peptide intensities detected in at least two replicates was 0.93 (supplemental Fig. S3B). The "primary" cleavage sites (derived from peptides accounting for ϳ80% of total ion current) are depicted in supplemental Fig. S3C. For H2A3, cleavage at G44 is in line with cathepsin-L in vitro proteolysis experiments of H2A2A (98% homologous to H2A3) at pH 6 reported previously (56). For H3.1, the "primary" cleavage sites lie toward the C terminus of the protein. However, cleavage sites at the proteins' N terminus agree with H3.1 fragments generated in vivo by cathepsin-L (50). For H4, proteolysis using cathepsin-L has not yet been reported in the literature. Our studies indicate that R23 is the most abundant among "primary" cleavage sites, with a few more detected across the protein's length.
We then scanned for optimal concentrations of reducing agent (TCEP) and denaturant (guanidine hydrochloride), typically added to the samples during the quenching step (0°C, pH 2.5) and prior to proteolysis to improve digestion efficiency (37). Our results indicate that a slightly reducing environment (5 mM) and no denaturant result in optimal Cathepsin-L activity ( Fig. 1B-1C). This is consistent with maintenance of a reduced active site Cys while leaving structural disulfides intact.
To next determine amino acid cleavage preferences of cathepsin-L at pH 2.5, we combined all histone peptides identified above and discarded redundant cleavage sites (194 peptides remaining). We aligned peptides (P5 to P4Ј (62)) and loaded them to the iceLogo web application (63). To account for the high fraction of positively charged amino acids, as a reference set, we employed the four histones and cleaved consecutive amino acids across their entire sequences to generate 9 amino acid peptide sequences (455 peptides). Comparison of the frequency percentage of an amino acid at locations P5 to P4Ј between the experimental and reference set revealed a preference for valine, leucine and isoleucine in P2 (positive values in Fig. 2), in agreement with previous studies on cathepsin-L cleavage of hemoglobin (64). In H3, the P2 residue (L20) has been shown to be essential for recognition by cathepsin-L, whereas substitution of P1 and P3 has no significant impact on the proteolytic activity of the enzyme (65).
Enhanced Histone N Terminus Coverage by Cathepsin-L/Pepsin under HX-MS Conditions-Next, we assessed the ability of cathepsin-L to generate peptides from the N-terminal regions (Ͻ50 amino acids) of histones that have been difficult to access using pepsin in earlier HX studies (29, 30,  Table S1).  33,34,36). Although there is no well-defined ideal peptide fragment length in HX-MS studies, these range typically between ϳ7-15 residues (66). In addition, generation of overlapping peptides is highly desirable for improved structural resolution (67). We compared peptide maps generated using cathepsin-L proteolysis, online pepsin digestion and in-tandem cathepsin-L/online pepsin digestion under HX-MS conditions (cathepsin-L:histone ratio 1:2, 3°C) (supplemental Table S2). For H3, pepsin produced long peptides (ϳ50 amino acids), whereas cathepsin-L generated similar size peptides but also additional overlapping ones (1-21 and 1-22), improving resolution in this region (Fig. 3, Table I, supplemental Table  2). Peptide 1-21 is the shortest generated H3 peptide (at pH 2.5) reported in the literature to date (neprosin produced peptide 1-32 at similar pH (38)) and contains five amino acids that can be post-translationally modified (K4, K9, S10, K14 and K18), emphasizing the challenging nature of the H3 tail. For H4, cathepsin-L generated peptides ϳ23 amino acids long and several complementary ones starting at D24, corroborating R23 as a "primary" cleavage site. Pepsin cleaved H4 at D24 generating peptides 1-24 and 25-49, partially demonstrated in a previous HX study of the H3-H4 dimer (29). Combined use of both cathepsin-L and pepsin almost doubled the number of H4 peptides and overlapping peptides and was therefore preferred despite a slight increase observed in the average length that was the result of the larger peptides generated by pepsin. For H2A each protease generated several peptides with few of them being common among cathepsin-L and pepsin; five overlapping peptides with overhanging C-terminal residues were detected up to residue G28, providing the highest resolution among histones. Finally, pepsin proved better for H2B giving a shorter peptide (1-15) and its complementary one (16 -40), whereas cathepsin-L pro-duced peptides that were all Ͼ37 amino acids in length. The use of both proteases in this case however resulted in shorter average peptide lengths and greater sequence overlap. Given the highly basic content of the N termini, detected peptides were highly charged and therefore excellent candidates for HX studies employing ETD for single-residue resolution (68). In contrast to the tails, both proteases generated many overlapping peptides for the remaining protein sequences (beyond amino acid 50), emphasizing the contrast in the amino acid composition between the tails and the core histone-fold regions. Using cathepsin-L, we were able to obtain full sequence coverage for H2A and H4, and ϳ85% coverage for H2B and H3. Overall, the use of cathepsin-L/online pepsin digestion in-tandem proved more efficient compared with each protease alone, evident by the high number of peptides identified and improved resolution. Of note, the average peptide length was not impacted using single or dual protease digestion and was in the range of 18 -25 amino acids. We further tested the developed digestion scheme in intact mononucleosomes containing the histone octamer (H2A1B, H2B1K, H3 and H4). Overlapping N-terminal peptides for H2A (1-24, 1-28), H3 (1-21, 1-23) and H4 (1-23) were detected and full coverage was obtained for all histones (with the exception of H3, 93%), indicating that our method may be deployed successfully in a mononucleosomal context (supplemental Table S2).
Flexible H3 and H4 Tails in Monomers-During DNA replication, newly synthesized histones are recognized by chaperones that coordinate their nuclear import and deposit them onto DNA to form nucleosomes. Most histone chaperones have a preference for binding and importing histone heterodimers (69), and HX structural information has been focused mainly on the analysis of heterodimers (34), tetramers (29) and nucleosomes (30,31) or on heterodimer-chaperone interactions (32)(33)(34)(35)(36), disregarding monomeric states. Recent evidence, however, has shown that H3 and H4 are predominantly monomeric in the cytosol and can be rapidly imported into the nucleus bound tightly to importin-␤ proteins (70). Obtaining structural information at the monomer level, therefore, is crucial for delineating structural elements governing histone-chaperone interactions. We employed our method using dual cathepsin-L/pepsin digestion to probe conformational dynamics of the H3 and H4 monomers. The monomeric states of the proteins were confirmed using size-exclusion chromatography (supplemental Fig. S4). Proteins were incubated in D 2 O for different periods of time (10 s, 100 s, 1000 s and 10000 s) and exchange was performed on ice to slow HX rates (71). On quenching, proteins were digested for a total of 5 min using cathepsin-L followed by online pepsin digestion and analyzed by LC-MS (Fig. 4A). The D-uptake of individual profiles was subsequently calculated; for peptides with bimodal isotope distributions, the fraction of each population was modeled further (Fig. 4B). In total, 36 peptides were generated for H3 (87% coverage) and 56 for H4 (100% FIG. 3. Peptide maps following digestion with cathepsin-L, online pepsin and a combination of both. Each line represents a unique peptide identified following digestion with cathepsin-L (5 min, enzyme/protein ratio 1:2, 3°C), online pepsin (1 min, 300 l/min) and a combination of both using above parameters. All digestions occurred in a slightly reducing environment (TCEP 5 mM). Peptides were desalted online and analyzed by a nanoAcquity-Synapt XEVO G2-XS MS (Waters). Upon identification, peptides were further processed and validated in DynamX 3.0 (Waters) (supplemental Table S2). coverage) monomers and a subset of those was analyzed for their D-content (23 for H3 and 30 for H4), depicted in the form of a heat map for individual time points (Fig. 5A and 5B) and onto the histone structures ( Fig. 5C and 5D). D-recoveries were 83% for H3 and 80% for H4 (average values, calculated as described in (60)). D-uptake graphs for individual peptides are shown in supplemental Fig. S5.
Both monomers demonstrated highly flexible N termini, evident by extensive deuteration at the earliest time point. For H3, all amide protons exchanged with deuterons (peptides 1-21 and 1-22) indicating that this region is unstructured. To our knowledge this is the first time that dynamics of the H3 N terminus has been captured at this peptide resolution by bottom-up approaches. In previous HX-MS studies of H3 and variants, poor coverage of the N terminus was obtained using pepsin (29 -31, 34). In the present study, the higher coverage obtained was made possible by use of cathepsin-L. Exchange profiles for the longer overlapping peptides (1-47, 1-48, and 1-49) extending to the first residues of helix ␣N (46 -57) were obtained for the monomers. Deuterium exchange indicated some slight increase in protection (ϳ88% at 10 s versus 100% for 1-1-22 of H3), suggesting participation of amide protons in H-bonds in the 23-49 region.
For H4, the N-terminal tail was characterized by peptides 1-23, 1-24 and few longer (1-38, 1-48, 1-49) peptides as well as their N-terminal flanking ones (24 -38, 24 -49, 25-49) that contain segments of helix ␣1 and loop L1. For peptides 1-23 and 1-24, D-uptake was ϳ80% throughout the timecourse of the experiment. This could be because of some helical content that has been observed in molecular dynamics studies (73) and corroborated further by circular dichroism for 1-23 of unmodified H4 (17% helical content) (74). Further, previous HX studies of the H2A-H2B dimer have shown that ionic strength plays a critical role in secondary structure formation with lower salt concentrations (60 mM NaCl) leading to partial unfolding and higher salt concentrations enhancing H-bond formation (33). These findings suggest that the tail of the H4 monomer may indeed form a stable secondary structure at our experimental conditions (147 mM NaCl). In the absence of salt, the monomeric H4 tail showed rapid exchange at the earliest time point corroborating further the dependence of structure formation as a function of ionic strength; high salt concentration that was required to maintain the H3-H4 tetrameric complex resulted in reduced deuterium uptake (46). Further, in previous studies of the H3-H4 heterodimer and heterotetramer, H4 peptide 1-24 (pepsin generated) showed rapid exchange at the earliest time point (29,34), suggesting that the chemical environment also plays a major role in the structural conformation of the tails (34).
Extensive EX1 Kinetics in the Histone Folds of H3 and H4 Monomers-Two mechanisms have been proposed to describe HX kinetics of a protein under native conditions, termed EX2 and EX1 (76,77). In EX2, the HX chemical rate is slower than the protein's structural refolding rate and the protein goes through many unfolding and refolding events before a successful exchange event; in EX1, the HX rate is faster than the refolding rate and exchange will occur on multiple exposed residues prior to refolding. EX2 kinetics are presented with a single isotopic distribution whereas EX1 with two dis-  (H3.1, H4). Samples were prepared and analyzed in duplicates. B, Cartoon representation of D-uptake data analysis. Isotopic distributions (EX2) were processed in DynamX 3.0 (Waters) and the D-uptake (centroid depicted with a black line) of peptides over time was calculated. Bimodal isotopic distributions (EX1) were processed in HX-Express2 (60) and the D-uptake was calculated (red and blue dashed lines correspond to a centroid value calculated for each distribution) along with the fraction of each population in solution (the size of each bar corresponds to the "area under the curve" for each binomial distribution observed). D-uptake values from EX2 and EX1 were normalized to full deuteration controls as described in Mayne (60). tinct mass envelopes, with the lower mass envelope representing the "closed" state and the higher mass envelope representing the "open" state of a protein (Fig. 4B). Dynamics of the histone folds (4) were dominated by bimodal isotopic distributions (supplemental Fig. S6), indicating exchange kinetics in the EX1 regime that occur when the rate of refolding is slower than the rate of chemical exchange (77). As a result, for regions in which we observed EX1 kinetics, we included a second heat map for each protein, with the fast-exchanging regions corresponding to the "open" and the slow-exchanging regions corresponding to the "closed" state ( Fig. 5A and 5B) and mapped exchange profiles for both conformations on respective structures (Fig. 5C and 5D) (1). To avoid erroneous assignment of EX1 kinetics, blanks were injected to preclude carryover between samples (78) and the StepWave ion guide that has shown to cause artificial bimodal patterns for highly charged peptides (79) was optimized. For all time points we further calculated the relative population of each state in solution (61) (supplemental Table S3, supplemental Fig. S7). For both histones, relative populations changed over time, suggesting EX1 kinetics rather than two distinct but static conformations in solution that would be indicated if the relative population remained constant (61).
The histone fold is a conserved structural motif, composed of three ␣-helices (␣1, ␣2, ␣3) connected via two loops (L1, L2) that promotes histone dimer formation (4); in H3, there is an extra helix preceding the histone fold termed ␣N (1). Bi-modal distributions for H3 residues 48 -125 were detected, entailing helix ␣N and the histone fold motif (denoted as ␣1-L1-␣2-L2-␣3 in Fig. 5A). Exceptions to this were the central and carboxy-terminal part of ␣3 and the C-terminal of the protein that demonstrated EX2 kinetics. Helix ␣N (residues 48 -61) was mostly rigidly packed (ϳ0.8 in "closed" state at 10 s), evident by the low D-uptake over the time course of the experiment (Ͻ10%); the remaining fraction (ϳ0.2) was unfolded. Helix ␣N has previously been shown to adopt its state in a context-dependent manner, appearing unfolded in (H3-H4) 2 heterotetramers (29,34,46) but stably folded in nucleosomes (1,30,46), nucleosomal arrays (31) and in complex with chaperones (34). ␣1-L1 indicated dynamic regions (ϳ30% overall D-uptake in the time course of the experiment) in contrast to ␣2 that showed the least extent of deuterium incorporation in the histone fold. Segment ␣2-L2-␣3 that promotes tetramer formation of the H3-H4 histone pairs (1), is mainly unfolded (Ͼ0.75 of population, Ͼ89% D-uptake). EX1 kinetics for this region have been observed previously in the (H3-H4) 2 tetramer at high ionic strength and have been explained by the transient loss of tetramerization, further supported by the abolishment of the biphasic behavior upon complex formation of the H3-H4 dimer with DAXX that stabilized this region (34). Further, this region in monomeric H3 and in an H3-H4 dimer complex with ASF1 has been identified to interact with chaperone NASP (80). Taken together, our data suggest that in physiological concentrations of salt and in the  (1). Heat map color-coding is based on the deuterium uptake (%) calculated upon normalizing D-uptake values to respective values obtained using fully deuterated control samples as described in Mayne (60). Regions with bimodal isotopic envelopes detected for the monomers are depicted with two heat maps, one representing the "open" conformation and one the "closed" conformation. Individual time points of exchange are 10, 10 2 , 10 3 , and 10 4 s. Individual D-uptake values and profiles at the peptide level are given in supplemental Table S3 and  absence of H4, this region presents a high degree of flexibility that may be required to accommodate the breadth of the histone chaperone network. For H4, peptides within the histone fold exhibited extensive EX1 kinetic profiles with ␣1 and ␣3 helices occupying mainly an "open" (Ͼ0.7 at 10 s) and helix ␣2 mainly a "closed" state (ϳ0.8 at 10 s). For ␣1 and ␣3, the "closed" population exchanged Ͼ40% deuterium at 10 s indicating partially folded elements (supplemental Fig. S5B). This contrasts with ␣2, where D-uptake was Ͻ20% in the time course of the experiment indicating a tightly folded helix (supplemental Fig. S7). EX1 behavior of the ␣2 helix has been observed in the (H3-H4) 2 tetramer, but not when H4 forms a tetramer complex with the H3 variant CENP-A (29). CENP-A appears to rigidify helix ␣2 and minimize exchange. Further, cooperative unfolding detected previously for both H2A and H2B in the H2A-H2B heterodimer was also localized at helical regions of the histone fold. Similarly to the H3-H4 dimer with DAXX (34), bimodal distributions were no longer observed upon complex formation of the H2A-H2B dimer with Nap1 (33).
Overall, EX1 kinetics seem to be a feature of certain regions of histones, irrespective of whether these are in a monomeric state, engaged in dimers, tetramers or in complexes with chaperones. Differences in experimental conditions (such as temperature and salt concentrations, affecting H-bond formation) render comparison of prior studies to our current work difficult with respect to establishment of the exact boundaries of regions that exhibit EX1 exchange kinetics. However, the EX1 kinetics detected in our study for monomeric histones are extensive and expand the entire protein sequence (omitting the N-terminal tails), suggesting high conformational flexibility, in contrast to histones engaged in complexes where EX1 behavior is confined to short regions. Considering recent evidence showing that H3 and H4 monomers associate with importin-beta proteins and are imported into the nucleus in a monomeric state expands their interaction network further (70). We presume that a high degree of flexibility is a feature of histones required to counterbalance the breadth of chaperones that regulate their nuclear import and assembly into nucleosomes and provide exciting directions for future studies.
In summary, cathepsin-L in tandem with pepsin provided a high-resolution peptide map of histones suitable for HX-MS, including their tails that have thus far eluded crystallographic characterization. The introduction of cathepsin-L as a new tool to facilitate dual-protease techniques could be highly impactful for the study of important biological molecules beyond histones. Such cases may include protein classes known to be challenging for HX-MS, such as antibodies (81), or proteins which have been identified as cathepsin"s substrates (i.e. elastins, collagens and proteoglycans) (52). For example, the fusion viral glycoprotein F has been reported to be activated through cleavage by cathepsin-L in cells (82) but was presented with moderate coverage in an HX-MS study employing pepsin only (83). We think that the advance we have shown here for histones will encourage the HX-MS community to consider cathepsin-L when probing challenging proteins, and therefore the biggest potential impact is to the HX-MS community rather than the narrow histone structural biology one.