A Quantitative Proteomic Analysis of In Vitro Assembled Chromatin*

The structure of chromatin is critical for many aspects of cellular physiology and is considered to be the primary medium to store epigenetic information. It is defined by the histone molecules that constitute the nucleosome, the positioning of the nucleosomes along the DNA and the non-histone proteins that associate with it. These factors help to establish and maintain a largely DNA sequence-independent but surprisingly stable structure. Chromatin is extensively disassembled and reassembled during DNA replication, repair, recombination or transcription in order to allow the necessary factors to gain access to their substrate. Despite such constant interference with chromatin structure, the epigenetic information is generally well maintained. Surprisingly, the mechanisms that coordinate chromatin assembly and ensure proper assembly are not particularly well understood. Here, we use label free quantitative mass spectrometry to describe the kinetics of in vitro assembled chromatin supported by an embryo extract prepared from preblastoderm Drosophila melanogaster embryos. The use of a data independent acquisition method for proteome wide quantitation allows a time resolved comparison of in vitro chromatin assembly. A comparison of our in vitro data with proteomic studies of replicative chromatin assembly in vivo reveals an extensive overlap showing that the in vitro system can be used for investigating the kinetics of chromatin assembly in a proteome-wide manner.

DNA replication, transcription and repair continuously disturb the conformation of chromatin, which results in a relatively high rate of histone turnover (1) and poses a constant threat to the maintenance of epigenetic information (2,3). Therefore, chromatin assembly has to be controlled thoroughly to ensure a proper chromatin structure. It is well appreciated that chromatin assembly is a highly regulated multistep process involving synthesis, storage and nuclear transport of histones followed by their deposition onto DNA. Immediately after translation and before the assembly onto DNA, histones are bound by a number of chaperones that assist their folding, posttranslational modification, nuclear transport and prevent nonspecific association with negatively charged cellular molecules (4 -6). Once histones are deposited, chromatin adopts a particular conformation containing specific histone modification patterns (7-9) and a defined composition of associated proteins (10 -13). Crosslinking experiments show that histones H3 and H4 are first deposited as a tetramer, whereas two dimers of H2A and H2B are added at a subsequent stage (14,15). A similar assembly pathway is also observed in an in vitro assembly system where the process of histone deposition and chromatin contraction occurs within 30 s (16,17). Regardless of this apparent rapid compaction, it takes much longer for new chromatin to become indistinguishable from the bulk chromatin in vivo (9,13).
Recent systematic studies revealed that mature chromatin adopts a complex molecular structure containing a large variety of binding factors that go way beyond a simple aggregate of DNA and histones (11,12,18,19). This observation raises the question of how this structure is assembled, in which order individual factors bind to the DNA, whether distinct intermediates during chromatin assembly exist and which key players mediate chromatin maturation. Many of those questions are extremely difficult to address experimentally because of the high complexity of chromatin assembly and maturation in vivo and its high level of cooperativity. Particularly, the analysis of functionally important components of chromatin synthesis will be difficult to decipher in vivo, as they are expected to have a severe impact on cell division and viability. Therefore, key aspects of chromatin assembly are better accessible by an in vitro reconstitution system. Embryonic extracts are extremely rich sources for factors required in chromatin assembly such as storage chaperones (20 -22) and can therefore support chromatin assembly in vitro (20,23,24). Although it has been shown that such extracts recapitulate several aspects of chromatin assembly in vivo and can therefore be used to investigate this process (23)(24)(25), a systematic comparative study has not been done so far. With the recent development of methods like iPOND (10,26) and NCC (13) to investigate replicative chromatin assembly in vivo and improved techniques of label free MS based quantitation of proteins in complex samples (27) such comparative studies became feasible.
In this study, we used immobilized linear DNA to rapidly isolate in vitro assembled chromatin at different time points and determined its protein composition in a time resolved manner using sequential window acquisition of all theoretical fragment ions (SWATH) 1 -MS-based label-free protein quantitation. A comparison with the proteomic investigation of chromatin assembled in vivo (13) reveals an almost 80% overlap with the orthologue proteins assembled in vitro. Interestingly, we observe very similar binding kinetics, as proteins enriched in nascent chromatin in vivo also bind preferentially during early time points of in vitro chromatin assembly. The similarities of protein identity, binding kinetics and the largely sequence independent protein binding to in vitro assembled chromatin further support the usability of such in vitro assembly systems to dissect the general mechanisms of chromatin assembly.

Preparation of Drosophila Embryonic Extract (DREX)-D.
melanogaster embryos were collected on agar trays with yeast paste 0 -90 min after egg-laying. Using a brush and sieves with descending mesh size (0.71 mm, 0.355 mm, 0.125 mm), embryos were rinsed with cold tap water and allowed to settle into ice-cold embryo wash buffer (0.7% NaCl, 0.05% Triton X-100) to arrest further development. After five successive collections, the wash buffer was decanted and replaced with wash buffer at room temperature. For dechorionation of the embryos, the volume was adjusted to 200 ml and 60 ml of 13% hypochlorite solution was added. The embryos were stirred vigorously for 4 min on a magnetic stirrer, poured back into the collection sieve (0.125 mm), and rinsed with tap water for 5 min. Embryos were allowed to settle in 200 ml of wash buffer for about 3 min. Afterward the supernatant containing the chorions was removed. Following two more settlings in 0.7% NaCl and in extract buffer (10 mM HEPES (pH 7.6), 10 mM KCl, 1.5 mM MgCl 2 , 0.5 mM EGTA, 10% glycerol, 10 mM 3-glycero-phosphate, 1 mM dithiothreitol (DTT), and 0.2 mM phenylmethylsulfonyl fluoride (PMSF), added freshly) at 4°C, the embryos were settled in extract buffer in a 60 ml glass homogenizer on ice. The volume of the packed embryos was estimated before the supernatant was aspirated, leaving packed embryos and additional 2 ml buffer on top. Homogenization was performed with one stroke at 3000 rpm and 10 strokes at 1500 rpm with a pestle connected to a drill press. The homogenate was supplemented with MgCl 2 to a final MgCl 2 concentration of 5 mM. Nuclei were pelleted by centrifugation for 10 min at 10,000 rpm in a SS34 rotor. (Sorvall, Thermo-Fisher Scientific, San Jose, CA). The supernatant was centrifuged again for 2 h at 45,000 rpm in a chilled SW 56 rotor (Beckman-Coulter, Munich, Germany). The clear extract was isolated with a syringe, avoiding the top layer of lipids. Extract aliquots were frozen in liquid nitrogen. Protein concentration was determined by Nanodrop measurement and titration with chromatin assembly experiments.
Biotinylation of DNA-To obtain linearized and biotinylated DNA, we used a plasmid DNA that contains oligomers of the sea urchin 5S rDNA positioning sequence. Five hundred micrograms plasmid DNA were linearized using the restriction enzyme SacI. Completion of the digest was analyzed by agarose gel electrophoresis. Upon completion of the plasmid digestion, we added the restriction enzyme XbaI to the reaction and incubated for at least 3 h at 37°C. Subsequently, the DNA was precipitated and purified, followed by incubation with 80 mM dCTP and dGTP, 3 mM biotinylated dUTP and dATP and the Klenow Polymerase. To purify DNA from excessive nucleotides and enzyme, we used G50 Sepharose columns (Roche, Penzberg, Germany) according to the manufacturers protocol. Finally, DNA concentration was measured and adjusted to 200 ng/l. The same procedure was applied for the heterochromatic 359bp repeat sequence using a pBluescript plasmid containing 4 repetitive elements of the 359bp repeat.
TSA was added to a final concentration of 50 M during chromatin assembly experiments as indicated. For time-resolved studies, the assembly reaction was incubated at 26°C for 15 min, 1 h and 4 h, respectively. After two stringent wash steps with EX200, beads were resuspended in elution buffer (EX100 with 0.5 U/l MNase and 2 mM CaCl 2 ). The supernatant after 10 min of MNase-mediated elution was subjected to mass spectrometry-based protein quantitation.
Micrococcal Nuclease Digestion-Chromatin from 1 g circular DNA assembled for 15 min, 1 h or 4 h was resuspended in EX50 containing 5 mM CaCl 2 and 100 Boehringer units/l of MNase (Sigma). After incubation at room temperature for 30 s and 90 s, respectively, a 110 l fraction of the digestion was stopped by adding 40 l MNase stop solution (2.5% N-Lauroylsarcosine, 100 mM EDTA pH 8.0). The suspension was subjected to RNase A and proteinase K treatment and precipitated DNA was separated with a 1.3% agarose gel. A 100bp ladder (Invitrogen) was used as a size marker.
Preparation of MS Samples for Proteomics Analysis-Assembled chromatin was subjected to mass spectrometry analysis. 10% of the chromatin-bound proteins were separated on a 4 -20% gradient SDS-PAGE and analyzed by silver staining (29). 90% of the chromatin bound proteins were subjected to MS-sample preparation. Proteins were denatured in 3 M Urea, 1 M Thiourea and 25 mM DTT for 2 h at 20°C followed by an incubation for 30 min in a dark place with a final concentration of 25 mM iodoacetamide at 20°C to carbamidomethylate sulfhydryl groups of free cysteine. Subsequently, DTT was added to a final concentration of 50 mM and incubated for 30 min at 20°C. The samples were diluted with 100 mM ammonium bicarbonate to lower the urea concentration below 1 M for tryptic cleavage with 200 ng of trypsin (Promega, Madison, WI) in 50 mM ammonium bicarbonate. Digestion was completed after 14 h at 25°C. 10% of the tryptic peptide mixture were acidified using trifluoroacetic acid (TFA) and desalted using C18 stage tips prior to mass spectrometry analyses and redissolved in 0.2% TFA (30). The resulting liquid, containing the digested peptides, was dried and redissolved in 17 l of 0.2% TFA and stored at Ϫ20°C until further processing.
Sample Preparation for Histone Modification Analysis by MS-Nuclear-enriched fractions were separated by SDS-PAGE, stained with Coomassie (Brilliant blue G-250) and protein bands in the molecular weight range of histones (15)(16)(17)(18)(19)(20)(21)(22)(23) were excised as single band/fraction. Gel slices were destained in 50% acetonitrile/50 mM ammonium bicarbonate. Lysine residues were chemically modified by propionylation for 30 min at RT with 2.5% propionic anhydride (Sigma) in ammonium bicarbonate, pH 7.5 to prevent tryptic cleavage. This step added a propionyl group only to unmodified and monomethylated lysines, whereas lysines with other side chain modification will not obtain an additional propionylgroup. A set of 30 precursors of heavy SILAC-R10 labeled standard peptides, (spiketides) coding for common histone modifications was added prior to tryptic digestion. Spiketide abundance was used to normalize for different sample amounts. Subsequently, proteins were digested with 200 ng of trypsin (Promega) in 50 mM ammonium bicarbonate overnight and the supernatant was desalted by carbon Top-Tips (Glygen) according to the manufacturer's instructions.
Proteomic Analysis via LC-MS/MS on Orbitrap Mass Spectrometer-The peptide mixture resulting from tryptic cleavage was injected onto an Ultimate 3000 HPLC system equipped with a C18 trapping column (C18 PepMap, 5 mm ϫ 0.3 mm ϫ 5 m, 100Å) and an analytical column (C18RP Reposil-Pur AQ, 120 mm x 0.075 mm x 2.4 m, 120 Å, Dr. Maisch, Ammerbuch-Entringen, Germany) packed into an ESI-emitter tip (New Objective, Woburn, MA). First, the peptide mixture was desalted on the trapping column for 7 min at a flow rate of 25 l/min (0.1% FA). For peptide separation a linear gradient from 5-40% B (HPLC solvents A: 0.1% FA, B: 80% ACN, 0.1% FA) was applied over a time of 120 min. The HPLC was online coupled to an LTQ Orbitrap XL mass spectrometer (Thermo-Fisher Scientific).
The mass spectrometer was operated in DDA-mode employing a duty cycle of one survey scan in the orbitrap at 60,000 resolution followed by up to 6 tandem MS scans in the ion trap. Precursors were selected when they had a minimal intensity of 10,000 counts and a charge state of 2ϩ or higher. Previously analyzed precursors were excluded for 20 s within a mass window of Ϫ1.5 to ϩ 3.5 Da.
Proteomic Analysis via LC-MS/MS on Q-TOF Mass Spectrometer-Samples were injected into an Ultimate 3000 HPLC system (Thermo Fisher Scientific) for nano-reversed phase separation of tryptic peptide mixtures before MS analysis. Peptides were desalted on a trapping column (5 ϫ 0.3 mm inner diameter; packed with C18 PepMap100, 5 m particle size, 100 Å pore diameter, Thermo-Fisher Scientific). The loading pump flow of 0.1% formic acid (FA) was set to 25 l/minute with a washing time of 10 min under isocratic conditions. Samples were separated on an analytical column (150 ϫ 0.075 mm inner diameter; packed with C18RP Reposil-Pur AQ, 2.4 m particle size, 100 Å pore diameter, Dr. Maisch) using a linear gradient from 4% to 40% B in 170 min with a gradient flow of 270 nl/minute. Solvents for sample separation were A: 0.1% FA in water and B: 80% ACN, 0.1% FA in water. The HPLC was directly coupled to the 6600 TripleTOF mass spectrometer using a nano-ESI source (both Sciex, Framingham, MA). A data-dependent method was se-lected for MS detection and fragmentation of eluting peptides comprising one survey scan for 225 ms from 300 to 1800 m/z and up to 40 tandem MS scans for putative precursors (100 -1800 m/z). Precursors were selected according to their intensity. Previously fragmented precursors were excluded from reanalysis for 30 s.
Data Analysis of Data-dependent LC-MS Experiments-DDA-MS data recorded on the LTQ Orbitrap mass spectrometer were processed with MaxQuant (version 1.2.2.5) using standard settings with the additional options LFQ and iBAQ (log fit) selected. Data were searched against a combined forward/reversed database (special amino acids: KR) including common contaminants for false-discovery rate filtering of peptide and protein identifications (Dmel_all translation r5.57, 30305 entries). The mass deviation for the precursor mass was set 20 ppm; fragment ions were matched within 0.5 Da mass accuracy. Fixed modifications of cysteine (Carbamidomethyl (C)) were included as well as variable modifications by oxidation of methionine and acetylation (Acetyl (Protein N-term); Oxidation (M)). Matches were filtered setting false peptide and protein (PSM FDR and protein FDR) hits to 1%. The minimum peptide length was allowed to be 6 amino acids, the minimum score for modified peptides was set to 40. For protein identification, one non-unique razor peptide was required, whereas protein quantitation was only performed if at least two razor peptides were associated with the protein hit. Prior to statistical analysis in Perseus, protein hits associated with the reversed database or common contaminants were filtered in the protein.groups.txt file (supplemental Table  S1).
Data-dependent experiments performed on the Q-TOF mass spectrometer were analyzed in MaxQuant (version 1.5.1.2) using the Andromeda search engine and the same flybase database as for Orbitrap data. The settings for database search were as follows: fixed modification carbamidomethyl (C), variable modification oxidation (M) and acetyl (protein N-term); ⌬mass ϭ 20 ppm for precursors, ⌬mass ϭ 50 ppm for TOF fragment ions. Peptide hits required a minimum length of seven amino acids and a minimum score of 20 for unmodified and 40 for modified peptides. Resulting protein hits were FDR filtered for 1% false discoveries on the PSM level and up to 5% false protein hits. Settings for protein identification and quantitation were identical as for orbitrap data (see above).
SWATH Data Acquisition-Peptides from tryptic digestion were resuspended in 10 l 0.1% TFA and injected into an Ultimate 3000 nano-chromatography system equipped with trapping column (C18 AcclaimPepMap, 5 ϫ 0.2 mm, 5 m 100 Å) and a separation column (C18RP Reposil-Pur AQ, 150 ϫ 0.075 mm x 2.4 m, 100 Å, Dr. Maisch) poured into a nano-ESI emitter tip (New Objective). After washing for 10 min on the precolumn with 0.05% TFA, peptides were separated by a linear gradient from 4% to 40% B (solvent A 0.1% FA in water, solvent B 80% ACN, 0.1% FA in water) for 150 min at a flow rate of 270 nl/min. Eluting peptides were detected on a 6600 Triple TOF quadrupol-TOF hybrid mass spectrometer (Sciex, Framingham, MA). First, a mixture of all conditions was run in data-dependent mode to generate an ion library for the data-independent SWATH measurements and optimize the isolation window distribution over the mass range for SWATHdata acquisition. Data-dependent acquisition consisted of a survey scan and up to 40 tandem MS scans for precursors with charge 2-5 and more than 200 cps abundance. Rolling collision energy was set to generate peptide fragments. The overall cycle time for the DDA experiment was 2.676 s. Previously analyzed precursors were excluded from repeated fragmentation for 30 s employing a mass window of 20 ppm around the precursor mass.
MS data with data-independent SWATH acquisition were generated using the same HPLC conditions as used for the generation of the ion library. Based on the distribution of the m/z values of identified peptides in the ion library, the mass range from 300 -1200 m/z was split into 40 SWATH mass windows to optimize the number of precursor ions per window. First, precursors were monitored from 300 -1500 m/z in a survey scan of 50 ms, followed by the SWATH data acquisition for 65 ms/mass window, resulting in an overall cycle time of 2.7 s. The fragmentation energy was adjusted to fragment 2ϩ charged ions in the center of the mass window and a collision energy spread over seven units was allowed. For data analysis, SWATH data were mapped to a protein database containing RT, peptide precursor and fragment ion information, that was generated in ProteinPilot 4.5 (Sciex) against the previously described drosophila database using database search settings described for MaxQuant 1.5.1.2. Settings for SWATH peak extraction in the Peak View 2.1 software (Sciex) were five peptides/protein with at least six mapping fragment ion signals, a confidence interval of 99% and FDR rate of 2%.
Histone Modification-Following carbon stage tip, the dried peptides were resuspended in 5 l of 0.1% TFA and the complete sample was directly injected onto the reversedphase separation column (C18RP Reposil-Pur AQ, 120 ϫ 0.075 mm x 2.4 m, 100 Å, Dr. Maisch) of an Ultimate 3000 nano-chromatography system (Thermo-Fisher Scientific), coupled by a nanoESI source. A separation gradient from 5% B to 30% B (solvent A 0.1% FA in water, solvent B 80% ACN, 0.1% FA in water) over 32 min was applied to separate the histone peptides at a flow rate of 325 nl/min. Because the column was poured into the nano-ESI emitter tip, peptides were directly analyzed by mass spectrometry using a Trip-leTOF 6600 q-TOF mass spectrometer (Sciex). A targeted MS/MS method was selected for detection and quantitation of Nterminal peptides of histone 3.1 and histone 4 with specific modifications. The first scan monitored the abundance of the precursor ion for 225 ms, the MRM scans for individual modifications were acquired for 35 ms per precursor. The overall cycle time was 2.05 s. For the list of precursor ions, peptide sequences, modifications and fragmentation conditions see supplemental methods (supplemental Table S2).
Data Analysis-Peptide fragment masses of heavy and light peptide variants were calculated in silico using GPMAW 5.0 software (GPMAW3) and applied to filter the MRM data for abundance of specific modifications using MultiQuant software (Sciex, version 3.0). Peptides with similar precursor masses containing either trimethylation or acetylation of K were distinguished based on the mass accuracy of the instrument and the difference in retention time.
PTM-data analysis was performed with PeakView software (version 2.1, Sciex) by using doubly and triply charged peptide masses for extracted ion chromatograms (XICs). XICs were checked manually and values were exported to Excel for further calculations. Standard deviation of the mean was used for error bar calculation. The mass spectrometry raw data are deposited to the ProteomeXchangeConsortium with the data set identifier submission number PXD002537 and PXD003445. Annotated spectra can be viewed at MS-Viewer (http://prospector2.ucsf.edu/prospector/cgi-bin/msform. cgi?formϭmsviewer). Spectra from DREX samples can be accessed with search key: hlyzvamoxh. Spectra from chromatin samples can be accessed with search key: 3keubvytc7.
Statistical Methods-Data were handled with Perseus software. Three biological replicates acquired with DDA-MS were analyzed for chromatin and DREX. All three biological replicates from 1 h and 4 h assembled chromatin or wtDREX were Log2(x) transformed. Missing values were replaced by random numbers from a standard deviation (width 0.3, shift Ϫ1.8).
DIA-MS SWATH intensities were normalized to the total area sum within the MarkerView software (version 1.2.1.1). In order to compare different assembly times, the median of three biological replicates was calculated for each protein and each time point. To determine early and late binding proteins, medians of SWATH intensities after 15 min assembly were divided by medians at 4 h for each protein. Resulting ratios were Log2(x) transformed and plotted by Scatter Graph tool sorted from largest to smallest. Ratios higher than 1 were regarded as early binding proteins. Ratios below Ϫ1 were regarded as late binding proteins. Proteins with ratios in between both thresholds were regarded as not changing between both time points and therefore as constant binding proteins. A similar technique with different threshold was applied to chromatin assembly samples treated and untreated with TSA. Medians of SWATH intensities for each protein were calculated and the values for TSA-untreated samples were divided by medians of TSA-treated samples for each time point separately to determine the enrichment upon TSA treatment during chromatin assembly. After Log2(x) transformation of the resulting ratios, proteins with ratios higher than 1 were regarded as enriched in unperturbed chromatin assembly whereas proteins with ratios lower than 0.8 were regarded as enriched upon TSA-treatment. Both protein groups were sub-jected to GO-term analysis. Functional Annotation Clustering for GO-term analysis was performed by means of DAVID Bioinformatics Resources 6.7 (31).
For comparison of sequence specific protein binding, t test-based statistics was applied on SWATH intensities. First, the logarithm (log 2) of the SWATH intensities was taken, resulting in a Gaussian distribution of the data. Statistical outliers for the 5S-rDNA compared with the 359 repeat DNA were then determined using two-tailed t test. Multiple testing correction was applied by using a permutation-based false discovery rate (FDR) method in Perseus. A Similar technique was used for the statistical evaluation of chromatin enriched proteins in comparison to beads-only bound control proteins.
For comparison between in vitro and in vivo data, Nascent Chromatin Capture (NCC) repository data were used as in vivo data (13). NCC data are illustrated by the quotient of Log2(x) transformed nascent SILAC ratios divided by Log2(x) transformed mature SILAC ratios. In vitro data are based on median-averaged SWATH-intensities of three biological replicates of 15 min divided by median-averaged SWATH-intensities of three biological replicates of 4 h. Error bars represent standard error of the mean (S.e.m) with n ϭ 3.
Experimental Design and Statistical Rationale-Chromatin assembly experiments have been performed in three biological replicates with three independently collected DREX. As negative controls, beads-only were incubated in three biological replicates with DREX. Silver gels and agarose gels show a representative example out of three replicates. A pilot study in our lab revealed that three biological replicates enable us for a precise and statistical valid conclusion between chromatin assembly experiments and the composition of proteins during different time points of assembly. Based on biological function of the identified proteins, we altered the initial settings for statistical analysis to be (s(0) ϭ 3 and FDR ϭ 0.5%). Using these parameters, we were able to recover almost all chromatin-assembly factors and reduce the unspecific background.

RESULTS
To investigate factors involved during in vitro chromatin assembly and to study proteome dynamics of chromatin maturation, we analyzed the proteins binding to an immobilized array of a nucleosome positioning sequence derived from the sea urchin 5S rRNA gene (32) using a well-characterized S-150 chromatin-assembly extract prepared from early Drosophila embryos (DREX) (24) (Fig. 1A). It has been shown that this protein extract is able to assemble large DNA fragments into an ordered array that closely resembles the chromatin structure seen in early embryos with regards to nucleosome spacing (33) and histone modifications (34,35) (Fig. 1B).
Chromatin matures in vitro over a time of ϳ4 h, which can be observed by the increased regularity of the nucleosomal array generated by MNase digestion (Fig. 1C). After assembly and extensive washing, newly synthesized chromatin was eluted by a nuclease digestion step, which selectively released all chromatin proteins bound during chromatin assembly. This nuclease-mediated elution had the advantage of a lower background compared with an elution at low pH or highdetergent concentrations (Fig. 1D), because proteins that nonspecifically interact with the beads are not released (36).
Proteins present in the DREX as well as proteins bound to chromatin were digested with trypsin and the resulting peptides were analyzed on a 2 h LC-MS gradient on an Orbitrap XL mass spectrometer operated in a data-dependent acquisition mode. We identified a total of 977 proteins in the assembly extract with 530 of them being present in all three replicates (54.2%) (Fig. 2B, panel (a)). When analyzing the proteins bound to chromatin after 4 h, we identified a total of 299 proteins using the OrbitrapXL with 174 (58.2%) being present in all three replicates (Fig. 2B, panel (c)). Proteins were quantified using LFQ values provided by the MaxQuant analysis suite. Consistent with a high reproducibility of the assembly system, the intensities of the reproducible proteins correlated well with regards to the individual replicates (red data points Fig. 2B, panels (b) and (d)). Proteins that were not identified in all replicates had the tendency to be of low intensity even in the replicates from which they were identified. In order to get more accurate quantitative values and to reduce the number of missing values we generated a high quality ion library of the Drosophila chromatin assembly sys-tem that could be used for SWATH based label free quantitation (37). To generate this library, we measured the three replicates of the DREX assembly extract and the proteins bound to DNA after 4 h on a 6600 TTOF mass spectrometer in a data dependent acquisition run using a Top25 fragmentation scheme. Proteins were identified using the ProteinPilot software, which was also used to generate the ion library. On the 6600 TTOF instrument, we identified 1050 proteins in the DREX and 943 proteins were associated with chromatin after 4 h with a general overlap of 56.9% or 45.87% respectively (Fig. 3B, panels (a) and (c)). Upon retention time calibration using ten conserved peptides that were distributed over the entire LC gradient we could quantify 1035 proteins in the SWATH runs performed on triplicate samples 54    of chromatin assemblies after 15, 60, and 240 min (Fig. 3B,  panel (d)).
Despite the fact that we selectively eluted the assembled chromatin using a digestion with micrococcal nuclease, which shows a relatively low background (Fig. 1D) we still identified and quantified proteins that were eluted from the beads in the absence of chromatin. As we only intended to study proteins specifically assembled on chromatin, we focused all further analysis on the 480 proteins that showed a significant enrichment on chromatin over beads-only control experiments (red dots Fig. 3C). Most of these proteins assemble on chromatin independent of the underlying DNA sequence as we only observe a small number of proteins that show differential binding between the 5S rDNA repeat and a plasmid containing a repeated 359bp sequence from Drosophila pericentromeres (38) (supplemental Fig. S1A).
A hierarchical clustering of averaged log2-transformed SWATH intensities for each sample condition revealed the existence of three main clusters ( Fig. 4A and 4B): One cluster contains 104 proteins that have a relatively high value at 15 min, which then drops with longer assembly times. This cluster containing early chromatin binding proteins contains most factors known to play a role in DNA replication (PCNA, Gnf1 Rfc3/38, etc.) or histone deposition (Ssrp, Dre4, Acf1, Caf subunits), which was also verified using an orthogonal Western blot assay (Fig. 4C). The largest number of proteins (774) binds early and remains bound to chromatin. This cluster, which we have termed constant binders, contains most canonical histone proteins but also BigH1/tefu (ATM kinase), DNA pol alpha, Smc5, Chrac 16 and MCM 2,3,7. Finally, 146 proteins only bind after 4 h of assembly. Examples of proteins belonging to this class are factors known to be involved in DNA repair like Irbp and Ku80 or RPA3/2 and 70.
We next wondered whether the in vitro chromatin assembly system resembles the general processes that occur during replication coupled chromatin assembly in vivo. To do this, we compared our results with the most comprehensive data on the composition of nascent chromatin, which was measured by Alabert et al. (13). As the nascent chromatin capture assay was done in a human cell line (HeLa) we first determined the number of chromatin associated Drosophila proteins that contained a unique human orthologue using the BioMart algorithm (12). 216 proteins (out of 480) that specifically assembled onto chromatin had a clear orthologue in the human proteome and 171 of them (79%) were also detected as chromatin associated in either nascent or mature chromatin (13). We next wondered if proteins that are enriched in nascent chromatin were also preferentially found at the early time points during in vitro chromatin assembly. Indeed many of the early in vitro binding proteins were also enriched in nascent (early assembled) chromatin in HeLa cells, which reveals a strong evolutionary conservation of chromatin assembly mechanisms and a high degree of similarity between in vitro and in vivo chromatin assembly. Similarly, the late binding proteins also have a preference for being enriched in mature chromatin (Fig. 4D).
The presence of a broad HDAC inhibitor such as Trichostatin A during replication coupled chromatin assembly in vivo results in a failure to establish repressive chromatin (39) and a broadening of replication initiation areas (40) suggesting that it results in a more open chromatin structure. To get a better insight into the effect of TSA on chromatin assembly, we performed a time resolved analysis of protein assembled to chromatin in vitro in the absence or presence of TSA. We verified the efficiency of the TSA treatment on the acetylation levels by measuring the acetylation of H4 in the absence or presence of TSA (supplemental Fig. S1D and S1E). It is worth mentioning that the presence of TSA does not result in a histone hyperacetylation, which is usually observed in tissue culture cells but induces a moderate increase in the acetylation because of a failure to efficiently remove the acetylation pattern present on histones before assembly (41). The reason for this is unclear but it is probably because of a general lack of site specific histone acetyltransferases in early embryos (34). Nevertheless, we observe a substantial change in the proteomic composition of chromatin assembled in the presence of TSA. To detect all proteins affected by TSA treatment we quantified the changes of all proteins rather than only focusing on the ones that were significantly enriched on chromatin. An unsupervised clustering revealed a clear difference between chromatin assembled in the absence or the presence of TSA (Fig. 5A). We found 131, 224, and 270 proteins being enriched on chromatin upon TSA treatment and 195, 71, and 121 being reduced after 15, 60, and 240 min respectively (Fig. 5B). A GO term analysis of the assembled chromatin in the presence of TSA suggests that the TSA treatment results in an increased association of factors that interact non-specifically with chromatin in comparison to the assembly reaction that is unperturbed (Table I), which suggests a more open and hence more error prone chromatin assembly. Alternatively, the higher degree of nonspecific binding could also be caused by a hyperacteylation and concomitant regulation of the multiple chaperones we find to be associated with in vitro assembled chromatin and whose job it may be to remove unwanted protein associations. We also find an increased binding of all subunits of the MCM helicase complex to chromatin that is treated with TSA (Fig. 5C), which is interesting in light of earlier findings that show a general broadening of DNA replication and a more widespread binding of MCMs to replication origins in vivo when cells are treated with TSA (40). DISCUSSION The combination of label-free quantitative proteomics with a well-described in vitro chromatin assembly system allowed us to temporally dissect the process of chromatin assembly. By choosing three different time points, we were able to describe the kinetics of the chromatin assembly in vitro at an    Fig. 5B were processed with the Functional Annotation Tool from DAVID Bioinformatics Resources 6.7, NIAID/NIH. FBgn numbers were uploaded as list and used for the search for GO-terms of biological processes according to the Drosophila melanogaster background with a threshold of 2 counts and an EASE of 0.1. Table shows GO-terms with highest enrichment scores for proteins enriched or repelled upon TSA treatment for samples after all three time points of assembly unprecedented depth. Although the in vitro chromatin assembly system has been shown to recapitulate aspects of chromatin assembly in vivo such as chromatin spacing and the establishment of specific histone modification patterns (41,42) or the coordinated binding and release of chromatin binding factors (41,43,44), the proteomic composition of in vitro and in vivo assembled chromatin has not been compared so far. The recent advancements toward a complete proteomic description of in vivo chromatin assembly (13,26) have now made such comparisons feasible. The use of a deep high quality ion library generated from the chromatin assembly extract and the chromatin bound factors enabled us to determine the kinetic of chromatin binding for 480 proteins highly enriched in chromatin. Because of the fact that the in vitro extract is prepared from Drosophila and the nascent chromatin capture has been performed in human cells, we could only compare 217 of these 480 factors that had a clear orthologue in the human proteome. Given this limitation, we find 171 (79%) of the proteins detected as bound to chromatin during assembly in vitro also on nascent chromatin. This supports the hypothesis that the general pathways for chromatin assembly are conserved among different eukaryotes.
Interestingly, not only the identity of the bound proteins is conserved but also their dynamics of binding. For example, we do find multiple components of the replication machinery like the RFC clamp loading complex, PCNA, the single strand binding complex RPA, members of the MCM helicase complex and the subunits of chromatin assembly factor CAF1 enriched at early time points of the assembly reaction, which is consistent with their enrichment in nascent over mature chromatin shown by NCC data. The possibility to perform measurements of all chromatin associated proteins at three time points using SWATH based quantitation enabled us to include an intermediate measurement of assembly (1 h) rather than only comparing nascent and mature chromatin. This dissection of the binding kinetics verifies earlier in vitro findings that the clamp loader complex first binds to the DNA and facilitates PCNA loading and then dissociates from the template upon loading of the sliding clamp leaving PCNA bound to the template to stabilize the polymerase. So far this loading event has only been characterized upon reconstitution in a highly purified system (45,46). The fact that we observe a transient peak of PCNA at the intermediate time point during in vitro chromatin, which decreases substantially at late chromatin assembly resembling nascent chromatin findings (13), suggests that this loading process is indeed occurring during chromatin assembly. In contrast to what happens to the replication and histone deposition machinery and similar to what is seen in vivo, most proteins rapidly associate with chromatin and stay bound. Many of these factors are bona fide structural components of chromatin like the canonical histone proteins, the linker histone and HMG proteins like HMGD. However, we also detect proteins and protein complexes involved in pro-tein folding like members of the TriC/CCT complex (Tcp1, TCPeta, TCPzeta, CG7033, CG8258, CCT5, and CCTgamma) or Hsp60 and Hsp70, being stably bound to chromatin during assembly. This binding of chaperone factors suggests that chromatin assembly is associated with extensive and continuous protein folding and refolding events. Although TriC/CCT has been suggested to be mainly involved in the folding of newly translated proteins (47), there is increasing evidence that it has additional functions such as the prevention of the aggregation of poly-Q proteins (48) or the formation of macromolecular protein complexes in the nucleus (49,50). The TriC/CCT complex can also be observed on nascent and mature chromatin in vivo (13) and in interphase chromatin (12), which also supports its integral role in chromatin dynamics and metabolism. The function of these chaperones in chromatin has so far been enigmatic and no clear function was assigned to them. The fact that we find much more factors with no apparent chromatin function associated with chromatin upon TSA treatment suggests that chaperones may be required to prevent such proteins from binding to accessible (i.e. TSA treated) chromatin or remove mislocalized factors. This hypothesis would fit to previous findings that the formation of regularly spaced chromatin in vitro is entirely dependent on the presence of ATP (20,24), which is a substrate of all nucleosome remodeling factors but also of the Hsp and TriC/ CCT chaperones. Alternatively, chaperones might function in the remodeling of chromatin-bound multisubunit complexes. This hypothesis is supported by findings showing that molecular chaperones are responsible for the rapid exchange of hormone receptors during oscillating transcription (51,52). An example of such a complex that gets remodeled while bound to chromatin is the CAF1 complex. The two large, CAF1 specific, subunits are removed from chromatin at later stages of assembly whereas the small CAF1 histone binding subunit can also be detected at later time points. As the small subunit is also part of the NURF chromatin-remodeling complex, whose other three subunits (ISWI, ACF1, and NURF38) are also present later, it may stay bound to chromatin serving as an additional binding platform for the other components.
In order to test whether we are able to investigate quantitative changes in the assembly kinetics of specific chromatin factors upon a challenge of the system, we added the broad histone deacetylase inhibitor TSA to a reaction and repeated the quantitative proteomic analysis of the bound proteins. We have shown in the past that TSA treatment results in a moderate increase of histone acetylation on assembled chromatin in contrast to what is observed in tissue culture cells where TSA treatment results in a strong hyperacetylation of histones (53)(54)(55). During in vitro assembly, the histones are deposited in a preacetylated form, which will not get deacetylated when TSA is present (41). TSA treatment has been shown to prevent the formation of transcriptionally repressed chromatin (39) in Xenopus oocytes and lead to an altered pattern of DNA replication origin activity in human tissue culture cells (40). The exact mechanism of how TSA mediates these effects has so far remained unclear. Our finding that all MCM proteins, which are key regulators of eukaryotic replication having an increased binding to chromatin when TSA is present, provides a potential explanation for this finding. However, it remains to be determined whether this is an indirect effect of a more open chromatin structure or a direct effect that is mediated by MCM acetylation (55).
Our results show that the use of a quantitative method to investigate chromatin assembly in vitro resembles many aspects of replication dependent chromatin synthesis in vivo and can therefore be used to dissect key regulatory steps in chromatin assembly, which are often difficult to investigate in living cells.