Kinetic proofreading of chromatin remodeling : from gene activation to gene repression and back

ATP-dependent chromatin remodeling is the active displacement of nucleosomes along or off DNA induced by chromatin remodeling complexes. This key process of gene regulation in eukaryote organisms has recently been argued to be controlled by a kinetic proofreading mechanism. In this paper we present a discussion of the current understanding of this process. We review the case of gene repression via heterochromatin formation by remodelers from the ISWI family and then discuss the activation of the IFN-β gene, where the displacement of the nucleosome is initiated by histone tail acetylations by the enzyme GCN5 which are required for the recruitment of SWI-SNF remodelers. We quantify the specificity of the acetylation step in the remodeling process by peptide docking simulations.


Introduction
A key problem in molecular biology is to understand how genes are switched "on" or "off".For prokaryotes (bacteria) and phages this question has been studied since the 1960's, culminating in the development of the operon model by Monod and Jacob [1].A particularly well-studied model organism is the λ-phage, whose diverse regulatory mechanisms are beautifully explained in Mark Ptashne's book [2].The main regulatory mechanism of gene control relies on the combinatorial and cooperative binding of transcription factors at regulatory sites which either block or favor the recruitment of RNA polymerase, the reader and translator of the genetic code.Gene activation/inhibition is thus physically controlled by the energetics of binding of the transcription factors, turning the theoretical description of transcription initiation into a problem of statistical physics [3].
While the prokaryote mechanism of transcription factor binding remains valid in eukaryotes, the presence of the chromatin fiber requires additional regulatory mechanisms, notably to control the positioning of nucleosomes along the chromatin fiber, relative to gene and regulatory sequences on DNA.One may therefore talk of a step of 'pre-initiation' of transcription which renders the fiber accessible or inaccessible to the molecular regulators that directly promote the readout of the gene.A particular difference between eukaryotes and prokaryotes is the presence of nucleosomes and the associated numerous post-translational modifications on the nucleosomal histone tails (and cores) whose combinatorial presence has been linked to a potential 'histone code' [4], and further, the existence of a dedicated machinery of 'chromatin remodeling' enzymes which actively, i.e. under ATP-consumption, displace nucleosomes on and also from the chromatin fiber [5].
Recently, there has been an attempt to relate specific histone modifications and chromatin remodeling in a mechanistic picture by invoking a kinetic proofreading scenario [6,7,8].Other kinetic scenarios have also appeared in the literature [9,10,11,12,13].The advantage of the explicit kinetic proofreading scenario is that it specifically couples a recognition step of the chromatin remodeler, which is controlled by the free energy of binding or dissociation, with a kinetic step, which is a non-equilbrium reaction and consumes ATP.This latter step, in any kinetic proofreading scenario as introduced by Hopfield [14] and Ninio [15], is the one that confers a high specificity to the reaction.In the case of mRNA translation, such a specificity is required in order to guarantee the quality of the protein product.In dynamic intermediates, as it is the case in chromatin remodeling, such a high degree of specificity is not needed.Nevertheless, as will be argued here, the mechanism is essential in promoting gene activation or repression in eukaryotes, as it allows a specificity increase by several orders of magnitude, a feat which cannot be achieved by free energy-dependent processes alone.Kinetic proofreading in chromatin remodeling has so far seen experimental support and more detailed theoretical analysis in the case of a remodeler from the ISWI family, ACF [7,8].ACF displaces nucleosomes to form arrays, and hence specifically acts to repress or 'shut down' genes.The activation of the remodeler is controlled by the unmodified histone tails of the histones H4, while for gene activation, e.g. by the RSC remodeler, specific histone tail modifications must be present [16].
In this paper, we first review the argument underlying the kinetic proofreading scheme in chromatin remodeling as it has been developed in [6,7,8].In fact, the original proposal of ref. [6] was concerned with the case of gene activation, while the first experimental case, ISWI/ACF, to which the scenario has been applied concerns a case of gene repression.After briefly reviewing this case, we address the original situation of gene activation and discuss, as a novel application of the kinetic proofreading scenario, the case of the IFN-beta interferon gene which had earlier been studied experimentally in great detail by D. Thanos and collaborators [17,18,19,20,21,22].

The Kinetic Proofreading Scenario of Chromatin Remodeling
Kinetic proofreading as a biochemical process for error-correction was first proposed by Hopfield [14] and Ninio [15].A particularly lucid introduction to this mechanism can be found in Uri Alon's book [23].Here, we consider directly the scenario as relevant to the regulation of nucleosome position due to chromatin remodeling.In this context the scenario presupposes only two experimental facts.The first is that histone amino acids can bear different chemical groups (post-translational modifications) which can be read by chromatin remodeler domains.'Reading' the histone tail state refers to the binding of a remodeler recognition domain with a rate k , and an unbinding with a rate k.We take the rate of unbinding as specific: the rate of binding k is determined by the frequency of molecular collisions, while the stability of the binding between the remodeler recognition domain and the histone tail decides on the rate of unbinding k.The second fact is that the active engagement of the remodeler in displacing the nucleosome is irreversible and consumes ATP; irreversible thus means that in order to undo the induced motion, additional ATP consumption is required.
If we denote the remodeler by R, the nucleosome by N and the remodeler-nucleosome complex by I, the activated complex by I * and the mobile complex by M , the proofreading scenario is given by the following reaction scheme in which m is the ATP-dependent activation rate, t the translocation rate of the nucleosome remodeler complex, and the dissociation rate of the activated remodeler complex from the nucleosome [8] The reaction scheme is also depicted schematically in Figure 1 where we have omitted the translocation reaction t which occurs repeatedly if the remodeler acts in a processive manner.We note that, of course, the rate scheme we adopt is a very simplified one.This is particularly relevant for e.g. in the case of ISWI, as the remodeler carries both the recognition domain and an accessory domain which binds with the DNA linker; this last feature is not retained in the model.Whenever a back-reaction is missing in the reaction scheme it is assumed as irreversible.The reaction scheme can easily be rewritten in the form of the rate equations for concentrations indicated by brackets From these equations the ratio can be calculated when stationarity of the reaction is assumed, in this way characterizing its discriminatory capacity.One finds . Schematic representation of the remodeling reaction under proofreading.The red circle is the nucleosome, the histone tail (green) emerges from it and carries a modification (yellow triangle).The remodeler is drawn as a blue ellipse.
The expression for the error-rate, the key result in kinetic proofreading, follows from the comparison of two reactions ('correct' and 'incorrect' or 'favored' and 'disfavored') described by ratios f 1 and f 2 , which spell out as In the original scenario, the presence of a transcription factor was also included in the reaction scheme [6].A rough estimate for the error factor F yielded a value of ≈ 400, which, as we will see below, is very close to the value found from experiments by G. Narlikar and collaborators [27,28].

Gene Repression: Remodelers from the ISWI Family
The ISWI family of chromatin remodelers is one of the major known groups of remodeling enzymes in eukaryotes [24].Figure 2 displays a schematic drawing of the different domains of the remodeling complex.The ISWI ATPase contains two major domains one of which is the motor domain.In addition to consuming ATP it is also sensitive to the interaction with the unmarked histone H4 tail, and to linker DNA on both sides of the nucleosomes, however not to DNA sequence.Two accessory motifs, termed AutoN and NegC, regulate this interaction in a competitive way [25].AutoN contains a basic patch whose amino acid composition mirrors that of the unmarked histone tail.The presence of the histone tail is therefore required to unblock the functioning of the ATPase.The second major domain, Hand-Sant-Slide (HSS) confers the recognition of the DNA linker and thus steers the action of the activated complex.G.J. Narlikar and collaborators have studied the regulation of a family member of ISWI remodelers, ACF, which acts as a dimer on a nucleosome [26,27,28].This allowed to determine several of the parameters in the Hopfield formula.In the notation used by Narlikar we can rewrite the equation as where the following parameters were determined by her work: k I,1 = 20/min; k I,2 = 1/min; k of f,1 = 8/min, k of f,2 = 160/min; k tr,1 = 80/min, k tr,2 = 80/min.For the remaining parameters we assume k 1 = k 2 and k i = i = k of f,i .When these numbers are used, an error factor of the correct substrate of F ≈ 313 is obtained.This estimate is indeed of the same order of magnitude as the one put forward in [6].More recently, we have extended the ISWI/ACF scenario to include the AutoN and NegC motifs into the kinetic proofreading scheme [29].As it is common in kinetic proofrading scenarii, the addition self-regulatory mechanism of ISWI helps to increase the specificity of kinetic proofreading.A precise quantitative estimate of this effect is currently not available.

Gene Activation: the IFN-β Gene
We now return to the case of gene activation for which the proofreading scenario was first proposed in [6] and in which the presence of transcription factors had been included.In fact, as we will see below for the IFN-β gene, the reality of gene activation is more complex.Figure 3 shows a schematic representation of the build-up of the IFN-β gene and its regulatory regions.The first key element is an enhancer region which is positioned in a nucleosome free region (NFR) in between two positioned nucleosomes (see Figure 3 a).The downstream nucleosome partially occludes the TATA-box transcription factor binding site upstream of the gene.In order to pre-initiate transcription of this gene, the downstream nucleosome must be remodeled in order The TATA-box of the gene is partially occluded by a nucleosome.b) After remodeling, the nucleosome is moved downstream, allowing the general transcription factor TFIID to bind.Drawn after ref. [17,19]; numbers in both graphs denote the position relative to the IFN-β transcription start site on the DNA sequence before remodeling, as the nucleosome has no precise position afterwards.
to shift it further downstream to allow for the generic transcription factor TFIID to bind.
How can the IFN-β gene go from the inactive state shown in Figure 3 a), to the transcription initation-ready state shown in Figure 3 b)?The first step is the assembly of the enhanceosome which consists of a transcription factor complex which contains a set of three different transcription factors engaged in triggering the inflammation response following a viral infection, among them the well-known NF-κB factor [17].The enhanceosome then recruits a histone-tail modification writer, GCN5, which modifies the histone tails of the nucleosome.On histone H4, the lysine residue K8 is acetylated while on H3 K9 and K14 are acetylated [19].Acetylation of H4-K8 allows the recruitment of a remodeling complex, SWI-SNF, which senses the modification via its bromodomains.The remodeler then shifts the nucleosome downstream.Acetylation of H3-K4/K9 recruits the general transcription factor TFIID and hence the IFN-β gene is ready for transcription.While Thanos et al. established the time series of these events, they did not quantify the process by their corresponding free energies of binding or dissociation.At present it is therefore not useful to cast the process into rate equations.Qualitatively, we distinguish between three components: i) the build-up of the enhanceosome is the analogue to transcription factor recruitment in prokaryotes [3] in which, as in prokaryotes, DNA looping also plays a relevant role.For the IFN-β gene, the free energies controlling the build-up of the transcription factor complex could One interesting thing about BR2s is that this family associates with such modifications of histone tails throughout the cell cycle, while other BET members dissociate during mitosis.As an example of this specificity, such a tight binding of BR2 with those specific acetylation marks in human, is known to help human papilloma virus (HPV) and Kaposi sarcoma-associated herpes virus, for tethering their genomes into the mitotic genomes of the host, and propagates them with every cell cycle.The knockout models of BRD2 in mice have shown that missing BRD2 causes severe obesity without type-2 diabetes, suggesting BRD2 as a potential therapeutic target.

Comparison of GCN5 and BRD2
Compared to the GCN5 family of BrDs, the active site architecture is somewhat different in BR2 BrD, because of its more flexible ZA-loop region, which makes a wider active site and provides acK with a secondary binding site.This secondary binding site is absolutely missing in case of GCN5 bromodomains.The overall left-handed four helix bundle architecture is conserved, despite major differences in the interaction pattern compared to other BrDs.Early studies [175] suggested that BRD2 exclusively interacts with acK-12 of H4, by involving many more residues of their protein partners in the interaction, compared to GCN5, PCAF and TAF1. in principle be studied in a similar manner, at least in vitro.This is nevertheless a complex problem as the binding of an extended transcription-factor complex will certainly modify the elastic properties of the chromatin fiber and hence affects the overall state of the fiber [30,31].
ii) the recruitment of the remodeler via the histone-tail state and its subsequent action can be measured experimentally, as was shown in the case of ISWI/ACF [7].For the remodeler RSC, a remodeler related to SWI/SNF, in-vitro measurements have been performed earlier which allows to estimate the specificity of the binding to the tail and the activation effect the remodeler has on the nucleosome [16].
iii) The remaining key intermediate step is the recruitment of the histone tail writer GCN5, which occurs via protein-protein interactions with the enhanceosome, and its action on the histone tails.In the following we will focus on this last aspect in order to understand the specificity inherent to this step.

Free Energies of Histone Tail Peptides in GCN5 Bromodomains
The recognition of (acetylated) histone tails is brought about by bromodomains which provide key protein interfaces in gene expression mechanisms in eukaryotes [32,33].GCN5 bromodomains have been studied in yeast and human [34,35].The GCN5 protein is ∼439 amino acids long and contains a C-terminal bromodomain (BrD) which is composed of ∼115 amino acids and features the general characteristics of BrDs, i.e. a left-handed four-helical bundle (Z, A, B and C helices), and two ZA-and BC-loop regions with two (sometimes also three) small single-turn helices in the ZA-loop region.The active site of the GCN5-BrD is comparatively narrow.Figure 4 shows a comparison of the GCN5 bromodomain structures for the two species (crystallographic data: PDB entries 1E61 for yGCN5 and 1F68 for hGCN5).In order to quantify the specificity of the histone tail modifications we have compared different bromodomains and modified histone
tails by combining molecular dynamics (MD), peptide docking and umbrella sampling [36].In the first step we equilibrate a solvated bromodomain with a bound histone tail peptide carrying a specific acetylation state on the lysine residues, with the binding characterized by the binding free energy.We then pull out the peptide of the bound complex and measure the dissociation free energy from steered MD and umbrella sampling.The expectation is that the dissociation free energy of the preferred modification must be lower than that of an unfavorably modified tail, as our selected docking process mimics the release of the enzyme after the modification has already been placed, and not its binding to a still unmodified tail.
For the docking calculations of the peptides in the bromodomains we built histone tail peptides of 15 amino acids with a central acetyl-marked lysine flanked by 7 amino acids on each side in the form (7aa -acK -7aa).Five different acetylation marks taken at different positions along each of the histone tails H3 (lysine K4, K9, K14, K18 and K27) and H4 (lysine K5, K8, K12, K16 and K20) were selected.The corresponding 15aa peptides were generated using the Tinker modeling package [37].The lysine residues were mutated into acetyl-lysines with CHIMERA [38], minimized using the steepest-descent algorithm and the Gromacs 43a2 force field [39].
In the first MD-step we took each of the BrD models and removed water, ions and any other ligands.The BrD was then put back into a simple point charge (SPC) water box, then equilibrated and relaxed.Box volume was 7.8 nm 3 and the minimum solute-solvent distance was 1 Å with the assumption of a normal charged system at pH 7. To counter the net charge of the BrDs with the biological salt concentration of 0.15 mM, the system was neutralized using 32 sodium and 34 chloride ions which were added according to the potential gradient of the simulated system.Following minimization, equilibration was performed with the backbone of the BrDs partially restrained for 1 ns.Subsequently, unrestrained MD was initiated and the timedependent evolution of trajectories recorded for 5 ns for further analysis.During this production run, bond lengths were constrained using LINCS [40] and SETTLE was used for water molecules [41].The time step of the simulation was kept at 2 fs and the simulation was performed in the [NPT] ensemble, using Berendsen algorithms to impose constant P = 1 bar and T = 300 K.The van der Waals cutoff was set to 12 Å.The long-range electrostatic forces were treated with the Particle Mesh Ewald (PME) method.Each of the 20 MD-optimized peptides were docked to the respective yeast or human BrD's using the AutoDock4.2[42].The central part of the peptide, i.e. the acK plus the two neighboring amino acids to both sides were treated as a flexible part, while the terminal flanking amino acids (5 + 5) were treated as a rigid part of the peptide.For the BrD only the active site residues with 5 Å radius (corresponding to the complete ZA-loop and BC-loop regions) were treated as flexible, while the rest of the amino acids were treated as rigid in order to reduce computing time.We employed a knowledge-based docking approach using the Genetic Algorithm scoring function with a grid size of 1 nm 3 , and 25 million energy evaluations per grid.We assigned the maximum number of torsions allowed in AutoDock, 32 and distributed these around the central amino acid.We also used AutoDock/VINA to go beyond the limit on the number of torsions to include all possible torsions [43].The ten best binding configurations for each of the BrD-acK pair were selected and subjected to MD, steered-MD [44,45] and umbrella sampling simulations [46,47] to obtain the potential of mean force (PMF) for in total 20 complexes (2 BrD × 5 H3 and 5 H4 tail peptides) from which the free energies can be obtained using the WHAM routine for each of the 20 complexes.1. Free energies of binding (FEB) and dissociation free energies (DFE) for the histone tails H4 and H3 to the bromodomain GCN5 for yeast and human.Free energies of binding compare both acetylated (ac) and nonacetylated (nac) 15-mers at the indicated residue, while DFE-values are given for the acetylated residue only.

Results and Discussion
We report the results of the pulling experiments in both graphical and tabular form.Table 1 lists all data for human and yeast on H3 and H4 tails together with the amino acid sequence that has been employed in the simulations.In Figure 5 we show the dissociation free energies for the H3 and H4 tails and both GCN5 domains, with the residue positions arranged linearly at integer values along the abscissa, starting from the end of the tail.Values of the dissociation free energy do not vary significantly between species but display similar trends for position.For the H4 tail, a distinctly lower value of the acetylated residue K8 is clearly discernible.This result clearly shows that the positioning of the acetyl-mark on residue K8 is associated with a preferred release of the bromodomain from the tail.The case of H3 is distinct from the previous case in that we can see that for residue K9 the DFE is equally lowered (as for the case of K8 in H4) while the value for acetylation of K14 is in fact increased.At present the physical interpretation of these values is not as easy as for H4K8 since we have not studied the dissociation free energy in the case where both modifications have already been set.As far as the comparison with experiment is concerned, the scenario we propose has so far not been tested experimentally.
Experiments in the field so far typically test the enzymatic activity of the histone writer based on mass spectrometry approaches or other bulk measurements (see e.g.[48] and references therein), while our approach is in the spirit of a (simulational) force experiment, providing by design access to specific modifications on individual peptides.Simulations retaining molecular details of the complex formation between the histone writer and the tail are meanwhile also available [49].

Conclusions
To conclude, in this paper we have first briefly reviewed the basic idea of the kinetic proofreading scenario of chromatin remodeling which was originally proposed with transcriptional activation in mind [6].As it happens, G. Narlikar and collaborators proposed a related scenario for the repressive ISWI-type remodeler ACF, which was therefore the first case to which this idea could be applied quantitatively.After reviewing these results, we turn back to transcriptional activation and propose that the IFN-β gene as an interesting case for the application of the scenario as for this gene the sequence of regulatory events has been well-studied, however not in a quantitative fashion.We therefore have addressed one key step in the process which is the placement of the histone tail modifications, as step which was ignored in [6].We determined dissociation free energies of the bromodomain of the histone code 'writer' GCN5 for both yeast and human with molecular pulling simulations, in the idealized case of employing representatively constructed tail peptides.Our results are indicative that a full quantitative characterization of the kinetic proofreading scheme in the case of gene activation is possible in principle, if not by experiment, then at least by simulation.As an example for the latter can serve the recent work by Teif et al. [30], in which the binding of HP1 to the chromatin fiber is considered, taking into account the folding state of the chromatin fiber.

Figure 2 .
Figure 2. Schematic representation of the domains of the ISWI remodeler.The ATPase is the motor unit which interacts with the unmodified histone tail H4.The neighbouring regions AutoN and NegC modulate this interaction.Hand-Sant-Slide denotes domains involved in regulating the interaction of the remodeler with DNA.

Figure 3 .
Figure 3.The regulatory region of the IFN-β gene.a) An enhancer element is located upstream of the gene in a nucleosome-free region (NFR).The TATA-box of the gene is partially occluded by a nucleosome.b) After remodeling, the nucleosome is moved downstream, allowing the general transcription factor TFIID to bind.Drawn after ref.[17,19]; numbers in both graphs denote the position relative to the IFN-β transcription start site on the DNA sequence before remodeling, as the nucleosome has no precise position afterwards.

Figure 4 .
Figure 4. Ribbon-style representation of bromodomains.α-helices are shown as flat ribbons, loops as thin wires.(A) Yeast GCN5-BrD; (B) structure-based alignment of yeast (purple) and human (green) GCN5-BrD with an RMSD of 0.8; small differences in the ZA-and BC-loops are observed; (C) human GCN5.

Figure 5 .
Figure 5. Dissociation free energies (in kCal/mol) as determined from the pulling protocol, as taken from Table I.On the abscissa the modified histone amino acids are placed on arbitrary integer positions; the drawn line is a guide to the eye.a) H4; b) H3.The yellow (lighter) curve is the yeast data.