RNA binding induces an allosteric switch in Cyp33 to repress MLL1-mediated transcription

Mixed-lineage leukemia 1 (MLL1) is a transcription activator of the HOX family, which binds to specific epigenetic marks on histone H3 through its third plant homeodomain (PHD3) domain. Through an unknown mechanism, MLL1 activity is repressed by cyclophilin 33 (Cyp33), which binds to MLL1 PHD3. We determined solution structures of Cyp33 RNA recognition motif (RRM) free, bound to RNA, to MLL1 PHD3, and to both MLL1 and the histone H3 lysine N6-trimethylated. We found that a conserved α helix, amino-terminal to the RRM domain, adopts three different positions facilitating a cascade of binding events. These conformational changes are triggered by Cyp33 RNA binding and ultimately lead to MLL1 release from the histone mark. Together, our mechanistic findings rationalize how Cyp33 binding to MLL1 can switch chromatin to a transcriptional repressive state triggered by RNA binding as a negative feedback loop.


RNA library for selection by ligand binding:
A random RNA pool was generated using a dsDNA template with a T7 polymerase promoter site followed by 30 nucleotides long randomized sequence and a 3' constant primer binding site. The following primers were used to design the dsDNA template: were prepared by lyophilization and resuspension in D2O.

NMR titrations
Multi-complex assembly/dissociation: H3K4me3 peptide was titrated at 310°K with MLL1 PHD3 in 20mM KHPO4 pH7, 40mM KCl, 50μM ZnCl2 buffer to a 1:1 ratio. Cyp33 RRM was then titrated with UAAUGU RNA to a 1:1 ratio in the same conditions. Finally, both complexes were mixed and NMR signals were followed upon time. The first serial file of F1-filtered 2D NOESY was used to follow the evolution of unlabelled RNA and K4me3 group of unlabelled peptide and 1 H-15 N HSQC to follow the evolution of 15  All titration experiments were measured on a Bruker AVII-700MHz spectrometer equipped with a cryoprobe.

Reaction network model
The model based on ordinary differential equations was formulated with rule-based modeling in BioNetGen language (65). Model simulations in different conditions were performed using RuleBender package (66). Model parameters are listed in table S2. The final model was deposited in BioModels (MODEL2201310002).

Isothermal titration calorimetry (ITC)
In order to prepare the receptor and ligand molecules or complexes, each macromolecular component  Figure S2F was performed by titrating 1mM of H3K4Me3 (in the syringe) into a 10μM RNA solution.
In order to have more measuring points at the start of the sigmoidal curve, for some of the experiments the initial 10 injections volumes were set to 4 μl followed by injections of the normal 8 μl, leading to the apparent discontinuity in the decrease of the heat response amplitude. Total number of injections were set to 40 with a delay of 5 minutes and measurement stopped automatically after syringe volume was consumed. All experiments were performed at 25°C on a VP-ITC instrument (Microcal), calibrated according to the manufacturer's instructions. Raw data were integrated, normalized for the molar concentration, and analyzed using the Origin 7.0383 software according to a one binding site model.

PPIase activity assay
The PPIase activity of recombinant Cyp33 wt was performed as described (67)

RT-PCR
HEK293T cells were transfected with pCEP4 plasmid constructs expressing full length FLAG-tagged wild-type and mutated versions of CYP33 using Lipofect-amine 2000 (Invitrogen EDTA, 1xRoche protease inhibitor and 25U/mL Superase-RNAse inhibitor). Finally, RNA was eluted with elution buffer (100mM Tris-HCl pH 7.5, 50mM NaCl, 10mM EDTA, 100 g Proteinase K, 0.5% SDS for 1 hour at 55°C in a Thermomixer at 1100rpm, extracted using the RNA extraction kit (Ambion) and retro-transcribed into cDNA using the SuperScript III First-Strand Synthesis Super-Mix (Thermo Fisher Scientific). qPCRs were performed using SYBR GreenER qPCR SuperMix Universal (Invitrogen) and data expressed as relative enrichment to the respective input material.
The following primers were used for Real time analysis: hU1 Rv 3' CAGGGGGAAAGCGCGAACGCA 3' 3 Backbone sugar and base-number         Figure 6A, with the following differences: (D) Effect of RNA alone -Cyp33 was not recruited during RNA transcription; (E) Effect of Cyp33 alone -RNA was not transcribed; (F) Both RNA and Cyp33 were generated, but the RNA-H3K4me KD was set to 500 μM (17-fold weaker affinity than the experimentally determined 30 μM) -to illustrate robustness of the system behavior.Simulations show equilibrium populations of different species in the target system, as simulated by the developed ODE model encompassing all underlying reactions. At t=0 min, the initial system is equilibrated with only MLL1 and H3K4me at 100 μM each. At t=5.6 min RNA transcription is triggered, simultaneously recruiting Cyp33 to the system -with each species reaching 100 μM concentration at its maximum -thus giving equimolar concentrations for all 4 components of the system (MLL1, H3K4me, RNA, Cyp33). Simulations in (D) and (E) show cases when only RNA and only Cyp33 were dynamically added to the system. At t=22 min, RNA transcription is turned off, allowing gradual removal of RNA and Cyp33 from the system. Left panels show the balance between the "active" (blue trace) and "repressive" (red trace) states of H3K4me. The "active" state combines the binary MLL1-H3K4me and tertiary Cyp33-MLL1-H3K4me complexes. The "repressive" state combines the free H3K4me and RNA-bound H3K4me. Right panels show the corresponding time-resolved dynamics for each of the above species separately. Table S2. Parameter values and species concentrations used in the simulations.

Supplementary
-The network simulation conclusions in the paper refer only to equilibrium states, defined by thermodynamic constants. Kinetic constants are estimated primarily for ODE model definition purposes and, within the simulation timescales used in the paper, do not affect the network equilibrium states.
-Unless indicated otherwise the bimolecular kinetic kon constants are assumed to be 10 2 slower than the absolute diffusion limit -to approach the unbiased diffusion-limited kon (70). The diffusion limits are estimated based on spatial dimensions of corresponding molecules.
-"Experimental" -refers to measurements performed in this study. KD constants measured by ITC or NMR at 25ºC (298.15K).
-Main simulations presented in the paper were done at equimolar 100 μM of MLL-PHD3, H3K4me (present from the start of simulation) and 100 μM RNA and Cyp33 being first dynamically recruited and then removed during simulation. Keq = 0.084 -kon estimated from "3D-diffusion-tocapture" mechanism (the time it takes a protein to find its target by simple 3D diffusion) (71). 3D = V / (D * a). kon 3D V, D, a -search volume, diffusion coefficient, linear size of the target. Since -helix (26 Å-long) is tethered with one end, it is assumed that the search occurs in a volume of radius defined by helix 3 . The linear size of the target surface is set equal to helix length (a = 26 Å). Diffusion coefficient is assumed to be same as for helix free in solution (1.9E-06 cm 2 s -1 ). The fact that helix is attached to the protein with one end is neglected. Two main consequences of this, to a rough approximation, should balance one another: helix should move (diffuse) slower than free in solution, and at the same time it should find the target faster, since the search space is reduced.
koff is estimated based as follows: -perp (bound to beta-sheet) state should explain the Cyp33-RNA Kd enhancement 319 > 79 μM upon helix "removal" in WLF mutant (i.e. 75% of -perp state at equilibrium) (2) kon -formation should be same (the diffusion search occurs in same volume) (3) koff -perp state is 4 times slower than koff -sheet-bound state is expected to be more stable) -as judged based on relative interaction interfaces between the molecules. -Requires that PHD33 domain is bound to beta-sheet of Cyp33 kon and koff are estimated in same way --kon and koff are set the same in binary (Cyp33-PHD3) and tertiary (Cyp33-PHD3-H3K4me) complexes.  (73).
In absence of Cyp33 -based on the structural data (13) the cis isomer is predicted to be preferred. Here this is modelled as a reverse equilibrium of cis/trans isomer ratio from 0.135 to 1/0.135.