Reader domain specificity and lysine demethylase-4 family function

The KDM4 histone demethylases are conserved epigenetic regulators linked to development, spermatogenesis and tumorigenesis. However, how the KDM4 family targets specific chromatin regions is largely unknown. Here, an extensive histone peptide microarray analysis uncovers trimethyl-lysine histone-binding preferences among the closely related KDM4 double tudor domains (DTDs). KDM4A/B DTDs bind strongly to H3K23me3, a poorly understood histone modification recently shown to be enriched in meiotic chromatin of ciliates and nematodes. The 2.28 Å co-crystal structure of KDM4A-DTD in complex with H3K23me3 peptide reveals key intermolecular interactions for H3K23me3 recognition. Furthermore, analysis of the 2.56 Å KDM4B-DTD crystal structure pinpoints the underlying residues required for exclusive H3K23me3 specificity, an interaction supported by in vivo co-localization of KDM4B and H3K23me3 at heterochromatin in mammalian meiotic and newly postmeiotic spermatocytes. In vitro demethylation assays suggest H3K23me3 binding by KDM4B stimulates H3K36 demethylation. Together, these results provide a possible mechanism whereby H3K23me3-binding by KDM4B directs localized H3K36 demethylation during meiosis and spermatogenesis.

Vertebrate KDM4A-C proteins contain a conserved double tudor domain (DTD) and a potential zinc-finger domain at the carboxy terminus (Fig. 1a). These types of chromatin-interacting modules (also known as reader domains) often mediate binding to specific histone modification states [17][18][19] . Tudor domains are part of the 'Royal Family' reader domains, which usually recognize methylated lysine residues 20,21 . In particular, DTD from KDM4A (KDM4A-DTD) was shown to form an unique integral structural unit and recognize methylated lysines [22][23][24][25][26][27] . Deletion of the C-terminal domain in KDM4 proteins resulted in a change of sub-cellular localization, changed demethylase activity and disruption of other KDM4 functions 8,14,15,28 , suggesting functional roles for the C-terminal DTDs. However, there has been no comprehensive investigation of the histone-binding properties for KDM4B and KDM4C DTDs.
To better understand how reader domains regulate the overall chromatin-acting functions among the closely related KDM4 family members, we aimed to determine and compare histone interactomes of the C-terminal DTDs in human KDM4A-C proteins (Fig. 1a). From our biochemical and structural profiling, we find KDM4A, KDM4B and KDM4C DTDs display different histone-binding preferences. We show that these DTDs use an aromatic cage as a general mechanism to coordinate trimethyl lysine and, most importantly, the sequence specificity is largely determined by side-chain interactions with surrounding residues. Specifically, we describe the unique interaction between KDM4-DTDs and H3K23me3, a histone modification enriched in heterochromatin during meiosis in primary spermatocytes. Our crystal structures and homology models explain the origins of H3K23me3 specificity by KDM4B, and these biochemical and structural data are supported by the co-localization of full-length KDM4B with H3K23me3 in vivo. We demonstrate that H3K23me3 binding stimulates H3K36 demethylation by KDM4B. These results suggest H3K23me3 as a novel link between KDM4 family and mammalian germ cell development, potentially to eliminate methylation at H3K36 at heterochromatic regions. The molecular determinants of histone-binding specificity by DTDs serves as a general framework to understand KDM4 family functions.
We further quantified the methylated histone binding of KDM4-DTDs by employing a solution-based binding assay (fluorescence polarization; Fig. 1c,d). Overall, the derived binding constants were consistent with the peptide array analysis. KDM4A-DTD binds H3K23me3 peptide at low micromolar affinity (K d ¼ 2.2 mM), slightly stronger than its interaction with H3K4me3 and H4K20me3 (Fig. 1d), and KDM4C-DTD displayed highly specific binding to H3K4me3 (K d ¼ 6.8 mM; Fig. 1d). In dramatic contrast to KDM4A-DTD and KDM4C-DTD, KDM4B-DTD only displayed striking specificity for H3K23me3 (K d ¼ 10.3 mM), with no significant binding to H3K4me3 ( Fig. 1d and Supplementary Fig. 1c). Collectively, this systematic analysis indicates that KDM4A-C DTDs have distinct and surprising histone-binding preferences that could not have been predicted based on the high similarity in their overall amino acid sequences (460% identity, 480% similarity; see Fig. 1e). Nevertheless, how KDM4-DTDs achieve sequence specificity for residues proximal to the trimethyl lysine remained unclear. To define the mechanism of such sequence selectivity for KDM4-DTD binding, we proceeded to structurally and biochemically detail these interactions.
The structure of the H3K23me3-bound KDM4A-DTD shows that the trimethyl lysine group of H3K23 is coordinated via cation-p and/or hydrophobic interactions by an aromatic cage formed by F932, W967 and Y973 of the reader domain (Fig. 2d). To examine the selectivity of methylation state on H3K23, we determined binding constants of KDM4A-DTD for mono-/di-/ trimethylated H3K23 peptides. The binding results suggest the KDM4A-DTD aromatic cage favours trimethyl lysine (K d ¼ 2.20 mM) over dimethyl lysine (K d ¼ 9.06 mM), or monomethyl lysine (K d ¼ 98.55 mM) at H3K23 (Fig. 2e). Correspondingly, H3K23me3 binding is greatly decreased (K d ¼ 284.90 mM) with the aromatic cage substitution Y973A (Fig. 2e). The conserved role of this aromatic cage in directing PTM selectivity ( Supplementary Fig. 2b) towards trimethyl lysine suggests specific recognition with H3K23me3 is influenced by neighbouring residues besides the trimethyl lysine moiety on histones. The orientation of H3K23me3 peptide in KDM4A is more similar to H3K4me3 than H4K20me3 peptide ( Supplementary Fig. 2b). However, other H3K4me3 reader domains such as ING2-PHD could not bind H3K23me3 ( Supplementary Fig. 2c), because ING2-PHD recognizes the free N terminus (H3A1) of H3K4me3 peptide and does not accommodate the longer N terminus of H3K23me3. In contrast, the more open channel between the two lobes in DTDs can accommodate the longer polypeptide chain N-terminal to H3K23me3.
H3T22, the neighbouring residue of H3K23, plays an important role in the H3K23me3 recognition by KDM4A-DTD. H3T22 side chain forms a hydrogen bond with N940, whereas its main chain forms hydrogen bonds with N940 and S938 side chains (Fig. 2f). The importance of H3T22 in H3K23me3-binding is illustrated by the decreased H3K23me3 binding with phosphorylation on H3T22 (K d ¼ 83.69 mM) or when N940 is changed into Ala in KDM4A (K d ¼ 57.41 mM; Fig. 2g), both of which would interfere with the hydrogen bond of H3T22 side chain. The effect of H3T22ph on H3K23me3 binding is reminiscent of the relationship between H3T3ph and H3K4me3. Indeed, a similar loss of KDM4A binding to H3K4me3 was observed with H3T3ph ( Supplementary Fig. 2d).
Unique contacts with H3R26 also contribute to H3K23me3 recognition by KDM4A. The H3R26 side chain forms a hydrogen bond with KDM4A N931 (2.9 Å), whereas the main chain of a   Supplementary Table 1 for peptide information) for each FP assay. Relative specificity factors (as shown by the shades of red) were normalized to the tightest binding in the matrix (K d ¼ 2.2 mM). Binding affinity for each protein-peptide combination as determined by FP is labelled. (e) Sequence alignment of KDM4A-C DTDs. Secondary structure is labelled on top of the sequence (blue: Tudor 1, orange: Tudor 2). Shades indicate conservation of particular residues (yellow: conserved in all three proteins, cyan: conserved in two proteins). Red stars highlight the residues that interact with H3K23me3 peptide as determined from Fig. 2 H3R26 forms a hydrogen bond with the KDM4A-DTD G935 main chain (Fig. 2h). H3R26 fits nicely into a negatively charged cleft formed by residues spanning from E929 to D933 ( Supplementary Fig. 2e). To provide direct biochemical evidence of the H3R26-N931 interaction pair in H3K23me3 recognition, we modified H3R26 in the context of H3K23me3 peptide and observed decreased binding (32.3 mM) between wild-type KDM4A-DTD and H3K23me3R26A peptide (Fig. 2i). Interestingly, modifying H3R26 with asymmetric dimethylation dropped H3K23me3 binding mildly to 8.7 mM (Fig. 2i), suggesting KDM4A-DTD might read H3K23me3 and H3R26me2a in a combinatorial manner. We also observed a dramatic decrease in H3K23me3 binding (73.9 mM) with N931R substitution (Fig. 2j), which could form electrostatic repulsion with H3R26. Conversely, changing N931 into aspartic acid (N931D) increased H3K23me3 binding by about threefold (K d ¼ 0.85 mM; Fig. 2j), potentially by providing ionic interaction with H3R26. Interestingly, N931D also increased H3K14me3 binding (K d ¼ 12.24 mM; Supplementary Fig. 2f), likely to be due to the enhanced ionic interaction between D931 and H3R17 at the þ 3 site of H3K14me3, reminiscent of the distance between H3K23me3 and H3R26 at the þ 3 position (Supplementary Fig. 2g). Overall, the H3R26-N931 pair mediates H3K23me3 recognition by KDM4A-DTD and this contact is unique to H3K23me3 binding. Besides main-chain hydrogen bonds between H3R26 and G935, other main-chain hydrogen bonds are found between H3A24 and F937, together forming three b-sheet-like main-chain hydrogen bonds. The intermolecular interactions determined from the KDM4A-H3K23me3 co-crystal structure involve all residues from H3L20 to H3R26 (Fig. 2k), consistent with the clear electron density of this region in the structure (Fig. 2b). The molecular arrangement of the H3K23me3-bound structure defines an H3K23me3-specific interaction network and suggests that modifications on H3T22 and H3R26 could play regulatory roles in modulating DTD binding.
H3K23me3 recognition by specific side-chain interactions.
Previously, H3K4me3 and H4K20me3 were characterized as histone PTMs recognized by KDM4A-DTD 23,30 . D945 of KDM4A was identified as the residue important for H3K4me3 binding by forming a salt bridge with H3R2, whereas D939 was reported critical for H4K20me3 binding due to its interaction with H4R19. We noted that neither D945 nor D939 contacts the H3K23me3 peptide in the KDM4A-H3K23me3 structure (Fig. 2k). Using this observation, we generated a series of D945 substitutions (D945A/L/S) and found that they did not affect H3K23me3 binding, although such changes did decrease H3K4me3 binding ( Supplementary Fig. 3a,b). Similarly, D939A only weakens H4K20me3 binding but not H3K23me3 binding or H3K4me3 binding ( Supplementary Fig. 3a-c). The fact that individual KDM4-DTD substitutions at different residues could exclusively disrupt H3K23me3, H3K4me3 and H4K20me3 binding, respectively, without affecting the other two interactions, demonstrates that H3K23me3 recognition is distinct from either H3K4me3 or H4K20me3 interactions. To further confirm the specific H3K23me3 recognition is mediated by unique side chain interactions, we combined amino acid substitutions that reduce H3K4me3 (D945R) and H4K20me3 (D939R) affinity, while increasing H3K23me3 affinity (N931D) as a triple mutant ('DRR') in KDM4A-DTD (Fig. 3a,b). The DRR mutant retained a low micromolar K d value for H3K23me3 binding, while discriminating against other PTM states, leading to overall enhanced H3K23me3 specificity (Fig. 3c). The engineered 'DRR' reader domain specifically recognized H3K23me3 full-length recombinant histone (Fig. 3d). The successful engineering to selectively recognize H3K23me3-containing histones highlights the detailed molecular knowledge gleaned from this study.
The above biochemical analysis validated the intermolecular interactions between KDM4A-DTD and H3K23me3 observed from the co-crystal structure. As KDM4B-DTD also interacts with H3K23me3 with high discrimination (Fig. 1d), we investigated whether KDM4B-DTD engages H3K23me3 using intermolecular interactions similar to KDM4A-DTD. Indeed, KDM4B-DTD binds H3K23me3 with sensitivity to PTM states on H3K23 and neighbouring residues in a similar fashion to that observed for KDM4A-DTD ( Fig. 4a,b), suggesting that the interaction network with H3K23me3 is shared between KDM4A-DTD and KDM4B-DTD. However, our initial analysis showed a unique specificity of KDM4B-DTD: unlike KDM4A-DTD and KDM4C-DTD, KDM4B-DTD displayed dramatically poor interaction with H3K4me3 (Fig. 1b,d and Supplementary Fig. 1c). Such poor interaction with H3K4me3 was also confirmed with more physiologically relevant chromatin substrates, native nucleosomes purified from human cell line MCF-7, and KDM4B-DTD showed no enrichment of H3K4me3-containing chromatin (Fig. 4c). Furthermore, KDM4B-DTD displayed H3K23me3specific binding with reconstituted methyl lysine analogue (MLA) nucleosomes and such binding was abolished with the aromatic cage mutant Y993A (Fig. 4d). To better understand the molecular basis of the specific H3K23me3 binding and the disfavoured H3K4me3 binding by KDM4B-DTD, we crystallized KDM4B-DTD and successfully determined the structure of the apo-form at a resolution of 2.56 Å (PDB ID: 4UC4; Table 1). Analysis of the KDM4B-DTD structure reveals a highly similar overall tertiary structure when compared with KDM4A-DTD and KDM4C-DTD ( Supplementary Fig. 4a). Next, we modelled KDM4B-DTD with the H3(19-27)K23me3 ( Fig. 4e) or H3(1-7)K4me3 ( Fig. 4f-i) peptide, based on the highly conserved structural fold of HTD-2 from the corresponding KDM4A co-crystal structures (PDB ID: 5D6Y and 2GFA). Indeed, the KDM4B-H3K23me3 model displayed a similar interaction network ( Supplementary Fig. 4b,c) as KDM4A-H3K23me3 (Fig. 2k), including the hydrogen bonds with H3T22 and H3R26. Specific KDM4B mutants that impaired its H3K23me3 binding also confirmed the KDM4B-H3K23me3 model ( Supplementary Fig. 4d). Furthermore, wild-type KDM4B-DTD but not the aromatic cage mutant (Y993A) displayed binding to recombinant H3K23me3 histone ( Supplementary  Fig. 4e).
To pinpoint the residues that account for the poor H3K4me3 affinity of KDM4B, we looked for apparent clashes or missing molecular interactions in the modelled KDM4B-H3K4me3  Supplementary Table 1 for peptide information) for each fluorescence polarization assay. Relative specificity factors (as shown by the shades of red) were normalized to the tightest binding in the matrix (K d ¼ 2.2 mM). Binding affinity for each protein-peptide combination as determined by FP is labelled. (d) DRR mutant fails to recognize H3Kc4me3 MLA but retains binding to H3Kc23me3 MLA histone in a far western blot setting. structure ( Fig. 4f-i). We first noticed that N960 in KDM4B-DTD showed a significant steric clash with H3A1 and H3R2 in the model (Fig. 4f). To test whether N960 hinders the interaction between H3K4me3 and KDM4B-DTD, we made the Asn-to-Ala substitution, which should alleviate the steric clash. Indeed, with the N960A variant, H3K4me3 binding was rescued to K d ¼ 303 mM, whereas the corresponding substitutions in KDM4A (N940A) dramatically decreased H3K4me3 binding (Fig. 4g), most probably due to disrupting the hydrogen bond with H3T3. S965 in KDM4B might also contribute to the poor affinity for H3K4me3, probably due to the lack of strong ionic interaction between H3R2 and S965; such ionic interaction is fulfilled between KDM4A-D945 and H3R2 (Fig. 4h). In addition, KDM4B-S965 forms a weaker hydrogen bond (3.9 Å) with H3R2 in the modelled structure than KDM4A-D945 (3.0 Å; Fig. 4h). Consistent with this hypothesis, H3K4me3 binding in KDM4B-S965D (K d B200 mM) is tighter than that in wild-type KDM4B (K d not measurable; Fig. 4i). At the same time, corresponding substitutions of KDM4A-D945S displayed lower H3K4me3 binding than the corresponding wild-type proteins (Fig. 4i). Similar models of KDM4C-DTD with H3K4me3 and H3K23me3 suggests general conserved interactions as with KDM4A-DTD, except for the H3R26 clash in KDM4C-H3K23me3 model ( Supplementary Fig. 4f,g), which might explain KDM4C's lower affinity to H3K23me3. Together, these results suggest that despite overall sequence and structure homology, specific differences of key residues result in unique H3K23me3 specificity of KDM4B-DTD. This also highlights the  multiple side chain interactions that define the sequence specificity of KDM4 reader domains.
H3K23 methylation is enriched in spermatocytes. The specific H3K23me3 interaction by human KDM4 reader domains in vitro prompted us to investigate further the physiological context of such interaction. Only recently has a biological context for the histone PTM H3K23me3 been described in Tetrahymena and C. elegans 32,33 . However, direct evidence of its relevance in mammals has not been documented. To search for a physiologically relevant source of H3K23me3 in mammalian systems, we first acid-extracted endogenous histones from a panel of mammalian cell lines and mouse tissues ( Supplementary  Fig. 5a). We also purified histones from germline micronuclei and non-germline macronuclei in Tetrahymena, as H3K23me3 was previously found exclusively enriched in the micronuclei 32 . Using an H3K23me3-specific antibody produced in-house 32 and profiled on peptide microarray ( Supplementary Fig. 5b,c), we found specific enrichment of H3K23me3 in mouse testes compared with other non-germline sources (Fig. 5a). Higher H3K23me3 levels in germline (micronuclei) histone were detected in Tetrahymena (Fig. 5a), consistent with previous observations. Using the recombinant H3K23me3-specific reader as a probe (DRR mutant), a similar enrichment pattern was observed in testes histones (Fig. 5b).
To test whether H3K23me3 detected in mouse testes also displays association with meiotic chromatin as found previously in Tetrahymena and C. elegans 32 , we performed indirect immunofluorescence of H3K23me3 in mouse testes ( Fig. 5c and Supplementary Fig. 5d). We detected an enrichment of H3K23me3 in mouse primary spermatocytes as they began meiosis I, as indicated by the appearance of the meiosis marker SCP3 (synaptonemal complex protein 3), which coincides with meiotic entry and leptotene. In new spermatids, which had completed meiosis II, H3K23me3 was also detected, albeit in a a MIC MAC  somewhat more compact sub-nuclear distribution than in spermatocytes; however, H3K23me3 was nearly undetectable in mature sperms (Supplementary Fig. 5d). Meanwhile, H3K23me3 signal was much weaker in spermatogonia that have not entered meiosis ( Fig. 5c and Supplementary Fig. 5d). The specificity of the H3K23me3 antibody was validated by peptide competition (Supplementary Fig. 5e). To further validate the higher methylation levels of H3K23 in mouse testes as detected by antibody and the recombinant reader probe, we also quantified different modification levels on H3K23 by mass spectrometry (MS; Supplementary Fig. 6). Indeed, we observed relatively higher global levels of H3K23 methylation (B10% using the quantified levels of monomethylation) in the testes (Fig. 5d) compared with levels (o1%) observed in other mammalian tissues or cell lines 29,34-36 (Levels of di-and trimethylation at H3K23 are difficult to quantify by MS due to the low global abundance in mammals). This observation provides further evidence on H3K23me3 being a conserved histone PTM associated with meiosis in mammalian systems.
Having established the physiological link of H3K23me3 with meiosis in mammalian spermatogenesis, we next investigated the in vivo connection between H3K23me3 and its putative reader, KDM4. As KDM4B harbours the unique specificity to read only the H3K23me3 modification ( Fig. 4 and Supplementary Fig. 4d), we probed rat testes tissue to determine the extent of overlapping nuclear localization between KDM4B full-length protein and H3K23me3. A high degree of overlap was found between KDM4B and H3K23me3 staining in spermatocytes (Fig. 5e). The temporal and sub-nuclear overlap of both H3K23me3 and KDM4B in the same cell population suggests a relevant connection during meiosis. Another heterochromatic histone PTM, H3K27me3, was more broadly distributed than H3K23me3 across nuclei of germline cells; however, we found that in meiotic cells, a subset of H3K27me3 staining was highly correlated with the punctate distribution of H3K23me3 (Fig. 5f). In contrast, the euchromatic histone PTM H3K9ac is more diffused inside the spermatocyte nucleus and does not overlap with H3K23me3 (Fig. 5g). These immunofluorescence results map H3K23me3 to heterochromatin, but not euchromatin. Altogether, our results suggest that H3K23me3 helps target KDM4B within heterochromatin of mammalian meiotic spermatocytes.
H3K23me3-binding stimulates H3K36 demethylation by KDM4B. The co-localization of H3K23me3 and KDM4B in vivo, and the specific ability of KDM4B to bind H3K23me3 in vitro, prompted us to investigate the functional role of H3K23me3 binding by KDM4 proteins. KDM4A-C proteins are conserved histone lysine demethylases that have been linked to removal of H3K9me and H3K36me 5,6,8,13,37,38 . More recently, the catalytic domains of KDM4A-C were shown to demethylate H3K27 in vitro as well, although at a much slower rate 39 . Having determined binding specificity of the C-terminal DTD in KDM4B for H3K23me3 (Fig. 3), we reasoned that this interaction might enhance the activity of full-length KDM4B against certain histone methylations that co-occur with H3K23me3.
To investigate the functional consequences of KDM4 proteins binding to H3K23me3, we built several structural models that incorporate both the catalytic and the DTD. One structural model included KDM4A and dually modified peptide H3K4me3K9me3 ( Supplementary Fig. 6a), and demonstrated the simultaneous engagement of H3K4me3 and H3K9me3. However, when the C-terminal reader domain binds H3K23me3, this binding mode requires the opposite orientation (head to toe) between histone H3 and KDM4 protein, suggesting that H3K27 and H3K36, but not H3K9, are more likely to be the demethylase target when H3K23me3 is bound to the DTD (Fig. 6a). Our structural analysis shows no molecular interaction between KDM4B DTD and the H3K27 or H3K36 residue (Fig. 4), suggesting H3K23me3 binding should not directly impede H3K27 or H3K36 demethylation. We then built a structural model of KDM4B and H3K23me3K27me3 ( Supplementary Fig. 6b), which provided evidence that either H3K27 or H3K36 is accessible for demethylation by the KDM4B catalytic domain.
To experimentally test the ability of KDM4B to demethylate H3K27me3 or H3K36me3 on H3K23me3 binding, we compared demethylation activity towards H3K27me3 or H3K36me3 by full-length KDM4B in the presence of H3K23me3 on the same histone peptide (Fig. 6a). Using MS/MS, we did not detect significant demethylation of H3K27me3 peptide by KDM4B, even after prolonged incubation ( Supplementary Fig. 6c), consistent with previous reports of H3K27me3 being a poor substrate for the KDM4 family 5,6,8,13,37 . The lack of measurable demethylation on the dually modified H3K23me3K27me3 peptide (Fig. 6a) indicated that H3K23me3 binding by the reader domain did not stimulate the negligible H3K27me3 demethylation. In stark contrast, we observed consistent H3K36me3 demethylation by KDM4B ( Fig. 6a and Supplementary Fig. 6c). Using a MS-based demethylation assay, we corroborated the findings: the coexistence of H3K23me3 and H3K36me3 on the same peptide yielded faster rates of H3K36me2 and H3K36me1 product formation, and a corresponding faster depletion of substrate (H3K36me3) compared with peptide with only H3K36me3 present (Fig. 6b). The MS analysis also confirms that increased activity of KDM4B in the presence of H3K23me3 is indeed due to demethylation of H3K36me3 and not the result of altered demethylation at H3K23me3. KDM4B preferentially demethylate H3K36me3 in the context of dually modified peptide when incubated with a 1:1 mixed pool of H3K23me3K36me3 and H3K36me3 peptide substrates (Fig. 6c). Consistent with the ability of both KDM4A and KDM4B but not KDM4C to bind H3K23me3 through its reader domain (Fig. 1), demethylation of H3K36 was enhanced for KDM4A and KDM4B but not KDM4C ( Supplementary Fig. 7d,e). The fact that the H3K23me3K36me3 peptide displayed an accelerated rate of demethylation relative to the K36me3 peptide in the substrate competition experiment (Fig. 6c) supports a cis mechanism for demethylation by KDM4A and KDM4B.
Based on the observation that full-length KDM4B displays enhanced demethylase activity at H3K36me3 in the context of doubly modified H3K23me3K36me3, we determined the subnuclear distribution of KDM4B and H3K36me3 in rat primary spermatocytes (Fig. 6d). Consistent with the in vitro kinetic results, H3K23me3-specific KDM4B and H3K36me3 staining did not overlap and supports the efficient demethylation of H3K36me3 at these loci. Together, the results suggest that the interaction between H3K23me3 and KDM4B-DTD endows the demethylase with enhanced catalytic efficiency for removing H3K36me3 when these chromatin marks exist on the same histone tail.

Discussion
In this study, we described the molecular basis and functional relevance of H3K23me3 recognition by the KDM4A and KDM4B DTDs. Biochemical analysis revealed the intricate molecular interactions that govern PTM specificity and sequence specificity of KDM4-DTDs for H3K23me3. In particular, we demonstrated the significance of side-chain interactions with residues surrounding the trimethyl lysine in mediating KDM4B-DTD's binding preference of H3K23me3 over H3K4me3. This study suggests a role for KDM4B in heterochromatin maintenance during meiosis, mediated by H3K23me3-dependent binding and demethylation of H3K36me3 (Fig. 6e). Lastly, our results support a conserved role of H3K23me3 during germ cell development.
KDM4 demethylases are conserved epigenetic regulators of histone methylation that are required for normal development [5][6][7][8][9][10][11][12] . The C-terminal DTDs in the multiple KDM4 homologues are only present in vertebrates ( Supplementary  Fig. 8), suggesting functional diversification as a result of genome duplication during evolution. The sophisticated intermolecular interactions present in KDM4-DTDs revealed by this study underlies the histone interactomes of human KDM4A-C DTDs and sheds light on distinct functions of KDM4 members [14][15][16] . The mild phenotype of individual KDM4 knockout mice 40,41 implicates some compensating functions among KDM4 members, which is consistent with overlapping histone PTM interactomes among KDM4A-C DTDs (Fig. 1). However, the H3K23me3-specific reader KDM4B-DTD displays poor H3K4me3 binding, whereas KDM4A-DTD and KDM4C-DTD have low-micromolar binding affinity with H3K4me3 (Fig. 1). This observation is in line with previous findings that KDM4C but not KDM4B overlaps with H3K4me3 genomic distribution and the fact that targeting of KDM4C is dependent on its DTD 16,40 . Altogether, the data suggest a model in which chromatin association of KDM4 members is mediated by both their divergent and common reader binding targets, to ensure correct epigenetic programming during development.
Germ cell chromatin is subject to epigenetic regulation 42,43 . Here we propose that H3K23me3 and H3K36me3 might be   involved in mammalian germ cell development, especially during the mitosis-to-meiosis transition, through cross-talk between the KDM4 reader and catalytic domains. We revealed H3K23me3 enrichment in mouse and rat testes, specifically in primary spermatocytes undergoing meiosis (Fig. 5). This provides further evidence about H3K23me3 being a conserved histone modification associated with meiotic chromatin, as previously identified in Tetrahymena and C. elegans 32 . The enrichment of H3K23me3 is also consistent with reduced immunostaining of H3K23ac during preleptotene to pachytene stage in mouse testes tissues 44 . The function and mechanism of this unique association with meiotic chromatin still needs further investigation. It remains to be experimentally tested whether H3K23me3 plays similar roles in meiotic DNA damage in both mammalian systems and Tetrahymena 32 .
Previously, the KDM4 family and H3K36me3 were independently implicated in germ cell development 5,41,45 ; however, the underlying relationship was not clear. Based on our detailed structural and biochemical analysis (Figs 2-4), we would postulate that the reader domain of KDM4B has evolved to discriminate for exclusive binding to H3K23me3 with possible regulation by PTMs on H3T22 and H3R26. The H3K23me3 binding by KDM4A and KDM4B is likely to be conserved across species, given the conservation of the H3K23me3-discriminating residue (N931 in human KDM4A and N951 in human KDM4B) as shown in Supplementary Fig. 8. The functional analyses reported here demonstrate that the distinct mode of H3K23me3 binding stimulates the demethyalse activity of KDM4B towards H3K36me3. These results are in accord with such 'cross-talk' between a histone methylation 'reader' and 'eraser', leading to cishistone demethylation reported for PHF8, KDM7A 46,47 , KDM4A and KDM4C 28,48 . In the case of PHF8, biochemical and structural characterization indicates H3K4me3 strongly promotes H3K9 demethylation 46 . Although our structural model and the preliminary competition kinetic assay suggests a cis-histone cross-talk mechanism between H3K23me3 binding and H3K27/ K36 demethylation (Fig. 6c and Supplementary Fig. 7a,b), more detailed kinetic analysis will be needed to fully investigate cis-histone and trans-histone cross-talk between H3K23me3 and KDM4 demethylation sites (H3K9me, H3K36me and H3K27me), in particular using nucleosome substrates. Interestingly, another KDM4 family member, KDM4D, which lacks the C-terminal reader domain and can only demethylate H3K9, has also been implicated in spermatogenesis 41 . This suggests other histone PTMs (such as H3K9 methylation) might also be involved in germ cell development.
Interestingly, we found H3K23me3 also overlaps significantly with H3K27me3 by immunofluorescence in rat spermatocytes (Fig. 5f). This finding is consistent with previous MS results showing H3K23me3 physically co-exists on a subset of H3K27me3 modified histones in Tetrahymena and C. elegans 32,33 , suggesting that similar positive 'cross-talk' between H3K23me3/H3K27me3 may occur in mammals as well. Furthermore, KDM4B binding to H3K23me3 was not disrupted in the presence of H3K27me3 (Supplementary Fig. 1b). Thus, the combined heterochromatic H3K23me3/H3K27me3 signature may be used to recruit KDM4B to remove H3K36me3 at heterochromatin during the initial stages of meiosis. Indeed, H3K27me3 and H3K36me3 are located in mutually exclusive regions of germline chromatin in C. elegans. Whether H3K23me3-dependent downregulation of H3K36me3 serves to reduce gene expression or increase meiotic DNA damage repair at heterochromatin needs further investigation. Collectively, these observations suggest enrichment of H3K23me3 during meiosis is conserved across multiple species and our molecular characterization provides a mechanistic basis for the future investigation of the relationship between KDM4B and H3K23me3 in the germline.
Peptide microarray. Peptide microarray was synthesized in-house 29,49 . Peptide arrays were handled with care and protected from light. All solutions were filtered before putting on the array. Assays were performed with a modified two-chamber simplex gasket (Intuitive Bioscience). After tightening the gasket onto the slide surface, the peptide array was blocked with blocking solution (1 Â PBS, 0.05% Tween-20 pH 7.4, 1% BSA) at 4°C overnight, to reduce nonspecific binding.
(1) Reader binding: 10-50 nM purified recombinant HT-KDM4 DTDs were first reacted with excessive HaloTag ligand-biotin (Promega) in blocking solution on ice for 15 min and incubated on peptide array under mild rocking at 4°C, 1 h. The slide was washed with PBS with 0.05% Tween (PBST) three times and incubated with 1:2,000 streptavidin-conjugated Alexa-Fluor647 (Invitrogen) in blocking solution at room temperature for 1 h. (2) Antibody binding: after blocking, 1:2,500-1:100 dilution H3K23me3 antibody in blocking solution were incubated with peptide microarray under mild rocking at 4°C, 1 h. The slide was washed with PBST three times and incubated with 1:1,000 anti-rabbit IgG Alexa 647 (Cell Signaling) in blocking solution at room temperature for 1 h. For both assays, after three washes in PBST and a final wash in distilled water, the slide was dried by centrifugation and imaged at dual wavelengths of 532 and 635 nm on Axon GenePix 4000B (Molecular Devices). The laser power was set to 100%, with automatic gain adjustment (0.05% saturation tolerance) for dual photomultipliers. Image was obtained at 5 mm pixel resolution. Features in each block were defined by manual adjustment of 13 Â 13 grid (feature diameter, 280 mm; column spacing and row spacing, 320 mm) to cover every spot. Signal intensities were quantified by GenePix Pro 6.1 software (Molecular Devices). For each spot (feature), the mean intensities for 635 nm wavelength were used for subsequent analysis. For each peptide species, an average was calculated from three replicate spots. The averaged intensities were normalized to the range between 0 and 1 by the lowest and highest values of selected peptides on each library. For comparison between peptides with combinatorial modifications, the intensities were normalized to the peptide with single modification as log 2 ratio. The signal at 532 nm wavelength was used to identify misprinting events.
Fluorescence polarization analysis. Peptide sequences are given in Supplementary Table 1. Fluorescein-labelled histone peptides (constant final concentration at 3 nM) were mixed with titrations of purified KDM4 DTD proteins (0.1-100 mM) in buffer (30 mM HEPES, 150 mM NaCl pH 7.4, 0.01% v/v Triton X-100). A no-protein control with only peptide probe was included. Polarization at each concentration was measured as triplicates in 384-well polystyrene black microplates (Thermo Fisher Scientific 262260) by Biotek Synergy H4 multimode plate reader (light source: xenon flash, offset from top: 7 mm, sensitivity: 60%, excitation: 485/20 nm, emission: 528/20 nm, both parallel and perpendicular, normal read speed). Fraction bound (Fb) at each concentration was calculated based on the corresponding polarization values (P): Fb c ¼ (P c À P min )/(P max À P min ), where P min is the polarization value of no-protein control and P max is the polarization value of the saturation value for that peptide. Dissociation constants NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13387 ARTICLE (K d ) were derived by KaleidaGraph, Synergy Software (version 4.1.3) using the following equation: Matrix-assisted reader chromatin capture. MCF-7 cells (ATCC) were cultured in DMEM medium supplemented with 10% fetal bovine serum. For each nucleosome isolation, MCF-7 cells at E90% confluency from two 10 cm plates (E2 Â 10 7 cells in total) were collected and washed for three times in ice-cold Buffer M (10 mM HEPES, 10 mM KCl, 1.5 mM MgCl 2 , 340 mM sucrose pH 7.9, 10% glycerol, v/v), supplemented with 1 mg ml À 1 trichostatin A, 1 mM DTT, 0.5 mM phenylmethylsulfonyl fluoride, 10 mM b-glycerophosphate, 1 mM leupeptin and 1 mM aprotinin. At the last wash, the cell resuspension was lysed with 0.1% Triton X-100 on ice for 10 min. After lysis, nuclei pellets were resuspended in Buffer M and centrifuged at 1,300 g for 12 min through chilled sucrose cushion buffer (10 mM HEPES pH 7.9, 30% sucrose, w/v, 1.5 mM MgCl 2 ), to further purify the nuclei pellets. After three washes with Buffer M, nuclei was diluted to 1.2 to 1.6 mg ml À 1 DNA concentration and digested with 2,000 gel units of micrococcal nuclease (New England Biolabs) at 37°C for 12 min with constant mixing in the presence of 1 mM final concentration of CaCl 2 . Before assay, the amount of enzyme and digestion time was optimized to obtain above 90% purity of mononucleosomes. MNase activity was stopped with 10 mM EDTA and spun down. Soluble chromatin from the supernatant (S1) was collected. Less soluble chromatin (S2) was recovered from the nuclei pellets resuspended in 5 mM HEPES, 0.2 mM EDTA at 4°C overnight. Pooled chromatin extract (S1 plus S2) was concentrated to E10 mM and dialysed into 30 mM HEPES, 150 mM NaCl pH 7. Nucleosome reconstitution and MLA production. MLAs were prepared as previously described 29 . Briefly, histone H3 (X. laevis) with K4C (or K23C) and C110A mutations was expressed in E. coli and purified by Superdex200 column followed by ion-exchange columns. The purified histone was reduced and the cysteine residue was alkylated with excess (2-bromoethyl) trimethylammonium bromide (Sigma-Aldrich) at elevated temperature. After the reaction was quenched with 2-mercaptoethanol, excess reagents and salts were removed using a PD-10 desalting device. Along with other core histones (H2A, H2B and H4) that are recombinantly prepared, MLA H3 was refolded into histone octamers. The purified octamer and 146 bp DNA fragments (shown to exhibit strong positioning to the histone octamers) were reconstituted into nucleosome core particle by a salt-gradient dialysis method 50 .
Crystallization. Initial crystallization screens were performed using an in-house screen comprising IndexHT, SaltHT, Crystal Screen HT, PegRx MgCl 2 and CaCl 2 , 100 mM Tris and Bicine pH 8.5, 10% (v/v) ethylene glycol and 20% (w/v) PEG 8,000) at 15°C using the sitting drop method. All crystals mentioned above were then flash frozen in liquid nitrogen with additional 15-20% glycerol for data collection. Co-crystallization of KDM4B-DTD with H3K23me3 was also attempted in similar initial screens, but complex crystals were not obtained.
Structure determination and homology modeling. Data collection and structure refinement. X-ray diffraction data were collected at the Life Sciences Collaborative Access Team (LS-CAT) with an X-ray wavelength of 0.98 Å at the Advanced Photon Source (Argonne National Laboratory). Data sets were indexed and scaled using HKL 51 or XDS 52 . For phasing experiments on KDM4A-DTD and KDM4B-DTD, molecular replacement was used using 2QQR 23 as starting model and then phenix.autobuild was used for automatic model building 53 . All the structures were completed with alternating rounds of manual model building with COOT 54 and refinement with phenix.refine 53 . Structure quality was validated by Molprobity 55 . Structures were visually interpreted using a stereographic collaborative commodity 3D TV arrangement 56 .
The H3K23me3-bound structure of KDM4A-DTD contains six subunits in the asymmetric unit: two subunits (Chain A, C) have clear electron density for H3Q19-K27, another two subunits (Chain B, D) have clear electron density for H3L20-R26 and the remaining two subunits (Chain E, F) are partially disordered, which lead to higher R-factors than expected (Table 1). This is consistent with the crystal-packing environment, as chain A-D can form a stable crystal lattice leaving a huge channel for chain E-F to fit in (Supplementary Fig. 2a). We used subunits A and C with better peptide coverage for analysis. Two apo structures of KDM4A-DTD at 1.99 and 2.15 Å (PDB ID: 5D6W, 5D6X; statistics summarized in Table 1) with the same space groups and better R/R free factors compared with previously reported KDM4A apo structures (PDB ID: 2GF7 and 2QQR) 23,30 . Analysis of the KDM4B structure reveals that the crystallographic asymmetric unit comprises two subunits with a limited interaction interface (B480 Å 2 by PISA 31 ). This result is consistent with size exclusion chromatography results (Supplementary Table 2), where the Stoke's radius of gyration of KDM4A-C DTDs indicates the protein exists in solution as a monomer. Our attempts to obtain a structure for KDM4C-DTD yielded crystals with limited diffraction even after optimization; therefore, we used the 1.56 Å resolution KDM4C-DTD structure solved previously (PDB ID: 2XDP 57 ) as a template for modelling.
Homology modelling: based on the already known protein-peptide complex structures of KDM4A-DTD with either H3(1-7)K4me3 (PDB ID: 2GFA) 30  Immunofluorescence microscopy. Mouse immunofluorescence staining. Testes of 7-week-old male C56BL/6 mice were dissected and detunicated in PBS, fixed in 2% paraformaldehyde in PBS for 3 h at 4°C, equilibrated into 30% sucrose solution overnight at 4°C, embedded in O.C.T. (optimal cutting temperature) compound and then frozen. Seven-micrometre-thick sections were cut, adhered to slides, baked at 50°C for 45 min and permeabilized in PBST (0.05% Triton X-100). Sections were blocked for 30 min in 10% normal goat serum and 0.3% BSA in PBST. Primary antibodies (rabbit anti-H3K23me3) were diluted 1:70 in PBS (with 0.1% BSA and 0.1% Tween-20) and incubated at 4°C overnight. Secondary antibodies were goat anti-rabbit conjugated with Cy3 or Alexa-488 (Jackson Immunoresearch and Invitrogen) diluted at 1:200 in PBS (with 0.1% BSA and 0.1% Tween 20) and incubated for 2 h at room temperature. For peptide competition experiments, anti-H3K23me3 was incubated with 5 mg ml À 1 of indicated peptides for 1 h on ice and immune complexes were removed by centrifugation for 10 min at 416,000 g. Pre-absorbed primary antibodies were applied to the tissue samples for 1 h at room temperature, followed by secondary antibody incubation for 1 h at room temperature. Before imaging, DNA was counterstained with 4,6-diamidino-2-phenylindole. Confocal images were acquired with a Cascade QuantEM 512SC camera (Photometrics) attached to a Zeiss Axioimager microscope with Yokogawa spinning disk confocal scanner and Slidebook software (Intelligent Imaging Innovations). Maximal-intensity projections encompassing a single layer of nuclei were generated by the Slidebook software. Image processing was performed in Adobe Photoshop CS4.
Histone acid extraction from tissues. Testes and other tissues were dissected in PBS from C57BL/6J male wild-type mice (3-6 months old) and homogenized in ice-cold hypotonic lysis buffer (10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl 2 pH 7.4) supplemented with proteinase inhibitors in a 1 ml dounce homogenizer. Homogenized tissue lysate was filtered through 100 mm nylon mesh cell strainer (Fisher Scientific) and pelleted (800 g, 10 min, 4°C). Pelleted nuclei were washed with PBS twice and resuspended with 0.4 N H 2 SO 4 . After incubation at 4°C for 4 h, the nuclear lysate was clarified by centrifugation (3,400 g, 5 min, 4°C) and acid-soluble supernatant was precipitated with 100% trichloroacetic acid at 4°C overnight. The precipitated proteins were rinsed with acetone þ 0.1% HCl and the air-dried pellet was resuspended in water and quantified by Bradford assay (Bio-rad).
For reader-based far western, the membrane was blocked with 5% BSA in HBST (30 mM HEPES, 150 mM NaCl pH 7.4, 0.05% Tween 20) at room temperature for 1 h and then incubated with 100-200 nM HT-KDM4 DTD conjugated with HaloTag-Alexa 660 (Promega) at 4°C overnight. All images were captured by ImageQuant LAS 4000 (GE Healthcare Life Sciences) and processed by ImageJ 59 .
Nano-liquid chromatography and electrospray ionization MS/MS. Quantitative MS for histone PTMs was performed as previously described 34 . Histone extracts were derivatized using propionic anhydride for unmodified and monomethylated lysines, and phenylisocyanate for newly generated N termini on tryptic peptides as described 60 . Briefly, 5 mg of histone extract in 9 ml of water was buffered using 1 ml of 1 M triethylammonium bicarbonate. One microlitre of 1:100 propionic anhydride in H 2 O was then added to each histone extract and incubated at room temperature for 2 min before a 20 min quench with 1 ml of 80 mM hydroxylamine. Histone extracts were trypsinized at 1:50 for 5 h at 37°C, followed by addition of 1 ml 20 mM NaOH. Newly generated N termini were labelled by incubation with 3 ml of 2% phenylisocyanate/acetonitrile for 1 h at 37°C and desalted using in-house prepared C18 stage tips. Approximately 0.5 mg of derivatized histone extract was injected onto a Dionex Ultimate3000 nanoflow HPLC with a Waters NanoEase C18 column (100 mm Â 15 cm, 3 mm) coupled to a Thermo Fisher Q-Exactive MS at 700 nl min À 1 . Mobile phase consisted of water þ 0.1% formic acid (A) and acetonitrile þ 0.1% formic acid (B). Histone peptides were resolved with a two-step gradient of 2-25% mobile phase B over 60 min followed by a gradient of 25-40% mobile phase B over 15 min. The MS was operated in data-dependent acquisition (DDA) mode with dynamic exclusion enabled (exclusion duration ¼ 8 s), MS1 resolution ¼ 70,000, MS1 automatic gain control target ¼ 1 Â 10 6 , MS1 maximum fill time ¼ 100 ms, MS2 resolution ¼ 17,500, MS2 automatic gain control target ¼ 2 Â 10 5 , MS2 maximum fill time ¼ 500 ms and MS2 normalized collision energy ¼ 30. For each cycle, one full MS1 scan range ¼ 300-1,100 m/z, was followed by 10 MS2 scans using an isolation window size of 2.0 m/z. An inclusion list was employed to increase the detection efficiency of histone peptides of interest.
Rodent experiments. The rat tissues were obtained from adult wild-type Sprague-Dawley rats that were killed in accordance with existing Institutional Animal Care and Use Committee (IACUC)-approved protocols. The mouse tissues were obtained from wild-type C57BL/6 surplus male mice that were scheduled for termination under IACUC protocol and provided to our laboratory following killing. This procedure has been reviewed and approved by Johns Hopkins Animal Care and Use Committee and UW-Madison Animal Care and Use Committee (protocol number M02059) and was performed in accordance with the NIH Guide for the Care and Use of Laboratory Animals.
MS-based analysis of peptide demethylation. In a typical demethylase reaction, the assay mixture containing 50 mM Tris pH 7.5, 50 mM NaCl, 50 mM (NH 4 ) 2 Fe(SO 4 ) 2 , 0.5 mM ascorbic acid, 1 mM a-ketoglutarate and 0.2 mM full-length KDM4A-C (Active Motif) was mixed with 5-10 mM of trimethylated histone peptides (detailed sequence information in Supplementary Table 1). Each peptide concentration was quantified by measuring Abs 280nm for peptides containing either tryptophan or tyrosine residue. After incubation of assay mixture at 25°C for 10, 30 and 60 min time point, 10 ml of aliquot was transferred to a clean tube containing 5 ml of 5% formic acid to quench the reaction. Each quenched sample was treated with standard StageTip protocol using C18 extraction disk (Model 2215, Empore, 3 M) to remove excess salts and detergents in the reaction mixture 61 . Each eluate from StageTip containing peptides was dried at a reduced pressure (vacuum centrifuge) and was re-suspended with 10 ml of 5% acetonitrile solution (0.1% formic acid). The 1 ml of the sample was injected at 0.7 ml min À 1 flow rate onto a Dionex Ultimate 3000 Nano LC system equipped with Atlantis dC18 NanoEase reverse-phase column (3 mm, 100 mm Â 150 mm, Waters) and coupled to a hybrid quadrupole-orbitrap MS (Q-Exactive, Thermo Scientific). The peptide was eluted at 4-15 min retention time with a linear gradient of 2-35% acetonitrile (0.1% formic acid). All the solvents used for liquid chromatography MS were HPLC grade (Fisher). For each analysis, MS1 scan range of 300-1,100 m/z was used and data-dependent MS/MS was obtained to identify the site of demethylation. Each peptide peak and its area integration was performed on the XCalibur Qual Browser.
Data availability. X-ray structures have been deposited in the Protein Data bank with PDB ID: 4UC4, 5D6Y, 5D6W and 5D6X.