Caenorhabditis elegans MES-3 is a highly divergent ortholog of the canonical PRC2 component SUZ12

Summary Polycomb Repressive Complex 2 (PRC2) catalyzes the mono-, di-, and trimethylation of histone protein H3 on lysine 27 (H3K27), which is strongly associated with transcriptionally silent chromatin. The functional core of PRC2 is highly conserved in animals and consists of four subunits. One of these, SUZ12, has not been identified in the genetic model Caenorhabditis elegans, whereas C. elegans PRC2 contains the clade-specific MES-3 protein. Through unbiased sensitive sequence similarity searches complemented by high-quality structure predictions of monomers and multimers, we here demonstrate that MES-3 is a highly divergent ortholog of SUZ12. MES-3 shares protein folds and conserved residues of key domains with SUZ12 and is predicted to interact with core PRC2 members similar to SUZ12 in human PRC2. Thus, in agreement with previous genetic and biochemical studies, we provide evidence that C. elegans contains a diverged yet evolutionary conserved core PRC2, like other animals.


INTRODUCTION
Posttranslational modifications of histone proteins contribute to the organization of genomic DNA and establishment of transcriptionally active versus silent chromatin (Bannister and Kouzarides, 2011). Polycomb group proteins form an important class of transcriptional repressors that function through modification of histone tails (Grossniklaus and Paro, 2014;Margueron and Reinberg, 2011). These proteins assemble into two distinct multi-subunit complexes, Polycomb Repressive Complex 1 and 2 (PRC1 and PRC2) (Bannister and Kouzarides, 2011;Bieluszewski et al., 2021;Grossniklaus and Paro, 2014;Margueron and Reinberg, 2011;Simon and Kingston, 2009). PRC2 catalyzes the mono-, di-, and trimethylation of histone protein H3 on lysine 27 (H3K27), which is strongly associated with transcriptionally silent chromatin and plays an important role in the maintenance of cell identity and developmental regulation of gene expression.
The functional core of PRC2 is highly conserved in animals and consists of four subunits: the H3K27 methyltransferase EZH2/1 and associated proteins EED, SUZ12, and RBBP4/7 (Bieluszewski et al., 2021;Glancy et al., 2021;Simon and Kingston, 2009) (Figures 1A and 1B). SUZ12 interacts with all members of the PRC2 core to form two distinct lobes (Chen et al., 2018;Glancy et al., 2021;Kasinath et al., 2018). The N-terminal region of SUZ12 together with RBBP4/7 forms the targeting lobe, which contributes to the recruitment and regulation of PRC2, and serves as a platform for cofactor binding (Chen et al., 2018;Kasinath et al., 2018). The region of SUZ12 included in this lobe contains five motifs and domains: zinc-finger binding (ZnB), WDdomain binding 1 (WDB1), C2 domain, zinc finger (Zn), and WD-domain binding 2 (WDB2) (Chen et al., 2018;Kasinath et al., 2018) ( Figure 1B). The C-terminal region of SUZ12 contains a VEFS domain ( Figure 1B), which associates with EZH2/1 and EED to form the catalytic lobe of PRC2 (Chen et al., 2018;Kasinath et al., 2018). Thus, SUZ12 is critical for the assembly, integrity, and function of PRC2, in agreement with the conservation of SUZ12 as a core PRC2 component in animals ( Figure 1A).
Genetic and biochemical studies in the nematode C. elegans revealed a functional PRC2 complex without an apparent SUZ12 ortholog (Ahringer and Gasser, 2018;Bender et al., 2004;Capowski et al., 1991;Gaydos et al., 2014;Ketel et al., 2005;Korf et al., 1998;Xu et al., 2001). The components of this complex were originally defined by specific maternal-effect sterile (mes) mutations that cause defects in germline development and silencing of the X chromosome in the hermaphrodite germline (Capowski et al., 1991;Garvin et al., 1998). Molecular characterizations revealed that MES-2 and MES-6 are homologs of the Polycomb group proteins EZH2/1 and EED, respectively (Xu et al., 2001). MES-2 (EZH2/1) and MES-6 (EED) form a protein complex with MES-3, and all three components are required for histone H3K27 methyltransferase activity in vivo and in vitro (Ahringer and Gasser, 2018;Bender et al., 2004;Gaydos et al., 2014;Korf et al., 1998;Xu et al., 2001). Despite the functional similarity with the PRC2 core, MES-3 appeared to lack obvious motifs or sequence similarity to SUZ12 or RBBP4/7 and therefore has been considered a C. elegans specific subunit (Ahringer and Gasser, 2018;Bender et al., 2004;Ketel et al., 2005;Paulsen et al., 1995;Xu et al., 2001). Consequently, PRC2 in C. elegans and in animals are considered functional analogues, despite a seemingly divergent subunit composition (Ahringer and Gasser, MES-3 is a highly divergent ortholog of the canonical Polycomb Repressive Complex two component SUZ12 (A). The Polycomb Repressive Complex 2 (PRC2) core components EZH2/1, EED, RBBP4/7, and SUZ12 are conserved in a broad range of metazoans; the presence of orthologs is indicated by filled boxes. Notably, based on sequence similarity searches, an ortholog of SUZ12 is absent in the nematode model species Caenorhabditis elegans, but present in other, closely related nematodes (Brugia malayi and Trichinella spiralis). C. elegans encodes the PRC2 core component MES-3 that lacks obvious motifs or sequence similarity to SUZ12 (Ahringer and Gasser, 2018;Bender et al., 2004;Ketel et al., 2005;Paulsen et al., 1995;Xu et al., 2001). (B). Schematic representation of the composition of the core PRC2. The zinc finger binding (ZnB; red), WD-domain binding 1 (WDB1; blue), C2 domain (green), zinc finger (Zn; yellow), WD-domain binding 2 (WDB2; pink), and VEFS (orange) motifs or domains involved in SUZ12 protein-interactions are shown in the schematic as well as along the protein sequence (Chammas et al., 2020;Chen et al., 2018;Kasinath et al., 2018). Schematic representation of the protein sequence of MES-3 is shown, and regions of uncovered sequence (c) and structural (e, f) similarity are highlighted. (C). Protein sequence alignment between the N-terminal region of SUZ12 and MES-3, as identified by sensitive profile-vs-profile sequence similarity searches, covers part of the zinc finger binding (ZnB; red), WD-domain binding 1 (WDB1; blue), and C2 domain (green). The conserved RBBP4/7 binding epitope as well as Gly299 are highlighted (Birve et al., 2001;Rai et al., 2013;Schmitges et al., 2011). Identical amino acids are shown in blue and biochemically similar amino acids are shown in turquoise.
(D-F). The predicted aligned error (in Å ; based on model 2 ptm) of the MES-3 structure is shown as a heatmap and reveals two separated globular regions in the N-and C-terminus, the former overlaps with the profile-vs-profile match (c) and corresponds to the C2 domain of SUZ12 (e; Figure S1I; RMSD = 1.607), while the latter overlaps with the region that structurally resembles the VEFS domain (f; Figure S1J; RMSD = 3.676). The black arrows (e, f) highlight regions that differ considerably between SUZ12 and MES-3 (Figures S1I and S1J), and the structure predictions of SUZ12 and MES-3 (e, f) are shown in gray as well as green (C2) and orange (VEFS), respectively. (G). Sequence-independent structure alignment of the VEFS regions of SUZ12 and MES-3 reveals significantly structural similarity (Dali Z score = 8.3; TMscore = 0.55), especially along the a helices in the C-terminus; a region previously shown to stimulate histone methyltransferase activity in SUZ12 (Birve et al., 2001) (pos. 580 to 612) is highlighted by a black bar, and individual amino acids important for PRC2 assembly (Birve et al., 2001) Bender et al., 2004;Ketel et al., 2005;Xu et al., 2001). In-depth sequence comparisons have recently turned up surprising homologies (Yoshida et al., 2019), which prompted us to investigate whether MES-3 could be a highly diverged homolog of SUZ12 instead of a C. elegans specific invention.

MES-3 is a highly divergent ortholog of the canonical PRC2 component SUZ12
To identify MES-3 homologs in animals, we used unbiased sensitive profile-vs-profile searches to query the predicted human proteome with MES-3 and query the worm proteome with SUZ12. Surprisingly, we recovered a consistent but insignificant bidirectional match between SUZ12 and MES-3 (16% identity; Figure 1C) that is located at approximately the same regions in both proteins and covers 223 amino acids in MES-3. This region in SUZ12 spans part of the ZnB motif, the complete WDB1 motif, and most of the C2 domain ( Figures 1B and 1C). Notably, the conserved RBBP4/7 binding site of SUZ12 (Schmitges et al., 2011) is also present in MES-3 (MES-3, pos. 108-113; FLxRx[VL]) as well as a conserved glycine (MES-3, pos. 299) ( Figure 1C); a missense mutation of this glycine in Drosophila leads to a partial loss-of-function phenotype (Birve et al., 2001;Rai et al., 2013). Therefore, we conclude that the N-terminal regions of SUZ12 and MES-3 share extended sequence similarity including residues previously shown to be critical for function, suggesting that these two proteins are homologs. However, the profile-to-profile searches did not detect similarity between the C-terminal sequence of MES-3 and the SUZ12 domain that mediates EZH2 and EED interaction (Chen et al., 2018;Kasinath et al., 2018) ( Figure 1B).
Protein structure is typically more conserved than primary sequence and better allows detection of diverged homologs (Sanchez-Pulido and Ponting, 2021). Because the protein structure of MES-3 is not yet experimentally resolved, we used deep-learning driven protein structure prediction of both MES-3 and SUZ12 with Alpha-fold2 . The SUZ12 structure has six functional motifs and domains that were predicted with high precision as they resemble the experimentally determined structure (RMSD = 0.56-1.14; global TMscore = 0.70; global Dali Z score = 14.8 Figures S1A-S1E). Like SUZ12, the predicted MES-3 structure is partially disordered (Figures 1D; Figures S1F-S1H), but nevertheless has a globular N-terminal region mainly formed by b-sheets and a C-terminal region mainly formed by a-helices ( Figures 1D and 1E), and both regions were modeled with high confidence ( Figure S1G). Interestingly, the C2 domain of SUZ12 shares significant structural similarity with the N-terminal structural regions of MES-3 ( Figures 1D and 1E; Figure S1I; RMSD = 1.607; TMscore = 0.60; Dali Z score = 11.6), corroborating the profile-vs-profile results ( Figure 1C). The structural similarity (MES-3, pos. 150-365) extends beyond the region of shared sequence similarity identified earlier (MES-3, pos. 150-312) and thus encompasses the complete C2 domain ( Figure 1D; Figure S1I). Nevertheless, we also observed some differences in the predicted structures such as the occurrence of an unmatched a helix in MES-3 ( Figure 1E; Figure S1I) or the absence of amino acids in MES-3 known to be involved in the interaction between SUZ12 and RBBP4/7 (e.g., SUZ12, pos. R196 (Chen et al., 2018)).

DISCUSSION
Here, we provide evidence that MES-3, even though diverged, structurally resembles SUZ12 in two large regions that are involved in mediating EZH2/1, EED, and RBBP4/7 binding. It is therefore conceivable that,      (Chammas et al., 2020). We note that the region around the potential MCSS/SANT2 domains in MES-2 is substantially diverged compared with EZH2, yet still displays considerable structural similarity. iScience Article similarly to SUZ12 (Chen et al., 2018;Kasinath et al., 2018), MES-3 is critical in assembling and maintaining a functional PRC2. The uncovered sequence and structural similarities as well as the peculiar complementary phylogenetic profiles strongly suggest that MES-3 and SUZ12 are in fact orthologs, albeit that MES-3 has undergone rapid sequence divergence and loss of crucial amino acid motifs as well as the Zn domain. Besides, C. elegans specific evolution of the PRC2 assembly and architecture is likely to also play a role. The PRC2 catalytic lobe, which consist of the SUZ12 VEFS domain in association with EZH2 and EED (Chen et al., 2018;Kasinath et al., 2018), appears the most structurally conserved part of C. elegans PRC2. The most notable differences between SUZ12 and MES-3 reside in the N-terminal targeting lobe, which mediates interaction with RBBP4/7, nucleosomes, and accessory proteins (Chen et al., 2018;Kasinath et al., 2018). From flies to humans, distinct PRC2.1 and PRC2.2 sub-complexes can be distinguished that differ in associated accessory proteins and have specialized functions (Chammas et al., 2020;Hauri et al., 2016;Kasinath et al., 2021;Margueron and Reinberg, 2011). For example, the accessory proteins JARID2 and AEBP3 form part of PRC2.2 and mediate interaction with H2AK119ub1 (Kasinath et al., 2021), the product of the PRC1 E3 ubiquitin ligase complex (Margueron and Reinberg, 2011). Although homologs of JARID2 and other accessory proteins remain to be identified in C. elegans, the reported candidate PRC1 components are not required for germline development, in contrast to PRC2 (Karakuzu et al., 2009). This may explain the lack of conservation of the Zn domain, which in SUZ12 forms part of the JARID2 interaction surface (Chen et al., 2018). Additional characterizations of C. elegans PRC2 and its accessory proteins will be needed to further substantiate this hypothesis.
The here described similarities and differences between SUZ12 and MES-3 should facilitate further experiments to elucidate the specific mechanisms by which MES-3 acts in PRC2 in C. elegans. Our work joins a rapidly growing set of in silico predictions of previously undetected homologies made possible by unprecedented advances in deep-learning driven structure prediction (Bayly-Jones and Whisstock, 2021; Sanchez-Pulido and Ponting, 2021).

Limitation of the study
We capitalized on recent advancements in computational prediction approaches that enable to derive high-quality structures of protein monomers or multimers , which enables to study protein function and evolution at unprecedented scale (Bayly-Jones and Whisstock, 2021; Sanchez-Pulido and Ponting, 2021). We demonstrate that MES-3 is a diverged ortholog of SUZ12, and that MES-3 may associate with MES-2, MES-6, and LIN-53, similar to the orthologous proteins in human PRC2. However, this study is strictly based on computational predictions, and thus further experiments will be needed to support our predictions and to elucidate how MES-3 functions in C. elegans PRC2. This may come, for instance, from resolving the structure of PRC2 in C. elegans as well as from genetic engineering experiments of MES-3 in which predicted conserved amino acids and interaction surfaces are modulated, in combination with biochemical and phenotypic characterization.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following: