The heterochronic LIN-14 protein is a BEN domain transcription factor

Heterochrony is a foundational concept in animal development and evolution, first introduced by Ernst Haeckel in 1875 and later popularized by Stephen J. Gould1. A molecular understanding of heterochrony was first established by genetic mutant analysis in the nematode C. elegans, revealing a genetic pathway that controls the proper timing of cellular patterning events executed during distinct postembryonic juvenile and adult stages2. This genetic pathway is composed of a complex temporal cascade of multiple regulatory factors, including the first-ever discovered miRNA, lin-4, and its target gene, lin-14, which encodes a nuclear, DNA-binding protein2,3,4. While all core members of the pathway have homologs based on primary sequences in other organisms, homologs for LIN-14 have never been identified by sequence homology. We report that the AlphaFold-predicted structure of the LIN-14 DNA binding domain is homologous to the BEN domain, found in a family of DNA binding proteins previously thought to have no nematode homologs5. We confirmed this prediction through targeted mutations of predicted DNA-contacting residues, which disrupt in vitro DNA binding and in vivo function. Our findings shed new light on potential mechanisms of LIN-14 function and suggest that BEN domain-containing proteins may have a conserved role in developmental timing.

Heterochrony is a foundational concept in animal development and evolution, first introduced by Ernst Haeckel in 1875 and later popularized by Stephen J. Gould 1 . A molecular understanding of heterochrony was first established by genetic mutant analysis in the nematode C. elegans, revealing a genetic pathway that controls the proper timing of cellular patterning events executed during distinct postembryonic juvenile and adult stages 2 . This genetic pathway is composed of a complex temporal cascade of multiple regulatory factors, including the first-ever discovered miRNA, lin-4, and its target gene, lin-14, which encodes a nuclear, DNA-binding protein [2][3][4] . While all core members of the pathway have homologs based on primary sequences in other organisms, homologs for LIN-14 have never been identified by sequence homology. We report that the AlphaFold-predicted structure of the LIN-14 DNA binding domain is homologous to the BEN domain, found in a family of DNA binding proteins previously thought to have no nematode homologs 5 . We confirmed this prediction through targeted mutations of predicted DNA-contacting residues, which disrupt in vitro DNA binding and in vivo function. Our findings shed new light on potential mechanisms of LIN-14 function and suggest that BEN domain-containing proteins may have a conserved role in developmental timing.
To gain insights into the structural features of LIN-14, we first interrogated more than 100 recently sequenced nematode genomes for the presence of homologs of C. elegans LIN-14. BLAST sequence analysis readily recovered LIN-14 sequence homologs within the nematode phylum, but not outside. Sequence alignment of all 139 LIN-14 homologs defined This work is licensed under a Creative Commons Attribution 4.0 International License, which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. a conserved 149 amino acid residues long region within a minimal LIN-14 protein fragment previously shown to bind DNA and to be sufficient to rescue lin-14 mutant phenotypes ( Figure 1A) 4,6 . Querying the AlphaFold structural database 7 , we found that this domain is predicted to be highly structured, primarily composed of several α helices ( Figure 1B). Moreover, there are striking similarities with crystal structures of BEN domains from several different proteins, including Drosophila insv and BSG25A, and mammalian BEND family proteins (e.g., BEND3) ( Figure 1B) 5,8 .
While BEN domains have not previously been described in C. elegans, the BEN domain is found in proteins across the animal kingdom, many of which are involved in transcriptional regulation and chromatin organization 5,8 . Since BEN domain proteins have been crystallized together with DNA target sequences 5 , we were able to predict that several basic, positively charged residues in the fifth (and last) α helix of the LIN-14 BEN domain contact DNA directly ( Figure 1C). To experimentally confirm this prediction, we pursued both in vitro and in vivo approaches. In both types of analyses, we relied on our recent identification of genomic targets of LIN-14 protein, i.e., genes that were found to display in vivo binding to LIN-14, as assessed by ChIP-Seq analysis, and were transcriptionally dysregulated (i.e. derepressed) in lin-14 mutant animals 9 . One such candidate is the neuropeptide-encoding gene nlp-45.
We expressed the predicted LIN-14 BEN domain in bacteria, both in its wild-type form and in mutated forms in which the four positively charged arginine residues in the fifth helix, predicted to be involved in DNA binding, are mutated to alanine residues. The arginine to alanine mutations did not result in protein destabilization as the mutant LIN-14 proteins were soluble and behaved similarly to the wild-type LIN-14 protein in gel filtration chromatography ( Figure S1A,B in Supplemental information, published with this article online). Using gel shift assays, we found that the wild-type, but not the mutant LIN-14 proteins, bind to DNA sequences derived from the nlp-45 locus ( Figure 1D). The in vitro binding to the nlp-45 promoter was sequence-selective, as shown in binding assays with single nucleotide mutations in the YGGAR motif ( Figure S1C).
Next, we used CRISPR/Cas9 genome engineering to introduce the four arginine mutations into the endogenous lin-14 locus. If these residues were indeed involved in DNA binding, we would expect such mutant animals to display the same defects as previously described in lin-14 null mutant animals. We indeed found that lin-14(syb5772) Arg mutant animals are indistinguishable from lin-14(ma135) null mutant animals; they are sterile, have a dumpy appearance, a protruding vulva, display precocious alae, and show de-repression of nlp-45 gene expression at the improper time, during the first larval stage (Figures 1E and S1D-F).
Structural homology searches using DALI against all predicted C. elegans protein structures from AlphaFold revealed several other C. elegans proteins with predicted similarities to the BEN domain ( Figure S2). The only one previously characterized is SEL-7, a nuclear, DNA-binding protein with no previously known homolog that is, intriguingly, also involved in temporal patterning in C. elegans 10 . Another previously uncharacterized protein with two putative BEN domains, F12F6.1 ( Figure S2), also shows temporally controlled expression during postembryonic development 9 .
The structural deorphanization of LIN-14 as a BEN domain-containing protein provides new vistas on both LIN-14 protein function as well as BEN domain proteins in general. Since many BEN domain proteins are involved in controlling chromatin architecture 5 , it is conceivable that, in addition to transcriptional regulation, LIN-14 may also play a role in chromatin organization. Since several non-nematode BEN domain-containing proteins have, like nematode LIN-14 and SEL-7, roles in temporal patterning 5,8 , such a function may have been the ancestral role of BEN domain proteins.  Figure S1F for cell identification details. The red bars in the bottom right represent 10 μm.