Evolutionary emergence of Hairless as a novel component of the Notch signaling pathway

Suppressor of Hairless [Su(H)], the transcription factor at the end of the Notch pathway in Drosophila, utilizes the Hairless protein to recruit two co-repressors, Groucho (Gro) and C-terminal Binding Protein (CtBP), indirectly. Hairless is present only in the Pancrustacea, raising the question of how Su(H) in other protostomes gains repressive function. We show that Su(H) from a wide array of arthropods, molluscs, and annelids includes motifs that directly bind Gro and CtBP; thus, direct co-repressor recruitment is ancestral in the protostomes. How did Hairless come to replace this ancestral paradigm? Our discovery of a protein (S-CAP) in Myriapods and Chelicerates that contains a motif similar to the Su(H)-binding domain in Hairless has revealed a likely evolutionary connection between Hairless and Metastasis-associated (MTA) protein, a component of the NuRD complex. Sequence comparison and widely conserved microsynteny suggest that S-CAP and Hairless arose from a tandem duplication of an ancestral MTA gene.


Introduction
A very common paradigm in the regulation of animal development is that DNA-binding transcriptional repressors bear defined amino acid sequence motifs that permit them to recruit, by direct interaction, one or more common co-repressor proteins that are responsible for conferring repressive activity. Two such universal co-repressors are Groucho (Gro) and C-terminal Binding Protein (CtBP).
The ancient and highly conserved transcription factor Suppressor of Hairless [Su(H)] functions at the terminus of the widely utilized Notch cell-cell signaling pathway. Su(H) is converted into an activator by signaling through the Notch receptor, but in the absence of signaling it functions as a repressor. Earlier studies have revealed that in many settings in Drosophila, Su(H)'s repressive activity depends on binding to the Hairless protein ( Figure 1). Hairless includes separate Gro-and CtBPbinding motifs, which permit it to function as an adaptor to bring these two corepressors to Su(H) ( Figure 1B) (Barolo et al., 2002a). Thus, the Su(H)/H partnership in the fly represents a notable exception to the rule of direct co-repressor recruitment.
As genome and transcriptome sequences have become available for more and more insects and other arthropods, we have searched for possible Hairless orthologs in a wide variety of species, in an attempt to determine the protein's phylogenetic distribution. We have found that Hairless is confined to the Pancrustacea (or Tetraconata), a clade of arthropods that includes the Crustacea and Hexapoda (Misof et al., 2014;Kjer et al., 2016). While this indicates that Hairless was gained at least 500 Mya, it also raises the question of how Su(H) in other protostomes acquires repressive activity.
Here we present evidence that direct co-repressor recruitment by Su(H) is likely to be ancestral in the protostomes. We show that Su(H) in a broad range of protostomes, including arthropods, molluscs, and annelids, bears both a short linear motif that mediates binding of CtBP and a novel motif for direct recruitment of Gro. Thus, the evolutionary appearance of Hairless has permitted the replacement of an ancient and predominant regulatory mechanism (direct co-repressor recruitment) with a novel one (indirect recruitment).
What can we learn about the evolutionary history of Hairless? While Hairless itself is found only in the Pancrustacea, we show that the genomes of Myriapods and Chelicerates encode a protein with clear sequence and functional similarities to Hairless. These proteins include a motif that strongly  (Hase et al., 2017), with scale and protein sizes indicated. (B) Summary of Hairless's known mode of action (Lai, 2002;Maier, 2006) as an adaptor protein that recruits the global co-repressors C-terminal Binding Protein (CtBP) and Groucho (Gro) to Suppressor of Hairless [Su(H)], the transducing transcription factor for the Notch (N) cell-cell signaling pathway; adapted from Figure 6 of Barolo et al. (2002a). In the absence of signaling through the Notch receptor (left), Su(H) acts as a repressor of Notch target genes, despite the presence of transcriptional activator proteins (orange oval). Upon activation of the Notch receptor (middle), Su(H), in a complex with the receptor's intracellular domain (NICD) and the co-activator Mastermind (Mam), functions to activate transcription of pathway target genes in cooperation with other transcriptional activators. In the absence of Hairless and hence in the absence of Su(H)'s repressive activity (right), the partner transcription factors are often sufficient to activate expression of target genes in a signal-independent manner (Barolo and Posakony, 2002b). DOI: https://doi.org/10.7554/eLife.48115.002 The following figure supplement is available for figure 1:  (Buchan et al., 2013;Jones and Cozzetto, 2015). DOI: https://doi.org/10.7554/eLife.48115.003 resembles the Su(H)-binding domain of Hairless, and we demonstrate that this motif from the house spider Parasteatoda tepidariorum does indeed bind Su(H). In addition, these Myriapod and Chelicerate proteins also include one or more canonical motifs for recruitment of CtBP. Accordingly, we designate these factors as 'Su(H)-Co-repressor Adaptor Proteins' (S-CAPs).
Finally, further sequence analyses, along with the discovery of conserved microsynteny, have provided substantial evidence that Hairless and the S-CAPs are likely to be homologous and that they arose from a duplication of the gene encoding Metastasis-associated (MTA) protein, a component of the nucleosome remodeling and deacetylase (NuRD) complex.
An intriguing question in evolutionary biology concerns the path by which a particular clade has escaped a strongly selected character that has been conserved for hundreds of millions of years. We believe that our study has yielded valuable insight into both the emergence of an evolutionary novelty and its replacement of an ancestral paradigm.

Hairless is present only in the Pancrustacea
We have conducted extensive BLAST searches of genome and transcriptome sequence data for a wide variety of metazoa in an attempt to define the phylogenetic distribution of Hairless. We find that Hairless as originally described (Bang and Posakony, 1992;Maier et al., 1992;Maier et al., 2008) is confined to the Pancrustacea (or Tetraconata), and occurs widely within this clade, including the Hexapoda, Vericrustacea, and Oligostraca ( Figure 2A). By contrast, no evidence for a true Hairless gene has been detected in either Myriapods or Chelicerates, even in cases where substantially complete genome sequence assemblies are available.
The enormous variation in the size of the Hairless protein in various Pancrustacean clades is worthy of note ( Figure 1A). The known extremes are represented by the Diplostracan (shrimp) Eulimnadia texana (343 aa) (Baldwin-Brown et al., 2018) and the Dipteran (fly) Protophormia terraenovae (1614 aa) (Hase et al., 2017), a 4.7-fold difference. There is a broad tendency for the size of the protein to be relatively stable within an order (Supplementary file 1). Thus, as noted previously (Maier et al., 2008), the Hymenoptera generally have a small Hairless (of the order of 400 aa; see Figure 1A), while the Diptera typically have a much larger version (of the order of 1000 aa or more). Notable exceptions to this pattern of uniformity are aphids, where Hairless is typically~900 aa compared to~400 aa in other Hemiptera, and chalcid wasps, where the protein is over 500 aa instead of the Hymenoptera-typical~400 aa noted above (Supplementary file 1). Smaller Hairless proteins typically retain all five conserved motifs/domains characteristic of this factor (Maier et al., 2008), while the regions that flank and lie between these sequences are reduced in size ( Figure 1A; Supplementary file 2).
A known CtBP-binding motif is present in the non-conserved N-terminal region of Su(H) in a wide variety of protostomes The apparent confinement of the Hairless co-repressor adaptor protein to the Pancrustacea raises the question of the mechanism(s) by which Su(H) in other protostomes might recruit co-repressor proteins to mediate its repressor function. Of course, other protostomes need not utilize the Gro and CtBP co-repressors for this purpose; different co-repressors might substitute. Nevertheless, we first sought to identify known binding motifs for Gro and CtBP in Su(H) from arthropods lacking Hairless. As shown in Table 1, we found a canonical CtBP recruitment motif of the form PfDfS (where f = I, L, M, or V) in predicted Su(H) proteins from a variety of Myriapods and Chelicerates, including the centipede Strigamia maritima, the tick Ixodes scapularis, the spider Parasteatoda tepidariorum, the horseshoe crab Limulus polyphemus, and the scorpion Centruroides sculpturatus. These motifs are all located in the non-conserved N-terminal region of Su(H) (Supplementary file 3).
Extending this sequence analysis to other protostome phyla led to the finding that a similar PfDfS motif occurs in the N-terminal region of Su(H) from a large number of molluscs and annelids, as well as from multiple Nemertea, Brachiopoda, Phoronida, and monogonont rotifers, and also from some flatworms ( Table 1). It is notable, by contrast, that we do not find CtBP-binding motifs present in Su(H) from nematodes. Nevertheless, given the broad phylogenetic distribution of the To verify that the shared PfDfS motif in protostome Su(H) proteins can indeed mediate direct recruitment of CtBP, we carried out an in vitro pulldown assay using GST-tagged Drosophila CtBP (bound to Glutathione Sepharose beads) and a His-tagged fragment of Strigamia maritima Su(H) ( Figure 3A). We found that the two proteins do interact directly and robustly, in a manner that is dependent on the integrity of the PVDLS motif in Strigamia Su(H).

A novel conserved motif in protostome Su(H) binds the Gro corepressor
In addition to a PfDfS CtBP-binding motif, we have found that Su(H) from a wide variety of protostomes includes a novel motif similar to GSLTPPDKV ( Table 1). Where present, this sequence typically lies a short (but variable) distance C-terminal to the PfDfS motif, also within the non-conserved N-terminal region of the protein (Supplementary file 3). The GSLTPPDKV motif is particularly prevalent in Su(H) from the Trochozoa, which includes annelids, sipunculans, molluscs, nemerteans, brachiopods, and phoronids (Kocot et al., 2017). Among the Ecdysozoa, it appears consistently in Su (H) from Crustacea and Myriapoda, and in small subsets of both Hexapoda (Ephemeroptera, Odonata, Zygentoma, Archaeognatha, Diplura, and Collembola) and Chelicerata [harvestmen (Opiliones) and Scorpiones]. The motif is absent from Su(H) in all other insect orders, and we have not found it so far in Su(H) from nematodes, flatworms, rotifers, or tardigrades; it is, however, found in the onychophoran Euperipatoides kanangrensis ( Table 1). Perhaps surprisingly, the motif is present in Su(H) from the acorn worms Saccoglossus kowalevskii and Ptychodera flava , which are hemichordates (deuterostomes).
Using an in vitro pulldown assay, we tested the possibility that the GSLTPPDKV motif mediates binding of the Gro co-repressor ( Figure 3B). Indeed, we find that GST-tagged Gro protein interacts strongly with a His-tagged protein bearing this motif at its C-terminus, and that this binding is abolished when the motif is replaced by alanine residues. We conclude that Su(H) from a broad range of protostomes is capable of directly recruiting both CtBP and Gro (Table 1), and that this capacity is hence very likely to be ancestral in this clade.
Retention of the hybrid state: Species that have both Hairless and the co-repressor-binding motifs in Su(H) The evolutionary emergence of Hairless as an adaptor protein capable of mediating the indirect recruitment of both Gro and CtBP to Su(H) might be expected to relieve a selective pressure to retain the ancestral Gro-and CtBP-binding motifs in Su(H) itself. And indeed, we find that Su(H) from multiple insect orders comprising the Neoptera lacks both of these sequences ( Figure 2B). Strikingly, however, we have observed that Crustacea and a small group of Hexapoda retain both traits ( Figure 2B). Thus, multiple representatives of the Branchiopoda, Malacostraca, and Copepoda, along with Ephemeroptera, Odonata, Zygentoma, Archaeognatha, Diplura, and Collembola, have both a canonical Hairless protein (including its Gro-and CtBP-binding motifs) and Gro-and CtBPbinding motifs within Su(H). These clades, then, appear to have retained a 'hybrid intermediate' state (Baker et al., 2012) characterized by the presence of both co-repressor recruitment mechanisms.

Myriapods and Chelicerates encode a protein with similarity to Hairless
While canonical Hairless proteins are confined to the Pancrustacea, we have discovered that the genomes of Myriapods and Chelicerates nevertheless encode a protein with intriguing similarities to Hairless as an adaptor protein, Su(H) in most insect orders (the Neoptera clade) has lost the ancestral short linear motifs that mediate direct recruitment of the CtBP and Gro co-repressor proteins (red bar). However, in the Crustacea, Collembola, Diplura, and a subset of Insecta, the ancestral recruitment motifs have been retained in Su(H), despite the presence of Hairless (see Table 1 and Supplementary file 3). Tree adapted from Misof et al. (2014) and Kjer et al. (2016). DOI: https://doi.org/10.7554/eLife.48115.004 Hairless. Most notable is the presence of a motif that strongly resembles the 'Su(H)-binding domain' (SBD) of Hairless, which mediates its high-affinity direct interaction with Su(H) (Figure 1; Figure 4A). We will refer to these proteins as 'S-CAPs'; the basis for this designation will be made clear in forthcoming figures. We note that the occurrence of this protein in the centipede Strigamia maritima has also recently been reported by Maier (2019). In the Pancrustacea, the N-terminal and C-terminal halves of the Hairless SBD are encoded by separate exons ( Figure 4B). Strikingly, the related motif in Myriapod and Chelicerate S-CAPs is likewise encoded by separate exons, with exactly the same splice junction as in Hairless ( Figure 4B). We believe that this is highly unlikely to be coincidental, and is instead strongly suggestive of an evolutionary relationship between Hairless and S-CAPs. A recent structural analysis of the Su(H)-Hairless protein complex identified several residues in the Hairless SBD that are involved in binding to the C-terminal domain (CTD) of Su(H) (Yuan et al., 2016) ( Figure 4A). These include four hydrophobic amino acids in the main body of the SBD (L235, F237, L245, and L247; these are highlighted in red in Figure 4A). Note that the Myriapod and Chelicerate S-CAP motifs share these same residues. In addition, a tryptophan (W258) C-terminal to the main body of the Hairless SBD also participates in binding to Su(H) ( Figure 4A). Myriapod and Chelicerate S-CAPs all include a tryptophan residue at a similar position C-terminal to the main SBD-like domain ( Figure 4A). Moreover, this particular W residue in both Hairless and the S-CAPs is followed by a hydrophobic residue, typically V or I. These sequence features, we suggest, is further strong evidence of a common ancestry for the respective segments of Hairless and S-CAPs.
A third structural similarity between Hairless and S-CAPs is the presence in the latter of one or more short linear motifs capable of binding the CtBP co-repressor ( Figure 5A). These motifs typically reside in the C-terminal half of the S-CAPs, superficially resembling the C-terminal location of Hairless's CtBP recruitment motif.
A table listing representative examples of Myriapod and Chelicerate S-CAPs is provided as Supplementary file 4, and an annotated FASTA file of their amino acid sequences is included as Supplementary file 5. It is important to note that we have not found non-Hairless S-CAPs in the Pancrustacea.

Spider S-CAP binds to Drosophila Su(H)
Given the clear sequence similarity between the Hairless SBD and the SBD-like motif in Myriapod and Chelicerate S-CAPs, we investigated whether the latter motif is likewise capable of mediating direct binding to Su(H). As noted above, the Hairless SBD interacts specifically with the CTD of Su (H). Since this domain in Su(H) is very highly conserved throughout the Bilateria and Cnidaria, we thought it reasonable to utilize Drosophila Su(H) for this binding assay. As shown in Figure 4C, we find that a 200-amino-acid segment of S-CAP from the spider Parasteatoda tepidariorum binds directly to Drosophila Su(H) in vitro. This interaction depends strictly on the integrity of the five residues that in Hairless have been shown to contact the Su(H) CTD (highlighted in red in Figure 4A).
Given the presence of one or more CtBP recruitment motifs in the Myriapod and Chelicerate S-CAP proteins ( Figure 5A), along with the ability of their SBD-like domains to bind Su(H) ( Figure 4C), we have designated these as 'Su(H)-Co-repressor Adaptor Proteins' (S-CAPs).
Chelicerate S-CAP proteins are related to Metastasis-associated (MTA) proteins In addition to their similarities to Hairless, the S-CAP proteins of Chelicerates include two regions with strong sequence homology to the Metastasis-associated (MTA) protein family, which is highly conserved among Metazoa. The MTA proteins play an important role in transcriptional regulation via their function as core components of the nucleosome remodeling and deacetylase (NuRD) complex (Allen et al., 2013;Burgold et al., 2019). The N-terminal half of MTAs includes four well-defined functional domains: BAH (Bromo-Adjacent Homology), ELM2 (Egl-27 and MTA1 homology), SANT (Swi3, Ada2, N-CoR, and TFIIIB), and GATA-like zinc finger (Millard et al., 2014) ( Figure 5B). Of these, the ELM2 and SANT domains are retained at the N-terminal end of Chelicerate S-CAPs ( Figure 5B; Figure 5-figure supplement 1A). This is highly likely to have functional significance, as the ELM2 and SANT domains of MTA proteins work together to recruit and activate the histone deacetylases HDAC1 and HDAC2 (Millard et al., 2013). Further suggesting homology between Chelicerate S-CAPs and MTAs is the observation that their shared ELM2 and SANT domains are each encoded by two exons with exactly the same splice junction ( Figure 5C).
It is noteworthy that, despite sharing the SBD-like and CtBP recruitment motifs of Chelicerate S-CAPs, the available Myriapod S-CAP protein sequences lack the N-terminal ELM2 and SANT homologies with MTA proteins ( Figure 5B). Consistent with this, the SBD motif in Myriapod S-CAPs lies much closer to the protein's N terminus than the SBD motif in Chelicerate S-CAPs, suggesting that simple loss of the ELM2/SANT-encoding exons might underlie this difference between the two S-CAP clades. Likewise, Hairless proteins are devoid of clear similarities to MTAs.
In addition to their SBD and ELM2/SANT domains, Chelicerate S-CAPs share a third region of homology that lies between the ELM2 and SANT sequences ( Figure 5-figure supplement 1A). This region is absent from both Hairless and the Myriapod S-CAPs. Conversely, Myriapod S-CAPs include a segment of sequence similarity that is not found in either Hairless or Chelicerate S-CAPs ( Figure 5-figure supplement 1B).   Interestingly, in some instances Hairless/MTA microsynteny is preserved, but the genes' relative orientation is different (Figure 6; Supplementary file 1). Thus, in the aphids -in contrast to other Hemiptera -MTA lies downstream of Hairless, but in the opposite orientation ( Figure 6). In the beetle Harmonia axyridis (Coleoptera), MTA lies upstream of Hairless ( Figure 6).

Conserved microsynteny between MTA and S-CAP/Hairless genes
Despite the multiple instances in which it has been lost, we believe that the most parsimonious interpretation of our analysis is that close linkage between MTA and S-CAP/Hairless genes is ancestral in the respective taxa (Myriapods/Chelicerates and Pancrustacea). We leave for the Discussion our proposed interpretation of the evolutionary significance of this adjacency. Our analysis of sequences from a broad range of protostomes strongly suggests that direct recruitment of the CtBP and Gro co-repressors by Su(H) is ancestral in this clade. This is consonant with the fact that direct co-repressor recruitment by DNA-binding repressor proteins in general is a dominant paradigm among Metazoa. This evokes the intriguing question of what might have led to the loss of direct recruitment by Su(H) in the Neoptera (see Figure 1B) and its replacement by Hairless-mediated indirect recruitment? Does Hairless provide some advantageous functional capacity? Note that this is not intended to suggest that Hairless must be an evolutionary adaptation per se (Lynch, 2007); rather, we are asking: What capability might it have conferred that would lead to its retention and the subsequent loss of the recruitment motifs in Su(H)?
One appealing (but of course speculative) possibility is that Hairless may have permitted Su(H) for the first time to recruit both CtBP and Gro simultaneously to the same target genes. As we have noted, the apparently ancestral PfDfS and GSLTPPDKV motifs in protostome Su(H) typically lie quite close to each other in the protein's linear sequence (Supplementary file 3). CtBP (~400 aa) and Gro (~700 aa) are both large proteins that engage in oligomerization as part of their functional mechanism (Song et al., 2004;Bhambhani et al., 2011). It is very unlikely that both could bind stably to DNA-bound Su(H) at the same time. In contrast, the Gro and CtBP recruitment motifs in Hairless are far apart in the linear sequence ( Figure 1A) and are separated by a region predicted to be largely disordered (Figure 1-figure supplement 1). We suggest that this might be compatible with simultaneous recruitment of the two co-repressors.
Whatever may have been the selective forces that led to the loss of direct co-repressor recruitment by Su(H) in the Neoptera and its replacement by Hairless-mediated indirect recruitment, Hairless is a notable evolutionary novelty for having permitted the unusual abandonment of an ancestral and highly conserved paradigm. We suggest that this represents a striking example of 'developmental system drift' (True and Haag, 2001), in which a common output (widespread 'default repression' of Notch pathway target genes) is achieved via distinct molecular mechanisms in different species.
A possible evolutionary pathway for the appearance of Hairless We have described here several findings that we believe have important implications for an attempt to reconstruct the history of Hairless as an evolutionary novelty. First, we observe that Hairless is apparently confined to the Pancrustacea, wherein it is widely distributed among diverse taxa (Figure 2A; Supplementary file 1). Second, we have discovered in the sister groups Myriapoda and Chelicerata a protein (S-CAP) with clear sequence homology to the Su(H)-binding domain (SBD) of Hairless ( Figure 4A). Significantly, in both Hairless and the S-CAPs these motifs are encoded by contributions from two exons, with the associated splice junction in precisely the same location ( Figure 4B; Supplementary file 4). Third, we find that S-CAPs in the Chelicerata include in their N-terminal region strong homology to the ELM2 and SANT domains of MTAs, which themselves are highly conserved among Metazoa, and therefore would have been present in the arthropod common ancestor ( Figure 5B,C). Finally, our analysis indicates that close, usually adjacent, linkage of Hairless and MTA genes (in the Pancrustacea) and between S-CAP and MTA genes (in the Myriapoda and Chelicerata) is widespread (Figure 6; Supplementary file 1; Supplementary file 4), and hence very likely to be ancestral, in these taxa.
While any attempt to infer the sequence of evolutionary events that led to the appearance of Hairless is necessarily speculative, we believe that the above findings offer substantial support for the following hypothetical pathway. We propose that in a deep arthropod ancestor a tandem duplication of the MTA gene occurred. One copy retained the strong sequence conservation (and presumably ancestral function) of metazoan MTA genes, while the second copy diverged very substantially, eventually encoding a protein that had lost all but the ELM2 and SANT domains of the MTA ancestor. The extensive reconfiguration of this paralog also included the eventual acquisition of the SBD motif and the addition of one or more CtBP recruitment motifs (see Figure 7 for some possible sources of these components). In the Myriapod lineage, even the ELM2 and SANT domains were eventually lost. In the Pancrustacea, we suggest that this same divergent MTA paralog evolved to become Hairless. Beyond the alterations described for the Myriapoda, this would have involved the acquisition of sequences encoding additional now-conserved domains and motifs, including the Gro recruitment motif (Supplementary file 2). This radical evolutionary transformation resulted in a protein with little or no remaining homology to its MTA ancestor, and with an entirely novel regulatory function (Holland et al., 2017).
In this context, it is of interest that the Drosophila Mi-2/Nurd complex -which includes the MTA protein -has recently been shown to engage in direct repression of multiple Notch pathway target genes, independent of both Su(H) and Hairless (Zacharioudaki et al., 2019). Whether this activity preceded the emergence of Hairless is unknown, but the possibility that it is in some way connected to Hairless's evolutionary history is indeed intriguing.

Sequence searches, analysis, and annotation
Genome and transcriptome sequences encoding Hairless, Suppressor of Hairless, S-CAP, and MTA proteins from a wide variety of species were recovered via BLAST searches, using either the online version at the NCBI website (Boratyn et al., 2013) or the version implemented by the BlastStation-Local64 desktop application (TM Software, Inc). Sequences were analyzed and annotated using the GenePalette (Rebeiz and Posakony, 2004;Smith et al., 2017) and DNA Strider (Marck, 1988;Douglas, 1995) desktop software tools. Analysis of predicted disordered regions in Hairless was conducted using DISOPRED3 on the PSIPRED server (Buchan et al., 2013;Jones and Cozzetto, 2015).

Generation of constructs for GST pulldown experiments Strigamia maritima Su(H) protein constructs to test CtBP binding
A codon-optimized fragment corresponding to exons 2 and 3 from S. maritima Su(H) mRNA was synthesized by Genewiz, Inc, and cloned into pRSET-C using Acc65I and BamHI restriction sites. The CtBP-motif mutant was subsequently generated by overlap extension PCR using the primers HISs-marSUH-f (CGCTGGATCCGCGGCCAGTATGAC), HISsmarSUH-r (CCATGGTACCAGTTATGCGTGG TG), HISsmarSUHctbpm-f (AACCACgCCGcaGcTGcGgCTAACAGCCATCGCGGTGAAGGCGGC-CAC), HISsmarSUHctbpm-r (GCTGTTAGcCgCAgCtgCGGcGTGGTTGTCGGCGAAGTGAGGGG TCAG). After sequence confirmation, this fragment was also cloned into pRSET-C using the same enzymes. Binding of these constructs to Drosophila melanogaster CtBP was assayed using GST alone and a GST-CtBP fusion protein (Nibu et al., 1998).

Constructs to test potential Gro-binding motif in Strigamia maritima Su(H)
A truncated version of HLHmb (HLHmb-WRPWtrunc) was amplified from a pRSET-HLHmb-WT construct using the primers HISmbeta-f (cgatggatccgaATGGTTCTGGAAATGGAGATGTCCAAG) and HISmbetatrunc-r (ccatggtaccagTCACATGGGGCCagaggtggagctggcctcgctgggcgc); a version of HLHmb with the WRPW motif replaced with the amino acids GSLTPPDKV (HLHmb+Smar-motifWT) was amplified from the WT construct with HISmbeta-f and mbetaSmarSuH-r (ccatggtaccagTCACAC TTTATCAGGTGGAGTGAGAGAACCCATGGGGCCagaggtggagctggcc); and a version of HLHmb with the WRPW motif replaced with a stretch of 9 alanine residues (HLHmb+Smar-motifMUT) was amplified using HISmbeta-f and mbetaSmarSuHmut-r (ccatggtaccagTCA ggctgccgctgcggctgccgctgctgcCATGGGGCCagaggtggagctggcc). Each construct was then subsequently cloned into pRSET-C using the restriction enzymes BamHI and Acc65I and sequence verified. Binding of these constructs to Drosophila melanogaster Gro was assayed using GST alone and a GST-Gro fusion protein. The latter construct was made by cloning the full-length Gro coding sequence into the pGEX-KG expression vector at the EcoRI and SalI restriction sites: gtggcgac-catcctccaaaatcggatctggttccgcgtggatccccgggaatttccggtggtggtggtggaattctaATG...TAAATCCA-CAAAACCATGCAGTTTTTTCATTTTGTAATAAGCTCGTATAGTTTTTATTACAACATGTTCGAAATCA TGCAcccgggctgcaggaattcgatatcaagcttatcgataccgtcgactcgagctcaagcttaattcatcgtgactgactgacgatctg (underlined = pGEX KG vector; uppercase = gro cDNA; bold = gro start and stop codons; italic = linker)
GST pulldowns using each of the above constructs were performed as previously described (Fontana and Posakony, 2009 Hosono et al., 2004) is closely related to the C-terminal half of Hairless and S-CAP SBDs. Upper diagram is a sequence alignment of the entire Yippeelike proteins from Drosophila melanogaster (Dm) and Homo sapiens (Hs). Aligned below are contiguous SBD motifs from Drosophila Hairless and five Myriapod and Chelicerate S-CAPs; their C-terminal halves are shown in bold. Two leucine (L) residues shown to make direct contact with Su(H) (Yuan et al., 2016) are highlighted in red. Amino acid sequence identities are indicated by vertical lines; conservative substitutions are indicated by + signs. Other species names as in Figure 4A. (B) As shown in the gene diagram at the bottom, the CtBP recruitment motif in Hairless is encoded by a very small exon located at the extreme 3' end of the gene [example is from the Oriental fruit fly Bactrocera dorsalis (Bdor; JFBF01000273.1); scale indicated]. A pre-existing gene encoding a protein that utilizes the same PLNLS recruitment motif is a possible source of this exon. Example shown is a portion of the senseless gene from the red flour beetle Tribolium castaneum (Tribolium Genome Sequencing Consortium, 2008). Senseless directly recruits the CtBP co-repressor via the PLNLS motif (Miller et al., 2014). This portion of the protein is encoded in exon 2; splice junction is indicated by a red /. Aligned beneath it is the last exon of the Bdor Hairless gene, illustrating its splice junction in the same frame as senseless exon 2. DOI: https://doi.org/10.7554/eLife.48115. 011  TGAAAACAGCACCTCTTTAAGCTTCAGCGACGACAATAGCAGCATTCAGAGCAGCCCG  TGGCAGCGTGATCAGCCGTGGAAACAGAGTCGTCCGCGCCGTGGCATCAGCAAAGAACTGTC  TTTATTTTTCCACCGCCCGCGCAATAGTACACTGGGTCGTGCAGCCTTACGTACCGCAGCCCG-CAAACGTCGTCGTCCGCATGAACCGCTGACCACCAGCGAAGATCAGCAGCCGATC  TTTGCCACCGCAATCAAAGCCGAGAACGGTGATGATACTTTAAAAGCCGAAGCAGCCGAATAAC  TGGTACCATGG   >Dmel Hairless192-389 5Amut   CGATGGATCCGAGCCGTTGTGGCAGCAGCAGCTGGCACTGCCAAAATCGGCAAAGGCAGCAA  TAGCGGTGGTAGCTTTGACATGGGCCGCACCCCGATTAGCACCCATGGCAACAACAGCTGGGG  TGGTTATGGTGGTCGTGCCCAAGCTTTTAAAGACGGCAAGTTCATCGCCGAAGCCGCACGCAG-CAAAGATGGCGACAAAAGCGGTGCCGTGAGCGTGACCCGCAAAACCTTTCGTCCGCCGAG  TGCAGCAACCAGCGCAACCGTTACCCCGACCAGCGCAGTTACCACCGCCTACCCGAAAAAC-GAAAACAGCACCTCTTTAAGCTTTAGCGACGACAACAGCAGCATTCAGAGCAGCCCG  TGGCAGCGCGATCAGCCGTGGAAACAGAGCCGTCCTCGTCGCGGCATCAGCAAAGAGCTGTC  TTTATTCTTTCATCGCCCGCGCAATAGCACTTTAGGTCGTGCAGCACTGCGCACAGCAGCACG  TAAACGTCGTCGCCCGCATGAACCGCTGACCACCAGCGAAGACCAGCAGCCGA  TTTTTGCCACCGCAATCAAAGCCGAGAACGGCGATGATACTTTAAAAGCAGAAGCAGCCGAA  TAACTGGTACCATGG   >Ptep s-CAP233-432 WT   CGATGGATCCGAACCGTGAATACCGAAGATCCGCCGAAGGATAGCATCAACTTTCTGGACCA-CAGCCGCGTGACCGATCCGTGTAGTGCCGCAAGCGAAACCAGCCTGCCGCAGGATG  TGCCGGCAACAAGCACCGTGGGCAGCCTGAAATTTTTTCTGGGCGGTCGCCTGGTGCTGAAA  TTAAACGCCCAGCAGGATGGCGGCAGCGGCAATAAATGCCAGTGGGTGCAGAGCAACGATC  TGCCGAAACATAGCAACCATAACAAAAAAGATAAACATAAGAAAAAATTTGCACCGTATAGCTA  TAGCAGCAGCGGCACTCAGAAACCGCTGAAGAAAGGCGACGATACCAGTGCCGTGCCGGACTG  TGATCCGAGCGGCATCAAAAAGCCGCGCCTGAAAGAGTACGAGACCAGCGAGAATAGCGCCC  TGGGTCTGCTGCTGTGCAGCAGCAGTTGGACCCCGCCGGTTGCAGATGGTCAGGAGAGCA  TTGACGTGGACGATACCAGCAGCAAAACCAGCGAGGGCTATATTAGCCCGATCCTGAGCAACAA  TAGCCGCACCAGCAAAATCGACACCATCAAGCACGATTTTGCCAGCAACCCGAACACCTAAC  TGGTACCATGG   >Ptep s-CAP233-432 5Amut   CGATGGATCCGAACCGTGAACACCGAAGACCCGCCGAAAGATAGCATCAACTTTTTAGACCA  TAGCCGCGTGACAGACCCGTGCAGTGCCGCAAGTGAAACCTCTTTACCGCAAGATG  TGCCGGCAACCAGCACCGTGGGTAGCGCCAAAGCCTTTCTGGGCGGTCGTCTGG  TGGCCAAAGCCAATGCCCAGCAAGATGGTGGTAGTGGTAACAAATGCCAAGCTGTGCAGAG-CAACGATCTGCCGAAACACAGCAATCACAATAAGAAAGACAAACACAAGAAAAAATTTGCCCCG  TATAGCTATAGCAGCAGCGGCACCCAGAAACCGCTGAAAAAAGGCGATGACACCAGCGCAG  TGCCGGATTGCGATCCGAGCGGCATTAAGAAACCGCGTTTAAAGGAGTACGAGACCAGC-GAAAACAGTGCTTTAGGTTTACTGCTGTGCAGCAGCAGTTGGACACCGCCGGTGGCCGATGG  TCAAGAAAGTATCGATGTGGACGACACCAGCAGCAAAACCAGCGAAGGCTACATCAGCCCGA  TTCTGAGCAACAATAGCCGCACCAGCAAAATTGATACCATTAAACATGATTTTGCAAGCAA  TCCGAATACCTAACTGGTACCATGG encoded by two exons with the same splice junction as in Hairless. If not, the alternative exon structure is indicated. References not in the main reference list are provided in Supplementary file 6. Data availability All data generated or analysed during this study are included in the manuscript and supporting files.