The cysteine-rich exosporium morphogenetic protein, CdeC, exhibits self-assembly properties that lead to organized inclusion bodies in Escherichia coli

Clostridioides difficile is an obligate anaerobe spore-forming, Gram-positive, pathogenic bacterium, considered the leading cause of nosocomial diarrhea worldwide. Recent studies have attempted to understand the biology of the outer-most layer of C. difficile spores, the exosporium, which is believed to contribute to early interactions with the host. The fundamental role of the cysteine-rich proteins CdeC and CdeM has been described. However, the molecular details behind the mechanism of exosporium assembly are missing. The underlying mechanisms that govern exosporium assembly in C. difficile remain poorly studied, in part due to difficulties in obtaining pure soluble recombinant proteins of the C. difficile exosporium. In this work, we observed that CdeC was able to form organized inclusion bodies in the E. coli BL21 (DE3) pRIL strain filled with lamellae-like structures separated by an interspace of 5-15 nm; however, this lamellae-like organization is lost upon overexpression in E. coli SHuffle T7 strain with an oxidative environment. Additionally, DTT treatment of CdeC inclusion bodies released monomeric soluble forms of CdeC. Three truncated versions of the CdeC protein were constructed. While all the variants were able to aggregate forming oligomers that are resistant to denaturation conditions, TEM micrographs suggest that the self-organization properties of CdeC may be attributed to the C-terminal domain. Overall, these observations have important implications in further studies implicated in elucidating the role of CdeC in the exosporium assembly of C. difficile spores.


Introduction
lysed with 10 mg/mL lysozyme for 1 hour at 37ºC. After that, cells were sonicated at 12 Watts 170 for 15 s on ice and centrifuged at 5853 x g for 45 min at 4ºC. The supernatant obtained was 171 discarded, and the pellet was washed two times with 2% Triton X-100 in PBS 1X buffer by  Table S4 In order to make a rooted phylogenetic tree explaining the 185 evolutionary history of CdeC, the NCBI RefSeq database was searched using the same strategy, 186 leaving out all hits with organisms that were described as C. difficile (tax id: 1496) and that had a 187 coverage of less than 90% to the reference gene of the C. difficile 630 strain. The genes found and 188 the species to which it belongs are in the supplementary Tables S5 and S6 and Table S4. The files 189 were concatenated and clustered into single alleles using CD-HIT-EST v4.8.1. The single alleles 190 are in Table S7. These single alleles were aligned using the AlignTranslation function of the R 191 DECIPHER package v2.16.1 with the default parameters. The multiple sequence alignment is found in the supplementary Table S8 (nucleotide) and Table S9 (amino acid). To perform the 193 phylogenetic inference, the GTR+I+G substitution model was determined by the AIC and BIC 194 methods using the jModelTest v2.1.10 program on nucleotide alignment Table S8 Phylogenetic   195 inference was performed using the Bayesian method with the program BEAST v1.10.4, using for 196 this purpose 6 independent chains of 10 000 000 states sampling every 1000 states. Valid sample 197 size values were >200 for all parameters, and convergence and mixing were assessed using Tracer  For immunofluorescence analysis, droplets of 2 μL samples were added to the poly-L-lysine pre-237 treated coverslips and dried for 10 min at 37ºC. For the measurements of the distance between each lamination, at least five independent inclusion 260 bodies were divided in their axial and longitudinal axis, giving four sections of each inclusion body. From each section, five lamellae-like were measured, giving a total of 20 measurements per 262 inclusion bodies. The results were plotted in a frequency distribution graph. The data were fit to a 263 Gaussian curve using the Graph Pad Prism 8 software.

267
Cysteine-rich protein multimerizes during heterologous expression 268 In a recent study, it was shown that the cysteine-rich protein CdeM exists not just as a monomer 269 (predicted the molecular weight of 19 kDa) but also as several forms, including species of 25 and      showed an organized lamellae-like ultrastructure. Besides overexpression in pETM11, CdeC was 329 also cloned and expressed in the vector pET22b with no differences in the lamellae ultrastructure 330 ( Fig S1A). Also, the overexpression was tested at 21ºC, and 37ºC, no differences in the lamellae-331 like structure were evidenced (Fig. S1B), indicating that this is temperature independent.

332
The CdeC cysteine-rich proteins are highly conserved in Peptostreptococcaceae family members, 334 and, at least in the epidemically relevant R20291 strain, it is essential for morphogenesis of the  (Table S1). From these, 1833 alleles of cdeC 338 were extracted (Table S4). Besides, 16 alleles of cdeC-like proteins with at least 90% coverage 339 from the family Peptostreptococcaceae (Clostridium cluster XI) were added to extend the analysis 340 to other species and to discriminate amino acids at positions preserved throughout evolution. The 341 assemblies with the access codes were described in Table S2. To reduce the dataset without losing

384
Considering the abundant cysteine-residues in the CdeC amino acid sequence, it is feasible to the 385 hypothesis that, at least part of the CdeC-self-assembly lamellae-like, may be impacted by the 386 formation of disulfide bonds, as is the case in some of the spore coat cysteine-rich protein from B.   Due to the formation of stable and high density, CdeC inclusion bodies can be isolated from cell 423 lysates by differential centrifugation, providing fast, robust, and hence cost-efficient protocols to 424 obtain large amounts of relatively pure protein. We first aimed to test whether CdeC inclusion bodies could be purified from E. coli lysates. For this, E. coli cells carrying CdeC inclusion bodies 426 were lysed, and the remaining pellet was extracted with Triton X-100, which is a detergent that 427 solubilizes membrane-associated material rather than aggregated proteins. After sequential washes 428 with Triton X-100 and PBS, the pellets resulted in an enrichment of CdeC inclusion bodies that 429 were observed as dark spheres ( Figure S4A). Also, all the inclusion bodies presented 430 immunoreactive fluorescence against anti-his antibodies, indicating the accessibility of the 6xHis 431 tag (Fig. S4A). Next, we ask if the purification protocol could affect the ultrastructure of the bodies.   Therefore, we selected the reducing agent DTT, which maintains sulfhydryl (-SH) groups in a 449 reduced state, it is effective for reducing the disulfide bridges in proteins and the cross-linker N, 450 N′-bis(acryloyl) cystamine. Phase-contrast micrographs demonstrate that the inclusion bodies 451 retained their phase-bright properties upon treatment with various DTT concentrations (Fig. 5A).   (Fig. 5C). We also observed that some filamentous fragments of the inclusion bodies are lost after   The self-assembly of short peptides has been actively studied in recent years (Hu et al 2020). The 521 process is generally driven by specific, spontaneous, and non-covalent chemical interactions. In 522 the case of CdeC, some regions of the protein might have the auto-assembly capacity. Since the 523 redox environment and DTT treatment affects the organized ultrastructure of the inclusion bodies, 524 these organization could depend, at least in part, in the formation of disulfide bridges, suggesting 525 that cysteine residues could be implicated in self-assembly properties. As mentioned before, the 526 sequence of CdeC contains several cysteine residues and sequence repeats (Fig. 7A), but the 527 relevance of these residues is unknown and may be implicated in the organization of lamellae-like 528 structures of CdeC. of CdeC (Fig. 7A)., and two redox-sensitive disordered regions CETTFEFAVCGERNAEC and 543 VDTFSKVCDF (Fig. 7A). The predicted molecular weight of this variant is 23.9 kDa. These three  These results agree with the Western blot analysis, where most of the protein was found in the 565 insoluble fraction. In contrast, CdeC1-100 variant did not seem to form inclusion bodies, and it was 566 found, qualitatively, in the same amounts in soluble and insoluble fractions (Fig. 7B). Furthermore, 567 the inclusion bodies of variants CdeC1-214 and CdeC206-405 are likely as big as those of CdeCFL but 568 lack the lamella-like structure (Fig. 7C). For the CdeC206-405 variant, the inclusion bodies seem to 569 contain discrete laminations (Fig. 7C), but that is not so clear as in CdeCFL. These results may 570 suggest that some of the amino acid residues essential for the CdeC organization may be localized  ). In C. sporogenes, there was also observed that the 611 cysteine-rich exosporium protein, CsxA, self-assembled into a highly thermally stable structure 612 identical to that of the native exosporium when expressed in E. coli, (Janganan et al 2020).

613
Therefore, it is tempting to propose that cysteine rich proteins drive the assembly of the 614 exosporium by building a self-assembled structure that serves as a scaffold for the latter 615 recruitment of other exosporium constituents, and this mechanism may be conserved in 616 Clostridiales.

626
Although the precise mechanism that guides the formation of the lamellae-like is unknown, our 627 predicted interaction maps suggest that the C-terminal domain seems to have motifs that contribute 628 to self-organization. This is supported by TEM analysis with the CdeC variants showed that the 629 central and carboxyl regions of the proteins tend to form aggregates but are not able to form the