Diversity of cwp loci in clinical isolates of Clostridium difficile

An increased incidence of Clostridium difficile infection (CDI) is associated with the emergence of epidemic strains characterized by high genetic diversity. Among the factors that may have a role in CDI is a family of 29 paralogues, the cell-wall proteins (CWPs), which compose the outer layer of the bacterial cell and are likely to be involved in colonization. Previous studies have shown that 12 of the 29 cwp genes are clustered in the same region, named after slpA (cwp1), the slpA locus, whereas the remaining 17 paralogues are distributed throughout the genome. The variability of 14 of these 17 cwp paralogues was determined in 40 C. difficile clinical isolates belonging to six of the currently prevailing PCR ribotypes. Based on sequence conservation, these cwp genes were divided into two groups, one comprising nine cwp loci having highly conserved sequences in all isolates, and the other five loci showing low genetic conservation among isolates of the same PCR ribotype, as well as between different PCR ribotypes. Three conserved CWPs, Cwp16, Cwp18 and Cwp25, and two variable ones, Cwp26 and Cwp27, were characterized further by Western blot analysis of total cell extracts or surface-layer preparations of the C. difficile clinical isolates. Expression of genetically invariable CWPs was well conserved in all isolates, whilst genetically variable CWPs were not always expressed at comparable levels, even in strains containing identical sequences but belonging to different PCR ribotypes. This is the first report on the distribution and variability of a number of genes encoding CWPs in C. difficile.


INTRODUCTION
Clostridium difficile is the most common cause of nosocomial infectious diarrhoea.In addition to PCR ribotype 027, which is associated with high-level fluoroquinolone resistance and increased levels of production of the pathogenic toxins A and B, other types are currently implicated in the increase in incidence, recurrence and mortality of C. difficile infection (CDI) in hospitalized patients.PCR ribotypes 014, 001 and 078 represent the most prevalent strains in European hospitals, as observed in the last European surveillance performed in 2008 (Bauer et al., 2011).Characterization of the genetic conservation among strains belonging to different PCR ribotypes could help significantly to identify the genetic elements associated with the onset of CDI.It has already been established through sequencing of the genomes of a number of C. difficile strains of different origin that there is a high variability in gene composition and conservation among strains (Forgetta et al., 2011;He et al., 2010;Scaria et al., 2010).In this study, we analysed the distribution and variability of the genes of 14 cell-wall proteins (CWPs) in 40 C. difficile clinical isolates of the six prevailing PCR ribotypes in Italy (Spigaglia et al., 2010) and, more generally, in Europe (Bauer et al., 2011).
To date, 29 cwp genes have been identified in C. difficile, encoding a family of CWPs involved in colonization and pathogenesis.All of these CWPs have a conserved domain IP: 54.70.40.11On: Mon, 05 Aug 2019 03:40:41 containing two or three copies of the Pfam 04122 motif, a putative cell-wall-binding repeat 2 (Fagan et al., 2011).In addition, several of the CWPs show a second, more variable domain that may specify a unique function.In all C. difficile strains, the predominant CWPs are the components of the surface layer (S-layer).These proteins are encoded by the slpA gene as a single precursor.After post-translational cleavage of this precursor, two mature proteins are produced, the high-molecular-mass SLP and the low-molecular-mass SLP.The SLPs facilitate adhesion to cultured cell lines and the LMW SLP is an immunodominant antigen.Other members of the CWP family have been investigated extensively such as Cwp84, a protease that cleaves the SLPA precursor and also degrades many proteins of the host-cell extracellular matrix, Cwp66, which acts as an adhesin (Waligora et al., 2001), and CwpV, a protein that is expressed in a phase-variable manner (Emerson et al., 2009).
Several of the cwp genes are not conserved in all the C. difficile genomes characterized so far.In general, 12 of the 29 cwp genes are clustered in the same region of the genome, named after slpA (cwp1), the slpA locus (Calabi et al., 2001;Karjalainen et al., 2001), whereas the remaining 17 paralogues are distributed throughout the genome.Recently, the diversity of the genomic region spanning the slpA locus among 57 C. difficile clinical isolates has been described (Dingle et al., 2013).The current study focused on the genes encoding the group of the other 17 paralogues of CWPs, as they are poorly characterized.Knowledge of the conservation of these genes in clinical isolates would offer useful information for the characterization of the role that CWPs may have in CDI and would also provide another tool for classifying newly emerging strains.

METHODS
Bacterial strains and growth conditions.Forty C. difficile clinical isolates collected by the Istituto Superiore di Sanita `, Italy, from 1987 to 2010 were used in this study (Table 1).Strains were isolated from symptomatic patients in 13 Italian hospitals distributed over nine different regions of the country.In particular, strains C192, C193, C252, C253, AR1, AR2, TR2, TR3, An45 and An56 were isolated during five different outbreaks that occurred in hospitals C, D, E, F and G located in five different regions.C. difficile isolates were typed as PCR ribotype 001 (two isolates), 012 (ten isolates), 014 (two isolates), 018 (ten isolates), 078 (ten isolates) and 126 (six isolates).All isolates that belong to the same PCR ribotype were not clonally related and were selected in order to assess genetic conservation in strains of different geographical origin throughout Italy.
DNA isolation, amplification and sequencing.Genomic DNA was isolated by a standard protocol for Gram-positive bacteria using a NucleoBond AX-G kit (Macherey-Nagel) according to the manufacturer's instructions.Genes were amplified using primers specific for regions external to each ORF and are listed in Table 2.Only for cwpV were primers designed to amplify a conserved internal segment (904 bp) of the otherwise highly variable coding region (Emerson et al., 2009).When primers were used in a multiplex PCR, the two sets were added at different concentrations: 1 and 0.3 mM for primers specific for the longer gene and the shorter gene, respectively.DNA amplification was performed using 1 ml purified genomic DNA (50 ng) in a final volume of 50 ml.The nucleotide sequences of the PCR products were determined using a BigDye Terminator v3.1 kit (Applied Biosystem) in an ABI PRISM 3700 Analyzer (Applied Biosystems).
Sequence alignments and phylogenetic analysis.The percentage sequence identity was calculated by pair-wise BLAST with the VECTOR NTI SUITE 11 (Informax), with gaps included.Sequence alignments were performed using CLUSTAL W 1.83 (GCG Wisconsin Package, version 11.1) and phylogenetic trees were inferred by the neighbourjoining distance-based method and bootstrapped 1000 times.
C. difficile protein extraction.The preparation of whole-cell lysates was obtained from cultures grown in BHI broth to stationary phase (OD 600 ~1) by a method based on a freeze-thaw procedure (Fagan & Fairweather, 2011).Briefly, cultures of C. difficile were harvested by centrifugation at 5000 g for 10 min at 4 uC and the pellets frozen at 220 uC.The bacteria were thawed, suspended in PBS to an OD 600 of 20 and incubated at 37 uC for 10 min.Three such freeze-thaw cycles were carried out in order to obtain consistent and reproducible lysis.Extraction of the S-layer was performed following a previously described method (Fagan & Fairweather, 2010).For SDS-PAGE and Western blot analysis, 3.5 ml of total cell extract and 5 ml of S-layer extract were used.
SDS-PAGE and Western blotting.Extracts were separated by SDS-PAGE on a 12 % polyacrylamide gel, followed by Western blotting and immunodetection with specific antibodies.Antisera against Cwp16, Cwp18, Cwp25, Cwp26 and Cwp27 were raised in mice immunized with purified recombinant CWPs obtained by overexpressing the corresponding ORFs of C. difficile strain 630 using the pET15 vector (Novagen) and the Escherichia coli strain BL21(DE3) (Invitrogen) expression system.Primary antibodies used at a 1 : 2000 dilution in blocking buffer were detected using horseradish peroxidase-conjugated goat anti-mouse IgG (diluted 1 : 20 000; Invitrogen) and SuperSignal West Pico chemiluminescent substrate (Thermo Scientific Pierce).The specificity of the anti-CWP antibodies was verified by Western blot analysis of recombinant CWPs with the antisera.Briefly, 150 ng of each purified recombinant CWP was separated by SDS-PAGE and transferred to a nitrocellulose membrane.Specific recognition by each polyclonal antiserum was assayed using the same experimental conditions used for cell extracts.A marker for direct visualization of standard bands (MagicMark XP Western Protein Standard; Invitrogen) was used routinely for protein molecular mass estimation directly on Western blots.17 cwp ORFs not included in the slpA locus were taken into consideration.In several cases, only one of the two flanking regions was conserved in all the published genomes.For these cases, different sets of primers were designed that would allow the amplification of the corresponding ORF in all the known variants of the locus.A list of the primers used is given in Table 2.All sets of primers whose sequence was highly conserved in all published genomes or, alternatively, could discriminate for the presence/absence of a specific cwp gene were combined as two sets in a multiplex PCR and tested on control genomic DNA extracted from strains 630, R20291 and M120.With this approach, amplification of a conserved PCR fragment could be used as a positive control for the negative result obtained when using primers specific for genomic regions that were not present in all published genomes.PCR fragments of the expected length were obtained with all sets of primers, except for those designed for the amplification of cwp14, cwp21 and cwp23.Hence, these genes were excluded from our analysis, which focused on the remaining 14 cwp genes listed in Fig. 1.Amplification of the corresponding 14 cwp ORFs was carried out on genomic DNA extracted from strains isolated from patients at 13 Italian hospitals and representative of the six PCR ribotypes prevalent in Italy (Table 1).The nucleotide sequence was then determined for all of the PCR fragments obtained (Fig. 1).In total, 511 ORFs were sequenced and analysed for sequence conservation using the multiple alignment program CLUSTAL W (Chenna et al., 2003).The ORFs were found to be conserved in all 40 isolates for eight of the cwp genes analysed, whilst the remaining six cwp ORFs were absent or, when present, were not conserved in at least one of the clinical isolates.In particular, cwp27 and cwp29 were absent in all the PCR ribotype 078/126 isolates, as already reported for M120, the reference strain for PCR ribotype 078 (He et al., 2010).Similarly, the ten PCR ribotype 018 strains of our collection lacked cwp28, whilst in the same isolates the other 13 cwp genes were found to be present and conserved (Fig. 1).A PCR fragment of the expected length for cwp17 was obtained in all isolates except for two, one PCR ribotype 078, which also lacked cwp16, and one PCR ribotype 126 strain.Moreover, the sequence of cwp17 in all the remaining isolates of PCR ribotypes 078 and 126 showed a lower level of conservation with respect to strain 630 than the isolates of the other PCR ribotypes.Finally, the cwp26 gene was found to be alternatively absent/ present but variable or conserved in different isolates of the same PCR ribotype with the exception of the ten PCR ribotype 018 isolates, which all shared identical cwp26 sequences.The number of single-nucleotide polymorphisms found in each cwp gene among the various PCR ribotypes with respect to the orthologue sequence in strain 630 and the corresponding amino acid substitutions are reported in Table S1 (available in JMM Online).

Phylogenetic analysis of cwp sequences
The sequences of the 14 cwp ORFs in one isolate for each PCR ribotype, arbitrarily selected as representative of all strains that belonged to the same PCR ribotype, were joined in a single string and compared with the corresponding cwp sequences of strain 630, also joined in a string.The phylogenetic tree inferred from the sequences of the 14 cwp loci among the clinical isolates and constructed by the use of the neighbour-joining algorithm is shown in Fig. 2. Two interesting observations could be drawn from this analysis.First, strains that belong to PCR ribotypes 078 and 126 always had identical cwp sequences in the clinical isolates of our collection.Conversely, the reference strain for PCR ribotype 078 M120 and strain QCD-23M63, another PCR ribotype 078 strain whose genome sequence has been characterized (Forgetta et al., 2011), showed some variability for these cwp genes (Fig. 2).Second, the two PCR ribotypes 078 and 126 were clearly more closely related to the hypervirulent PCR ribotype 027 than to any of the other PCR ribotypes analysed in our study.

Analysis of expression of conserved versus variable CWPs
Five of the CWPs under study were characterized further by Western blot analysis of total cell extracts or S-layer preparations of the C. difficile clinical isolates.Three conserved CWPs, Cwp16, Cwp18 and Cwp25, and two variable ones, Cwp26 and Cwp27, were selected for this analysis as antisera showing high specificity for these CWPs were available.For simplicity, total cell extracts were prepared from only one representative isolate for each PCR ribotype, as well as from any strain showing variable alleles.A similar mode of expression was detected in total extracts of all the clinical isolates analysed for the highly conserved cwp16, cwp18 and cwp25 genes (Fig. 3).However, it was noteworthy that strain IT0901, the only isolate from which we were not able to amplify the cwp16 gene, showed a Cwp16-positive band of the same intensity and molecular mass as all the other strains.This suggested that the cwp16 flanking regions that we used to design the primers are not conserved in strain IT0901.Moreover, as we were also not able to amplify the adjacent cwp17 gene in this isolate, we propose that in strain IT0901 the entire region may contain some degree of sequence variability that does not compromise expression of Cwp16.For this reason, we believe that Cwp16 can be included in the group of the highly conserved CWPs, thus bringing to nine the number of conserved CWPs compared with five variable ones.
Conversely, analysis of the data obtained on expression of Cwp26 revealed that a protein of the expected molecular mass (49 kDa) was present in total cell extracts of the 027 reference strain R20291 and the PCR ribotype 001 strains but was missing in the remaining isolates (Fig. 4a), even if the gene was present.A weaker band visible at approximately 31 kDa in all samples and representing a crossreaction of the Cwp26-specific polyclonal antibody with Cwp25 (data not shown) was used as a sample loading control for the Cwp26-negative samples.
To verify whether the absence of Cwp26 in total cell extracts was due to the sample preparation procedure or to differences in expression/localization, the Western blot analysis was repeated on S-layer preparations of the same strains.S-layer extracts showed the presence of a 49 kDa Cwp26-positive band in the PCR ribotype 078 and 126 strains as well as in the 027 and 001 isolates already found to be positive in the total cell extracts (Fig. 4b).Likewise, the strains that belonged to PCR ribotypes 012, 014 and 018 showed a strong positive signal only in S-layer preparations, although the strong band recognized in these strains had a significantly higher molecular mass (~70 kDa) than that predicted from the cwp26 gene sequence (Fig. 4b).In addition, it should be noted that the same strong signal at 70 kDa was also visible in S-layer preparations of the two 012 isolates, TR2 and TR3, that were found to be cwp26 negative by PCR analysis (Fig. 1).
Of the other two cwp26-negative isolates reported in Fig. 1, strain CD5 displayed two positive bands at 48 and 49 kDa, whilst strain F II 3 did not show any specific band recognized by the anti-Cwp26 antibodies (Fig. 4b).
Although we cannot offer an explanation for the results obtained in isolates TR2, TR3 and CD5, it can be inferred that F II 3 is the only strain that clearly does not contain a cwp26 orthologue.
The results of the Western blot analysis of the expression of Cwp27 are shown in Fig. 5. Two bands of the expected molecular mass for the mature form (38 kDa) and precursor (41 kDa) of Cwp27, as inferred from the Cwp27 sequence analysis using the PSORT (Nakai & Horton, 1999) and VECTOR NTI prediction programs, were seen in total extracts as well as in S-layer preparations of the two reference strains 630 and R20291 used as positive controls.Conversely, no band was visible in M120, a strain that does not have a cwp27 orthologue.In clinical isolates of PCR ribotypes 012, 014 and 018, the same two bands were present with the same intensity ratio in S-layer extracts but with varying intensities in total cell extracts.However, neither of the PCR ribotype 001 isolates of our collection, which contained a conserved cwp27 gene, showed any positive signal, thus indicating that these strains did not express Cwp27 at detectable levels under the conditions used.In contrast, the lack of a positive signal in PCR ribotypes 078 and 126 isolates confirmed the absence of a cwp27 orthologue in these strains.

DISCUSSION
The observation that the emergence of new C. difficile strains is associated with an increased incidence and virulence of CDI suggests that strain differences play an important role in the onset and subsequent outcome of disease.For this reason, many recent studies have focused on characterization of the genetic variability found in clinical isolates belonging to PCR ribotypes recurrent in outbreaks of CDI (Forgetta et al., 2011;He et al., 2010;Scaria et al., 2010;Stabler et al., 2006).Among the genetic traits supposedly relevant for pathogenicity, we chose to investigate the variability of a number of genes encoding a family of surface-exposed proteins, the CWPs.Our analysis was carried out on 14 of the 29 known cwp genes in 40 Italian clinical isolates belonging to PCR ribotypes 001, 012, 014, 018, 078 and 126.The data provided an insight into the extent of sequence variability among strains of different PCR ribotypes, as well as among different isolates of the same PCR ribotype.On the basis of the degree of sequence conservation, these cwp genes could be divided into two groups.One comprises nine highly conserved cwp genes (cwpV, cwp13, cwp16, cwp18, cwp19, cwp20, cwp22, cwp24 and cwp25) that have identical sequences in all the isolates of the same PCR ribotype and only a few polymorphisms among PCR ribotypes, and the other group includes five variable cwp genes (cwp17, cwp26, cwp27, cwp28 and cwp29) with low sequence conservation among isolates of the same PCR ribotype as well as among different PCR ribotypes.Interestingly, the latter group comprises cwp27 and cwp29, two genes encoding CWPs that do not contain putative domains assigned to a known function (Fagan et al., 2011).A search for sequence homology of their unassigned C-terminal regions, however, showed that both have some sequence similarity with phage proteins (data not shown), thus implying that these genes could have been acquired through horizontal genetransfer events.
Moreover, the results of our phylogenetic analysis show that certain PCR ribotypes always display the same type of variability for most of the cwp genes included in our study.This is the case for PCR ribotypes 014 and 018, or 078 and 126.In particular, all ten PCR ribotype 078 isolates of our collection have polymorphisms identical to those found in the six PCR ribotype 126 strains, whilst they differ, albeit just for a few nucleotides, from the 078 reference strains M120 and QCD23M63 (Fig. 2).This is in agreement with previous studies, where PCR ribotypes 078 and 126 are always assigned to the same lineage (Reil et al., 2011;Spigaglia et al., 2010), a clade that is frequently associated with livestock as well as with humans (Goorhuis et al., 2008;Hensgens et al., 2012;Keel et al., 2007).Clearly, the cwp genes analysed in this study seem to maintain a tight association with the linkage groups defined by PCR ribotypes.This suggests that their genetic diversity coevolves with the core genome, unlike what has been shown to occur at the slpA gene cluster for the slpA, secA2 and cwp66 cassette, which generates S-layer switching (Dingle et al., 2013).
Another interesting outcome of our study is the finding that, with regard to cwp gene variability, PCR ribotypes 078/126 are more closely related to PCR ribotype 027 than to any of the other PCR ribotypes analysed (Fig. 2).
Recently, a number of reports have focused their interest on determining which genetic factors PCR ribotypes 027 and 078/126 may have in common that would help to explain the similarity in CDI outcome observed for these 'hypervirulent' strains (Barbut & Rupnik, 2012;Dingle et al., 2013;Knetsch et al., 2011;Stabler et al., 2006;Walk et al., 2012).Our data clearly suggest that 14 of the 29 predicted CWPs share a high degree of sequence similarity in strains that belong to PCR ribotypes 078/126 and 027.
As CWPs are surface components of C. difficile possibly involved in colonization and onset of CDI, it is proposed that several of the 14 CWPs characterized in this work may represent common traits of these PCR ribotypes that contribute to their 'hypervirulent' behaviour.
The genetic diversity found among the six PCR ribotypes is not evenly distributed between the 14 cwp genes of interest and can be used to discriminate between highly conserved and variable cwp genes.Likewise, expression of conserved CWPs seems to be well conserved in all isolates (Fig. 3), whilst variable CWPs are not always expressed at comparable levels, even in strains containing identical sequences but belonging to different PCR ribotypes, as seen for the expression of Cwp27 in PCR ribotype 001 isolates (Fig. 5).Our results highlight how difficult it is to characterize key components of the C. difficile cell surface due to the exceedingly high overall genetic complexity present in different C. difficile isolates.Indeed, in the case of the extremely variable cwp26 gene, although we were not able to amplify and sequence this orthologue in several isolates (Fig. 1), expression of a Cwp26-like protein could be detected in most of the PCR-negative strains.Moreover, in all the PCR ribotype 001, 012, 014 and 018 strains analysed, expression of Cwp26 was observed only in S-layer extracts, suggesting that Cwp26 is a complex constituent of the S-layer in these PCR ribotypes (Fig. 4b).
In summary, we propose that the conserved CWPs may correspond to essential components of the bacterial surface, whilst the highly variable CWPs could be more recent acquisitions of additional surface elements.As the specific function of the majority of the CWPs analysed in our study remains unclear, it is not currently possible to elucidate whether there is a correlation between the presence of a particular CWP in the S-layer and C. difficile interspecies transmission, or increased spread and severity of CDI.All these aspects need to be addressed urgently in order to be able to contain the significant increase in disease incidence and mortality reported in recent years.
Finally, due to the complexity of the genetic variability observed in C. difficile strains, a unique method for typing newly emerging strains of C. difficile is still not available.The analysis of cwp gene diversity could offer an additional tool for the classification of C. difficile clinical isolates.
Sequence alignment of ORFs predicted to code for CWPs in published C. difficile genomes was used to identify conserved flanking regions suitable for designing primers for amplification of the corresponding cwp ORFs.Only the Downloaded from www.microbiologyresearch.orgby IP: 54.70.40.11On: Mon, 05 Aug 2019 03:40:41

Fig. 1 .Fig. 2 .
Fig. 1.Comparison of cwp loci in different PCR ribotypes.Distribution of the 14 cwp genes in 40 Italian clinical isolates and their conservation with respect to strain 630.Reference strains are highlighted in the blue box.Strains also tested for expression of a specific CWP by Western blot analysis are indicated by a full circle ($) when found positive or by an open circle (#) when a specific band could not be detected.

Fig. 3 .
Fig. 3. Analysis of expression of conserved CWPs.Western blot analysis of total cell extracts of C. difficile reference strains (630, R20291 and M120) and clinical isolates representing different PCR ribotypes using anti-Cwp16, anti-Cwp18 and anti-Cwp25 antibodies.A marker for direct visualization of standard bands (MagicMark XP Western Protein Standard) was used for protein molecular mass assessment directly on Western blots.

Fig. 4 .Fig. 5 .
Fig. 4. Expression of Cwp26.Western blot analysis of C. difficile reference strains (630, R20291 and M120) and clinical isolates representing different PCR ribotypes using anti-Cwp26 antibody.Total cell extracts (a) and S-layer extracts (b) were separated by SDS-PAGE, followed by Western blotting with Cwp26-specific antibodies.A marker for direct visualization of standard bands (MagicMark XP Western Protein Standard) was used for protein molecular mass assessment directly on Western blots.The grey shaded arrow indicates the position of the expected molecular mass for the mature form of Cwp26, whilst the open arrow indicates the Cwp26-positive band at approximately 70 kDa.

Table 1 .
C. difficile clinical isolates analysed in this study *Strains were isolated at 13 Italian hospitals (arbitrarily denominated A-M).

Table 2 .
Primers used in this study