Functional and Structural Diversity in the Als Protein Family of Candida albicans

The human fungal pathogen Candida albicans colonizes and invades a wide range of host tissues. Adherence to host constituents plays an important role in this process. Two members of the C. albicans Als protein family (Als1p and Als5p) have been found to mediate adherence; however, the functions of other members of this family are unknown. In this study, members of the ALS gene family were cloned and expressed in Saccharomyces cerevisiae to characterize their individual functions. Distinct Als proteins conferred distinct adherence profiles to diverse host substrates. Using chimeric Als5p-Als6p constructs, the regions mediating substrate-specific adherence were localized to the N-terminal domains in Als proteins. Interestingly, a subset of Als proteins also mediated endothelial cell invasion, a previously unknown function of this family. Consistent with these results, homology modeling revealed that Als members contain anti-parallel beta-sheet motifs interposed by extended regions, homologous to adhesins or invasins of the immunoglobulin superfamily. This finding was confirmed using circular dichroism and Fourier transform infrared spectrometric analysis of the N-terminal domain of Als1p. Specific regions of amino acid hypervariability were found among the N-terminal domains of Als proteins, and energy-based models predicted similarities and differences in the N-terminal domains that probably govern the diverse function of Als family members. Collectively, these results indicate that the structural and functional diversity within the Als family provides C. albicans with an array of cell wall proteins capable of recognizing and interacting with a wide range of host constituents during infection.


Introduction
harmless commensal, this organism can cause a variety of conditions ranging from superficial mucocutaneous infection, to deep organ involvement in disseminated candidiasis (1). Prior to causing disease, the fungus colonizes the gastrointestinal tract, and in some cases skin and mucous membranes. Adherence to host mucosal surfaces is a key prerequisite for this initial step (3). After colonization, C. albicans enters the bloodstream via infected intravascular devices or by transmigration through gastrointestinal mucosa compromised by chemotherapy or stress ulcerations (1). Organisms then disseminate via the bloodstream, bind to and penetrate the vascular endothelium to egress from the vascular tree and invade deep organs such as liver, spleen and kidney (3). Thus, C. albicans must be capable of adhering to a variety of biological substrates at different stages of infection.
There is abundant experimental evidence to support the role of adherence in candidal virulence. Initial observations noted a correlation between the degree of endothelial cell adherence and virulence of different yeast species (4). More recently, several adhesins of C.
albicans have been isolated and characterized (5)(6)(7)(8)(9)(10)(11). Mutants deficient in the genes encoding these adhesins not only exhibit decreased adherence to host substrates in vitro, but also a corresponding reduction in virulence in several experimental models of C. albicans infection (6,8,(12)(13)(14). The extent of these reductions has been quite variable, likely due to the actions of other adhesins with redundant or overlapping function.

22
ALS1 is a member of a large C. albicans gene family consisting of at least 8 members originally described by Hoyer et al. (15,16). These genes encode cell surface proteins that are characterized by three domains. The N-terminal region contains a putative signal peptide, and is relatively conserved among Als proteins. This region is predicted to be poorly glycosylated (16,17). The central portion of these proteins consists of a variable number of tandem repeats, (~36 amino acids in length) and is followed by a serine-threonine rich C-terminal region that contains a glycosylphosphatidylinositol anchor sequence (16,17). While the proteins encoded by this gene family are known to be expressed during infection (18,19), the function of the different Als proteins has not been investigated in detail.
We therefore used heterologous expression of Als proteins in non-adherent Saccharomyces cerevisiae to evaluate the function of Als proteins in isolation, and to avoid the high background adherence mediated by the multiple other adhesins expressed by C. albicans. This heterologous expression system has been used extensively for the study of C. albicans genes, including the isolation and characterization of the adhesins ALS1, ALS5 and EAP1 (7,9,10). Using this model system we demonstrated that Als proteins have diverse adhesive and invasive functions.
Consistent with these results, homology modeling indicated that Als proteins are closely related in structure to adhesin and invasin members of the immunoglobulin superfamily of proteins. 22 23 7 blastospores were isolated from overnight culture, blocked with 100µl of goat serum, and then stained with either polyclonal antisera A or B at a 1:25 dilution, followed by fluorescein isothiocyanate -labeled goat anti-rabbit IgG at 1:100.
A FACSCaliber (Becton Dickinson) instrument equipped with an argon laser emitting at 488 nm was used for flow cytometric analyses. Fluorescence emission was detected with a 515/40 bandpass filter. Fluorescence data for 10,000 events were collected, and the distribution of cells with fluorescence above baseline (i.e., S. cerevisiae transformed with the empty plasmid) was analyzed for each strain using CELLQUEST software (Becton Dickinson).

Chimeric Als Protein Construction and Expression -
To test the hypothesis that N-terminal sequences were responsible for mediating substrate specific adherence, we constructed chimeric Als5/Als6 proteins by exchanging the N-termini of each protein. Chimeric ALS5/6 genes were constructed as follows. A BglII-HpaI fragment of ALS5 encompassing the 5' 2117 bp of the gene was isolated. pGEM-T-ALS6 was then digested with BglII and HpaI to release the corresponding 5' 2126 bp of ALS6, and the fragment consisting of pGEM-T-Easy plus the 3' sequences of ALS6 was isolated and ligated to the 5' ALS5 fragment to generate plasmid pGEM-T-5N6C. An identical approach using the corresponding 5' fragment of ALS6 and 3' fragment of ALS5 was used to generate plasmid p-GEM-T-6N5C.
After sequence confirmation, each chimeric ALS gene was released by BglII-XhoI digestion and subcloned into pADH1 as above. S. cerevisiae S150-2B was then transformed with these constructs, and expression was verified by II. Magnetic Bead Assay -Als5p was originally identified by virtue of the proteins ability to induce agglutination of fibronectin coated beads when expressed on the surface of S. cerevisiae (10). We therefore tested S. cerevisiae strains transformed with ALS5, ALS6, 5N6C and 6N5C for fibronectin bead adherence using this methodology (10,11). Briefly, tosylated magnetic beads (Dynal Biotech) were coated with fibronectin following the manufacturer's instructions.
Next, 10µl of coated beads (approximately 10 cerevisiae in 1 ml of 1 X Tris-EDTA (TE) buffer pH 7.0, and incubated with gentle mixing for 45 minutes. The tubes were placed in a magnet to separate beads and adherent S. cerevisiae from non-adherent organisms. The supernatant containing non-adherent organisms was removed by aspiration and the remaining beads were washed three times by resuspending in 1 ml of TE buffer, followed by magnetic separation and aspiration of the supernatant. Finally the washed beads and adherent organisms were resuspended in 100µl of TE buffer and examined microscopically for co-agglutination.
Invasion Assay -The ability of Als proteins to mediate endothelial cell invasion was determined using a modification of our previously described differential fluorescence assay (24). Briefly,  conformation (27). This approach included the BLASTP2 algorithm (28) to search for primary sequence similarities in the ExNRL-3D database. In parallel, the dynamic sequence alignment algorithm SIM (29) was used to select candidate templates with greatest sequence identity.
Subsequently, ProModII was used to conduct a primary and refined match analyses. Resulting proteins were used as templates for homology modeling of Als protein backbone trajectories. Finally, the physicochemical properties of the Als models were visualized by MOLCAD (44); as implemented in SYBYL and HINT platforms (45), such that the physical properties were projected onto the water-accessible surface of the Als N-terminal domains.

Determination of the Structure of the N-terminus of Als1p
-To test the hypotheses generated by our homology modeling, we determined the structural features of of the N-terminal domain of Als1p using the complementary approaches of CD and FTIR spectrometry. This protein, encompassing amino acids 17-432 of Als1p was produced in S. cerevisiae and has been described previously (8).   3,5,6,7, and 9 were successfully amplified and expressed in S. cerevisiae. ALS mRNA expression was detected by Northern blot analysis for each construct (data not shown). Despite the use of three sets of primers, we were unable to amplify ALS2 and ALS4 from genomic DNA of C. albicans SC5314.

Cloning of ALS Family Members and Expression in S. cerevisiae
Given the difficulty of sequencing and assembling across the tandem repeats of ALS genes, it is possible that this outcome reflects errors in the sequence assembly currently available on the published genome database.
Flow cytometry confirmed that each of the Als proteins was expressed on the surface of their respective S. cerevisae hosts. Two distinct anti-sera demonstrated that all of the Alsp expressing strains exhibited at least a four-fold increase in fluorescence when compared with S. cerevisiae transformed with the empty plasmid (Table II). Consistent with the predicted structural diversity among members of the Als family, the antisera displayed differences in recognition of individual Als expression strains.  Table II, such differences are unlikely to be responsible for the substrate-specific binding patterns found in this study. Such a global increase or decrease in the amount of Als protein expressed on the cell surface would be expected to produce a commensurate increase or decrease in adherence across all substrates, and not result in the substrate specific differences that were observed.

Als Proteins are Homologous to Adhesins and Invasins of the Immunoglobulin Superfamily -
As an initial step in the molecular modeling of Als proteins, we used a knowledge-based search algorithm to identify molecules that share significant structural similarity with Als family members. These models indicate that the N-terminal domains of all Als proteins contain multiple anti-parallel β-sheet domains, consistent with members of the immunoglobulin superfamily (Table III). These proteins typically consist of complex 7-stranded anti-parallel βsheet domains, from which project loop / coil structures. The β-sheet domains are separated from one another by interposing regions. This structure is often referred to as a beads-on-a-  (Table III). Different patterns of similarity were observed among the Als proteins analyzed. For example, all Als proteins examined, except Als7p, shared significant homology with collagen-binding protein of S. aureus. However, the specific primary, secondary and tertiary homologs varied for most family members (Table III). Indeed, only Als2p and Als9p shared an identical primary, secondary and tertiary homolog. displayed more than 30% consensus identity and more than 50% consensus conservation across Als proteins. An identity plot and schematic alignment of these amino acid sequences comprising the N-terminal domains (residues 1-420) of Als proteins with known function is presented in Figure 4A and B. Importantly, homology modeling revealed that the HVRs of different Als proteins, while distinguishable in sequence, are predicted to conform to similar loop / coil structures that project from the β-sheet components of the CRs. Thus, these HVRs are likely available to interact with host constituents. Taken together, the FTIR and CD data strongly support our hypothesis that the N-terminus of Als1p contains predominant domains of anti-parallel β sheet structure containing minor alphahelical and turn components, interposed by less structured regions.

Three-dimensional Models Suggest Physicochemical Distinctions among Als N-terminal
Domains -Molecular models indicated differences in predicted physicochemical attributes of the N-terminal domains of Als proteins that likely influence their interactions with host cells and substrates several. Als proteins appear to be separable into three distinct groups based on surface distributions of hydrophobicity, charge, and hydrogen-bonding potential (Figure 6). Als1p, Als3p, and Als5p each share similar patterns of these properties, and thus are considered the Alsgroup-A. In contrast, the predicted physicochemical properties of Als6p and Als7p N-terminal domains (Als-group-B) have striking differences from those of the Als-group-A (Figure 6).
While the cationic potential in Als-group-A members is typically segregated from their neutral or anionic facets, positive charge is broadly distributed across the entire surface of the Als-group-B members Als6p and Als7p. Finally, the N-termini of Als2p, Als4p and Als9p appear to constitute a third group of Als proteins (the Als-group-C) that differ structurally from either the Als-group-A or -B proteins. The Als-group-C proteins, would appear to be more similar to the Als-group-A than Als-group-B proteins in terms of hydrophobic or electrostatic distribution.

Discussion
Several proteins with adhesive function have been identified in C. albicans. Hwp1p has been shown to mediate adherence to buccal epithelial cells by acting as a substrate for mammalian transglutaminase (5). EAP1 was recently identified by heterologous expression in S. cerevisiae, and mediates adherence to polystyrene and renal epithelial cells in vitro (7). Of the 8 members of the Als protein family, only Als1p and Als5p have been studied from a functional perspective.
Heterologous expression of Als1p has been shown to mediate binding to human vascular endothelial cells and epithelial cells, a finding which has been confirmed in C. albicans through gene disruption studies (8,9). Heterologous expression of ALS5 in S. cerevisiae confers adherence to collagen, fibronectin, bovine serum albumin and laminin (10,11,56). No large scale comparison of the substrate specificities of C. albicans adhesins has been performed. In this study, we compared the adhesive properties of a structurally diverse group of Als protein family In addition to mediating adherence, our data suggest that Als proteins may also function as invasins. Interestingly, while both Als1p and Als3p expressing S. cerevisiae demonstrated similar endothelial cell adherence, Als3p-expressing S. cerevisiae underwent internalization at a much higher rate. These results suggest the intriguing hypothesis that endocytosis is not simply an extension of adherence, but rather a distinct process that can be influenced by the ligandreceptor interaction. It is possible that differences in N-terminal sequences in Als proteins mediate these distinct functions, as is the case with adherence. However, we were unable to test this hypothesis using chimeric Als5/6 proteins since adherence is likely a prerequisite for invasion, and non-adherent chimeric proteins would by definition fail to mediate invasion. These results show significant predictive correlation with circular dichroism studies of Als5p (60) indicating that the N-terminal domain of Als5p is characterized by a relative predominance of anti-parallel β-sheet and loop / coil regions. Thus it is highly likely that all members of the Als protein family exhibit this overall structure. Importantly, our structural data are also consistent with our homology models that suggest that many of the HVRs correspond to the flexible loop / coil structures projecting from β-sheet domains in the N-termini of distinct Als proteins. We hypothesize that these structures are integral to substrate-specific binding by Als proteins (Figure 7). Consistent with our data, analogous regions of mannose binding lectin, αagglutinin, and other members of the immunoglobulin superfamily appear to confer substrate binding specificity (61,62). Furthermore, mutations of these variable loop regions significantly alter substrate binding in these homologous proteins (63,64). 23 distribution, and electrostatic potential. Yet, even within this group, specific physicochemical distinctions exist that may govern functional differences within the group (Figure 6). In contrast, Als proteins with reduced adhesive capacity have surface features predicted to be distinct from the Als-group-A proteins in multiple physicochemical properties, including hydrophobicity and electrostatic potential. It is highly likely that the aggregate effects of differences in these structural features confer the specific functional properties of distinct Als proteins.
Extensive genetic variability has been demonstrated within the ALS gene family. Sequence variation in specific ALS genes of different isolates of C. albicans have been observed (19,60), and not all members of the ALS family are present in all isolates. Even significant sequence divergence between two different alleles in a single isolate have been found (16,19). This degree of genetic variability would suggest these proteins may undergo rearrangement or mutation at a relatively high frequency. Such a mechanism would provide the organism with the ability to generate the high degree of structural and functional diversity demonstrated in this study.
Indirect support for this hypothesis is provided by a recent study of allelic variation of ALS7, which suggested that this gene is both hypermutable, and that these mutations are subject to selective pressure (19).
Collectively, these studies suggest an analogy between antibodies and Als proteins at both the structural and functional level.              brown; most hydrophilic, blue; electrostatics -(spectral continuum) most positive charge (+ 10 kcal / mol), red; most negative charge (-10 kcal / mol), blue; hydrogen-bonding potentialdonor, red; acceptor, blue. Als proteins are distinguishable into 3 groups based on the composite of these properties. For example, note the similar hydrophobic, electrostatic, and hydrogenbonding profiles among Als-group-A proteins, Als1p, Als3p, and Als5p. In contrast, Alsgroup-B members, Als6p and Als7p, display striking differences in hydrophobic and electrostatic features from those of Als-group-A. In addition to biochemical profiles, note the differences in predicted structure among these domains.