Introduction

Dendritic cell-specific intracellular adhesion molecules (ICAM)-3 grabbing non-integrin (DC-SIGN, CD209) and liver/lymph node-specific ICAM-3 grabbing non-integrin (L-SIGN; encoded by CLEC4M, also known as DC-SIGNR for DC-SIGN related or CD209L) are closely related genes, which directly recognize a wide variety of pathogens. As homologues, they share 77% amino-acid identity [13]. Located in a head-to-head orientation within 30 kb of each other on chromosome 19p13.2–3, they are thought to have arisen through an ancient gene duplication event [4, 5].

Both DC- and L-SIGN are trans-membrane C-type lectins that are calcium-dependent carbohydrate-binding proteins, which can act as cell-adhesion and pathogen-recognition receptors. The extra-cellular domain of both DC- and L-SIGN consists of an extended neck region, which contain tandem repeats of a highly conserved 23-amino-acid sequence, followed by a C-terminal C-type carbohydrate-recognition domain (CRD) [6, 7], which complexes carbohydrates of high-mannose type [4]. This neck region plays a crucial role in the oligomerization and support of the CRD, thus influencing the pathogen-binding properties of these two receptors. As pathogen-recognition receptors, both lectins recognize a wide range of micro-organisms, some of which have major impact on public health. For example, DC-SIGN captures viruses such as human immunodeficiency virus 1 (HIV-1) [6, 8], Ebola virus [912], Hepatitis C virus (HCV) [1316], Dengue virus [17], Cytomegalovirus [18], and SARS coronavirus (SARS-CoV) [12, 19]; bacteria such as Mycobacterium tuberculosis [20, 21], Helicobacter pylori [21, 22]; and parasites such as Leishmania pifanoi [21, 23]. L-SIGN is able to capture viruses such as HIV-1 [2, 6, 24], Ebola virus [912], Hepatitis C virus [1316, 25], and more recently SARS-CoV [12, 26], M. tuberculosis [27], and Leishmania infantum [23].

Both DC- and L-SIGN had previously been thought to be very similar in overall structure, ligand-binding characteristics, as well as function, but there is now mounting evidence for the contrary. The most notable difference is that of the tandem-neck-repeat region. Whilst DC-SIGN presents a constant size in the tandem repeats, L-SIGN demonstrates considerable polymorphism in this neck region [24]. This tandem-neck-repeat region appears to be important for homo-oligomerization of L-SIGN on the cell surface, bringing the CRDs into proximity for high-affinity ligand binding [7, 24]. It has been suggested that heterozygous expression of polymorphic variants of L-SIGN, in which neck lengths differ, may prevent the formation of hetero-oligomers and thus lead to a reduced ligand-binding affinity [3]. Indeed, a genetic-risk association study showed that L-SIGN tandem-repeat homozygosity is associated with a reduced risk for SARS-CoV infection [28], supported by in vitro observations that is homozygous but not heterozygous L-SIGN, and reduces the final total viral titer in co-cultures with permissive cells. Demographic study of this variation in tandem-neck-repeat lengths, which included 52 worldwide populations, suggested that this neck region may be a functional target of selective pressure imposed by infectious agents [29]. Recent biochemical and structural studies also show that DC- and L-SIGN have distinct ligand-binding properties and very different physiological functions. Differences are also seen in their tissue distribution. This review will outline the similarities and differences of structure, ligand-binding characteristics and function between the DC- and L-SIGN. The genotypic differences between the two genes and genotype variation in the L-SIGN tandem-neck-repeat region among different populations worldwide will be discussed in relation to disease association.

Structural studies of DC- and L-SIGN

The structural organization of DC- and L-SIGN can be divided into an intra-cellular domain, a trans-membrane domain, which anchors the proteins onto the cytoplasmic membrane, and an extra-cellular domain, which consists of a neck region supporting the CRD (Fig. 1a).

Fig. 1
figure 1

Structure of DC- and L-SIGN. a The genetic structure and functional domains of these two genes. Boxes exons, grey-color shaded boxes 5’ and 3’ untranslated regions, stripped box tandem repeats of exon 4. b The neck-region repeats of L-SIGN. The half-repeat is located at the N-terminal of the neck-region repeat. Different number of neck-region repeats result in different lengths of L-SIGN proteins. One complete wave curve represents one repeat, and each repeat is alternatively shaded. For variants containing 4.5 and 5.5 repeats of L-SIGN, the first half of the fourth repeat, which contains the same sequence as the first half of the sixth repeat (as illustrated in c), joins with the second half of the sixth repeat, resulting in a complete repeat. Similarly, the first half of the first repeat joins with the second half of the second repeat. Thus, first and last two repeats of L-SIGN are conserved. c The DC- and L-SIGN neck-region amino-acid sequences illustrating 7.5 repeats. The conserved hydrophobic heptad positions are highlighted in grey. Arrows indicate sites of subtilisin proteolytic digestion. Boxed sequences are the first half of each repeat in the neck region (adopted from Feinberg et al. [34])

Intra-cellular domain

The intra-cellular N-terminal domain located in the cytoplasm is responsible for signaling and contains several internalization motifs, which suggests a function of DC-SIGN as an endocytic receptor. The di-leucine motif essential for internalization through DC-SIGN [30] and the tri-acidic cluster of DC-SIGN important for targeting to proteolytic vacuoles [31] are both conserved in L-SIGN [32], suggesting that L-SIGN may also function as an internalization receptor. On the other hand, the tyrosine-based motif is present in DC-SIGN but absent in L-SIGN [32].

DC-SIGN is thought to have dual ligand-binding properties of adhesion, as well as mediation of endocytosis, acting as a recycling endocytic receptor releasing ligand at endosomal pH. L-SIGN, in contrast, appears only to have properties of an adhesion receptor. Binding to L-SIGN is not reversible at low pH as L-SIGN expressed in fibroblasts does not release ligand at low pH [33].

CRD

The CRD are flexibly linked to the neck regions that project the CRDs ∼25 Å from the cell membrane and allow the CRDs to interact with high-mannose carbohydrates [34] that are presented with variable spacing on viral envelop surfaces. L-SIGN has 84% amino-acid homology of the CRD with DC-SIGN. CRDs of both DC- and L-SIGN contain highly a conserved EPN sequence motif essential for recognizing mannose-containing structures [35]. An important difference in this extra-cellular portion is the replacement of Val351 in DC-SIGN with Ser363 in L-SIGN, resulting in elimination of the van der Waals interaction with the −2OH group of fucose. This difference in one single amino acid in the CRD accounts for the markedly different ligand-binding characteristics between DC- and L-SIGN. Although both receptors bind to N-linked high-mannose oligosaccharides, DC-SIGN binds preferentially to fucose whilst L-SIGN to mannose. DC-SIGN, but not L-SIGN, binds to blood group antigens, including those present on micro-organisms [33].

The CRDs exit as monomers, which can bind with high-affinity to high-mannose N-linked oligosaccharides [6, 7]. Whilst the specificity towards particular carbohydrate structures is a property of the CRDs, it is the tetramers that provide high-avidity binding, which allows tight binding to high densities of glycoconjugate provided by pathogens. Mitchell et al. [7] showed that full-length receptors form tetramers through their neck-repeat domain.

Neck domain

The neck region, or repeat domain, in the majority of the population, contains seven complete repeats and one incomplete repeat of a 23-amino-acid sequence. The repeats fold in an α-helical conformation interspersed with non-helical regions, which form extended stalks, stabilized by lateral interactions of the α-helical regions [34]. Each repeat contains a proline residue, and the first half of each 23-amino-acid repeat displays hydrophobic residues spaced at regular intervals. Oligomerization of the neck region has been shown to play an important role in the recognition of endogenous glycans. In particular, the formation of tetramers has been shown to be important for high-avidity binding to multivalent ligands [7, 34, 3638]. The neck regions are essential for lectin tetramerization [7, 34, 36, 38].

Ligand binding

ICAM are the presumed physiological ligands of DC- and L-SIGN. Evaluating the binding affinity of ICAM-1, ICAM-2, and ICAM-3, Synder et al. [38] found surprisingly that only ICAM-3 showed detectable binding to L-SIGN but at submicromolar affinities. This binding is mediated by the CRD with the carbohydrate of ICAM-3 being the receptor binding epitope.

  1. 1.

    Tetramer formation determines high-avidity binding. Whilst the structure of each CRD determines the preference of the receptor for particular carbohydrate structures, tetramer formation determines avidity binding, which allows tight binding to dense arrays of carbohydrate ligand presented by target pathogens. Using this tetramer model, Synder et al. [36] developed a ligand-recognition index to identify potential receptor ligands for DC-/L-SIGN based on their predicted gross glycosylation density. Moreover, Bernhard et al. [37] showed increased gp120 binding affinity of tetrameric DC-/L-SIGN compared with the monomeric form.

  2. 2.

    Multimerization status is determined by the number of tandem-neck repeats. It has been suggested that the number of repeats determines the multimerization status [34, 38]. In a series of recombinant soluble receptors varying in the number of repeats, Synder et al. [38] showed that different truncated receptors exist in different oligomeric forms. Different repeats lengths were also found to bind with different affinities to gp120, with better binding for tetrameric forms than shorter monomeric forms. Feinberg et al. [34] demonstrated that a protein with two fewer repeats (five-repeat allele) results in partial dissociation of the final tetramer, whilst a protein with less than five repeats exhibits dramatic reduction in overall stability, which directly impacts on the quality of ligand-binding functions.

While DC-SIGN presents predominantly with full-length tandem repeats, L-SIGN has considerable polymorphism in this neck region [24], ranging from four to ten repeats. By screening a panel of human genomic DNA, it was shown that the tandem-neck length variation of L-SIGN corresponded to the absence of specific repeats in the middle of the neck region [34]. The C-terminal two repeats adjacent to the CRD are present in all cases, whilst the N-terminal two repeats were also found to be conserved (Fig. 1b,c). From these observations, Feinberg et al. [34] suggested that the former repeat may be important in dimer formation whereas the latter repeat in the formation of stable tetramers. Guo et al. [39] also demonstrated L-SIGN with a 4.5-repeat failed to form stable tetramers and suggested that it was either because a minimum number of repeats are needed to stabilize tetramers or the second repeat absent from this form is critical of oligomerization. As the five-repeat form is partially dissociated at low protein concentrations, it may suggest that at least 5.5 neck repeats are necessary to stabilize the tetramer even when the second repeat is present. In contrast, L-SIGN with 5.5 or more repeats can assemble into stable homo-oligomers, as well as into hetero-oligomers with the more common 7.5-repeat, suggesting that stable hetero-tetramers may be assembled efficiently from L-SIGN polypeptides of different lengths.

Neck length polymorphisms may also affect interaction with viruses in other ways. Shortening of the neck region could significantly change the spatial projection of the CRDs relative to the cell surface. However, how the hetero-oligomers are assembled remains as speculation. It has been suggested that they may exist as two homo-dimers that associate at the N-termini but diverge further from the membrane, with the CRDs projecting at different distances from the membrane, which may affect accessibility for binding with viruses. This underscores the importance of investigating the effect of homo/heterozygous genotype of L-SIGN, influencing pathogen-binding affinity, which will be discussed in a later section.

Pathogens recognized by DC- and L-SIGN

HIV

DC-SIGN, highly expressed on DCs present in mucosal tissues, was first shown by Geijtenbeek et al. [8] to bind to the HIV-1 envelope glycoprotein gp120 and to promote efficient infection in trans of cells that express CD4 and chemokine receptors. DC-SIGN is not a functional receptor for HIV infection but functions to efficiently capture HIV-1 from the periphery to facilitate its transport to secondary lymphoid organs rich in T cells to enhance infection in trans of the CD4+ target cells. They also showed that DC-SIGN also interacts with ICAM-3 expressed on T cells, contributing to the close interaction between DCs and T cells required for efficient antigen presentation [1]. Indeed, Feinberg et al. [6] showed that both DC- and L-SIGN bind to oligosaccharides that present on the envelope of HIV. Bashirova et al. [24] also showed that L-SIGN has an affinity for ICAM-3 similar to that of DC-SIGN and can capture HIV-1 through gp120 binding and enhance HIV-1 infection of T cells in trans.

HCV

Using HCV envelope glycoproteins, Lozach et al. [15] demonstrated that DC- and L-SIGN are high-affinity-binding receptors for hepatitis C virus glycoprotein E2. Further investigation, however, was hampered due to lack of suitable cell-culture system for in vitro propagation of HCV. With the use of retroviral pseudotypes bearing HCV glycoproteins (i.e., HCVpp) [40], similar binding was subsequently demonstrated [13, 15, 16], with transmission of pseudovirus to adjacent hepatocytes [13, 15]. The binding of naturally occurring HCV present in the sera of infected individuals to L- and DC-SIGN [14] has also been shown.

SARS-CoV

Angiotensin-converting enzyme-2 (ACE-2) is the only known functional receptor for SARS-CoV infection. The SARS-CoV and the HIV envelope proteins share sequence motifs, including an N-terminal leucine/isoleucine zipper-like sequence, a C-terminal heptad repeat, and a trans-membrane segment that constructs their active conformation [41], suggesting similarities in their oligomeric structure that might give rise to similarities in how their high-mannose oligosaccharides are displayed. This suggests that DC- and L-SIGN could be likely candidates to interact with SARS-CoV. Indeed, Chan et al. [28] demonstrated that L-SIGN binds to SARS-CoV and facilitates trans infection of SARS-CoV. Binding of SARS-CoV to L-SIGN, however, leads to proteasome-dependent viral degradation rather than productive viral replication. After exposure to SARS-CoV for 1 h, viral binding assay showed increased viral genomic copy numbers in L-SIGN transfectants compared with mock transfectants, but the total viral copy number decreased substantially afterwards. They also demonstrated that L-SIGN facilitates trans but not cis infection of SARS-CoV. L-SIGN expressed on permissive Vero E6 cells (which express ACE2) does not enhance SARS-CoV replication but rather retains the virus in a cell-associated form, as evidenced when cell lysates and supernatants were analyzed separately for the distribution of viral copy number. Blockade of L-SIGN with antibodies resulted in an increase of viral titre in the supernatants, with a reciprocal decrease in the lysates. The process could be reversed at least in part by treatment of EGTA, a calcium-chelating reagent. Thus, infectious viruses that would have been liberated after replication in permissive cells would instead be captured through binding to L-SIGN. Capture, absorption, and degradation of the released viruses could reduce the levels of infectious virions in the local environment, thus limiting subsequent viral spread to other permissive cells for further infection.

Yang et al. [19] showed that SARS-CoV transmission can occur through cell-mediated transfer by dendritic cells through DC-SIGN. Myeloid dendritic cells could not be infected by SARS-CoV, but were able to bind SARS-CoV spike (S) glycoprotein and transfer the virus to susceptible target cells. Marzi et al. [12] showed that DC- and L-SIGN enhanced infection mediated by the SARS-CoV spike glycoprotein. Infection was measured as the levels of luciferase activity in the cell lysates of DC- and L-SIGN expressing T-Rex cells. Consistent with Yang et al, they also showed that DC-SIGN does not function as a receptor for the virus but facilitates transmission to susceptible cells.

Others

DC- and L-SIGN have been shown to act as cofactors for cellular entry of Ebola virus [9], functioning as a trans receptor in transmitting virus to susceptible cells, increasing Ebola-virus infectivity [1012]. Similarly, they function as attachment receptors for Sindbis virus, an arbovirus of the Alphavirus genus [42]. DC- and L-SIGN interact with mycobacteria through mycobacterial mannosylated lipoarabinomannan [27]. For DC-SIGN, this prevents DC maturation and induces the immuno-suppressive cytokine IL-10, which may contribute to survival and persistence of M. tuberculosis. L-SIGN but not DC-SIGN acts as a receptor for L. infantum, the parasite responsible for visceral leishmaniasis [23], whereas DC-SIGN but not L-SIGN recognizes L. pifanoi, which causes the cutaneous leishmaniasis.

Tissue localization of DC- and L-SIGN

DC-SIGN is highly expressed in monocyte- and CD34+-derived DCs and in subsets of immature and mature DCs at various sites. In contrast, L-SIGN is not expressed by DCs nor by monocyte-derived in-vitro-cultured DCs. The tissue distribution of DC- and L-SIGN expressing cells so far reported by immunohistochemistry is summarized in Table 1. Furthermore, in situ hybridization studies by Chan et al. [28] has demonstrated that L-SIGN is expressed in the lung in both cytokeratin positive alveolar epithelia, as well as a subset of cells that co-expressed ACE2 but were negative for cytokeratin.

Table 1 The tissue distribution of DC- and L-SIGN expressing cells identified by immunohistochemistry

Alternative splicing resulting in DC- and L-SIGN transcript isoforms

Exons can be spliced or skipped to generate mRNA structures that can take many different forms. In addition to genomic polymorphism of the L-SIGN neck-repeat region, alternative splicing events also occur in DC- and L-SIGN, which generate a large repertoire of transcripts that are predicted to encode membrane-associated and soluble isoforms [43]. DC- and L-SIGN isoforms lacking the trans-membrane anchoring region encoded by exon 3 are predicted to encode as soluble isoforms that may function as intracellular molecules or may be secreted [43, 44]. Alternative spliced forms of L-SIGN have been identified [44] in vaginal and rectal mucosal samples, many of which lacked exon 3 and of higher transcript levels than that of full-length L-SIGN. In contrast, the full-length form of DC-SIGN was found to be more abundant than its isoform. These soluble isoforms may modulate efficiency of HIV viral transmission and dissemination, as well as that of other infections.

Internalization of ligands

HIV

Engering et al. [30] showed that DC-SIGN is rapidly internalized upon binding of soluble ligand. DC-SIGN–ligand complexes targeted to late endosomes and ligand binding was reduced at acidic pH. Upon internalization into acidic organelles, ligands can be dissociated, allowing recycling of DC-SIGN to cell surface as also demonstrated by Kwon et al. [45] and Bernhard et al. [37] for HIV. DC-SIGN mediates rapid internalization of intact HIV into a low pH non-lysosomal compartment, and the internalized virus retains competence to infect target cells. Removal of the DC-SIGN cytoplasmic tail reduces viral uptake and abrogates the trans-enhancement of T-cell infection. The virus being internalized to endosomal compartments evades host immune surveillance and infects target T cells when DCs conjugate with them [46]. Synder et al. [38] showed that the DC-SIGN oligomer was stable at the lower pH and suggested that reduced gp120 binding at low pH was likely to be due to loss of bound functional Ca2+ in the CRD. Moreover, they found that ICAM-3 was less unique as a ligand to DC-SIGN than any other cell surface glycoprotein, which contradicts the proposed function of DC-SIGN as an adhesion receptor to promote cellular contact between T cells and DCs. They thus argue that the function of DC-SIGN is specifically to capture low levels of glycan-containing antigens, which are then processed as antigens to major histocompatibility complex loading compartments. Their findings support the function of DC- and L-SIGN as an antigen-capturing receptor that binds antigens at the cell surface, internalizing them to a low pH endosomal compartment and then releasing them for degradation and loading onto major histocompatibility complex molecules.

HCV

Ludwig et al. [47] showed that DC- and L-SIGN are receptors for HCV envelope glycoproteins E1 and E2. Moreover, HCV virus-like particles (truncated and secreted versions of HCV envelope glycoproteins) are efficiently captured and internalized by DCs through binding of DC-SIGN. By immunofluorescent microscopy, HCV virus-like particles interacted with liver sinusoidal endothelial cells (LSECs). The internalized HCV virus-like particles were targeted to early endosomes, non-lysosomal compartments where they are protected from lysosomal degradation in a manner similar to that demonstrated for HIV-1, suggesting that L-SIGN on liver sinusoidal endothelial cells may capture HCV from blood, acting as reservoirs for transmission to hepatocytes, the primary target which are productively infected by HCV.

More recently, Lai et al. [48] isolated and cultured LSECs. Both DC- and L-SIGN are expressed on LSECs in vivo, bind to HCV E2 glycoproteins but are unable to support HCV entry neither by co-culture with pseudotypic particles HCVpp nor with FL-J6/JFK virus infected Huh 7.5 liver cells. Infectivity, however, could be transferred to permissive hepatoma cells, analogous to that of DC-SIGN presenting HIV to T cells. Moreover, stimulation of LSECs with IL-4 increases expression of DC- and L-SIGN, promoting HCV E2 binding. Experiments, however, show that internalization of ligands is dependent on cell type and the ligand itself [47]. For L-SIGN expressed on K562 cells, internalization of HCV particles was to the lysosomal compartment, whereas for THP-1 cells, internalization was to early endosomes. The function of DC-SIGN in HIV-1 transmission also appears to be dependent on its cellular context. Only DCs and the THP monocyte cell lines, but not 293 and HOS cell lines, were able to use DC-SIGN to retain HIV-1 in a highly infectious state for several days. Moreover, HIV-1 virions were not retained in endosomal compartments or in lysosomes but in undefined vesicles [49]. Lewis X antigen, another ligand of DC-SIGN, was shown to be internalized to lysosomes, demonstrating that the internalization pathway of DC-SIGN-captured ligands may depend on the structure of the ligand [47].

SARS-CoV

In a study of L-SIGN expression in fatal SARS patients, Chan et al. [28] demonstrated that L-SIGN(+) cells could be ACE2bright or ACE2dim/−. The L-SIGNbrightACE2dim/− cells expressed cytokeratin may be a subset of bronchiolar or alveolar epithelia. However, in the fatal SARS-CoV-infected lung, these cells showed no co-localization of SARS-CoV antigen. In contrast, it was the subset of cytokeratin-negative L-SIGNbrightACE2bright cells that showed co-localization of SARS-CoV antigen in fatal SARS patients. These L-SIGNbrightACE2brightcells could also be found in samples from some non-SARS individuals. More recently, they have shown that these L-SIGN-positive, cytokeratin- and surfactant- negative SARS-infected cells also co-express stem/progenitor cell markers CD34 and Oct-4. Moreover, these putative lung stem/progenitor cells could also be identified in some non-SARS individuals and can be infected ex vivo by SARS-CoV [50].

Effect of homo-/heterozygous genotype of L-SIGN on pathogen-binding affinity

Gramberg et al. [51] showed that L-SIGN alleles with five or six repeats are capable of complexing viral glycoproteins, gp120 of HIV-1, GP1 subunit of Ebola virus and S1 protein of SARS-CoV. Experiments were conducted on transiently transfected 293T cells, and luciferase activity of cells inoculated with infectivity normalized luciferase-reporter viruses were measured from cell lysates, which showed enhancing trans infection of, at least, HIV-1. Under conditions of high expression, the co-expression of wild-type and alleles with five repeats did not appear to alter the infectivity for HIV and EBOV.

In a series of in vitro experiments [28], using cell transfectants expressing L-SIGN homozygous seven tandem repeats (N7), homozygous five tandem repeats (N5), or heterozygous expression of both seven and five repeats (N7/N5 co-transfectant), it was demonstrated that compared with heterozygous L-SIGN (N7/N5), homozygous L-SIGN (N7 and N5) had a higher binding capacity to SARS-CoV and mediated more efficient viral degradation. Moreover, homozygous L-SIGN compared with heterozygous L-SIGN also showed lower ability for trans infection and an increased cell association for SARS-CoV. Indeed, the difference of viral copy numbers in lysates compared with supernatants became obvious and of relevance when comparison was made between homozygous L-SIGN with heterozygous L-SIGN. It must be noted that the separate measurement of lysates from supernatants probably explains the apparent difference from the observations of Marzi et al. and Gramberg et al. who measured luciferase activity as an indication of infectivity from only cell lysates. Interestingly, however, no significant difference was noted for the cis infection capacity of homozygous (N5/N5 or N7/N7) L-SIGN when compared with the heterozygous (N5/N7) [28]. Furthermore, viral copy number was significantly lower in the supernatants of homozygous L-SIGN/Vero E6 cultures than in heterozygous L-SIGN/Vero E6 cultures and mock-transfected Vero E6 cultures, whilst the reverse was noted in the lysates (Supplementary Fig. 1). Moreover, serially titrated supernatants from homozygous L-SIGN/Vero E6 cells harvested to infect fresh Vero E6 cells resulted in lower final viral genomic copy than with infection with supernatants from infected heterozygous L-SIGN/Vero E6 cells and mock transfectants. Thus, it is likely that a higher binding capacity may lead to more efficient internalization and degradation, thereby resulting in a lower ability for trans delivery of the virus. The outcome of SARS-CoV infection in vivo is more likely to be a combination of L-SIGN mediated capture, internalization, and degradation, as well as trans delivery of virus to permissive cells. Experiment in a closed system was designed whereby no wash procedures were performed to maintain a constant titer for infection, which indeed demonstrated that cultures with permissive cells with homozygous, but not heterozygous, L-SIGN-expressing cells resulted in significantly lower final viral copy numbers. Thus L-SIGN expressed on permissive cells (i.e., L-SIGNbrightACE2brightcells identified in human lung) may capture/sequester infectious viruses that have been liberated after replication. Capture, internalization, and degradation of the released viruses through binding to homozygous L-SIGN would result in a lower amount of infectious virions in the local environment, which is likely to contribute to the reduced susceptibility to SARS infection (Fig. 2). These findings were further supported by a genetic-risk association study, which confirmed that individuals homozygous for L-SIGN tandem repeats were less susceptible to SARS infection [28] as detailed in the next section.

Fig. 2
figure 2

Homo- and heterooligomerization of N7 and N5 of L-SIGN, respectively, with SARS-CoV binding, viral internalization, and degradation in the L-SIGN expressing cells. Homo-oligomer of N7 was found to have stronger binding with SARS-CoV compared to the hetero-oligomer of N7/N5. The viruses were internalized via L-SIGN. The homo-oligomers were found to have higher efficiency of viral degradation than the hetero-oligomers [28]. N7 and N5 correspond to the 7.5 and 5.5 neck-region repeats of L-SIGN according to Feinberg et al. [34]

Genetic association studies for infectious susceptibility

Tandem-neck repeats of L-SIGN

The L-SIGN neck-region tandem repeats have been investigated by risk association study for various infectious diseases such as SARS-CoV, HIV-1, HCV, and M. tuberculosis (TB).

SARS-CoV

A genetic-risk association study on 285 confirmed SARS patients and a total of 842 controls by Chan et al. [28] showed that individuals homozygous for L-SIGN neck-region repeats were less susceptible to SARS-CoV infection, with an overall association p = 0.005 and overall OR = 0.649. Moreover, the analysis also showed that it was the homo- or heterozygosity per se, but not any specific allele or genotype that constitutes the homo- or heterozygosity, influences the susceptibility to SARS infection. Tang et al. [52] claimed to be unable to replicate this finding. The control groups they used were very poorly matched by age for comparison with SARS patients with demonstrable age-related selection for L-SIGN genotypes [53]. Moreover, there was little sub-population structure for L-SIGN homo/heterozygote distribution between the northern (Beijing control samples of Tang et al.) and southern Chinese population (random controls of Chan et al.). Indeed, by the meta-analysis of the dataset of Chan et al. and Tang et al. by Mantel–Haenszel test using all controls groups that are in Hardy–Weinberg equilibrium (n = 1,497) and all SARS cases (n = 462) totaling n = 1,959, the combined odds ratio remained significant (combined ORs = 0.786, p = 0.026), indicating that a reduced risk is still associated with homozygotes, even by the approach of Tang et al. that disregarded the age effect [53].

HIV

Although Lichterfeld et al. [54] failed to detect any association between L-SIGN neck-region tandem repeat with HIV-1 infection, more recently, Liu et al. [55] in a large study of 1,716 individuals from two combined cohorts demonstrated that the homozygous 7/7 genotype was significantly associated with an increased risk of HIV-1 infection (p = 0.0015, OR = 1.87), with heterozygous 7/5 genotype correlating with resistance to HIV-1 infection (p = 0.029, OR = 0.69). Since different ethnic groups, namely European, African, and Asian Americans, had been included within the population cohorts studied, stratification analysis was performed to address the possibility of population admixture. Indeed, higher frequencies of the 7/5 genotype was found in European and Asian Americans (28%–30%) than in African Americans (11%). When analyzed only for individuals of European descent, the association of increased risk for HIV-1 infection for homozygous 7/7 individuals remained significant in one cohort [56]. Genetic variation has been known also to occur in the same ethnic group coming from different geographic areas [57], which may account for the differences between the cohorts in this study. Similarly, a study conducted in Thailand [58] found significantly higher incidence of HIV-seronegative individuals of HIV-seropositive spouses possessing heterozygous 7/5 and 9/5 repeats than HIV-seropositive individuals (p = 0.037, OR = 0.57 and p = 0.023, OR = 0.38, respectively). This association was, however, observed only in females. Unlike for SARS-CoV where homozygous L-SIGN plays a protective role, the association for HIV-1 infection appears to demonstrate a heterozygote advantage. This difference might be accounted by a different role L-SIGN plays in the pathogenesis of these two different diseases. The precise mechanism of HIV-1 acquisition is still not completely understood and has been largely investigated with regard to DC-SIGN, which seems to play a more dominant role than L-SIGN.

HCV

Nattermann et al. [59] analyzed the L-SIGN neck-region tandem-repeat polymorphism in relation to HCV infection and replication in 430 HCV-infected patients vs 100 healthy subjects. They found no significant difference in distribution of L-SIGN alleles and genotypes between HCV-infected patients and healthy controls. However, they observed that HCV-infected patients with five-, six-, and seven-repeat alleles had higher HCV viral-load when compared with four- and nine-repeat allele carriers, suggesting that the neck-region repeat may play a role in influencing HCV replication efficacy. No significant correlation was found between L-SIGN alleles and genotypes with other clinical parameters such as mean serum amino-alanine-transferase, the degree of hepatic inflammation, or the degree of hepatic fibrosis.

TB

A recent study by Barriero et al. performed on a South African cohort of 351 tuberculosis patients and 360 healthy controls of South African colored population found no significant association for the L-SIGN neck-region tandem repeat with susceptibility to TB infection [60]. Interestingly, observable differences in genotype distribution of this population can be noted when compared with that found in the African population (Table 2). Almost two-thirds of the heterozygous individuals had a difference of only one-tandem-neck repeat, namely that of 6/5, 7/6, and 8/7 genotype, in contrast to that of other ethnic groups, which had about two-thirds or more of heterozygous genotypes carriers having difference of two or more tandem repeat.

Table 2 L-SIGN neck-region genotype distribution in different ethnic groups—compilation of all published data

Promoter polymorphisms of DC-SIGN

Whilst the neck region of DC-SIGN presents with a largely constant size, polymorphism has been found in the promoter region of DC-SIGN, which may alter levels of expression of DC-SIGN and thus influence susceptibility and/or severity to infection.

Dengue fever

The −336G allele has been found to have strong protection against dengue fever (p = 0.000002, OR = 0.2) [61]. This polymorphism was demonstrated to have functional significance affecting the Sp1-like binding site and possibly other transcription factors that modulate transcriptional activity of the gene. Promoter activity assay showed the activity of −336G promoter construct about 0.67-fold lower than that of the −336A promoter construct, suggesting that the DC-SIGN −336A/G polymorphism can affect promoter activity. Carriers of the −336G allele with decreased expression of DC-SIGN may result in lower susceptibility of dendritic cells to dengue virus in the early stages of infection, thus protecting against dengue fever

HIV-1 infection

Individuals with the −336G allele has also been shown to be more susceptible to parentally acquired infection (p = 0.001, OR = 1.87) [62], but not for mucosally acquired infection, in keeping with the suggestion that DC-SIGN is not expressed by mucosal DCs in the genital tract [63]. It would appear that carriers of the −336G allele with decreased expression of DC-SIGN, unlike that of dengue fever, are more susceptible to HIV infection.

TB

Barriero et al. [64] reported that the combination of two DC-SIGN promoter variants (−871G and −336A) is associated with a decreased risk of developing TB infection in the same South African cohort they had used to study L-SIGN genetic association. However, a smaller study based on a cohort from Northwestern Colombia found no significant association [65]. Interestingly, the −336A/G allele frequency of this Colombian cohort is different from that found in the South African colored cohort (Supplementary Table 1). In TB, increased DC-SIGN expression by DCs is thought to lead to better capture and processing of mycobacterial antigens, resulting in stronger T-cell response. Hence, individuals with the −336A allele associated with increased DC-SIGN expression may underlie increased host response and better control of infection. TB had been endemic in Europe for centuries, whereas it has probably been rare in Africa before contact with Europeans [66]. Hence, the selective pressure on Europeans is stronger than on African population [67], lending support to the increased frequency of the protective −336A allele in non-African populations. Indeed, the −336A/G allele frequency among African, Asian, European, and South African colored populations was significantly different (p < 0.0001, chi-square test).

Population differences

DC- and L-SIGN genotyping results of the tandem-neck repeats on multi-ethnic groups

Re-sequencing the DC- and L-SIGN genes using the multi-ethnic Human Genome Diversity Panel (HGDP)–CEPH panel [68] including as many as 52 worldwide populations, Barreiro et al. [29] found distinct differences in the neck-region allele and genotype distribution among the different ethnic groups, suggesting that the neck region is a functional target for selective pressure.

Very little variation in the tandem-neck region was found in DC-SIGN, with an overall heterozygosity of only 2%. Although eight different alleles were found, ranging from two to ten repeats, the seven-repeat allele accounts for 99% of total variability. On the other hand, L-SIGN showed high variation of the neck-region repeat with an overall heterozygosity of 54% presenting with seven different alleles ranging from four to ten repeats. Supplementary Fig. 2 summarizes the allele frequency distribution of L-SIGN neck repeats in 52 worldwide populations reported by Barreiro [29], compared with that reported in three separate East Asian populations [28, 69, 70]. The two most dominant alleles are the five and seven repeats, which together contribute to 76–91% of all alleles found in all studied ethnic groups, with the exception of African and Oceanian populations. The six and seven repeats alone account for 96% of all alleles in the African ethnic group, whilst the nine repeat is of highest frequency in Oceanians than other ethnic group. Mosaic composition of different alleles of L-SIGN was observed in European, Asian, and Pacific populations. Although the seven-repeat allele is the predominant allele in all populations, it occurs at a higher frequency in Chinese and Japanese than in European Caucasians who demonstrate a larger number of five and six repeats. Table 2 summarizes the L-SIGN neck-region genotype distribution of different ethnic groups. The L-SIGN allele and genotype distribution in Chinese by Wang et al. [70] and Chan et al. [28] was similar to that in East Asians (predominantly Chinese) reported by Barrerio et al. [29, 56]. In contrast, L-SIGN neck-region tandem-repeat allele and genotype distribution in European Caucasian populations [29, 56] was distinctly different from the Chinese [28] (p < 0.0001) and the Japanese populations [69] (p < 0.0001).

The constant size of the DC-SIGN neck region suggests it has been under strong functional selective constraint that prevents accumulation of changes, supporting its proposed crucial role in pathogen recognition and immune response. This phenomenon is similar to that seen for the β-globin gene [71], the 5′ cis regulatory region of CCR5 [72] and the bitter-taste receptor gene [73] for which balancing selection has been convincingly demonstrated. In contrast, the strong variation in the neck region of L-SIGN can be the result of either relaxation of the functional constraint, which allows accumulation of new mutations or the action of a balancing selection that maintains over time two or more functionally different alleles at different frequencies. Therefore, DC- and L-SIGN appear to have gone through completely different evolutionary processes as reflected in their current patterns of diversity. The neck region constitutes an excellent candidate as functional target for selection as it plays a major mediating role in the orientation and flexibility of the carbohydrate-recognition domain. Since this domain is directly involved in pathogen recognition, neck-region length variation has important consequences for the pathogen-binding properties of these lectins [7, 34, 37].

Although DC- and L-SIGN are within ∼15 kb from each other, they are not in linkage disequilibrium (LD) as illustrated by data from the HapMap database. A recombination hotspot is located between the two genes (Supplementary Fig. 3). Similar weak LD pattern between DC- and L-SIGN was observed by Sakuntabhai et al. [61]. Such LD pattern suggests that these two genes behave as independent genetic entities.

Summary

With their high degree of homology, DC- and L-SIGN are thought to have evolved from an ancient gene duplication gene event. To increase defense potential, host immunity genes can exploit a feature called gene duplication by retention. By this, one duplicate with its useful function is conserved, while its twin is free to mutate and to possibly acquire novel functions [74]. DC- and L-SIGN represent a prototypic model of a duplicated progeny of ancestral genes that interact with a large spectrum of pathogens. Whilst the neck region of DC-SIGN is well conserved, that of L-SIGN demonstrates an excess of diversity compatible with the action of balancing selection. Thus, these duplicated genes appear to have undergone completely different evolutionary pressures, which might result in acquiring novel functions. There is already evidence to suggest that DC- and L-SIGN function differently in human immunity.

With regard to its association with susceptibility to various infectious diseases, the variable distribution of L-SIGN neck-region repeat genotype across populations calls for more genetic association studies in different populations, with possible important consequences in the field of medical genetics. Variation of the L-SIGN tandem-neck repeats affecting interaction with pathogens opens new horizons for investigation. Besides structural aspects, splicing events producing various lengths of transcript, epigenetic mechanisms such as promoter methylation, and miRNA gene silencing may reveal possible regulatory mechanisms governing gene expression in different types of tissues and cells. Recently, Hodges et al. reported that HIV sequestration by and stimulation of DC-SIGN helps HIV evade immune responses and spread to cells through downregulation of genes encoding major histocompatibility complex class II, Jagged 1 and interferon-response molecules, and upregulation of the gene-encoding transcription factor ATF3 [75]. In contrast, knowledge of cellular signaling downstream to L-SIGN after ligand binding is limited. The role of these two genes in the new emerging infectious diseases remains to be seen. The potential of developing new therapeutics targeting DC- and L-SIGN constitute frontier work in the fight to control infectious diseases and for continuing study of these two “SIGN” genes—DC- and L-SIGN.