Main

Glycomics is the study of the structural and functional aspects of the various glycoconjugates present on proteins, cells and, in some cases, entire organisms. Compared with its counterparts, genomics1 and proteomics2 — which deal with nucleic acids and proteins, respectively — the field of glycomics is much less developed. The types of bio-oligomers and biopolymers covered by the term glycoconjugate are diverse. Different natural products such as glycoproteins, glycolipids, glycosaminoglycans and glycosylphosphatidylinositol anchors are summarily known as sugars3. Even when only the oligosaccharide chains are considered, studies of processes involving carbohydrates are complicated because, unlike peptides and oligonucleotides (which are usually linear), many oligosaccharides have branched structures. Because they are not under direct genetic control, glycoconjugates are typically heterogeneous. Amplification methods — such as the polymerase chain reaction (PCR) for nucleic acids or bacterial expression systems for protein production — do not exist for glycoconjugates. Consequently, until recently, isolation of carbohydrates was the only way to procure these natural products3. Overall progress in glycobiology has suffered from a lack of tools such as those that are readily available for studying nucleic acids and proteins, including automated sequencing4,5, automated synthesis6,7, high-throughput microarray screening, and detailed structure elucidation, including X-ray analyses. Carbohydrate synthesis8 methods are time-consuming and practised mostly by specialized laboratories9,10,11. The products of such syntheses have aided the development of modern sequencing methods12,13.

Improved synthetic protocols for solution-phase oligosaccharide synthesis9,10,11, as well as the use of automated carbohydrate assembly, have provided more straightforward access to usable quantities of pure oligosaccharides. Synthetic carbohydrates have allowed the development of chemical approaches to glycomics that provide a molecular picture of biological processes involving carbohydrates. Synthetic sugars are beginning to be used in the development of diagnostic tests, vaccines and carbohydrate therapeutics. Some of these exciting advances in this rapidly growing field are highlighted in this review.

The usual first step when investigating a biological signal-transduction event is to establish which molecule is responsible for the activity. If the biomolecule is a nucleic acid or a protein, an answer can be obtained quickly because there are reliable, automated sequencing techniques. If, however, a carbohydrate is involved, sequencing is less straightforward. Carbohydrate analysis has improved tremendously in the past two decades12,13,14,15,16,17,18, but there is still no single method to determine the composition of all glycoconjugates. Given the structural diversity of glycoconjugates, different analytical approaches may persist for the analysis of different classes of sugar.

Synthesis of carbohydrates

Once a particular oligosaccharide (or a set of oligosaccharides) has been identified as being responsible for a biological effect, it often has to be synthesized in order to establish or confirm its structure assignment. In addition, defined oligosaccharides and their analogues are key tools for biochemical, biophysical and biological studies. The synthesis of carbohydrates has been pursued for more than a century, and many oligosaccharides can now be synthesized, albeit with considerable effort6,9,10,11. Specialized laboratories synthesize oligosaccharides using processes that may take months to years owing to the structural complexity of carbohydrates. This situation is reminiscent of the solution-phase total syntheses of peptides and DNA that were practised before the advent of automated solid-phase synthesis. A host of improvements has accelerated solution-phase oligosaccharide syntheses19.

The one-pot synthesis strategy aims to automate the planning of oligosaccharide synthesis20. On the basis of the relative reactivity of hundreds of monosaccharide 'building blocks', a computer program known as Optimer selects the appropriate building blocks, as well as the order in which they should be added to the reaction vessel during oligosaccharide assembly (Fig. 1a). The Optimer method works well for oligosaccharides of up to six units, but requires an extensive set of building blocks. An automated instrument known as the 'Golgi apparatus'21 has been used for enzymatic syntheses22, building on the superb regiospecificity and stereospecificity of glycosyltransferases that avoid the need for protective groups to assemble oligosaccharides23 (see page 1008). Engineered organisms such as yeast have been used for the production of N-glycosylated proteins24,25. A library of glycoengineered cell lines is expected to yield a plethora of specific glycovariants that have previously been unobtainable in mammalian cells.

Figure 1: Schematic representation of strategies for automated oligosaccharide assembly.
figure 1

a, Automated solution-phase synthesis using the Optimer-based one-pot approach for a pentasaccharide. A computer program selects appropriate monosaccharide building blocks according to their relative reactivity values in order to achieve best yields for oligosaccharide assembly. b, Automated solid-phase assembly using five monosaccharide building blocks on a polymer resin (black circle), attached by means of a linker. The cycle — consisting of activation and deprotection steps — is performed five times for the assembly of a pentasaccharide. Finally, the linker is cleaved to procure the desired oligosaccharide. Coloured triangles and squares represent different sugar monomer building blocks. P, P′, temporary protecting groups; R, hydrocarbon residue to be functionalized; X, leaving group.

The development of a fully automated oligosaccharide synthesis process by chemical means has been viewed with considerable scepticism in light of the complexity of carbohydrates, the large number of monomers needed and possible connections between monosaccharide units. Recent bioinformatics studies explored the diversity of mammalian oligosaccharide connectivities using a comprehensive database of isolated N-linked and O-linked glycans, and glycosphingolipids (D.B.W., R. Ranzinger, S. Herget, A. Adibekian, C.-W. von der Lieth and P.H.S., unpublished observations). Data mining has revealed that nature uses only a small proportion of the theoretically possible connections, so the complexity of glycospace (that is, the body of different structures that can, in principle, be constructed) is significantly reduced. According to the results of this database analysis (D.B.W., R. Ranzinger, S. Herget, A. Adibekian, C.-W. von der Lieth and P.H.S., unpublished observations), just 36 building blocks are needed to assemble 75% of known mammalian oligosaccharides by chemical methods. The challenge of synthesizing these 36 building blocks is surmountable when considering that about 100 different amino-acid monomers are commercially available for peptide synthesis.

Each monosaccharide building block is synthesized at a multi-gram scale and can be used for the assembly of various targets. The feasibility of assembling most carbohydrates using a limited, defined set of building blocks is still questioned, but example structures of increasing complexity are being reported. Temporary protective groups mark sites of further glycosylation, and permanent protective groups mask hydroxyl groups to be unveiled at the end of the synthesis. Besides controlling regioselectivity by orthogonal protective groups to account for branching of the carbohydrate chain, the stereochemistry at the anomeric carbon must be controlled. Placement of participating protective groups at the C2 hydroxyl or amine groups ensures the formation of trans-glycosidic linkages, whereas non-participating groups are needed for the preferential installation of cis-glycosides.

Automated solid-phase oligosaccharide synthesis (Fig. 1b) has been developed from insights gained from oligopeptide and oligonucleotide assembly26. The first building block is added to a polystyrene resin equipped with an easily cleavable linker containing a free hydroxyl group27. An activating agent induces couplings involving glycosyl phosphate and glycosyl trichloroacetimidate building blocks26. Unlike oligonucleotide and peptide couplings, glycosidic bond formation occurs mostly at low temperatures and requires a reaction chamber that can be cooled. Excess building blocks (that is, a 5–10-fold molar excess, sometimes applied twice) are added to the chamber for each coupling. Mass action to drive coupling reactions to completion and to achieve high yields is also common to peptide and oligonucleotide syntheses. Washing and filtration remove any side products or remaining reagents before selective removal of a temporary protective group readies the next hydroxyl group for subsequent coupling. Coupling efficiencies can be assessed by spectrometric read-out after protecting-group removal when temporary protecting groups that absorb ultraviolet radiation, such as 9-fluorenylmethyloxycarbonyl (Fmoc), are used28. Originally, this coupling–deprotection cycle was automated using a converted peptide synthesizer26. An automated oligosaccharide synthesizer prototype with parallel synthesis capability is currently being tested.

After completion of the oligosaccharide sequence, the fully protected product is cleaved from solid support. After global deprotection, the oligosaccharide is purified and its structure verified. A series of increasingly complex oligosaccharides has been assembled, each within 1 day or less, using the automated oligosaccharide synthesizer. This compares favourably with the weeks to months taken using solution-phase methods28. Initial syntheses contained only the synthetically less challenging trans-glycosidic linkages, but cis-glycosides such as α-galactoses have now also been selectively incorporated29.

At present, automated oligosaccharide synthesis resembles the early days of automated peptide and oligonucleotide assembly: many carbohydrate structures, both simple and complex, can be synthesized by automation. The problems and drawbacks — such as the excess of buildings blocks used, the difficulties in incorporating certain monosaccharides such as sialic acid, and the double bond in the linking moiety that restricts the deprotection conditions — have been recognized. Some of these limitations have now been addressed, but certain challenges remain. Although commercially available monomeric building blocks are quite expensive, it seems likely that the cost of these reagents will decrease with increasing demand. One of the reasons why automated solid-phase synthesis of oligosaccharides is not currently performed by a larger number of groups is the fact that the synthesis instrument is not yet commercially available. However, the strength of the chemical approach to incorporating unnatural linkages nicely complements the evolving enzymatic technologies. A combination of improved isolation and purification strategies of naturally occurring carbohydrates, enzymatic approaches, the use of engineered organisms and chemical synthetic approaches such as automated solid-phase synthesis is expected to provide scientists with rapid access to defined oligosaccharide libraries for glycomics investigations in the near future.

Carbohydrate arrays

Several methods have been established to study the interactions of carbohydrates with various other molecules (Table 1). The DNA microarray has been a key tool in genomics research30, and protein arrays are used to identify protein–protein interactions and potential inhibitors31. Similarly, carbohydrate microarrays have been used in glycomics research to examine the interactions of carbohydrates with other molecules. The chip-based format of microarrays offers important advantages over common screening techniques such as enzyme-linked immunosorbent assays (ELISAs), because several thousand binding events can be screened on a single glass slide and only minuscule amounts of analyte and ligand are required. Assay miniaturization is particularly suitable for glycomics, because access to pure oligosaccharides is the limiting factor. The first carbohydrate microarrays relied on isolated saccharides that were non-covalently attached to membranes32,33. A flurry of methodological studies evaluated different aspects of microarray design and adopted oligonucleotide and protein array techniques for carbohydrate chips (Fig. 2). Synthetic monosaccharides and oligosaccharides were covalently attached via different linkers to glass34, plastic35 and gold surfaces36, or placed on beads in fibre-optic wells37. Initial proof-of-principle studies focused on known interactions between lectins (carbohydrate-binding proteins) and sugars. Current screening efforts rely on carbohydrate arrays in which chemically or enzymatically synthesized and isolated oligosaccharides with a linker on the reducing terminus are covalently attached to glass slides38. Standard DNA printing and scanning equipment is used to produce and analyse the carbohydrate microarrays39,40.

Table 1 Synthetic tools for studying the interactions of carbohydrates
Figure 2: Carbohydrate microarrays.
figure 2

a, Carbohydrate microarrays can be constructed from synthetic or isolated oligosaccharides that contain a primary amine. The glycans are covalently attached to the glass surface of a microscope slide that has been functionalized with a reactive group (for example, N-hydroxysuccinimide, NHS). b, Incubation of the carbohydrate microarray with a protein, antibody or cell that has been labelled (for example, with a fluorescent group) can be used to determine which oligosaccharide binds to that protein, antibody or cell. In this particular example, the protein represented by a blue wedge specifically binds to the oligosaccharide represented by the yellow triangle, but not to the other oligosaccharides that appear on the glass slide.

Initially, carbohydrate–protein interactions important to the process of HIV infection were analysed, and epitope mapping of HIV-related antibodies was performed40,41. The knowledge gained from the microarray experiments was essential to efforts directed at the creation of carbohydrate-based HIV vaccine candidates42,43 (see page 1038).

During the past 3 years, a host of carbohydrate–protein interactions has been studied using oligosaccharide arrays44. National and regional consortia such as the Consortium for Functional Glycomics (NIH, USA) and the ETH Zürich Glycomics Initiative are making this technology widely accessible to life science researchers.

Carbohydrate microarrays can be used to address the interactions of sugars with other types of molecule, as well as entire cells. Carbohydrate–RNA interactions were screened by incubating labelled RNA with aminoglycoside microarrrays45. Mechanisms responsible for antibiotic resistance were studied using these arrays together with resistance-causing enzymes46.

The binding of cells to microarray surfaces allows the detection and typing of bacteria in blood47. The interaction of eukaryotic cells with carbohydrate arrays has also been demonstrated48. The oligosaccharide binding preferences of different types of avian and human influenza strains can be determined using carbohydrate microarrays49.

The structure–activity relationship of glycosaminoglycan polysaccharides, including heparin and chondroitin sulphate, is poorly understood. Carbohydrate microarrays containing synthetic50 or isolated51 heparin oligosaccharides have served to identify specific sequences recognized by different fibroblast growth factors52. Immobilization of a series of chondroitin tetrasaccharides has aided the identification of a novel tumour necrosis factor-α antagonist53 and the exact chondroitin sequences bound by proteins54.

In addition to the application of carbohydrate arrays to the study of biomolecular interactions, oligosaccharide microarrays are beginning to be used as diagnostics and to conduct epidemiological studies. Hundreds of human sera have been screened for antibodies against the malaria toxin glycosylphosphatidylinositol (GPI) anchor55, and a correlation between the presence of specific antibodies and resistance to severe malaria has been established (F. Kamena, M. Tamborrini, X. Liu, G. Pluschke and P.H.S., unpublished observations). The search for serological markers of autoimmune diseases has yielded results for Crohn's disease56. Epitope mapping of antibodies against tumour-associated antigens is also feasible57.

The presence of certain oligosaccharide structures in a glycoprotein sample can be determined using arrays of immobilized lectins58. These carbohydrate-binding proteins recognize terminal saccharide units and have been used to analyse the dynamic bacterial glycome59.

Oligosaccharide therapeutics

Given the prevalent role of carbohydrates in a wide range of biological processes it may seem surprising that there are few carbohydrate-based therapeutics and diagnostics on the market. In addition to monosaccharide-inspired drugs such as the influenza virus treatment Tamiflu60,61 (oseltamivir phosphate; Roche) two blockbuster drugs, acarbose (Precose, Glucobay; Bayer) and heparin, stand out. Both oligosaccharides were derived by isolation and reached the clinic before a detailed structure–activity relationship had been established. In addition, aminoglycosides — naturally occurring pseudo-oligosaccharides — have been used clinically to treat infectious diseases induced by a variety of Gram-negative bacteria62. The antibiotic activity of aminoglycosides is due to their inhibition of protein synthesis, which results from their binding to bacterial ribosomes62.

Heparin

The oldest carbohydrate-based drug is isolated from animal organs and has been used clinically as an antithrombotic agent since the 1940s. Heparin activates the serine protease inhibitor antithrombin III, which blocks thrombin and factor Xa in the coagulation cascade63. This drug is a highly heterogeneous mixture of polysaccharides and is associated with severe side effects, including heparin-induced thrombocytopenia, bleeding and allergic reactions. Chemically or enzymatically fragmented heparins (low-molecular-weight heparins, LMWHs) are also heterogeneous, but are more bioavailable, with a longer half-life, a more predictable anticoagulant activity and fewer side effects in vivo.

After the specific pentasaccharide responsible for the anticoagulant property was identified in the early 1980s (ref. 64; Fig. 3a), a herculean effort lasting more than 10 years was begun to establish a structure–function relationship using synthetic oligosaccharides64. As a result of this drug-development effort, a synthetic pentasaccharide known as Arixtra (fondaparinux sodium; GlaxoSmithKline) has been available since 2002 (ref. 65). However, Arixtra does have some clinical shortcomings, such as an exceedingly long half-life in vivo and little to no dose-dependent activity in certain indications66. Thus, LMWHs still have the highest market share of all antithrombotics, and the need for additional synthetic heparin molecules with specific activities persists.

Figure 3: Carbohydrates used in medicine and for vaccine development.
figure 3

a, A pentasaccharide sequence of heparin. This sequence is responsible for binding to antithrombin III. b, Synthetic spore surface tetrasaccharide of B. anthracis. This molecule was used for the generation of anticarbohydrate antibodies to detect anthrax spores and is currently being used in vaccine development.

Recent advances in heparin sequencing16, heparin synthesis67,68,69,70 and heparin microarray technology50,51 have provided the tools to identify specific sequences or sequence families that interact with proteins such as chemokines. The chemical synthesis of a broad range of heparin analogues should allow researchers to study the molecular mechanism of angiogenesis and to modulate wound healing and other medically relevant processes (see page 1030).

Acarbose

Carbohydrates such as starch and sucrose are principal components of food, and have to be enzymatically broken down in the intestinal tract. Acarbose71, a pseudo-oligosaccharide of microbial origin, is produced by fermentation. This α-glucosidase and α-amylase inhibitor interferes with and regulates intestinal carbohydrate digestion, controls the rate of absorption of monosaccharides and influences the intermediary carbohydrate metabolism. It is used to treat type 2 diabetes.

Carbohydrate-based vaccines

The cell surfaces of bacteria, parasites and viruses exhibit oligosaccharides that are often distinct from those of their hosts. Specific types of glycoconjugate are often more highly expressed on the surface of tumours than on normal cells72. Such cell-surface carbohydrate markers are the basis for carbohydrate-based detection systems and vaccines. An immune response against the carbohydrate antigens that results in the death of target cells is required for a carbohydrate-based vaccine. Such vaccines have been widely used against a host of diseases for several decades73. The carbohydrate antigens for antibacterial vaccines were isolated from biological sources. Recently, intense efforts focused on the use of defined carbohydrate antigens that are synthesized rather than isolated. A carbohydrate-based approach has also been pursued for anticancer vaccine candidates74,75,76 (see page 1000). However, one of the early carbohydrate-based anticancer vaccine candidates recently failed in a Phase III clinical trial.

Antibacterial vaccines

Polysaccharide capsules, glycoproteins or glycolipids cover the cell surfaces of many bacteria. Capsular polysaccharides are either homopolymers or made up of between two and six repeating sugar units. Capsular polysaccharides elicit type-specific protective immune responses in adults but not in infants, who do not respond with antibodies that confer protection. Conjugate vaccines consisting of a carbohydrate antigen and an immunogenic protein can overcome this immunogenicity problem and produce high titres of protective antibodies.

Improved analytical tools have helped to identify the exact chemical structure of carbohydrate antigens and have aided the development of new vaccines. Several vaccines based on purified capsular polysaccharides or on neoglycoconjugates are now commercially available, such as vaccines against Neisseria meningitidis, Streptococcus pneumoniae, Haemophilus influenza type b (Hib) and Salmonella typhi77. Meningitis caused by Hib has essentially been eradicated in areas where national vaccination programmes using protein conjugate vaccines have been implemented.

Vaccine development could benefit greatly from the new glycomics technologies. The identification of specific oligosaccharide antigens has been aided substantially by sequencing and carbohydrate arrays. The procurement of defined oligosaccharides using improved solution- and solid-phase methods has become fast enough to be used reiteratively in drug-development efforts.

A synthetic oligosaccharide-based conjugate vaccine is now used in Cuba, where the large-scale synthesis, pharmaceutical development and clinical evaluation of a conjugate vaccine composed of a synthetic capsular polysaccharide antigen of Hib was achieved. Long-term protective antibody titres compared favourably with products prepared with the Hib polysaccharide extracted from bacteria78.

A tetrasaccharide has been discovered on the surface of spores of the biowarfare agent Bacillus anthracis79. Once the durable form of the pathogen has been inhaled it will kill most victims if treatment is not commenced immediately. Synthesis of a species-specific tetrasaccharide antigen80,81 (Fig. 3b) allowed the production of antibodies that specifically recognize B. anthracis in the presence of the closely related opportunistic human pathogen Bacillus cereus82. Challenge experiments to create a conjugate vaccine against anthrax are ongoing.

Antiparasite vaccines

Like bacteria, many parasites have unique glycoconjugates on their surfaces. The specific carbohydrates may serve as a starting point for the creation of conjugate vaccines, but efforts towards this goal have been hampered by the fact that the parasites are very difficult to culture and because glycoconjugates cannot be obtained in pure form or sufficient quantity by isolation.

Malaria

Plasmodium falciparum is the most pathogenic of the single-celled parasites of the genus Plasmodium that are responsible for malaria. Malaria infects 5–10% of humans worldwide and kills more than 2 million people each year. Infected mosquitoes transmit the parasite, which leads to the common symptoms of chills and fever. Drug resistance is a growing problem at a time when there is still no effective vaccine.

P. falciparum expresses a large amount of GPI on its cell surface83. This glycolipid triggers an inflammatory cascade that is responsible for much of malaria's morbidity and mortality. When a protein conjugate of a synthetic hexasaccharide GPI malaria toxin was administered to mice before infection, this resulted in a highly reduced mortality rate of only 10–20%, compared with 100% without vaccination55. Cross-reactivity of the antibodies with human GPI structures was not observed owing to the differences between human and P. falciparum GPI. Immunization of mice did not alter the infection rate or overall parasitaemia, indicating that the antibody against the GPI neutralized toxicity without killing the parasites55.

Preclinical studies involving protein conjugates of synthetic GPI antigens are currently underway. To support such vaccine development efforts, methods for the large-scale synthesis of oligosaccharide antigens have been developed by taking advantage of the latest advances in carbohydrate synthesis technology. Very small amounts of synthetic antigen (10−9–10−7 g per person) are required, and the production of several kilograms of antigen per year will suffice.

Leishmaniasis

Leishmaniasis, which is caused by another protozoan parasite, is transmitted by sandflies and affects more than 12 million people worldwide. Leishmania resides in macrophages, making them difficult to treat. In a search for a potent vaccine, the lipophosphoglycans (LPGs)84 that are ubiquitous on the cell surfaces of the parasites and are composed of a GPI anchor, a repeating phosphorylated disaccharide and different cap oligosaccharides became a target. The cap tetrasaccharide has been the focus of efforts towards a conjugate vaccine based on a synthetic antigen. The branched tetrasaccharide was assembled by automated solid-phase synthesis85 and conjugated to a virosomal particle to enhance immunogenicity. These highly immunogenic conjugates yielded antibodies that selectively recognize parasite-infected livers86. Challenge studies in an animal model are currently underway.

Recent advances and future development

For many years the lack of tools for studying glycobiology prevented biologists and medical researchers from addressing research problems that involve carbohydrates. During the past decade, sequencing and synthesis technologies that are commonly used to study nucleic acids and proteins have become available for glycomics as well. Now, carbohydrate sequencing of glycoconjugates is often possible even though sample preparation is complicated by carbohydrate microheterogeneity and the absence of amplification procedures. Automated solid-phase synthesis, improved methods for solution-phase oligosaccharide assembly, enzymatic methods and the use of engineered cells have complemented each other, allowing oligosaccharide synthesis to take a big step forward by granting access to different classes of glycoconjugate. In turn, these methods have helped procure oligosaccharides and their non-natural analogues for the creation of high-throughput screening methods such as carbohydrate arrays.

The identification of specific oligosaccharides, by sequencing followed by comparison with synthetic oligosaccharides, has yielded insight into the interactions of carbohydrates and proteins. Oligosaccharide involvement at key positions of signalling pathways is beginning to emerge and a molecular understanding of carbohydrate binding to proteins is evolving. Detailed structural studies — including studies of protein–carbohydrate interactions — using X-ray crystallography will become commonplace in the near future. Further improvements in the methods by which oligosaccharides are sequenced and synthesized will be needed to make their routine use possible for non-specialists.

A better understanding of the biological roles of carbohydrates and improved sequencing and synthesis techniques are beginning to influence the design of diagnostic and therapeutic approaches. Carbohydrate arrays help to define new disease markers by screening the sera of patients. Bacterial and viral detection and typing can be achieved using carbohydrate microarrays. Synthetic access to oligosaccharides of infectious agents that are hard to culture and isolate (for example, B. anthracis and P. falciparum) facilitates antibody production for specific detection of these pathogens. These anticarbohydrate antibodies may become important for passive immunization. The first conjugate vaccine candidates containing synthetic oligosaccharide antigens are reaching preclinical and clinical trials against bacterial (for example, Hib), viral (for instance, HIV) and parasitic (for example, malaria and leishmaniasis) infections. The trend to produce defined vaccine antigens using chemical and enzymatic methods, as well as engineered cells, is likely to increase, and synthetic vaccines are expected to complement already existing vaccines containing purified polysaccharides.

As our understanding of carbohydrate involvement in signalling cascades — in particular of those that involve glycosaminoglycans — expands, carbohydrate-mediated processes will become the target of drug-development efforts using small organic molecules. Glycomics has just gone beyond the initial proof-of-principle studies for diagnostics and therapeutic candidates. Improved tools and a better molecular understanding should convince those biologists and medical researchers who previously avoided carbohydrates to address questions involving this class of molecule. The excitement of glycomics is just beginning, with many discoveries to be made and applications to be developed.