Revisiting the Language of Glycoscience: Readers, Writers and Erasers in Carbohydrate Biochemistry

Abstract The roles of carbohydrates in nature are many and varied. However, the lack of template encoding in glycoscience distances carbohydrate structure, and hence function, from gene sequence. This challenging situation is compounded by descriptors of carbohydrate structure and function that have tended to emphasise their complexity. Herein, we suggest that revising the language of glycoscience could make interdisciplinary discourse more accessible to all interested parties.

use words clearly underminest he globalr esponse to antimicrobials' waning usefulness". [5] Te chnological [6] and informatics [7] advancesi ng lycoscience, alongside combinations of the two, [8] are providing new ways to cut through the complexity,w hilst comprehensive books of glycobiology topics providee ntries in to the field. [9] The introduction of stylized symboln omenclature for glycans (SNFG;F igure 1) also represents an important step towards simplifyingc ommunication within and between interested disciplines [10] along with guidelines for experimental design and data curation, [11] and ar epository for glycan structures. [12] As discussed recently by Gabius, [13] there are notable parallels between aspects of glycobiology and the epigenetic regulation of chromatin structure and function. The latter processes, which occur with exquisite precision, are typically referred to in stripped-downt erms as as eries of read, write and erase events, making the field immediately accessible to outsiders. Indeed, this approache mulatesc omputer programming's create,read, update and delete( CRUD) [14] -the four basic functions employed for persistentd ata storage. [15] Herein, we consider the potential to recapitulate glycoscience language in the terms of epigenetic vocabulary.
In simple terms, epigenetics concerns small chemical changes (marks)i nt he chemical structure of chromatin-typically the histone proteins that organize and package DNA in chromosomes. [16] Dynamic changes in these epigenetic protein marks impact on the physical accessibility of gene sequences for expression, rather than on the alteration of the genetic code per se. The profound biological consequences of these processes have attracted enormousa ttention over the past decade, given their central role in life and their disruption in disease. [17] The molecular hallmarks of epigenetic regulation comprise ad ynamic series of enzymatic modifications teps that introduce or remove marks to the histonep rotein structure. Epigenetic writers, which introduce epigenetic marks on amino acid residues of the histones, include histone acetyltransferases (HATs,w hich N-acetylate lysine), histonem ethyltransferases (HMTs, whichN -methylate lysine), protein arginine methyltransferases (PRMTs) and protein kinases (which O-phosphorylate serine/threonine), amongst others. Epigenetic readers, whichb ind to epigenetic marks and amplify their impact on DNA packaging and hence gene accessibility for expression, The roles of carbohydrates in nature are many and varied. However,the lack of templatee ncoding in glycoscience distances carbohydrate structure, and hence function, from gene sequence. This challenging situation is compounded by descrip-tors of carbohydrate structure and functiont hat have tended to emphasise their complexity.Herein, we suggest that revising the languageo fg lycoscience could make interdisciplinary discourse more accessible to all interestedp arties.
The impact of the lysine N-acetylation epigenetic mark is perhaps simplest to appreciate. Writing this mark resultsi nt he loss of ap ositive charge on the lysine side chain of ah istone, thus removing the potentialf or interaction with the negatively chargedD NA backbonea nd causing looseningt he DNA-histone complex. The resulting openingu po ft he chromosome structuree nables the localized activation (turning on) of gene expression. In the opposite sense, erasing al ysine acetylation mark drives at ighter assembly of the histone-DNA complex and silencing (turning off) gene expression.
The general principle of readers, writers and erasers prompts consideration of potential parallels between epigenetics and the control of glycan biosynthesis, structure and function. That is, does the notion of lectin readers, glycosyltransferase writers and glycosyl hydrolase erasers ring true in glycobiology?A convenient segue from epigeneticsi nto glycoscience is provided by the reversible O-GlcNAc modification of Ser/Thrr esidues in proteins. [18] This central metabolic "rheostat" [19] comprises a nutrient status-responsive, post-translational modification that impactso np rotein-protein and protein-nucleic acid interactions. In turn regulating of cellular events including transcription and signal transduction, with implications in diabetes, Alzheimer'sdisease and cancer.
So how does the O-GlcNAccycle work? O-GlcNAc transferase (OGT) writes and O-GlcNAcase erases, providing as imple and reversible modification cycle that is orthogonal to protein phosphorylation and which has far-reachingp hysiological impact ( Figure 3). [20] In addition to glycosyltransferase writers and glycosyl hydrolase erasers, there are also potentialr eaders in glycoscience-a functionp erformed by lectins [21] and the carbohydrate-binding modules (CBMs) [22] in multidomain CAZymes. The full read, write, erase combination in glycoscience is most easily exemplified by the proofreading and editing cycle associated with N-linked glycoprotein biosynthesis. These processes are essential to ensuring the correct integrity and dynamics of cell-surface glycoproteins,w hich contribute to the glycocalyx that dominates cell-cell interactions in the maintenance of healthy tissue and which underpin sperm-egg interactions during fertilisation, but which also serve as cellular receptors for aw ide range of microbial pathogens. [9] Asparagine-linked protein N-glycosylation starts in the endoplasmicr eticulum, whereas the peptide chain is unfolded, and proceeds through protein folding to the Golgi apparatus, where the glycan components are processedt oamature state. This requires ah ighly organised distribution of processing machinery to achieve the fidelity and quality control neededt oe nsure biological function. [23] Approximately8 0% of the proteins enteringt he secretory pathway are glycosylated in the ER and most of the proteins assembled in the ER feature N-linked oligosaccharides. Most of the glycoproteinsf eaturing mature N-glycans are, as describedb yA ebi, "precisely heterogeneous" in their carbohydrate composition-ar esult of kinetically controlled processing. [24] Nonetheless, tor each their final maturea nd bioactive form, in the early stage of biosynthesis all N-linked glycoprotein are homogeneously glycosylated. This is ar esult of ap recise lectin chaperone (reader) based proofreadingm echanism in the ER,w hichd iscriminates between correctlyf olded and misfolded glycoproteins ( Figure 4). [25] Here the oligosaccharide plays ak ey role in presenting each glycoprotein for scrutinyb yt he sophisticated biological checkpoint process, which is referred to as glycoprotein quality control. [26]    This process ensures that only correctly folded glycoproteins are transported to the Golgi for further glycan processing in to matureg lycoproteins. Unfolded and misfolded glycoproteins are retained in the ER for furtherf olding attempts and are eventually degraded if the correctly folded status is not achieved.
The glycoprotein quality control system presentsc lear parallels to the read, write and erase processes of epigenetic regulation. In the first step of glycosylation, Glc 3 Man 9 GlcNAc 2 is transferred en bloc from an oligosaccharyl dolichol diphosphate to the nitrogen of an asparagine side chain in the nascentp olypeptidec hain by the writer oligosaccharyltransferase (OST). [27] Immediately after the Glc 3 Man 9 GlcNAc 2 is transferred,t he eraser glucosidase-I [28] cleaves off the terminal glucose( Glc) residue, which is necessary to preventt he glycoprotein product rebinding to the OST.S ubsequently,t he eraser glucosidase-II [29] catalyses cleavage of as econd glucoser esidue and the resulting monoglucosylatedp olypeptide is promptly sequestered by the calnexin (CNX) [30] and calreticulin (CRT) [31] lectin chaperone [32] readers. These chaperones preventa ggregation of the unfolded glycopolypeptide chains, and assist in their correct folding by presentation to the oxidoreductase ERp57, which is responsible for effectingc orrect disulfideb ond formation. [33] Once the folded glycoprotein is released from the lectin chaperonesr eaders, the eraser glucosidase-IIr emoves the final glucose residue and the glycoprotein undergoes inspection by the UDP-glucose:glycoproteing lycosyltransferase (UGGT) [29] ( Figure 4).
If the correct glycoprotein folding is not accomplished, UGGT serves as aw riter and re-glucosylates the misfolded glycoprotein in preparation for recycling to the chaperones/ ERp57 machinery [34] -the so-called calnexin/calreticulin cycle ( Figure 4). [35] Following repeated failed foldinga ttempts,t he glycoprotein is degraded by the endoplasmic reticulum associated degradation system (ERAD). [36] If correct folding is achieved, the glycoprotein is transported into the Golgi apparatus for further processing of the glycan to provide the matureg lycoprotein.

Conclusion
It is widely recognised that carbohydratesp layi mportant roles in biological molecular recognition, and have ap rofound impact on human health and medicine. Nonetheless, there is merit in simplifying the language of glycoscience to make it more accessible to the uninitiated.I nt urn, this might facilitate af ocus on the principles and implications of glycosylationi n biology,r ather than risking drowning in the detail of structural complexity.The notionofaccessible vocabulary in glycoscience is not new: it was already evident in Hood, Huang and Dreyer's 1977 [37] description of differentiation antigensa sc ell-surface "area codes"; and the potential of cell-surface carbohydrates, lectins, enzymes and carbohydrate-binding antibodies in Feizi's 1981 [38] "cellular addresses"," postmen, policeman and traffic signs" "involved in the obedient interpretation of area codes". Similar thoughts were explored in Brandley and Schnaar's 1986 [39] "potential carbohydrate "language" involved in inter-cellulari nteractions", while Hakomori's 2002 [40] "glycosynapse"-microdomains of glycolipids-seekst od raw parallels to the "immune synapse" assembly that contributes to cell adhesion and signalling. As highlightedi nt he cross-disciplinary article by Bertozzi and Kiessling in 2001, [41] "chemical tools have proven indispensable for studies in glycobiology". Perhaps it is time to revisit the terminologyo fg lycoscience, to make interdisciplinary communication more straightforward and to support marketing and engagementb eyond the immediate field. Referencet ol ectin readers, glycosyltransferase writers, and glycosyl hydrolase erasersc ould therefore be worth wider (re)consideration.