Glycosylation: A “Last Word” in the Protein-Mediated Biomineralization Process

Post-translational modifications are one way that biomineral-associated cells control the function and fate of proteins. Of the ten different types of post-translational modifications, one of the most interesting and complex is glycosylation, or the covalent attachment of carbohydrates to amino acid sidechains Asn, Ser, and Thr of proteins. In this review the author surveys some of the known biomineral-associated glycoproteins and summarizes recent in vitro recombinant protein experiments which test the impact of glycosylation on biomineralization protein functions, such as nucleation, crystal growth, and matrix assembly. These in vitro studies show that glycosylation does not alter the inherent function of the polypeptide chain; rather, it either accentuates or attenuates functionality. In essence, glycosylation gives the cell the “last word” as to what degree a biomineralization protein will participate in the biomineralization process.


Introduction
Over the last forty years there has been a concerted effort to understand how organisms craft biomineralized skeletal structures for survival [1][2][3]. This effort has focused along two lines. First, how do mineral crystals or amorphous minerals form under biological conditions? Recent evidence points to a mineral precursor nucleation process involving nanoparticle synthesis followed by particle assembly into larger mineral mesoscale structures [4][5][6]. Second, what agents are biosynthetically created by these same organisms to manage the mineral formation process? With regard to the latter, it has been well documented that the genomes of biomineralizing organisms code for families of proteins that are mineral-specific and unique with regard to primary sequence construction and structure [7][8][9][10]. The appearance of these proteins in the extracellular matrix during mineral formation is a clear attempt by cells to regulate the nucleation and assembly stages that lead to the final mineral product of the skeletal elements that are necessary for organism survival. Thus, to understand how biominerals form into larger, useful structures, we must understand the role or function that these proteins play in nucleation and particle assembly.
In the majority of eukaryotic organisms, the overall complexity of the biomineral proteomes is augmented by a process known as post-translational modification [11][12][13]. In essence, once a nascent protein polypeptide chain is produced on the ribosomal complexes, in some cases the cells express enzymes that perform further covalent modifications of certain amino acid sidechains on the protein, thereby altering the functionality of these sidechains. These covalent modifications occur in compartments that are separate from the cell cytoplasm (e.g., Golgi apparatus, rough endoplasmic reticulum (rER)), intracellular vesicles) [11][12][13]. A summary of common post-translational modifications (Table 1) [12] indicates that certain amino acid sidechains are targeted by cells for covalent modification. These modifications are performed by intracellular enzymes and in the majority of cases the finalized Perhaps the most complex post-translational modification process is glycosylation, or the addition of one or more carbohydrate monomers (known as monosaccharides) to specific amino acid sidechains on a protein, thus converting the polypeptide into a glycoprotein [11][12][13][14][15][16]. There are three classifications of glycoproteins, depending on which amino acids serve as attachment points for carbohydrates [11][12][13][14][15][16]: (1) O-linked, where the oligosaccharide attachment occurs on Ser and/or Thr residues and is performed in the Golgi apparatus; (2) N-linked, where the oligosaccharide attachment occurs on Asn and is performed within the endoplasmic reticulum (ER); (3) hybrid, in which a glycoprotein has O-linked (Ser, Thr) glycans and N-linked (Asn) glycans.
Several features contribute to the overall complexity of glycosylation [11][12][13][14][15][16]: (a) The number of carbohydrate groups added to a single amino acid sidechain site can vary; (b) the number and type of amino acid sites for attachment on a given protein can vary; (c) at a given attachment point on a protein, the carbohydrate groups can be constructed as linear or branching chains; (d) the hydroxyl-rich carbohydrate groups themselves can be modified by the addition of chemical groups, such as carboxylate, sulfate, N-acetyl amino, and hydroxyl. Thus, unlike other post-translational modifications, glycosylation represents a unique opportunity for the cell to combine two very different macromolecular building blocks (amino acids, carbohydrates) into one macromolecule, which in turn may have a significant impact on the function and distribution of this protein class within a biomineralizing system.
For the purposes of this review, the focus will be on glycosylation and the impact that this post-translational modification has on known biomineralization processes. The review will begin by identifying notable well-studied biomineralization glycoproteins [17][18][19][20][21][22][23][24][25] and briefly touch upon their roles in their respective mineralization processes. Then, a discussion of recent in vitro studies [26][27][28][29][30] will follow, which investigated the effects of glycosylation on biomineralization protein mineralization functions (e.g., nucleation, crystal growth, particle assembly and protein-protein interactions). Finally, suggestions as to the direction of future studies of glycosylated biomineralization proteins will be offered. Table 2 provides a summary of specific mineral matrix proteins that have been identified as glycoproteins and report the complete amino acid sequence [17][18][19][20][21][22][23][24][25]. Admittedly, this table is sparse, Crystals 2020, 10, 818 3 of 11 and at the time of this writing very few biomineral-associated glycoproteins have complete protein sequence data or oligosaccharide composition/sequence data available. Note that some studies have identified glycoproteins in the extracellular matrices of different organisms [31][32][33][34][35], but to date these proteins have not been sequenced nor rigorously characterized. The majority of the identified biomineralization glycoproteins are found in association with calcium-based biominerals [17][18][19][20][21][22][23][24][25][31][32][33][34][35]; however, it should be acknowledged that glycoproteins may eventually be identified in other non-calcium based biominerals, such as magnetite (Fe 3 O 4 ) [36] or silicates (SiO 4 ) [37]. To provide some examples of the roles that glycoproteins play in biomineralization, we will briefly describe the proteins in Table 1. Note that in only a few examples the oligosaccharide chain attachment and composition are known at this time [26,27].

Enamelin
This is a glycoprotein found in the tooth enamel of vertebrates [17,18]. This protein plays a role in hydroxyapatite (HAP) formation from the amorphous precursor, amorphous calcium phosphate (ACP) [17]. This protein has two groups of oligosaccharides chains consisting of fucose, galactose, mannose, N-acetylglucosamine, and N-acetylneuraminic acid [18]. Enamelin combines with another enamel matrix protein, amelogenin, to form a protein-protein complex that stabilizes ACP and modulates HAP crystal growth during tooth formation [18].

EDIL3, MFGE8
These two proteins are found in avian eggshells [19] and, although the polypeptide chains have been sequenced, they have not been fully characterized with regard to their oligosaccharide content or sequence, it is known that they bind to amorphous calcium carbonate (ACC)-containing matrix vesicles-and guide these vesicles to the mineralization front, where calcite crystals form from the ACC particles [19].

Proteoglycans
These are a family of complex macromolecules that are composed of glycosaminoglycan (GAG) chains covalently attached to a core protein through a tetrasaccharide linker [20,21]. Proteoglycans act as polysaccharides rather than proteins as 95% of their weight is composed of glycosaminoglycans. The glycosaminoglycan chains consist of alternating hexosamine and hexuronic acid or galactose units [20,21]. There are also glycopeptide linkage regions that connect the polysaccharide chains to the core proteins that contain N-and/or O-linked oligosaccharides. Although found in the extracellular matrix of many tissues, PGs comprise a significant portion of bone and tooth dentine HAP-containing extracellular matrices and are believed to be involved in ion and water sequestration in these matrices [20,21].

SIBLING Family
There is a family of proteins found in bone and tooth dentine that are known as the SIBLING proteins (small integrin binding ligand N-glycosylated) [22,23]. These proteins, all have Arg, Gly, Asp RGD-cell binding domains, are anionic, and all are glycosylated [22,23]. At present, there is scant information regarding the N-linked oligosaccharide chain composition, sequence, or attachment location. SIBLINGS are found in multiple HAP-containing tissues in addition to bone and dentine and are multifunctional: cell signaling, hydroxyapatite binding, and mineral formation. The SIBLING proteins are osteopontin (bone sialoprotein 1), dentin matrix protein 1 (DMP1), bone sialoprotein (BSP2), matrix extracellular phosphoglycoprotein (MEPE) and the products of the dspp gene, dentin sialoprotein (DSP) and dentin phosphoprotein (DPP) [22,23].

SpSM30A-F
In the developing sea urchin Strongylocentrotus purpuratus embryo, the first skeletal element that emerges is the spicule [9], a stirrup-like structure that initially forms from ACC and transforms into mesocrystal calcite [27,29,30]. The matrix of the spicule is formed via many spicule matrix proteins (denoted as SpSM) [9] of which a subset of six isoforms, known as SpSM30A-F, are known to be glycosylated [24,27]. These proteins are known to stabilize ACC, inhabit the intracrystalline regions of mesocrystal calcite [4,5] and most likely contribute to the fracture resistance of the spicule itself [1,27]. The SpSM30 proteins are known to interact with the major spicule matrix protein, SpSM50, and these interactions are important for the assembly of the spicule matrix [27,29,30].

AP24
In the formation of the aragonitic nacre layer in the shells of mollusks, there exists families of proteins that inhabit the interior regions of aragonite crystals [6][7][8] and are termed intracrystalline proteins [25]. These proteins modify the material properties of the aragonite crystal and convey fracture resistance and ductility to these crystals, thus strengthening the shell itself [1,6,25,26]. In the Pacific red abalone, Haliotis rufescens, a family of intracrystalline proteins (the AP series) have been identified [25,26], with one member of this family, AP24, identified as a glycoprotein [25,26]. Subsequent studies confirmed that AP24 acts as a blocker of calcite formation, which then allows the metastable aragonite to form in the presence of extracellular Mg(II) [25,26].

The Impact of Glycosylation on Protein Function
Does the attachment of oligosaccharides affect the molecular behavior of a polypeptide chain? To answer this question, one could envision a comparative study wherein the function of an unglycosylated variant of a given protein is contrasted against that of a glycosylated variant, with each possessing the identical primary sequence. Here, the only variable would be the presence (or absence) of oligosaccharide chains.
Recently, this type of study was executed on two proteins, AP24 (aragonite nacre layer, Pacific red abalone H. rufescens [25] and SpSM30B/C (calcitic spicule matrix, S. purpuratus, purple sea urchin) [24]. Both proteins have been the subject of in vitro glycosylation studies in insect cells, where it was discovered that AP24 and SpSM30B/C belong to the hybrid classification-i.e., they consist of N-and O-linked linear and branching oligosaccharide chains [26,27]. Interestingly, the glycosylated variants of AP24 and SpSM30B/C both contain anionic monosialylated, bisialylated and monosulfated, bisulfated monosaccharides [26,27]. Given that both proteins inhabit a Ca(II)-rich environment in vivo, the anionic monosaccharides could serve as putative sites for Ca(II)-protein or mineral-protein interactions. To a certain extent, both proteins are similar in function: they are involved in the formation of the organic matrix, forming hydrogel particles that assemble mineral nanoparticles [26,27]. In addition, both protein hydrogels become occluded within calcium carbonates and modify the material and surface properties of the minerals they inhabit [26,27].
In the following section, we review these studies and their comparative use of two recombinant variants: (1) a non-glycosylated variant expressed in E. coli bacteria, and (2) a glycosylated variant expressed in baculovirus-infected sf9 insect cells [26,27]. By using these two variations within parallel mineralization and biophysical studies, it was possible to measure the contributions of oligosaccharide chains to the function of each protein [26,27].

The Nacre Glycoprotein AP24
In this in vitro study the protein was expressed in bacterial and insect cells as a single polypeptide. In Sf9 cells, the recombinant form of AP24 (denoted as rAP24G) is expressed with variations in glycosylation that create microheterogeneity in protein molecular masses [26]. The overall molecular mass of the oligosaccharide component was found to range from 650 Da to 6.5 kDa. It was observed that both rAP24G and the non-glycosylated variant (denoted as rAP24NG) aggregate to form protein hydrogels, with rAP24NG exhibiting a higher aggregation propensity compared to rAP24G [26]. With regard to functionality, both rAP24G and rAP24NG exhibit similar behavior within in vitro calcium carbonate mineralization assays and Ca(II) potentiometric titrations that measure prenucleation cluster appearance and ACC formation/transformation [26]. An interesting difference was noted in these studies: rAP24G modifies crystal growth directions and is a stronger nucleation inhibitor, whereas rAP24NG exhibits higher mineral phase stabilization and nanoparticle containment [26]. Hence, oligosaccharides may modulate certain functions of the nacre glycoprotein AP24 but have little effect on other intrinsic functionalities.

The Spicule Matrix Glycoprotein SpSM30B/C
Similarly, the spicule matrix protein SpSM30B/C is expressed in insect and bacterial cells as a single polypeptide. The recombinant glycosylated form (rSpSM30B/C-G) also contains variations in glycosylation that create microheterogeneity in rSpSM30B/C molecular masses [27]. The overall molecular mass of the oligosaccharide component was found to range from 1.2 kDa to 3.6 kDa to 7.5 kDa. In terms of aggregation propensities and hydrogel formation, the bacteria expressed non-glycosylated variant (rSpSM30B/C-NG) has a lower aggregation propensity compared to the glycosylated rSpSM30B/C-G variant. Both variants promote faceted growth and create surface texturing of calcite crystals in vitro, with rSpSM30B/C-G promoting these effect with higher intensity (Figure 1) [27].
Crystals 2020, 10, x FOR PEER REVIEW 5 of 11 [26,27]. In addition, both protein hydrogels become occluded within calcium carbonates and modify the material and surface properties of the minerals they inhabit [26,27]. In the following section, we review these studies and their comparative use of two recombinant variants: (1) a non-glycosylated variant expressed in E. coli bacteria, and (2) a glycosylated variant expressed in baculovirus-infected sf9 insect cells [26,27]. By using these two variations within parallel mineralization and biophysical studies, it was possible to measure the contributions of oligosaccharide chains to the function of each protein [26,27].

The Nacre Glycoprotein AP24
In this in vitro study the protein was expressed in bacterial and insect cells as a single polypeptide. In Sf9 cells, the recombinant form of AP24 (denoted as rAP24G) is expressed with variations in glycosylation that create microheterogeneity in protein molecular masses [26]. The overall molecular mass of the oligosaccharide component was found to range from 650 Da to 6.5 kDa. It was observed that both rAP24G and the non-glycosylated variant (denoted as rAP24NG) aggregate to form protein hydrogels, with rAP24NG exhibiting a higher aggregation propensity compared to rAP24G [26]. With regard to functionality, both rAP24G and rAP24NG exhibit similar behavior within in vitro calcium carbonate mineralization assays and Ca(II) potentiometric titrations that measure prenucleation cluster appearance and ACC formation/transformation [26]. An interesting difference was noted in these studies: rAP24G modifies crystal growth directions and is a stronger nucleation inhibitor, whereas rAP24NG exhibits higher mineral phase stabilization and nanoparticle containment [26]. Hence, oligosaccharides may modulate certain functions of the nacre glycoprotein AP24 but have little effect on other intrinsic functionalities.

The Spicule Matrix Glycoprotein SpSM30B/C
Similarly, the spicule matrix protein SpSM30B/C is expressed in insect and bacterial cells as a single polypeptide. The recombinant glycosylated form (rSpSM30B/C-G) also contains variations in glycosylation that create microheterogeneity in rSpSM30B/C molecular masses [27]. The overall molecular mass of the oligosaccharide component was found to range from 1.2 kDa to 3.6 kDa to 7.5 kDa. In terms of aggregation propensities and hydrogel formation, the bacteria expressed nonglycosylated variant (rSpSM30B/C-NG) has a lower aggregation propensity compared to the glycosylated rSpSM30B/C-G variant. Both variants promote faceted growth and create surface texturing of calcite crystals in vitro, with rSpSM30B/C-G promoting these effect with higher intensity (Figure 1) [27]. Figure 1. SEM images of in vitro calcium carbonate mineralization assay samples, following the protocol described in [27]. (A) Negative control, no protein added; (B) + rSpSM30B/C, nonglycosylated, 1.5 µM; (C) + rSpSM30B/C-G, glycosylated, 1.5 µM. Note faceted nanotexturing produced by both proteins, which is more pronounced in the presence of the glycosylated variant in (C). White arrow in (C) denotes protein hydrogel deposit that forms within the mineralization assay. Scalebars = 2 µm. Figure 1. SEM images of in vitro calcium carbonate mineralization assay samples, following the protocol described in [27]. (A) Negative control, no protein added; (B) + rSpSM30B/C, non-glycosylated, 1.5 µM; (C) + rSpSM30B/C-G, glycosylated, 1.5 µM. Note faceted nanotexturing produced by both proteins, which is more pronounced in the presence of the glycosylated variant in (C). White arrow in (C) denotes protein hydrogel deposit that forms within the mineralization assay. Scalebars = 2 µm.

How Does Glycosylation Impact Function?
From these two studies we note a trend where glycosylation does not change the intrinsic function of the polypeptide chain; rather, the attachment of anionic oligosaccharide moieties either (1) attenuates specific functions or has no effect (AP24) or (2) accentuates protein functionality [SpSM30B/C].
Other studies with multiple glycoproteins will hopefully confirm this trend or provide evidence of other effects that oligosaccharides impose upon polypeptides.

The Impact of Glycosylation on Protein-Protein Interaction (Matrix Formation)
In addition to modulating the mineral formation process, a key role of biomineralization proteins is the assembly and organization of multiple proteins to form an organic matrix within which the nucleation and crystal growth processes take place [1,2,7]. We pose the question: how does glycosylation affect protein-protein interactions that dominate the matrix formation process? To address this question, investigations were conducted on molluscan (AP7, AP24, H. rufescens) [6,25,28] and sea urchin (SpSM50, SpSM30B/C, S. purpuratus) [6,24,29,30] recombinant two-protein systems. In both organisms it is known that each pair of proteins co-exist in vivo within the extracellular matrix [6,24].

AP7-AP24 Complex
It is known that AP7 forms a complex with AP24 in the nacre layer [25]. Using sensitive quartz crystal microbalance with dissipation (QCM-D) measurements, this complex formation was confirmed and it was found that both the glycosylated and non-glycosylated variants of recombinant AP24 bound to recombinant AP7 but with different quantities and binding kinetics ( Figure 2). Interestingly, non-glycosylated recombinant AP24 underwent a conformational change when binding to AP7, but the glycosylated variant did not [28]. Moreover, the binding of AP7 with non-glycosylated and glycosylated variants of AP24 was found to be Ca(II)-dependent and -independent, respectively ( Figure 2) [28]. Thus, AP7 and AP24 protein complexes form as a direct result of polypeptide-polypeptide chain recognition and not polypeptide-oligosaccharide recognition. However, the presence of anionic oligosaccharides on AP24 appears to modulate the intensity of AP7-AP24 protein-protein interactions and potentially stabilizes the AP24 conformation upon binding to AP7. As shown in Figure 3, both proteins have numerous surface-accessible regions or domains where interactions might take place.

How Does Glycosylation Impact Function?
From these two studies we note a trend where glycosylation does not change the intrinsic function of the polypeptide chain; rather, the attachment of anionic oligosaccharide moieties either (1) attenuates specific functions or has no effect (AP24) or (2) accentuates protein functionality [SpSM30B/C]. Other studies with multiple glycoproteins will hopefully confirm this trend or provide evidence of other effects that oligosaccharides impose upon polypeptides.

The Impact of Glycosylation on Protein-Protein Interaction (Matrix Formation)
In addition to modulating the mineral formation process, a key role of biomineralization proteins is the assembly and organization of multiple proteins to form an organic matrix within which the nucleation and crystal growth processes take place [1,2,7]. We pose the question: how does glycosylation affect protein-protein interactions that dominate the matrix formation process? To address this question, investigations were conducted on molluscan (AP7, AP24, H. rufescens) [6,25,28] and sea urchin (SpSM50, SpSM30B/C, S. purpuratus) [6,24,29,30] recombinant two-protein systems. In both organisms it is known that each pair of proteins co-exist in vivo within the extracellular matrix [6,24].

AP7-AP24 Complex
It is known that AP7 forms a complex with AP24 in the nacre layer [25]. Using sensitive quartz crystal microbalance with dissipation (QCM-D) measurements, this complex formation was confirmed and it was found that both the glycosylated and non-glycosylated variants of recombinant AP24 bound to recombinant AP7 but with different quantities and binding kinetics ( Figure 2). Interestingly, non-glycosylated recombinant AP24 underwent a conformational change when binding to AP7, but the glycosylated variant did not [28]. Moreover, the binding of AP7 with nonglycosylated and glycosylated variants of AP24 was found to be Ca(II)-dependent and -independent, respectively ( Figure 2) [28]. Thus, AP7 and AP24 protein complexes form as a direct result of polypeptide-polypeptide chain recognition and not polypeptide-oligosaccharide recognition. However, the presence of anionic oligosaccharides on AP24 appears to modulate the intensity of AP7-AP24 protein-protein interactions and potentially stabilizes the AP24 conformation upon binding to AP7. As shown in Figure 3, both proteins have numerous surface-accessible regions or domains where interactions might take place. For more information on the QCM-D method and experimental protocol, please refer to [28]. rAP7 is adsorbed onto the poly-L-Lys coated QCM-D chip, then unbound rAP7 is washed off and then rAP24G (glycosylated) or rAP24NG (non-glycosylated) variants are introduced into the flowcell. Plots show the third harmonic frequency (F3, blue) and dissipation (D3, red) observed under each scenario. For more information on the QCM-D method and experimental protocol, please refer to [28]. rAP7 is adsorbed onto the poly-L-Lys coated QCM-D chip, then unbound rAP7 is washed off and then rAP24G (glycosylated) or rAP24NG (non-glycosylated) variants are introduced into the flowcell. Deflection in frequency and dissipation result from rAP24 protein adsorbing onto the immobilized rAP7 layer on the chip, with amplitudes of the deflection proportional to the amount of protein bound. The time-dependent introduction of proteins is noted on the plots by arrows. These experiments were repeated and found to be reproducible.
Deflection in frequency and dissipation result from rAP24 protein adsorbing onto the immobilized rAP7 layer on the chip, with amplitudes of the deflection proportional to the amount of protein bound. The time-dependent introduction of proteins is noted on the plots by arrows. These experiments were repeated and found to be reproducible. Figure 3. INTFOLD-predicted three-dimensional structures of H. rufescens AP7 and AP24 proteins, in ribbon representation. The protocol for structure prediction is provided in [27]. AP24 is represented without glycan groups. Note that each protein has surface-accessible domains or regions which could serve as sites for protein-protein interaction.

SpSM50-SpSM30B/C Complex
SpSM50 is the major matrix protein of the sea urchin spicules in S. purpuratus embryos with other SpSM proteins, such as the six SpSM30A-F isoforms comprising smaller amounts in the matrix [9,23]. With SpSM50 in large abundance, there is the possibility that other SpSM proteins interact with SpSM50 to form the matrix and control mineralization. This was tested in a recent in vitro study, where recombinant forms of SpSM50 and glycosylated and non-glycosylated variants of SpSM30B/C were investigated for their ability to form protein-protein complexes [29,30]. The results were quite dramatic: the formation of a SpSM50-SpSM30B/C complex requires glycosylation and, in contrast to the AP7-AP24 study described above, these interactions were found to be Ca(II)-independent for both variants [29,30]. The glycosylation requirement clearly indicates that the SpSM50 polypeptide sequence recognizes and binds to the glycan moieties on the surface of SpSM30B/C. As shown in Figure 4, the SpSM50 sequence contains a conserved C-type lectin domain, which is known to bind to carbohydrates, and presumably it is this domain that would interact with the glycan groups of SpSM30B/C [9,23,29,30].   [27]. AP24 is represented without glycan groups. Note that each protein has surface-accessible domains or regions which could serve as sites for protein-protein interaction.

SpSM50-SpSM30B/C Complex
SpSM50 is the major matrix protein of the sea urchin spicules in S. purpuratus embryos with other SpSM proteins, such as the six SpSM30A-F isoforms comprising smaller amounts in the matrix [9,23]. With SpSM50 in large abundance, there is the possibility that other SpSM proteins interact with SpSM50 to form the matrix and control mineralization. This was tested in a recent in vitro study, where recombinant forms of SpSM50 and glycosylated and non-glycosylated variants of SpSM30B/C were investigated for their ability to form protein-protein complexes [29,30]. The results were quite dramatic: the formation of a SpSM50-SpSM30B/C complex requires glycosylation and, in contrast to the AP7-AP24 study described above, these interactions were found to be Ca(II)-independent for both variants [29,30]. The glycosylation requirement clearly indicates that the SpSM50 polypeptide sequence recognizes and binds to the glycan moieties on the surface of SpSM30B/C. As shown in Figure 4, the SpSM50 sequence contains a conserved C-type lectin domain, which is known to bind to carbohydrates, and presumably it is this domain that would interact with the glycan groups of SpSM30B/C [9,23,29,30].
Crystals 2020, 10, x FOR PEER REVIEW 7 of 11 Deflection in frequency and dissipation result from rAP24 protein adsorbing onto the immobilized rAP7 layer on the chip, with amplitudes of the deflection proportional to the amount of protein bound. The time-dependent introduction of proteins is noted on the plots by arrows. These experiments were repeated and found to be reproducible. The protocol for structure prediction is provided in [27]. AP24 is represented without glycan groups. Note that each protein has surface-accessible domains or regions which could serve as sites for protein-protein interaction.

SpSM50-SpSM30B/C Complex
SpSM50 is the major matrix protein of the sea urchin spicules in S. purpuratus embryos with other SpSM proteins, such as the six SpSM30A-F isoforms comprising smaller amounts in the matrix [9,23]. With SpSM50 in large abundance, there is the possibility that other SpSM proteins interact with SpSM50 to form the matrix and control mineralization. This was tested in a recent in vitro study, where recombinant forms of SpSM50 and glycosylated and non-glycosylated variants of SpSM30B/C were investigated for their ability to form protein-protein complexes [29,30]. The results were quite dramatic: the formation of a SpSM50-SpSM30B/C complex requires glycosylation and, in contrast to the AP7-AP24 study described above, these interactions were found to be Ca(II)-independent for both variants [29,30]. The glycosylation requirement clearly indicates that the SpSM50 polypeptide sequence recognizes and binds to the glycan moieties on the surface of SpSM30B/C. As shown in Figure 4, the SpSM50 sequence contains a conserved C-type lectin domain, which is known to bind to carbohydrates, and presumably it is this domain that would interact with the glycan groups of SpSM30B/C [9,23,29,30].  The protocol for structure prediction is provided in references 27 and 29. Note that the SpSM50 protein possesses a surface-accessible C-type lectin carbohydrate binding domain, which presumably acts as a site for interaction with SpSM30B/C glycan groups.

Summary and Future Directions
From the foregoing, we can observe that glycosylation provides an additional degree of control over extracellular protein function by either accentuating or attenuating the intrinsic functionality of the polypeptide sequence. In a sense, the cell can have the "last word" as to the degree of participation within the biomineralization process. In some cases (e.g., AP24), the oligosaccharides stabilize the conformation of the glycoprotein, which is a known trait of N-linked oligosaccharides [12][13][14][15][16]. The author proposes that glycosylation can serve several purposes vis a vis the biomineralization process: (1) "tweak" or "tune" protein mineralization function to suit the situation or need; (2) act as a site for molecular recognition and binding with other matrix proteins; (3) conformationally stabilize a protein, thereby enhancing functionality; (4) create additional anionic sites for ionic (e.g., Ca(II)), mineral, or water interactions; (5) invoke cell activation or deactivation via binding to outer membrane receptor proteins. Clearly, there may be other benefits that arise from glycosylation, and thus this process represents a powerful method that cells can exploit to create skeletal elements under ambient or extreme conditions [1,2].
The author believes that the biomineralization field is still in its infancy with regard to understanding the role that glycoproteins and their associated oligosaccharides play in the skeletal formation process. To make progress in this area, the author proposes several key issues that need to be addressed, which are elaborated upon in Sections 5.1-5.4, below.

A More Aggressive Approach to Glycoprotein Isolation and Identification
Simply put, the genomics of biomineralization have advanced quite rapidly [8,9], but the proteomics and the identification of post-translational modifications to these proteins lag to a certain extent, especially when compared to the advances in glycobiology within the fields of immunology and other medical branches [13,14]. It is the author's opinion that this is not due to limitations in methodologies, technology, or skill; rather, it is due to at least two factors: (1) the unwillingness of laboratories to pursue these intensive and costly projects and (2) insufficient grant funding to permit these projects to move forward. It is hoped that this situation will change for the better over time.

Improvements in Glycoprotein Purification and Structure Determination
For a variety of reasons, glycoproteins can be difficult to purify to homogeneity for structural determinations [13][14][15][16][38][39][40]. Further, oligosaccharide chains and the protein region(s) to which they are attached are typically conformationally labile, making structural determination by X-ray crystallography or NMR to be highly problematic, which, in turn, makes it nearly impossible to establish protein structure-function relationships [14][15][16]. Currently, molecular modeling (i.e., energy minimization, molecular dynamics) is the only route to obtain structural protein-oligosaccharide information, albeit in qualitative form [41]. Thus, there is a need for new methodologies to obtain glycoproteins in a highly purified form and to decipher the three-dimensional structure of the oligosaccharide-polypeptide chain complex.

Improvements in Glycoprotein Localization
One can identify the location of proteins in situ within the extracellular matrix using monoclonal or polyclonal antibody recognition of protein epitopes [39]. In the case of glycoproteins, this becomes a more complicated issue, since the antibodies raised to epitopes on glycoproteins might be specific to only the polypeptide chain, to certain oligosaccharide chains, or to both [39]. Given that some glycoproteins often exhibit variations in glycosylation [26,27], the in situ identification of glycoproteins using antibodies may not be so straightforward. In such cases, it may be prudent to synthesize select protein sequence regions and/or glycan chains and use these for antibody generation. Further, improved methods of in situ glycoprotein detection will propel the field further and allow interpretations of biomineral formation in the presence of matrix-specific glycoproteins.

Improvements in Understanding the Role of Variations in Glycosylation
The fact that in some in vitro systems there is variation in the degree of oligosaccharide chain completion and site attachment [26,27] creates a diverse pool of glycoproteins coded for by a single gene. Is this a flaw of the cellular system, or, is this deliberate and with a purpose? If deliberate, how