Novel insights into the function of the conserved domain of the CAP superfamily of proteins

: Members of the Cysteine-rich secretory proteins, Antigen 5, and Pathogenesis-related 1 proteins (CAP) superfamily are found in a remarkable variety of biological species. The presence of a highly conserved CAP domain defines the CAP family members, which in many cases is linked to other functional protein domains. As a result, this superfamily of proteins is involved in a large variety of biological processes such as reproduction, tumor suppression, and immune regulation. The role of the CAP domain and its conserved structure throughout evolution in relation to the diverse functions of CAP proteins is, however, poorly understood . Recent studies on the mammalian Golgi-Associated plant Pathogenesis Related protein 1 (GAPR-1), which consists almost exclusively of a CAP domain, may shed new light on the function of the CAP domain. GAPR-1 was shown to form amyloid fibrils but also to possess anti-amyloidogenic properties against other amyloid forming peptides. Amyloid prediction analysis reveals the presence of potentially amyloidogenic sequences within the highly conserved sequence motifs of the CAP domain. This review will address the structural properties of GAPR-1 in combination with existing knowledge on CAP protein structure-function relationships. We propose that the CAP domain is a structural domain, which can regulate protein-protein interactions of CAP family members using its amyloidogenic properties.


CAP protein superfamily
The CAP superfamily consists of proteins found in thousands of species across the entire biological kingdom [1]. In addition, CAP proteins are involved in a wide variety of functions, including fertilization, tumor suppression and immune modulation. Family members are characterized by the presence of a CAP domain, also referred to as the SCP (Sperm Coating Protein) domain, which encompasses a conserved tertiary structure: a unique α-β-α sandwich fold in which αhelices flank a central antiparallel β-sheet. Four highly conserved signature motifs have been recognized [1]. Outside these sequences there is little homology within the CAP domain between family members [1]. Most family members have an extension or an additional domain at the Cterminus of the CAP domain. Furthermore, all but one CAP protein family members possess a signal peptide for secretion to the extracellular space where they exert endo-or paracrine functions. Therefore, the large variety of additional domains is thought to be responsible for the remarkable evolutionary diversity in functions of CAP family members. For example, regulation of ion channels is a well-characterized function that has been attributed to the C-terminal domain present in the CRISP subfamily [2,3]. The role of the CAP domain and its conserved motifs has, however, largely remained elusive. Currently, some functions have been attributed to the CAP domain in certain species, such as protease activity in cone snail venom, sterol binding/transport in yeast, inhibition of integrin function, and scavenger of eicosanoids in saliva of blood-feeding insects [4][5][6][7]. However, none of these functions has been shown to be present in a wide range of superfamily members. The structurally highly conserved domain suggests a more common functionality throughout evolution.

GAPR-1
Phylogenetic analysis of mammalian CAP proteins shows GAPR-1 to branch off early in evolution from the yeast PRY protein family and GAPR-1 might well be the first mammalian CAP protein in evolution [1]. GAPR-1 is the only mammalian CAP protein lacking a signal sequence and consists almost exclusively of a CAP domain without additional extensions [1,8,9]. Together, these features make GAPR-1 a suitable candidate to study the role of the conserved structural nature of the CAP domain.
GAPR-1 (also known as GLIPR-2 or C9orf19) acts as a negative regulator of autophagy [10], a cellular process involved in the degradation of cellular components via a lysosomal pathway. Autophagy plays an essential role in the immune system as it provides defense against infection, neurodegenerative disorders, cancer, and aging [11]. The protein exerts its negative regulatory function in autophagy by interacting with the essential autophagy-related protein Beclin 1 [10]. This results in the retention of Beclin 1 at the Golgi complex and inhibition of autophagy by preventing Beclin 1 to initiate autophagy. GAPR-1 is a small (17 kDa), highly positively charged (pI 9.4) peripheral membrane protein that contains an N-terminal myristoyl group [12], which localizes the protein to lipid-enriched microdomains at the cytosolic leaflet of the Golgi apparatus. GAPR-1 is highly expressed in immune-related tissues and cells, indicating that GAPR-1 plays a role in the mammalian innate immune system, analogous to PR-1 proteins in plants. Recently, GAPR-1 was shown to mediate interferon 1 signaling activation in response to Toll-like receptor 4 [13].
Additionally, GAPR-1 expression is vastly increased in renal proximal tubular cells during kidney fibrogenesis and stimulates epithelial to mesenchymal cell transition [14]. GAPR-1 has also been detected in extracellular vesicles secreted from prostate epithelial cells [15].
The molecular mechanisms of GAPR-1 have been extensively studied and provide interesting clues about how GAPR-1 exerts its biological function(s). GAPR-1 binds to liposomes containing negatively charged lipids [16]. It also interacts with Caveolin-1 on Golgi membranes [12] and together with N-terminal myristoylation [12] these signals are likely to be involved in the recruitment of GAPR-1 to raft-like lipid microdomains in the Golgi membrane. Another notable feature of GAPR-1 is that it has a strong tendency to form homodimers, both in vitro and in vivo ( Figure 1) [8,9]. The protein also crystallizes as a dimer [9] and interaction with inositolhexakisphosphate (IP 6 ) induces dimerization with an alternative configuration in which one of the monomeric subunits of the crystallographic dimer is rotated by 28.5 degrees [17]. Membrane binding to liposomes also induces GAPR-1 homodimerization and as a result rapid tethering of the liposomes [17].  [9] and transmission EM image of GAPR-1 amyloid fibrils prepared as described in [18] are shown.
Prolonged incubation with negatively charged liposomes resulted in the formation of amyloidlike fibrils (Figure 1) [18]. The reasons for the propensity of GAPR-1 to form amyloid structures is not known, but it is intriguing to note that in the crystal structure of GAPR-1, the dimeric arrangement shows an almost continuous β-sheet extending beyond the monomeric units ( Figure 1). The oligomerization process of GAPR-1 commences instantaneously upon membrane binding. The presence of cholesterol in the liposomes enhanced fibril formation, suggesting that localization of native GAPR-1 to lipid-and cholesterol-enriched microdomains of the Golgi complex favors oligomerization of GAPR-1 in the cell. Furthermore, natively folded GAPR-1 was shown to possess an intrinsic amyloid-related structure as it bound the amyloid oligomer-specific antibody A11 [18] and was shown to possess anti-β-amyloid aggregation activity [19]. GAPR-1 effectively inhibited Aβ fibril formation in vitro by binding to oligomeric Aβ structures [18]. The apparent paradoxical amyloid-forming and -inhibiting properties have been shown for several other proteins, including GroEL, small heat-shock proteins, α-synuclein and TTR [20][21][22]. GAPR-1 is also linked to amyloidrelated diseases. In a study of necroptosis activation in multiple sclerosis (MS), GAPR-1 was found to be enriched in the insoluble proteome of MS patients [23]. GAPR-1 was also shown to be exclusively enriched in sites of neonatally induced neurodegeneration in rat hippocampus and its gene expression was proposed to regulate the development of diabetic neuropathy [24,25]. Whether these examples are related to the amyloid-related properties of GAPR-1, its role in autophagy regulation or both, remains to be investigated.

Functional protein oligomerization and toxicity
Self-association of proteins into dimers or oligomers can result in structural and functional benefits such as enhanced stability, control over active site accessibility and generation of new binding sites [26,27]. Oligomerization of proteins is known to be essential for a wide variety of processes, e.g. the regulation of its subcellular location, (de-)activation of enzymes and chaperones, regulation of gene expression and cell-cell adhesion [26,28]. Oligomerization can be mediated in several ways, including ligand binding, posttranslational modifications, disulfide bond formation between subunits and domain swapping. Oligomers serve diverse purposes in biological signaling to function as e.g. receptors or ligands, regulation of protein-protein interactions, or modulation of membrane curvature [26,[29][30][31]. As both GAPR-1 and Beclin 1 form oligomeric structures [9,18,32,33], these properties may be intimately linked to regulate the interaction between these two proteins and/or regulate the activity of Beclin 1 in autophagy.
However, oligomerization can also lead to the formation of pathogenic protein structures. Proteins or peptides can undergo a specific self-assembly process into amyloid structures via a conserved pathway involving relatively short amino acid stretches that are normally hidden within the native structure of the protein, but upon exposure activate aggregation [34,35]. (Sub)cellular membranes and extracellular matrix components regularly play a critical role in initiating amyloid formation as they can serve as a structural template for self-assembly and often induce significant conformational changes in monomers or dimers leading to the exposure of amyloidogenic segments [36][37][38][39]. Subsequently, a conformational transition takes place from unstructured or nativelike structures into oligomers rich in β-sheet structures. These oligomers act as seeds for adjacent monomers resulting in rapid elongation into mature amyloid fibrils, which are composed predominantly of highly ordered cross β-sheet structures [40]. Oligomerization of proteins and peptides into amyloid fibrils plays a central role in a wide variety of diseases, including Alzheimer's, Parkinson's and Huntington's disease, prion disease and type II diabetes. Soluble amyloid oligomers are now recognized as the main toxic species [41]. However, the mechanisms of amyloid associated cytotoxicity are highly diverse and poorly understood [42]. One characteristic shared by most amyloid oligomers is their ability to disrupt membranes [43,44]. The membrane-perturbing property also renders amyloid peptides antimicrobial activity [45].
Intriguingly, there is a growing list of examples of amyloids serving a beneficial role for the organism. As short amyloid-forming peptides have been shown to possess catalytic activity in its fibrillar state, the amyloid fold has been hypothesized to be a common ancestral protein fold [46]. In contrast to disease-associated amyloids, the production of these so-called functional amyloids is tightly regulated without causing cytotoxicity. How this is achieved is not well understood. The regulation of protein abundance and fibrillation kinetics have been postulated as important factors in determining toxicity of amyloids [47]. Examples of functional amyloids include storage and release of hormones and toxins, serving as a template for pigment synthesis, and biofilm formation by bacteria and fungi [48,49]. A functional role for amyloids has also been implicated in maintenance of synaptic plasticity and memory consolidation [50,51], in cell adhesion [52], and as epigenetic elements of phenotypic inheritance [53]. Amyloid structures are also involved in signaling pathways. In programmed necrosis (necroptosis), the RIP1 and RIP3 proteins interact to form an amyloid core in the necrosome [54]. The functional amyloid NLRP3 inflammasome can be activated in response to amyloid fibrils from diverse sources, including Aβ, prions, serum amyloid A as well as curli from E. coli and S. Typhimurium [55]. Amyloid curli from these bacteria also activate Toll-like receptor 2 in intestinal epithelial cells, which leads to an enhanced barrier function and ameliorated inflammation [56]. Chronic activation of the immune system is now believed to be a major factor in neurodegenerative diseases and cardiac diseases [55,57].
The amylogenic properties of GAPR-1 and the emergence of functional amyloids early in evolution prompted us to consider the possibility that the CAP domain allows functional oligomeric and/or amyloidogenic regulation of the diverse CAP protein family members. Numerous CAP proteins have already been shown to exist and/or function as dimers or oligomers [9,[58][59][60][61][62][63][64].
Furthermore, amyloid prediction analysis shows that CAP proteins in all taxa contain amyloidogenic segments within their CAP domain, with remarkably consistent motifs in the conserved CAP1 and CAP2 signature (Figure 2). Below we will discuss the indications for functional oligomeric and/or amylogenic interactions of CAP proteins in reproduction and immune regulation.

Reproduction
Functional amyloids play a major role in fertilization. The zona pellucida (ZP), a glycoprotein extracellular matrix surrounding oocytes that is essential during fertilization, was recently discovered to possess all the classical characteristics of amyloid [66]. Functional amyloids are also present within the epididymal lumen and in the sperm acrosomal matrix, which interacts with ZP [67]. Also, mating of haploid yeast cells involves aggregation of cells promoted by amyloid-forming adhesion proteins [68,69]. Therefore, amyloidogenesis could be a conserved mechanism in reproduction throughout evolution.
A number of CAP proteins play pivotal roles in mammalian reproduction. These include mammalian sperm-binding proteins, CRISP1-4 and GLIPR1L1, which have been postulated to function in spermatogenesis, sperm capacitation and sperm-egg binding and fusion [70,71]. The CRISP subfamily is characterized by the presence of a cysteine-rich C-terminal extended domain, separated from the CAP domain by a hinge region [1]. The cysteine-rich C-terminal domain has ion channel regulatory activity, as was also determined for CRISP proteins derived from the venom of poisonous reptiles [2,3]. However, the role of the CAP domain remains unclear.
CRISP1 is essential for sperm maturation, capacitation and sperm-oocyte binding. It is expressed in the epididymis, secreted to the epididymal lumen and binds to spermatozoa in two populations: a loosely and tightly bound population. Zn 2+ binding was shown to induce formation of high molecular weight oligomeric CRISP1 complexes and facilitate its association to spermatozoa [58]. Following capacitation, the remaining tightly bound CRISP1 is involved in ZP and oocyte binding. For oocyte binding the CAP2 signature motif is essential. CRISP2 is a component of the sperm acrosome and following release during the acrosome reaction, it strongly binds to the sperm equatorial segment and is involved in sperm-egg fusion, similarly to and possibly in cooperation with CRISP1 [72]. CRISP3 and CRISP4 are also important for sperm-ZP binding. CRISP3 in equine seminal plasma was shown to suppress neutrophil-sperm binding, protecting sperm from elimination from the female reproductive tract [73]. GLIPR1L1 is localized at lipid rafts of spermatozoa and potentially also involved in ZP and oocyte binding [71].
CRISP1 is also present in the female genital tract. It is expressed by cumulus cells surrounding oocytes and capable of modulating sperm ion channels [74]. Allurin, a CRISP homolog from the female tract of Xenopus lacking the C-terminal CRISP domain was shown to function as a chemoattractant for sperm cells, also in mice [75,76]. Allurin forms SDS-insoluble oligomers typical of amyloid [62]. In relation to this, semen derived amyloid peptides were shown to enhance HIV infection by promoting virion fusion with host cells [77,78]. Due to the many commonalities between HIV infection and mammalian fertilization [79], in addition to the presence of amyloid proteins on the surface of spermatozoa and oocytes, amyloid fibrils are likely to play a role in spermegg binding and fusion. Recombinant human CRISP2 was already shown to have amyloid properties upon membrane binding in vitro [18]. Nevertheless, it remains to be established whether amyloidogenic and/or amyloid modulating properties are common to and important for the function of CAP proteins involved in fertilization.

Immune regulation
Microbial pathogens of plants and mammals secrete an assortment of effector proteins that contribute to virulence by directly targeting host cells or tissues e.g. by suppressing defense responses. Evidence is mounting that amyloids play a role in evasion of immune response and host invasion [80]. Amyloid structures formed on the cell surfaces of bacteria and fungi not only function in microbial biofilm formation and cell adhesion, but were also shown to function as toxin storage and release sites and to inhibit neutrophil response by binding serum amyloid P [80][81][82]. Chronic infection with these pathogens has even been implicated as a causative factor in Alzheimer's disease [83]. Harpin proteins secreted by various plant pathogens cause a hypersensitive response in the host plant (a process similar to apoptosis in mammalian cells), which was shown to be dependent on amyloid formation by the harpin [84].

PR-1 proteins
In plants, the PR family of proteins have long been considered hallmarks of the hypersensitive response/defense pathways in plants. Among the 17 classified PR subfamilies [85], PR-1 is the only group which has not been attributed a biochemical function nor assigned any protein category with a recognized function. PR-1 proteins have been implicated in enhancing resistance against viral, fungal and oomycete infection [85][86][87]. There are various examples for PR-1 proteins from both host and pathogen where oligomerization seems to be crucial in the regulation of immunity, which will be discussed below. PR-1 proteins have been shown to accumulate at intercellular spaces between host and pathogen and to associate with fibrillary and electron-dense material [88][89][90]. PR-1 proteins were also found to localize in a clustered manner at the surface and in the cytoplasm of infection hyphae, inhibiting their differentiation [88,91]. The PR-1 protein homologue PR1-5 from hexaploid wheat was shown to physically interact with the Stagonospora nodorum toxin ToxA. This interaction potentially mediates toxin-induced necrosis in sensitive wheat. PR1-5 exists as a homodimer and mutation of the ToxA binding site did not affect dimerization. In a previous study the dimer conformation of PR1-5 was demonstrated to convey resistance to proteases [59]. PR-1 proteins were discovered that were fused via a transmembrane part to a kinase domain, reminiscent of receptor-like kinases [92]. These PR-1 receptor kinases were proposed to transduce defense signals through extracellular interaction with a PR-1 ligand. Ligand-induced oligomerization is a well-known mechanism for receptor kinase activation [93].
Recently, a wound induced peptide derived from the C-terminal part of PR-1, termed CAPE, was identified in tomato [94]. These peptides were shown to activate immune signals for antipathogen defense including its own precursor PR-1b. Homologous peptides in Arabidopsis were subsequently found to play an important role in the regulation of salt stress response [95]. Interestingly, the cleavage site of these peptides is before the proline residue within the CAP2 motif but after the predicted amyloidogenic part of the CAP2 motif, leaving the putative amyloidogenic properties intact (Figure 2).

PR-1-like proteins
Plant and mammalian pathogens also secrete PR-1-like proteins, which play an important role in pathogenicity. The fungus Fusarium oxysporum secretes a PR-1-like CAP protein, Fpr1, which was shown to be dispensable for virulence in plants but required for virulence in mammalian hosts [61]. Fpr1 forms dimers in solution and is proteolytically processed. As for PR-1-like proteins found in the saliva of blood-feeding insects, Fpr1 could be instrumental in evading the host immune system or to prevent blood clotting. Another possibility could be that fungi secrete PR-1-like proteins for antimicrobial actions against competing pathogens. This was proposed for a recently identified PR-1 protein with anti-bacterial activity in the digestive fluid of carnivorous plants [96].
PR-1-like proteins also have an important function in the infection process of parasitic nematodes through suppression and evasion of the host immune system. Upon host entry, they secrete a high amount of PR-1-like proteins, including a neutrophil inhibiting protein [97,98]. These ancylostoma secreted proteins (ASPs) are prime candidates for vaccine development. The human parasite Necator americanus produces two major ASPs that are thought to contribute to immune evasion and inhibition of platelet aggregation [63,99]. Intriguingly, Na-ASP-1 possesses two CAP domains that are linked via an extended loop [99]. Double domain ASPs are also capable of forming homodimers typical of the single-CAP domain ASPs [60,63]. Interestingly, the two-CAP domain protein was not effective in eliciting an immune response as opposed to single CAP ASPs, providing a clue as to how oligomerization of PR-1 proteins from other pathogens could play a part in immune evasion [99].
Other CAP protein family members have also been implicated in immune regulation. Hookworm platelet inhibitor (HPI) was discovered to inhibit platelet activation by blocking cell surface integrin receptors for fibrinogen and collagen [100]. HPI, like GAPR-1, crystallizes as a dimer although the orientation of the monomeric subunits is different as compared to dimeric GAPR-1 [64]. Remarkably, purified recombinant HPI was shown to be monomeric in solution, but unable to inhibit platelet adhesion to fibrinogen and collagen. Human neutrophil alpha defensins (HNPs) have been shown to bind the fibrinogen receptor and thereby induce formation of fibrinogen and thrombospondin-1 amyloid-like structures that activate platelets and bind microorganisms [101]. Amyloid peptides involved in Alzheimer's disease are also intimately linked to platelet activation [102]. This could indicate that oligomeric HPI and/or amyloid-like fibrinogen are required for inhibition.
A similar example is natrin, a CRISP from snake venom, which functions as an inflammatory modulator by inducing expression of adhesion proteins in vascular endothelial cells. It was suggested that Zn 2+ binding to the CAP domain in the presence of heparan sulfate enhances di-/oligomerization of natrin to activate this pathway [103]. Both zinc cations and heparan sulfate have been widely associated with amyloid formation [36,104,105].

Other PR proteins
Oligomerization and amyloid formation could be a more general mechanism in plant defense by other PR protein subfamilies as well, despite the fact that they do not contain the CAP domain. Prohevein is a wound induced antipathogenic protein from the rubber tree Hevea brasiliensis and the major allergen in latex. It is homologous to PR-4 proteins and has agglutination and chitinase properties. Its C-terminal domain, common to all PR-4 proteins, was shown to have amyloid-forming characteristics [106]. Other examples are found in PR-12 proteins (class II defensins), where dimer formation of tobacco NaD1 is critical for its antifungal activity [107]. Intriguingly, phosphatidylinositol binding of NaD1 induced its oligomerization and permeabilization of fungal and mammalian (tumor) cells. Nearly identical properties were revealed for tomato defensin TPP3, which formed amyloid-like fibrils upon specific binding to phosphatidylinositol (4,5)-bisphosphate (PIP2) [108]. In radish seeds, a C-terminal peptide derived from antifungal defensins showed a high amyloid fibril-forming propensity [109]. The occurrence of haze during wine production has been attributed to the self-aggregation of PR proteins, such as thaumatin-like proteins (PR-5) and chitinases [110]. Besides defense against pathogens, PR proteins are also upregulated upon exposure to environmental stress. In winter rye, PR proteins from three different classes act synergistically as antifreeze proteins in oligomeric complexes [111].

Conclusion
CAP proteins are involved in a wide variety of biological functions. Yet, the function of the highly conserved CAP domain has not been elucidated. The structural properties and dynamics of the mammalian CAP protein GAPR-1 provide potential clues on this issue. We propose that oligomerization and amyloid formation/modulation is a common functionality of the CAP domain. There is evidence for a functional role of amyloid-forming proteins in cell-cell adhesion processes during fertilization, signaling pathways and immune regulation in host-pathogen interactions. Given the ubiquitous nature of amyloid folds in proteins throughout evolution, we suggest that the molecular mechanisms of GAPR-1 related to the structure of the CAP domain are characteristic for other members of the CAP superfamily of proteins as well.