High Resolution Structure of Human Apolipoprotein (a) Kringle IV Type 2: Beyond the Lysine Binding Site

Lipoprotein (a) [Lp(a)] is characterized by an LDL-like composition in terms of lipids and apoB100, and by one copy of a unique glycoprotein, apolipoprotein (a) [apo(a)]. The apo(a) structure is mainly based on the repetition of tandem kringle domains with high homology to plasminogen kringles 4 and 5. Among them, kringle IV type 2 (KIV-2) is present in a highly variable number of genetically encoded repeats, whose length is inversely related to Lp(a) plasma concentration and cardiovascular risk. Despite it being the major component of apo(a), the actual function of KIV-2 is still unclear. Here, we describe the first high-resolution crystallographic structure of this domain. It shows a general fold very similar to other KIV domains with high and intermediate affinity for the lysine analogue  -aminocaproic acid (EACA). Interestingly, KIV-2 presents a Lysine Binding Site (LBS) with a unique shape and charge distribution. KIV-2 affinity for predicted small molecule binders was found to be negligible in Surface Plasmon Resonance (SPR) experiments, and with the LBS being nonfunctional, we propose to rename it “pseudo-LBS” (pLBS). Further investigation of the protein by computational small-molecule docking allowed us to identify a possible heparin-binding site away from the LBS, which was confirmed by specific reverse charge mutations abolishing heparin binding. This study opens new possibilities to define the pathogenesis of Lp(a) related diseases and to facilitate the design of specific therapeutic drugs.


Introduction
Human Lipoprotein (a) [Lp(a)] is a macromolecular complex present in plasma originally described by Kåre Berg (1). Its physiological function is still unknown, but the increase in its concentration can double or even triple cardiovascular risk (2,3). Lp(a) is an LDL-like lipoprotein, and it shares both lipid composition (4) and apoB100 with LDLs. Unique to Lp(a) is the presence of apolipoprotein (a) [apo(a)] (5-8), a highly polymorphic glycoprotein which is covalently linked to apoB100 by a disulfide bond.
A characteristic feature of apo(a) is the presence of multiple kringle domains (9,10), which are identified by their high homology to plasminogen kringle IV and kringle V. Particularly, apo(a) is composed of ten different kringle IV subtypes (designated from 1 to 10) with high interdomain homology, a kringle V domain and a serine protease domain (11). Nine of the ten kringles IV types (1 and 3-10) are present in a single copy, while the number of kringle IV type 2 (KIV-2) can vary between 1 and more than 40 (2,12,13). This variability is responsible for the great heterogeneity of isoforms present in the general population and is related to CVD risk (2). It has been observed, in fact, that low molecular weight isoforms (less than or equal to 22 total KIV repeats) are associated to 4-to 5-fold higher Lp(a) plasma concentrations than high molecular weight isoforms (higher than 22 KIV repeats) (14), with low concentration variability within families (15,16). joined in a 1-6, 2-4 and 3-5 scheme (17). Kringle domains are found in plasminogen (18,19) and several other plasma proteins (20)(21)(22)(23)(24)(25) and membrane receptors (26).
The best characterised function of kringle domains is to mediate protein-protein interactions through their lysin binding site (LBS) pocket. Out of the different apo(a) KIV domains, KIV-10 has the highest homology to plasminogen K4 and it has the most conserved and highest affinity lysin-binding pocket ( Table 1). In fact, a dominant role for KIV-10 in the lysine-binding function of apo(a) was first demonstrated by the loss of binding observed in a KIV-10 W70R mutant of Rhesus monkey (27). Further lysine binding capacity was later found localised within the KIV-5 to KIV-9 domains, while the KIV-1 to KIV-4 domains did not show any significant lysine binding capacity (28). Within the KIV-5 to KIV-9 range, domains KIV-6 and KIV-7 have a crucial role in Lp(a) assembly (29)(30)(31)(32) and their LBS shows moderate affinity for the lysine analogue -aminocaproic acid (EACA, 0.2-0.3 mM, see Table 1). Structurally, the LBS is a surface pocket consisting of a cationic, an anionic and a central hydrophobic binding centre, each defined by specific clusters of residues. The residue types present in each kringle type LBS determine the size of the pocket and the affinity for EACA and for other ligands.
While the number of KIV-2 repeats in apo(a) is inversely related to Lp(a) plasma concentration, suggesting a key role for this domain in pathogenesis, its function is not yet completely clear. Its ability to bind EACA has never been experimentally determined and is likely to be affected by the differences observed in the three binding centres when compared to, for example, the highaffinity EACA binding site of KIV-10 (see alignment, Fig. 1). There is experimental evidence based on two-hybrid systems, however, that a single KIV-2 domain can bind fibulin-5 (also known as DANCE (33)) and apolipoprotein H (also known as 2-glycoprotein I, 2GPI (34)).
However, for the above reasons, these interactions are unlikely to be based on a classical kringle lysine binding mode.
With the number of repeated KIV-2 domains affecting Lp(a) plasma levels, and hence cardiovascular disease risk, a full understanding of KV-2 function and molecular structure is by guest, on November 6, 2020 www.jlr.org Downloaded from 6 essential for the development of new strategies for interfering with Lp(a) pathophysiological function in heart diseases and stroke. Here, we describe the production and purification of the protein in recombinant form and the determination of its high-resolution structure by X-ray crystallography. We have also analyzed its binding to low molecular weight molecules and explored possible interaction modes with relevant proteins by in silico analysis.

Cloning, expression and purification
The cDNA sequence encoding for a single KIV-2 domain was derived from the LPA full cDNA sequence deposited in the NCBI Nucleotide database (NM_005577.2). Seven linker amino acids were included at the C-terminus of the KIV-2 peptide downstream the last cysteine (Cys) to facilitate formation of all the correct intra-molecular disulfide bonds. At the 5'-terminus of the KIV-2 encoding sequence, a 6x histidine tag encoding sequence was added, followed by the recognition site for Tobacco Etch Virus (TEV) protease. The KIV-2 gene fragment was obtained by gene synthesis (GeneArt, Thermo Fisher Scientific, U.S.A.) and subcloned into the pET-45b(+) expression vector between restriction sites Nco I and Not I. After sequencing of the insert, the construct was transformed into E. coli BL21(DE3). The protein was obtained with the non-classical inclusion bodies technique using 2xTY medium (16 g/l tryptone, 10 g/l yeast extract, 5 g/l NaCl) supplemented with 100 g/ml ampicillin for overexpression (35). Induction was performed at an OD600 of 0.85 AU by adding 1 mM Isopropyl β-D-1-thiogalactopyranoside (IPTG) and incubating the cultures at 18°C, 250 rpm, for 24 h (35).
Cells were collected by centrifugation and resuspended in 50 mM Tris-HCl, 500 mM NaCl, pH 8.5. The cell suspension was subjected to three sequential freeze-thaw cycles (alternating -80°C and room-temperature) to disrupt cell walls. Then, sonication was performed at 40% power, on ice, by 10 cycles of 1 min each using an OMNI Ruptor 400 sonicator. The insoluble fraction containing inclusion bodies was precipitated by centrifugation (10000 x g, 4°C, 20 min) and by guest, on November 6, 2020 www.jlr.org Downloaded from 7 resuspended in 2 M L-arginine, 500 mM NaCl, 50 mM Tris-HCl, pH 8.5 by homogenization using a glass potter (36). Reduced and oxidized glutathione (5 mM and 0.5 mM final concentration, respectively) were added, prior to incubating the sample for 3-4 days while stirring at 4°C to allow protein solubilization. Then, the remaining insoluble fraction was removed by centrifugation (15000 x g, 4°C, 15 min) and the supernatant diluted 1:50 in 50 mM Tris-HCl, 500 mM NaCl, pH 8.5. The recombinant protein was purified by affinity chromatography using a 5 ml HisTrap HP column (GE Healthcare). The eluted protein was further purified using Superdex 75 10/300 GL column (GE Healthcare). Pure, monomeric KIV-2 protein to be used for crystallization was buffer exchanged versus 20 mM Tris-HCl, 100 mM NaCl, pH 8.0 using centrifugal concentrators (3 kDa cut-off, Millipore).
The Q7E, R10E, R51E triple mutant was obtained by gene synthesis (Twist Biosciences, U.S.A.) and subcloned into pET-45b(+) using Nco I and Hind III restriction enzymes. Expression was performed as described above, reducing the pH of the extraction buffer to 7.0 and salt concentration to 200 mM after optimization. The recombinant protein was purified using the same protocol described above for the wild type, though yields were much lower (typically 0.15 mg/l versus 30 mg/l).

Small-molecule and protein binding prediction
If not otherwise stated, KIV-2 chain C (residues 1 to 78) was used as a monomer to predict both small-molecule binding and protein-protein interactions. Small-molecule binding was predicted using 3DLigandSite (40) and ProBis (41); heparin binding prediction was done using ClusPro server (42) and setting heparin as a ligand in the search options. For protein-protein interaction modeling, the Z-DOCK server (43) was used. Protein-protein model refinement and ranking were performed for the top 10 models using the CONSRank server (44) and the best two scoring models analyzed. Modeling of fibulin-5 (Uniprot: Q9UBX5) was obtained by i-Tasser (45). Modeller (46) was used to model KIV-2 self-assembly. Models were visualized with Pymol (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC). Rotation and translation matrices were calculated by Superpose (47).

Surface Plasmon Resonance (SPR) analysis
Affinity of small molecules for the KIV-2 domain was investigated through Surface Plasmon Resonance (SPR), using a Biacore™ T200 (GE Healthcare) apparatus. All experiments were carried out at an operating temperature of 25°C and using PBS-T (137 mM NaCl, 2.7 mM KCl, 9 performed for each small molecule with a contact time of 60 s, a dissociation time of 400 s and a flow rate of 10 l/min (n=3).
Heparin binding was investigated on a Xantec Heparin chip (XanTec bioanalytics, GmbH) using 20 mM Tris, pH 7.7, 100 mM NaCl as a running buffer and injecting different KIV-2 concentrations between 0 and 40 M. The flow rate was 15 l/min, the contact time 300 s and dissociation time 300 s. Regeneration of the surface was obtained by one 60 s pulse of 0.05% w/v SDS in water followed by one 60 s pulse with 1 M NaCl. When needed, pre-incubation of KIV-2 was performed with heparin (enoxaparin, Clexane) in a 30:1 molar ratio.
KIV-2 triple mutant was always analyzed in parallel with the wild-type in similar running conditions, but the low yield of its purification allowed only to test concentrations up to 5 M.
Results were analyzed with Biacore T200 Evaluation Software 3.0 and with Scrubber (http://www.biologic.com.au/scrubber.html). Relative Binding Response Units (RelRUs) were compared by Student's t-test. RelRUs for heparin binding were obtained for each curve using the Biacore T200 Evaluation Software 3.0. Data points were then used for non-linear regression fitting using the one-site specific binding equation by GraphPad Prism.

Overall structure
KIV-2 crystallized in space group C 1 2 1 with 3 molecules per asymmetric unit. Details of data processing and structure refinement are given in Table 2. In the PDB file (PDB ID: 6RX7), residue numbering starts with 1 at the first Cys residue to facilitate comparison to other previously described kringles (Fig. 1). The Glu located immediately before residue C1 is indicated with zero and further N-terminal residues with negative numbers. Residues 0 to -10 belong to the apo(a) interdomain linker and from -11 to -13 represent part of the TEV recognition site. The KIV-2 core structure is well defined in all the three copies present in the asymmetric unit.
Residues -10 to 78 could be modeled for chain A, 0 to 78 for chain B and -13 to 79 for chain C.
The relatively large part of the structure which could not be rebuilt is the main reason for the relatively high Rwork and Rfree values observed. Superposition of the C traces of the three chains shows no significant differences (r.m.s.d. values are 0.37 Å for chain A vs. chain B, 0.36 Å for chain B vs. chain C and 0.34 Å for chain C. vs chain A) and the Ramachandran plot reveals that 1% of the residues (Val56 from all the three molecules: electron density in Supplemental Figure 10) do not lie in the allowed regions ( Table 2). The LBS of chain A contains a glycerol molecule, and the one of chain B a sulfate ion deriving from the cryoprotective and crystallization solutions, respectively. Superposition of the same region of chain C, which has nothing bound, shows no relevant changes in side chain orientation in all local residues, except for R35 in chain A, whose side chain is slightly turned and faces the LBS pocket, compared to the other two chains (Supplemental Fig. 1).
The overall structure is lentil-shaped, and very similar to that of other apo(a) kringles previously determined, which include KIV-6, KIV-7, KIV-8, KIV-10 and KV (Supplemental Fig. 2 illustrates the superposition of -traces). KIV-8 shows a significant divergence, but its structure has been determined by NMR and only one member of the ensemble has been randomly chosen for this superposition.
The secondary structure of KIV-2 is mainly composed of turns and coils (Supplemental

The Lysine Binding Site
KIV-10 has the highest affinity among apo(a) kringles for the lysine mimic EACA. Its sequence homology to KIV-2 is 79% (Fig. 1). Seven out of the 16 residue that are different are located along the LBS border, which, in both kringles, is made up of residues S32-R35, D54-V56, Y60-Y62 and R69-Y72 (Fig. 2a, light blue).
Three out of the 4 residues forming hydrogen bonds with EACA in KIV-10 (R35, D54, and R69), are conserved in KIV-2, while D56 is replaced by a Val residue (Fig. 2b). In the LBS cationic center (formed by residues R35 and R69), the side chain of residue R35, which forms a hydrogen bond with the O2 atom of EACA in the bound form of KIV-10, is oriented outwards in the unbound form, and it establishes a hydrogen bond with P53 instead. In KIV-2 (chain C) and Y62, respectively. Thus, the bottom surface of the KIV-2 pocket carries two oxygen atoms that can hamper interaction with the hydrophobic part of EACA because of the charge and steric hindrance they create (Fig. 3a). Moreover, S34 and Y62 contribute hydroxyl groups in the anionic center, further reducing EACA binding affinity (Fig. 3b).

Binding of heparin to KIV-2
Heparan sulfate, structurally related to heparin, binds a wide range of proteins of different functionality, some of which through an interaction with a kringle domain, and is involved in various physiological and pathological processes. KIV-2 heparin binding sites were predicted by homology to the Hepatocyte Growth Factor/Scatter Factor kringle structure in complex with heparin (PDB ID: 3SP8). The hit was obtained using the ProBis server and using KIV-2 chain C as a search model. The interaction with heparin was also investigated by ClusPro. Nine out of the 10 simulation results clustered into two main regions (Supplemental Fig. 6a-c), neither involving the LBS. Particularly, binding was predicted to occur through the highly conserved residues Q7, R10, P41, N42, R51 and P53 ( Fig. 1 and Supplemental Fig. 6d). Subsequent SPR experiments allowed the calculation of an average KD for heparin binding of 77.61 ± 21.80 M (Supplemental Fig. 7a). Sensorgrams suggest a complex kinetics, with a very stable residual interaction with the chip after an initially fast dissociation phase, especially at higher concentrations (Supplemental Fig. 7b). Pre-incubation of 20 M KIV-2 with a 30 times molar excess of heparin reduced binding to the heparin coated chip by ca. 10 times (stability at 10 s after injection end: 695.9 ± 131.4 vs 71.9 ± 11.3 RUs, p=0.001). The introduction of three reverse charge mutations, Q7E, R10E and R51E, completely abolished binding to the heparincoated chip (i.e. resulted in flattened SPR sensorgrams) (Supplemental Fig. 7b).

KIV-2 self-assembly
Crystal packing can often provide relevant information about biological interactions. In the case of the KIV-2 domain, the asymmetric unit includes three monomers (A, B and C) assembled in a horse-shoe shape. The A and C chains are related by a complex rigid transformation, which includes both a 164° rotation around an axis centered within the B-domain and a translation.
Interdomain interactions occur through LBS-independent binding interfaces. For chain B some residues of the LBS are partially involved, but not in a classical lysine binding mechanism, as the binding pocket is completely interaction-free (Supplemental Fig. 8a).
Interfaces involved in trimer formation are of similar sizes (255.5 Å and 240.5 Å for interfaces AB and BC, respectively). Chain B, which is sandwiched between chains A and C, interacts with them through two independent surface patches comprising residues 3-8, 55-57 and 77 (for the interaction with chain A) and residues 32-35, 67-70 (for the interaction with chain C), respectively, whose sequences are unrelated. Interestingly, the same group of residues (38)(39)(40)(41)(42) and 53-55) on both chains A and C interact with chain B. Further interactions provided by residues S34 and R35 of chain A justify the higher interface surface area provided by the latter.
Each KIV-2 domain can therefore provide at least three different interaction surfaces for selfassembly, none directly involving the LBS pocket.
Crystal contacts also generate a linear chain of KIV-2 tandem repeats (Supplemental Fig. 8b). In fact, a model including 3 KIV-2 monomers and generated in silico by using Modeller shows a configuration highly similar to that experimentally determined for the asymmetric unit, suggesting that the type of crystal contacts observed in this kind of interaction might really be physiologically favored, and not only a crystal feature. Interestingly, the structure of angiostatin kringles 1-3 (PDB ID:1KI0) closely resembles the one observed in KIV-2 asymmetric unit.

Prediction of KIV-2 binding modes to selected protein interactors
The interaction of KIV-2 with known protein ligands was explored considering its monomeric state and selecting predicted complexes according to CONSRANK score (Supplemental Table   by (52). It is organized into five sushi domains (Fig. 4) and a Cterminal domain named "fifth domain" oriented perpendicularly to the sushi domain repeats.
Binding of KIV2 to β2GPI occurred in the first two top-ranking models in the hinge region between sushi domain 4 and the fifth domain. Consensus contacts involved the two NAG molecules associated to β2GPI, while interface analysis shows that the KIV-2 LBS is not involved (Fig. 4).
In order to study the interaction of KIV-2 with fibulin-5 (33), a model of the latter was generated by homology modelling using I-Tasser (Supplemental Table S1). Though a direct comparison with the original SAXS data is not possible, the general architecture of the fibulin-5 model looks similar to the one obtained by SAXS data (53) with the C-terminus domain 90° bend with respect to the stalk formed by the EGF-like tandem repeats. This structure is also similar to the one of β2GPI (Fig. 4).
According to experimental data (33), KIV-2 binding to fibulin-5 occurred in a very specific region comprising 98 C-terminal amino acids (53). Interface consensus analysis of the two topranking KIV-2 -fibulin-5 complexes shows the partial involvement of the region comprising the LBS of KIV-2 (Fig. 4).
According to the literature, even fibronectin can interact with full length apo(a) through a very short peptide belonging to type III fragment 12 (FN12) in a non-lysine dependent way (54).
Therefore, interaction models of KIV-2 and FN12 were built which showed good consensus.
Their interface surfaces comprise the LBS of KIV-2, but the pocket is free and most of the interactions are hydrophobic (Fig. 4).

Discussion
In this work, we report the first high-resolution crystallographic structure of the so far elusive by guest, on November 6, 2020 www.jlr.org

Downloaded from
KIV-2 domain of apo(a). This domain shows a high conservation of the general fold compared to other known KIV domains, but, as detailed above, the fine structure of the KIV-2 LBS is significantly different from that of apo(a) kringles with high and medium affinity for EACA (Table 1). The main differences affect the LBS pocket shape and the surface charges typically involved in EACA binding, leading to the negligible affinity found for EACA and t-AMCHA.
These data indicate that the main functional role of KIV-2 cannot be related to a classical lysine binding mechanism mediated by an optimized LBS and is in line with previous sequence-based hypothesis and experimental findings (28). In fact, structural differences generate an "affinity gradient" for EACA going from the high-affinity C-terminal KIV-10 to the low affinity N- The cationic center is totally conserved in R35 and R69 throughout all apo(a) kringles. Notably, in KIV-2 the R35 side chain is oriented in a conformation more similar to KIV-10 with bound EACA than to the KIV-10 unbound structure. A similar orientation is found in all three molecules in the asymmetric unit, with minor changes in glycerol and sulphate bound chains A and B, respectively. In the three-dimensional structure of KIV-2, the presence of the S34 OH group in the proximity of the LBS binding site contributes an impairment to EACA binding. which interact with the negatively charged groups of the glycosaminoglycan. In fact, the main binding site was predicted and experimentally confirmed to be located in a positively charged crevice, involving residues Q7, R10 and R51, close to the LBS, a situation similar to the one observed for Hepatocyte Growth Factor NK1 fragment (55,56), suggesting a potential role of heparin in in vivo interactions. According to the results derived from the triple mutant, charge reversal is the main mechanism expected to be involved in the loss of heparin binding.
However, as residues R10 and R51 establish critical hydrogen bonds with neighbouring residues by guest, on November 6, 2020 www.jlr.org Downloaded from through their side chains, we consider disruption of surface complementary of this specific region an alternative possibility. Apolipoproteins B and E are also known to have heparin binding motifs and to bind to heparin and other glycosaminoglycans contributing to lipoprotein lipolysis and to receptor-mediated uptake of the remnants (57). Heparin binding of KIV-2 deserves therefore particular attention, as it might contribute to both processes. In this respect, its specificity and complex kinetics, which resembles the one observed for the 10 kDa fragment of apoE3 (58), require to be dissected further.
Structural features of KIV-2 protein-protein interactions were explored for fibulin-5 and 2-GPI (33,34). Protein docking showed that interactions between them and KIV-2 do not involve the kringle LBS. Interestingly, the models suggest an interaction pose in which 2GPI and fibulin-5 might be "hooked" onto Lp(a) KIV-2 tandem repeats. Fibronectin type III interaction with apo(a) was described as well, but no definitive proofs on its direct interaction with the KIV-2 component of apo(a). The interaction between fibronectin and apo(a) was described as being independent from a lysine-binding mechanism. We therefore speculated a possible main role of the lysine-binding deficient KIV-2 in the interaction with apo(a). The obtained models were quite similar and showed a recurrent mode of interaction between the KIV-2 and FN12. The modeled information must still be confirmed by experimental data, but it is anyway an indication of a possible, further involvement of the repetitive module of apo(a) in binding with a protein which is part of the extracellular matrix involved in atherogenesis.
The availability of KIV-2 atomic structure also allows a detailed mapping of the Single Nucleotide Polymorphisms (SNPs) very recently described through a systematic population analysis (59). Only 25 positions out of the 78 residues of the KIV-2 were found to be absolutely conserved (light blue in Supplemental Fig. 9). They are either single or paired residues along the primary sequence. The only exception is the 68-72 stretch, which includes the five consecutive residues motif (VRWEY) forming one of the two KIV-2 anti-parallel -sheets, suggesting that this strand represents a structural core element. For all the other positions, one or more non-by guest, on November 6, 2020 www.jlr.org Downloaded from synonymous SNPs were described (purple in Supplemental Fig. 9), indicating a strong structural tolerance to mutations and raising questions about the role of these regions in KIV-2 function.
In conclusion, our work reveals the first available high-resolution crystal structure of KIV-2 and provides a structural basis for very low affinity towards small molecules like EACA and t-AMCHA of the novel LBS that was experimentally determined. As the structure confirms that the KIV-2 lysine binding site, in fact, not structurally suitable for lysine and EACA binding, we propose to solve the derived semantic issue renaming the KIV-2 LBS to "pseudo-LBS" (pLBS).
For the first time, we also contribute in silico and in vitro experimental evidence for the interactions of KIV-2 with heparin which we proved not to be mediated by the pLBS. This study has clarified that KIV-2 pLBS has, if any, a non-classical role in apo(a) function and has identified possible alternative sites for protein-protein and protein-heparin interactions, opening possibilities for the design of new therapeutic drugs for Lp(a) related diseases.

Data Availability Statement
All data are available in this manuscript, except for the structural atomic coordinates of the