The association of plasma cystatin C proteoforms with diabetic chronic kidney disease

Cystatin C (CysC) is an endogenous cysteine protease inhibitor that can be used to assess the progression of kidney function. Recent studies demonstrate that CysC is a more specific indicator of glomerular filtration rate (GFR) than creatinine. CysC in plasma exists in multiple proteoforms. The goal of this study was to clarify the association of native CysC, CysC missing N-terminal Serine (CysC des-S), and CysC without three N-terminal residues (CysC des-SSP) with diabetic chronic kidney disease (CKD). Using mass spectrometric immunoassay, the plasma concentrations of native CysC and the two CysC truncation proteoforms were examined in 111 individuals from three groups: 33 non-diabetic controls, 34 participants with type 2 diabetes (DM) and without CKD and 44 participants with diabetic CKD. Native CysC concentrations were 1.4 fold greater in CKD compared to DM group (p = 0.02) and 1.5 fold greater in CKD compared to the control group (p = 0.001). CysC des-S concentrations were 1.55 fold greater in CKD compared to the DM group (p = 0.002) and 1.9 fold greater in CKD compared to the control group (p = 0.0002). CysC des-SSP concentrations were 1.8 fold greater in CKD compared to the DM group (p = 0.008) and 1.52 fold greater in CKD compared to the control group (p = 0.002). In addition, the concentrations of CysC proteoforms were greater in the setting of albuminuria. The truncated CysC proteoform concentrations were associated with estimated GFR independent of native CysC concentrations. Our findings demonstrate a greater amount of CysC proteoforms in diabetic CKD. We therefore suggest assessing the role of cystatin C proteoforms in the progression of CKD.


Background
Cystatin C (CysC) is a cysteine proteinase inhibitor belonging to the type 2 cystatin gene family [1][2][3]. It is a non-glycosylated single chain protein with a molecular weight of 13,343 Da. CysC is produced at a constant rate by nucleated cells, and is freely filtered by the renal glomerulus, therefore, has been used to monitor the progression of chronic kidney disease (CKD) [4].
CKD is defined by a decline of glomerular filtration rate (GFR) to <90 mL/min/1.73 m 2 and by the presence of kidney damage for at least 3 months [5,6]. CKD affects 26 million American adults, while millions of others are at increased risk. In clinical practice, estimated GFR (eGFR) is used to evaluate CKD and categorize the disease into different stages (1-5) based on severity. GFR is defined as the volume of plasma that can be completely cleared of a particular substance by the kidneys in a unit of time [7]. Accurate diagnosis and staging of CKD is critical for therapy and clinical implications.
Currently, serum creatinine is used to calculate eGFR. However, eGFR determined by creatinine-based equations can be biased by several factors that are not related to kidney function including age, race, sex, muscle mass, dietary intake and other factors affecting plasma creatinine levels. In a recent study by Stevens et al., creatinine-based estimates of GFR could not accurately detect acute changes in kidney function [8]. In addition, 5 -10 % of excreted creatinine originates from distal tubule secretion. This excretion increases in response to decreased GFR, making it difficult to accurately detect small to moderate changes in GFR [2]. Therapies that block distal tubule secretion of creatinine (e.g., trimethoprim, cimetidine, cefoxitin) can also increase serum creatinine levels, and alter the GFR estimation [9].
In a meta-analysis by Shlipak, et al. [10], reclassification of kidney function with CysC improved the prediction of cardiovascular and renal morbidity and mortality. Additionally, in experimental mice models of acute renal failure, CysC was a more sensitive measure of renal insufficiency after bilateral nephrectomy compared to serum creatinine [11]. In a study by Coll et al. [12], serum creatinine levels increased in 92.15 % of patients with impaired renal function compared to serum CysC levels which was increased in 100 % of patients. CysC started increasing at eGFR value of 88 mL/min/1.73 m 2 , while creatinine levels increased at eGFR of 75 mL/min/ 1.73 m 2 . Furthermore, the study showed that only CysC levels were significantly elevated in hypertensive patients with no evidence of renal failure and minimal proteinuria [12].
CysC undergoes posttranslational modifications in the plasma to form multiple proteoforms [13]. Protein modifications play important roles in biological processes, and can serve as diagnostic indicators of pathological events. Mass spectrometric immunoassay (MSIA) is a high throughput assay well suited to identify and quantify molecular variants and posttranslational modifications of plasma proteins [14,15]. MSIA involves the isolation of protein moieties from a biological milieu by immobilized antibodies, followed by mass spectrometric detection. We recently reported in a population of 500 healthy adults 3 posttranslationally modified CysC proteoforms: one containing hydroxyproline, and 2 truncated proteoforms; one missing N-terminal serine, and one lacking three N-terminal residues [13]. The distribution of CysC proteoforms in CKD is not known. In the present study, we investigate the CysC truncations in non-diabetic controls and patients with DM and CKD using MSIA.

Clinical samples
The study was approved by the University of Arizona Institutional Review Board, and written informed consent was obtained from all patients. Three groups of adult participants were recruited: 33 participants without CKD and without DM (control), 34 participants without CKD and with DM (DM group), and 44 participants with both CKD and DM (CKD group). Participants reported to the Center for Clinical and Translational Sciences after an overnight fast. Blood was collected for clinical laboratory measurements (lipid profile, Hemoglobin A1C (HbA1C), C-reactive protein (CRP), and fasting insulin). Additional samples were collected in EDTA tubes, and plasma from these samples was separated and immediately frozen at −80°C for all other assays. Demographic information (age, sex, ethnicity), physical exam measurements (blood pressure, waist circumference, weight, height, body mass index (BMI)), medication use, and medical history (hypertension, hyperlipidemia, smoking, type and duration of diabetes) were also recorded. GFR was estimated using the Modification of Diet in Renal Disease Study (MDRD) equation. Assignment of CKD (stages 1-5) was based on eGFR levels as described [16]. Exclusion criteria included the following: type 1 diabetes, participation in an active weight loss program, pregnancy, dysuria, thyroid dysfunction, history of cancer, HIV, active infection, other ongoing serious illness or current steroid use. Diabetes classification was based on clinical and medication history, or glycated hemoglobin greater than 6 %.

Analytical samples preparation
Frozen plasma samples were briefly thawed on ice, centrifuged at 3000 rpm for 5 min, and 50 μL aliquots were placed into low profile 96-well trays. To generate the standard curve, CysC standard was serially diluted in PBS buffer containing 3 g/L BSA, to final concentrations of 1.25 mg/L (standard 1), 0.625 mg/L (standard 2), 0.313 mg/L (standard 3), 0.156 mg/L (standard 4), 0.078 mg/L (standard 5) and 0.039 mg/L (standard 6). Beta lactoglobulin (BL) was used as an internal reference standard (IRS) for quantification, and was prepared in water, to a final concentration of 1 mg/mL. Human plasma samples were diluted 5-times in PBS, 0.1 % Tween prior to analysis. The analytical samples were prepared by combining 20 μL of the CysC standard solution or the diluted plasma with 40 μL of a 1 mg/mL solution of BL, and 100 μL PBS 0.1 % Tween buffer.

Mass spectrometric immunoassay
Affinity pipettes derivatization and assay execution was done on a Multimek 96-channel pipettor (Beckman Coulter, Brea, CA) as previously described [17] . Initially, affinity pipettes were derivatized with antibodies towards CysC (the targeted protein) and BL (the internal reference standard) in a ratio of 4.5:1 (w/w; CysC:BL, 3.6 μg anti-CysC, 50 μL/well), as described previously in the protocol [13,17]. Antibody-derivatized pipettes were stored at +4°C until used. Prior to sample protein extraction, the derivatized affinity pipettes were prerinsed with assay buffer (PBS, 0.1 % Tween, 10 aspiration/dispense cycles, 100 μL each). Pipettes were then immersed into a microplate containing analytical samples, and 250 aspirations/dispense cycles were performed (100 μL each) allowing for optimal affinity capture of both CysC and BL. The pipettes were then rinsed with PBS 0.1 % Tween (100 cycles, 100 μL aspiration/dispense volumes), and twice with water (10 cycles and 20 cycles respectively, 100 μL aspiration/dispense each). Protein-loaded tips were then exposed to six-microliter aliquots of MALDI matrix solution (25 g/L sinapic acid in aqueous solution containing 33 % (v/v) acetonitrile, and 0.4 % (v/v) trifluoroacetic acid). After a 10 s delay (to allow for the dissociation of the protein from the capturing antibody), the eluates were dispensed directly onto a 96-well formatted MALDI target. After samples were dried, linear-mode mass spectra was acquired from each sample spot, each consisting of ten thousand laser shots using an Autoflex III MALDI-TOF mass spectrometer (Bruker, Billerica, MA). For accurate mass assignment, the mass spectra were externally calibrated with protein standard I (Cat. No. 8206355, Bruker, Billerica, MA) and also with the internal reference standard (mass accuracy up to 0.001 Da). In addition, the mass spectra were baseline subtracted (Tophat algorithm) and smoothed (SavitzkyGolay algorithm; width = 0.2 m/z; cycles = 1), before peak integration with Flex Analysis 3.0 software (Bruker Daltonics). Peak areas for all CysC signals and BL were measured in Zebra software (Beavis Informatics, Ltd.). The hydroxylated proteoforms were not baseline-resolved from their corresponding proteoforms and were therefore co-integrated with their originating proteoforms.

Quantitative MSIA analysis of CysC proteoforms
Quantification of CysC was done as previously described [17] . In short, standard curve was generated, utilizing the corresponding protein standard (CysC) and the internal reference standard (BL). Separate standard curves were created with each run, by plotting the ratio of the peak areas of the CysC standard signal and the BL signal (CysC/BL) against CysC standard concentration (c(CysC)). The linear equations obtained were used to calculate the concentrations of native CysC and CysC proteoforms in the analyzed samples, using the ratio of the peak areas of each proteoform to the IRS. Peak area ratios for each CysC proteoforms against BL were summed up, and, total CysC concentration was determined using the standard curve equation. Individual CysC proteoform concentrations were calculated using the percent abundance in correspondence to total CysC. The reproducibility of the assay was tested by analyzing a control sample with known CysC concentration in triplicates with each run. This control sample was used to assess the within and between run variability. MSIA can identify total of 5 CysC proteoforms (Fig. 1). However, due to the inability to be resolved at a baseline level, CysC hydroxylated proteoforms were integrated with their originating proteoforms. An example of a standard curve, together with the corresponding mass spectra is presented in Additional file 1: Figure S1. The control sample run in triplicates showed within-run

Statistical analysis
Mean (SD) and median (25th and 75 th percentiles) were used to describe normally and non-normally distributed continuous variables, respectively. The control, DM and CKD groups were compared using one-way ANOVA (normally distributed variables) or Kruskal-Wallis test (non-normally distributed variables). Categorical variables were compared using chi-square test. Linear regression was used to analyze the association of GFR with CysC proteoforms. All regression models were adjusted for age, sex, and BMI as covariates. Partial correlation was used to test independent correlation of native CysC, CysC des-SSP and CysC des-S with eGFR. All statistical analyses were performed using SAS version 9.4 software package.
For Cystatin C proteoforms, the alpha level was set at 0.02 adjusting for multiple comparisons. For the other variables, the alpha level was set at 0.05.

Results
The demographic and clinical characteristics of individuals in the control, DM and CKD groups are summarized in Table 1. Participants in the CKD group were significantly older than both the control and DM groups (p < 0.001). The DM and CKD groups had elevated BMI (p < 0.001), HbA1C (p < 0.001) and CRP (p < 0.001), as well as decreased levels of HDL cholesterol (p < 0.01) compared to controls. As expected, urine microalbumin and serum creatinine were significantly greater, and eGFR was lower in the CKD group compared to the two other groups (p < 0.001).
Using mass spectrometric immunoassay we were able to identify several CysC proteoforms ( Table 2). An example  The hydroxylated variants were co-integrated with their originating proteoforms, therefore were not individually correlated with the clinical parameters of a mass spectra of CysC from a plasma sample of a control participant is shown in Fig. 1. Native CysC (representing sum of native and 3Pro-OH CysC), CysC-des-S (sum of des-S CysC and des-S 3Pro-OH CysC), and CysC-des-SSP concentrations were greater in participants with CKD compared to participants without CKD (38-77 % greater CysC proteoform concentrations, p < 0.01). Greater increases in truncated CysC were observed across the different CKD stages compared to native CysC concentrations (Fig. 2). The distribution of CysC proteoforms across the three groups is summarized in Table 3. Native CysC concentrations were 1.4 fold greater in CKD compared to DM group (p = 0.02) and 1.5 fold greater in CKD compared to the control group (p = 0.001). CysC des-S concentrations were 1.55 fold greater in CKD compared to the DM group (p = 0.002) and 1.9 fold greater in CKD compared to the control group (p = 0.0002). CysC des-SSP concentrations were 1.8 fold greater in CKD compared to the DM group (p = 0.008) and 1.52 fold greater in CKD compared to the control group (p = 0.002).
To understand the relation of CysC proteoforms with CKD, the concentrations of these proteoforms were correlated with eGFR. Increases in CysC proteoforms were negatively correlated with eGFR. This linear association between CysC proteoform concentrations and eGFR was significant after adjusting for age, sex and BMI, as summarized in Table 4. However, subgroup analysis indicated that this association of CysC proteoforms and eGFR was driven by the CKD group.
The concentrations of truncated proteoforms of CysC had a stronger correlation with eGFR than native CysC proteoform. The regression coefficients explaining the correlation between CysC proteoforms and eGFR were larger for truncated CysC proteoforms than the native CysC proteoform. The truncated CysC proteoforms correlated with eGFR independent of the native CysC.
The partial correlation of native CysC and CysC truncations with eGFR demonstrated a significant inverse correlation between CysC des-SSP and CysC des-S concentrations and eGFR (r = −0.28, p = 0.003, and r = −0.24, p = 0.01, respectively). The partial correlation of native CysC with eGFR was not significant (r = 0.15, p = 0.12). Therefore, truncated CysC proteoforms were associated with eGFR independent of native CysC; highlighting the importance for measuring the truncated proteoforms in CKD patients.
To examine the relation of CysC truncations with albuminuria, the distribution of CysC proteoforms was analyzed in the combined sample (n = 104) where urine microalbumin was measured. The three categories of albuminuria were defined as no albuminuria (urine microalbumin < 30 mcg/mg creatinine), micoalbuminuria (urine microalbumin 30-300 mcg/mg creatinine) and macroalbuminuria (urine microalbumin > 300 mcg/mg creatinine). Eight patients had clinical albuminuria, 18 patients had micoalbuminuria, and 78 participants presented with no evidence of albuminuria. Participants with albuminuria had greater amounts of plasma CysC proteoforms (p < 0.05) compared to participants with normal albuminuria (Table 5). CysC des-SSP concentrations in the macroalbuminuria group were 1.57 fold greater compared to the normoalbuminuria group (p = 0.02). CysC des-S concentrations in the macroalbuminuria group were 1.56 fold greater compared to the normoalbuminuria group (p = 0.05).

Discussion
Our findings demonstrate that CysC proteoforms are greater in concentrations in participants in the CKD + DM group compared to the DM and control groups. Native CysC and CysC truncations (CysC des-SSP and CysC des-S) were inversely correlated with eGFR and this persisted after adjusting for age, sex and BMI. CysC proteoforms concentrations were also significantly greater in the setting of urine microalbuminuria. The association with eGFR was stronger with the truncated proteoforms than native CysC and was only significant in participants with both CKD and DM. These findings highlight the importance of measuring truncated CysC proteoforms.
Total CysC was shown previously to be a specific predictor of eGFR [8,10]. The lack of correlation of CysC proteoforms with eGFR in the control and DM groups was noteworthy. The MDRD Study equation to estimate GFR is best in the lower ranges of GFR [18]. GFR estimates from the MDRD Study equation greater than 60 mL/min/1.73 m 2 underestimate measured GFR, and may lead to misdiagnosis and misclassification of CKD in individuals with mild renal insufficiency [19,20]. This limitation highlights the need of new CKD biomarkers to capture early disease risk.
The mechanism of CysC truncations is not well understood. The N-terminal residues of CysC were found to be important in determining the specificity of CysC with cysteine proteinases [21]. It is known that leucocyte elastase cleaves the Val10-Gly11 bond of cystatin C [22], and cathepsin L cleaves the Gly11-Gly12 bond [23]. The NMR structure of CysC shows that the N-terminal fragment is unstructured and highly mobile [24]. The Nterminus also contains the binding site for the cysteine proteinases, and when removed through truncations, the inhibition of proteinases is drastically reduced due to the decreased affinity [21]. Therefore, it is likely that the truncated des-S and des-SSP CysC proteoforms exhibit reduced biological activity. Our data suggest that CysC truncations can be markers of disease severity and used as more specific measures of CKD than native CysC. Therefore, evaluating changes in CysC truncations can be a useful indicator of CKD progression.
The strength of this study lies in the simple MS-based quantitative proteomics approach to assess CysC proteoforms, that is both high throughput and accurate. There are some limitations to this study. MSIA was not able to resolve signals from hydroxylated CysC (3Pr-OH) from  native CysC at baseline due to the small mass difference among these proteoforms. A previous study demonstrated CysC and truncations levels can fluctuate daily by as much as 2 fold within the normal range [13], which could bias our findings given our limited sample size. Another limitation is the small sample size of the subgroups with advanced CKD, stages 4 and 5. Our findings warrant an investigation of Cystatin C proteoforms in larger CKD studies with measured GFR to discriminate the capacity of these proteoforms versus eGFR in predicting GFR decline.

Conclusions
We conclude that CysC proteoforms are increased in patients with DM and CKD. Truncated proteoforms, independent of native CysC, associate with eGFR. Future studies are needed to further investigate the relation between CysC proteoforms in early renal disease and for their utility as biomarkers of CKD progression.