Eleven neurology-related proteins measured in serum are positively correlated to the severity of diabetic neuropathy

About 20% of patients with diabetes suffer from chronic pain with neuropathic characteristics. We investigated the multivariate associations between 92 neurology-related proteins measured in serum from 190 patients with painful and painless diabetic neuropathy. Participants were recruited from the Pain in Neuropathy Study, an observational cross-sectional multicentre study in which participants underwent deep phenotyping. In the exploration cohort, two groups were defined by hierarchical cluster analyses of protein data. The proportion of painless vs painful neuropathy did not differ between the two groups, but one group had a significantly higher grade of neuropathy as measured by the Toronto Clinical Scoring System (TCSS). This finding was replicated in the replication cohort. Analyzing both groups together, we found that a group of 11 inter-correlated proteins (TNFRSF12A, SCARB2, N2DL-2, SKR3, EFNA4, LAYN, CLM-1, CD38, UNC5C, GFR-alpha-1, and JAM-B) were positively associated with TCSS values. Notably, EFNA4 and UNC5C are known to be part of axon guidance pathways. To conclude, although cluster analysis of 92 neurology-related proteins did not distinguish painful from painless diabetic neuropathy, we identified 11 proteins which positively correlated to neuropathy severity and warrant further investigation as potential biomarkers.


Patients and clinical data
PiNS is an observational cross-sectional multicentre study in which participants underwent deep phenotyping that included neuropathy screening tools, extensive symptom and function questionnaires, neurological examination, nerve conduction studies, quantitative sensory testing, and skin biopsy for intraepidermal nerve fibre density assessment (IENFD) in a sub-set of patients 8 .Patients with diabetes mellitus aged above 18 years with diagnosed DSP, or patients with symptoms and signs suggestive of DSP were included.Exclusion criteria were pregnancy, coincident major psychiatric disorders, poor or no English language skills, severe pain at recruitment from a cause other than DSP (to prevent potential confounding influence on pain reporting as well as psychological and quality-of-life reported outcomes), patients with documented central nervous system lesions, or patients with insufficient mental capacity to provide informed consent or to complete questionnaires.Many of the study participants were recruited from primary care practices in London and Oxford.Study participants were also recruited from diabetes and other clinics at Chelsea and Westminster Hospital NHS Foundation Trust (London), Sheffield Teaching Hospitals and Oxford University Teaching Hospitals, neurology clinics at King's College Hospital (London), and through advertisements.
Participants were included consecutively in PiNS until target number was reached.In PiNS, participants are dichotomized in painless and painful diabetic DSP.The methods and questionnaires have been previously described in detail 8 .Participants included in the present biomarker study were those where serum, and neuropathic pain grading according to IASP/NeuPSIG, were available 21 .We applied the NeupSIG grading system for neuropathic pain to pain in the feet as being the plausible anatomical distribution when separating those with painful versus painless diabetic neuropathy.In the present study, the following clinical variables were available for both painless and painful neuropathy patients: • Age and sex • Data pertaining to diabetes and metabolic control (Body Mass Index (BMI; kg/m 2 ), diabetes type 1 or type 2, HbA1c) • Data related to neuropathy-the Toronto Clinical Scoring System (TCSS) correlates with diabetic neuropathy severity 22 .Based on TCSS, patients can be classified as having no DSP (TCSS 0-5), mild DSP (TCSS 6-8), moderate DSP (TCSS 9-11), or severe DSP (TCSS 12-19) 23 .• Douleur Neuropathique en 4 Questions (DN4) which can be used as a screening tool for neuropathic pain 24 .
Clinical variables available in patients with painful neuropathy were Brief Pain Inventory (BPI) severity scores 25 , Neuropathic Pain Symptoms Inventory (NPSI) 26 , and PainDetect 27 .

Protein data
A 10 ml blood sample (BD Vacutainer SST Tubes) was drawn from each participant.After 30 min, to allow blood to clot, the sample was centrifuged at 3000 rpm for 10 min at a temperature of 4 °C.Serum was then aliquoted into 1.8 ml Nunc CryoTubes and stored at -80 °C.
The Olink Neurology panel (product number 95801, v. 8012) provides a high-throughput, multiplex immunoassay enabling the analysis of 92 neurology-related protein biomarkers at the same time using a Proximity Extension Assay (PEA) technology 28,29,30,31 .PEA means that a pair of oligonucleotide-labelled antibodies bind to their respective target protein.When the two antibodies are close to each other, a polymerase chain reaction (PCR) is initiated which is then quantified by real time PCR.Results are expressed as Normalized Protein eXpression (NPX), which is relative quantification between samples, on a Log2 scale.A high NPX value equals a high protein concentration.Because NPX is a Log2 scale, a difference of 1 in NPX means a doubling of protein concentration.If needed, NPX values can be converted into a linear scale according to 2 NPX = linear NPX.A complete list of the 92 neurology-related proteins, including their UniProt ID, is found in Supplemental Digital Content 1 (see Supplementary Information 1).The URL leading to validation data provided by Olink is available here: https:// www.olink.com/ conte nt/ uploa ds/ 2021/ 09/ olink-neuro logy-valid ation-data-v2.1.pdf (Access date April 30, 2024).

Statistics
Data are expressed as median (IQR), unless stated otherwise.SIMCA (version 16, Sartorius Stedim Biotech, Umeå, Sweden) was used for multivariate data analysis (MVDA).SPSS (version 26, IBM Corporation, Route 100 Somers, New York, USA) was used for all other analyses (Mann-Whitney U test, Chi-Square test, Spearman's rho for bivariate correlations, and multiple linear regression (MLR), as appropriate).A significance level of 0.05 was chosen.
The same procedures were conducted in the exploration (n = 95) and in the replication (n = 95) cohorts.Details concerning MVDA methodology 32,33 have been described in previous publications 30,[34][35][36][37][38][39] .Briefly, we performed principal component analysis (PCA), hierarchical clustering analysis (HCA) and, based on the groups defined by HCA, orthogonal partial least squares (-discriminant analysis) (OPLS and OPLS-DA).PCA is a technique that models the correlation structure of a dataset, and thereby enables the identification of multivariate outliers 32,33 .Principal components (PC) extract relevant information found in the data, reducing a high-dimensional space (high number of variables) to a few "summary variables".After outlier detection with PCA (strong outliers defined as Hotelling's T2>>T2Crit(99%) and moderate outliers as DModX>2*DCrit), we applied a bottom-up HCA to the principal component score vectors using the default Ward linkage criterion to identify relevant subgroups of patients.HCA complements PCA in the sense that while PCA identifies distinct clusters in multivariate space, HCA can find subtle clusters.In the resulting dendrogram, interesting patient subgroups were identified, and clinical data were compared between subgroups to ascertain the clinical relevance of the subgroups.Then, OPLS-DA was performed using group belonging as Y-variables and protein data as predictors (X-variables).To identify the proteins most relevant for group discrimination, the OPLS-DA models analyzed and identified associations between the X-variables and group belonging.X-variables with |p(corr)|≥ 0.5 are usually considered important for group discrimination 32 , the sign of p(corr) denoting the direction of the association (described in text in each case).However, in some cases, a tougher cut-off of |p(corr)|≥ 0.6 was used instead in this study.P(corr) is the loading of each X-variable scaled as a correlation coefficient that is comparable between models.MVDA analyzes all variables simultaneously, using the overall correlation pattern present in the data, hence separating information from "noise".Hence, the protein data in the present study were not primarily analyzed by multiple univariate testing, thereby minimizing the multiple testing problem.In a third step, both cohorts were analyzed together in an OPLS model with TCSS as outcome (Y) variable-see text below for rationale.For this third step, a false discovery rate (FDR) at the 10% level was applied using the Benjamini-Hochberg procedure 40 .

Ethics
The study was approved by the National Research Ethics Service of the United Kingdom (No.:10/H07056/35).All study participants signed written consent before participating.The research was conducted in accordance with the Declaration of Helsinki.

Results
Two proteins out of 92 had > 20% missing values and were therefore excluded from all analyses.All results are based on the remaining 90 proteins.An overview of clinical data in painless vs. painful patients is presented in Table 1.

First phase of protein analyses: exploration cohort
The exploration cohort consisted of 95 patients.Two patients (ID 30483 and ID 30519) were excluded because of quality warning from Olink Bioscience (Uppsala, Sweden).On the remaining 93 patients, a PCA was done using the 90 proteins as X-variables (4 PCs, R 2 = 0.52, Q 2 = 0.38); no outlier was found.By HCA, 2 groups were defined (Group 1, n = 37 and Group 2, n = 56).Then, the clinical variables were compared between the two groups, i.e., we wanted to see if they seemed clinically meaningful.In the exploration cohort (Table 2, left side), TCSS was significantly higher in Group 2 (11 (8-15) vs. 8.5 (6-12), p = 0.027).The other clinical variables did not differ between groups in the exploration cohort.Finally, an OPLS-DA was done with group belonging (Group 1 vs 2) as Y-variable; the model had 2 latent variables (one predictive and one orthogonal component), R 2 = 0.69, Q 2 = 0.57, and p < 0.001 by CV-ANOVA.The proteins most responsible for group discrimination, i.e., with |p(corr)|≥ 0.6 for the first (predictive) latent variable, are listed in Table 3 , left column.

Second phase of protein analyses: replication cohort
After an initial PCA on the 95 patients of the replication cohort using the 90 proteins as X-variables, one patient (ID 30087) was excluded for being a multivariate outlier by Hotelling's T2.Hence, the new PCA model had 94 patients, with 4 PCs, R 2 = 0.53, Q 2 = 0.40.By HCA, 2 groups were defined (Group 1, n = 31 and Group 2, n = 63).Then, we tested the hypothesis that TCSS scores would differ between Group 1 and 2. We found that, just as in the exploration cohort, Group 2 had significantly higher TCSS scores: (12.5 (9-14) vs. 10 (8.5-12), p = 0.016) (Table 2, right side).Age was also significant (p = 0.003) (Table 2).Hence, the findings concerning TCSS were replicated.Finally, an OPLS-DA was done with group belonging (Group 1 vs 2) as Y-variable; the model had 1 latent variable (i.e., one predictive component), R 2 = 0.56, Q 2 = 0.53, and p < 0.001 by CV-ANOVA.The proteins most responsible for group discrimination, i.e., with |p(corr)|≥ 0.6, are listed in Table 3  www.nature.com/scientificreports/ of the top ten proteins of the exploration cohort were top 10 in the replication cohort (see text in red in Table 3) and 85% of the top 20 proteins of the exploration cohort were top 20 in the replication cohort (see text in blue in Table 3).

Third phase: in-depth analysis of TCSS in all patients
In third phase of the study, we focused on TCSS, using both cohorts taken together (n = 186; TCSS was missing in one patient).Using TCSS as Y-variable and the 90 proteins as X-variables, we computed an OPLS model which had 2 latent variables (one predictive and one orthogonal), R 2 = 0.31, Q 2 = 0.10, p < 0.001 by CV-ANOVA.The proteins most associated with TCSS are tabulated in Table 4; the overlap with the two OPLS-DA models in Table 3 is also illustrated.Because of a possible age issue (see Table 2), we next computed a new OPLS model, with age as Y variable to be able to exclude age-related proteins from the results of Table 4.This age model had 3 latent variables (one predictive and two orthogonal), n = 186, R 2 = 0.64, Q 2 = 0.45, p < 0.001 by CV-ANOVA.We found 4 proteins with |p(corr)|≥ 0.4 (see note in Table 4 for choice of p(corr) level), indicating a possible association with age, and EDA2R, the top protein in Table 4 , had the highest p(corr) in the age-model (positive correlation with age and p(corr) = 0.5).Therefore, EDA2R was excluded from the main results of the study (see below).
Hence, the 11 proteins with p(corr) ≥ 0.5 in Table 4 (i.e., excluding EDA2R as per above) were the main findings of the present study, and all correlated positively with TCSS in multivariate space.Additionally, we also computed bivariate correlations between TCSS and the 11 proteins, finding that the 11 proteins had highly significant correlations with TCSS, with rho ranging from 0.19 to 0.32.These 11 correlations remained significant when applying a false discovery rate (FDR) of 10% (calculated for 90 bivariate correlations between TCSS and each protein).The 11 proteins also all inter-correlated significantly with each other (all p-values < 0.001)  www.nature.com/scientificreports/with rho ranging from 0.51 to 0.88-hence confirming the validity of this cluster of proteins which together correlated positively with TCSS (as per the OPLS model above).The 11 proteins are described briefly together with their Uniprot ID in Table 5.To further descriptively get a sense the effect sizes involved, we calculated the percentage increases of median values of linearized NPX in those having severe DSP compared to those not having DSP (defined as TCSS < 6, n = 18), see Table 5. Rho-values (i.e., correlation with TCSS) are also listed in Table 5.Also, the relationship between the 11 proteins and TCSS is visualized in Supplemental Digital Content 2 (see Supplementary Figures).
Using the 11 proteins as per above, we computed a PCA model, n = 186, 1 PC, R 2 = 0.77, Q 2 = 0.72.We used the scores of the PC of the PCA model as a "summary" variable of the 11 proteins, here called PC1_11prot.We did a multiple linear regression (MLR) with TCSS as outcome variable (dependent variable) and with the following variables as predictors: PC1_11prot, sex, age, BMI, and HbA1c.The MLR model was significant (adjusted R 2 = 0.155 and p < 0.001) and PC1_11prot was significant (p = 0.001 with a positive coefficient, i.e., a positive correlation between PC1_11prot and TCSS) when adjusted for sex, age, BMI and HbA1c.Table 3.Proteins responsible for group discrimination (Group 1 vs 2) in the exploration cohort (left columns) and the replication cohort (right columns) in falling order of p(corr).To illustrate the overlap of results, the top 10 proteins of the exploration cohort are in red in both columns, and the top 11-20 proteins of the exploration cohort are in blue in both columns.The p(corr) values in the table all have a positive sign, indicating higher levels in Group 2 compared to Group 1.All top 20 proteins in the exploration cohort had a p(corr) > 0.68 in the replication cohort.www.nature.com/scientificreports/

Fourth phase: in-depth analysis of BPI scores in patients with painful neuropathy
Finally, we did an in-depth analysis of BPI scores in patients with painful neuropathy.In the exploratory cohort, only "BPI now" differed when comparing Groups 1 and 2, Group 2 (n = 28) having statistically significant higher levels than Group 1 (n = 15): 6,5 (4-8) vs. 3 (1, 5-6), p = 0.019; this remained significant at FDR 10%, the critical value being 0.02.In the replication cohort however, none of the BPI scores differed between Groups 1 and 2. "BPI now" in the replication cohort was 6 (4-7) in Group 1 (n = 14) vs 4 (2-6) in Group 2 (n = 33), p = 0.205 (i.e., a non-significant tendency for "BPI now" to be lower in Group 2).Hence, the findings of "BPI now" in the exploratory cohort could not be replicated.

Discussion
A panel of 92 neurology-related proteins was used to investigate potential biomarkers of painful and painless diabetic DSP in a deeply phenotyped cohort.We found that 11 proteins were associated with the severity of neuropathy (but not with the presence of neuropathic pain).These 11 proteins have a variety of biological functions such as inflammatory processes, growth factors, adhesion molecules and axon guidance (Table 5).Neuropathic pain is known to positively correlate with the severity of peripheral neuropathy 8 .However, given its complex aetiology involving multiple pathophysiological drivers in the central as well as peripheral nervous system 41 , it is not surprising that we may find molecular correlates of neuropathy severity that are independent of neuropathic pain.One biological process that was highlighted in our findings was axon guidance with the identification of EFNA4 and UNC5C.Ephrins, to which EFNA4 belongs, is one of five known families of axon guidance proteins 42 .Axon guidance pathways seem to be involved in diabetic DSP 43 .Interestingly, Evdokimov et al. 44 studied EFNA4 in skin biopsies from fibromyalgia patients vs. controls, finding that the expression of EFNA4 was higher in fibromyalgia patients.Axon guidance proteins are detected by a structure at the tip of growing axons-the growth cone 42 .Different receptors are present on the growth cone, one of them being UNC5C, which is another of the main findings listed in Table 5. UNC5C in turn binds to the netrin family of axon guidance proteins 42 .In diabetic DSP, both nerve degeneration and regeneration are present 45 , and the question therefore arises if EFNA4 and UNC5C can perhaps be seen as potential biomarkers for nerve de-and/or re-generation in this setting?This is of course highly speculative and should be investigated in further studies.It should also be noted that ephrins and Table 4. Top 22 proteins* associated with Toronto Clinical Scoring System (TCSS) in OPLS model.p(corr) values are all positive, corresponding to positive correlations with TCSS.Included in the table is also an illustration of the overlap with the top 22 proteins of the two previous OPLS-DA models according to Table 3 (exploration and replication), see the two columns furthest to the right.*Note: To enable a comprehensive comparison with the exploration and replication cohorts, 0.4 was here chosen as cut-off value for |p(corr)|, rendering a list of 22 proteins listed in falling order of p(corr).However, consistently with the pre-defined cutoff value of 0.5 (see "Methods"), the main findings of the study are the 11 proteins with |p(corr)|of at least 0.5, excluding EDA2R, see text.www.nature.com/scientificreports/netrins have been implicated in central processes related to (neuropathic) pain [46][47][48][49] , and that UNC5C has been investigated in the context of endometriosis-related (and supposedly neuropathic) pain 50 .Moreover, although it was not associated to TCSS as per Table 4, the ephrin EPHB6 was nonetheless a main finding in both the exploration and the replication cohorts as per Table 3.
Given previous finding about chronic inflammation in painful and painless diabetic DSP 10 , our findings about GFR-alpha-1 and CD38 are also interesting (Table 5).CD38 is immunomodulatory 51 and has been deemed to be a possible pharmacological target.In mice, Gil and co-workers studied CD38 in the context of osteoarthritis, concluding that inhibition of CD38 could potentially be a novel therapeutic approach for the treatment of osteoarthritis and associated pain 52 .GFR-alpha-1 is the receptor for GDNF-which in turn is connected to Table 5. Cluster of proteins correlating positively with Toronto Clinical Scoring System (TCSS).Notes: *The percentages in the "increase" column indicate descriptively how much larger the median levels (in linearized normalized protein expression, NPX) were in the severe DSP group (n = 76) compared to no DSP (n = 18).For details about NPX, see "Statistics" section.**This short description is based on information on the Olink website, www.olink.com, accessed 23  www.nature.com/scientificreports/inflammation 53 .GDNF has also been shown to have neuroprotective actions on sensory neurons following traumatic axotomy 54 and in experimental models of diabetic neuropathy 55 .Hence, although the pain literature contains scarce information about the potential biomarkers listed in Table 5, at least five of them can be related to neuropathy/pain in different ways.It should be noted that Neurofilament light chain, a potential biomarker ford diabetic DSP 56 , was not part of the panel of proteins in the present study.
In our opinion, the present study has some obvious strengths.Before any statistical analysis of the data, the material was dichotomized into two cohorts, enabling us to implement an exploration-replication strategy which confirmed that the subgroup characterized by high levels of the proteins listed in Table 3 had higher levels of neuropathy as expressed by TCSS (Table 2).Hence, although this paper is hypothesis-generating in the sense that it used a panel of neurology-related proteins (and thus no specific candidate proteins), there is also an element of confirmatory methodology inherent in the exploration-replication design.Moreover, as in previous work 10 , we used an unbiased clustering approach to subgroup the patient on biological grounds.Hence, instead of merely comparing painful and painless participants, we stratified the material according to a systems biology perspective-and this stratification was then shown to be clinically relevant, albeit not directly pain-wise.The idea behind this approach is that there might be different mechanisms at play in different patients who have the same symptoms and signs, and that a simple comparison based on phenotype might thus blur the picture more than a comparison of clusters based on biology.Whether this "mechanism-based stratification" 19 approach is really advantageous will of course have to be confirmed or falsified in future studies.Concerning the fact that we did not find a pain signal, it is of course important to remember the subjective and biopsychosocial nature of the pain experience.Elucidating the biology underlying the subjective experience is a task as difficult as it is important 15,57,58 .Pain biomarker studies are undertaken with methods from different fields, e.g., imaging methods 59 , electrophysiology 60 , or 'omics 16 .Recent advances concerning the role of calcitonin gene-related peptide (CGRP) in migraine 61 should give pain researchers some confidence that the search for biomarkers reflecting the pathophysiology of different chronic pain conditions is hopefully not a futile task.
Obvious study limitations include the cross-sectional design and the possibility of confounders such as for instance concomitant medication (although our findings seem robust when it comes to the influence of sex, age, BMI and HbA1c).The possibility of there being a systematic error in the material cannot be ruled out.Also, even if our results would turn out to be valid in the sense that they really reflect neuropathy-related pathophysiology, it is still important to consider whether the described "fingerprint" relates directly to the pathophysiology of neuropathy, or if perhaps it is a risk factor that was present prior to the development of neuropathy.A third possibility could be that the fingerprint is an epiphenomenon, perhaps more related to co-morbidities such as insomnia or depression.Disentangling the contribution of potentially mutually interacting factors is a challenge and will require longitudinal studies.Fourth, when measuring multiple analytes in a single experiment, antibody specificity is an important issue to be aware of.The PEA technology 28,29,30,31 builds on dual recognition, i.e., a pair of oligonucleotide-labelled antibodies have to bind to their respective target protein to generate a signal, leading to higher specificity compared to methods based on a single antibody.This fact notwithstanding, the question marks raised by the specificity issue remain a major limitation in the present work.The findings should therefore be interpreted cautiously, warranting further replication studies using alternative methods of detection.
To conclude, in Table 5 we present a list of 11 inter-correlated proteins who were positively correlated to the severity of neuropathy in DSP patients but not to the presence of neuropathic pain.These may have potential as novel biomarkers for diabetic neuropathy which are increasingly important as new understanding of axon degeneration has led to novel drug targets to prevent axon degeneration.The validity and clinical relevance of these putative neuropathy biomarkers will need to be confirmed in future longitudinal studies.