An IL-6-IL-8 score derived from principal component analysis is predictive of adverse outcome in acute myocardial infarction

Highlights • Principal component analysis (PCA) can create scores from collinear markers.• This study shows a PCA-derived score can combine cytokines in AMI patients.• An IL-6-IL-8 score using PCA independently predicts poor outcomes in AMI.

occurs after infarction [2]. However, suboptimal levels of inflammation can have detrimental effects on this repair process [2]. Cytokines are an important subset of inflammatory markers and their elevated levels have been associated with adverse remodelling and outcomes after AMI [3,4].
Traditionally, studies have focussed on the association between single cytokines and major adverse cardiovascular events (MACE) following AMI. However, there has been increasing recognition that inflammation following AMI is a complex process and investigating multiple cytokines using a combined analytical method may be beneficial [5][6][7][8][9]. The methods used to date have been relatively simple, such as combining a pro-and an anti-inflammatory cytokine into a ratio or adding a point to a patient's inflammatory score for every cytokine elevated above the median [5][6][7]. Unfortunately, using ratios limits the combination to only two cytokines that have assumed opposing actions. These methods also assume that cytokines are independent factors, which inherently places a greater weighting on correlated cytokines with overlapping functions, causing them to be over-represented in the derived score. There is a need to create a combined cytokine score using a mathematical approach that can counter these flaws.
One mathematical approach is principal component analysis (PCA). PCA is a statistical method that reduces complex, multi-dimensional data while retaining maximal variance [10]. It makes no assumptions on the independence between variables, which means multi-collinear data can be combined into scores. Other benefits are that PCA can be conducted with small portions of missing data for individual patients and that, by reducing the number of variables, a smaller cohort size can be used to measure outcomes using multivariate analysis in a statistically-meaningful way [11]. PCA has been used in studies looking at adverse outcome in conditions such as subarachnoid haemorrhage and trauma, where the authors derived prognostic scores from the principal components and found these to be predictive of outcome [12,13].
The primary aim of this study was to create a PCA-derived score composed of cytokines associated with inflammation in AMI patients, and assess its association with major adverse cardiovascular events (MACE).

Study population
All patients that were included in this observational cohort study were selected from an existing AMI biobank. Patients were recruited into this biobank if they were admitted to Wellington Regional Hospital, New Zealand, between January 2012 and April 2018 with a diagnosis of spontaneous (type 1) AMI and were planned to be treated with an invasive approach (angiography with or without revascularisation). AMI was defined according to the third universal definition [14]. Exclusion criteria included types 2-5 myocardial infarction (MI), chronic inflammatory disorders, malignancy, and immune-modulating medications. All patients gave voluntary written consent for their participation in this biobank and the study was approved by the New Zealand Lower South Ethics Committee (LRS/11/09/035) and the New Zealand Central Health and Disability Ethics Committee (16/CEN/68).
From this population, patients were selected for this study if they had not been treated with an anti-thrombolytic agent, had no renal insufficiency and had plasma samples collected between 48 and 72 h after onset of their ischemic symptoms.
The primary endpoint was defined as MACE at one year post index admission with AMI. MACE was defined hierarchically as a composite of all-cause death, non-fatal MI, non-fatal ischaemic cerebrovascular accident, stent thrombosis, chronic heart failure (CHF) leading to hospital admission, and unplanned revascularisation. Stent thrombosis included either probable or definite stent thrombosis as defined according to the Academic Research Consortium criteria [15].

Data collection and sampling methods
Baseline demographics and clinical information were prospectively collected using hospital records and the cardiac catheterisation database. Follow-up was conducted by research nurses at 12 months through telephone calls, hospital electronic records, and where necessary, contacting participants' family doctors.
Between 48 and 72 h from symptom onset, whole blood samples were collected in sodium citrate tubes (0.109 M BD Vacutainer, New Jersey, USA) from the peripheral vein prior to cardiac catheterisation or from the arterial sheath immediately after insertion and before administration of heparin during coronary angiography. The tubes were centrifuged within 60 min of collection at 1500g for 12 min at room temperature. After centrifugation, the separated plasma was stored at −80°C until analysis.

Statistical analysis
For the CBA plates, data collected from the flow cytometer was analysed on FlowJo® (FlowJo, LLC; OR, USA) and FCAP Array Software v3.0 (BD Biosciences; CA USA) to obtain a mean fluorescence intensity (MFI). The concentrations of each cytokine were calculated using a 5PL fitted standard curve. Samples values below the blank were defined as "undetectable" and given a concentration of 0 pg/mL. Samples that had MFIs or absorbances above the blank but below the lowest standard (i.e. undetermined concentrations when fitted with the 5PL curve) were extrapolated using linear regression down to zero. Samples with MFIs for RANTES above the upper limit of detection were all given one concentration that was greater than the highest detected concentration, and this was arbitrarily chosen as 16,000 pg/mL. Unless specified, analysis was conducted on GraphPad Prism v.7 (GraphPad Software Inc.; CA, USA).
Descriptive statistics for baseline demographics and cytokine concentrations, tests of normality (Shapiro-Wilk and D'Agostino and Pearson) for the cytokine concentrations and baseline demographics were analysed on GraphPad Prism v.7. Cytokines where over half of the patients had detectable concentrations were used to create a combined cytokine score by the addition of significant component scores derived from principal component analysis (PCA). Parallel analysis was used to determine the number of significant components. Regression component scores were generated without rotation and the component matrix was used to observe which cytokines contributed significantly to the loading of each component score (correlation > 0.5). Finally, the component scores were added to create the cytokine score. All PCArelated statistics except for parallel analysis was conducted using SPSS v.24 (IBM; Armonk, NY, USA). An online engine was used for parallel analysis [16].
For univariate analysis, a Mann Whitney U test for the cytokine score, individual cytokine concentrations and continuous risk factors was used against MACE, while a Chi square test was used for categorical risk factors. From this test, any individual cytokine concentrations that were significantly different between patients with and without MACE were combined into a score using principal component analysis. A Mann Whitney U test was also conducted on this subset score. Receiver operator characteristic (ROC) curves were generated and multinomial logistic regression was used for any PCA-derived scores that were statistically significant on univariate analysis. SPSS v.24 was used to complete these analyses.
Finally, to conduct a post-hoc power calculation, a student's T test was used to compare the means and standard deviations of the cytokine score in patients who did and did not develop MACE (G*Power v.3.0.10, University of Düsseldorf; Germany). A post-hoc power calculation showed that the cohort had 62.6% power to detect a significant difference in means of cytokine scores between patients who did and did not develop MACE (significance calculated by α = 0.05).

Baseline demographics
From a population of 1994 patients in the biobank, 320 patients met the study criteria and were included in the cohort. Reasons for exclusion included blood samples not collected within 48-72 h from symptom onset (1581, 94.4%), treatment with thrombolysis (33, 1.97%), renal insufficiency (26, 1.55%), subsequent revision to an alternative diagnosis from AMI (18, 1.08%), and other factors such as no plasma samples collected and undetermined time of symptom onset (16, 0.96%). A further three patients were excluded due to loss to follow-up, resulting in a final cohort of 317 patients. Table 1 presents the baseline demographics of the cohort. Of the 317 patients, 76.0% were male, 83.0% were European and 78.9% were diagnosed with NSTEMI. The mean age was 62.5 years. From this cohort, 41 (12.9%) patients developed MACE within one year of follow-up, which comprised of four deaths, 16 AMIs, seven ischaemic strokes, one stent thrombosis, eight CHFs and five unplanned revascularisations. We observed that patients who developed MACE were older (66.6 versus 61.9 years old, p-value = 0.022), more likely to present with STEMI (36.6% versus 18.8%, p-value = 0.014) and more likely to have a previous history of stroke or TIA (14.6% versus 4.70%, pvalue = 0.024).

Descriptive statistics of individual cytokine concentrations
The percentage of detectable levels of the 13 cytokines within the patient population ranged from 0% to 99.4%. Cytokines with less than 50% of the patients having detectable levels (IFNγ, IL-4, IL-17A, GM-CSF, TGF-β1, TNFα, and VEGF) were excluded from further analyses. The concentrations, medians and interquartile ranges (IQRs) for the remaining six cytokines (IL-1β, IL-6, IL-8, IL-10, MCP-1, and RANTES) were plotted in Fig. 1. The medians and IQRs were calculated and plotted, revealing large IQRs and demonstrating a considerable variation in concentrations. These values can be found in Supplementary  Table 2.

Principal component analysis
Barlett's test of sphericity was statistically significant (p < 0.001), indicating that the cytokines were sufficiently correlated for PCA to be useful [17]. Parallel analysis revealed that only the first two components, with eigenvalues of 1.562 and 1.156, were significant. These two components accounted for 45.3% of the variance in the data and the scores of the two components were added to create the cytokine score. Table 2

Predictors of MACE -Univariate analysis
Figs. 2 and 3 display the medians and IQRs for the individual cytokines and PCA-derived cytokine scores in patients who did and did not develop MACE at one year. Supplementary Table 3 presents the information in Figs. 2 and 3. As IL-6 and IL-8 were the only cytokines that were statistically significantly different between the two cohorts (p = 0.006 and p = 0.004, respectively), PCA was used to create an IL-6-IL-8 score. The Bartlett's test of sphericity was statistically significant (p < 0.001). Only one component was significant, with an eigenvalue of 1.347. This component accounted for 67.4% of the total variance of the two cytokines. The score of this component was used for the IL-6-IL-8 score. The medians, IQRs and p-value from the Mann Whitney U test were added to Supplementary Table 3. Alongside IL-6 and IL-8 levels, both the cytokine score and the IL-6-IL-8 score were statistically associated with MACE (p < 0.05).
To evaluate the prognostic ability of the various cytokines and PCAderived scores for MACE, ROC curves were generated. The cytokine score and IL-6-IL-8 score were found to be moderate predictors of MACE, with significant area under the curves (AUCs) of 0.606 (p = 0.048) and 0.652 (p = 0.002), respectively. The ROC AUC for all cytokines and scores are shown in Table 3, and the optimal cut-off value, sensitivity and specificity of significant ROC AUCs are also included in the table. Only ROC curves of cytokines or scores with significant AUCs have been shown in Fig. 4.

Predictors of MACE -multivariate analysis
As all individual cytokines were used to derive the cytokine score, and IL-6 and IL-8 individually were used to derive the IL-6-IL-8 score, these could not be combined into one multinomial logistic regression model. Therefore, cytokine scores above 0.0802, age and clinical presentation (the baseline characteristics that were statistically significant on univariate analysis) were combined into one multivariate model (Model 1); IL-6-IL-8 scores above −0.141 and clinical factors were combined into Model 2; and IL-6 levels above 3.11 pg/mL, IL-8 levels above 3.59 pg/mL, and clinical factors were combined into Model 3. With only 19 patients (6.0%) in the cohort having a previous history of stroke or TIA, this variable was excluded from all models. Clinical presentation remained significantly associated with MACE in all three models and age was significant in the first model. Of the two PCA-derived scores, only the IL-6-IL-8 score was found to be an independent predictor of MACE at one year, with an OR of 2.77 (95% CI 1.32-5.81), p = 0.007 (Table 4). IL-6 alone was also an independent predictor of MACE (OR 2.18, CI 1.06-4.50, p = 0.035).

Discussion
In this study, we created two PCA-derived scores that allowed us to mathematically reduce the data while accounting for collinearity between cytokines with overlapping functions and maximising data variance. On univariate analysis, the cytokine PCA-derived score, the IL-6-IL-8 PCA-derived score, and levels of both IL-6 and IL-8, were associated with MACE along with clinical presentation and age. ROC analysis showed modest predictive power for each of these continuous variables. Using cut-off points from that analysis, multivariate models found that either IL-6-IL-8 PCA-derived score or the IL-6 alone were independent predictors of MACE.
Combining the six cytokines where values could be calculated for > 50% of the population in the PCA model produced an AUC of 0.606 on ROC analysis, suggesting moderate ability to predict MACE. In the multivariate model, the point estimate for the ORs was 3.77, but the 95% CI crossed 1.0, such that this score was not an independent predictor of MACE. Despite this, the inclusion of multiple cytokines that have been previously linked to MACE following ACS may have merit. From a theoretical point of view, inclusion of multiple markers using a mathematical technique that deals with inter-correlations and is robust even in the presence of missing data-points is an appealing approach to characterise inflammation.
The alternative approach is to selectively include cytokines into the PCA model that are univariate predictors of outcome in a given cohort. In this instance, inclusion of IL-6 and IL-8 alone generated a PCA score that had a greater AUC of 0.652 on ROC analysis than either cytokine individually or the combined cytokine score. In addition, the IL-6-IL-8 score was an independent predictor of outcome in the multivariate model, with a greater odds ratio than either cytokine alone. This is consistent with the idea that inclusion of more than one inflammatory marker into a score may be more predictive than use of a single marker [9]. Skau et al. showed similar results in their study of AMI patients where selectively combining growth differentiation factor 15 (GDF-15) and tumour necrosis factor-related apoptosis-inducing ligand receptor 2 Fig. 1. Cytokine concentrations in AMI patients measured in plasma 48 to 72 h from symptom onset. The concentrations of all cytokines shown on the left y-axis, except for RANTES, which is demonstrated on the right. For RANTES, values above the upper limit of detection were given an arbitrary value of 16,000 pg/mL. The bars represent the median and IQRs of each cytokine. Correlations less than −0.5 or > 0.5 demonstrate a significant loading or contribution to the variance in the component.
(TRAIL-R2) produced the same ROC AUC of 0.85 for MACE as the combination of 33 biomarkers [8]. IL-6 is expressed by many cell types, predominantly acts in a proinflammatory manner by mediating the acute phase response and has been associated with MACE in patients with acute coronary syndrome [4]. IL-8 is also expressed by various cells and has both pro-and antiinflammatory roles. It is activated during myocardial ischaemic-reperfusion injury but also helps with angiogenesis, which improves myocardial function after AMI [4]. Interestingly, most studies have shown that high levels of IL-8 are associated with increased risk of adverse outcomes, but Velasquez et al. found it was protective in women [4,[18][19][20]. To the best of our knowledge, no previous study has looked at combining IL-6 and IL-8 alone to predict MACE in AMI. This is likely because both cytokines have pro-inflammatory actions, making them unsuitable to be combined into a ratio as this type of analysis is normally reserved for two cytokines with opposing roles in  inflammation. Moreover, these two cytokines are often highly correlated, so if they were to be included in a simple score that did not control for their overlapping functions, the inflammatory pathways associated with these cytokines would be over-represented in the score. This makes it difficult to derive meaningful scores from both sets of data. However, as demonstrated by this study, PCA can be useful for combining these two cytokines as it takes into account the collinearity that exists between them.
Few studies have investigated PCA-derived scores as predictors of MACE following AMI. One study analysed 12 biomarkers in 100 STEMI patients and used exploratory factor analysis (EFA), a statistical method similar to PCA, to create factor scores and correlated these with 30-day MACE [21]. Only two factor scores, one of which was composed of IL-6, IL-8 and MCP-1, were found to be independent predictors. With our cohort, IL-6 and IL-8 contributed to the first component score, while MCP-1 contributed to the second component score. The differences in our findings may be due to the rotational component of EFA, which allows for improved clustering of variables than PCA while compromising the amount of variance retained by the model [22]. In this study, we prioritised retaining the maximum amount of variance possible in our cytokines over correlations between cytokines, as there was large variance in the individual cytokines concentrations among AMI patients, thus we chose PCA over EFA.
Each of the cytokines chosen for analysis has been associated with adverse outcomes following ACS in at least one prior study (Supplementary Table 1). This was the basis for inclusion of these markers in the current study. However, there are many other inflammatory cytokines which have been used to predict outcome following ACS [7,8] that were not able to be included in this study due to limited resources. Additionally, six of the 13 cytokines analysed were undetectable in our cohort. This may be due to using a sampling time that was too distant from symptom onset. To standardise timing in this study, we chose to include patients where blood was collected 48-72 h after symptom onset, as this was the most frequent time period in our biobank, allowing us to examine these trends in a larger cohort of patients. While this limited our ability to analyse only seven markers, the combination and sub-combination were still predictive of MACE. IL-1β may peak significantly sooner than 48 h [23], and other markers, including IL-6, MCP-1 and IFNγ, have been shown to have dynamic changes in concentration over time [24,25]. The optimal time point for measuring peak levels of cytokines in the context of ACS has not been determined. Further investigation is required regarding the temporal changes in concentrations for all cytokines and how this might affect a cytokine score.

Conclusion
Use of PCA to derive an inflammatory score from multiple cytokines was predictive of MACE in AMI patients, with the best model achieved by inclusion of only those cytokines that were individually associated with MACE. In this study, these cytokines were IL-6 and IL-8. The PCAderived score had moderate sensitivity and specificity. While this approach appears promising, the optimal set of cytokines to measure, and the optimal time to measure them remains to be determined.

Source of funding
This study was not funded by any specific grants and used departmental funding from the University of Otago Wellington, New Zealand.  All significant AUCs are bolded (p < 0.05). For significant AUCs, a cut-off value with optimal sensitivity and specificity has been determined from the ROC curve, with sensitivity prioritised over specificity when deciding between two similar cut-offs. Fig. 4. ROC curve of significant individual cytokines and PCA-derived scores as predictors of MACE. The ROC curves demonstrate sensitivity (y-axis) and 1-specificity (x-axis) for IL-6 and IL-8 individually (left), and for the cytokine score and the IL-6-IL-8 score (right), as predictors of MACE. All had a moderate AUC that was significant (p < 0.05).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Model 1 = Age, STEMI and cytokine score > 0.0802; Model 2 = Age, STEMI and IL-6-IL-8 score greater than −0.141; and Model 3 = Age, STEMI, IL-6 concentration > 3.11 pg/mL and IL-8 concentration > 3.59 pg/mL. Significant p-values (< 0.05) are bolded.