Machine learning cardiac-MRI features predict mortality in newly diagnosed pulmonary arterial hypertension

Abstract Aims Pulmonary arterial hypertension (PAH) is a rare but serious disease associated with high mortality if left untreated. This study aims to assess the prognostic cardiac magnetic resonance (CMR) features in PAH using machine learning. Methods and results Seven hundred and twenty-three consecutive treatment-naive PAH patients were identified from the ASPIRE registry; 516 were included in the training, and 207 in the validation cohort. A multilinear principal component analysis (MPCA)-based machine learning approach was used to extract mortality and survival features throughout the cardiac cycle. The features were overlaid on the original imaging using thresholding and clustering of high- and low-risk of mortality prediction values. The 1-year mortality rate in the validation cohort was 10%. Univariable Cox regression analysis of the combined short-axis and four-chamber MPCA-based predictions was statistically significant (hazard ratios: 2.1, 95% CI: 1.3, 3.4, c-index = 0.70, P = 0.002). The MPCA features improved the 1-year mortality prediction of REVEAL from c-index = 0.71 to 0.76 (P ≤ 0.001). Abnormalities in the end-systolic interventricular septum and end-diastolic left ventricle indicated the highest risk of mortality. Conclusion The MPCA-based machine learning is an explainable time-resolved approach that allows visualization of prognostic cardiac features throughout the cardiac cycle at the population level, making this approach transparent and clinically interpretable. In addition, the added prognostic value over the REVEAL risk score and CMR volumetric measurements allows for a more accurate prediction of 1-year mortality risk in PAH.


Introduction
Cardiac magnetic resonance (CMR) is a powerful prognostic tool owing to its ability to assess cardio-physiological parameters such as the volume and function of the cardiac chambers, tissue characterization, and anatomical structure. Machine learning methods harnessing CMR's prognostic abilities remain rare and mainly focus on segmenting cardiac chambers to automate CMR measurements. 1 The process of automating CMR measurements has matured over recent years, proving to be accurate and comparable with results obtained from manual segmentation. [2][3][4] However, there is a wealth of data available in CMR studies other than those based on volumetric measurements. A recent machine learning model based on the motion of segmented right ventricle predicted mortality in a mixed cohort of pulmonary hypertension patients. 5 This study linked impaired basal longitudinal shortening and transverse contraction at the interventricular septum and free wall with an increased risk of mortality. 5 Another recent machine learning model based on CMR disease features extracted by multilinear principal component analysis (MPCA) has been used to predict the presence or absence of pulmonary arterial hypertension (PAH) 6 without the need for segmentation.
The MPCA-based model is interpretable because MPCA is a linear and transparent feature extraction method, thus a particularly promising machine learning approach for CMR imaging. Each CMR image sequence is a three-dimensional array (e.g. 512 × 512pixels × 6 mm slice thickness ×20images throughout the cardiac cycle), with each element being a voxel capturing different tissue characteristics, anatomical location, and temporal variation in the cardiac cycle. Such a multidimensional array can be naturally represented as a mathematical object called a tensor. MPCA extracts features directly from such multidimensional tensor representation which preserves the multidimensional structure of the original CMR data more accurately than reshaping it into one-dimensional, vector representation. 7 The extracted MPCA features can then be weighted in classification or regression models to optimize the prediction of the desired outcome.
Pulmonary arterial hypertension is a rare but serious disease that is associated with high mortality if left untreated. 8 This study aims to assess the prognostic accuracy of the above MPCA-based model to predict 1-year mortality in PAH. Therefore, evaluating prognosis is key to identifying high-risk patients and optimizing their management strategies as recommended by the European Society of Cardiology guidelines. 9,10 Multiple clinical parameters are routinely obtained to evaluate PAH disease progression, including pulmonary haemodynamics from right heart catheterization (RHC), functional data from exercise tolerance and pulmonary function tests, biochemistry including N-terminal pro-B-type natriuretic peptide (NT-proBNP) and imaging including echocardiogram and CMR. The REVEAL score is a composite clinical risk score for mortality that combines these clinical parameters to predict 1-year mortality. 11 In addition, CMR measurements such as right ventricular volumes and function have been shown to predict mortality in PAH. 12 Thus, the availability of detailed patient phenotyping and prediction scores allows setting a clinical benchmark for the performance of machine learning prognostic models in PAH. This study assesses the additive value of the MPCA-based model to predict mortality compared with established prognostic parameters such as the REVEAL risk score and CMR measurements.

Methods
The TRIPOD checklist for reporting prediction model development and validation was followed 13 and is available in the supplemental material.

Study population
All consecutive treatment-naïve patients with PAH referred for a baseline CMR between 2008 and 2019 were identified from the ASPIRE  registry. 14 Eligibility criteria included: (i) a baseline CMR study performed within 14 days of a PAH diagnosis, confirmed by RHC, and before the commencement of PAH treatment. (ii) Minimum 12 months follow-up or death within 12 months post-CMR study. The study population was divided into two cohorts: (i) a training cohort whose CMR images were used to develop and optimize the prognostic algorithm and (ii) a validation cohort that was left out of training and used to validate the performance of the prognostic model. The cohort was split 70:30 into the model development and model validation cohort.
Ethical approval was obtained from the local ethics committee and written consent was waived for this retrospective study (ref c06/ Q2308/8).

MR imaging protocol
Cardiac magnetic resonance was performed with a 1.5 Tesla GE HDx (GE Healthcare, Milwaukee, USA) system using an eight-channel cardiac coil. Four-chamber (4Ch) and short-axis (SA) cine images were acquired using a cardiac-gated multislice balanced steady-state free precession sequence (20 frames per cardiac cycle, slice thickness 10 mm, 0 mm interslice gap, field of view 480 mm, acquisition matrix256 × 200, flip angle 60°, BW 125 KHz/pixel, TR/TE 3.7/1.6 ms). A stack of images in the SA plane were acquired fully covering both ventricles from base to apex. End-systole was considered to be the smallest cavity area. End-diastole was defined as the first cine phase of the R-wave triggered acquisition or largest volume. Patients were in the supine position with a surface coil and with retrospective ECG gating. Volumetric and ventricular function analysis was performed by contouring the ventricular endocardial borders at end-diastole and endsystole on the SA images using MASS software (MASS, 2020; Leiden University Medical Center, Leiden, the Netherlands). Papillary muscles and trabecula were included in the blood volume.

Image preprocessing
Mid-chamber SA and 4Ch cine images were used in this study. Images were processed following methods in a previous study. 15 In brief, images were preprocessed by standardizing CMR voxel units between subjects, registering to each other using three anatomical landmarks, masking surrounding tissues, and downscaling image size ( Figure 1).
Cardiac magnetic resonance voxel units were standardized between subjects by z-scores. Rigid image registration was used based on three predefined fixed anatomic landmarks. The landmarks were manually placed on SA (superior insertion point; right ventricular free wall inflexion; mid-left ventricular lateral wall) and 4Ch (left ventricle apex; lateral mitral annulus; lateral tricuspid annulus). Cardiac magnetic resonance images were landmarked by a single reader (S.A.) with independent visual quality assurance checks (S.A.; J.U.). To focus on spatially relevant features, an ellipsoidal mask was fitted around the heart. Downsampling was performed to four image sizes (32 × 32, 64 × 64, 128 × 128, and 256 × 256).

Multilinear principal component analysis pipeline
The prognostic prediction was achieved by training support vector machines (SVMs) on MPCA features extracted from CMR studies. 6,7 The methodology followed the MPCA-based pipeline in previous studies. 6,15 This pipeline was trained through 10 rounds of 10-fold cross-validation on the development cohort (n = 516). For each fold during training, MPCA features were extracted and ranked for prognostic capability using Fisher's Discriminant Analysis in the following way. Extracted features were ranked and selected using a step-wise feature inclusion method. This was performed using a random tuning-set (n≈50) of cases. The feature set with the highest tuning-set performance was used to train an SVM and tested on the left-out fold. The feature set with the highest foldperformance was used to train the final development SVM. This MPCA-based machine learning model was then applied to the completely left-out validation cohort (n = 207). On a standard computer, the time it takes to process each image and perform inference is much ,0.

Visualization of tensor features
Trained features were visualized by using MPCA reconstruction to obtain spatially relevant feature maps. To visually inspect the impact of specific regions on high-and low-risk of mortality prediction, a two-step procedure of thresholding and clustering was implemented. Voxels containing high absolute values (high positive = high-risk, high negative = low-risk) of MPCA features were thresholded. Morphological dilation-erosion using a spherical structural element (r = 2) was performed and clusters of visually significant size were overlaid on individual patients' original CMR scans.

Clinical and mortality data
Clinical data including intermittent shuttle walking test, pulmonary function test, and serum level of NT-proBNP were collected before treatment was commenced. Demographic data, WHO functional status, PAH subgroup diagnosis, and outcome were collected from the electronic medical system.Mortality data were collected from the electronic records of the National Health Service (NHS) Personal Demographics Service. The NHS automatically updates the mortality records once a death is registered in the United Kingdom. All patients were followed up as part of the national service specification for patients with pulmonary hypertension for a minimum of 12 months. No patients were lost to follow-up.

Statistical analysis
Continuous variables are presented as proportions, means + standard deviations, or median and interquartile range for data not following a normal distribution. The sample size for developing the prediction model was calculated using a 1-year mortality prevalence of 10% and seven predictor parameters and required 420 patients to develop the mortality prediction model. 16 The REVEAL score was calculated from composite clinical parameters 11 and modified to include the incremental shuttle walk test instead of the 6 min walking test. 17,18 The CMR volumetric measurements were indexed for body surface area and corrected for age and sex by calculating the percentage predicted values as per published reference data. 19,20 The outcome of the MPCA-based pipeline was calculated as the SA and 4Ch probabilities based on the SVM prediction. A combined probability was calculated by further training a dual-scan SVM from the selected features of both individual models-SA and 4Ch. All variables were standardized by subtracting the mean for each variable and dividing it by its standard deviation (SD) to allow for more meaningful comparisons. A univariable Cox proportional hazards regression was performed to estimate the 1-year mortality prediction of the REVEAL score, CMR measurements, and the MPCA probabilities. For the multivariable analysis, we planned to include the CMR measurements that were identified in previous prog-   nostic studies, namely right ventricular ejection fraction (RVEF), right ventricular end-systolic volume index (RVESVi), right ventricular enddiastolic volume index (RVEDVi), left ventricular end-diastolic volume index (LVEDVi), left ventricular stroke volume index (LVSVi) and pulmonary artery relative area change (PA RAC). 12,21 Owing to the high correlation between RVESVi and RVEDVi (r = 0.89), we only included RVESVi as the stronger predictor in the multivariable analysis. The proportional hazards assumption was confirmed using scaled Schoenfeld residuals. The c-index was used to measure the relative goodness of fit between the different regression models. The c-index indicates the rate of correct predictions of survival the model makes. We also computed the Akaike information criterion (AIC) for each model. The AIC estimates the rate of incorrect prediction and compares the quality of different models relative to each other while penalizing the models with more variables. While a higher c-index indicates a better model fit, a lower AIC value indicates fewer prediction errors. 22 In addition, the likelihood ratio test was performed to assess if there is a statistically significant difference between the different models and to determine the additive predictive value of the MPCA probabilities. The models compared were the univariable REVEAL score, the REVEAL score combined with prognostic CMR measurements, and finally a multiple variable model including the REVEAL score, CMR measurement, and the MPCA probabilities. Kaplan-Meier curves were analysed to demonstrate the prognostic value of MPCA predictions dividing patients based on the median MPCA value as the threshold. The high and low mortality risk groups were compared using the log-rank (Mantel-Cox) test. The receiveroperating characteristic curve (ROC) and the area under the curve (AUC) were used to estimate the prognostic accuracy of the different MPCA features.

Study population characteristics
A total of 737 consecutive incident patients with PAH were identified. Incomplete scans because of claustrophobia or patient intolerance were excluded, leaving 723 scans for the analysis. The training cohort included 516 and the validation cohort 207 subjects ( Figure 2).

Mortality prediction Survival analysis
The 1-year mortality rate in the validation cohort was 10% with an overall mortality rate over the total follow-up period of 29%. Kaplan-Meier survival analysis demonstrated a significant difference in survival in patients with high and low mortality risk in the validation cohort (log-rank test ,0.001) (Figure 3). The ROC curve for each model is shown in Figure 4. The AUC was 0.73 for the SA model, 0.64 for 4Ch, and 0.70 for the combined MPCA-based features to predict 1-year mortality in the validation cohort.
Univariable Cox regression analysis confirmed a strong prognostic utility of the SA and combined SA and 4Ch MPCA-based predictions ( Table 2). However, the 4Ch features alone were not significant predictors of mortality. The univariable Cox regression hazard ratios for the demographics, RHC and CMR measurements, functional tests and clinical parameters are shown in Table 2. The REVEAL score and, PA RAC and age and sex-adjusted RVESVi were significant predictors of 1-year mortality.

Additive prognostic value
Several multivariable prognostic models were compared in Table 3 to compare the predictive value of the REVEAL score alone, REVEAL score combined with CMR measurements or MPCA features and finally REVEAL score combined with CMR measurements and MPCA features. The prognostic models were compared using the c-index and AIC test for goodness of fit and the log-rank test to assess the statistical significance of the difference between the models. The univariable REVEAL model allows the assessment of the 1-year risk of mortality based on available composite clinical data alone. Adding the MPCA-based predictions allows evaluating the added incremental value in predicting death compared with REVEAL and segmentation-based CMR parameters.
The REVEAL score alone had a c-index of 0.71 and AIC of 203. Adding CMR measurements improved the model statistically significantly, to 0.78 and AIC of 205 (log-rank test P = 0.003). The model including MPCA prediction, REVEAL score and CMR measurements, showed the strongest prognostic utility  Table 3 C-index and Akaike information criterion (AIC) for the univariable and multiple variable Cox regression analysis for the REVEAL score, CMR measurements, and the MPCA machine learning model

Temporal prognostic dynamics
The MPCA-based features were assessed throughout the cardiac cycle and grouped according to the anatomical region into the right ventricle (RV), left ventricle (LV) and septum. For visualization purposes, we manually segmented the averaged SA and 4Ch slice to group the MPCA features into anatomical regions. The features were divided into low and high-risk features based on the median MPCA feature values used in the Kaplan-Meier analysis ( Figures 5, 6). On the SA views, abnormal interventricular septum during systole and particularly at end-systole and the LV chamber during diastole and particularly at end-diastole indicated a higher risk of mortality. On 4Ch views, the features with the highest impact on predicting mortality were at the RV at early systole. A normal LV and interventricular septum in diastole on SA and 4Ch imaging were the strongest predictors of survival, whereas the RV was a poor indicator of survival ( Figure 5).

Discussion
This study assessed the prognostic utility of an MPCA-based machine learning model in CMR in patients with treatment-naïve PAH. This is the first study to localize prognostic PAH features with an explainable AI approach dynamically over the cardiac cycle. In addition, we have shown the incremental prognostic value of the MPCA model compared to known prognostic markers such as the REVEAL score and CMR volumetric measurements. The advantage of using MPCA is its interpretability. The ability to directly relate prognostic features identified in the machine learning process helps understand and explain the machine learning model's findings. Diagnostic and prognostic models based on deep learning methods have been criticized for creating a 'black-box' situation where the predictions are often difficult to comprehend and retrace. 23 Visualizing the MPCA features throughout the cardiac cycle allowed discerning the most significant discriminatory predictors of death on CMR in PAH. The known prognostic features identified in pulmonary hypertension of diastolic interventricular septal flattening, 24 reduced LV size and increased RV size 25 can all be visually assessed on SA images. The most significant features identified in non-survivors on SA imaging were located at the septum at end-systole and LV at end-diastole. Changes in the interventricular septum at end-systole are the result of RV pressure overload. The altered pressure gradient between the LV and RV results in flattening of the septum giving a characteristic D-shaped LV and eventually results in impaired LV diastolic function and reduced LV filling. 26,27 Survivors showed the opposite with features in the septum at enddiastole and LV at end-systole. We found fewer overall features on SA images at the RV. However, on 4Ch imaging the most significant features were identified in RV systole. Whereas the septal and LV features were less important on 4Ch imaging. The 4Ch view allows assessing the longitudinal RV contractility which for example can be inferred on echocardiogram by assessing the tricuspid annular plane systolic excursion (TAPSE). Right ventricle longitudinal contraction is known to be the larger component of RV contraction and a key prognostic indicator [28][29][30] which explains its prognostic importance in PAH.
The MPCA-based model was developed and validated on CMR imaging performed at diagnosis and in treatment-naive PAH patients. Disease severity assessed at baseline assessment is important for planning an optimal treatment strategy. Almost all published prognostic CMR studies in PAH are based on disease prevalent PAH patients in later stages of the disease process. 12 A meta-analysis of 22 studies and almost 2000 patients with PAH showed that RVEF, RVESVi, RVEDVi, LVEDVi and LVSVi were significant predictors of mortality. 12 Right ventricle ejection fraction, RVEDVi, LVEDVi, and LVSVi did not predict mortality in our baseline PAH cohort. The MPCA pipeline can therefore elicit cardiac changes before they affect RV function and size and adds prognostic value at baseline evaluation. In addition, comparing the MPCA to the REVEAL score allowed us to evaluate the incremental value benchmarked against a clinically validated baseline prognostic tool. The MPCA-based predictions significantly improved the 1-year mortality prediction of the REVEAL score. The prognostic model accuracy (c-index) using REVEAL improved from 71% to 83% (log-rank test P , 0.001) when it was combined with CMR data including MPCA predictions and CMR measurements. However, even without REVEAL data, mortality can still be accurately predicted based on MPCA features alone with an accuracy (c-index) of 70%.
The application of step-wise cardiac features extraction using CMR has further potential that can be evaluated in future developments. Comparing prognostic features at follow-up with baseline features might provide a better understanding of disease progression on CMR and might offer a standardized disease monitoring tool. In addition, technical improvements would allow to fully automate the prognostic model, which currently requires manual image registration. Deep learning automated landmarking for image registration would reduce the manual processing of CMR images and reduce the time and cost associated with it. 31,32 Limitations This was an exploratory retrospective single centre study on patients with PAH. Findings will need to be confirmed in a prospective trial with an external validation cohort. In addition, applying the model to other diseases and MRI systems would further validate its generalizability.
The MPCA method was applied on cine images of the midchamber slice throughout the cardiac cycle. Stack imaging of the whole heart can currently not be included in the MPCA model training. However, because of the strong prognostic signal from the SA and 4Ch cine images we envisage that future developments including 3D data of the heart will further improve prognosis prediction.

Conclusion
Patient outcome prediction in PAH can be enhanced by adding MPCA-based machine learning to CMR volumetric data and clinical risk scores. The MPCA analysis gives a population insight into the prognostic cardiac features in PAH in an explainable and visualisable approach.

Funding
The study was supported by the Wellcome Trust grants 215799/Z/ 19/Z and 205188/Z/16/Z. For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission. The funders did not have any role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, and approval of the article.

Conflict of interest:
The authors have no relationships relevant to the contents of this paper to disclose.

Data availability
The data underlying this article will be shared on reasonable request to the corresponding author.