Cardiovascular magnetic resonance feature-tracking assessment of myocardial mechanics: Intervendor agreement and considerations regarding reproducibility

Aim To assess intervendor agreement of cardiovascular magnetic resonance feature tracking (CMR-FT) and to study the impact of repeated measures on reproducibility. Materials and methods Ten healthy volunteers underwent cine imaging in short-axis orientation at rest and with dobutamine stimulation (10 and 20 μg/kg/min). All images were analysed three times using two types of software (TomTec, Unterschleissheim, Germany and Circle, cvi42, Calgary, Canada) to assess global left ventricular circumferential (Ecc) and radial (Err) strains and torsion. Differences in intra- and interobserver variability within and between software types were assessed based on single and averaged measurements (two and three repetitions with subsequent averaging of results, respectively) as determined by Bland–Altman analysis, intraclass correlation coefficients (ICC), and coefficient of variation (CoV). Results Myocardial strains and torsion significantly increased on dobutamine stimulation with both types of software (p<0.05). Resting Ecc and torsion as well as Ecc values during dobutamine stimulation were lower measured with Circle (p<0.05). Intra- and interobserver variability between software types was lowest for Ecc (ICC 0.81 [0.63–0.91], 0.87 [0.72–0.94] and CoV 12.47% and 14.3%, respectively) irrespective of the number of analysis repetitions. Err and torsion showed higher variability that markedly improved for torsion with repeated analyses and to a lesser extent for Err. On an intravendor level TomTec showed better reproducibility for Ecc and torsion and Circle for Err. Conclusions CMR-FT strain and torsion measurements are subject to considerable intervendor variability, which can be reduced using three analysis repetitions. For both vendors, Ecc qualifies as the most robust parameter with the best agreement, albeit lower Ecc values obtained using Circle, and warrants further investigation of incremental clinical merit.


Introduction
Heart failure is characterised by high mortality irrespective of the predominance of either systolic or diastolic functional impairment. 1,2 Several imaging techniques are available to characterise its aetiology and severity, amongst which cardiovascular magnetic resonance (CMR) has a pivotal role. 3e5 In particular, the opportunity of easy and fast quantitative image analyses makes this technique attractive. 6 There is evidence to suggest that quantitative deformation imaging derived strain assessment based on echocardiographic speckle tracking has higher value for the prediction of mortality than ejection fraction (EF) in consecutive patients subjected to echocardiography. 7 CMRderived myocardial feature tracking (FT), a technique analogous to echocardiography speckle tracking, derives similar quantitative deformation parameters from routinely available steady state free precession (SSFP) cine sequences. Reasonable agreement between speckle tracking and CMR-FT has been demonstrated. 8 Furthermore, CMR-FT agrees well with myocardial tagging, 9 which is considered the reference standard for CMR quantitative wall-motion assessment, but the former does not require the acquisition of additional sequences. 10 Its clinical applicability has been demonstrated in a variety of cardiovascular diseases, 8,11e13 its feasibility of detailed assessments of systolic and diastolic cardiovascular physiology has been demonstrated, 14,15 and there is evidence of prognostic relevance in dilated cardiomyopathy. 13 Although the vast majority of such studies have been carried out with the software provided by TomTec Imaging Systems (Diogenes or 2D Cardiac Performance Analysis-MR, TomTec GmbH, Unterschleissheim, Germany) 16 recently Circle Cardiovascular Imaging (cvi 42 , Calgary, Canada) have introduced an alternative tool called Tissue Tracking. Given the fact that a widespread clinical use of these new measures of deformation is highly desirable and likely important, prerequisites to achieve this goal are to ensure that the assessments are reproducible and comparable with a high amount of intervendor agreement. Therefore, the aim of the present study was to assess the reproducibility and intervendor agreement of both commercially available types of software for the derivation of ventricular circumferential (Ecc) and radial (Err) strains, as well as rotational mechanics expressed as left ventricular (LV) torsion.

Material and methods
The study cohort consisted of 10 healthy volunteers. CMR imaging was carried out on a 1.5 T system (Intera R 12.6.1.3, Philips Medical Systems, Best, The Netherlands). All participants gave written informed consent after approval of the study protocol by the Institutional Review Board at the University of Nebraska Medical Center.

CMR imaging
The CMR examination was carried out in the supine position using a five-channel cardiac surface coil. Electrocardiogram (ECG)-gated SSFP cine sequences were acquired during brief periods of breath-holding in 12 to 14 equidistant short-axis planes completely covering the LV. Typical CMR parameters were as follows: 8 mm section thickness; 1e2 mm gap; 360Â480 mm field of view; 196Â172 matrix size. Dobutamine stress CMR imaging was performed as previously described. 17 Complete short-axis stacks were acquired at rest and with 10 and 20 mg/kg/min dobutamine, respectively.

CMR-FT
CMR-FT was performed using dedicated software provided by TomTec Imaging Systems (2D CPA MR, Cardiac Performance Analysis, Version 1.1.2.36) and Circle Cardiovascular Imaging (Tissue Tracking, cvi 42 ). For the purposes of this paper the different software tools are referred to as "TomTec" and "Circle". Identical short axis sections were analysed at apical, mid-ventricular, and basal levels to compare short-axis-derived global LV Ecc and Err (based on all three analysed sections) alongside the time-to-peak (TPK) strain duration. Short-axis CMR images were analysed at rest and with 10 and 20 mg/kg/min dobutamine, respectively. Myocardial torsion was calculated from the rotational raw data provided with the TomTec software using an in-house-developed post-processing tool as recently described by the authors' group. 15 The model underlying this assessment makes use of linear interpolation and takes standardized rotational measurements at 25 and 75% LV locations after the analysis of a whole LV short axis stack. In this model the most apical section showing LV cavity at end-systole is considered at the 0% LV location and the most basal section including a complete circumference of myocardium at end-systole is considered at the 100% LV location. In comparison to TomTec, Circle commercially provides torsion measurements within its software interface. This is done by manually choosing an apical and basal section. In order to allow accurate comparisons between vendors, apical and basal sections at the closest distance to 25% and 75% LV locations were chosen.
With both types of software LV endocardial and epicardial borders were manually delineated in all analysed sections with the initial contour set at end-diastole. In case of insufficient tracking, as defined by apparent deviations of the contours from the endocardial and epicardial borders, contours were manually corrected and the algorithm reapplied. The tracking was repeated three times in all sections. One single observer analysed all data using both types of software. Intra-observer variability was derived from the repetition of the analysis after 4 weeks. The analysis of a second skilled observer for both types of software was used to assess interobserver reproducibility.
Reported results are based on the average of three analysis repetitions (R3). To study the impact of repeated measurements on reproducibility, the reproducibility derived from results based on a single repetition (R1), averaged results for two (R2) and three repetitions (R3) were compared with each other.

Statistical analysis
Statistical analysis was conducted using Microsoft Excel and IBM SPSS Statistics version 22 for Windows. Data are expressed as mean (AE standard deviations). Pairwise nonparametric data at rest and with increasing levels of dobutamine were compared using the Wilcoxon test. Significance was determined at <0.05. The intra-and interobserver variability was assessed using three different methods: intraclass correlation coefficients (ICC), BlandeAltman analysis, 18 and coefficients of variation (CoV). The CoV was defined as the standard deviation of the differences divided by the mean. 19 The level of agreement was defined as previously described: excellent for ICC>0.74, good for ICC ¼ 0.60e0.74, fair for ICC ¼ 0.40e0.59, and poor for ICC<0.4. 20

Results
Demographics are displayed in Table 1. Quantitative analysis was performed in all subjects. Fig 1 shows a representative example of the derivation of Ecc with both types of software, respectively. Although all scans were successfully analysed using TomTec (100%) one volunteer was excluded with intermediate dose dobutamine stimulation (20 mg/kg/min) from the Circle group due to insufficient border tracking (97% success rate in total). The time for repeat-analysis (three repetitions) of a given case (when only considering the analysis of three sections with both types of software) including the tracking at rest and with the respective dobutamine levels did not vary between the different types of software and took 27e35 minutes on average. Conversely, the analysis time based on a single repetition only, took 9e12 minutes with either type of software.

Quantification of myocardial strain
The changes of myocardial strain in response to dobutamine stimulation are illustrated in Fig 2. There was a significant increase in Ecc and TPK Ecc at both levels of dobutamine (10 and 20 mg/kg/min; p<0.05) using TomTec (Table 2). Similarly, with Circle the Ecc and TPK Ecc significantly increased from rest to 10 and 20 mg/kg/min of dobutamine, respectively (p<0.05; Table 2). There was no significant increase from 10 to 20 mg/kg/min of dobutamine (p¼0.374; Table 2). There were significantly lower Ecc values derived from Circle as compared to TomTec at rest (p<0.05) and with 10 (p<0.05) and 20 mg/kg/min of dobutamine (p<0.05; Table 2).
Err significantly increased from rest to 20 and between 10 and 20 mg/kg/min of dobutamine using Tom Tec (p<0.05; Table 2). The corresponding TPK Err significantly increased from rest to 10 and 20 mg/kg/min of dobutamine (p<0.05) but not between 10 and 20 mg/kg/min of dobutamine (p¼0.125; Table 2). With Circle Err and TPK Err significantly increased from rest to 10 and to 20 mg/kg/min of dobutamine, respectively (p<0.05; Table 2). There was no significant increase from 10 to 20 mg/kg/min of dobutamine (p¼0.139; p¼0.051, for Err and TPK Err respectively; Table 2) and no significant difference in Err values derived from either software type (p>0.05 for all parameters). There were significantly increased strain values with dobutamine stress as compared to rest irrespective of the number of analysis repetitions (p<0.05, data not shown).

Quantification of myocardial torsion
A significant increase in myocardial torsion was measured between rest and both levels of dobutamine, but not between 10 and 20 mg/kg/min of dobutamine using either software type (p<0.05; Fig 2, Table 2). The change in TPK torsion did not reach statistical significance between rest and 10 mg/kg/min of dobutamine using TomTec (p¼0.13). All other comparisons reached statistical significance (p<0.05) ( Table 2). There was significant lower torsion at rest measured with Circle as compared to TomTec (p<0.05). There was no significant difference in torsion Table 1 Volunteer demographics.

Demographics
Healthy volunteers Study population, n 10 Gender (F/M) 5/5 Age (years) 40.6 (23e51) Continuous variable are expressed as mean AE standard deviation, age is expressed as median (range). Volumetric results have been adopted from. 15 EDV, end-diastolic volume; ESV, end-systolic volume; CI, cardiac index; EF, ejection fraction. Figure 1 Example of the derivation of LV Ecc curves, using the two commercially available CMR-FT software types.  Results are reported as mean (SD). Ecc, circumferential LV short axis strain; Err, radial LV short axis strain; TPK, time to peak; ms, milliseconds; BP, blood pressure; Other Abbreviations as in Table 1. Volumetric results have been adopted from. 15 Bold p values indicate a significance level 0.05. derived from either software during dobutamine stress (p>0.05 for all parameters).

Intervendor agreement and reproducibility
Intervendor variability was lowest for Ecc, and higher for myocardial torsion and Err on an intra-and interobserver level based on three analysis repetitions (R3; Fig 3). Intervendor agreement was generally lower than intravendor agreement for both types of software. Ecc was the least variable parameter for both types of software. Although TomTec showed better reproducibility for Ecc and myocardial torsion as compared to Circle, Circle had better reproducibility than TomTec for Err (Table 3, Fig 4). There was no reduction in intervendor agreement and reproducibility with dobutamine stress (data not shown).

Impact of repeated measurements on reproducibility
The results based on three repetitions (R3) are shown in Table 3 as compared to results based on two repetitions (R2; Table 4) and to results relying on single analyses (R1; Table 5). The intervendor agreement and the reproducibility within the individual software types of most assessed parameters were improved by repeated measurements both on the intra-observer and interobserver level. Whilst there was relatively little impact on Ecc with intervendor agreement on the intra-

Discussion
To the authors' knowledge, this study is the first comparison of different types of commercially available CMR-FT software, and presents several notable findings. First, intervendor agreement between the two software types is reasonable, with the best agreement for Ecc, and worse but acceptable agreements for myocardial torsion and Err. Second, there is significantly lower Ecc and myocardial torsion at rest and lower Ecc measured with dobutamine stimulation using Circle as compared to TomTec. Third, averaging of the results of repeated analyses increases both intervendor agreement and intravendor reproducibility; however the benefit is relatively low, considering that doubling or tripling of analysis times would be required. Lastly, although both software types show acceptable intravendor reproducibility, it is important to note that the Circle software shows slightly more variability for Ecc and myocardial torsion as compared to TomTec. Conversely, Table 3 Intervendor agreement and reproducibility for torsion, circumferential and radial strain based on the average of three repeated measurements (R3). Results are reported as mean (SD).
TomTec shows slightly more variability for Err measurements compared to Circle. Since the introduction of CMR-FT in 2009, 16 it has found widespread clinical and research applications in various adult and congenital disorders. 8,9,11,12,14,21 The increased demand for this relatively young technology necessitates the availability of quick and efficient post-processing software. Although, historically, such software has been provided by TomTec, Circle only recently released their version. Nevertheless, results need to be comparable and ideally interchangeable between different types of software to allow widespread clinical use. Within the present study, post-processing times were comparable between the software types making them equally applicable for clinical use. The fact that Ecc and resting myocardial torsion measured with Circle showed significantly lower values as compared to TomTec could potentially limit the interchangeability of results between vendors. Clearly, there is a need to consider these inherent differences when comparing results from either type of software. Notwithstanding these considerations, the underestimation (from a Circle perspective) or overestimation (from a TomTec perspective) in Ecc was consistent in the three experimental conditions and reproducible through the three repetitions, which may allow future work to introduce correction factors to account for these differences. The average difference between vendors was 4.8% (see Table 3) a considerable value compared to the range of strain observed in this population (10 to 30%, see Fig 2). Vendorinduced variability between TomTec and Circle was lowest for Ecc (see Tables 3e5). The finding of high reproducibility of Ecc is in line with previously published literature. 9e11, 22,23 There is evidence from studies that used TomTec suggesting that Ecc is the CMR-FT parameter with the highest reproducibility in health 10 and disease, 11 irrespective of field strength. 23 Furthermore, of all CMR-FTderived parameters, Ecc has been shown to have the highest interstudy reproducibility 22 as well as the best agreement with echocardiographical speckle tracking. 8 Myocardial torsion and Err were subject to higher intervendor variability compared to Ecc and to lower bias between vendors. Although Err showed no significant bias between vendors, there was significant underestimation of torsion at rest, but not during dobutamine stimulation using Circle. The variability associated with torsion may well be explained by the fact that the methodology that has been validated with TomTec makes use of linear interpolation and standardized measurements at predefined anatomical LV locations as compared to Circle that derives rotational mechanics directly from the analysed sections. 15 Based on ICC, Err showed the lowest intravendor reproducibility and intervendor agreement. Conversely, torsion showed the lowest intravendor reproducibility and intervendor agreement based on CoV.
It is important to note that Ecc and myocardial torsion reached slightly higher intravendor reproducibility with  Intervendor agreement and reproducibility for torsion, circumferential and radial strain based on the average of two repeated measurements (R2). Results are reported as mean (SD). ICC, intraclass-correlation coefficient; CoV, coefficient of variation; SD, standard deviation; CI, confidence interval. Other abbreviations as in Table 2. Table 5 Intervendor agreement and reproducibility for torsion, circumferential and radial strain based on single measurements (R1). Results are reported as mean (SD). ICC, intraclass-correlation coefficient; CoV, coefficient of variation; SD, standard deviation; CI, confidence interval. Other abbreviations as in Table 2.

A. Schuster et al. / Clinical Radiology 70 (2015) 989e998
TomTec as compared to Circle. This may be related to the fact that TomTec has been around for several years and been subjected to changes of the tracking algorithm several times (last change in December 2012); however, Circle showed better reproducibility for Err than TomTec. Considering these results, further refinements in the performance and subsequent increases in agreement between vendors and within vendors seem highly desirable. To achieve this, the impact of repeated measurements and subsequent averaging of results was tested. Although each of the parameters shows somewhat improved intervendor agreement and intravendor reproducibility with repeated analysis runs, this effect is most evident for myocardial torsion (reduction of intervendor CoV from 55% to 38%, see Tables 3 and 5). In comparison, three repetitions have a lower effect on Ecc as opposed to the lesser reproducible myocardial torsion. Even though Err has comparable reproducibility to myocardial torsion there is only modest improvement of reproducibility with repeated analyses based on CoV and no improvement based on ICC. Consequently, one needs to decide whether the positive effects of repeated analyses on intervendor agreement and intravendor reproducibility for most parameters would justify a threefold increased analysis time, especially in the setting of a large volume clinical practice. Doubling or tripling the time of analysis with an increase from about 9e12 to 27e35 minutes may represent a limitation of the feasibility of CMR-FT for clinical routine. From a time versus cost view, the analysis based on three repetitions may consequently not proof costeffective. From a time versus use standpoint, the clinical applicability of both software types seems comparable because of similar analysis durations for a single case with either software; however the fact that Circle provides built-in torsion measurements within their software interface may well enhance the clinical feasibility of deriving this parameter. In comparison, at present Tom-Tec derived rotational displacement and resulting data need to be further analysed, in the present case with inhouse Matlab software, for torsion calculation. An automatic and consistent selection of the apical and basal levels for the estimation of torsion removes the human factor in the selection of sections (in this study not accounted in the results of Circle, the sections were predefined in each case). In the present study, reproducibility was comparable using either software suggesting reliability of both approaches.
When interpreting the results of the current study, it is important to note that the main user action involves the manual delineation of the endocardial and epicardial contours in the first frame of an existing sequence of images, to start the tracking algorithm and to correct the initial contours if the tracking is not sufficient or has failed. The identification of these two initial contours is easily and quickly performed by a skilled user. Nevertheless, this factor introduces considerable variability in the results. This can be explained by the intrinsic difficulty to estimate rotation and strain metrics neglecting the out-of-plane movement that the myocardium experiences through the heart cycle.
Having two vendors performing similarly in terms of reproducibility suggests that FT in conventional short axis CMR has fundamental limitations that need to be tackled by the combination of different views in an attempt to reconstruct the true three-dimensional (3D) deformations and strains.

Study limitations
Significantly lower Ecc and resting torsion was found using Circle; however, the sample size of the current study in healthy volunteers is small and needs to be recognised when interpreting the results of the current study. The study did not include patients; however, similar CMR-FT reproducibility between health and disease had been reported before 11e13,23 independent of different patients groups. Consequently, the comparison of different types of CMR-FT software in healthy volunteers is appropriate and the results transferable when studying different disease states.
Furthermore myocardial tagging or speckle tracking echocardiography was not included as an independent reference standard. Notwithstanding this fact, it is important to note that TomTec has been compared to myocardial tagging with excellent agreements 9 and speckle tracking echocardiography with reasonable to good agreements in the past. 8,24 Furthermore, the aim of the current study was not to undertake another comparison with myocardial tagging 25e28 or speckle tracking echocardiography 8,24 but simply assess how well the two types of CMR-FT software agree with each other and whether or not both types of CMR-FT software can be used interchangeably.
Global deformation parameters were studied in the present study. Ideally, quantitative tools should be used to derive segmental information in addition to global values. Several studies have shown that segmental analysis does not provide high amounts of reproducibility for myocardial strain within repeated analyses 29 and repeated studies. 22 Therefore, the focus was on comparisons of entire sections and global myocardial deformation and rotation. Future refinements for both types of analysis software will possibly allow accurate and reproducible quantification of segmental deformation.
In conclusion, assessment of myocardial strain and torsion is feasible with the two types of commercially available CMR-FT software with reliable detection of increased myocardial deformation with dobutamine stimulation; however, myocardial strain and torsion measurements using both software types are subject to considerable inter and intravendor variability, even when averaging three analysis repetitions. It is important to note that Ecc and resting torsion values obtained from the Circle software are significantly lower as compared to TomTec. For both vendors, Ecc qualifies as the most robust parameter with the lowest variability. Whether or not the widespread availability of CMR-FT software types will allow the methodology to develop into a useful routine clinical tool has yet to be demonstrated.