Clinical magnetic resonance image quality of the equine foot is significantly influenced by acquisition system

Background: Investigation of image quality in clinical equine magnetic resonance (MR) imaging may optimise diagnostic value. Objectives: To assess the influence of field strength and anaesthesia on image quality in MR imaging of the equine foot in a clinical context. Study design: Analytical clinical study. Methods: Fifteen equine foot studies (five studies per system) were randomly selected from the clinical databases of three MR imaging systems: low-field standing (LF St), low-field anaesthetised (LF GA) and high-field anaesthetised (HF GA). Ten experienced observers graded image quality for entire studies and seven clinically important anatomical structures within the foot (briefly, grade 1: textbook quality, grade 2: high diagnostic quality, grade 3: satisfactory diagnostic quality, grade 4: non-diagnostic). Statistical analysis assessed the effect of anaesthesia and field strength using a combination of the Pearson chi-square test or Fisher’s exact test and Mann-Whitney test. Results: There was no difference in the proportion of entire studies of diagnostic quality between LF St (90%, 95% CI 78%-97%) and LF GA (88%, 76-95%, P =


| INTRODUC TI ON
Foot pain is a common cause of lameness in the horse and magnetic resonance (MR) imaging of the foot is now a fundamental diagnostic method, 1,2 given the limitations of other imaging modalities in this anatomical region. [3][4][5][6][7] Understanding the factors that influence MR image quality in a clinical context can optimise the use of this modality. Increasing magnetic field strength effectively results in a linear increase in signal-to-noise ratio which significantly contributes to perceived image quality. 8,9 The effect of magnetic field strength on the identification of pathology in the equine cadaver foot has been investigated, demonstrating the diagnostic value of low-field MR imaging of this region. 10,11 Similar findings have been established in human orthopaedic MR imaging. 12 Motion is also a critical influencer of MR image quality and is reported to be more significant during imaging of the standing equine patient compared to a patient under general anaesthesia. 8,13 The position of the foot against the ground surface makes it less susceptible to pendulous sway motion than more proximal regions of the limb. 2,14 Careful patient management, optimal sequence selection, study planning and motion-correction techniques can minimise the impact of subtle motion. 8,14 However, other factors such as weight-bearing of the imaged limb can also influence the diagnostic value of MR images for structures such as the articular cartilage of the distal interphalangeal (DIP) joint. 15 The application of previous literature to the clinical context is limited by the use of cadaver materials, investigation of particular anatomical regions, use of comparable sequences rather than those optimised for each system, deviation from the practical use of MR imaging systems and methods of image viewing. 10,11,14,15 There is little evidence describing the differences in perceived image quality between MR imaging systems for foot studies performed in a clinical context in live equine patients. 8,16 The aim of our study was to assess the influence of field strength and anaesthesia on subjective image quality assessment in studies of the equine foot performed in a clinical context. We hypothesised that: (a) there would be a difference in perceived image quality between MR images from low-field systems acquired with the patient standing compared to with the patient under general anaesthesia; (b) there would be a difference in perceived image quality between MR images acquired from low-field and high-field systems with the patient under general anaesthesia. Using these three groups the study design was formulated to assess the effect of anaesthesia (LF St compared to LF GA) and field strength (LF GA compared to HF GA). Five foot studies were selected from the clinical databases from each of the three institutions using a random number generator (Random Integer Generator: www.random.org/integ ers/). MR imaging studies acquired between June 2015 and August 2018 were included in the search. The minimum sampling unit for the purposes of study selection was the individual foot. The shortlisted database studies included all the sequences that would typically be performed as a routine equine foot MR imaging study at that institution. These included MR imaging studies that were a repetition of previously imaged patients and studies of contralateral (ie non-lame) limbs. Targeted MR imaging studies with a reduced number of sequences (for example for pre-surgical planning or assessment of an acute foot penetrating injury) were not included. The patient details were anonymised by the acquiring organisation prior to submission to the primary author.

| MR imaging studies
Additional metadata attributes were modified to remove information that identified the acquiring institution using a DICOM anonymisation tool (DICOM Anonymizer: www.dicom anony mizer.com) and DICOM viewing software (Horos, Horos Project (version 2.2.0)).
Metadata outlining the parameters used in the pulse sequences were unaltered. The studies were randomly ordered using a random sequence generator (Random Sequence Generator: www.random. org/seque nces/) and labelled appropriately with their assigned case number in the Patient's Name and Patient ID metadata attributes.
The finalised DICOM files were exported to individual study folders for distribution to observers.

| Image assessment platform
An online image assessment platform was developed to allow observers to evaluate MR imaging studies of the equine foot. The first component was a subjective assessment of image quality for the whole study, using a 4-point grading scale (briefly, grade 1: textbook quality, grade 2: high diagnostic quality, grade 3: satisfactory diagnostic quality, grade 4: non-diagnostic). Verbal descriptors were provided for each grade (see Data S1).
The second component focused on assessment of individual anatomical structures of the foot that are deemed to be clinically relevant in the investigation of lameness. Seven structures were selected based on previous literature reporting lesion distribution during MR imaging for the investigation of foot lameness. 1,[17][18][19][20][21] The structures included were the deep digital flexor tendon, navicular bone, navicular bursa, DIP joint, collateral ligaments of the DIP joint, the third phalanx and the distal sesamoidean impar ligament. A version of the 4-point image quality grading scale was produced for each structure, with the verbal descriptors modified to reflect the structure's specific MR imaging features (see Data S1). Pathology within individual structures was assessed using a 4-point scale (grade 1: no pathology, grade 2: mild pathology, grade 3: moderate pathology, grade 4: severe pathology). Assessment of pathology was performed to quantify the relative degree of pathology in each group, rather than to test the pathology identification ability of the different acquisition systems. A field was also available at the end of each section for free text comments. The observers were not given specific instructions to dictate the content of free text responses. The image assessment platform was hosted using an online survey tool (Online Surveys, Jisc) and can be found in Data S2. The online assessment platform was distributed with the labelled DICOM files of the 15 studies for interpretation. Each observer accessed the image assessment platform using a unique link that allowed individual results to be recorded. The observers were able to assess the studies with their DICOM viewer of choice and manipulate the images as they would in a clinical context.

| Observers
The platform was distributed to 10 experienced observers. Selection

| Data analysis
The image assessment platform data were compiled into a data- Free text comments were collated for each study. The comments were categorised as containing information regarding image quality or pathology. Comments regarding image quality were categorised and combined into summary tables. Reasons for reduced image quality were categorised as: alignment or position, additional sequence desired, short tau inversion recovery (STIR) fat suppression, sequence parameter (other), motion artefact, magic angle effect, artefact (other), repeat sequence desired and other. Where a comment referred to multiple categories, this was noted as a count in each category. The pulse sequence data for MR imaging studies were also collected into a database and descriptive statistical analysis was performed including the median and range for number of sequences (including multiplanar reconstructions) per study for each group.

| Observers and MR imaging studies
All observers had >5 years' experience interpreting equine MR images, with 6/10 having >10 years' experience. There were six diagnostic imaging Diplomates, one diagnostic imaging Associate Member, one diagnostic imaging and surgical Diplomate, one diagnostic imaging, surgical and sports medicine and rehabilitation Diplomate and one surgical and sports medicine and rehabilitation Diplomate. When reporting the frequency with which they interpret images from different systems, observers were most familiar with low-field standing images (regularly: 6/10, frequently: 2/10, occasionally: 0/10, rarely or never: 2/10) than low-field under general anaesthesia (regularly: 1/10, frequently: 0/10, occasionally: 5/10, rarely or never: 4/10) and high-field under general anaesthesia (regularly: 0/10, frequently: 2/10, occasionally: 3/10, rarely or never: 5/10). The median (range) number of sequences for each group was: LF St 10 (9-11), LF GA 12 (11)(12)(13)(14) and HF GA 11 (11)(12). These values do not include localiser or pilot sequences but include multiplanar reconstructions that formed part of the routine imaging protocol. The pulse sequence parameters varied between acquisition systems and studies. All studies contained sequences acquired in at least three planes. In addition, all studies contained T1 weighted, T2 weighted, proton density weighted and STIR images. The pulse sequence parameters of all studies are presented in Data S3.

| Low-field standing vs low-field under general anaesthesia
When assessing the image quality of entire studies, 90% (95% CI 78%, 97%) of the gradings were classified as diagnostic for LF St and 88% (95% CI 76%, 95%) for LF GA. Entire study gradings classified as better than "high diagnostic quality" (ie grades 1 and 2), accounted for 18% (95% CI 9%, 31%) of gradings for LF St and 34% (95% CI 21%, 49%) for LF GA. The distribution of image quality gradings is displayed in Figure 1. The median and range of image quality gradings for entire studies and individual anatomical structures of the LF St and LF GA groups are presented in Table 1. When comparing the proportion of diagnostic vs non-diagnostic studies between LF St and LF GA groups, there were no statistically significant differences for entire studies or individual anatomical structures of the foot.
There were no statistically significant differences in median image quality gradings between LF St and LF GA. The output of these comparisons is presented in Table 1. The distribution of pathology gradings is displayed in Figure 2. There were no significant differences in median pathology scores between LF St and LF GA. The LF St group had 19 free text comments that referred to image quality and 15 that referred to pathology. The LF GA group had 23 comments that referred to image quality and 19 that referred to pathology. The distribution of comments in the image quality categories is displayed in Figure 3.

| Low-field under general anaesthesia vs high-field under general anaesthesia
When assessing the image quality of entire studies, 88% (95% CI 76%, 95%) of the gradings were classified as diagnostic for LF GA and 100% (95% CI lower bound 94%) for HF GA. Entire study gradings classified as better than "high diagnostic quality" accounted for 34% (95% CI 21%, 49%) of gradings for LF GA and 96% (95% CI 86%, 100%) of gradings for HF GA. The distribution of image quality gradings is displayed in Figure 4. The median and range of image quality gradings for entire studies and individual anatomical structures of the LF GA and HF GA groups are presented in Table 2 Table 2. Median pathology scores were significantly different for the deep digital flexor tendon (P < .001), navicular bursa (P < .001), DIP joint collateral ligaments (P .001) and distal sesamoidean impar ligament (P .002) for HF GA compared to LF GA.
The distribution of pathology gradings is displayed in Figure 5. The LF GA group had 23 comments related to image quality and 19 related to pathology. The HF GA group had 8 comments related to image quality and 20 that related to pathology. The distribution of comments in the image quality categories is displayed in Figure 6.

| Inter-observer agreement
The results of inter-observer agreement analysis for image quality assessment are presented in Table 3. Absolute inter-observer agreement, as indicated by Fleiss' Kappa, was fair for entire studies. Absolute agreement varied between individual anatomical structures but was generally poor to fair. Inter-observer agreement accounting for relative of order of grading, as indicated by Kendall's coefficient of concordance, was moderate to high for entire studies and individual anatomical structures. TA B L E 1 Key descriptive statistics, comparison of proportion of diagnostic gradings and comparison of the ranked gradings for magnetic resonance image quality for low-field standing and low-field under general anaesthesia groups

| D ISCUSS I ON
This study demonstrates that most MR imaging foot studies of live patients were deemed to be of diagnostic quality by experienced observers, regardless of acquisition system. There was no evidence of a significant difference in the perceived image quality between the LF St and LF GA groups, though the reasons described for reduced image quality appear to differ between groups. Field strength influenced perceived image quality for entire studies and individual anatomical structures of the foot. Many of the factors that reduced image quality could be influenced by the system operator.

| The effect of general anaesthesia
Our study indicates that there are minimal differences in perceived image quality between low-field MR imaging of the foot performed in the standing patient and with the patient under general anaesthesia. However, the reasons for reduced image quality may differ between groups. Motion is frequently cited as a reason for reduced image quality, particularly in the standing patient. 13 The image quality comments in the current study demonstrated that motion artefact was mentioned in reference to a small number of studies in the LF St and the LF GA groups. Our results indicate that impact of motion is effectively limited during standing imaging of the foot in clinical practice by patient management, repetition of movement affected sequences and motion-correction techniques (including motion insensitive sequences). 8,14,24 Given its position on the ground surface, the foot is less susceptible to patient sway movement than proximal regions of the limb. 14 It is likely that our findings cannot be extrapolated beyond the foot, therefore, other clinically important structures more susceptible to motion, such as the proximal metatarsal region, may warrant specific investigation in a similar manner.
Weight-bearing has been reported to influence the appearance of articular structures and the apparent sites of loading during MR imaging of human limbs. 25,26 An experimental equine study reported improved DIP joint articular cartilage visualisation in unloaded cadaver limbs when compared to images from live weight-bearing patients. 15 The LF St and LF GA groups of the current study are an analogous comparison, but there was no evidence of a statistically significant difference in the perceived image quality for the DIP joint between the two groups. This may reflect the broader image quality assessment involved in the current study, compared to the focused measurement of cartilage thickness and delineation in the previous report. 15 Other literature assessing identification of DIP articular cartilage lesions primarily relates to non-weight-bearing limbs. 27,28 Further research is required to determine the significance of weight-bearing and limb positioning on the identification of DIP joint articular pathology. The current literature indicates that MR imaging under general anaesthesia (ideally high-field) may be warranted where subtle DIP joint articular cartilage lesions are suspected. 2

| The effect of field strength
The impact of field strength on image quality has received attention in the equine veterinary literature, which has supported the use of low-field imaging in a clinical setting. 10,11 The results of our study indicated that high-field studies were more likely to be deemed diagnostic. The improved image quality is a result of the approximately linear relationship between field strength and the signal-to-noise ratio of images. 29 Previous studies assessing the influence of magnetic field strength have compared systems for standing and anaesthetised patients, which resulted in differences in the appearance of some (primarily soft tissue) structures and their anatomical positioning. 10,11 This confounding factor also may contribute to perceived image quality. The LF GA and HF GA groups of the current study had comparable positioning, reducing the influence of this factor. 30 When considering individual anatomical structures there was no evidence of a statistically significant difference between LF GA and HF GA groups in the proportion of diagnostic studies for the navicular bone, DIP joint and the third phalanx. This may indicate that these structures are more readily assessed by observers and therefore a lower threshold of image quality is required to achieve a diagnostic study. 31,32 In addition, the importance of individual sequences differs between structures. 27,[32][33][34][35] There may be inter-sequence variation in the relative disparity of image quality between comparable sequences from low-field and high-field systems. The findings TA B L E 2 Key descriptive statistics, comparison of proportion of diagnostic gradings and comparison of the ranked gradings for magnetic resonance image quality for low-field under general anaesthesia and high-field under general anaesthesia groups are broadly consistent with a previous experimental image quality study using cadaver limbs. 11 However, the third phalanx received a relatively poor score in the low-field group (compared to high-field images) of the previous study. This was attributed to loss of signal in the distal aspect of the bone 11 due to positioning of the toe at the periphery of the magnet and radiofrequency coil in the low-field system. 6,11 This may not have been replicated by the magnet and radiofrequency coil (a human knee coil) configuration used in the LF GA group of the current study.
The findings of this study relate to image quality, rather than pathology identification ability. Intuitively, these factors are intrinsically associated (though this is not necessarily a linear correlation).
Comparison of ranked image quality gradings of the current study, which is a more refined indicator than the proportion of diagnostic

| General considerations
The influence of other inherent features of the acquisition systems should also be considered. The availability of a radiofrequency coil which is specific to the anatomy of interest will assist in optimising signal-to-noise ratio of the images. 24,40 The equine foot presents a relatively unique shape, therefore, some systems use an equine specific coil rather than a human extremity coil for imaging of this region. 40 The orientation of the static magnetic field relative to the long axis of the limb is an intrinsic feature of the system and this can influence the susceptibility of structures (most commonly ligaments and tendons) to the magic angle effect. 8,10,16,[41][42][43][44][45] The magic angle effect was reported to have reduced image quality in studies from both LF St and LF GA groups of the current study. While sequence selection can assist the observer in identifying magic angle effect, 46 patient positioning has an influence on its occurrence. In the anaesthetised patient, positioning is partially dictated by practicalities of recumbency, the system design and by the skills of the system operator. 30 In the standing patient, leaning of the imaged limb is an additional factor to consider. 45,47,48 In some circumstances (particularly for high-field systems) the magic angle can assist in diagnostic assessment, such as for evaluation of the tendon-bone interface at the insertion of the deep digital flexor tendon. 8,11 Knowledge of the structures susceptible to magic angle effect for a specific MR ac-  requirements and practicalities of the bore) and anaesthesia (due to reduced patient access and requirement for MRI compatible equipment) when compared to low-field imaging under general anaesthesia. 30 There is limited evidence quantifying the morbidity and mortality of low-field MR imaging in both the anaesthetised and standing, sedated equine patient.
When assessing inter-observer agreement with Kendall's coefficient of concordance, the inter-observer agreement was moderate to high, though absolute inter-observer agreement (indicated by Fleiss' kappa) was poor to fair. 22 Kendall's coefficient of concordance accounts for the relative of order of grading and is the most useful indicator of agreement in this study since image quality assessment is both subjective and complex, despite the use of a grading scale. 8,23,54 Inter-observer agreement was considered satisfactory for the purposes of image quality assessment. The variation in inter-observer agreement between structures may reflect the relative ease of MR image assessment for different anatomical regions. 31,32

| Limitations
The studies contained the sequences of the standard foot imaging protocol of the acquisition institution and were deemed to be of diagnostic quality during their clinical acquisition. However, there is likely to be some variation in the quality of individual studies from the same acquisition system, which may be influenced by However, given the experience of the observers in equine MR imaging, it is likely that they would be able to speculate on the acquisition system based on the appearance of the images alone. This is an inevitable consequence of observer-based assessment but could introduce bias due to individual observer experience and preferences.
The observer profile demonstrated that the group most frequently interpreted images from low-field standing systems, which is likely a direct consequence of the preponderance of these systems in equine practice. 55,56 This familiarity with images acquired from lowfield standing acquisition systems could influence the perception of the relative diagnostic quality of low-field images (particularly from standing acquisition systems).
Some practical considerations are inherent in a study of this nature. Observers reviewed 15 complete MR imaging studies and assessed 7 anatomical structures individually, which represents a significant time commitment. Previous studies in the equine literature using a similar methodology to investigate MR image quality were used as a guide for sample size. Considering a single observation to be one observer assessing one complete MR imaging study, previous literature used 22-30 observations per acquisition system. 10,11 It was deemed that the clinically important effect size for MR image quality is large, since a small increase in image quality is unlikely to make a clinically relevant change in pathology identification or diagnostic confidence. Therefore, the 50 observations per acquisition system of our study was deemed to be sufficient to identify a clinically relevant difference in image quality, while still being achievable within the practical limitations. Previous literature used cadaver limbs and fewer observers (2-3), which increases the potential impact of observer bias. While there will be some intra-system variation in image quality, prior experience and literature indicates this is relatively small (especially for studies deemed to be sufficiently diagnostic in a clinical context at acquisition) compared to the more clinically relevant inter-system variation.
Considering these factors, our study aimed to minimise the influence of individual observer bias by incorporating 10 experienced observers who evaluated a relatively smaller number of studies per acquisition system.
Assessment of pathology by the observers quantified the pathology of each study and demonstrated some statistically significant differences between the LF GA and HF GA groups.
The clinical significance of this difference in pathology and its influence on our conclusions regarding image quality is deemed to be minimal.

| Conclusions
Field strength is a more important influencer of image quality than anaesthesia for magnetic resonance imaging of the equine foot in clinical patients. Observers deemed the majority of clinical MR imaging foot studies to be of diagnostic quality, regardless of acquisition system. The reasons described for reduced image quality appear to differ between acquisition systems. Many of the factors related to reduced image quality can be influenced by the system operator and individuals establishing imaging protocols. Further research focusing on specific factors that reduced image quality of individual studies would be valuable to guide targeted training of operators for their acquisition system.

E TH I C A L A N I M A L R E S E A RCH
The study was approved by the School of Veterinary Medicine Research Ethics Committee, University of Glasgow (Ref 08a/18).

OWN E R I N FO R M E D CO N S E NT
Explicit owner consent was not requested for the retrospective use of anonymised magnetic resonance images from clinical cases.
Images were anonymised at the acquiring institution prior to submission to the primary author.

DATA ACCE SS I B I LIT Y S TATE M E NT
The data that support the findings of this study are available from the corresponding author upon reasonable request.