Introduction

Today, mobile technology has made the sharing of high-quality photos easy, and clinicians, especially orthopedists, often use instant messaging platforms to consult patient images and imaging studies [1,2,3,4]. It has been shown that 82% of all doctors in the USA and 98% of orthopedic surgeons in Australia use smartphones for professional purposes [5]. At the same time, the need for teleconsultations has increased due to the COVID-19 pandemic which has caused unplanned changes in hospital work hours and shifts [6]. Such consultations usually involve the transfer of high-resolution images of radiological examinations to orthopedists, especially during off-hours and holidays. In cases where this cannot be done, there may be difficulties in patient referrals, diagnoses, or treatments.

While the patient’s history and physical examination are very important for decision-making in orthopedics, final decisions for the treatment and management of trauma are mostly based on radiological imaging. This is because imaging studies are usually sufficient for diagnosis and categorization of patients, especially since their assessment is based on relevant fracture classifications that standardize orthopedic practice [7, 8].

Although teleconsultations with imaging have become widespread, it is still questionable whether providing consultation based on such images is reliable. In previous studies on ankle fractures, wrist fractures, and pediatric elbow fractures, results have shown that patient consultation, diagnosis, and treatment decisions can be made reliably from photographs of the computer screen showing picture archiving and communication system (PACS) results [9,10,11,12,13]. However, to our knowledge, there are no studies focusing on teleconsultation for proximal humeral fractures. The aim of this study was to evaluate whether diagnosis, classification, and treatment planning could be made reliably with photographs taken from the PACS images of patients with proximal humerus fractures.

Materials and methods

Study group

A total of 83 patients who were consulted via photographs for trauma-related suspicion of proximal humerus fracture, between 2016 and 2019, were retrospectively included in this study. Seventy-three of these subjects had presented to the emergency department of our hospital and were diagnosed with proximal humerus fracture during the relevant dates. The remaining 10 patients were selected from patients who had applied to the emergency department with shoulder trauma but had normal radiographs, regardless of date of admission. All patients had undergone anteroposterior and anteroposterior oblique X-ray imaging (n = 166 images). Approval for the study was obtained from the institutional review board. This study was conducted according to the principles of the Declaration of Helsinki.

Population

Inclusion criteria

Being aged 20 years or older and not having additional traumatic pathology in shoulder X-ray, such as clavicle or scapula fracture.

Exclusion criteria

Having a history of prior fracture in the same area, having pathological fractures, and the presence of suture anchor or any implant related to previous operation(s).

Gold standard assessment

A total of 166 radiographs were classified by the corresponding author (CP) and a radiologist (BE) by evaluating the patient’s history, demographic characteristics, further examinations (e.g., computerized tomography [CT] images), treatments applied to the patient during follow-up, and the results were accepted as the gold standard for the evaluation of consultation responses.

Consultation photography and evaluation

The X-ray images of the patients were projected onto a high-definition screen (ASUSTeK Computer Inc., Taiwan) from the PACS system, and a photograph of the screen was taken with an Apple iPhone 6S® smartphone which possesses an 8-megapixel camera (Apple Inc., Cupertino, CA, USA) without applying any standardization (Fig. 1). Care was taken to ensure that the focus was exactly right and that the image fully included the proximal humerus region. While taking the photo, information pertaining to the identity of the patient were removed from the screen and care was taken not to include any additional patient information in the photograph. These photos were sent to another iPhone 6S smartphone via the WhatsApp® (WhatsApp Inc., Mountain View, CA, USA) instant messaging application without any image manipulation. Four orthopedic surgeons (MA, CM, TO, NA) each of which had 10–15 years of professional experience were asked to assess these images. In order to minimize recall bias, only the images of patients who had been admitted to our clinic were sent for evaluation by our colleagues working in other clinics. Four weeks later, patient images were randomized (via use of www.random.org) and the same 4 orthopedic surgeons performed a secondary evaluation of the patients, this time utilizing the direct assessment of PACS images (Fig. 2). The orthopedic surgeons who performed evaluations were not given any information about the patients except for the X-rays.

Fig. 1
figure 1

Photographing for teleconsultation

Fig. 2
figure 2

Design of the study

The survey

A questionnaire was prepared for the standardization of assessments. In the questionnaire, the orthopedic surgeons were asked to define the fracture, classify it according to the Neer 6-part classification and the Arbeitsgemeinschaft für Osteosynthesefragenbeing (AO) classification systems [14,15,16], and choose one of the suggested treatment options, conservative or surgical (Table 1). The Neer system divides the proximal humerus into four parts and considers not the fracture line, but the displacement as being significant in terms of classification. The four parts are the humeral head, the greater tuberosity, the lesser tuberosity, and the humeral shaft. Displacement is on a per-part basis. A fracture part is considered displaced if angulation exceeds 45°, or if the fracture is displaced by more than 1 cm. The Neer 6-part classification system includes six types of fractures: minimal displacement, two-, three-, and four-piece fractures, fracture dislocations, and joint fractures, while the AO classification divides proximal humeral fractures into three groups (A, B, and C) with additional subgroups and places greater importance to the blood supply of the articular surface. All diagnoses and treatment decisions were compared between surgeons (interobserver) and the same surgeon’s responses (intraobserver), with respect to the results of consultation (WhatsApp) images and PACS images.

Table 1 Standardization of responses

Data analysis

All analyses were performed using the SPSS version 21 statistical package (IBM, Armonk, NY, USA). Intraclass correlation coefficient (ICC) values were calculated, and the ICC was rated as follows: 0.00–0.19 poor, 0.20–0.39 fair, 0.40–0.59 moderate, 0.60–79 strong, and > 0.80 excellent. The 95% confidence intervals (95% CI) were also analyzed and reported.

Results

X-ray images of a total of 83 patients were evaluated by four experienced orthopedic surgeons. A total of 664 responses (332 for Neer and 332 for AO classification) were received. Fifty-one of the patients included in the evaluation were males and 32 were females; mean age was 52.49 ± 14.13 years. The gold standard diagnosis of patients were described in Table 3.

For diagnoses according to the Neer classification, ICC values for interobserver WhatsApp (0.90, 95% CI: 0.85–0.95), interobserver PACS (0.92, 95% CI: 0.88–0.97), and intraobserver WhatsApp versus PACS (0.96, 95% CI: 0.92–1.00) were all at an excellent level, with an overall mean of 94.78% (± 2.52).

For AO classification, the respective ICC values were 0.91 (95% CI: 0.86–0.96), 0.93 (95% CI: 0.88–0.96), and 0.96 (95% CI: 0.94–1.00), with an overall mean of 95.31% (± 1.95) (Table 2, Fig. 3). Especially on normal radiographs without fracture, interobserver and interobserver ICC values were 1.00. All the answers were found to be correct.

Table 2 ICC values for diagnosis and treatment decisions
Fig. 3
figure 3

Percent overall agreement

In the evaluation of conservative and operative treatment decisions, interobserver WhatsApp, interobserver PACS, and intraobserver WhatsApp versus PACS ICC values were evaluated as excellent (ICC > 0.8) (Table 2). For treatment selection, the mean overall agreement in all categories was 96.25% (± 1.83).

Discussion

In this study, it is seen that the diagnosis and classification of proximal humeral fractures can be made safely on the images obtained by taking photographs of PACS images on a computer screen. There was excellent intraobserver compatibility in Neer and AO classifications and although interobserver compliance levels were also high, results showed that interobserver compliance was relatively lower compared to intraobserver results. This situation was not related to the quality of the images, but to the diagnosis and treatment method deemed appropriate by the surgeon. Since the normal radiographs without fracture mostly consisted of clearly distinguishable images, they were the groups with the highest reliability in diagnosis and treatment recommendations [7]. Despite the fact that a number of studies have explored diagnostic and therapeutic accuracy with image or video transfer through various methods, such studies have very rarely included assessment results of patients with normal radiographs. This is an important gap in knowledge, since the vast majority of trauma patients in daily practice receive a diagnosis of soft tissue trauma without fracture. Thus, we included 10 patients without fractures in this study (Table 3).

Table 3 Gold standard diagnosis of patients

Neer type 2 and type 3 fractures were relatively less reliable; however, it is also difficult to distinguish these two groups with direct radiographs, and a definitive diagnosis may require CT evaluation [8]. Thus, we believe that this margin of error was not due to evaluation of images sent through the phone, but due to the need for further examination.

When treatment decisions were evaluated, it was seen that interobserver reliability for conservative and surgical treatment of type 1 fractures was relatively low compared to patients with other fractures. For instance, in the same patient, one surgeon could decide on ORIF, while another surgeon could recommend conservative treatment. However, by nature of orthopedic practice, it is well-known that there may be more than one “correct” treatment option in various cases [17]; thus, these differences may be attributed to natural decision-making processes.

Previous studies have investigated the reliability of evaluating vertebral pathologies with smartphone videos of the coronal, axial, and sagittal planes of CT or magnetic resonance imaging sent through WhatsApp Instant Messenger® [3, 5]. In these studies, it was reported that the evaluation could be incomplete due to the lack of calibrated measuring devices in the PACS system. This situation is also a valid concern in our study, and the 45° angulation and 5-mm displacement of the fractured fragments (for Neer classification) were evaluated with eye judgment and was based on the experience of the surgeon. Although CT scans are often required in proximal humerus comminuted fractures, we did not include CT images in our study because our primary aim was to investigate whether X-ray imaging studies could be evaluated accurately.

It has been shown in previous studies that inexperienced surgeons and emergency room physicians can show interobserver and intraobserver variability in the diagnosis, classification, and treatment processes of fractures [18]. The surgeons who evaluated the images in our study had been in a consultant position with 5 to 10 years of experience. As a result, we think that being able to reach experienced surgeons instantly and acquiring their opinion can enable more accurate diagnosis and treatment, ultimately shortening waiting times.

In a study on pediatric elbow fractures, it was reported that instant consultation with the surgeon via MMS could reduce unnecessary referrals to tertiary-level healthcare centers [12]. The same is true for proximal humeral fractures, particularly because most proximal humerus fractures are treated conservatively, and, after instant deliberation with a consultant surgeon, many emergency physicians can safely initiate treatment by applying a Velpeau bandage [17, 19].

It is evident that the transfer of patient images via multimedia messages may violate the patient’s right to privacy, and each country has different laws regarding this issue. It has been suggested that the images, which have been evaluated in previous studies on this subject, should be shared without including patients’ name or any other identifiable information, and that these images should be deleted immediately after evaluation [20].

Recall bias is an important limitation of our study, as was the case for previous studies. In our study, our secondary evaluations were performed after an interval of 1 month from the initial assessment. In addition, the evaluating surgeons received images of patients who had been treated in clinics other than their own, thereby ensuring that the surgeons would only evaluate images which they had not personally seen before. We believe these design characteristics would have minimized recall bias.

Making diagnostic and treatment-related decisions by examining anamnesis, demographic characteristics, examination findings, and radiological images of patients is undoubtedly the best method to ensure accurate results in all specialties. We think that requesting information by sending only patient images can never prevent the principle of evaluating the patient as a whole.

Conclusion

This is the first study to show that the evaluation of proximal humeral fractures by photographs taken from the PACS system screen can provide high levels of diagnostic accuracy when compared to the evaluation of X-ray images on the PACS system. This shows that utilizing image transfer through multimedia messaging in patients with proximal humerus fractures is a highly reliable approach for diagnostic and treatment-related decision. The fact that interobserver compliance was relatively low in AO and Neer classifications is not a problem that occurs during the consultation of PACS images with a smartphone, and the high intraobserver reliability also supports this situation. However, all patients should be evaluated as a whole with a detailed patient history and examination findings.