Adult appendicitis score versus Alvarado score: A comparative study in the diagnosis of acute appendicitis

Background Acute Appendicitis (AA) is the most common abdominal surgical emergency. It requires proper management to decrease mortality and morbidity. Clinical scoring systems for diagnosing AA aimed to decrease the use of radiological scans and the rate of negative appendectomies (NA). We aim to assess the adult appendicitis score (AAS) in the diagnosis prediction of AA. Method A retrospective study with 1303 cases of AA is performed. We compared the correlation of AAS and Alvarado scores to postoperative histopathology. Specificity, sensitivity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV) were assessed. ROC was used. Results AAS risk stratification was applied to the study population. Group I for a low probability, and groups II and III for an intermediate and high probability of AA. We found that 159 patients were matched in group I, 505, and 639 were in groups II and III of AAS, respectively. The correlation between Alvarado and AAS with HP was significant. AAS ≥ 16 presented sensitivity and specificity of 50 % and 75.47 %, respectively, with PPV of 97.96 % and NPV of 6.02 %, with an accuracy of 51.04 %. Regarding AAS ≥ 11, the sensitivity was 88.96 %, specificity was 39.62 %, PPV was 97.2 %, NPV was 13.21 %, and accuracy was 86.95 %. Conclusion AAS is relatively more accurate than Alvarado's score, especially in selecting a safe candidate for discharge from an emergency. In addition, AAS is found to decrease the need for radiological images and NA rate more than Alvarado.


Introduction
Acute Appendicitis (AA) is a frequently encountered abdominal surgical emergency with an estimated lifetime risk of 7-8 % [1]. In developed countries, it occurs at a rate of approximately 90-100 cases per 100,000 population per year, affecting adolescents and young adults, with a higher incidence among males [2]. Furthermore, severe cases of AA have been associated with increased mortality. Hence, the diagnosis of AA can pose challenges with various differential diagnoses, especially in females, and any delays in treatment can result in elevated mortality and morbidity rates [3,4].
Our institution's employs a standard diagnostic approach for assessing AA, which depends on both physician clinical assessment and the use of radiological modalities. Diagnosing AA can be challenging, as rely solely on clinical diagnosis carries a significant risk of negative appendectomy (NA), with rates reported in the literature reaching up to 23 % [4,5]. However, incorporating imaging studies into the diagnostic process has been shown to improve the accuracy of AA diagnosis while reducing the rate of NA [5,6]. It is important to note that in cases of typical appendicitis, the use of imaging may potentially lead to a delay in surgical consultation and intervention, therefore increasing the risk of complications [7].
Since the establishment of clinical scoring systems, they have played a remarkable role in improving diagnostic accuracy and reducing the need for further investigations like US, CT scan, and MRI. These scoring systems are based on symptoms, signs, and laboratory findings, helping to raise clinical suspicion of AA without providing a definitive diagnosis. They assist in appropriately selecting patients with uncertain diagnoses for diagnostic imaging.
Alvarado score is widely recognized as the most renowned scoring system for diagnosing AA in adults [8]. Its components include eight factors: migratory pain to the Right iliac fossa, anorexia, nausea, vomiting, temperature >37.3 • C, rebound tenderness, and neutrophilic count >75 %. Each of these factors scored 1. Additionally, tenderness of the Right iliac fossa and leukocytes >10,000/Ul are scored 2. Therefore, the total score is calculated by summing up the scores according to available components, resulting in the 10-point Alvarado score [7]. However, the diagnostic power of the Alvarado score in predicting AA was assessed in a previous study and concluded that it was insufficient to be considered the main scoring system in our institute [9]. As a result, a new scoring system called the adult appendicitis scoring system (AAS) was established by Sammalkorpi et al. in 2014 [10].
According to the updated guidelines from the World Society of Emergency Surgery (WSES), the use of AAS is recommended, while the use of the Alvarado score to help increase clinical suspicion of AA in adults is discouraged [11]. In our institution, many physicians are utilizing AAS in diagnosing AA instead of relying on the Alvarado score. However, there is a lack of sufficient published research on AAS assessment despite the WSES recommendations. Herein, the aim of this study is to assess the effectiveness of AAS compared to Alvarado's score in predicting the diagnosis and stratifying the risk of AA in correlation to HP as the gold standard for diagnosis.

Study population
The study is a secondary data carried out at Hamad Medical Corporation (HMC), Qatar's main health care provider. The study period was from January 1st, 2018, until January 31st, 2019, as approved time frame by the ethical committee for human research by the Medical Research Center of HMC with protocol number (MRC/01/19/454). 1303 patients diagnosed with AA were included in the study. Our inclusion criteria were: (1) All patients ≥14 years old, (2) patients who were admitted with AA and underwent appendectomy, and (3) postoperative histopathology results were available.

Study data and diagnostic scores
The electronic medical records (EMR) database was used to search for study data. Collected data in this study demonstrate three sets of pre, intra, and postoperative data. The first set included demographic, history, and clinical characteristics. Second set of data demonstrates laboratory results and radiological findings. Finally, the third data set includes surgical procedure details, hospital course and Histopathology (HP) grading.
For comparison, AAS and Alvarado scores were calculated retrospectively using their components, as demonstrated in Tables 1 and 2. Retrospective calculation of these scores is based on history, physical examination, and comprehensive laboratory testing, which are accessible on a regular basis not just for patients with acute appendicitis but also for individuals who report to emergency departments with acute abdominal pain. We stratified them according to the risk of having AA into group I for a low probability of AA, and groups II and III for an intermediate and high probability of AA. The AAS Risk Stratification was generated based on the score result. So, a score of 0-10 represents a low probability, a score of 11-15 is considered intermediate probability, and an AAS score of ≥16 is considered high probability of AA. In terms of Alvarado Risk Stratification, a score of 1-4 indicates low probability, a score of 5-6 indicates intermediate probability, and a score of 7-10 indicates high probability of AA.
The output of study variables with a comparison between scores is tabulated as shown in Tables 3 and 4. The scores were displayed by ROC analysis, and the area under the ROC curve (AUC) was estimated.

Study outcomes
The primary outcome is to validate the diagnostic accuracy of AAS in diagnosing AA. The secondary outcomes are the ability of AAS to reduce the use of clinical imaging studies in diagnosing AA and the diagnostic role of Alvarado score in comparison with AAS.   Data are presented as mean ± standard deviation. AAS = Adult Appendicitis Score; BMI = Body mass index CT scan = computerized tomography scan; CRP = serum C reactive protein; DBP = diastolic blood pressure; IQR = interquartile range; INR = international normalized ratio; pH = blood degree of acidity or alkalinity; SBP = Systolic blood pressure; WBCs = white blood cells.

Grading setting
HP findings, we classified the microscopic finding to grade 0 for normal or no evidence of AA, grade I for a mild form of AA, grade II referred to gangrenous/perforated AA, and grade III for AA with incidental neoplastic finding. Regarding Operative findings; it was described as grade 0 for the normal appearance of the appendix, grade I for nonperforated AA, grade II for gangrenous /impending perforation AA, grade III for perforated AA with the collection, grade IV for mass forming AA and grade V as finding mentioned before with superadded generalized peritoneal contamination. We designed this grade description according to The American Association for the Surgery of Trauma (AAST) [12] grading for AA as it is the nearest one to our reported findings. We compare the correlation of Alvarado score and AAS to the gold standard HP findings and intraoperative findings. We chose the cutoff point of Alvarado score at five, seven and eleven; on the other side, we took scores eleven, sixteen and eighteen as cut-off points for AAS to assess the Sensitivity, the Specificity, the Positive Predictive Value and the Negative Predictive Value based on ROC curve and previous publication. Sammalkorpi et al. who created AAH score recommended these cut off value (at score 11,16,18) based on statistical analysis and through ROC curve [10]. Regarding Alvarado cut off value along with many articles cut off value of 5, 7, 9, there is systematic review of 42 studies they recommended these cut off values [7,9,13].

Statistical methods
Descriptive statistics in mean and standard deviation for interval variables and frequency with percentages for categorical variables were calculated according to Alvarado and AAS groups. Chi-square tests were applied to see the association between HP and clinical scoring systems. One-way ANOVAs were performed to see mean differences among HP and both scores groups for all interval variables. ROC curve and c-statistics were performed to see the best discriminate AA disease at a different cut-off value of Alvarado. A p-value of 0.05 (two-tailed) was considered a statistically significant level. SPSS 28.0 statistical package was used for the analysis.

Study participants
We enrolled 1303 patients who fulfilled the inclusion criteria. The mean age was 32.3 ± 9.5 years, with male predominance (75.8 %). In addition, 81 % of the study's nationality was of Asian origin. Risk stratification groups of Alvarado score were applied to this study cohort; accordingly, group I displayed 121 patients, 336 patients for group II and 846 patients for high probability group III. Regarding AAS stratification of the study population, we found that 159 patients were matched in group I, 505 and 639 patients were in groups II and III, respectively. After examining the correlation of study variables, such as demographics, clinical data, laboratory, and radiological results, we discovered significant differences between the Alvarado and AAS groups within the same scoring system. We found statistical significance between both the Alvarado and AAS groups regarding specific study data, including the duration of symptoms, migratory abdominal pain, anorexia, nausea, vomiting, WBC count, neutrophil count, lymphocyte count, hemoglobin level, serum C-reactive protein (CRP), serum lactate, serum glucose, Computerized Tomography (CT) scan findings of appendicular diameter, and the surgical approach.
. Also, we found that BMI, fever, pulse rate, platelets count, blood degree of acidity or alkalinity (Ph) and surgery waiting time were statistically significant to only Alvarado risk stratification groups. Conversely, AAS risk groups were significantly related to gender (male), nationality, serum creatinine, and readmission, as displayed in Tables 3  and 4. CT scans were done for 84 % of patients and the diagnostic power of CT was 96.1 %. 13.5 % of patients had ultrasounds before surgery and 2.5 % had surgery without imaging. The duration of symptoms of acute appendicitis was 1.8 days and the mean surgery waiting time was 26.6 h. The mean Alvarado score for the whole cohort was 7 and for AAS was 15. The main surgical procedure was laparoscopic appendectomy (89.9 %) and the conversion rate from laparoscopic approach to open was 0.6 %.

Operative outcomes
The intraoperative finding was going with normal-looking appendix (grade 0) in 1.5 % of patients. The operative complications rate was (1.2 %) and reoperation was 0.2 % encountered in 3 patients; one patient operated for postoperative abdominal collection not amenable to nonoperative management and the other two patients encountered postoperative bleeding that required surgical control. The mean length of hospital stay was two days. The readmission rate was in 20 patients and the main reason was related to abdominal collection discovered in 14 patients. The recorded mortality was only one case (0.1 %). The correlation between Alvarado and AAS on one side and intraoperative findings on the other was significant (P = 0.001). However, grade 0 of intraoperative finding was higher in AAS group 1 with less probability which gives more efficiency of AAS to exclude cases from surgical management (2.5 % for Alvarado versus 4.4 % for AAS) see Table 5.

Correlation between the histopathological findings and the diagnostic scores
Regarding HP findings, there were 52 patients (grade 0) with a normal appendix representing a negative appendectomy rate of 4 %. There is a statistical significance between HP and Alvarado's scoring system and AAS's (p = 0.001). However, as mentioned previously, the number of patients in group 1 with grade 0 representing negative appendicitis is more in AAS (21 patients) than in Alvarado (12 patients) which gives another clue of the efficacy of AAS in detecting patients not required surgery as a management option as demonstrated in Table 6.

ROC curve
We utilized (ROC) curve to get the best cut-off values for AAS and Alvarado, which displayed an area under the curve (AUC) of 0.731 and 0.696 for AAS and Alvarado respectively seen in Fig. 1. We computed the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) with the optimal cut-off value for both scores, as demonstrated in Table 7.

Discussion
The AA is a commonly encountered surgical emergency worldwide that surgeons manage on a daily basis. It requires an excellent surgical attention because its symptoms can resemble those of other abdominal conditions, especially in females. While AA can present with nonspecific clinical data, it can also manifest in a severe form that poses a significant risk of complications and can have a detrimental impact on a patient's life [14,15]. Surgeons prefer to intervene promptly and proceed with an appendectomy instead of waiting to avoid the risk of complications. However, this approach has led to an increase in negative appendectomies (NA) and unnecessary surgeries. Consequently, there has been a growing need to search and develop a viable diagnostic score that enables accurate diagnosis, while reducing the rate of NA and overreliance on radiological assessments [16].
In our study, depending on HP results, the NA rate was found to be 4 %, which is favorable compared to the literature rate [17][18][19][20]. Surprisingly, we observed no significant difference in the NA rate between the female and male genders (51.9 % and 58.1 % respectively), contradicting previous studies that reported a higher NA rate in female [21,22]. CT scans were performed in 84 % of our cases, with a diagnosis accuracy of 96.1 %. This accuracy is comparable to the literature's reported range of 93 and 98 % [23]. Notably, a separate prospective study demonstrated a high diagnostic accuracy of 97.8 % for CT scans in detecting AA. Interestingly, only 32 % of patients in that study had CT scans performed in the emergency unit [24].
In our study, the readmission rate was 1.5 % (20 patients), with the main reason being postoperative abdominal collection, accounting for (70 %) of the readmissions. This is in contrast to another study which reported a higher readmission rate of 11.9 % among patients, of which 25 % were due to postoperative abdominal collection. Moreover, the other study reported a reoperation rate of 2.5 %, which is clinically significant compared to our study where it was only 0.2 % [25]. A recent meta-analysis reported by Bailey et al. showed a readmission rate of 4.5 % [26]. However, the main causes of readmission were reported as postoperative abdominal collection and pain [27].
We found a significant correlation between both Alvarado and AAS with the following parameters: WBCs count; neutrophils count, lymphocytes count, hemoglobin level, serum C reactive protein (CRP), and serum lactate. These parameters showed an increase in severity grading of AA and were previously considered as biomarkers for diagnosing AA. Data are presented as n (%). AAS = Adult Appendicitis Score. Data are presented as n (%). HP = Histopathology result; AAS = Adult Appendicitis Score. However, clinically, no single biomarker has demonstrated significant diagnostic performance to be used in isolation [28]. Many scoring systems have been introduced over time. One of the earliest and most commonly used worldwide was the Alvarado score. Numerous studies have been conducted in the literature to validate the Alvarado score for diagnosing AA, yielding mixed results with supporting and non-supporting findings. A recent investigation of the Alvarado score was carried out in our institution, but the study concluded with unsatisfactory Alvarado sensitivity for diagnosing AA [9]. Therefore, we endeavored to validate a new scoring system in our institution, aiming to identify a more suitable score that could accurately diagnosis AA and reduce excessive use of radiological methods. We decided to compare Alvarado score with a recently introduced score called AAS, which was developed in 2014 and is recommended by WSES updates [10]. The confirmatory postoperative histopathology was utilized as the gold standard for AA diagnosis.
We selected a cut-off value according to a previous study, as demonstrated by Chae et al. [29]. Accordingly, the ROC curve we generated as shown in Table 7, revealed that AAS outperforms the Alvarado in terms of area under the curve (AUC) as seen in Fig. 1. AAS exhibited relatively higher accuracy for both higher and lower cut-off values, indicating its superior ability to stratify patients with AA. This finding is consistent with the earlier publication on AAS construction, which strongly supports our results [10]. Furthermore, Kabir et al. [25] reported similar, with AAS having a better AUC (0.78) compared to the Alvarado score (0.75), providing further evidence that AAS can decrease NA and the need of radiological diagnosis. Conversely, Capoglu et al. demonstrated no significant difference in AUC between AAS and Alvarado, along with similar accuracy [30].
After stratifying the AA cases based on HP and intraoperative findings, we noticed that group I of AAS had a higher proportion of normal looking appendices or absence of histological inflammation compared to group I of Alvarado. This suggests that AAS can effectively identify more patients with negative appendicitis and can change of way of management from operative to conservative management, especially in those with lower probability score (group I). Additionally, we observed that a majority of the patients (65 %) fell into high probability group III, according to Alvarado, which is considerably higher than in AAS (49 %). However, despite that, there was no significant change in accuracy. This finding was also reported by Sammalkorpi et al. further supporting its insignificance [10].
After evaluating the sensitivity and specificity of AAS and Alvarado at different cut-off points, we noticed that both scores demonstrated moderate overall diagnostic accuracy, with AAS relatively better performance. Chae et al. also reported similar finding and noted that both scores have been useful in excluding appendicitis in low-risk group I, allowing for safe discharge [29]. Similarly, a recent systematic literature review on the diagnostic value of different scoring systems confirmed that AAS and Alvarado were primarily effective in ruling out appendicitis and identifying low-risk patients for AA, thus reducing the need for radiological evaluations, and minimizing NA rates within these patient groups [31].
Despite the advantages of retrospective medical record reviews, they have inherent limitations regarding data quality. Furthermore, estimating AAS and Alvarado scores retrospectively based on clinical evaluations, which impacted the accuracy of the NA rate. Additionally, variations in the degree of expertise and experience among the operating surgeons may have influenced the recorded intraoperative findings. These limitations should be considered and addressed in future research. Nevertheless, this study has a strength in its large sample size, which provides a more accurate understanding of the relationships between variables. To our knowledge, this is the first study in Qatar to evaluate the correlation between Alvarado score and AAS findings, incorporating a wide range of interrelated variables in such a large sample.

Conclusion
The diagnosis of acute appendicitis remains a challenging task without radiological confirmation. AAS has demonstrated higher accuracy compared to the Alvarado score, especially in identifying patients suitable for discharge from the emergency department, as it can effectively detect more cases of NA. These findings suggest that AAS reduces the reliance on radiological imaging and decreases the rate of NA more effectively than Alvarado. Therefore, conducting a prospective study is recommended to validate these findings in the near future.

Implications and contribution
Research working on assessing the adult appendicitis score (AAS) in the diagnosis prediction of AA should consider many of the factors highlighted in the study.

Funding
The publication of this article was funded by the Qatar National Library.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability statement
Data will be made available on request.