A multi-center diagnostic accuracy study on nine Prehospital Stroke Screening Scales CURRENT STATUS:

Background: Stroke is one of the most common debilitating diseases. Although effective treatment is available, but a golden-time has been defined in this regard. Therefore, prompt action is needed to identify patients with stroke as soon as possible, even in the pre-hospital stage. In recent years, several clinical scales have been introduced for this purpose. We performed the present study to examine the accuracy of 9 clinical scales in terms of stroke diagnosis. Methods: This multi-center diagnostic accuracy study has been conducted during 2019. All patients older than 18 years, presenting to ED, who had undergone brain magnetic resonance imaging (MRI) due to suspicion of stroke were eligible. All data were gathered in a pre-designed checklist consisting of 3 sections, using the clinical profiles of the patients. The first section of the checklist included baseline characteristics and demographic data. The second part included physical examination findings of 19 items related to the 9 scales. The third part was dedicated to the final diagnosis based on the interpretation of brain magnetic resonance imaging (MRI), which was considered the gold standard method for acute ischemic stroke (AIS) diagnosis in the current study. Results: Data of 805 patients suspected to stroke were analyzed. Of all, 463 (57.5%) patients were male. Participants’ age was between 6 and 95 years and their mean age was 66.9 years (SD=13.9). Off all the enrolled patients, 562 patients (69.8%) had acute ischemic stroke. The accuracy of screening tests was between 63.0% and 84.4%. Their sensitivity and specificity were between 50.2% to 95.7% and 46.5% to 92.2%, respectively. Among all the screening tests, RACE had the lowest sensitivity (50.2%) and MedPACS had the highest (95.7%). In addition, PreHAST had the lowest specificity (46.5%) and RACE had the highest (92.2%). Conclusion: Based on the findings of the present study, highly sensitive tests that can be used in this regard are CPSS, FAST and MEDPACS, all of which have about 95% sensitivity. On the other hand, none of the studied tools were desirable (specificity above 95%) in any of the examined cut-offs.

seconds one person dies due to stroke [1]. Brain cells are highly susceptible to ischemia, and if a large vein is blocked by thrombosis, about 1.9 million neurons are lost per minute. Therefore, every hour of delay in stroke treatment results in the loss of a large number of brain cells that a human would lose within 3.6 years of his/her normal life [2]. Effective treatment is available for AIS, but a golden-time has been defined in this regard. It has been reported that only 1-8% of stroke patients receive proper treatment and the others face poor outcomes due to delayed referral [3][4][5][6]. Therefore, prompt action is needed to identify patients with stroke as soon as possible, even in the pre-hospital stage [7]. Although public education at the community level plays an undeniable role in promptly calling emergency medical services (EMS) after a stroke, effective interventions in the healthcare system begin the moment a patient or his/her companion contacts EMS. Naturally, paraclinical diagnostic tests are almost impossible prior to admission to the hospital, and the impression should be made just based on clinical presentation. Accurate and timely diagnosis allows the patient to be referred to the right place at the right time. In recent years, several clinical scales including Hospital Ambulance Stroke Test (PreHAST) have been introduced for this purpose [8][9][10][11][12][13]. Choosing a scale depends on both its accuracy and ease of use. Confirming the accuracy of a scale by comparing it to other pre-hospital scales can play an important role in accurate diagnosis of acute stroke and thus increase the chance of the patient benefitting from successful treatment. Therefore, we performed the present study to examine the accuracy of these criteria in terms of stroke diagnosis in patients admitted to the emergency department (ED) via a multi-center research project. However, in the current study, efficacies of these scales were not evaluated on the field and they should be assessed in future surveys.

Study design
This diagnostic accuracy study has been conducted during 2019, using a multi-centric approach, including 4 major educational-referral medical centers in Iran (Tehran: Sina and Shohaday-e-Tajrish Hospitals; Isfahan: Al-Zahra Hospital; Ahvaz: Golestan General Hospital).

Study population
All patients older than 18 years, presenting to ED the of mentioned hospitals, who had undergone brain magnetic resonance imaging (MRI) due to suspicion of stroke according to the in-charge physician's evaluation, were eligible. Those with a history of head trauma, previous stroke, known neurological disease or previous neurological surgery, and those who had left the ED against medical advice before undergoing brain MRI were excluded.
The sample size in this study calculated based-on sample size formula for estimating sensitivity of each stroke scale. So, assuming overall sensitivity of 87% for screening scale, stroke prevalence of 50% in suspicious patients, type-1 error of 5% and absolute precision on either side of the sensitivity (Ԑ) of 5%, the minimum sample size was calculated to be 802 suspicious patients. Also, we required 260 positive and negative patients for detection of 5% difference of area under the curve (AUC) in two stroke screening tools with 95% CI and 80% power, assuming AUC1 = 0.85 and correlation between two AUCs was assumed as 0.50. Finally, we planning for sampling of at least 800 suspicious stroke patients. The required sample size of each hospital was determined based-on proportion to size of patients with suspicious stroke admitted in year 2017. Then, in each center all patient with inclusion criteria entered to study from January 2018 and continued until the intended sample size was reached.

Data gathering
All data were gathered in a pre-designed checklist consisting of 3 sections, using the clinical profiles of the patients. The first section of the checklist included baseline characteristics and demographic data such as age, gender, past medical history, drug history and time of symptom onset. The second part included physical examination findings of 19 items related to the 9 scales along with other manifestations such as, vital signs, blood sugar level and level of consciousness. The third part was dedicated to the final diagnosis based on the interpretation of brain MRI, which was considered the gold standard method for AIS diagnosis in the current study. All data were gathered under supervision of an emergency medicine resident and three emergency medicine specialists. The required data were collected from the patients' files as well as the MRI images available in the corresponding hospital's picture archiving and communication system (PACS). The brain MRI scans were interpreted by both a radiologist and a neurologist.

Statistical analysis
We described data using frequency and percentage or mean and standard deviation (SD), as appropriate. We applied Chi-square test for assessing distribution difference of demographic characteristics and history of disease, as well as risk factors in patients with and without final diagnosis of stroke. Additionally, independent T-test was used for assessment of mean difference of numerical variables, e.g. age, among two group of patients. We calculated sensitivity, specificity, and positive and negative likelihood ratios of all nine screening tests with 95% confidence interval (CI) McNemar odds ratio (OR) with 95% CI, which presents the difference between predicted stroke cases and final diagnosis for each screening tool and also ratio of stroke prevalence. The sensitivities and specificities of screening tests were compared using McNemar Chi-square analysis described in previous articles [14,15]. First, overall test of difference (sensitivity or specificity) between all pairwise comparison of nine screening test were conducted using a 4 × 4 extension of the McNemar test, and if significant they were then compared sensitivity and specificity separately conducted using a 2 × 2 contingency table of the McNemar test. Finally, we used Youden's J statistic for comparing the performance of the nine screening tests. The Receiver Operating Characteristic (ROC) curve and area under the curve (AUC) with 95% CI of screening tools with a numerical score (ROSIER, LAPSS, FAST, CPSS and RACE) were calculated and their AUCs were compared as described by DeLong et al. [16]. P-value < 0.05 was considered statistically significant and all statistical analyses were conducted using Stata Version 14 (StataCorp LP, College Station, TX).

Result
Data of 805 patients suspected to stroke, who were transferred to ED by EMS were analyzed. Of all, 463 (57.5%) patients were male. Participants' age was between 6 and 95 years and their mean age was 66.9 years (SD = 13.9). Off all the enrolled patients, 562 patients (69.8%) had ischemic stroke based on the gold standard. Table 1 reports demographic and baseline characteristics of the studied patients. Prevalence of ischemic stroke in the male population was higher than the female population (73.9% vs. 64.3%; P = 0.004). The history of ischemic heart disease (IHD) was higher in patients with stroke (74.9% vs. 67.1%; P = 0.021). Also, patients with stroke were older (P < 0.001).    (Table 3). The Youden index for ROSIER and LAPSS was 55.1% and 54.7%, respectively, which was higher than other tests. Therefore, based on this index and assuming that sensitivity and specificity have equal importance, ROSIER and LAPSS had better performance compared to others ( Table 2).
Among screening tools with a numerical score, AUC of RACE was higher than other tools and this difference was only significant in comparison with LAPSS (0.857 vs 0.802; p < 0.001). AUC of both ROSIER and FAST was 0.850, which was significantly higher than AUC of LAPSS (p = 0.02). The pairwise comparison of ROC curves for other tools did not show significant differences (p > 0.05) (Fig. 2).

Discussion
According to the results of analyses, MEDPACS has the highest sensitivity among the 9 assessed tools at cut-off = 1; it also has the highest sensitivity at cut-off = 3. FAST, the tool currently used by the Iranian EMS to detect stroke, has a sensitivity of 94/84 at cut-off = 1. Obviously, in a pre-hospital setting, the sensitivity of a test is much more important than its specificity, and the tendency to screen correctly and not to miss positive cases is a priority. Therefore, based on the findings of the present study, highly sensitive tests that can be used in this regard are CPSS, FAST and MEDPACS, all of which have about 95% sensitivity. On the other hand, in the hospital setting, where diagnoses are expected to be more precise and specialized, examinations should be applied to avoid wasting resources, so higher specificity tests are required. Unfortunately, none of the studied tools were desirable (specificity above 90%) in any of the examined cut-offs; so if we want to define a criterion for ruling out stroke diagnosis in the ED, we need to perform more analyses and consider designing a new scoring system for this purpose.
Each of these criteria has its own strengths and weaknesses. PreHAST, LAPSS, MASS, and OPSS have considered more details and therefore, completing their checklists is time consuming and also difficult without specific training. On the other hand, patient assessment with FAST and CPSS is very easy and feasible for almost everyone and does not require any special training. These two tools do not consider lower limbs and eye symptoms. However, it should be mentioned that, given the lack of exclusion criteria, they may declare stroke-mimic cases as false-positive stroke.

MASS was indeed designed through integrating LAPSS and CPSS. LAPSS and MASS exclude patients
with a history of seizures, less than 45 years old, and also bedridden patients and those on wheelchair. LAPSS has tried to increase specificity and sensitivity by examining blood glucose level and unilateral symptoms. Time of symptom onset has been taken into account by LAPSS but not by MASS. On the other hand, speech difficulty has been assessed by MASS but not by LAPSS. In comparison with MASS and LAPSS, MEDPACS considers seizure, onset of symptoms, and blood glucose level, but age has not been taken into account.
OPSS does not consider age and eye symptoms, but excludes hypoglycemic and terminally ill patients as well as those under palliative care, and those with transient ischemic attack (TIA) less than 72 hours and Glasgow coma scale (GCS) < 10.
It is well-known that hypoglycemia is a stroke-mimic diagnosis that could easily be differentiated using a bedside BS glucometer; but this is not considered in CPSS, FAST, RACE, ROSIER and PreHAST.
It seems that it is an important weak point that leads to increase in the number of false-positive stroke diagnoses in the pre-hospital setting when these tools are used.
History of seizure has been considered as a negative point in LAPSS, MASS, MedPACS, OPSS and ROSIER, but not in FAST, CPSS, RACE and PreHAST. It is known that seizure could occur due to stroke; on the other hand, post-ictal phase of seizure may mimic stroke. So it is a very challenging to decide to ignore seizure or assign a negative score to it.
Pre-HAST is a new tool that has been designed based on NIHSS and has tried to cover everything, so completing the checklist is time consuming and also difficult without training. Age, blood sugar level, history of seizures, and the timing of the symptoms onset are not taken into account. In this criterion, all four limbs are examined, so generalized or symmetric weakness can lead to false positive decision.
In general, eliminating those with a history of seizures and younger than 45 years can cause adverse events, as stroke can also occur in young people, and seizures can be a symptom of a stroke.
ROSIER has assigned negative scores to seizure and syncope in order to better differentiate stroke and stroke mimics; also, by adding "new onset of symptoms", it has helped differentiate new cases from old stroke cases.
RACE has tried to determine stroke location and differentiate large vessel obstruction by examining aphasia and agnosia. However, blood glucose level, age, and time to onset of symptoms have not been assessed.
Sensitivity and specificity of these scales has been assessed in various studies, as well as their primary derivation study. We have tried to summarize the results of some of these papers in Table 4.
The  The key point that should be noted regarding the present study is that the instruments were only compared to a gold standard, namely MRI, and their effectiveness in dealing with patients on the scene may differ from the reported findings due to many reasons. For example, the level of knowledge and experience of pre-hospital emergency medical technicians (EMTs) in this field is very important and a specific criterion may not be useful due to difficulty on the scene and is more effective than a simple criterion. Therefore, in future studies, the efficacy of these tools should be examined at the time of dealing with patients in the pre-hospital setting.  anonymously. Consent to participate is not applicable.

Consent for publication:
All the authors present their consent for publication of the paper.

Availability of data and materials:
The dataset has been presented within the additional supporting files submitted in journal website.
Competing Interests: None declared.  Figure 1 Sensitivity and specificity of the nine studied screening tools in predicting ischemic stroke with 95% confidence interval (CI).

Figure 2
The Receiver Operating Characteristic (ROC) curve and area under the ROC curve (AUC) of the 5 studied ischemic stroke screening tools yielding numerical scores, with 95% confidence interval (CI).

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download. Strok9Dataset.xlsx