The Use and Structure of Emergency Nurses’ Triage Narrative Data: Scoping Review

Background Emergency departments use triage to ensure that patients with the highest level of acuity receive care quickly and safely. Triage is typically a nursing process that is documented as structured and unstructured (free text) data. Free-text triage narratives have been studied for specific conditions but never reviewed in a comprehensive manner. Objective The objective of this paper was to identify and map the academic literature that examines triage narratives. The paper described the types of research conducted, identified gaps in the research, and determined where additional review may be warranted. Methods We conducted a scoping review of unstructured triage narratives. We mapped the literature, described the use of triage narrative data, examined the information available on the form and structure of narratives, highlighted similarities among publications, and identified opportunities for future research. Results We screened 18,074 studies published between 1990 and 2022 in CINAHL, MEDLINE, Embase, Cochrane, and ProQuest Central. We identified 0.53% (96/18,074) of studies that directly examined the use of triage nurses’ narratives. More than 12 million visits were made to 2438 emergency departments included in the review. In total, 82% (79/96) of these studies were conducted in the United States (43/96, 45%), Australia (31/96, 32%), or Canada (5/96, 5%). Triage narratives were used for research and case identification, as input variables for predictive modeling, and for quality improvement. Overall, 31% (30/96) of the studies offered a description of the triage narrative, including a list of the keywords used (27/96, 28%) or more fulsome descriptions (such as word counts, character counts, abbreviation, etc; 7/96, 7%). We found limited use of reporting guidelines (8/96, 8%). Conclusions The breadth of the identified studies suggests that there is widespread routine collection and research use of triage narrative data. Despite the use of triage narratives as a source of data in studies, the narratives and nurses who generate them are poorly described in the literature, and data reporting is inconsistent. Additional research is needed to describe the structure of triage narratives, determine the best use of triage narratives, and improve the consistent use of triage-specific data reporting guidelines. International Registered Report Identifier (IRRID) RR2-10.1136/bmjopen-2021-055132


Overview
There are an estimated 46.6 emergency department (ED) visits per 100 people in the United States or 142 million annual visits to Canadian and American EDs combined [1,2]. EDs sort and prioritize patients using triage to ensure that patients with the highest level of acuity are provided care quickly and safely. Modern electronic health records allow for the large-scale collection of triage data, such as time stamps, vital signs, screening assessments, and free-text descriptions [3,4]. These data can be used to track ED volumes and guide local and national policy decisions [5]. Machine learning (ML) and artificial intelligence have allowed the data to be examined for a range of purposes [5,6]. Despite the ubiquity of triage and triage-related data collection, the potential research impact of using triage narrative data remains largely unrealized [7,8].

Background
Triage is the process of sorting patients. It originated during the Napoleonic wars [9] and was introduced into civilian practice in the 1960s [10]. Triage was formalized using validated tools in the 1980s [11] and was first implemented in Australia as a national system in 1994 [12]. Most countries use a formal triage system [13] associated with improved patient safety and service efficiency outcomes [14]. Triage is typically performed by experienced ED nurses [15] who are specially trained to use formally validated triage assessment tools to prioritize patient care [13]. Triage assessment typically consists of a brief history and physical assessment of the patient, followed by the assignment of a visit category and triage priority level by the nurse [15].
Several countries have standardized the mandatory collection of ED data. Canadian [16,17] and Australian [18] EDs report a triage minimum data set of structured complaint code fields. In addition to these nationally coordinated triage data collection efforts, there are regional databases for the local monitoring of injuries or syndromic surveillance (eg, toxic drug supplies and infectious disease outbreaks) [19]. The triage data collected between systems will vary, but the data types can be grouped into either structured or unstructured data, with each data type having its own strengths and weaknesses.
Structured data force the triage nurse to select from one of several preformatted options and restrict the types of data that can be entered into any given data field. Examples of structured triage data include arrival time, vital signs, demographic information (ie, age and sex), insurance status, categorical chief complaints, and numerical triage acuity score. Structured data are the most frequently reported data generated during triage [4,5]. Structured data are readily available (owing to their routine collection) and simple to analyze and report compared with unstructured data; however, this convenience comes at a loss of contextual detail that is available from unstructured narratives [5].
Unstructured clinical data include free-text written notes or "narrative" [20]. Narratives generated at triage vary in length and structure depending on the electronic health record and triage system used. The narrative typically includes the triage nurse's assessment and the patient's reported reason for visiting the ED. These data are unstructured and allow nurses to record the chief complaint in the patient's own words, descriptions of events associated with the ED presentation, and their physical examination findings [21].
Two systematic reviews that focused on injuries examined whether unstructured clinical narratives, including those generated at triage, could be used for large-scale injury surveillance [22,23]. These reviews summarized how narrative data were used to gather injury information and highlighted how data fields were interrogated [22,23]. Cumulatively, the reviews examined 2831 studies published over 18 years and included 56 studies, 13 of which used ED triage data [22,23]. They reported that narrative data use has increased over time and that analyzing the data required automatic or manual extraction of keywords or ML techniques. The review authors were critical of data heterogeneity and called for improved data collection methods [22,23]. The heterogeneity noted in these studies may be partially explained by the wide range of administrative data set types interrogated. A more homogeneous data set (ie, triage narratives alone) may have offered alternative insights.
Two additional review studies published in 2013 focused their analyses on studies using triage narratives for syndromic surveillance systems (ie, programs that monitor for disease outbreaks) [19,24]. Syndromic classifiers use chief complaint narratives to group patient visits into categories to monitor for changes (eg, outbreaks) in disease presentations. The first systematic review screened 89 studies identified through a structured search limited to PubMed to examine syndromic classifiers for detecting influenza in ED triage data sets [24]. The authors included 38 studies that met their inclusion criteria: (1) examined clinical data, which was (2) generated in the ED, and (3) examined influenza. The most commonly used method to identify cases was chief complaint classification. The authors noted that ED triage narratives allowed for large-scale research and program evaluation, but no details on the structure of or methods for extracting chief complaint classification data were offered [24].
The second 2013 nonsystematic review also focused on syndromic surveillance. This review offered descriptive details on the structure of syndromic surveillance systems and their data [19]. The review included 17 studies drawn from an undisclosed initial sample and identified 15 chief complaint classifier systems of interest. The authors described the geographic location where each system was in use and the process used by each system to group visits into syndromes and detailed the relative strengths and weaknesses of each system. The review noted that all but 1 system (from Canada) was American and that the classifiers used differing degrees of computer text parsing to assign patients into groups (eg, ranged from 4 to 9 syndromes) and classified the approach of each system by keyword, statistical, or linguistic methods. The authors highlighted that statistical methods relied on large data volumes to be robust to the "noisy" inputs found in narrative text. By contrast, keyword and linguistic methods used keyword-based strategies and were described as disadvantageous because time-intensive adjustments were needed to accommodate variations in triage vocabulary. The drawbacks of keyword-based methods were balanced by the transparency offered when compared with ML studies. The authors argued that triage narratives are of great utility for disease surveillance and were less critical of variations in the initial data quality, concluding that there is a need for common syndromic definitions to improve the utility of these data.
Despite the use of triage data for multiple purposes, there is a criticism of the methods used to classify triage narratives and a call for improved consistency and quality in their collection. There are documented efforts to create common data definitions for triage narratives [25] and to create national ED nursing data sets [26]; however, unstructured data are not as widely collected as structured data [7], and there is a paucity of literature examining what structures are common to triage narratives. This scoping review addresses these concerns and examines peer-reviewed literature to identify what ED triage narrative data have been used for, studies that may be sufficiently similar to compare, and the need for additional research. This scoping review systematically examines the evidence to determine what, if any, structures underlie these narrative data and describes what the data have been used for.

Objectives
The objectives of this review were as follows: 1. Describe the current literature on the use of ED nurses' triage narratives 2. Describe the objectives and findings of the included studies 3. Determine whether there are sufficient data to systematically review the structure or descriptions of triage narratives 4. Determine whether there is adequate consistency in the included studies to support further review of the outcomes.

Overview
In this review, we used the scoping framework proposed by Arksey and O'Malley [27,28]. The protocol was published previously [29]. The PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) framework was used to guide reporting [30].
To identify studies that examined unstructured narratives in the ED, we conducted a search using controlled terminology for the main topics of health record narratives, emergency, and triage. A medical librarian refined the search terms, and prespecified filters were used for ED [31][32][33][34]. To maximize the breadth of the retrieved studies, a comprehensive search was conducted in CINAHL, Ovid MEDLINE, Ovid Embase, Cochrane Library (via Wiley), and ProQuest Central. The search was limited to peer-reviewed literature published after 1990, four years before the first nationally implemented triage system [12]. The reference lists of select excluded studies, namely those that examined the free-text narratives of emergency physicians and review studies that included triage narratives, were hand searched for inclusion. There were no deviations from the published protocol [29].
Data were downloaded into Covidence (Veritas Health Innovation) for screening. The studies were screened independently by 2 authors (CP and MJD) in 2 stages (title plus abstract and then full text) using prepiloted screening forms. Any peer-reviewed studies that examined unstructured narratives [22,35] that were generated within an ED [36] by a nurse [37] were included. Studies that examined disaster triage systems, studies that did not have full text (ie, abstracts only), and non-English studies were excluded. Cohen κ was used to gauge agreement during screening, and all conflicts were settled by consensus. There were no deviations from the study protocol, which outlined the screening forms and operational definitions [29].

Data Extraction
The data were extracted into Microsoft Excel (version 2019, Microsoft Corp; by CP) using prepiloted forms. The results were independently confirmed by a second reviewer (MJD). Counts and proportions were used to describe categorical and numeric values. The extracted categorical values included study variables such as study design, country of origin, triage system used, and the use of ML. The extracted numerical data included the publication year, number of EDs from which the data were drawn, number of visits or patients included in the initial and final samples, and the number of nurses included in each study.
For studies that reported data as minimum values (ie, "there were over three million of visits") [38][39][40][41][42][43][44][45], values were recorded as the minimum stated value (ie, 3 million). When studies reported using quality or reporting frameworks, we reported the tool by name. The main conceptual categories of each study (ie, the objectives, design, population, and results) were described [46]. We summarized the descriptions of the triage narratives and keywords when the narratives were reported in the study. When 5 or fewer keywords were used, they were recorded verbatim.

Data Analysis
Owing to the wide distribution of data, estimates of central measures were calculated using both median (with IQRs) and minimum and maximum counts. Statistical analyses were performed using SPSS (version 25, IBM Corp). Citation management was performed using Zotero (Corporation for Digital Scholarship). The study objectives were categorized dichotomously (ie, yes or no) based on whether ML was used in the study (defined as any form of artificial intelligence), and the y were grouped into exclusive categories according to the primary use of the triage narratives: case identification, predictor variable, or quality improvement. primarily for not specifying whether the narratives were generated by a nurse at triage (67/131, 51.1%). All review studies identified in the initial search that discussed narrative (although excluded) underwent citation screening in the primary search that discussed triage or ED narratives underwent citation screening. An additional 13 studies were included at this stage ( Figure 1).

Study Designs
Retrospective design was the most common approach (80/96, 83%; Multimedia Appendix 1). Data were typically drawn (in part or entirely) from electronic databases, except in earlier studies, in which data were manually abstracted from paper charts [47][48][49]. The studies used data from hospitals (63/96, 66%) or regional databases (33/96, 34%). All studies reported on the unstructured narratives generated at triage; however, there was significant variation in the types and details of additional data reported. The most commonly collected non-triage-narrative data were patient demographic data, namely age ( (Table 1). There was a large variation in the numbers of visits and departments examined, with the included sample sizes ranging from fewer than 100 to >2 million visits. These visits were drawn from databases ranging from 100 to >14 million visits and reflected as few as 1 ED and as many as 496 EDs (Table 1)  Year, n (%)

Study Objectives
The most common objectives for studies using triage narratives were to perform case identification (59/96, 61%), to use narratives as a predictor variable in ML models (21/96, 22%), and to use narratives for quality improvement (16/96, 17%; Table 3). Studies categorized with case identification as their primary objective sought to describe incidence or prevalence estimates or populations of interest. Studies that used narratives as a predictor variable predicted patient acuity scores, resource use, or specific diagnoses.
Quality improvement studies used triage narratives to increase clinician or system safety and were subdivided as pertaining to reliability, accuracy, and validity or safety and efficiency.
Reliability and validity studies examined interrater reliability and were used to assess whether the triage classification matched specific populations with specific categorical assignments or triage acuity scores. Safety and efficiency studies examined narratives to improve data quality or reduce errors and effort (Table 3).
ML consisted of several models, and we used an inclusive approach by combining all ML, natural language processing, and other artificial intelligence models. We noted the frequency of ML use to be increasing and that ML was more frequently used in predictive studies (21/25, 84%) than in studies using narratives for case identification (17/58, 29%) or quality (1/13, 8%; Figure 2). Table 3. Summary of study objectives.

Quality improvement
Studies used triage narratives from previous ED a visits as a research instrument. These studies would have nurses or physicians rescore visits and compare the scores to calculate the reliability, validity, accuracy, or interrater agreement of providers for specific triage systems [48,49,51,54,[60][61][62].

Acuity or resource use
Triage narratives were used as a covariate for machine learning models that predicted specific resource or admission needs. Admission destinations and resources of interest included advanced diagnostic imaging use [56,129,130], mental health admission [131], ICU b admission [132], or neuro-intensive care unit admission [133].
Specific diagnoses a ED: emergency department. b ICU: intensive care unit.

Descriptions of Triage Narratives
The quality and structure of the triage narratives used in each study were not clearly stated. Of the 96 studies included, only 30 (31%) described the narrative. The most common approach to describing narratives was a description of the triage narrative or of the keywords used to search within the narrative (Table  4). The following fever-related keywords were used: "fever(s)," "Febrile," "chill*," and "low grade temp*" 5 N/A Chapman et al [104], 2004 Shortness of breath and difficulty in breathing were examined 6 The mean length of the triage narratives was 14.6 (SD 7.9) words in each database Day et al [43] Electric scooter-related brand names "bird" and "lime" as well as "scooter" were the keywords 3 N/A Trivedi et al [93], 2019 N/A N/A The mean length of triage narrative was 143.17 (SD 77.8) characters (excluding spaces) or 64.3 (SD 35.2) words Sterling et al [126], 2020 Electric scooter-related keywords and their variations were searched. "Scooter," "e-scooter," and "electric-scooter" were offered as specific terms 3 N/A Vernon et al [41], 2020 N/A N/A The average number of clinical features per text entry was 12.79. There was no discussion about character or word counts Ivanov et al [125], 2021 "Heroin" and "overdose" were specified as inclusion terms and "detoxification" as an exclusion term; although there may have been additional terms included, they were not specified 3 N/A Rahilly-Tierney et al [86], 2021 Suicide-related keywords 40 The average triage note was 127 characters long (notes with <30 characters were excluded) Rozova et al [99], 2021 a Studies reporting only the process of cleaning and normalizing unstructured narratives were not included. b Variations in spelling, abbreviations, bigram duplications, and negation terms were counted if specified. c N/A: not applicable. There were 7 studies that described triage narratives [38,39,43,75,99,125,126]. The descriptions included the counts of characters and words used in the typical triage narrative. The length of the triage narrative entries in these studies ranged from 40 [38] to 127 characters [99] and 14.6 [43] to 35 words (including abbreviations) [39] (Table 4). One study described the narratives in terms of "clinical features" [125]. "Clinical features" in this study were Unified Medical Language System clinical terms that the authors derived using a natural language processing algorithm (C-NLP), but it is unclear how much these differ from their input data or whether they can be compared with those in other studies.
In total, 9 studies reported the number of nurses who generated the narratives [38,[48][49][50][51][52][53][54][55]. The total number of nurses whose documentation was assessed in these studies was 3844. The median sample size of nurses was 15 (IQR 10-50), and the sample size ranged from 2 [50] to 3538 [53]. These 9 studies represent only 3% of the total sample size (n=367,946). One of the studies reported on both the structure of triage narratives and the number of nurses included in the sample [38].
The most in-depth descriptions were provided by Travers and Haas [75], who explored triage narratives in depth by describing the structure of the narratives and regional variations. This 3-center retrospective cohort study used verbatim triage chief complaint narratives drawn from EDs in the United States. In a corpus of 13,494 unique chief complaint narratives drawn from 39,038 visits, they used manual and automated techniques to identify chief complaint concepts using the Unified Medical Language System data definitions. Concepts that were not readily classified using ML models were described in both form and function, and the authors detailed the function of the punctuation, acronyms and abbreviations, truncations, modifiers, and qualifier words used in triage narratives [75].
Although quality appraisal can be incorporated into scoping reviews [30], we did not opt to include one because our primary aim was to describe the literature rather than assess each study's findings [27,28]. Consequently, we are limited to reporting that 8% (8/96) of the included studies used an Enhancing the Quality and Transparency of Health Research Network quality reporting guideline (Table 5). In total, 4% (4/96) of studies used reporting guidelines specifically for predictive models [62,99,124,129], and 1% (1/96) of studies reported using a quality framework to guide data cleaning and the protection of patient information [124].

Principal Findings
We performed a scoping review to examine studies reporting on the structure and use of triage nurse narratives. Our search was systematic, used a prepublished protocol, and screened a significant number of studies published over a 32-year period. Our study protocol was registered and published and used standardized screening templates and data extraction forms [29]. Our search intentionally sacrificed specificity for sensitivity, including a substantial number of studies in keeping with the scoping review design. The volume of studies retrieved demonstrates that identifying triage narrative in academic literature is difficult and that straightforward ways of identifying pertinent studies are needed. Studies would be more readily identifiable if their keywords, titles, and abstracts were clear and consistent.
In addition to the triage narrative, we found that the most frequently reported data were patient age, sex, chief complaint category, discharge status, and triage acuity, similar to a 2020 systematic review of ML for clinical decision support in the ED [5]. Similar to other review studies, we found an increase in the number of studies conducted over time [3]. We found a sharp increase in the sample size of studies after 2008. Our findings also support that the studies using ML lag behind studies of health record data. However, we noted that this trend continued only until 2017, when ML became the most common approach reported in the literature. Wang et al [3] tabulated the top sources of electronic health record narratives and determined that the most common sources were discharge summaries (n=26, 45% of studies), progress notes (n=15, 26%), admission notes (n=9, 16%), operative notes (n=5, 9%), and primary care notes (n=3, 5%). We identified 5 studies [71,72,108,121,134] that used ML. ML studies were challenging to identify through structured searches. Similar to our review, Wang et al [3] determined that most studies were conducted in the United States. They identified fewer (3/263, 1%) studies from Australia. In comparison, our study identified that 56% (10/18) of the studies originated from Australia during the same period [50,108]. Our results differ in part because we did not restrict our search in the same manner as Wang et al [3], who explicitly examined ML, and rather focused on unstructured narratives as a primary search concept.
The previously discussed reviews and several other studies included in this review established that triage narratives can improve case identification when used in isolation or when added to diagnosis codes [22]. The use of narratives for these purposes was reported as a straightforward process in several studies that showed that their inclusion or exclusion can substantially impact the number of cases identified [72,78,79]. Refinement of these techniques may improve the sensitivity of searches and have significant impacts on disease prevalence estimates for diagnoses (eg, rare illnesses) that may not be well captured with administrative discharge codes, a common method for tracking population illnesses [113,135]. The methods used in keyword-based case identification studies are well positioned for implementation, given their clearly defined and reproducible methods and long history of being used for these purposes. Studies of disease prevalence were among the first to use narratives collected on a large scale [42,75]. The potential improvements to the sensitivity and specificity of case identification may justify the systematic review of the studies included in this review. In addition, future research could focus on clearly defining the improvements that narrative data analysis can offer to case identification studies.
There is a pressing need to collect nursing data [7], and triage has been identified as one of the most important areas for quality improvement [136]. Several studies have reviewed quality improvement efforts at triage [8] and called to include narratives in these efforts [137], but significant work is still needed. A renewal of early efforts to establish a minimum ED nurse data set [26] and efforts to create common definitions for narrative elements are needed [25], as is additional research to describe the structures of triage narratives in general. This work is required to determine whether there is a common structure in the data. Our results showed that even though 31% (30/96) of studies offered a description of narratives, only 1% (1/96) provided significant depth. A fulsome description is needed to ensure that triage nursing contextual data are not lost through text normalization (a typical early step in data cleaning for models), given that nurses use unique punctuation and abbreviations while recording triage narratives [75]. Finally, given the wide regional variations in the breadth and depth of information collected at triage, research is needed to identify the specific details that triage narratives should contain.
The Strengthening the Reporting of Observational Studies in Epidemiology and Reporting of Studies Conducted Using Observational Routinely Collected Data guidelines were published in 2007 [138] and 2015 [139], respectively. However, only 8% (8/96) of the studies reported using a reporting guideline, even though 86% (83/96) of these studies were reported after 2007. Recently published reporting guidelines such as the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis [140] may contribute to more consistent reporting guideline use, and 2021 saw the highest (3/13, 23%) proportion of studies using a reporting guideline. The use of reporting guidelines will help reduce the heterogeneity noted in reporting.

Limitations
In total, 3 concepts (emergency, triage, and narrative) were searched using an inclusive search approach, resulting in a substantial number of studies. The level of agreement during screening was fair, but it was likely reduced owing to the large number of studies reviewed and the need for full-text reading to determine whether the narrative was nurse generated. Future refinements to the search strategy may enable a less wide-reaching search, and more clearly defined methods to identify nurse-generated narratives may decrease the number of studies for screening. In addition, clear methods for identifying when narratives are generated by nurses may prevent researchers from pooling nurses' triage narratives with narratives generated by other clinicians such as physicians, which may result in more studies being positively identified as originating from triage nurses.

Conclusions
This review identified 96 studies that used triage narratives to achieve quality improvement, perform case identification, or make predictions about clinical outcomes. We have described how narrative use is changing to incorporate larger samples and more ML methods for interrogating the data. We have provided a strong argument that there is a considerable lack of research on the structure of triage narratives. Future research should not only focus on the outcomes of their study but also describe in detail the data sources used. Future researchers should strive to follow reporting guidelines to improve the quality of data reporting and increase the ability to pool and compare study findings. Emergency nursing scholars can encourage the national collection of triage data to allow comparison between regions if the common structures of data are better articulated.

Authors' Contributions
All the authors contributed to the design of this study. CP, MK, and MD initiated the project. The protocol was drafted by CP and refined by MK, HMO, CN, CM and MD. Screening was performed by CP and MD. Data extraction was performed by CP and confirmed by MD. CP performed the statistical analyses and was responsible for drafting the manuscript. MK supervised this study. All the authors have contributed to the manuscript read, refined, and approved the final manuscript.

Conflicts of Interest
None declared.

Multimedia Appendix 1
Summary of the included studies.