1 Introduction

Situation awareness - SAW is a concept widely spread in military and aviation areas (with increasing use in different application areas that require critical decision making), related to the level of consciousness that an individual or team has to a situation. It is a dynamic understanding of an operator about what is happening in the environment and the projection of its status in the near future [1]. According to Endsley [1] SAW is divided in three levels: perception of the elements in the environment, comprehension of these elements status in a situation and the evolution of these in a near future. Achieving complete SAW is a process that takes place in the human mind, which requires cognitive activity. The prior user mental model of situations can assist to reduce the cognitive overload. However, poor understanding of information may not only cause the loss of its global significance, but can also lead to failures while allocating resources.

Mental model and SAW are highly related, since a mental model is formed by the user’s understanding of situations. Hence, a poor understanding of a situation affects the user’s mental model, which leads to poor comprehension and decision-making jeopardy [1].

One critical service that can benefit from a better SAW-oriented support to decision making is emergency calls. Bad identification, for instance, in a robbery report, can lead to failures in both resource allocation and tactics definition to respond to such calls. Under stress, operators can be provided by informational quality cues to help them to reason under uncertainties and improve their understanding about an ongoing situation. Semantic models have been used, which can adapt to specific contexts and create verbal schemes in order to attach meaning to the information.

The literature registers approaches that aim to enhance and maintain the user’s SAW, by employing technologies such cognitive models, ontologies and frameworks based on core ontologies, fuzzy logic, and data fusion models [25]. However, to the knowledge of the authors, there is a lack of a common ground regarding information quality in the evaluation of situations.

This paper introduces a conceptual framework for Information Quality that can enrich SAW for emergency dispatchers. A semantic model of the call is devised that can be used to support operator in a real time, to better acquire SAW under call uncertainties. The framework is used for real robbery victims report. However, it can be easily adapted to other crime reporting calls.

The paper is organized as follows: Sect. 2 discusses data quality dimensions for general decision-making systems. Section 3 presents our framework for quality assessment and representation for acquire SAW in robbery report attending followed by a case study that applies our framework in a robbery report in Sect. 4 and Conclusions.

2 Information Quality for the Evaluation of Situations for General Decision-Making Systems

Information quality is one among the crucial factors in decision-making systems. Imperfect information, which do not truly describe real world situations (e.g., incomplete set of necessary data, information which do not fully describe an event or fact, misspelled words, etc.), reduce the effectiveness of the systems, contribute negatively to the mental model formation and, consequently, undermine the SAW process.

According to the literature, there is not a defined pattern to information quality in decision-making systems. Requirements are divided into dimensions or metrics, and their decision-making applications are highly domain dependent, whereas the application defines their respective meanings according to objectives, tasks and associated decisions [6, 7].

For the robbery reporting call in our case study, completeness, uncertainty and timeliness information quality dimensions are addressed. Approaches, descriptions, and different information quality perspectives are described below.

O’Brien [8] defines data quality dimensions required for information systems in three main dimensions: content, time, and shape. Among the quality attributes there are readiness, acceptance, frequency, period, accuracy, relevance, completeness, conciseness, breadth, performance, clarity, detail, order, presentation, and media.

Wang et al. [9] categorize quality dimension attributes in four main classes (intrinsic, contextual, representational and contextual data quality) described as follows:

Intrinsic data quality implies guaranteeing credibility and reputation to data, among the attributes there are credibility, reputation, accuracy and objectivity.

Contextual data quality is comprised by attributes that should be considered and evaluated according to the context of the task to be performed, having as attributes: value-added, relevance, timeliness, completeness, and appropriate amount of data.

As for the quality of representation, the attributes are defined according to the given format-related aspects (such as conciseness and representation), and the meaning in the understanding and interpretation of such data. Finally, the authors classify individual accessibility-related attributes.

It is important to note that in addition to the works cited, most works analyzed presented in their methodology a subjective analysis step performed by experienced users in the field by means of questionnaires, interviews or surveys.

Among the related methodologies in the literature that evaluate data quality and relate to emergency decision-making context, it is noted that most of them are applied after the crime event, during the report recording process in order to prevent spreading low-quality reports in the system [10, 11].

It is evident that there are several applications of quality dimensions both in emergency and other application areas.

3 A Framework to Acquire Situation Awareness for Emergency Dispatchers: Robbery Report Event

In order to help developing SAW (specifically at the Perception level of the elements in the environment), our framework was defined to be coupled in a robbery situation assessment system, in which the operator can deal with information by using a multiple data sources combination (data fusion), as well as perform diverse kinds of refinement to build an incremental knowledge about what is going on in real scenarios. The Fig. 1 presents our conceptual framework and, main components.

Fig. 1.
figure 1

Framework to Enrich SAW for emergency dispatchers

As part of a complete situation assessment system, our framework is preceded by an acquisition module and followed by information fusion activities. The final objective is to reduce data dimensionality and convey better information about what is going on for a more grounded decision. The Fig. 1 presents how the information quality module relates to the other modules for situation assessment of robbery report events (the framework can be easily adapted to other crime reporting calls).

The quality assessment is performed upon data and information. In data when the primary assessment occurs, Syntactic Accuracy is applied, and information after the ontology is instantiated that is when the data turns into information with its relationships and meaning.

Before performing any information analysis, our framework receives the output of the acquisition phase. Such, acquisition performs a Natural Language Processing (NLP) [12] to identify objects, attributes and properties from audio reported to the PMESP (Military Police of the State of São Paulo) via emergency calls (number 190). As an output, it is produced a JavaScript Object Notation (JSON) object such as shown in Fig. 2.

Fig. 2.
figure 2

JSON schema for robbery report to perform quality assessment

Then, a Syntactic Accuracy analysis is performed leading a data preparation by checking for misspelled words, which can influence negatively the completeness assessment that will be conducted as follows.

First, it is used an algorithm called Metaphone [13] that generates a keyword as the word is pronounced – consequently, the words with similar sounds generate the same key. Then, the Levenshtein [14] Distance algorithm is applied to compare the words in the robbery report. Such algorithm measures the edit distance between two strings. As result, it discovers the number of operations needed for a string to be equal to another.

The process depends on a dictionary of sound key from words as parameter. If the result of the key comparison equals 0 it means a match is found, which indicates the presence of attributes (words) in an emergency call. Even if there is a misspelled word the evaluation process will happen without errors.

The completeness assessment occurs through examination of attributes that should be present in the call, which were defined by means of a Goal Driven Task Analysis (GDTA) [1]. The assessment results in a quantitative value a percentage of how much the object of the call is complete. Additional priority data and information were obtained through a questionnaire applied to members of PMESP in Brazil.

Timeliness will be assessed according to four requisites, the time of the robbery event, time that the robbery report was made, time spent to process such information while finding objects and attributes, and the current system time. As a result, the elapsed time from the event will be returned.

Later, a score assignment is performed to represent quantitatively the completeness and timeliness measures. Such score is crucial for both the fusion process, to use it as a parameter for information integration, and the operator to analyze them for decision-making. Also, such quality assessment is performed in a local and global fashion, ensuring that objects and situations can be evaluated, to be discussed further.

Therefore, the proposed framework can be summarized into three main steps, which follows: (1) data quality requirements elicitation for the emergency call, (2) definition and application of quantitative metrics and functions for requirements classification, and (3) knowledge representation of the generated knowledge using domain ontology.

3.1 Data Quality Requirements Elicitation

The requirements needed for the understanding of emergency calls were defined with the help of the PMESP. Initially, in order to perform the requirements elicitation, two different but complementary approaches were conducted: a Goal-Directed Task Analysis (GDTA) and a questionnaire. Both with the same objective of gather the necessary information for granting SAW Level 1 and complementary cues about the quality of information needed for emergency dispatchers’ decision-making.

In order to define how a robbery report is completed, the definition of a model of objects and attributes for this report is devised. With the information obtained by GDTA (Table 1) it was modelled an attribute tree according to the requirements in a report. Also, from the requirements, it was defined the components present in an event of robbery: the victim, the criminal, the stolen object, and the place and time of the event. The following is a description of each attribute:

Table 1. Robbery requirements set thru GDTA
  • Criminal and Victim, who have similar attributes as individuals: clothing, characteristics, ornaments, and respective descriptions;

  • Object: defines characteristics of the stolen object such as color, brand, size, and model. There is also an extension named Vehicle with specific features such as license plate and year, in case such information is provided;

  • Event spot: component provided with some type-specification (house, land, apartment, square) and information related to the address such as street, neighborhood, etc.

Four attributes were defined to also assess timeliness dimension: (1) time of the robbery event, (2) time that the robbery report was made, (3) time spent to process such information while finding objects and attributes and the (4) current system time. The Attributes Tree plays an important role in the next steps, since quality evaluation and knowledge representation of the domain take place by means of the objects and attributes defined.

3.2 Quantitative Metrics Elicitation for Data Quality Evaluation

Information Quality assessment will be performed upon a JSON scheme for robbery report with objects and its attributes presented in Fig. 2.

Metrics for quality assessment were defined according two specifics quality dimensions: timelines and completeness. For last, the score from the objects together will form a third quality attribute, uncertainty, which generalizes the other dimensions into a single quality measure for the robbery report (set of objects). Every object has its own completeness score; on the contrary, the timeliness and uncertainty score will be applied to the robbery report as a whole. This means that every report has at least the measures of uncertainty and timeliness dimensions.

The completeness metric were defined with the questionnaire applied to police experts who helped to define priority attributes related to a robbery report, as follows:

  • Information if weapons were used

  • What kind of weapons were used

  • Current location of the criminal

  • Location of the crime

  • Information about the victim (e.g., condition)

  • Information about the vehicle (e.g., if stolen)

The formula (1) defines the completeness calculus, as follows:

$$ \frac{{\delta \mathop \sum \nolimits \varphi + \left( {10\mathop \sum \nolimits \beta \times \varphi - \mathop \sum \nolimits \varphi } \right)}}{10\mathop \sum \nolimits \varphi } . $$
(1)

To deal with the presence of an object without attributes (e.g., the object is present in the JSON scheme but no attributes were mentioned by in the report) a default value of 10 % was set to such object to perform completeness score assessment. Therefore, where δ represents one of the four objects presence, being equals 0 if is not present in the JSON scheme and 1 if it is present.

Hence, if the object its present it means the completeness is already equals 10 %. To find the remainder score β consists in the presence of the attribute, which has the range of 1 when the attribute it is present and 0 if it is not multiplied by φ that is the weight of the attribute, it can be in a range of 1 if is a normal attribute and 2 if is a prior attribute. The result is divided by the sum of φ and finally multiplied by 100 to find the overall completeness percentage.

The timeliness dimension assessment will result in two kind of information: a quantitative score regarding the existence of the four attributes needed, and how many in minutes have elapsed since the robbery event. The formula (2) was set to perform the timeliness quantitative score:

$$ \mathop \sum \limits_{\gamma = 1}^{4} - \theta . $$
(2)

Where θ consists in a successive subtraction of the four attributes in the following order: the time of the robbery event, minus the time that the robbery report was made, minus the time spent to process such information while finding objects and attributes, minus the current time. To find the uncertainty score, the sum of each completeness score from all objects will be divided by the amount of objects found in the report.

3.3 Knowledge Representation

Matheus et al. [15] presents an ontology to improve SAW, for emergency dispatchers, which represents the objects and the relationship among them, besides their respective evolutions over time. Such ontology is used in this work to model robbery report.

The assessment system processes data from heterogeneous sources and also from an information fusion module. The latter information helps mitigating uncertainty locally, in a single report, or globally, when several reports, containing similar objects, have a property that links these objects, such as activities like “stealing, running, screaming, fighting”.

When an information fusion occurs, there is a need to re-asses information, considering that now we refer to a situation and uncertainty became a global measure of what is going on.

The ontology describes the relationship between four main classes named Victim, Criminal, StolenObject and RobberySpot. A class called RobberyReport was set do gather information about the report itself. The two classes Victim and Criminal hold some common attributes such as gender, condition, and physicalAspects. When some of the main classes are instantiated it must be instantiated together the class Situation whose the main attribute is called updateTime, to store every time any change happens in the whole situation and its instances’ attributes. A reduced example of the classes is shown in Fig. 3.

Fig. 3.
figure 3

Example of ontology represented in JSON format

4 Case Study

This case study presents a situation in which SAW is a paramount factor for decision-making because of the impact on police resources allocation. Given the large amount of criminal events reported to the PMESP, and considering the stress that emergency dispatchers are submitted to, the main objective of this study is to provide a supporting tool to dispatchers that can enhance their SAW, by reducing uncertainties and providing high level abstractions to help decision-making resulting in more efficient emergency call response service and police resource allocation. Additionally, we seek to identify and understand contexts associated with the situation, such as location, criminal, stolen object, presence of weapons and victims. This case study discusses specifically the situation of robbery. The occurrence is initially reported via phone 190, and then applied the functions of quality involving the completeness and temporal aspects of the events data. An example of a call is given below.

(Phone call): “Good evening! A carjacking has just happened at Domingos Setti with Luis Vives streets. Two guys riding bikes pointed a gun to the driver of a black Mercedes and made him leave the car without taking anything. They two fled speeding toward Klabin subway station.”

After receiving the robbery report, the system receives the JSON scheme with the objects and attributes present in the report (Fig. 1) with no values in the quality items. The objects identified were Criminal, StolenObject and EventSpot and its attributes. A report may or not have all the attributes set in the GDTA - it depends on the informer.

So first, the JSON scheme is processed in order to set a completeness score, to the first object detected (Criminal). Six attributes presents are identified (being 2 of these a priority attribute being equals 4) divided by the total of attributes defined for the Criminal having a total of 0.31 which is going to be multiplied by 100 generating a score of 31,60 %. The same calculation is applied to the second objecte found (StolenObject) generating a score of 50 % and to the third object found (EventSpot) a completeness score of 25 %.

Assuming that the four time attributes received, via JSON scheme: the event occurrence time (7:21 pm); the reporting time (7:25 pm); the report registration time (0:02 min) and the current time (7:29 pm), the timeliness node will be composed by two information: time elapsed from the event occurrence until the end of the event occurrence registration (8 min ago), and the percentage of these four attributes (100 %).

Finally, to perform the uncertainty calculation the three object completeness was added to the above calculation, resulting in 206.6, divided by the amount of objects defined by the GDTA (5), resulting in 41.32 %. After the quality assessment of each dimension is performed, the JSON scheme will look as shown in Fig. 4.

Fig. 4.
figure 4

JSON scheme after quality assessment is performed

Considering that an information fusion should occur to automatically mitigate the absence of information in a single report, all the quality scores will be recalculated if new objects and attributes are discovered by the fusion routines - information quality assessment must be performed once again. If fusion provides a relationship among objects, the completeness is calculated based on the presence or not of the objects that compose such relation. In both cases, uncertainty must be recalculated. Figure 3 presents a JSON scheme after a quality assessment is performed, with fusion results.

5 Conclusion

This work introduced a framework to enhance the first level of situational awareness of emergency dispatchers when responding to robbery reports. Because dispatchers may have to make decisions under heavy stress, our framework tackles information quality processing to reduce uncertainty, providing means for a better perception of what is going on during an emergency report. For the requirements elicitation, interviews with PMESP police experts were carried out with the application of GDTA methodology. As a result, an attribute tree was created with the main robbery objects and its attributes. The metrics to perform quality assessment were defined four quality dimensions: syntactic accuracy, completeness, timeliness and uncertainty.

The weight of objects and attributes set for a robbery were established with the help of PMESP police experts. Also, a domain ontology base on SAW core ontology was set to provide semantic meaning and relationships between objects and attributes while performing data quality evaluation.

The assessments of the data contained in each robbery report provide a full perception of the entities of an event and the necessary information about an ongoing crime event report. The knowledge generated may assist the development of systems that require SAW, since the evaluation of quality tends to improve the representation of both present and absent report information. Since the main objective of the framework was to focus on the perception of the elements, and it does so by identifying elements present in reports of robbery events by highlighting them and setting scores of quality, the framework meet its initial goal. As future work, time and quality dispatchers’ response to the calls will be measured (community evaluation and police resources allocation). Another quality dimension that can be considered is consistency. This dimension refers to divergence in a set of data, which breaks semantic meaning, for instance in this case of a robbery report, two different addresses belonging to the same report (it is considered wrong to assume that a robbery report cannot happen in two different places at the same time). This problem may happen after a data fusion process that combines a set of different reports.