Data-Flow-Based Extension of the System-Theoretic Process Analysis for Security (STPA-Sec)

Security analysis is an essential activity in security engineering to identify potential system vulnerabilities and achieve security requirements in the early design phases. Due to the increasing complexity of modern systems, traditional approaches, which only consider component failures and simple cause-and-effect linkages, lack the power to identify insecure incidents caused by complex interactions among physical systems, human and social entities. By contrast, a top-down System-Theoretic Process Analysis for Security (STPA-Sec) approach views losses as resulting from interactions, focuses on controlling system vulnerabilities instead of external threats and is applicable for complex socio-technical systems. In this paper, we proposed an extension of STPA-Sec based on data flow structures to overcome STPA-Sec's limitations and achieve security constraints of information-critical systems systematically. We analyzed a Bluetooth digital key system of a vehicle by using both the proposed and the original approach to investigate the relationship and differences between both approaches as well as their applicability and highlights. To conclude, the proposed approach can identify more information-related problems with technical details and be used with other STPA-based approaches to co-design systems in multi-disciplines under the unified STPA process framework.


I. INTRODUCTION
S YSTEM security is an emergent property of the system, which represents a state or condition that is free from asset loss and the resulting loss consequences. System security engineering, as a special discipline of system engineering, coordinates and directs various engineering specialties to provide a fully integrated, system-level perspective of system security and helps to ensure the application of appropriate security principles and methodologies during the system life cycle for asset protection [1]. Violating system security constraints causes unexpected incidents, like mission failure or leaking sensitive information, and finally leads to financial or even life losses. Therefore, security needs to be considered carefully in system design. Security analysis, referring to the activity of analyzing systems in security-related aspects to achieve security requirements in this research, is performed in the early security engineering phase and helps to manage system risks and make decisions. Traditional security analysis approaches, being designed for former relatively simple systems, are not effective to analyze increasingly complex systems. Today's Cyber-Physical Systems (CPS) or Socio-Technical Systems (STS) consist of not only physical components but also software and even social elements, in which components in multi-disciplines interact with each other deeply. For example, an autonomous vehicle (as a CPS) consists of tens of thousands of physical components as well as lines of codes. A vehicle Over-The-Air (OTA) software update system (as an STS) refers to not only the technical part but also social entities like data providers and regulations. However, most traditional approaches start with system decomposition and focus on component failures, which leads to overlooking impacts of interactions since components are analyzed individually. Traditional causality models attribute accidents to an initial component failure cascading through a set of other components (like dominos) [2] and can not address causes of losses with non-linear cause-and-effect linkages.
To meet the requirements of modern systems, a relatively new approach for safety engineering called System-Theoretic Process Analysis (STPA) was proposed [3] and the extension for security named STPA-Sec was presented later [4]. However, we recognized some limitations of STPA-Sec when implementing it, especially for data-flow-based systems. Therefore, this research aims to work out an extended approach based on the unified STPA process framework for complex information-critical systems to overcome the identified limitations of STPA-Sec.
The rest of this paper is organized as follows. In section II, we introduce traditional approaches and the STPA series with research gaps and our contributions. In section III, we introduce the original STPA-based approaches and propose the extension in detail. In section IV, we present the analysis process of a use case by using both original and extended approaches to demonstrate how to use them and make the comparison. Finally, we summary this paper in section V.

A. Traditional Security Analysis Approaches
The purpose of the security analysis in the early design stage is to achieve security requirements. The Threats Analysis and Risk Assessment (TARA) is normally the main activity in the security analysis to identify potential threats and assess threat risks [5]. The SAE (Society of Automotive Engineers) J3061 -arXiv:2006.02930v1 [cs.CR] 4 Jun 2020 Cybersecurity Guidebook for Cyber-physical Vehicle systems [5] provides a list of approaches, which contain complete process frameworks of TARA, like the EVITA approach [6] and TVRA [7]. Besides, techniques are proposed for only threat identification or risk assessment and can be used in the previously mentioned TARA frameworks, including Attack Tree Analysis (ATA) [8], STRIDE threat models [9], Threat and Operability Analysis (THROP) [5], Threat Matrix [10] and BDMP-based (Boolean logic Driven Markov Processes) Modelling [11] for threat identification, as well as Binary Risk Analysis (BRA) [12] and NIST SP 800-30 Guide [13] for risk assessment. Other than methods originally invented for security, some approaches are evolved from the safety field by introducing security awareness into the process and support the co-analysis of both safety and security. Table I summarizes both original and evolved approaches with brief introductions. They are all bottom-up approaches building upon physical or functional decomposition instead of analyzing the system as a whole initially. They focus more on the tactic level and may overlook issues at the strategy level. The tactics are means to accomplish a specific action and focused on physical threats, while the strategy is regarded as the art of gaining continuing advantages and is focused on abstract outcomes [2]. The latter one is useful to broaden the mind and includes more aspects like organizational and managerial ones in the analysis.

B. System-Theoretic Process Analysis (STPA) Approach and Extensions
To overcome the limitations of traditional approaches, STPA was created as a hazard analysis approach based on the System-Theoretic Accident Model and Process (STAMP), which views losses as results from interactions among various system roles that lead to violations of safety constraints and analyzes issues at the strategy level. STPA provides a powerful way to deal with complexity by using traceable hierarchical abstraction and refinement [2].
Other than safety engineering, STPA has also been extended into other fields with the same system-theoretic thought. Young and Leveson [4] firstly presented STPA for Security (STPA-Sec), which shares similar steps with STPA and focuses on controlling system vulnerabilities instead of avoiding threats. To perform co-analysis of safety and security under the STPA framework better, Friedberg et al. [19] proposed a novel analysis methodology called STPA-SafeSec, which integrated STPA and STPA-Sec into one concise framework and overcomes limitations of original approaches (e.g. no considerations about non-safety security issues) by introducing security constraints and mapping abstract control structures to real components. Shapiro [20] proposed STPA for Privacy (STPA-Priv), which extends STPA into privacy engineering by introducing privacy concepts and consideration into the general STPA process steps.
The most significant highlight of STPA-based approaches is that they shift from focusing on preventing failures and avoiding threats to enforcing safety constraints and control system vulnerabilities. Identifying and controlling system vulnerabilities rather than brainstorming and reacting to threats is a more efficient way to achieve system safety and security, because controlling a vulnerability may be effective to reduce several threats. Besides, threats are dynamic. Newly emerged threats can not always be detected in time, but controlling vulnerabilities can protect the system against even unknown threats, just like defending a castle by reinforcing walls and not caring who is the enemy. Another highlight is that the STPAbased approaches are applicable for socio-technical systems, which are systems that consider requirements spanning hardware, software, personal, and community aspects [21]. The analysis scope of the STPA series includes not only physical system components but also humans, natural or social environment and their interactions, which makes the approaches more suitable for todays complex systems. Furthermore, due to the numbers of extensions of STPA in various disciplines, it is easier to perform co-design with similar STPA framework and the same system model.

C. STPA-Sec Applications and Gaps
The STPA-Sec approach or its extensions have been used to identify system security constraints in various industries. Khan, Madnick and Moulton [22] demonstrated the implementation of STPA-Sec to identify security vulnerabilities of a use case (Central Utilities Plant Gas Turbine) in industrial control systems. Carter [23] used STPA-Sec with a previous information elicitation process to analyze a small reconnaissance unmanned aerial vehicles. Sidhu [24] applied an STPA-Sec extension with modified attack tree method to analyze cybersecurity of autonomous mining systems. Wei and Madnick [25] analyzed a use case (Over-The-Air software update) in the automotive industry by using both STPA-Sec and CHASSIS and compared analysis outcomes, which showed that STPA-Sec can identify more hazards compared to CHASSIS, while CHASSIS is more suitable for information lifecycle analysis.
Nevertheless, researchers also pointed out several limitations of STPA-Sec. Schmittner, Ma and Puschner [26] reported that the original STPA-Sec lacks guidance for intended causal scenarios, excludes considerations of the data exchange which is not directly connected to the process control and cannot cover more information-security centric properties such as confidentiality. Torkildson, Li and Johnsen [27] also found that some essential security issues can be overlooked and recommended to strengthen STPA-Sec by combining it with dataflow-based threat models. However, Torkildson's approach converts the STPA control structure into a data flow diagram by simply replacing control actions and feedback paths with data channels. Although such a data flow diagram helps to identify more data-related threats than using STPA-Sec alone, this diagram based on the original control loop does not view the system from the data aspect initially and may also miss data-related information. Besides, the STPA-Sec approach regards the security issue as one of the key threats affecting system safety [25] and only supports the identification of safety-related security goals [28]. Non-safety-related security issues like confidentiality may be overlooked.
To sum up, existing STPA-Sec is not effective to identify non-safety but information-related issues since it does not con- EVITA approach considers four security objectives (safety, privacy, financial, operational) and uses attacks trees to identify threats and assess risks [6]. Threat, Vulnerabilities, and implementation Risks Analysis (TVRA) TVRA is a process-driven TARA approach to systematically identify unwanted incidents which need to be avoided [7]. Operationally Critical Threat, Asset, and Vulnerability Evaluation (OCTAVE) OCTAVE is a process-driven TARA method which is best suited for enterprise information security risk assessments [14]. HEAling Vulnerabilities to ENhance Software Security and Safety (HEAVENS) HEAVENS is a systematic approach of deriving security requirements for vehicle E/E systems, including processes and tools supporting for TARA [15]. Approaches evolved from other disciplines and support co-analysis A Security-Aware Hazard and Risk Analysis Method (SAHARA) SAHARA is a combined approach of the Hazard Analysis and Risk Assessment (HARA) with the STRIDE model and outlines the impacts of security issues on safety concepts [16]. Failure Mode, Vulnerabilities and Effects Analysis (FMVEA) FMVEA is an approach evolved from the Failure Mode and Effect Analysis (FMEA) to identify vulnerability cause-effect chains for security [17]. Combined Harm Assessment of Safety and Security (CHASSIS) CHASSIS is a unified process for safety and security by using UML-based models (e.g. misuse cases and sequence diagrams) [18]. sider security from the perspective of data flows. Furthermore, STPA-Sec lacks guidance for identifying security concepts.

D. Contributions
In this paper, we propose a data-flow-based extension of STPA-Sec (named STPA-DFSec) with elicitation guide words to overcome STPA-Sec's limitations. The analysis process of a vehicle digital key system is presented to demonstrate how to use STPA-DFSec. We also analyze the same system by using the original STPA-Sec and compare outcomes though synthesis and mapping. Finally, we discover the relationship between both approaches and conclude the highlights and applicability of them.

III. METHODOLOGY A. Brief Introduction of STPA and STPA-Sec
We briefly introduce the original STPA framework as the basis of the proposed approach in this section.
STPA starts with defining the purpose of the analysis, including system-level losses, hazards and constraints. Losses are about something valuable and unacceptable to the stakeholders. A hazard is a system state or set of conditions that, together with a particular set of worst-case environmental conditions, will lead to a loss. Finally, system constraints can be derived from identified hazards, which specifies system conditions or behaviors that need to be satisfied to prevent hazards and ultimately prevent losses [3].
Then, the control structure needs to be built to describe relationships and interactions by modelling the system as a set of control loops (show in Figure 1).
The third step is to identify unsafe control actions, which will lead to a hazard in a particular context and worst-case environment [3], based on the previously built structure. Four ways of being unsafe are provided in STPA as guide words for the identification (listed in Table V).
Finally, loss scenarios, which describe the causal factors that can lead to unsafe control actions, will be identified. Two types of loss scenarios must be considered, which are scenarios that lead to unsafe control actions and scenarios in which control actions are improperly executed or not executed [3]. Each identified scenario represents a system failure which needs to be controlled by designers.
STPA-Sec, as the extension for security considerations, shares the same basic steps. Vulnerabilities, instead of hazards are identified in the first step since vulnerabilities lead to security incidents, which is just like hazards lead to safety incidents [4]. Similarly, final identified loss scenarios represent system vulnerabilities which need to be controlled.

B. STPA-DFSec Approach
The proposed STPA-DFSec follows the general STPA framework but introduces a data-flow-based structure for information security considerations. The main steps are described as follows. 1) Step 1 -Define the purpose of the analysis: Just being similar with the STPA-Sec, the first step of the analysis is to identify system-level losses, vulnerabilities and constraints to figure out what are unacceptable results that need to be avoided at the system strategy level.
General security attributes like Confidentiality, Integrity and Availability (C.I.A. Triad Model) are guide words for the vulnerability identification, which classify the security problems and elicit potential vulnerabilities.
Furthermore, to help to identify losses, STPA-DFSec provides general guidance for loss identification based on the study of various safety-and security-related definitions from standards and technical documents in industries including ISO 26262 [29], EVITA project report [6] and J3061 guideline [5]. All possibilities of losses at a high abstract level are listed in Table II. Losses of a particular case are a subset of this general list and can be described concretely according to practical situations.

2)
Step 2 -Model Functional Interaction Structure: Instead of the control structure, a Functional Interaction Structure (FIS) based on data flows is created to interpret how the system works from the perspective of data flows. The basic element of the FIS is the Function, which works based on its inputs and algorithms inside and outputs process results. The processing environment, together with inputs and algorithms, will affect function behaviors and final outputs. Inputs and outputs, instead of control actions and feedback, are interactions between components in FIS. Figure 2 shows a general interaction structure and the function element, in which arrows represents data flows.

3)
Step 3 -Identify Insecure Function Behaviors: Based on the FIS, we can identify Insecure Function Behaviors (IFB) for the target system by following the basic technique in STPA. Insecure behaviors are identified with the help of Guide Words (GW), which are slightly modified to fit the proposed approach.   1 Represents that the function is not executed successfully. 2 Represents that the function is executed with improper conditions (e.g. incorrect inputs or algorithms, process with data leakage risks). 3 Represents that the execution exceeds the timing limits. 4 S Fn IFBm is the label of each IFB, in which S represents the subject of the function.

4)
Step 4 -Identify Loss Scenarios: Finally, Loss Scenarios (LS), as possible causes of IFBs, are identified by using optimized guide words. Table IV is the template for identifying LSs with two classes of guide words. The Function itself class helps to identify scenarios caused by unexpected behaviors inside the function, while the Execution environment (Env) class refers to external conditions outside. Each loss scenario represents a system vulnerability which should be controlled by designers or operators. Detailed system constraints can also be derived from loss scenarios by simply inversing the conditions of loss scenarios or defining what the system must do in case the incident occurs [3]. System constraints are inputs of further design phases.

C. Summary
Table V summarizes the process steps of both STPA-DFSec and STPA-Sec with highlights of differences, in which '+' donates added features of the STPA-DFSec and '* represents modified steps in comparison with the original STPA-Sec.

A. Use Case Definition and Assumption
In this section, a Bluetooth digital key system of a vehicle is introduced as the target system in this research. The system consists of three main physical components and aims to lock or unlock vehicle doors by using smartphones. Communication between different entities are through wireless channels and protected by cryptographic mechanisms. The system sketch and sequence diagram of two main services are shown in Figure 3 to describe how this system works.
In this example case, we assume that the connections between components have been established via Wi-Fi or Bluetooth in advance, but the connection is not ensured to be secure, and the prerequisites in Figure 3 are regarded trusted. In this research, we only focus on security issues, which means that the system can work properly without intended external disturbances and the system development errors and hardware random failures are out of scope.

B. Analysis by STPA-DFSec
The analysis of the target system by using STPA-DFSec is presented in this section. First, system-level Losses (L), Vulnerabilities (V) with linked losses and security attributes and derived System Constraints (SC) are listed in Table VI.
Second, the functional interaction structure is created in Figure 4 based on the system data flows. Two functions with identified IFBs are shown in Table VII as examples. In contrast Step 1 -Define the purpose of the analysis Identify system-level losses, vulnerabilities and constraints, link vulnerabilities with corresponding losses and security attributes + . A general losses list is provided + .
Identify system-level losses, vulnerabilities and constraints.
Step 2 -Model the system structure Model the system by functional interaction structure based on data flows * .
Model the system by functional control structure based on control loops.
Step 3 -Identify insecure items Use modified guide words * (not being executed, being executed and timing issues) to identify insecure function behaviors.
Use guide words (not providing, providing, too early, too late, out of order, stopped too soon, applied too long) to identify insecure control actions.
Step 4 -Identify loss scenarios Use modified guide words * (function itself, execution environment(incl. function inputs, calling behaviors, computing resources and links) to identify loss scenarios.
Use guide words (unsafe controller behavior, inadequate feedback and information, involving the control path, related to the controlled process) to identify loss scenarios.   to most traditional approaches, this analysis includes participants (user and manufacturer) outside the physical system boundary. Functions in a human operator can also be refined into more detailed movements like 'decision in the mind', 'pressing button' or 'recording password'. Since we focus on the physical part in this analysis, human movements are simplified as one human operation function. Note that the first letter of the IFB labels in Table VII represents system components including smartphone (P), cloud server (S), door controller (D), user (U) and manufacturer (M). Finally, LSs are identified for each IFB. Example LSs with related guide words in the bracket are listed in Table VIII. Note that the IFBs and LSs of various subjects are merged to make the list concise since different system components may contain the same functions and different insecure behaviors may have the same causalities. In practice, it is also meaningful to describe each function or LS separately to achieve security constraints for corresponding engineers, who might only have access to a part of design information due to security reasons.

C. Analysis by STPA-Sec
We also analyzed this use case by STPA-Sec. Due to the same system model, the identified losses, hazards and system constraints are the same as those in the STPA-DFSec analysis. Therefore, the work here starts with drawing the system control structure shown in Figure 5, and then Insecure Control   Table X.

D. Outcome Comparison
The functions and control actions are basic elements in the STPA-DFSec and STPA-Sec respectively. Normally, a control action includes several functions to provide a service. For example, the control action D CA1 (Door controller registers at the server) consists of functions of data process, transmision,   encryption and decryption. Therefore, the relationship between these two elements is that a sequence of the execution of functions forms a control action. To find out how different approaches work on the same use case, we mapped the analysis outcomes in both analyses. For example, D CA1 ICA3 LS1 and D CA1 ICA3 LS3 in the STPA-Sec analysis can be mapped to P/S/D F2 IFB3 LS1. U CA2 ICA1 LS1 and U CA2 ICA1 LS2 can be mapped to U F7 IFB2 LS1. After performing the mapping for all analysis outcomes, we found that each loss scenario in the STPA-Sec analysis can be mapped to a corresponding one in the STPA-DFSec analysis from the perspective of data process, which means that the proposed approach can find all possibilities identified by the original approach. Furthermore, more loss scenarios can be identified in the STPA-DFSec analysis. For example, the scenario 'Computing resource is occupied to cause violation of execution timing limitations.' (P/S/D F2 IFB3 LS3 in Table VIII) cannot be identified by the STPA-Sec. Therefore, more technical details related to the data process can be revealed by the data-flow-based analysis. Another finding is that an STPA-DFSec loss scenario can be mapped to several STPA-Sec ones because a function is always called by various control actions for different applications. This explains why STPA-Sec can reveal more detailed information from the perspective of applications.

E. Discussion
After the comparison, we concluded the differences and highlights of both approaches. The STPA-DFSec focuses on information flows and discusses possible vulnerabilities along the whole data flow paths, which helps to identify more detailed loss scenarios from the perspective of information flows. By contrast, since control actions in STPA-Sec are derived from system functionalities, STPA-Sec can reveal more insecure details linked to concrete application scenarios. STPA-DFSec addresses where (in which function) a loss scenario occurs, while STPA-Sec addresses when (in which application scenario) a loss scenario occurs.
Since both approaches have different advantages, how to choose an approach depends on particular cases. Two principles can be used to help the decision. The first one is according to system purposes. If the data is the core asset in a system, STPA-DFSec is suitable for analysis insecure issues with more considerations on the information. If providing proper and secure services is the main object of a system, STPA-Sec is applicable to identify insecure issues linking with application scenarios. The second principle is to consider who uses it. STPA-DFSec is suitable for designers who are responsible for technical structure and design, while STPA-Sec is more useful for ones who design the system functionalities and make more high-level decisions.
Actually, system security engineering is not able to ensure absolute security but provides a sufficient base of evidence that supports claims that the expected level of trustworthiness has been achieved [1]. The analysis in security engineering is also not able to be proven complete, and the analysis results normally depend on the analyst's knowledge and design emphasis. However, a proper systematic approach can help the analyst to be more confident in the analysis completeness [2]. Proper guide words help to reduce the dependency on personal experience and subjective thinking and lead to objective and valid results with less effort. Although the case study in this paper represents the authors' understanding of the system, the analysis results are comparable and meaningful because both analyses were performed by the same group of analysts.

V. CONCLUSION
In this paper, we have proposed a data-flow-based approach for security analysis of information-critical systems based on the STPA framework to overcome STPA-Sec's limitations. The analyses of a vehicle digital key system by using both the STPA-DFSec and STPA-Sec have been presented and compared to show how to use the approaches and how well both approaches work on the same use case.
We have found that the proposed STPA-DFSec focuses on data flows and can reveal more details in information security aspects, which are hard to be addressed in the STPA-Sec analysis, while the STPA-Sec analyzes systems from the perspective of applications and more concerns safety-related security issues. Besides, since STPA-based approaches were created for high-level decisions rather than tactical details [2], the proposed STPA-DFSec extends considerations into lower levels with technical details. Furthermore, as an extension of the STPA series, the proposed approach, together with other STPA-based approaches, can be used to co-design complex systems in multi-disciplines from high to low system levels under the unified STPA framework. Social aspects and human factors can be included in the analysis, which are excluded in traditional analysis approaches.
In the future, we will study more industry cases and conduct experiments with different groups of analysts to validate and refine the proposed approach in practices since we performed both analyses in this paper, which might influence the validity of analysis outcomes. Furthermore, we will formalize the analysis process and design tools to achieve analysis results automatically for higher working efficiency.