3-layer modelling method to improve the cyber resilience in Industrial Control Systems

Cyberattacks against Industrial Control Systems (ICS), which play a crucial role in many industrial domains, could bring environmental and safety damages as well as huge economic loss. To decrease such cybersecurity risks, it is needed to improve the cyber resilience, that is the capability to respond to and recover from the threat in case cyberattacks occur. To this end, we propose the 3-layer modelling method that reproduces ICS by the actor, asset, and process models. The modelling approach based on Petri-Nets is used to express the state of all models in time series and quantify the availability of ICS influenced by cyberattacks, considering the behavior of personnel involving both cybersecurity and industrial operations. To demonstrate the effectiveness, resilience curves, which draw the availability as a function of time, are finally calculated for multiple cases of cyberattacks, assuming a hypothetical manufacturing factory. As a result, the production rate, which represents the availability of the manufacturing industry, is calculated in time series. The resilience curve can be used to determine what and when to do to minimize the impact of cyberattacks, which leads to the improvement of the cyber resilience.


Introduction
Industrial Control Systems (ICS), which consist of various assets (information components), are core infrastructures to monitor and control industrial processes.Examples of characteristic assets in ICS include Programmable Logic Controller (PLC) and Human Machine Interface (HMI).With the appearance of IIoT (Industrial IoT) or CPS (Cyber Physical System), modern ICS have actively adopted information technologies that expose a part of critical systems to the external network like the Internet to obtain the scalable connectivity [1,2].
From a security point of view, the attack surface and attack vectors are rapidly expanding, resulting in the increase of cyberattack risks.According to the report [3], the manufacturing industry of ICS was the top attacked domain in 2021, overtaking traditional IT domains like finance and insurance.In ICS, cyberattacks could bring environmental and safety damages as well as huge economic loss.Therefore, the defense strategy toward cyberattacks in ICS is different from that in IT systems.In the CIA triad, the best practice in IT systems is, in general, protecting the confidentiality with the highest priority, while ICS give the highest importance to the availability [4], namely maintaining stable and reliable operations is more prioritized than anything.As such, it is occasionally a preferred decision to wait for the removal of vulnerabilities found in ICS until the next scheduled downtime (e.g.maintenance period) to avoid the loss of availability [5].
As cyberattacks targeting ICS increase, international standards and guidelines of cybersecurity for ICS, such as IEC 62443 series and NIST SP800-82 Rev.2 [6], have been developed, which contributes to the improvement of cybersecurity awareness in ICS.On the other hand, even if enough efforts are taken to secure the system, cyberattacks still occur in many organizations [7], suggesting that it is difficult to completely eliminate the cybersecurity risk.ICS are also often targeted by Advanced Persistence Threat (APT) groups, that is adversaries who have sophisticated expertise and abundant resources.APT groups sometimes utilize multiple zero-day vulnerabilities and conduct continuous attacks with sophisticated persistence techniques that avoid detection over the years [8].
Take Stuxnet [9,10] for example.By masking the fact that PLCs are infected, the control data has been modified unnoticed by operators for a long period.As a result, it is believed that physical damage is caused by Iranian nuclear centrifuges.A contrastive example can be discussed using the Florida water plant incident in 2021.The story of the incident is briefly summarized in Ref. [11] and references therein.The attacker gained the remote access to the system and attempted to alter chemical values in the water supply.Fortunately, an operator noticed the abnormality almost in real-time and reverted the changes before actual damages are caused.
As can be seen in the above two examples, there is often a time lag between when a cyberattack is conducted and when the physical damage becomes visible.Therefore, it makes a huge difference in the resultant impact whether the anomaly is noticed in the early stage and the procedure of the response and recovery is conducted in the proper way.
To cope with advanced cyberattacks, the capability to keep delivering the intended outcome by rapidly responding to and recovering from the threat is required.In other words, only decreasing the attack surface and attack vectors is no longer enough, and actions after the incident must be well prepared as well.This capability is referred to as the cyber resilience [12].
The cyber resilience can be quantified using the resilience curve, which describes the system performance P in time series.Figure 1 shows an example of the resilience curve, which is drawn based on Wei and Ji [13].The performance P can be linked to the availability of the system.Therefore, the cumulative impact I c , the maximum impact I m , and downtime T represent the loss of availability caused by the cyberattack.The smaller the loss of availability is, the higher the cyber resilience is.Sophisticated attackers like APT groups deceive various personnel in ICS, such as operators, engineers, and maintenance teams as well as defenders (i.e.teams responsible for incident response), aiming to maximize the impact.
The resilience curve allows us to visually understand the overview of the incident in time series with ease.For example, the time lag between the cyberattack occurrence and the beginning of the performance degradation can be seen at t = t 0 − t 1 in Figure 1.In the Florida water plant incident in 2021, the operator noticed the incident and handled the threat in the proper way during this phase.As such, no loss of availability, which is related to the safety of people, was observed.
The resilience curve also reflects the behavior of personnel involving not only cybersecurity but also industrial operations.Therefore, the potential risk hidden behind industrial operations might be discovered by comprehensively evaluating the impact of expected cyberattacks using the resilience curve.
Using the resilience curve, experts can discuss what and when to do to minimize the impact of cyberattacks, which leads to the improvement of the cyber resilience.However, it should need much time and deep knowledge of both cybersecurity and industrial operations to manually create the resilience curve.So far, the consensus for the method to calculate the resilience curve in ICS has not yet been established.
This research proposes an modelling method that automatically reproduces the resilience curve for cyberattacks in ICS, including the behavior of personnel regarding both cybersecurity and industrial operations.Although this paper mainly focuses on the introduction to the concept of the modelling method covering overall ICS domains, the application to the manufacturing industry, which is one of the most attacked industries in 2021 [3], is considered the first step.To show the effect of the proposed method, we demonstrate that the cyber resilience can be evaluated for multiple types of cyberattacks in a hypothetical manufacturing factory and discuss the difference of each case.
This paper is organized as follows.To introduce the overview of the modelling method, we firstly explain the requirement of the model, prior research, and the model approach in Section 2. Section 3 then describes the technical detail of the modelling method.Section 4 demonstrates case studies to evaluate the cyber resilience using the resilience curve.Finally, sections 5 and 6 give the discussion and the conclusion, respectively.

Requirements of the model
As mentioned in the last section, the whole series of the cyberattack in ICS, considering various personnel in different roles, needs to be reproduced.Personnel in ICS can be divided into two types: those who are involved in industrial operations; and who are related to cybersecurity.The former, who include operators, engineers, maintenance teams, and so on, take responsibility for protecting the availability of ICS as a routine activity.The latter are groups or people that cause unexpected impacts on the availability by conducting cybersecurity events, such as attackers and defenders.In this paper, those two types of personnel are grouped as actors.
To observe how the target system reacts to the behavior of actors, it is also essential to build the model of assets in ICS.Namely, assets can be regarded as infrastructures or environments where actors conduct actions.Assets include ICS-specific devices (e.g.PLC and HMI) as well as IT devices for enterprise such as workstations and servers with Windows ® and Linux TM .
Last but not least, the modelling of the industrial process, which describes how the process of the target system is conducted, is also required.The model of the industrial process is used mainly for the calculation of the performance in the resilience curve.Let us consider an example of the manufacturing industry.There are several factors related to the performance in the manufacturing industry, such as adherence to plan, operating time, the proportion of defective production [14].The performance in the resilience curve is determined by the factor whose impact must be minimized during incidents, and therefore cannot be uniquely defined.Moreover, as discussed later, there are cases where the balance between multiple performances must be taken into consideration.The consensus must be reached through the discussion within the organization.
In this paper, the capability to manufacture products as scheduled, which is particularly related to the adherence to plan, is considered to demonstrate an example of the modelling.Specifically, the model of the industrial process is created so that the production rate, that is the amount of production per constant time, can be drawn in time series during the incident as Figure 1.
In summary, to reproduce the resilience curve for cyberattacks in ICS, it is essential to establish the modelling method of actors involving both cybersecurity and industrial operations, assets, and industrial processes.The state of all three models also needs to be reproduced in time series.

Prior research
The academic literatures related to above three models (actors, assets, and industrial processes) are briefly reviewed here.
Firstly, actors related to cybersecurity, which have been actively studied so far, are focused.The modelling method of the attacker's behavior is summarized in Ref. [15].Commonly used approaches are using the Cyber Kill Chain ® framework [16], which defines stages that attackers must complete to achieve the goal; and the MITRE ATT&CK ® framework [17], which describes Tactics, Techniques and Procedures (TTPs) of attackers.Cyberattack scenarios that describe the attacker's behavior until achieving the goal can be reproduced by mapping TTPs to the model of assets in a structured expression [18].In Ref. [19], TTPs from the perspective of defenders are also used to analyze the mission impact caused by the attacker.
The model of actors involving industrial operations is introduced in the resilience curve of Wei and Ji [13].The behavior of operators is reproduced by activities to monitor and operate the asset in ICS, which is linked to a part of the industrial process.Berger et al. [20] also propose the modelling method based on Petri-Nets to reproduce the time-dependent behavior of the asset interacting with the industrial process, and successfully evaluate the availability risk in smart factory.The impact caused by stochastic cyberattacks or errors is expressed using the total number of manufactured products.
As introduced above, various approaches are taken to build the model of the actor, asset, and industrial process.However, those methods are not designed to automatically calculate the resilience curve considering the behavior of personnel related to both cybersecurity and industrial operations simultaneously.To the best of our knowledge, no modelling method that satisfies the requirement specified in the last subsection has been established yet.

Modelling approach
In Figure 2, the modelling approach that satisfies the requirement is described.Firstly, ICS are modelled by the 3-layer architecture consisting of the actor, asset, and process models.Each model corresponds to the digital representation of the actor, the asset, and the industrial process, respectively.In the layer of the actor model, the behavior of personnel related to both industrial operations and cybersecurity are modelled.The actor model changes the state of the asset model reproduced in the middle layer.The asset model plays a role of the driving force in the process model.Therefore, the condition of the process model is determined by the state of the asset model.Note that the interaction of models within the same layer also exists as explained in the next section.Overview of the modelling approach to reproduce the resilience curve.Firstly, ICS are modelled by the 3-layer modelling method, which reproduces digital representatives of actors, assets, and industrial processes.Note that the model of actors includes personnel involving both cybersecurity and industrial operations.Secondly, the state of those models is repeatedly updated by reflecting attack and defense strategies.Finally, resilience curves are calculated from multiple viewpoints.
Secondly, the state of the model in all three layers is repeatedly updated in time series.The impact that the process model receives eventually returns to the actor model through the asset model.Therefore, by repeating the impact propagation among three layers in simulation, the state of the model at any given time can be obtained.Strategies of the cyberattack and the incident response to be evaluated by the resilience curve are reflected in the behavior of the actor model responsible for cybersecurity, i.e. attackers and defenders.
Lastly, the resilience curve is calculated from timedependent states of the model in three layers.In principle, the performance can be obtained from any of the three models.As mentioned in the previous subsection, the performance in the resilience curve depends on the choice of the organization, which could require multiple resilience curves with different performance factors as shown in Figure 2. In this research, for simplicity, the production rate is extracted from the process model.
Modelling approaches similar to the 3-layer architecture in Figure 2 are developed by prior research (e.g.[13,19]) as well, although the purpose and technical details are different.Such layered architectures are beneficial in terms of the extensibility of the model [19].While this study focuses on the manufacturing domain for the first target, the application field will be extended to other domains in the future.By defining the common interface between layers, the replacement of the model in layer unit can be easily conduced, which reduces the workload of the model construction.

Method
In this section, the technical detail of the 3-layer modelling method is explained.To reproduce the resilience curve, the state of models in all three layers needs to be expressed in time series.This study employs the modelling approach based on Petri-Nets (PN) [20,21], which is called the PN-based model hereafter.
The PN-based model is an extension model of Petri-Nets developed by Carl Adam Petri [22].The PNbased model is initially proposed to describe the timedependent behavior of assets in smart factories [21], and then extended to the modelling of the industrial process (production process) [20].The main features of the PN-based model are functions to express (1) timedependent states using firing delays of transitions; and (2) interactions among models by the guard function that defines the condition to enable the firing of transitions.In this paper, those two functions are applied to the modelling of actors as well.
Note that the PN-based model introduced in this paper is as simplified as possible so that only the minimum function necessary for the case study in the next section can be covered.The goal of this paper is to demonstrate the effectiveness of the proposed method outlined in Figure 2. In addition, the introduction of the PN-based model centres on the actor model, which is the most important for the discussion of the case study.For the other two layers, only the outline is explained.See Ref. [21] for the mathematical specification of the PN-based model.

Actor model
For the minimum setup of the actor model, the timedependent behavior of the attacker, the operator, and the defender is reproduced using the PN-based model.

Attacker
A typical action flow of the attacker is modelled by TTPs [18,19], which can also be reproduced with Petri-Nets (e.g.[23]).Figure 3(a) with Tables 1 and 2 show the PN-based model of the attacker, which is created using the Cyber Kill Chain framework [16] and the MITRE ATT&CK framework [17].The state of the attacker is determined by the marking, which represents the distribution of tokens over places.The state also changes to the next phase by firing transitions.In other words, places and transitions are regarded as states and actions taken by the attacker.
In Figure 3(a), the attacker is in the state Weaponization as the token is in the place p 2 that corresponds to Weaponization in Table 2.After completing the weaponization, the transition t 2 fires, which means that the attacker takes the action Exploit to move to the target asset.The state of the attacker then changes to the state Information Gathering, which means that the exploit has succeeded, and the post-exploitation phase has begun.
The attacker model conducts one cycle of TTPs in Figure 3(a) on each asset, and then moves to the next target.The attacker model in the state Information Gathering can choose the action Do nothing by firing the transition t 3 .In this case, the attacker model compromises the asset model just for the lateral movement to reach the next target.Therefore, the asset model does Recover the asset not exhibit unexpected behavior, although it has been compromised.
A characteristic feature of the attacker model is that the action of the attacker includes both DoS and Alarm suppression.While DoS is an attempt to make the target asset inoperable, Alarm suppression is an ICS-specific attack to hide the abnormal behavior of the target asset so that the detection of the incident by operators  1 and 2, respectively.can be When evaluating sophisticated cyberattacks, one possible strategy of the attacker model is conducting the Alarm-Suppression (AS) attack in the early stage and then the DoS attack in the last stage.
As mentioned above, the PN-based model can define firing delays to reproduce the time-dependent behavior of the attacker.The duration of each state is determined by the firing delay time of each transition.For example, when the firing delay time of the transition Exploit is set one hour, the attacker model spends one hour in preparing the weapon (e.g.tool and script) to exploit the next target asset.

Operator
The PN-based model of the operator is shown in Figure 3(b) with Tables 1 and 2. The method to describe states and actions of operators and defenders is the same as the attacker.In a normal operation period, the operator model with the state Regular operation is continuously monitoring the state of the asset model.When the asset model is in the state that the operator does not expect, the operator model changes the state from Regular operation to Investigation by firing the transition t 7 (Notice the incident).Once the investigation period completes, the operator model activates the defender model in the state On hold.

Defender
Defenders' typical action flows are also expressed using TTPs as well as the attacker model [19].Therefore, the PN-based model can also be built for defenders, a simplified example of which is shown in Figure 3(c) with Tables 1 and 2. In this case, the defender model is initially on hold, and then activated by the operator model.This interaction rule between the operator model and the defender model means that the incident response does not begin unless the operator model notices the incident.
After the activation, the defender model conducts the action Detect IoC (Indicator of Compromise) and Recover the asset, undergoing the state Investigation and Incident handling, based on the pre-defined defense strategy.When these processes complete, the state of the asset model is recovered to the pre-cyberattack state.
Note that several teams that are assigned different roles and responsibilities, not a sole defender team, handle the incident in the real ICS environment.Ref. [24] defines the role and responsibility of each team, including Security Operation Centre (SOC), Facility Maintenance Team (FCR), etc. and summarizes the process flow that should be taken during the incident.However, this study creates only the defender model in Figure 3(c) that has the function to complete the case study in the next section.

Asset model
The time-dependent state of the asset can also be modelled by the PN-based model.The difference compared to original studies of the PN-based model [20,21] is that the state transition of the asset is triggered not randomly but by the interaction of the actor model.The PN-based model of the asset is created so that the action of the actor model can be reflected to the state.In other words, transitions of the asset model are partially synchronized with those of the actor model as shown in Figure 5.For example, when the transition labelled DoS in the attacker model fires, the corresponding transition labelled DoS in the target asset model is forced to fire and consequently the state becomes Service stopped.This control is performed by the guard function assigned to each transition [20].
Another important function is the dependency between asset models.Unexpected behavior of an asset model can propagate to other asset models.The maximum acceptable interruption time can also be defined to reproduce the propagation delay time of the impact [21].
Let us introduce two examples of the interaction between the asset model and the actor model with different attack and defense strategies.Firstly, two actor models are assumed as shown in Figure 5  Secondly, one defender with the defense strategy in Figure 5(b) is assumed.The defender model is handling the incident so that the impact on the availability can be minimized.In the assumption, the control server has been compromised to achieve the target asset PLC1 by the attacker model.Although the state of the control server is Compromised, its function is working as expected.PLC2 that has the dependency on the control server also works in the proper way.In this situation, it could be a preferred decision to wait for the recovery of the control server until the appropriate timing to avoid the unexpected impact of the dependent PLC2.In general, sudden unexpected stops of control devices can lead to the loss of safety as well.In Figure 5(b), the defense strategy taken by the defender model is recovering only PLC1 for the first countermeasure to minimize the loss of availability.If this decision is made in real ICS environments, it is essential to investigate all possible impacts by conducting careful risk assessments.

Process model
Finally, the modelling of the industrial process in the manufacturing industry is explained using the PNbased model as well.Figure 6 shows an example of the PN-based model for the industrial process of a production line where products are processed by two robots.Compared to the actor model whose state is determined by the marking (Figure 3), tokens are regarded as products manufactured in the production line [20].More detailed techniques to model the industrial process with Petri-Nets are also introduced in Ref. [25].
Whether transitions in ICS processes fire depends on the state of asset models that are responsible for each process phase.For example, when Robot A in Figure 6 is controlled by the PLC suffering from the DoS attack as in Figure 5(a), the corresponding transition labelled Robot A cannot fire, which leads to the reduction of the production.In the actual factory, multiple production lines are usually operated to manufacture different types of products.In the proposed method, the process model is created for each production line.
Let N i cum be the cumulative production of ith production line.N i cum can be calculated by counting the number of tokens that have passed through all transitions as planned and reached the designated goal place.The production rate for ith production line, denoted by N i rate , is thus derived by the time derivative of N i cum as follows: Using Equation ( 1), the resilience curve can be drawn for each production line.It is also possible to calculate the toral cumulative production N tot cum and the total production rate N tot rate by summing up the cumulative production of all production lines as where k is the total number of production lines.In addition, there are cases that the importance of products is different for each type from a business point of view.In such case, the representative cumulative production N rep cum , which can be derived as follow, can be used.
where the weighting function θ(i) represents the importance of products manufactured in ith production line.In the condition of θ(i) = 1 for ∀i, suggesting that the importance of all products is equal, Equation ( 4) is identical to Equation (2).Finally, the relative total production rate Ntot rate is also defined by N tot rate scaled by the average of the total production rate in a normal operation day without cyberattacks.In other words, Ntot rate (t) = 1 indicates that are products is manufactured as scheduled at time t.

Case study
To the effect of the proposed method, case studies have been conducted by assuming a hypothetical manufacturing factory with 7 actor models (5 operator models, 1 attacker model, and 1 defender model), 24 asset models and 2 process models.The model of the factory discussed in this paper is artificially created so that the effectiveness of the 3-layer modelling method can be easily understood.
Asset models consist of both ICS-specific devices (PLC, HMI, etc) to control and monitor the production line and enterprise devices in the IT system that are connected to the Internet.To improve the productivity, the ICS network where ICS-specific devices run are partly accessible from the IT department via a local area network.It is also assumed that several asset models include vulnerabilities so that attackers can initially intrude the IT system from the Internet and then reach the ICS to attack the production line.
Since two process models exist, two different resilience curves can be obtained separately with Equation (1) in principle.However, for simplicity, the resilience curve with the production rate calculated by Equations ( 2) and ( 3) is used in the case study.
To observe various shapes of resilience curves, we prepare three different strategies of the attacker model and the defender model, which are summarized in Table 3. Figure 7(a) shows the cumulative production for those three cases, which are calculated by implementing the 3-layer modelling method with computer programming.The x-axis is time in arbitrary unit, while the y-axis is the total number of manufactured products.Solid and dashed curves correspond to raw data and their moving average with the time window [t − 15, t + 15].Table 3.Three different strategies of the attacker model and the defender model that are used to calculate cumulative productions and resilience curves in Figure 6.To derive the resilience curve from the cumulative production with computer programming, the central differencing scheme is used, namely Equation ( 3) can be read as Figure 7(b) shows resilience curves calculated using Equation ( 5) with Δt = 10, using the moving average in Figure 7(a).The x-axis is time in arbitrary unit, while the y-axis is the production rate scaled by the average value in t = 110 − 190.In all cases, the DoS attack is conducted in t = 200.Since the cumulative production shows zero continuously in the early stage of the simulation which destabilize the differentiation, the instability of the production rate in t < 110 is not caused by the cyberattack but observed in the normal operation period.The choice of Δt influences the shape of the resilience curve, such as the slope and waveform (periodic slight increase and decrease of the production rate in the normal operation period).A more smoother resilience curve can be obtained with larger Δt, although the characteristics of the incident could become invisible.In addition, the waveform in the early stage is different from the one after the incident as the timing when products are manufactured in two production lines are initially the same but becomes different after the incident.
In the following, the resilience curve for each case in Figure 7(b) is briefly reviewed.

Case 1
Since the attacker model conducts only the DoS attack, the operator model notices the incident in the early stages and then activates the defender model.The defender model recovers the minimum set of damaged asset models to minimize the impact of the availability as illustrated in Figure 5(b).As can be seen in Figure  3.In both panels, curves in blue, green and orange correspond to cases 1, 2 and 3, respectively.In panel (b), solid and dashed curves represent raw data and their moving average.Note that slight constants are added to the production rate in cases 2 and 3 to avoid the overlapping of the data.
7(b), the loss of the availability in this case is smaller than other cases.

Case 2
The attacker model conducts not only the DoS attack but also the AS attack to deceive the operator model as in Figure 5(a).As a result, the impact spreads without being detected.Therefore, compared to case 1, the cumulative impact I c and the downtime T are larger.Since attacker models of both cases 1 and 2 target only one production line, the maximum impact I m , which is the half of the normal value, is the same.Another unattacked production line is running as expected all the time in cases 1 and 2.

Case 3
Although the attacker model is the same as that of case 1, the defense strategy of the defender model is different.The defender model in case 1 recovers the minimum set of damaged asset models, while the defender model in case 3 attempts the recovery of the full set of damaged asset models.As explained in Figure 5(b), the downtime of an asset model for the recovery can affect other asset models as well as process models in the unexpected way.In case 3, both production lines stop as the defender model recover the asset model that is necessary for another unattacked production line.As such, the production rate becomes zero temporally and then recovers to the pre-cyberattack level step by step.

Towards better evaluations of the cyber resilience
It has been demonstrated though case studies that resilience curves calculated by the 3-layer modelling method with different strategies of the actor model exhibit different shapes.The result can be used to evaluate the expected impact of various cyberattacks in advance and judge what defense strategy is chosen to minimize the loss of the availability.Although only the production rate is discussed in study, multiple factors related to the availability of ICS as seen in Figure 2 should be considered for more accurate evaluations of the cyber resilience.
For example, the residual risk must be taken into consideration.As seen in Figure 7(b), the defense strategy of case 1 is superior to that of case 3 from the aspect of the production rate.However, case 3 should be the best choice in terms of the residual risk as vulnerabilities are removed from the full set of damaged asset models.Although case 1 enables the recovery of the production line rapidly, cyberattacks are more likely to occur again due to the left vulnerability.In this sense, when drawing the level of the residual risk, which means how likely the following cyberattacks occur, in time series, it is expected that the value after the recovery completes in case 1 is still high compared to that in case 3.
As in the risk assessment, the risk is usually determined by the balance between the impact and likelihood of cyberattacks [26,27].The evaluation method of the likelihood has been actively studied so far (e.g.[18,28]).For better evaluations of the cyber resilience, it should be important to show not only the loss of the availability caused by the single cyberattack but also the likelihood that represents the residual risk of following cyberattacks.To calculate the residual risk, the information that the asset model holds needs to be extended.More specifically, for the future work, not only the PN-based model and the dependency but also the attribution to express vulnerabilities are added to the asset model so that the residual risk can be extracted.
A more variety of factors related to the availability could also be extracted from three types of models in Figure 2. Examples include the percentage of operator models whose states are Regular operation that represents how well ICS can utilize the human resource.In addition, the likelihood that the impact of cyberattacks spreads to asset models of Safety Instrumented System (SIS) could be used to evaluate the safety as another factor in the resilience curve.The establishment of the evaluation method for other factors besides the production rate is an important future work.Organizations should determine the best defense strategy based on the balance of multiple factors.

Automation of the model construction
It is important for the practical application of the 3-layer modelling method to automate the model construction.Since there are usually many assets and industrial processes in ICS, it might be too time-consuming to build all models manually.Here, possible approaches to automate the model construction are introduced.
The first challenge is obtaining the necessary information to build asset models.The information of enterprise assets in IT systems, such as operation system and installed software, can be remotely collected by installing inventory tools or actively scanning the target asset through the network.However, it is difficult in many cases to deploy such inventory tools to all ICSspecific devices that could include legacy equipment with limited computation resources [29].In addition, broadcasting additional packets into the ICS network for the purpose of the scanning is usually not permitted as devices could become unstable or crash [30].
Considering the above conditions, deep packet inspection using the passive scanning is one effective option to obtain the information of ICS-specific assets.If vulnerabilities of assets can be identified by deep packet inspection, more detailed modelling of the attacker will be possible using more specific TTPs.Some open-source tools to perform deep packet inspection in ICS are available, such as GRASSMARLIN [31].Dependencies between services hosted on assets can also be automatically identified by analyzing passively obtained packet data (e.g.NSDMiner [32]).
The construction of the process model, which is another challenge, can be partly automated.In ICS environments, industrial processes are often driven in the pre-configured way, following a relatively fixed sequence.Therefore, the process mining [33], which enables the discovery of a Petri-Net describing a control flow from event logs, can be a solution.For instance, in Ref. [34], a Petri-Net of an industrial process has been successfully deduced by the process mining with event logs stored in ICS-specific devices.The system configuration that can automate the collection of data necessary to build models is also developed [35,36].Technologies and tools introduced in this subsection are expected to help the automation of the model construction.

Conclusion
The 3-layer modelling method to automatically calculate the resilience curve has been introduced in this paper.The characteristic feature compared to prior research is that the time-dependent behavior of various personnel related to both cybersecurity and industrial operations in ICS is modelled using the PN-based model.It has also been demonstrated though case studies that the resilience curve with different strategies of the actor model exhibits different shapes.Since the impact during the incident is visualized in time series with the resilience curve, the result can be used to determine what and when to do to minimize the impact, which leads to the improvement of the cyber resilience.
For future works, as depicted in Figure 2, the model is improved so that multiple factors related to the availability, such as the residual risk and safety, can be evaluated simultaneously with the production rate in series.Once the algorithm to evaluate multiple factors is implemented, the effectiveness and function are tested with real ICS environments.Finally, we will also scale the framework of the 3-layer modelling method presented in this study to other industrial domains.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Notes on contributors
Daisuke Tsuji received his B.S., M.S. and Ph.D. degrees in Science from the Nagoya University, Japan in 2015, 2017 and 2020, respectively.He is currently working at the R&D group of Hitachi, Ltd.His interest includes cybersecurity for industrial control systems and nonlinear physics of complex systems.He is a member of SICE.
Junya Fujita received his Master of Engineering from the University of Tokyo in 2011.He is a senior researcher at Hitachi, Ltd. and has experience in R&D related to embedded systems and cybersecurity for industrial automation control applications over 10 years.He is a member of SICE, ISA and IEEE.He is also a member of ISA99 WG and an expert member of IEC/TC 65/WG 10 as standardization bodies.He holds world-recognized cybersecurity certifications, such as CISSP, GICSP and OSCP.He also has P.E.Jp and ISA CAP.Noritaka Matsumoto received his B.E. and M.E.degrees in mechanical engineering from the Waseda University, Japan, in 1999 and 2001.He currently works at the R&D group of Hitachi Ltd.He has experience in R&D on industrial control system technologies, particularly embedded systems and cybersecurity.He is a member of the national committee of IEC TC65/WG10 and WG20.He is also a member of IEICE and IPSJ.
Yu Tamura received his B.S., M.S. degrees in system engineering from the University of Electro-Communications, Japan in 2017 and 2019, respectively.He is currently working at the R&D group of Hitachi, Ltd.His current research interests include network security.
Jens Doenhoff received his Diplom (Dipl.-Inf.)degree in computer science from the Ilmenau University of Technology, Germany in 2010.He is currently working at the R&D group of Hitachi, Ltd.His current research interests include security in distributed systems and network edge connectivity.He is a member of the ACM.
Tomohiro Shigemoto received his B.S., M.S. degrees in system engineering from the Osaka University, Japan in 2004 and 2006, respectively, and the Ph.D. degree in engineering from the Meiji University, Japan in 2018.He is currently working at the R&D group of Hitachi, Ltd.His current research interests include network security and malware analysis.He is a member of IPSJ.

Figure 1 .
Figure 1.Resilience curve based on Wei and Ji [13].The performance P in ICS is drawn as a function of time t.

Figure 2 .
Figure 2. Overview of the modelling approach to reproduce the resilience curve.Firstly, ICS are modelled by the 3-layer modelling method, which reproduces digital representatives of actors, assets, and industrial processes.Note that the model of actors includes personnel involving both cybersecurity and industrial operations.Secondly, the state of those models is repeatedly updated by reflecting attack and defense strategies.Finally, resilience curves are calculated from multiple viewpoints.

Figure 3 .
Figure 3.The PN-based model for actors: (a) Attacker, (b) Operator, (c) Defender.The state and the action corresponding to each place and transition are summarized in Tables1 and 2, respectively.
(a): the operator model who monitors the HMI; and the attacker model whose goal is conducting the DoS attack on the PLC.If the attacker model conducts only the DoS attack, the operator model can notice the incident immediately as the PLC and the HMI are dependent with each other.However, if the attacker model conducts the AS attack on the HMI in advance, the operator model can be deceived as the anomaly of the PLC is hidden.As a result, the detection of the incident by the operator model is delayed.As shown in this example, various scenarios of ICS-specific cyberattacks can be reproduced through the interaction between the asset model and the actor model involving both industrial processes and cybersecurity.

Figure 4 .
Figure 4.The state transition of the asset model caused by the actor model.

Figure 5 .
Figure 5. High-level views of strategies reflected in the actor model.

Case Strategies of the attacker model and the defender model 1 • 2 • 3 •
The attacker model conducts the DoS attack, targeting one production line.• The defender model joins the incident response and recovers the minimum set of damaged asset models for the first countermeasure.The attacker model conducts both DoS and AS attacks to deceive the operator model, targeting one production line.• The defender model joins the incident response and recovers the minimum set of the damaged asset model for the first countermeasure.The attacker model conducts the DoS attack, targeting one production line.• The defender model joins the incident response and recovers the full set of damaged asset models.

Figure 6 .
Figure 6.The process model of the production line expressed by the PN-based model.

Figure
Figure (a) Cumulative production and (b) Resilience curves for three different strategies of the attacker model and the defender model listed in Table3.In both panels, curves in blue, green and orange correspond to cases 1, 2 and 3, respectively.In panel (b), solid and dashed curves represent raw data and their moving average.Note that slight constants are added to the production rate in cases 2 and 3 to avoid the overlapping of the data.

Table 1 .
Relation between the place and the state of the actor model in Figure3.

Table 2 .
Relation between the transition and the action of the actor model in Figure3.