Mental Overload Assessment Method Considering the Effects of Performance Shaping Factors

The assessment of mental overload is essential as mental overload has been proved to be a critical cause of many accidents. Mental overload is affected by various performance shaping factors (PSFs), and harsh PSFs will either increase the task demand or decrease the overload threshold, which is rarely considered in the current mental overload assessment method. This research proposes a VACP-based mental overload assessment model which considers the effects of PSFs. In the VACP model, mental overload can be identified when the sum of the task demand values of every unit task is greater than the given threshold. In human reliability analysis, PSFs are mainly used as weighting factors to modify basic human error probability. In our research, by virtue of the quantitative relationship between human error and mental workload, the weighting factors of PSFs are converted to modify the task demand values and threshold for VACP activities. Furthermore, Bayesian Network (BN) is used to model the influence of PSFs and to calculate the probability of mental overload. The proposed method is applied to an accident involving a helicopter crash that occurred in Maryland, and the results show that, in comparison with the VACP method, the proposed method can more effectively identify the state of mental overload and provide a more rational explanation of the process of the accident.


I. INTRODUCTION
The safe and efficient performance of given tasks in a human-machine system requires that the mental workload imposed on operators does not exceed their capacity. An excessive mental workload will overwhelm an operator's information-processing, degrade an operator's vigilance, and eventually lead to ''cognitive tunneling'', which describes the phenomenon in which an operator is unable to reallocate his/her attention from one task to another [1], [2]. The National Safety Council (NSC) reports 21% of all fatal accidents are attributed to excessive mental workload [3]. It is widely accepted that excessive mental workload, i.e., mental The associate editor coordinating the review of this manuscript and approving it for publication was Catherine Fang. overload, is an important safety-related factor; furthermore, the consensus is that mental overload assessments should be taken into consideration in the design stage [4].
In essence, two elements are incorporated in assessments of mental overload: the mental workload caused by task demands and the operator's mental workload capacity. The mental workload caused by task demands (also known as ''mental workload level'') represents ''how hard the brain is working to meet task demands'' [5]. Meanwhile, the operator's mental workload capacity (also known as ''the operator's capacity'') represents ''the operator's limited mental resource in processing task demand'' [6]. Mental overload is considered to have occurred when the mental workload level reaches beyond the level of operator's capacity [7]. Researchers tend to use Cognitive Load Theory (CLT) to theoretically explain mental overload [8]. According to CLT, an operator's mental resources for handling task demands are heavily constrained because new information should be stored first in the working memory, and the working memory is limited both in terms of storage length and duration. In this sense, task demands can reflect the mental workload level, and the working memory can reflect the operator's capacity.
There are many researches about mental overload assessment, and those researches can be categorized as experimental methods and simulation methods.
For experimental methods, by virtue of the high-fidelity training device and virtual reality technology, mental overload can be assessed through man-in-loop experiment. Researchers tend to use physiological parameters and subjective assessment to reflect mental workload [9], [10], [11]. However, different experiments can only assess mental workload for a certain scenario, which means that with the change of application scenario, the experimental environment must be rebuilt to adapt it. In addition, the experimental method is both time-consuming and costly, therefore, the results obtained by such studies are not suitable for simple and quick analysis [12].
For simulation methods, they can be categorized as time-based and task-based. Time-based simulation methods, such as cognitive architectures like Adaptive Control of Thought-Rational (ACT-R), Queuing Network-Model Human Prcessor (QN-MHP) and Executive Process/Interactive Control (EPIC) [13], mainly use the ratio of total time spent for production firing and chunk exchanging in cognitive architecture to the available time as a natural index of mental workload [14]. Task-based methods, such as W/INDEX, Task Analysis/Workload (TAWL) model and Improved Performance Research Integration Tool (IMPRINT) [15], mainly use a rating scale to determine mental workload values based on different task type. Usually, the rating scale of these methods is based on Visual-Auditory-Cognitive-Psychomotor (VACP) method [16]. Compared with time-based simulation methods, task-based simulation methods are simple to implement and with higher versatility. Actually, VACP has shown its promising application in mental overload assessment [17], [18]. For example, Wang et al. used the VACP to assess the mental overload state of train driver so as to optimize task flow, and the experimental results proved that the optimized task flow balances the relationship between mental workload and reaction time well [19]. The VACP can provide a continuous prediction of mental workload level (as reflected by ''task demand'' in VACP), as well as a fixed value for the operator's capacity (as reflected by ''threshold'' in VACP), which is sufficient for performing real-time mental overload assessments. The VACP method has been applied to the study of many areas, including manufacturing [20], the operation of unmanned aviation vehicles [21], driving [22], and so on.
However, as with human error, mental overload is related to the contextual factors attending the situation in which the performance occurs [23], which is rarely considered in application of the aforementioned methods. In fact, several researchers have studied the influence of contextual factors on mental workload level and operator's capacity. In regard to mental workload level, Tsao et al. proved that an improper work/rest rhythm and adverse physical working environment could significantly increase the mental workload level [24]. Kaptan et al. pointed out that the mental workload level, whatever it be in a normal operation process or in an abnormal operation process, decreases as the automation level increases [25]. With respect to the operator's capacity, Young et al. noted that skill is a factor that can influence an operator's capacity [26] and Kim et al. used an experimental method to show that skilled operators would have a higher capacity [27]. In actuality, all of these factors can be considered to be performance shaping factors (PSFs). Therefore, the assessment of mental overload should consider the influence of PSFs.
PSFs is often used in Human Reliability Analysis (HRA). HRA is proposed to systematically incorporate human as a part of a Probabilistic Risk Assessment (PRA) activity [28], and PSFs are used to characterize the context of human tasks, which is assumed to enhance or degrade human performance [29]. The term ''PSFs'' varies from different HRA methods, such as PSFs, error-producing conditions (EPCs), common performance conditions (CPCs), and performance influencing factors (PIFs). Despite their different forms, the concepts are same. For the sake of consistence, they are all called as PSFs. One of the most common applications of PSFs in HRA is to adjust Basic Human Error Probability (BHEP) [30]. With the development of HRA method, the PSFs also experience the evolution. In the firstgeneration HRA methods, cognition is not particularly considered among the PSFs, which, to a certain degree, cannot explain how PSFs exert influence on performance. While in the second-generation, PSFs were derived by focusing on the cognitive impacts on operators. Nowadays, the application of PSFs has beyond the scope of HRA. There are several other research topics addressing PSFs, including human behavior models, human-related event analysis method and human performance database [31].
In our methodology, PSFs were used to modify the task demands and the threshold of VACP, so choosing the appropriate PSF set is necessary. Di Mascio et al. classified the PSFs that affect mental overload into three categories: the work environment in which the controller operates, some variable personal factors and physical preconditions [32]. Azadeh et al. further extended the categories, describing them as health-related PSFs, safety-related PSFs, environmentrelated PSFs and ergonomics-related PSFs [33]. In our research, Azadeh's classification was applied because it covers more aspects of the relevant contextual factors. However, in design stage, a designer must consider the diversity of potential operators, and thus, individual differences are temporarily set aside. As a result, all of the PSF categories should be taken into consideration, apart from health-related PSFs.
The introduction of PSFs poses a problem: the relationship between PSFs and mental overload should be established. The existing investigation have mainly employed experiments to qualitatively test the effect of a particular PSF on mental overload. For instance, Bhavsar et al. proposed a qualitative methodology based on pupillometry (the measurement of pupil diameter) to noninvasively estimate the mental overload of control room operators in real time during process operations [34]. Casner tested the advanced cockpit system's effect on mental overload via an in-flight experiment, and the results indicate that the use of the advanced cockpit helps to relieve mental overload in some situations [35]. However, in order to quantitatively assess mental overload, solely considering qualitative relationship is not enough. Thus, establishing quantitative relationship between PSFs and mental overload is necessary.
Another problem introduced by including PSFs in the model is that differing levels of probability distributions exist for the various PSFs. To address this issue, employing a Bayesian Network (BN) seems to be a viable approach. BN has long been used to handle the probability distributions of PSFs in HRA [36], [37], [38]. Furthermore, from the perspective of modeling, BN also has many obvious advantages such as stronger theoretical roots and lower levels of computational complexity, as compared with other tools [39].
The present study, thus, aims to propose a method for assessing mental overload with consideration given to PSFs. In our research, by virtue of the quantitative relationship between human error and mental workload, the PSFs' weighting factors have been converted to modify the task demand value and threshold of VACP. Furthermore, BN is applied so as to model the influence of each PSF and to calculate the associated probability of mental overload.
The structure of this paper is as follows: The next section provides information on the basic concepts related to the proposed method. Section III describes the framework of the proposed mental overload model and steps taken in the construction of the mental overload assessment model. In Section IV, the proposed method is demonstrated through its application to an actual helicopter crash that occurred in Maryland. Section V presents our conclusions.

A. VACP METHODS
The VACP method presents mental workload as relying on four resource channels: visual, auditory, cognitive and psychomotor. The amount of demand required of each channel to perform tasks is estimated on McCracken and Aldrich's seven-point scale, as shown in TABLE 1. Mental workload is defined as the sum of every channel's task demand value, and a score of 40 is regarded as the threshold [40]. When the calculated mental workload value is greater than 40, mental overload is assumed to occur. To perform tasks safely and effectively in a dynamic complex environment, the mental workload required of the operators should not exceed that threshold.
The VACP method has the following advantages: (1) VACP provides a more objective assessment of mental workload and can eliminate rater bias; (2) VACP uses verbal anchors to provide an objective and consistent use of scale; (3) VACP depicts mental workload as relying on resources from multiple channels; (4) VACP can provide a unique workload value for an interval time once the necessary information has been determined [21].

B. PERFORMANCE SHAPING FACTORS (PSFS)
The PSFs that influence mental overload are attributed to three categories: safety-related PSFs, environment-related PSFs and ergonomics-related PSFs. The details of the influential factors of these categories are summarized in TABLE 2.
There are many suitable PSF sets that can cover the elements mentioned in TABLE 2, such as nine PSFs included in the Cognitive Reliability and Error Analysis Method (CREAM) [41], those being ''Adequacy of organization'', ''Working conditions'', ''Adequacy of MMI and operational support'', ''Availability of procedures/plans'', ''Number of simultaneous goals'', ''Available time'', ''Time of day'', ''Adequacy of training and preparation'' and ''Crew collaboration quality''. CREAM's set of PSFs is widely accepted and used in HRA [42], and we adopt it here to describe the factors that influence mental overload.
Additionally, correlations exist between PSFs, and the state of a specific PSF will be influenced by the correlated PSFs. For example, researchers have found that the ''Adequacy of MMI and operation support'' will significantly affect task-related PSFs such as ''Number of simultaneous goals'' [43]. This should also be considered in the modeling process.

C. BAYESIAN NETWORK (BN)
BN, also known as Bayesian Belief Network, is a directed acyclic graph that consists of two parts: the net structure and the conditional probability table (CPT) [44]. Net structure includes nodes and causalities. Nodes stand for random variables of interest, and causalities stand for the likely causal relationships between nodes. Here if we use U to represent the collection of all nodes, then U = {A 1 , A 2 , . . . , A n }. If a directed arc exists between two nodes A i and A j , that means A j is the parent node of A i , written as Pa(A i ). Correspondingly, A i is the child node of A j . By assigning a CPT, according to the chain rule, the joint probability can be given as (1): where P(U ) is the joint probability of the whole BN.
It should be pointed out that once the evidence e of node A emerges (some nodes are in a particular state), the BN VOLUME 11, 2023 can update the probability; this evaluation process is often referred to as ''BN updating'' [45]. The whole updating process is based on (2): where P(A) is the prior probability of node A, P(e) is the probability of evidence and can be replaced with n i=1 P(e|A i )P(A i ), P(e|A) is the likelihood, and P(A|e) is posterior probability.

A. MENTAL OVERLOAD ASSESSMENT MODEL
The development of a novel BN-based mental overload assessment model with consideration given to PSFs is presented in this section. FIGURE 1 shows the framework of the proposed model. This research introduces PSFs as weighting factors to modify the task demand and threshold of VACP. The introduction of PSFs will contribute to two improvements: First, the influences of various PSFs on task demand and threshold are established. By virtue of the quantitative relationship between human error probability (HEP) and mental workload level (MWL), the weighting factors of the PSFs are converted to modify the task demand value and threshold of VACP. Second, probability distributions exist within PSF levels; therefore, BN is used to model the influence of PSFs and to calculate the probability of mental overload. A detailed description of this process is as follows.

1) DETERMINING THE TASK-DEMAND-RELATED AND THRESHOLD-RELATED PSFS AND THEIR DISTRIBUTION
Generally speaking, PSFs affect mental overload in two ways: modification of the task demand and modification of the threshold. For example, considering the PSF ''Adequacy of MMI and operational support'', if the interface layout is reasonable, then the information displayed in the interface will be easily identified, and the task demand that the task generates will be lower. In addition, when considering the PSF ''Adequacy of training and preparation'', operators with good training and rich experience have higher mental overload thresholds, i.e., they can perform more tasks at the same time. This illustrates that the influence of PSFs can be categorized into two groups: those factors influencing the threshold and those factors influencing the task demand.
How should we further categorize PSFs? Generally, PSFs such as ''Adequacy of MMI and operational support'', which influence the task demand, are directly related to ongoing tasks, and PSFs such as ''Adequacy of training and preparation'', which influence threshold, are more related to factors such as rhythm, working conditions, and so on, which indirectly influence ongoing tasks. Based on this rule, CREAM's PSFs can be categorized into two classes, as shown in TABLE 3.
As mentioned earlier, distributions exist between PSFs. So, the method by which to determine the distribution is another important issue. We will introduce the principles governing the distribution of PSFs in detail in the BN modelling section.

2) THE PREDICTION APPROACH TO MENTAL WORKLOAD LEVEL WITH THE INFLUENCE OF PSFS TAKEN INTO CONSIDERATION
Existing studies show that, for a pilot's aviation activities, if HEP is less than 0.5, the relationship between human error and mental workload is very close to a simple exponential function [46], [47]. Therefore, the relationship between HEP and MWL can be presented as (3): where BHEP is basic human error probability, BMWL is basic mental workload, and α MWL is the coefficient for MWL. In HRA, a PSF is used to provide weighting factor (denoted as WF) to modify BHEP, so HEP yields the following expression: Based on (3) and (4), the influence of PSFs on mental workload is derived as (5): BMWL can be regarded as the task demand value of different channels estimated by VACP, and MWL can be regarded as the modified task demand. If TD V , TD A , TD C , and TD P represent the task demand value of the visual, auditory, cognitive and psychomotor channels in VACP, TD M V , TD M A , TD M C , and TD M P represent the modified task demand values of the visual, auditory, cognitive and psychomotor channels, then VOLUME 11, 2023 we can obtain: where WF TD V , WF TD A , WF TD C , and WF TD P stand for PSFs' WF of the visual, auditory, cognitive and psychomotor task demand.
Therefore, the total modified task demand MWL is calculated using (10): where n stands for different channels in VACP, and TD M n , TD n , and WF TD n represent modified task demand, original task demand, and the PSFs' WF of different channels respectively.
As for the calculation of WF, we can refer to CREAM. In CREAM, each level of PSF has a multiplier θ. Once the level of each PSF is determined, WF can be obtained by multiplying these multipliers [42]. In our research, similarly, WF in (6) -(9) is calculated by multiplying the multipliers of these task-demand-related PSFs, denoted as (11): where θ n i represents the ith task-demand-related PSF's multiplier of channel n. The multiplier value is shown in TABLE 4. It should be noted that in CREAM, the visual and auditory channels are not distinguished, but are rather unified into one perceptive channel. Thus, it is assumed that visual and auditory channels share the same multiplier.

3) THE ADJUSTMENT APPROACH OF THRESHOLD CONSIDERING THE INFLUENCE OF PSFS
Harsh PSFs will increase the task demand and lower the mental overload threshold. So, the equation is similar in character, and it can be expressed as (12): where Th is the modified threshold, BTh is the basic threshold, WF is the weighting factor provided by threshold-related PSFs, and α Th is the coefficient for threshold.
In combination with VACP, the total modified threshold Th is shown as (13): (13) where n stands for the different channels of VACP, BTh n is channel n's basic threshold, and WF Th n is threshold's WF of channel n. In VACP, V ,A,C,P n BTh n is often regarded as 40. WF Th n is calculated in a manner similar to (11), with the exception that the task-demand-related PSFs are replaced with threshold-related factors, as shown in (14). TABLE 5 represents the multipliers of the threshold-related PSFs.

4) MENTAL OVERLOAD ASSESSMENT MODEL BASED ON BAYESIAN NETWORK a: THE STRUCTURE OF THE BN-BASED MENTAL OVERLOAD ASSESSMENT MODEL
In this section, we describe the construction of our BN-based mental overload assessment model. Based on the discussions in Section III-A2 and III-A3, (11) and (14) are the starting points for all the calculation processes as they provide the means by which to obtain weighting factors. Furthermore, the correlation among PSFs will change the distribution of some PSFs, so the PSF nodes should consist of the primary PSFs nodes and the PSF nodes adjusted in consideration of the correlations. For the sake of presentation alone, the primary PSF nodes are denoted as PSFs, and the adjusted PSF nodes are denoted as adjusted PSFs. The next step in the mental overload assessment process is based on (10) and (13). The causality described in (10) and (13) is clear, it derives from weighting factor nodes, basic task demand nodes, threshold nodes to modified task demand and threshold nodes. The last step in the assessment process is (15), namely to make the comparison between modified task demand and modified threshold so as to determine the mental overload state. We will introduce the mental overload probability node to display the mental overload assessment result.
In conclusion, there are six classes of nodes and four classes of CPT that are used to build this BN-based mental overload assessment model. The six classes of nodes are: • PSFs; • Adjusted PSFs; • Weighting factors; • Basic task demands and threshold; • Modified task demands and threshold; • Mental overload probability. Four classes of CPT are: • Those between PSFs and adjusted PSFs; • Those between PSFs, adjusted PSFs and weighting factor; • Those between basic task demand, basic threshold, weighting factor and modified task demand, modified threshold;   • Those between modified task demand, modified threshold and mental overload probability.
The BN-based mental overload assessment model is shown in FIGURE 2. The BN for this work was constructed using the commercially available software GeNIe 2.4 [48].

b: THE QUANTIFICATION OF ROOT NODES i) TASK DEMAND AND THRESHOLD QUANTIFICATION
VACP uses verbal anchors to assess task demand levels to aid in increasing consistency and reducing inter-rater variability [21]. Verbal anchors provide key words that are used to classify the unit tasks, and users are encouraged to assess unit tasks by combining task background, task type, and other task-related information rather than performing the assessment mechanically by simply matching keywords. For example, when a novice performs the same task as an experienced operator, such as driving on the same road, the cognitive task demands are significantly different. According to TABLE 1, the novice may have a higher cognitive task demand, perhaps rated at 6.8 out of 7.0, which has the verbal anchor ''evaluation/judgment consider several actions', while for the experienced operator, the cognitive task demand is rated at 1.0 out of 7.0, which has the verbal anchor ''automatic simple association''. As for the threshold, it is considered to be 40.

ii) DISTRIBUTION QUANTIFICATION OF PSFS
For PSFs nodes, with the support of data, the prior distribution of PSFs can be determined. However, the data are usually sparse, so we can assume that if the PSFs cannot be determined, then, with the exclusion of the impossible levels, the remaining levels are considered as having even distribution. For example, in the area of aviation, all pilots must be fully trained before obtaining permission to fly an aircraft, so for the PSF ''Adequacy of training and preparation'', the level ''inadequate'' is impossible. Then, the remaining levels, ''Adequate, high experience'' and ''Adequate, low experience'' are under an even distribution.

c: THE NODES AND CPT OF THE BN-BASED MENTAL OVERLOAD ASSESSMENT MODEL i) IDENTIFICATION OF THE NODES OF THE TARGET MODEL
As discussed in previous section, we introduced CREAM's PSFs to describe the contextual conditions of the assessment, so the nine PSFs are identified as root nodes of BN-based model, as shown in FIGURE 2. Four PSFs, namely ''working conditions'', ''number of simultaneous goals'', ''available time'' and ''crew collaboration quality'', are prescribed in CREAM such that they will be influenced by other PSFs, so we added another four nodes to reflect that. Next, the influence of each PSFs is represented by weighting factors, according to (10) and (13); these weighting factors are classified into four groups: the threshold's weighting factor, the visual and auditory weighting factor, the cognitive weighting factor, and the psychomotor weighting factor. Accordingly, there are four nodes used to calculate the weighting factors. As for the basic task demand and threshold, in accordance with the weighting factor nodes, there are also four nodes. Finally, two nodes are used to represent the modified task demand and threshold, and one node is used to display the mental overload result.

• Between PSFs and adjusted PSFs
Theoretically, if the distributions of some PSFs are known, the distributions of their correlated PSFs will be adjusted to adapt them. For example, ''Crew collaboration quality'' has a correlation with ''Adequacy of organization'' and ''Adequacy of training and preparation'', when the states of ''Adequacy of organization'' and ''Adequacy of training and preparation'' are at their highest level, ''Crew collaboration quality'' will also have a higher probability at its best level rather than continuing to with its prior values. In CREAM, the correlation can be summarized as shown in TABLE 6. When the levels of the PSFs in the left column can be determined, the distributions will not be adjusted; if not, then the distribution will be adjusted according to the states of the PSFs in the right column.
Subsequently, we must determine the adjustment rule for the adjusted PSFs nodes. In CREAM, every level of a PSF has an effect on performance, namely ''Improved'', ''Not significant'', and ''Reduced'', as shown in TABLE 7. If the majority of the effects of the PSFs in the right column of TABLE 6 are the same, the state of the PSFs that have the same effect in the left column will have a higher probability. TABLE 8 shows the adjustment rules and the adjusted distributions. Let us take ''Crew collaboration quality'' as an example. If all of the correlated PSFs' effects are ''Improved'', then the state of ''Very efficient'' which has the same effect, will have a higher probability (3/4). Accordingly, the distribution of ''Crew collaboration quality'' will be adjusted from an even distribution to (3/4,1/4,0,0). Based on the above rules, the CPT can be obtained. The CPT for the node ''Crew collaboration quality'' is shown in TABLE 9.
• Between PSFs, adjusted PSFs and weighting factor Unlike the adjusted PSF nodes, the weighting factor nodes are complicated by the problem that the scale of CPT is much bigger. Let us take the node ''Visual and auditory weighting factor'' as an example. The scale of the CPT is 108 * 54, which is difficult to calculate further. Fortunately, in GeNIe, we can set the node's type as ''equation'' to solve this problem. Still using the node ''Visual and auditory weighting factor'' as the example, the steps for setting the node are shown in TABLE 10. First, the function ''choose'' is used to allocate the multiplier from TABLE 4; then (11) can be directly input into the nodes, and the node can be set to perform further calculations, which are much easier to perform.
• Between weighting factors, basic task demands, basic threshold and modified task demands, modified threshold As with the earlier calculation, the type of node ''Modified task demand'' and ''Modified threshold'' are set as ''equation'', and then the settings from TABLE 11 are input into the nodes. In these equations, parameter α MWL and α Th is around 5 based on the regression analysis of the data [46].
• Between modified task demands, modified threshold and mental overload probability In the mental overload probability node, conditional functions are used to calculate mental overload probability. The function expression is shown in TABLE 12, and it takes on a value of 1 when the operator is in a state of mental overload; otherwise, it remains at a value of 0 otherwise. By calculating this expression, mental overload probability can be obtained.
The complete BN-based mental overload assessment model is shown in FIGURE 3.

B. MENTAL OVERLOAD ASSESSMENT PROCESS
The mental overload assessment process consists of four steps. First, task analysis needs to be performed to gain knowledge and insight of the target scenario. Next, the task   Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. demands imposed by the task are determined so as to be used as inputs. Then, the prior distribution of PSFs is determined so as to assign the CPT to BN. Finally, the mental overload probability can be obtained through the mental overload assessment of the scenario. If the predicted mental overload probability cannot meet the needs of the designers, then the design requires iteration until it satisfies the design index. The following section elaborates upon these steps.

1) TASK ANALYSIS OF THE TARGET SCENARIO
To perform the task analysis, the target scenario must be identified. Several rules must be obeyed. First, the scenario must be an interactive scenario, meaning that the operator must monitor the parameters, judge the state and react to it if needed. If the scenario only requires that the pilot monitor or judge or respond, the mental workload level will usually remain in a normal state, and the designers are not encouraged to pay more attention to it. Second, the operator should have several unit tasks to address in the scenario. A single unit task is not enough to cause the mental workload level to be too high; thus, a multi-task state requires more concern from the designers. Scenarios can be filtered by applying these two rules, and preliminary scenario screening can identify whether the scenario needs to be analyzed.
After the identification of the target scenario, task analysis must be carried out. The purpose of the task analysis is to divide the tasks into unit tasks. Here, to better perform task analysis, hierarchical task analysis (HTA) is used. HTA is a goal-oriented task model used to analyze the interactive process of operators in complex socio-technical systems. Ideally, the operating manual is the best place to obtain the information required to perform the HTA. The opinions of experts and interviews with operators can be used to supplement the HTA's details.

2) IDENTIFICATION OF THE TASK DEMAND VALUES
Sections III-A4.b, which treats the topic of task demand values, describes a more fully selection criteria and precaution required for this step of process. In this section, we will present the criteria and precautions required to identify the task demand values of the operators under a specific scenario. The selection of the task demand value determines whether the assessment result will be correct, so it is of vital importance.

3) ALLOCATION OF THE PRIOR DISTRIBUTION TO THE PSFS
Relevant statistical reports are the best way to obtain the prior distributions of the PSFs. In addition, expert evaluation is    also a way determine prior distributions. Readers can refer to Section III-A4.b and III-A4.c to obtain more information.

4) PERFORM MENTAL OVERLOAD ASSESSMENT OF THE SCENARIO BASED ON BN
Based on the mental overload assessment model, the mental overload assessment of the scenario can be performed. The model can output some results to assist the further analysis, including the mental overload probability, the distribution of the mental workload level, the distribution of the threshold level, and the influence of PSFs on mental overload probability. The following section demonstrates this process in detail in relation to an accident involving a helicopter crash.

IV. CASE STUDY A. APPLICATION SCENARIO
In order to demonstrate the capacity of the proposed method, we have applied it to a helicopter accident that occurred in Maryland. On January 10, 2005, at approximately 23:11, a helicopter crashed into the Potomac River during a lowaltitude cruising flight near Oxon Hill, Maryland [49]. The pilot died, and other crew members sustained severe injuries. According to the accident report, after taking off from the Washington Hospital Center Helipad, the helicopter was en route to the Stafford Regional Airport. Due to its proximity to the target airport, the pilot chose to cruise at a low altitude. During the flight, the helicopter needed to cross over Woodrow Wilson Bridge. Hence, when helicopter approached the bridge, the pilot first climbed to a higher altitude and crossed over the bridge; then the helicopter descended so as to continue cruising at a low altitude. However, when performing these tasks, the air traffic controller informed the pilot that an Airbus craft was approaching. To avoid a possible collision, the pilot searched for Airbus visually and tried to maintain visual separation from the Airbus. Subsequently, because the pilot directed most of his attention outside of the cockpit, ignoring the flight instrument display, the helicopter crashed into the Potomac River during the descent stage. The flight mission profile is shown in FIGURE 4.
As for the cause of the accident, the root reason may be that pilot has too many concurrent tasks to handle during the descent stage, which may have led to mental overload. As a result, the pilot abandoned the task of visually checking the flight instrumentation, believing that, once the altitude become too low, the ground proximity warning system (GPWS) would sound an alert. However, a failure occurred in the GWPS, such that it did not function to alert the pilot in time. Thus, pilot did not have a correct situational awareness of the flight parameters prior to the helicopter hitting the water.

B. MENTAL OVERLOAD ASSESSMENT OF THE SCENARIO 1) TASK ANALYSIS
To assess the mental overload state of the pilot in this scenario, we first need to conduct a HTA to divide the task into unit tasks. As Che et al. conducted a detailed analysis of this case, we will refer to their HTA results, as shown in FIGURE 5 [50]. Obviously, there were four tasks in this scenario: to climb, to cross over the bridge, to descend, and to search for and avoid Airbus. For the task ''climb'', the pilot needs to contact the air traffic controller to report the climbing position. Once the permission has been obtained, the pilot starts to adjust the throttle and pedals to change the altitude, course, speed, and rate of climbing until reaching the predetermined position. For the task ''cross over the bridge'', the pilot needs to carefully manipulate the helicopter and check the instruments to avoid approaching the bridge too closely. For task ''descend'', the pilot needs to adjust the collective pitch, throttle and pedals to return the helicopter to its pre-climb altitude. For the task ''search for and avoid the Airbus craft'', the pilot needs to contact the air traffic controller to obtain information about the approaching Airbus and then to search for the Airbus visually, maintaining visual separation from the Airbus. The last task is a temporary task, and the pilot is required to perform it while simultaneously performing the first three tasks.
For the sake of simplicity, the following two assumptions have been made: • Only the pilot and air traffic controller are considered in the process; • The accident occurred while the pilot was handling the third and fourth tasks simultaneously, so we mainly assess the mental overload state of the pilot during this period.
Based on the HTA results and assumptions, we conducted a further analysis.

2) IDENTIFICATION OF THE TASK DEMAND VALUES
To perform the VACP assessment, we need to determine the task demands of different unit tasks. As shown in FIGURE 5, among the unit tasks, 3.1.5 and 4.2 are visual tasks, task 4.1 is an auditory task, task 3.2.1 is a cognitive task, and the remaining unit tasks are psychomotor tasks. The results are summarized as TABLE 13.
For the visual tasks, the task demand value of unit task 3.1.5 is 3.0 out of 7.0, with the verbal anchor ''Visual Inspect/Check'' as the pilot needed to check the flight instrumentation; the task demand of unit task 4.2 is 6.0 out of 7.0, with the verbal anchor ''Visual Scan/Search/Monitor'' as the pilot needed to continuously search for the approaching Airbus. For the auditory task, the task demand of unit task 4.1 is 4.0 out of 7.0, with the verbal anchor ''Complex speech'' as the pilot and air traffic controller needed to communicate with each other in sentences. For the cognitive task, the task demand of unit task 3.2.1 is 4.6 out of 7.0, with the verbal anchor ''Evaluation/Judgement (consider several aspects)'' as the pilot needed to judge the appropriate altitude. Those psychomotor tasks have been treated as a whole and share a common task demand value, which is 2.6 out of 7.0, with the verbal anchor ''Continuous Adjustive (flight controls, sensor control)'' because flight control is a continuous procedure that consists of the adjustment of the cyclic stick, the collective pitch, the anti-torque pedal, and the throttle. If a change occurs solely in the cyclic stick, the total thrust and lift ratio will not change, thus requiring adjustment of the throttle and collective pitch accordingly. As these manipulations are always linked, they are seen as a whole task.

3) DETERMINATION OF THE PRIOR DISTRIBUTION OF THE PSFS
According to the accident report, the communication between the pilot and the air traffic controller was timely and accurate; both sides strictly comply with the requirements in the flight manual (Adequacy of organization: Very Efficient). It was night-time during the flight (Time of day: Nighttime), and there was no outside visual reference, meaning that once the pilot crossed the bridge, he was flying into a black void (Working Conditions: Incompatible). As shown in TABLE 14, when the pilot was required to handle several tasks simultaneously, he actually abandoned the visual check 48386 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.   of the flight instruments, which proves that the number of concurrent tasks was beyond his capacity (Number of simultaneous goals: More than capacity).

4) BN MODELING AND CALCULATION
The BN structure is similar to that shown in FIGURE 3, only the adjusted nodes differs because ''Working conditions'' and ''Number of simultaneous goals'' can be determined according to the report. Thus, there is no need to adjust their distributions. Conversely, since ''Available time'' and ''Crew collaboration capacity'' were subject to the assumed even distribution, their prior distribution must be adjusted.
The results show that the mental overload probability in this accident scenario is 68%.

1) THE DISTRIBUTION OF THE MENTAL WORKLOAD LEVEL
As the assessment of VACP is a point estimation, and the assessment of the proposed method is a probabilistic estimation, we must determine where the assessment value of VOLUME 11, 2023 48387 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.    VACP falls within the probability distribution of the mental workload level. The distribution of the mental workload level is shown in FIGURE 7. We can analyze this result from two  This actually shows the effect of PSFs: the degraded PSFs shift the mean value of the mental workload and change the distribution.

2) THE DISTRIBUTION OF THE THRESHOLD
As for the distribution of the threshold, reader can refer to FIGURE 8. In the described scenario, the probability that threshold is lower than 35 is almost 50%, meaning that the pilot's threshold witnessed a great drop under the impact of the degraded PSFs. For normal tasks, the threshold may not have an influence as the task demand is usually low. However, in emergency situations, the threshold can be influential because the addition of instantaneous tasks will overwhelm the threshold. The decrease in the threshold is one of the main causes of the mental overload.

3) COMPARISON BETWEEN VACP AND THE PROPOSED METHOD
VACP's assessment result is 20.2, which means that in this scenario, the pilot's mental workload was lower than the threshold (40). In contrast, our proposed method indicates with a relatively high probability (68%) that the pilot was in a stage of mental overload, as shown in TABLE 15. With the support of VACP's result, the designer may draw the wrong inference that the scenario to be assessed is safe enough. This incorrect inference would be caused by the absence of the influence of PSFs influence on the task demand and threshold. However, with the support of our proposed method's results, the designer's attention would be drawn to the discovery of the factors that influence mental workload, and the designer would therefore advance targeted suggestions.

4) THE MOST INFLUENTIAL PSFS
To assess the influence of PSFs, ''target sensitivity analysis'' is applied. By setting the evidence of a specific node, a target sensitivity analysis allows the user to investigate the changes in the probability of mental overload, and the results are shown in TABLE 16. Among all of the PSFs determined in the case, ''Number of simultaneous goals'' has been identified as being the most influential. Thus, the most possible accident chain is: the instant task of finding nearby Airbus → limited outside reference and lack of altimeter altitude → visual search for Airbus → too many unit tasks for pilot to perform → mental overload → ignorance of visual scan of instruments → failure occurred in GPWS → accident.

5) SUGGESTION
The accident report describes the probable cause of the accident: ''The probable cause of this accident was the pilot's failure to identify and arrest the helicopter's descent, which resulted in controlled flight into terrain. Contributing to the accident were the dark night conditions, limited outside visual references, and the lack of an operable radar altimeter in the helicopter''. We would like to point out that the existence of too many simultaneous goals is the most influential factor, which is in accordance with the deduction in the report that ''the pilot was unable to perform complex tasks in the helicopter or fly a compete mission involving several tasks in a series'' which validates the effectiveness of our proposed method.
As for suggestions, there are actual two threads to follow: improving the design and optimization of the task flow to reduce the task demand; and improving PSF-related elements so as to reduce their impact on the scenario. In this scenario, visually searching for the approaching Airbus was not the ideal way, so we recommend both the use of radar to detect other aircraft and the integration of this function with the flight instrumentation. In this way, at every fixation of the flight instrument, the pilot will be able to obtain enough information about the helicopter and the surrounding situation. This suggestion is also consistent with the accident report: ''If a functioning radar altimeter had been available to and used by the pilot, it could have provided a constant altitude cue, in the form of a digital readout of feet above the terrain, to enhance his awareness of the helicopter's height above the water'' [49].

V. CONCLUSION
Mental workload is one of the most concerning factors in the aera of safety, and the assessment of mental overload is essential to the design stage. The present study proposes a mental overload assessment model that considers the effects of PSFs. The proposed model has the following characteristics: First, the proposed method considers the influence of PSFs on mental overload, which extends the description of the traditional mental overload assessment method. Second, the relation between mental workload level and human error probability was established by converting the PSFs' weighting factors. Finally, BN was used to model the influence of PSFs and calculate the probability of mental overload. The performance of the proposed method was demonstrated through a case study of an accident involving a helicopter crash. The results show that the proposed method can efficiently determine the mental overload state of the pilot better than the traditional VACP method can, and it provides a more rational explanation of the accident's evolution.
XIN LU received the B.S. degree from the School of Aeronautic Science and Engineering, Beihang University, in 2015, and the M.S. degree from the School of Reliability and Systems Engineering, Beihang University, in 2018, where he is currently pursuing the Ph.D. degree. His research interests include reliability modeling, man-machine system safety analysis, and human reliability.
JIANBIN GUO received the Ph.D. degree in systems engineering from Beihang University, Beijing, China, in 2009. He is currently a Senior Lecturer with the School of Reliability and Systems Engineering, Beihang University. His research interests include reliability-based multidisciplinary design optimization, man-machine system safety analysis, and human reliability.
SHENGKUI ZENG received the Ph.D. degree in systems engineering from Beihang University, Beijing, China, in 2009. He is currently a Professor and the Assistant Dean with the School of Reliability and Systems Engineering, Beihang University. His research interests include reliability modeling and optimization, man-machine system safety analysis, and human reliability.
HAIYANG CHE received the B.S. degree in industrial engineering from Jilin University, in 2014, and the Ph.D. degree from the School of Reliability and Systems Engineering, Beihang University, in 2020. He is currently a Postdoctoral Fellow with the School of Automation Science and Electrical Engineering, Beihang University. His research interests include reliability modeling, man-machine system safety analysis, and human reliability.