Verification and Validation for a Digital Twin for Augmenting Current SORA Practices with Air‑to‑Air Collision Hazards Prediction from Small Uncooperative Flying Objects

Future autonomous Unmanned Aerial Vehicles (UAV) missions will take place in highly cluttered urban environments. As a result, the UAV must be able to autonomously evaluate risks and react to unforeseen hazards. The current regulatory framework for missions implements SORA guidelines for hazard detection, but its application to air-to-air collision is limited. This research defined a rigorous verification and validation framework (V&V) for digital twins for use in future autonomous UAV missions. The researchers designed a sentry mission for a UAV to evaluate its capacity to detect small uncooperative flying objects. A digital twin of the DJI M300 vision system was built using a game engine and a V&V framework was developed to assure the quality of results in both virtual and real-world scenarios. The results showed the capability of the digital twin to identify vulnerabilities and worst-case scenarios in UAV mission operations, and how it can assist remote pilots in identifying air-to-air collision hazards. Furthermore, the probability of air-to-air collision was calculated for three sentry patterns, and the results were validated in the field. This research demonstrated the capability to identify vulnerabilities and worst-case scenarios in UAV mission operations. We present how the digital twin of an operational theatre can be exploited to assist remote pilots with the identification of air-to-air collision hazards of small uncooperative objects. Furthermore, we discuss how these results can be used to enhance current SORA-based risk assessment practices.


Introduction
There is a clear interest in developing Unmanned Aerial Vehicles (UAV) by the service industry.According to a recent survey, UAV-supported services will be worth about 71 billion USD by the end of this decade [1].This deserved interest is supported that UAVs might become key enablers for UN Development Goals [2].Therefore, future use cases for UAVs will have to navigate highly cluttered environments, for instance, last-mile delivery UAVs will have to sense, detect, and avoid lampposts, powerlines, pedestrians, and other hazards in busy urban streets.
To enable such a level of autonomy, several technologies and standards must be developed to ensure that these operations can be carried out safely and securely, with no threats to human life and infrastructure.The current regulatory framework (in Europe and the UK) does not support these types of operations [3,4].Nonetheless, the EASA Artificial Intelligence Roadmap envisions a path towards a future where these types of UAV operations will become possible [5].Yet, we have made the case that to enact this roadmap specific targets for detect and avoid technologies must be set [4].
Notwithstanding, robust detect and avoid technologies are currently used in commercial-off-the-shelf UAVs to assist remote pilots in their operations.For example, the DJI Matrice 300 UAV incorporates a vision system that warns the remote pilot of hazards in the flight path of the UAV [6].A remote pilot performing a challenging survey operation (for instance the survey of a bridge), operates within the specific category [7].This means that the remote pilot must submit a mission plan that risk assesses the operation.Current guidelines for risk assessment of UAV operations are based on the JARUS Specific Operations Risk Assessment (SORA) [8].These guidelines provide a risk assessment methodology to determine if "a specific operation can be conducted safely" [8].
In this work, we propose the use of digital twins to enhance the risk assessment capacity of remote pilots.In particular, the Digital Twin (DT) will be used to identify air-to-air collision hazards.We address this by tackling the issue of trust in the results of the DT.Before the DT can be used (and trusted) by remote pilots, its capacity to reproduce the real world must be evaluated.To accomplish this, we developed a Rigorous Digital Twin Verification and Validation Framework to ensure that the DT is fit for purpose before it can be used by a remote pilot to risk assess and identify potential sources of hazards in safety-critical operational environments.
The Rigorous Digital Twin Verification and Validation Framework (V&V framework) was developed applying modern software development principles to assure the internal quality of the DT.We invested heavily in unit, integration and system testing to assure that the software is free from defects.Furthermore, we enacted feedback loops from the real-world into the digital world to assure that the behaviour of the DT was the same in both realms.
A key concept that we put across is that for the DT to be fit-for-purpose, it must not only reproduce the real world to a degree of accuracy that does not challenge the mission, but it must also be free from (known) software defects.
To evaluate this approach, we introduce a scenario where a sentry UAV is monitoring known airspace.This scenario is a simplification of one of the Use Cases from RAPID (see section 3).In this scenario, the sentry UAV is tasked to detect the intrusion of small uncooperative flying objects that can threaten an operation.As a result, the DT is used to evaluate three different sentry strategies, and to provide an indicative probability of air-to-air collision for each of these strategies.This probability of air collision results can then be used to close the loop to the SORA and inform the risk assessment evaluation of the operation.Furthermore, the capacity of the DT to reproduce the real world is evaluated by exporting missions from the DT into the real world and verifying that the planned flight path (the waypoints) represent the same world coordinates in the real world and the digital world.

Digital Twins for Risk Assessment of UAV Operations
We differentiate between a simulation and a digital twin.Following the definition by Fei et al. [9], a digital twin integrates a simulation of a complex product with outside sensor data and/or measurements so that the behaviour of the digital entity mirrors the behaviour of the physical entity.As such, a DT has to integrate elements of the virtual and the physical world during its execution, while a simulation stays within the virtual world.Even though the design elements of a simulation and a DT might be similar (or even the same), it is the use and the interconnectivity with the real world that sets a DT and a simulation apart.Liu et al. [10] present a meta-analysis of DT research, their results catalogue DT definitions, DT technologies and DT Applications.Regarding technologies, game engines like Unity and Unreal have been exploited for the development of DT [10,11].Regarding application, DT can be used to improve a product design [9,12], and to support the manufacturing process of a product (through real-time data collection, and performance prediction [13], among others).We have observed that, overall, research results in DTs are mostly about the application of DTs to support a manufacturing process [14].
There are limited examples of applications of DT to mobility [15].Wang et al. [15] present a review of these, the main takeaway related to this research is that all the sources reviewed in this work entail ground transportation digital twins (for instance see [16,17]).
Regarding the UAV domain, we have found simulation environments that can potentially form the basis for UAV DT.Furthermore, while there are several proposals for the application of UAV DT in several domains (i.e.last-mile delivery [18]), very few of these works present evidence in both the physical and the digital world.For instance, Lei et al. [19] present a comparison of machine learning algorithms that can be applied for UAV swarms in DTs, however, they do not present evidence of the application of their proposal in the physical world.Following the definition in this paper, this renders these works in the realm of simulation, not DT.
In contrast, as we do in this paper, Grigoropoulos and Lalis [20] present evidence on the application of their proposal in the real world.In [20] the authors argue that the UAV digital twin can be used to test UAV software components.They evaluate UAV flight navigation components both in a Laboratory setting and in the digital twin.
Our work differentiates in that we have a clear grounding in current regulations, and that our express interest is in being able to assure the replication capabilities of the DT so that its results can be transferred to real-world operations, by assuring the quality and transferability of the results of the DT to the real world.
Therefore, To the best of our knowledge, the digital twin presented in this paper is the first DT for which its application is targeted at evaluating the risk assessment of UAV operations.

JARUS SORA and Risk Assessments for UAV Operations
In this section, we provide a high-level overview of current regulations as they pertain to the risk-aware nature of UAV operations.In [3] we present a thorough review of the current legal requirements for UAV operations, and in [4] discuss the synergy of regulations and technological innovation to assuring future secure Beyond Visual Line of Sight (BVLOS) UAV operations.EASA is the legal regulator for civil European aviation.EASA has been developing regulations for the UAV space since 2002.Currently, these regulations segment the UAV operations according to Risks into the following three categories: • Open Operations are those that do not require authorization from an aviation authority.These operations are bound by strictly defined boundaries (for instance: UAV size, direct visual line of sight, and altitude that does not exceed 150 m).• Specific Operations are those that require a risk assessment.To operate under this category, a UAV operator must acquire relevant authorizations from the national aviation authority.Two standard scenarios describe the type of operations under this category.For operations that can not be classified into the standard scenarios, specific authorizations are needed.The JARUS SORA (explained below) is the best practice risk assessment methodology for UAV operations.• Certified Operations are aimed at future use cases of UAV operations.Operations under these categories envision future scenarios like unmanned deliveries and BVLOS air taxis.Regulations for the certified operations are currently under development [21].
The JARUS Specific Operations Risk Assessment (SORA) [8], describes a risk assessment methodology designed to establish a sufficient level of confidence that a specific operation can be safely completed.In the SORA, risk is associated with Harm and informed by a probability of occurrence.While air-to-air risk evaluation is within the scope of SORA, in its current version they are not addressed. 1The objective of the V&V framework presented in this paper is to address this through the application of a DT that can be used for quantitative risk evaluation of air-to-air collisions in UAVs operations.
The JARUS SORA is an eight-step risk assessment process that culminates in the categorization of the operation within the six levels of SAIL (specific assurance and integrity levels).The SAIL defines a 3 by 3 risk classification matrix in terms of Integrity (defined as the layers of safety gained by each mitigation) and Assurance (defined as the robustness of the evidence of the mitigation action(s)).
Risk quantification is an integral part of the SORA process.Risk, with their associated probability of occurrence, are quantified.Mitigation strategies (threat barriers) are also quantified in terms of their impact on decreasing the probability of occurrence or the impact of the risk.As a result, in this work, we have adopted the quantitative-based approach for risk identification and designed our V&V framework to output an air-to-air collision risk probability than can be used by remote pilots to inform their SORA-based risk assessment.

Standardisation and Requirements for UAV Detect and Avoid Systems
There is a lack of standards for detect and avoid systems for UAVs.Since autonomous UAV operations are not permitted within current regulations, there is no incentive to develop hardware and software systems for robust detect and avoid.This creates a vicious circle, where there is a vision for future application of the technology, yet little incentive for its development.
Robust detect and avoid is not only vital for future autonomous BVLOS operations, it is required by current remote pilots as an assistive technology.In highly cluttered situations, like surveying an in-port bridge (see section 3), current remote pilots rely on these assistive technologies to help them navigate hazards and obstacles.Yet, as mentioned, there is no standardization as to what types of objects -nor at which distances -a robust detect and avoid system has to identify to be considered fit-for-purpose.This section highlights the few applicable standards and regulations that would cover the development of these types of systems.
There are two EASA's regulations for the development of aerial software.EASA's AMC 20-115D [22] is targeted at manned aircraft.In 2021, EASA published guidelines for the certification of products operating within the specific category [23].We argue that these guidelines are biased towards manufactured systems, and do not consider the intricacies and risk of software development.Furthermore, within these guidelines, is the producer's responsibility to define the scope of the certification, therefore, the guidelines do not cover a target for Detect and Avoid System.
In [4], we have made the case that ISO/IEC 250xx family of standards could potentially be applicable to certify the product quality of software-intensive Detect and Avoid Systems [24].This family of standards is aimed at the certification of software products.However, the scope of the certified product (with its key performance indicators) is defined by the software producer, therefore there is no control nor requirements from aviation authorities as to which KPI to include within the scope of certification.This is not a critique to the standard, as this is a general software quality standard, so it is not expected that targets for robust detect and avoid solutions are set in the standard.Therefore, while we claim that the standard is not applicable, we value its approach as a potential pathway to certify software-intensive products within the UAV domain.
Some research projects have indicative targets for detect and avoid systems.In particular [25], presents some recommendations for separation distances for UAVs in highly cluttered environments.However, they do not present empirical evidence on the capacity of sensors or solutions to achieve their proposals.
To the best of our knowledge, ours is the only research line looking to characterise current detect and avoid systems with the objective of determining a performance baseline for the development of robust software-intensive detect and avoid technologies.

The RAPID H2020 Project use Case
The RAPID project (www.rapid.eu) is an H2020-funded project whose aim is to save lives by automating the maintenance inspection surveys of critical at-risk infrastructure.To achieve this, RAPID strives to develop technology to enable the end-to-end automation of a maintenance inspection survey mission using a swarm of UAVs.
Among RAPID use cases, of interest in this paper is the Bridge Inspection use case where a swarm of UAVs is tasked to survey a bridge.In this scenario, each of the UAVs in the swarm is assigned segregated airspace and a set of waypoints.This assignment is performed so that UAVs in the swarm are spatially separated so that there is little to no risk of air-to-air collision among the members of the swarm.Under current regulations, the operation would be closely monitored by a trained remote pilot who will oversee the airspace around the bridge for incoming hazards and take appropriate action if one is detected.However, in a future fully autonomous use case, a UAV performing a survey mission will be unable to fully monitor the operational theatre.As a result, a sentry UAV is incorporated into the Swarm with the responsibility to monitor the operational theatre to detect incoming intruders that can potentially breach the segmented airspace of the UAVs in the swarm.
In this paper, we will use this scenario as the guiding one for the evaluation of the V&V framework.The scenario entails a sentry UAV monitoring a defined airspace.A small uncooperative UAV will breach the airspace on a direct collision course with the sentry UAV.We will evaluate three sentry strategies for the detection of this small uncooperative UAV.

Design Considerations Related to Software Quality
A digital twin with interaction with the real world can be a challenging software component to assure its quality.Testing these systems is challenging [26,27].As early as 1971, it was observed how testing techniques cannot theoretically cover all possible input space on a software system [28].While software quality assurance techniques have evolved, so has the complexity of the software systems.
The key issue with assuring the quality of these systems is how to deal with autonomy.Autonomy relates to the capacity of the system to change behaviour without the need for human interaction [29].The key challenge for autonomy is the requirement to remove the human from the loop in any adaptation process.This capacity requires self-adaptive and context-aware capabilities from the systems.Self-adaptation is defined as the capacity of the system to change dynamically and modify its behaviour [30].And context-awareness is the capacity of the system to modify its behaviour to better serve the user [31].Which are system characteristics that are not currently supported by applicable regulations.We claim that having a digital twin that can execute variations of a scenario that can be validated in the real world, removes an element of uncertainty for the autonomous characteristics as emerging behaviours that can be observed in the digital world.However, to enact enough confidence in the digital twin results, a rigorous Verification and Validation framework must be developed to accompany the development of the digital twin.
To develop the Verification and Validation (V&V) Framework we draw from modern software testing practices.We paid attention to both process quality and product quality.For this framework we enacted the three recommendations presented in [32]: • Develop the V&V with a well-defined quality-oriented software development lifecycle.We enacted a modern agile software development process with testing activities at all stages of the product.From individual unit testing, component testing, and field exercises to calibrate the product within real-world scenarios.• Enact a thorough testing strategy using traditional (non-context aware) techniques.It is unnecessarily expensive to enact a complex simulation environment to test a complex CPS like our DT, only to realise that the strategy resulted in the identification of oversighttype of defects [33].Therefore, we support an incremental strategy with growing complexity of the test cases and test items.• Exploit advanced compute-intensive techniques (like digital twins) that impose no restrictions in the context and its variation.Through repeated executions of a defined scenario, not only emerging behaviours can be understood, but also defects in the test item can be identified.
Overall, we designed this framework to de-risk the complexity of assuring the quality of the DT and to increase the trust (see section 4.2) in the simulation results.Fig. 1 shows how we inspired the implementation of the test cases in the Cynefin framework [34,35].The Obvious quadrant is the realm of the Known-Knowns.Test Cases in this quadrant represent the unit testing of functions and classes or the verification of the presence or absence of elements in the virtual scene.Technologies for achieving this are mainstream (Nunit and Unity Testing framework [36]).The complicated quadrant is the realm of the Known-unKnowns.Test cases in this quadrant will require the execution of the simulation capabilities of the DT.Though Unity provides technologies to define and execute these tests (PlayModeTest in the Unity Apis), we argue that while we can define the expected outcomes of the test cases, we do not know the detailed execution path of each test, so we designed the V&V framework to complement this gap.The complex quadrant is the realm of the unKnown-Knowns.In the simulation, this is represented by the capacity to introduce variation to a known test scenario (a capability added by our V&V Framework based on implementation exiting in Unity Testing framework -see 4.3).Furthermore, this quadrant is also the realm of Lab validations (like designing an empirical set-up to measure the rotation of the gimbal -see section 4.4) or real-world field trials (see section 5.2).Finally, the chaotic quadrant is the realm of the Unknown-Unknowns.This is where real-world live operations take place.As mentioned, current regulations require that UAV operations are carried out after thorough risk assessments and hazard identification and evaluation.Remote pilots are trained to deal with unforeseen situations and are required to report accidents and near misses.These situations can be captured back into the V&V Framework through test cases at each quadrant of complexity.Which, after several iterations, increases the Trust in the overall system.

Design Consideration Related to Trust in the DT Simulation Results
Before the result of the simulation can be of use to the domain experts, it is needed that the simulation users can Trust that the simulation -and its results -are reliable representations of reality.The development of software systems is a complex endeavour, and it is prone to failure [37].
To that end, we treat the status of the digital twin as a cycle that goes from trustworthy to untrustworthy.Fig. 2 shows how we have operationalised this cycle.
• Trustworthy: This state is operationalised when the following conditions are met: All Software Tests that have been Written into the System Pass Our V&V framework contemplates tests that are both static (using Unity EditModeTest) and Dynamic (using Unity PlayModeTest).Table 1 presents an example of each type.
All Scenario Runs Pass Our V&V framework takes advantage of PlayModeTest to describe and execute dynamic scenarios.Each scenario has a Pass or Fail condition which can be evaluated.This condition is set according to expectations and real-world expectations (See Table 1).
Field Tests are Evaluated to Pass Scenarios executed in simulation should execute in the real world.Results of field trials are analysed to determine that the simulation and the real world behave in the same way (see section 5.2).Table 2 presents some of the Field Trials we have executed to maintain Trust in the simulation.
Untrustworthy The results of the DT cannot be trusted if any of the previous conditions are not met.A root cause analysis cycle [38] is enacted to identify the cause of the deviation.The overall strategy is to capture the identified root cause into a Test (in any of the previous types,  The result of traversing through these states of Trust and untrust is that with each cycle the Verification and Validation harness becomes more robust.There are more tests in the software and validation activities in the field that lead to an overall increase in the Trust in the simulation results.

Description of the Main Software Components in the V&V Framework
In this section, we present the design of the core elements of the DT.We have developed our DT for UAV operations on top of the Unity engine.While environments like Unity have been designed mainly for Games Development, there are experiences in extending these to develop DT [11].Our approach  uses Unity out-of-the-box rendering and a physical engine to reproduce the operational theatre, and we extend the Unity API to develop a rigorous V&V framework to assure the quality and reliability of the outputs of the DT.Fig. 3 presents the main software components of this framework.We have developed our framework on top of the UnityTesting Xunit implementations.While we take advantage of the execution engine provided by both implementations, we have created custom test listeners (TestResultListeners) and custom Test Launchers (Launcher) to intercede in the execution process and be able to extract the data (like the position of moving elements) form the simulation that will be used for calculating the SORA probability of air-to-air collision.
The main responsibility of the elements in this component are: Launcher: It provides a static entry point in the Unity menu for a user to select and execute a scenario.SoraBaseScenario.This is the implementation of a scene in the Unity environment.Though technically it can be considered to be within the Unity engine, a scene is the starting point for the RAPID Framework.TestResultListener.Two custom Test Results Listeners were implemented to determine the outcomes of the execution of a scenario.Result listeners are tasked to monitor the execution of a scenario (or multiple executions of a scenario) and manage the results into a short-had report.
• ConsoleResultListener: It outputs the summary result to the Unity console (useful for developers) • XMLResultsListener: It outputs the summary of the results to an XHTML that can be opened in a web browser (useful for end-users).

InputDataSet.csv.
Variations for the execution of a scenario can be described in a comma-separated file (CSV).Each row in the CSV will specify new input conditions to a scenario.
• Scenarios: The scenario is the implementation of the Use Cases within the Unity engine.Each Use Case can be represented with a scene (in Unity), where static objects are placed.Dynamic objects can be placed in their initial positions for the scenario using the Unity Editor, or they can be incorporated in the Unity project as GameObjects2 (elements in Unity that can be used in a scene) for them to be introduced into the scene programmatically.A human-in-the-loop will program how the scenario plays out (known/ known), variation can be introduced from the input data sets (known/unknows and unknown/knowns).The main responsibility of the elements in this component are: SentryDroneScenario.To this date, the Sentry Drone Scenario has been implemented within this framework (see section 3).ScenarioHelpers.This component contains elements that can be reused across multiple scenarios.These elements are designed to introduce variation and extend the behaviour of the scenarios without changing the programming.Therefore, the same scenario source code can be executed and the results of multiple variations (like speed, positions, etc.) of the same scenario can be executed.The main responsibility of the elements in this component are: BehaviourVisitors: These helpers are implemented using the visitor design pattern, which allows separating algorithms for the objects in which they operate.Simple visitors used to modify the speed of dynamic objects are implemented in this component [39].GimbalMovements.The variations of the M300 Gimbal movement have been developed within this container.This allows executing the same scenario with different movements for the gimbal by changing the instance of the subclass at runtime.
• Gimbal180Movement.This class inherits from GimbalMovement (therefore their instances can be replaced at runtime) and encapsulates the behaviour of a Gimbal that does 180° sweeps towards the front orientation of the UAV.
• Gimbal360Movement.This class inherits from GimbalMovement and encapsulates the behaviour of a Gimbal that does 360° sweeps towards the front orientation of the UAV.This class incorporates the physical restriction of the M300 Gimbal (where it can move up to 315° clockwise, and then one revolution and 315° anti-clockwise).Therefore the 360° degree is obtained by rotation 180° anti-clockwise and then 360° clockwise, and thereafter 360° alternating directions.• V&V Helper Extensions.This component contains elements that is in the data export process.These components facilitate the exchange of data that will later be used to analyse the results of the simulation.
DataExportVisitors.Modelled after the visitor design pattern, the entities under this hierarchy listen to changes in the state of the dynamic objects to export the data out of the simulation.
PositionVisitorXML and PositionVisitorCSV, export the position of the dynamic objects they are "visiting" into a text file.They depend on the UnityToLatLongExport entity to convert from the UnityCoordinates system to World coordinates (a key element of the validation scenario, see section 5.2).

4.4
We are using a DJI Matrice M300 UAV (M300 for short) for our field experiments.This UAV is a top-of-the-line commercial-off-the-shelf UAV currently used by remote pilots in specific operations missions.There are two main elements from the real-world M300 system that we have brought into the DT, the M300 Vision System, and the L1 Lidar sensor.The M300 Vision System is described in the M300 user manual [6].The vision system is responsible for obstacle detection, to aid the remote pilot in navigating cluttered environments.It was mainly designed to avoid large static obstacles (like trees, bridge frames or fences).The M300 Vision system has visible spectrum and infrared cameras.The visible spectrum cameras are used in conjunction with a machine vision system to monitor and detect obstacles.The infrared sensors are also used to detect obstacles and to help maintain the UAV position.The visible spectrum sensors have a detection range that varies from 30 m (Top and bottom) to 40 m (front, back and sides), while the infrared detection range is 8 m.
The manual describes the angles, positions, and range of the vision system, and we have used that information and reproduced it in the digital twin (see Fig. 4).In the DT, collider cones are used to reproduce the vision sensors, and Unity raytracing capabilities are used to reproduce the infrared sensors.Box 1 presents the algorithm used in the digital twin to detect collisions.As mentioned, we implemented a cone that represents the field of view from the L1 Lidar Sensor.The specifications for these Field of View cones were taken from the manual.In the digital twin, we also introduce how environmental lighting affects the L1 Sensor as described in the manual.This approach establishes a base case scenario for detection, as weather and declining vision conditions are not simulated in the DT yet.As seen in Box 1, we are not implementing the range limitation of the IR system, as we intend to sense small uncooperative flying objects that intrude in the mission airspace.

When Unity3d engine detects a collision I <-Intruder
Sentry mode, represents mission variations, in this case, we will evaluate three variations: • An M300 with an L1 Lidar sensor that has no Gimbal rotation.• An M300 with an L1 Lidar sensor rotates at a 180° angle.
• An M300 with an L1 Lidar sensor rotates at a 360° angle.
The speed of the small uncooperative object is also an independent variable as it can be controlled in the simulation, and the initial position of the small uncooperative flying object is also an independent variable.We decided to run and define a new scenario, so that its execution is separated from any other by 5 m (in X, Y and Z) and staring at 205 m range from the UAV at the sides of the sentry UAV.
The Subject box depicts the experimental subject.In this case, the same source code within the DT is executed in all variations of the scenario.As presented in section 4.3 this corresponds to the SentryDroneScenario.
The Effects box represents the output variables.These are a result of the scenario in terms of the small uncooperative object being detected by the L1 sensor, detected by the Vision System, or crashed into the UAV.The position where the small uncooperative object was detected (in Unity coordinates and real-world coordinates), and a distance in meters from the position at which the small uncooperative object was detected and the position of the Sentry UAV.

Empirical Evaluation
This section presents the use of the framework for estimating air-to-air collision risk in UAV missions.As mentioned in the previous section, the empirical evaluation presented in this section was carried out when the system is in a state of Trust.In this section, we show how the system can be used to calculate the probability of collision, followed by the validation of the results in the field with a selected scenario.

Instrumentation
Fig. 5 presents the experimental design to use the DT to estimate air-to-air collision risk.The input box in Fig. 5 represents the independent variables of our experimental design.The Scene element is the representation in the digital twin of the operational theatre.The DT must be initialized with the digital replica of the operational site.In this case, we have conducted our empirical evaluations in the Cochno Farm from the University of Glasgow.The scene includes a digital asset that represents a Ground Control Point.This asset is initialized within the coordinate system in Unity, and with the real-world coordinates that correspond to the scene (see section 5.2).The

Execution Results
The data and analysis procedure for this section was made open source in Kaggle. 3The input dataset describes 456 different scenarios for the initial conditions of the scene.As a dataset (see Table 3 -Input space row) the scenarios are defined to form a square whose centre coincides with the position of the sentry UAV.Each scenario is separated from any other at a 10 m distance in X, Y and Z.And the distance from the centre to the side of the "square" is 205 m.Each of these scenarios is evaluated individually by the framework.
Table 3 presents a graphic representation of the results of the simulation.The first row of Table 3 presents the input space as recorded by the simulation.The three images in this first row show that the input is the same for all three scenarios.However, they are coloured according to the sensor that detected each stating point.This was done as a first visualization of the results.
The output space row shows the positions at which the incoming object is detected.Colour coding is used to identify which sensor detected the incoming object.This visualization helps identify the positions from which each of the strategies under investigation is able to identify incoming uncooperative objects.
The overall detection capacity of the scenario is presented in Table 4.It can be observed that the profile of the detection range has more spread with the 360° movement strategy.We will discuss the implications of this from a Risk Management perspective in section 6.2.

Field Validation
We validated the results of the simulation by executing one instance out of all the executed instances presented in the previous version.The instance is selected by convenience with safety and security being a primary concern.Convenience sampling, as explained by Cresswell [40], is when the experimental subjects are selected given availability or naturally formed groups.In our case, the safety of the pilots and potential uninvolved persons, and the security of the equipment are primary concerns.Furthermore, we will note that the M300 -which weighs about 6,3 kg with the two batteries, is a class C3 UAV, that can only be operated under the Specific category [7].All the authors involved in this study have undergone the General Visual Line of Sight Certificate and are legally allowed to operate this type of UAV.To legally operate the M300, a risk assessment following the current SORA is required (see section 2.2).So to select a mission for field validation we selected an execution that minimized risk to the UAV remote pilots, the risk of collision (considering the terrain) and the start, end positions and flight path of the UAVs.Fig. 6 presents the experimental design to validate the results from the DT.The objective of this validation is to investigate the positional prediction of the simulation.We will observe the outputs of the simulation and create a real-world mission with the same situational parameters and measure the position of the incoming uncooperative UAV at the waypoint in which the simulation predicts that it was detected by the sentry UAV.
The input box in Fig. 5 represents the independent variables.As mentioned, a scenario must be exported from the digital twin into the real world.This is done through the component PositionVisitorCSV (shown in Fig. 3), which can export the position of a Dynamic object within the DT in a text-based comma-separated file format that can be imported Input Once in the field, the first step is to locate the position of the Ground Control Point, validating that its position is an accurate reflection of the observed position in the DT, and set the take-off and landing spot close to the Ground Control Point.Fig. 7 shows the set-up in the field, and Fig. 8 shows the set-up in the digital twin.In the field, we deploy a DJI Real Kinematic Positioning device (RTK) to increase the positioning precision of the UAVs in the operation.The RTK is not represented in the DT, as position precision is achieved through observing distance in Unity Coordinates.To be able to consistently reproduce the scene in the real world, we use the algorithm in Box 2 to identify the precise positioning of the GCP at every field exercise.This procedure allows us to reproduce the mission in the same conditions on different days.It also serves as the first validation of the coordinate system of the DT.
In <-GCP posiƟon in the Scene Turn On RTK Trun On M300 and Link controller.IdenƟfy GCP posiƟon with 1m error through a third-party soŌware (Google Maps or WhatThreeWords) IdenƟfy precise posiƟon through M300 comlink and RTK.
Box 2 Positioning an object in the real world.Subject As mentioned, there are two UAVs on the scene.As part of the set-up (Input), we would have exported a mission path to each of the UAVs and uploaded it through the UAV Controller.
Effects Actual flight paths are exported from the controller after the mission.The DJI flight dataet has information on the waypoints that a UAV has visited during a mission.We use Airdata 4 to process the flight dataset.Figure 9 presents the visualization of the flight in Airdata.Waypoints in Airdata can be identified and exported to google maps.We found that google maps provide a better graphical representation, as the Google Earth interface allows for better zooming (Fig. 10).The mission waypoints are exported in KML format from the Airdata UI.

Execution Results
The result of the validation scenario is presented in Table 5, confirming that the achieved precision of the simulation (Expected End Position) and the measured position are within 0.00001 degrees in Latitude (less than 0.0001 m) and 0.00001 degrees in Longitude (less than 0.001 m).With the overall error of the system being less than 0.001 m 2 .

Discussions
Our results show that it is possible to use the digital twin to estimate the probability of an air-to-air collision hazard.However, in this section, we discuss limitations and issues that require further development to make our approach ready for mainstream adoption.

Time to React: From Sensing Time to Avoid Time
The results from section 5.1.2represent the best-case scenario for the sense and detection of an uncooperative flying object.These results are drwn from the specifications in the DJI M300 manual and the sensor (L19) manual.As these elements are swapped with their real-world counterpart, the process of detecting the small uncooperative flying object needs additional steps and time (see Fig. 11).
To exemplify, in Box 3, we present a high-level calculation to show the capability of the sense and detect will be degraded in the real world, and how this results in increased risk.The parameters for time are estimated from our field trials of a related work that has evolved from [41].
Box 3 Calculation for real-world detection capabilities.an uncluttered environment.This decision was taken with safety considerations in mind, as the filed validation missions had to be flown within the current legal framework.
Regarding the results of the simulation, while the 360° movement strategy was the most successful in terms of detecting 91% of incoming hazards, it was also the strategy that showed greater variation.We can observe through the graphs in Table 3 that it is the hazards approaching from the front and right that are the ones most likely to be missed.This is the result of a design decision of the 360° strategy rotating first towards the left.The relationship between rotation speed and the incoming speed of the UAV produces these.Remote pilots utilizing these results must be aware of the bias introduced by the initial conditions of their simulations.
Regarding our estimation of reaction time and separation distances, in a not-so-distant future use case where a swarm of UAVs are tasked with autonomously surveying a critical Comparing the results from Table 4 and Table 6 shows that the distance is decreased by 3.8 m with the above parameters.Nonetheless, in close cluttered environments such as the one described in Sect.3, this distance can become the difference between two UAVs operating within their segmented airspace or an intrusion which must be handled with collision avoidance manoeuvres.

Effects of Initial Conditions on Detection Capacity
The results of the simulation will be affected by the initial conditions.This not only includes parameters like incoming speed, but also the terrain and static hazards in the digital twin.To validate the capabilities of the DT we selected representation of the real world, future work will explore its capability in more complex environments.

Future Risk Management Frameworks and Standardisations
We mentioned in section 2.3 that there are no standards for the capabilities of autonomous detect and avoid systems or digital twins.In this paper, we have made the case that the internal quality of the digital twin should enable trust in the system.The focus on safety and security is paramount to enable future beyond visual line of sight operations.Swarms of cooperating to complete a task (like a bridge inspection) must guarantee that hazards (to humans and equipment) are minimized.The risk-based approach presented in this paper aims at evaluating known-unknown risks that can be anticipated in an operational environment.While risks and hazards can not be completely eradicated, we argue that as these experimental solutions move toward higher technology readiness levels [42] it will be out of utmost importance that the technology can leave an audit trail.Current frameworks for manned aviation rely on the submission of accidents and near misses (in the UK regulated by CAP382 [43]) so the industry as a whole benefits from these reporting practices.With autonomous BVLOS operations, the audit trail will enable thorough reporting and foster the learning capacity of the industry.infrastructure (i.e. a bridge), separation distances of about 4 m and reaction times of about 10 s are critical for hazard detection.In these situations, the incoming uncooperative flying object can be hidden within the infrastructure, thereby limiting the effectiveness of sentry strategies presented in this paper.Notwithstanding, we claim that with a digital twin of an operation theatre, custom sentry movements and missions can be tailored to accommodate the hazards in the operational theatre.

Feedback Loops to Real-World Experimentation
We are working towards characterizing the real-world capabilities of the sensors in the field.These measurements will serve to inform the DT so that its capacity to reproduce the sense and detect capabilities of the physical system is improved.
For example, we are currently working on characterizing the use of the L1 lidar to detect uncooperative flying objects (the approximations in Box 3 are taken from this line of work).As we characterize these capabilities (with variables like detection range, lighting conditions and time to build data points), these parameters can be introduced in the Digital Twin to improve the air-to-air hazard estimation.
As mentioned in section 4.4, there are elements in the DT that are designed with these hazards in mind.While the focus of this paper was on validating the DT as a trustworthy

Conclusion
This paper presented a Rigorous Digital Twin Verification and Validation Framework that is used to ensure the pertinence of the DT to inform a specific operation risk assessment.This framework draws from modern software development practices to assure the quality of the DT.
As for theory contributions, we put forward that before the results of the DT simulation can be used for risk assessment, the remote pilots (end-users of the DT system) must Trust the results.This trust comes from three sources: • Confidence in the quality of the development process.Evidenced by the execution of automated test cases that assure the internal quality, and the feedback cycle that enables the overall Test suite to grow in size and coverage of the DT.• Confidence that the DT is capable of reproducing the real world.Evidenced by the definition and execution of field trials aimed at showing this capability.• Confidence in the quality of the DT product.Evidenced in the absence of known software defects.
As for practical contributions, we show that a DT can be developed to reproduce a UAV operation within the rules of the specific category and used to evaluate the risk of air-toair collisions.Through the application of this approach we have been able to: • Evaluate different options for the same mission.
• Determine the most risk-averse alternative • Quantify the probability of air-to-air collision for the chosen alternative (at 9%).
We also argue that evaluating the capacity of a DT of reproducing the real world is a key element before the system can be put to use.We have shown how we have done this by defining field exercises that are meant to evaluate this capacity.This enacts a feedback loop between the real and the virtual world that we have exploited to improve the internal quality of the DT by: • Defining additional test cases (in the virtual world) that captured past inconsistencies to make sure that these are not introduced in future versions of the DT.• Faithfully reproducing the missions in the real world to assure that the capacity is maintained.
Furthermore, we argue that this approach is better than pen-and-paper SORA, and can be used to inform the SORA-based risk assessment that a remote pilot carries out for other elements in the field.This capacity to identify and quantify the risk of air-to-air collisions, while it is within the scope of the JARUS SORA, is an enhancement to current UAV risk assessment practices.
Finally, we discussed the implications of our results for future autonomous sense and detect systems within future highly cluttered use cases.We argue for the development of product standards that can establish the capabilities of sense and detect systems so that these standards can drive the development of future sensors and the capabilities of future embedded computer vision models.And as a result to make sure that future UAV operations in urban highly cluttered environments can be conducted safely.

Fig. 1
Fig. 1 Complexity of the Test Cases in the context of the Cynefin framework

Fig. 3
Fig. 3 Main components of the Digital Twin for UAV operations

S
= RayTraceHit.Distance) then Return S //Confirm intrusion Else Return null //No intrusion Box 1 Digital Twin detect algorithm (pseudo code)

Fig. 4 Fig. 5
Fig.4 Digital Twin screenshot that visualises the M300 Vision System (shading added to code to facilitate orientation)

Fig. 10
Fig. 10 Google Maps visualisation of the same mission

Fig. 11
Fig. 11 Sense and detect process.(A) shows the current process in the DT.(B) shows the real-world future process

Table 1
Examples of the use of dynamic and static tests

Table 2
Example scenario descriptions [36]DT resides in this component.The Unity ecosystem provides an environment for the visual design of the DT, as well as APIs for implementing custom automation.The UnityEditor and UnityTesting components are APIs that our DT interacts with.The DT component represents the graphical and automation work put in to model the environments from the scenarios.•RAPIDV&VCore Framework.This component contains the main elements of our framework.Unity provides two implementations of the XUnit framework[36], PlayModeTests and EditModeTests.The Play-ModeTests implementation is designed to be used for test cases that require the execution of the UnityEngine.PlayModeTests cater for the situation where game developers are looking to verify a dynamic element of their game (like shooting at a target).The EditModeTest implementation is designed to be used for verifying elements that interact with the UnityEditor.Typically in game development, EditModeTest would be used to verify static elements in the scene like grounds and terrains.

Table 3 .
Simulation results

Table 4
Scenario results, estimation of air-to-air collision risk

Table 5
Position measurement in the field of incoming uncooperative UAV