Occlusion Model—A Geometric Sensor Modeling Approach for Virtual Testing of ADAS/AD Functions

New advanced driver assistance system/automated driving (ADAS/AD) functions have the potential to significantly enhance the safety of vehicle passengers and road users, while also enabling new transportation applications and potentially reducing CO2 emissions. To achieve the next level of driving automation, i.e., SAE Level-3, physical test drives need to be supplemented by simulations in virtual test environments. A major challenge for today’s virtual test environments is to provide a realistic representation of the vehicle’s perception system (camera, lidar, radar). Therefore, new and improved sensor models are required to perform representative virtual tests that can supplement physical test drives. In this article, we present a computationally efficient, mathematically complete, and geometrically exact generic sensor modeling approach that solves the FOV (field of view) and occlusion task. We also discuss potential extensions, such as bounding-box cropping and sensor-specific, weather-dependent FOV-reduction approaches for camera, lidar, and radar. The performance of the new modeling approach is demonstrated using camera measurements from a test campaign conducted in Hungary in 2020 plus three artificial scenarios (a multi-target scenario with an adjacent truck occluding other road users and two traffic jam situations in which the ego vehicle is either a car or a truck). These scenarios are benchmarked against existing sensor modeling approaches that only exclude objects that are outside the sensor’s maximum detection range or angle. The modeling approach presented can be used as is or provide the basis for a more complex sensor model, as it reduces the number of potentially detectable targets and therefore improves the performance of subsequent simulation steps.

Today SAE Level-2 -also defined as 'partial driving automation' -is available on the market by several manufacturers, e.g., Cadillac's Super Cruise, Nissan's ProPILOT Assist, Tesla's Autopilot, and Volvo's Pilot Assist. The ADAS functions implemented in an SAE Level-2 vehicle can take over lateral and longitudinal vehicle motion control. However, the driver remains responsible for the monitoring of the environment and for constant supervision of the overall system. Also defined as 'conditional driving automation', SAE Level-3 is currently undergoing testing, e.g., for the Honda SENSING Elite system and the Mercedes DRIVE PILOT. In addition to lateral and longitudinal vehicle motion control, an SAE Level-3 vehicle is fully responsible for the monitoring of the environment and reaction to unanticipated conditions and scenarios. This allows the driver to take his/her eyes off the road for prolonged periods. Since the monitoring responsibility is transferred from the driver to the system, the shift from SAE Level-2 to Level-3 represents potentially the largest step in driving automation. The major challenge for achieving SAE Level-3 is to develop, test, and homologate a reliable and robust environment perception system. A combination of diverse and redundant sensor modalities, i.e., camera, lidar, and radar, is considered to eventually provide the necessary capabilities to fulfill the high perception demands of SAE Level-3 [42].
Cameras have been an important perception sensor modality in all relevant AD demonstrators since the 1990s [42] and are nowadays among the standard equipment in most vehicles. They are considered the most reliable sensor modality for object classification, traffic sign/traffic light recognition, and lane detection [70], since the passive measurement principle allows high-resolution imaging at high acquisition frequencies [68]. While direct range measurement is not possible, good range estimation can be achieved using stereo cameras and computer vision methods [71]. In addition, velocity information can be estimated based on optical flow methods [19]. Among the main drawback of cameras is their poor performance in adverse weather conditions and at night. Furthermore, the interpretation effort to extract usable information from the actual measurement via computer vision is relatively large and therefore error-prone (e.g., false negatives), which has led to several prominent accidents in the past already.
Lidar sensors provide direct range measurements at a very high angular resolution [35]. The high cost attached to currently available systems that include mechanically spinning light sources, detectors, and/or mirrors, currently limits their implementation in today's production vehicles. However, newly emerging technologies, such as MEMS-based mirrors, optical phased arrays, vertical-cavity surface-emitting laser, and single photon avalanche diodes [21], [60], [63] will most likely enable commercial lidar sensors at costs low enough for implementation in standard vehicles in the next few years. While lidar is considered a robust sensor modality under adverse weather conditions, dew, dirt, or foam can lead to unwanted reflections at the sensor cover, which degrades sensor performance up to complete sensor blindness [54].
Radar sensors are among the standard equipment in today's middle and upper-class vehicles. Automotive radar sensors apply Frequency-Modulated Continuous Wave technologies for measuring relative distance and velocity [49] plus digital beamforming for directional sensing [27]. Typical automotive radars operate at 77GHz [50]. Furthermore, radar is considered the most robust sensor modality under adverse weather conditions. Only small disturbances have been measured under heavy rain [28], [29] or when water droplets or ice/dirt particles were present on the radome [3], [22].
In [40] it was shown, that autonomous vehicles would have to be tested over millions of kilometers in order to validate their reliability, which is unfeasible to do solely by real-world tests. Therefore, in order to develop, test, and eventually homologate a reliable and robust SAE Level-3 system, physical test drives need to be supplemented by simulations in virtual test environments [26]. Such a virtual test environment and the corresponding data flow is illustrated in Figure 1. Sensor models simulate the environment perception system of the vehicle undergoing testing. An environment simulation, e.g., aiSim [1], CARLA [20], IPG CarMaker [38], or Vires VTD [62], simulates other traffic participants, the road network, scenarios, etc. All relevant data on the environment is forwarded to the sensor model, which modifies the incoming data according to the capabilities of the perception system which is being tested (NB: an approach for standardizing these interfaces, called Open Simulation Interface, is currently under development [31]). The sensor model output is then forwarded to the ADAS/AD function and therefore forms the basis for decision-making within the function. The output of the ADAS/AD function (e.g., braking/steering request) is forwarded to a vehicle dynamics model, which then simulates the actual motion of the vehicle and forwards this so-called ego-motion to the environment simulation, which closes the simulation loop.
For a virtual test environment to provide simulations that can complement and eventually replace physical test drives, realistic and computationally efficient sensor models are essential. Previous work on perception sensor modeling includes both generic approaches, which allow the application to any sensor modality, and sensor-specific approaches. An extensive overview including a classification scheme into low (geometrical aspects), medium (incl. probabilistic or physical aspects), and high-fidelity models (rendering and ray tracing) is given in [53]. Examples of sensor model architectures are provided by [30], [55].
Existing low-fidelity models are typically also generic models and provide simple thresholds for filtering objects outside the sensor's field of view (FOV). Examples are [46], [59].
High-fidelity models are always sensor-specific. Highfidelity camera models typically use the rendered images provided by the environment simulation and apply synthetic image degradation to create a more realistic camera image. Examples can be found in [15], [16], [56], [69]. High-fidelity lidar models typically generate point clouds based on ray tracing or other rendering methods, e.g., [32], [51]. Highfidelity radar models, which also use ray tracing to simulate the radar wave propagation and reflection in the virtual world are presented in [13], [14], [37], [43], [44].
Despite the multitude of existing sensor models for virtual testing of ADAS/AD functions, to our knowledge, there is no publication that provides a computationally efficient, mathematically complete, and geometrically exact approach that solves the FOV and occlusion task at the object level (i.e., input and output of the sensor model are object lists). Stateof-the-art of low-fidelity sensor models detect all objects inside the sensor's FOV and only disregard objects that exceed the maximum detection range or angle of the specific sensor. Occlusion within the FOV is typically not considered in sensor models at the object level. The benchmark model for the newly suggested approach is a sensor model that disregards objects outside the sensor's specified maximum detection range and angle, i.e., the sensor's FOV.
The scope of this work involves providing a computationally efficient, mathematically complete, and geometrically exact generic sensor modeling approach that solves both the FOV and the occlusion task. We also discuss several sensor-specific extensions, such as bounding-box cropping and sensor-specific weather dependant FOV-reduction approaches for camera, lidar, and radar. The basic occlusion modeling approach, as introduced in Section II, can be categorized as a low-fidelity model according to [53]. The modeling approach presented can be used as is, or provide the basis for a subsequent medium or high-fidelity model since it reduces the number of potentially detectable targets and therefore improves the performance of a subsequent simulation step.

A. STRUCTURE OF THE ARTICLE
Section II introduces the newly developed occlusion model including the FOV filter and the occlusion filter plus describes all corresponding geometrical and mathematical approaches. Section III shows the model results in several different scenarios, where the model is applied to both real and artificially created scenarios. The occlusion model is benchmarked against a state-of-the-art FOV sensor model and also validated using measurement data from an automotive camera sensor. Section IV discusses several potential extensions to the base model, including bounding-box cropping and weather-dependent FOV-reduction approaches for camera, lidar, and radar. Section V completes this process with an outlook on future work.

II. APPROACH OF THE OCCLUSION MODEL
This section will address the modeling approach of the occlusion model, which consists of a sequential two-stage process which can be seen in Figure 2. As described in the previous section, the sensor model receives its input from the environment simulation. This so-called ground truth information consists of all rectangular bounding boxes (BB) of all dynamic and static objects in the scenario (e.g., cars, trucks, pedestrians plus houses or obstacles). From these ground-truth bounding boxes the sensor model calculates the so-called sensor-detected bounding boxes, which is the output of the sensor. For this modeling approach, it is therefore sufficient to know the x-, y-and z-coordinates of the BB center point, the length, height, and width of the BB plus the orientation of the BB described by the roll, pitch, and yaw angle. The additional assumptions for the occlusion model are the following: • all considered objects are displayed as rectangular bounding boxes only; • the ground is flat, i.e., no uphill, downhill, or tilted streets; • the x-y plane of the sensor coordinate system is parallel to the x-y plane of the world coordinate system; • all bounding boxes are always located on the ground, i.e., there is no distance between the bottom of the bounding box and the ground; • the bottom of the bounding box is always parallel to the ground, i.e., the objects are not tilted; • all bounding boxes are disjoint with each other and also with the ego vehicle.

A. FIELD OF VIEW FILTER
The field of view (FOV) filter is the first stage of the occlusions model. As it can be seen in Figure 6, it consists of two subsequent parts, the range check and the angle check which will both be described in the following. The output of the FOV filter constitutes the so-called Inside FOV bounding boxes, which are all bounding boxes that are inside the FOV. The FOV filter is only based on the bird's-eye view, i.e., the x-y plane, which means the height of the objects is not considered. For that reason, the sensor coordinate system in Figure 3 is utilized. This means that the 3D FOV spherical segment is reduced to a 2D circular segment, where only the range r FOV and the horizontal angle φ h FOV of the FOV are relevant. Nevertheless, the vertical angle φ v FOV is also required for the definition of the FOV, as this will be relevant for later steps. For the FOV filter, every object can be considered individually and thus the subsequent range and angle check will be applied analogously for every object in the scene.

1) RANGE CHECK
For the range check, the closest point of the bounding box to the sensor is computed. If the distance between the sensor and the closest point is greater than r FOV , the bounding box is outside the FOV and will not be considered any further. As depicted in Figure 5, two cases have to be considered: Either the closest point to the sensor is a vertex or the closest point is a non-vertex point somewhere on the bounding box. Therefore, the closest point to the sensor is computed with the following approach: 1) Calculate the distance to all four vertices of the bounding box; 2) Ignore the vertex with the furthest distance to the sensor; 3) Generate the two valid lines between the three remaining vertices, representing the bounding box part facing the sensor; 4) Calculate the base point of the perpendicular line between the sensor and both previously calculated lines; 5) For each of the two base points, check if it is between the two corresponding vertices, and if not, replace the base point with the closer vertex on this line; 6) From the three remaining vertices and the two generated base points, the point with the shortest distance to the sensor is the closest point. While the range check can eliminate bounding boxes which are outside the FOV, it cannot guarantee that bounding boxes are inside the FOV, since the horizontal angle φ h FOV of the FOV must likewise be considered.

2) ANGLE CHECK
The angle check determines whether a bounding box is inside or outside the FOV. The so-called angle interval [φ low , φ up ] of every bounding box defined as where V i denotes the i − th vertex of the bounding box and φV i denotes the polar coordinate of the transformed angle of the vertex V i . It has to be mentioned that in case of an object crossing the 0 line, the computation of the angle has to be adjusted by splitting the bounding box into two separate angle intervals. This means, for example, if a bounding box has φ low = 10 and φ up = 330, the two corresponding angle intervals are [0, 10] and [330, 360]. The angle interval or intervals are then checked for any overlap with the FOV angle interval. If there is no overlap, the bounding box is considered outside the FOV, otherwise, the bounding box is inside the FOV and potentially detected by the sensor. Based on the range and angle check, every bounding box with at least one point (vertex or non-vertex) inside the FOV is considered to be inside the FOV. For possible extensions and features which allow more realistic modeling, see the discussion in Section IV.

B. OCCLUSION FILTER
After the FOV filter, the occlusion filter is the second stage of the occlusion model, as depicted in Figure 7. It consists of 4 stages: (1) the generation of the occlusion matrix, (2) the calculation of the front view polygons, (3) the calculation of the visibility, and (4) the occlusion threshold check. All 4 steps will be explained and discussed in this section.

1) GENERATE OCCLUSION MATRIX
The occlusion matrix OccMat is a boolean matrix, i.e., it consists of only 0 and 1. It contains the information about which object is occluding which other object and vice versa. Every row and column stands for one object, therefore in the i − th row, all objects occluded by the i − th object can be determined. Meanwhile, in the i − th column, all objects occluding the i−th object are stated. In Figure 8, an example consisting of 5 objects is stated. One can immediately see in the first row, that objects 2, 3, and 5 are occluded by object 1 and in the first column, that object 1, in turn, is not occluded by any other object. As one can see in the left part of the figure, there is no difference in the occlusion matrix if an object is fully or only partly occluded. Every instance of overlapping is considered an occlusion.
The determination of whether an object occludes another object is hardly a trivial task. The whole approach for calculating the occlusion matrix is stated in the algorithm outlined in Figure 9, which will be explained in the following. The two for loops (lines 1 and 4) are required, since every  combination of Obj Tar and Obj Test has to be considered. The first if statement (lines 5 till 9) takes into account that it is not necessary to permute also the symmetric combinations, seeing as two objects cannot be simultaneously occluding each other, i.e., at least either  (1) and (2). If there is no overlapping angle interval, the corresponding entries in the occlusion matrix OccMat are set as 0 (line 23f). If there is an overlapping angle interval, the number of vertices of the test object, which are inside the angle interval of the target object is computed as nr vert . Based on nr vert , three different cases can be identified which must be treated accordingly. First, if all vertices are inside the angle interval, i.e., nr vert = 4, there is another differentiation, whether the test object Obj Test is inside the MaxBox(Obj Tar ) or not (line 12). If so, the so-called MaxBox case has to be applied. If not, the so-called ClosestDistance case is applied. Second, if no vertex of Obj Test is inside the angle interval of Obj Tar , i.e., nr vert = 0, again the ClosestDistance case has to be applied. Third, if none of the above is true (nr vert ∈ {1, 2, 3}), the so-called RayCast case is applied (line 20).
All three cases which can occur while generating the occlusion matrix are illustrated in Figure 10, where the red lines denote the angle interval of the orange target object. In this scenario with the orange object as the target object, the RayCast Case is triggered by the blue test object, the purple and green test objects trigger the ClosestDistance case, and the yellow test object triggers the MaxBox case. How these three cases are treated will be explained and discussed in the following. 1) ClosestDistance case: In the ClosestDistance case, the computation of the closest point of the object to the sensor is necessary, as the question of whether Obj Tar occludes Obj Test or vice-versa can be answered by the distance between the sensor and the closest point of the object. The orange line in Figure 11 represents the distance of the target object and the violet line represents the closest distance for a test object with nr vert = 0, while the green line represents the closest distance of a test object with nr vert = 4. If the distance of the target object is less than the distance of the test object, the target object is in front of the test object and therefore occludes it. In terms of Fig. 9, this results in: OccMat j, i = 0.
It should be stated that all objects are assumed to not intersect with each other, and therefore the closest distance for two objects cannot be equal in this case.
2) MaxBox case: In order to treat the MaxBox case, first, the so-called MaxBox of the target object MaxBox(Obj Tar ) has to be introduced, which is defined as the rectangle described by the minimum and maximum of both the x− and y− coordinates of the target objects bounding box vertices -for an illustration, see the dashed rectangle in Figure 12.
Again, it has to be determined whether Obj Test is in front or behind Obj Tar . Since in this case, the test object is completely inside the MaxBox of the target object, this can be done with the following approach: In the first step, the visible edges of Obj Tar are determined, and based on those visible edges, triangles with the appropriate MaxBox edges are formed. This step is also depicted in Figure 12, where the blue lines indicate the visible edges and the gray areas indicate the two triangles formed between the boundary of the MaxBox and the visible edges. Since these triangles are visible from the sensor position, in the second step it is now checked whether Obj Test is inside one of these visible triangles. This can be easily implemented and efficiently computed by using barycentric coordinates. Finally, if Obj Test is inside one of the visible triangles, this results in: OccMat j, i = 1.
It should be noted that depending on the relative heading of Obj Tar , there can be one or two visible edges, resulting in one or two triangles requiring inspection. 3) RayCast case: In the RayCast case the question of whether Obj Tar occludes Obj Test or vice-versa is solved by intersecting both objects with the two rays representing the upper and lower edges of the angle interval. Since the RayCast case only occurs when at least one vertex of Obj Test is inside and at least one vertex is outside of the angle interval, it is guaranteed that at least one intersection point exists between one of the rays and Obj Test . For an illustration, see Figure 13, where the orange object represents Obj Tar and the blue object represents Obj Test . There, the red dot illustrates the intersection of the upper ray with Obj Tar while the blue dot shows the intersection of the ray with Obj Test . When the intersection with Obj Tar is closer to the sensor than the intersection with Obj Test , the target object occludes the test object, leading to: OccMat j, i = 0.

2) CALCULATE PROJECTED POLYGONS
In order to calculate the so-called projected polygons, every bounding box inside the FOV is projected onto the so-called normalized surface, which is a cylinder surface with a radius of 1 meter around the sensor origin with the cylinder axis aligned to the z-axis of the sensor coordinate system. To this end, the height of the bounding boxes must also be taken into account. As illustrated in Figure 14, every vertex of the bird's-eye view of a bounding box is placed on the φ-axis of the normalized surface plot according to its angle. The dashed line in the lower plot denotes the mounting height of the sensor. How to calculate the length and position of the vertical line for each of the four vertices of a bounding box will be explained in detail further below. As mentioned before, the bounding boxes are projected to the normalized surface, which means that the height of the bounding box corresponding to its distance to the sensor has to be taken into account as well. In this twostage process, it is first checked for every one of the four ground vertices independently, if the vertical edge of the bounding box associated with a given vertex is intersected by the vertical FOV angle. It can be seen in Figure H down where r i denotes the distance to the sensor in meters. It should be noted that the second equation arises from the assumption that the street is flat and every bounding box is located directly on the floor. Depending on the object height h i and the sensor mounting height h S , it is possible that H up V i < 0 applies, as illustrated in Figure 16.
As illustrated in Figure 14, the normalized edge associated with a given vertex can be plotted by drawing a vertical line at position φ i from the sensor mounting height h S tõ H up V i and from h S toH down V i . Repeating this process for all four bottom vertices of the bounding box and taking every endpoint of the four vertical edges as a new vertex leads to the projected polygon with 8 vertices.
The contour of this polygon is the projection of the bounding box onto the convex normalized surfacez − φ of the normalized cylinder.
Note that the four ground vertices and the height of the bounding box are sufficient for describing all eight vertices of the bounding box, as, by assumption, every bounding box is parallel to both the flat floor and the sensor coordinate system.

3) CALCULATE VISIBILITY
The calculation of the visibility combines the occlusion matrix and the projected polygons of every bounding box and computes the visibility of every bounding box accordingly. The visibility of every bounding box is computed by gathering all occluders of the polygon Pol i of object i and joining those as JoinedOcc Pol i . As a next step, the visibility of Pol i is determined as where Area(Pol i ) denotes the area of the polygon Pol i , and ∩ denotes the intersection. In Figure 17, the calculation of the visibility is illustrated, where the calculation of Pol i ∩ JoinedOcc Pol i leads to the yellow polygon. For objects with no occluders, the visibility is 1, as depicted exemplarily by Pol 3 in Figure 17.

4) OCCLUSION THRESHOLD CHECK
In the last step of the occlusion filter, a decision has to be made, whether a bounding box is regarded as visible or not. To that end, the occlusion threshold Occ th , with a userdefined value between [0, 1] is used. This threshold defines the limit above which an object is characterized as detected by the sensor model. As a result, if Vis i < Occ th , the object is undetected, and if Vis i >= Occ th it is detected by the sensor model accordingly.

III. RESULTS
In the following, the results of a benchmark of the occlusion model against a state-of-the-art FOV model and measurement data will be stated and discussed. The used scenarios are based on real measurement data as well as on artificially created scenarios.

A. SCENARIO FROM MEASUREMENT DATA
For the validation of the occlusion model presented in this article, camera measurement data from a test campaign conducted in Hungary in 2020 with different international partners has been chosen. For more details about the measurement campaign and the sensor setup, see [24,Ch. 3]. In the following scene, an overtaking maneuver was chosen for showing the results of the occlusion model and comparing it to the state-of-the-art approach of taking only the field of view into account. In Figure 18 the scene is depicted, where the ego vehicle is shown as the red bounding box and the other four vehicles are depicted as blue bounding boxes. The position of the vehicles and therefore the bounding boxes was measured by a GPS which was mounted on every vehicle. In Figure 19 the input of the FOV model is shown. As it is state-of-the-art in the FOV model, it is only determined if an object is inside or outside of the field of view of the sensor, and occlusion effects are thus ignored. Considering this factor, the result of the FOV model, depicted in Figure 20, is obvious and equal to the input of the sensor model.
Taking the scene shown in Figure 19 as input for the occlusion model presented in this article as well, leads to the results shown in Figure 21. It can be seen, that only the 2 vehicles in front are visible to the sensor. Comparing this to the results of the FOV model depicted in Figure 20, raises the question of which result and therefore which model is a better fit for the behavior of a real sensor. To this end, the measurements of the camera sensor mounted on the ego vehicle are depicted in Figure 22, where it can be seen, that only two vehicles are visible. Comparing this to the results of the occlusion model, Figure 21, and the FOV model, Figure 20, clearly shows, that the results of the occlusion model are much closer to the reality than the ones of the FOV model.
Upon closer inspection of the occlusion model, the projected polygons, and the calculated visibility of each object in Figure 23, one can easily see that only a part of the vehicles is occluded. Nevertheless, the objects are not detected by the camera sensor, which motivates the introduction of the occlusion threshold Occ th in Section II-B4. For both vehicles, the visibility is below 0.2, which encourages us to choose Occ th = 0.2.

B. ARTIFICIAL SCENARIOS
To show the effect of the occlusion model, three different artifical scenarios have been chosen: (1) A multi-vehicle case with an adjacent truck to demonstrate occlusion effects, (2) A traffic jam situation where the ego vehicle is a car, (3) The analog traffic jam situation where the ego vehicle is a truck.  (1) In the multi-vehicle case, 19 cars are randomly located, where all have varying sizes in terms of their bounding box, and object 1 represents a large truck, which is located close to the ego vehicle. A bird's-eye view of the scenario can be seen in the left part of Figure 25. Furthermore, the scenario is displayed as a 3D plot in Figure 24. The sensor is parametrized to a FOV with a range of r FOV = 20m, horizontal angle φ h FOV = 360, and a vertical angle of φ v FOV = 20. The right part of Figure 25 depicts the output of the occlusion model, where it can be seen that the objects 2, 3, 4, 7, 9, 10, and 19 are blocked by the large truck and are thus not visible for the sensor. Additionally, this effect can be seen in  the occlusion matrix, see Section II-B1 for more details and the corresponding visibilities in the left part of Figure 26. In the right part of Figure 26, the projected polygons of all sensor-detected objects are depicted, see Section II-B2 for more details about the projected polygons. In this case, it can be seen, that objects 1, 11, and 16 appear the largest on the normalized surface, since they are close to the sensor. The occlusion threshold Occ th is set to Occ th = 0.5 and as a result, an object with a visibility below 0.5 is not detected by the sensor model. This can be seen, e.g., by the fact that object 12 with Vis 12 = 0.527 is detected, whilst object 4 with Vis 12 = 0.014 is not detected by the sensor model.
(2) In this scenario, a typical traffic jam situation on the highway is rebuilt, where the sensor is mounted at a height of 1.5m, representing a classic passenger car. This scenario is illustrated in 3D in the left part of Figure 27. The FOV for this scenario is parametrized with r FOV = 15m, φ h FOV = 160, and φ v FOV = 20. In the left part of Figure 28, the input of the sensor model is depicted, where 3 cars and 1 large truck are visible, therefore representing a classic traffic jam situation on two lanes with the ego driving on the right lane. In the right part of the figure, the results, i.e., the sensor-detected objects of the sensor model are represented. It can be seen, that objects 2 and 4 are not detected by the sensor. The reason for this can be found in the left part of Figure 29, as the visibilities of objects 2 and 4 (Vis 2 = 0.186 and Vis 4 = 0.048) are below the threshold of Occ th = 0.2. In the right part of this figure, one sees that between the projected polygons of objects 1 and 3, there is hardly any space, and therefore object 4 is nearly completely occluded.
(3) This scenario differs from the previous one only insofar, as the sensor mounting height is increased to 3.5m,  representing the ego vehicle as a truck, as illustrated in the right part of Figure 27. The effect of the sensor mounting height can be seen in Figure 30, where it can be seen, that all 4 objects are detected. Comparing the visibilities of objects 2 and 4 in Figure 31 (Vis 2 = 0.575 and Vis 4 = 0.679) with the ones from the previous scenario, one sees a significant increase in visibility resulting in objects 2 and 4 beeing detected by the sensor model. In the right part of Figure 31, the projected polygons of the detected objects are plotted. Again, one sees that due to the increased mounting height of the sensor, a direct line of sight to objects 2 and 4 now exists (above objects 1 and 3), and thus objects 2 and 4 are detected. Comparing the projected polygons of this scenario with the ones from the previous scenario, one also notices that the shape and position are influenced by the mounting position of the sensor. The comparison of scenarios 2 and 3 shows, that the influence of the mounting position of a sensor is correctly modeled by the occlusion model presented.

IV. DISCUSSION
This paper introduces a computationally efficient, mathematically complete, and geometrically exact generic sensor modeling approach for solving the FOV (field of view) and occlusion task. The approach presented is tested against real camera data, but it is also applicable to other sensor modalities such as lidar and radar. Lidar systems operate in a similar domain of the electromagnetic spectrum as cameras and also rely on a direct line of sight between sensor and target. We, therefore, expect this model to be equally useful for modeling lidar sensors that operate at an object level. While multipath reflection is not so relevant for lidar and camera, it is essential for radar modeling. As shown by [48], targets that are completely occluded by one or even several other vehicles might still be detectable by an automotive radar via multipath reflection. In case of radar modeling, we therefore recommend using the presented occlusion parameters not as strict detection thresholds, but rather as additional input factors for a subsequent radar sensor model that builds on measurement data. The modeling approach presented can either be used as is or provide the basis for additional subsequent modeling steps. Since the occlusion model can reduce the number of potentially detectable targets it can therefore improve the performance of any subsequent simulation step. In the following, we would like to briefly discuss a few potential additions that might improve the model behavior compared to real-world scenarios, namely bounding-box cropping and weather-dependent FOV-reduction approaches for camera, lidar, and radar.

A. BOUNDING BOX CROPPING
The bounding box cropping feature for the occlusion model could lead to a more realistic display of the sensor-detected bounding boxes. Depending on the sensor models processing pipeline, bounding boxes that are only partly inside the FOV could be cropped, i.e., so that the sensor-detected bounding box is smaller than the ground-truth bounding box. For example, in Figure 32, the effect of this bounding box cropping is shown for the traffic jam scenario from the previous section. The difference between the ground-truth bounding box of object 2, the green box, and the cropped sensor-detected  bounding box is shortened accordingly. If both yellow lines intersect with the FOV, as is the case for object 3, then both sides of the bounding box are shortened. It can be the case that one side is directly perpendicular to the sensor, as seen, for example, for objects 1 and 2. However, this is a special case where the cropping is obvious and thus unproblematic. Based on this approach, it is possible to crop the bounding box so that every visible part in the FOV is included, and the assumption that the bounding box is a rectangular, or a cuboid in 3D, still holds.

B. WEATHER DEPENDING FOV REDUCTION 1) CAMERA
Cameras operate in the visible (or near infrared) part of the electromagnetic spectrum and their performance degradation due to adverse weather can thus be compared to that of human vision. Hence, it makes sense to define the FOV reduction by the minimum meteorological visibility and camera resolution. Meteorological visibility is defined by the distance by which a black object of reasonable size is well distinguishable. Although this definition is slightly vague, it is a well-established term from classical meteorology and thus a frequently collected parameter. Under the assumption that in the automotive context, angular resolution and light sensitivity of a modern camera are not the restricting factors, this simple relation can be expected to restrict the camera's FOV depending on adverse weather conditions: Although the wavelength of lidar devices is often close to the visible range, it is not possible to estimate the FOV simply via the meteorological visibility. In case of lidar being an active sensor, laser power, propagation, and atmospheric attenuation have to be taken into account for each device individually. The general lidar equation is typically written as [65] with P r [W] being the power received by the lidar. Moreover, P 0 * τ [J] is the laser pulse energy with τ being pulse length and c, the speed of light. The factor 1/2 comes from an apparent "folding" of the laser pulse and O(R) is the overlap function of the laser beam and the receiver field of view, with z being the distance to the target (see [65] for more details). A is the effective receiver area and η is the overall system efficiency. β is the backscatter coefficient and α is the extinction coefficient for a given atmosphere. As stated in [65], "the quadratic decrease of the signal intensity with distance is due to the fact that the receiver telescope area makes up a part of a sphere's surface with radius R that encloses the scattering volume." The drawback with this equation in practice is that the internal lidar efficiencies and geometries are usually not supplied by the manufacturer and thus not known by the user. Reference [25] proposed an interesting approach where the maximum distance from where to expect detectable returns can be estimated from a few simple lidar parameters that are usually supplied for automotive lidar devices. For this purpose, the general lidar equation can be simplified in two ways. First, this can be achieved by considering the (foggy, rainy) atmosphere to be a homogeneous, uniformly scattering medium so that the integral over alpha is reduced to a constant. Secondly, it can be achieved by combining all sensor-related parameters into one constant C s . Moreover, dividing by that constant results in the relative sensor power [25].
can be assumed to be equal to 1 for objects at reasonable distances. This leads to the simplified form of the general lidar equation: The remaining two coefficients α and β have to be investigated in more detail. Extinction originates from the scattering and absorption of light by molecules and particles, and so the extinction coefficient is thus the sum of four components. It is integrated over the path from the lidar to the target, or for the simplified case of a homogeneous atmosphere multiplied by the distance. Meanwhile, the backscatter coefficient is comprised of molecular scattering (mainly occurring from nitrogen and oxygen molecules) and scattering from aerosols. The particulate scattering that takes place from aerosols is highly variable in the atmosphere on all temporal and spatial scales. Those particles include tiny air pollution particles (liquid and solid), larger mineral-dust and sea-salt particles, pollen and other biogenic material as well as, importantly, the comparably large hydrometeors [65].
In clear conditions and for the comparably small distances of automotive lidars, it can be assumed that the backscatter coefficient is largely determined by the target and α ≈ 0.0. In addition, most lidar specification sheets list the maximum viewing distance z max of a 90% reflective Lambertian target, i.e., for β = 0.9 π . The minimum detectable power received under clear conditions can thus be given by: When considering rainy or foggy conditions, the contribution from molecules and aerosols to the total extinction can be neglected as it is three to two orders of magnitude smaller than the contribution from water droplets [6], [61]. It is, as Lewandowski aptly puts it: "the extinction from raindrops is a product of the scattering cross-section and the raindrop size distribution integrated over all diameters" [4], [18].
Power law: Inserting into simplified lidar equation yields: When equating P r (z) with P min r , C s cancels out the maximum distance from which lidar returns can be expected to be detected at a certain rain rate: The lidar's FOV must be below or equal to the maximum distance from which lidar returns can be expected: r FOV,lidar ≤ z max

3) RADAR
In general, radar sensors are considered the most robust sensor modality under adverse weather conditions and only small disturbances have been measured under heavy rain [28], [29] or when water droplets or ice or dirt particles were present on the radome [3], [22]. Although the methods introduced above for lidar FOV reduction under adverse weather conditions could, in principle, also be applied with small modifications to radar FOV reduction, performance evaluations of automotive radar sensors under adverse weather conditions, e.g., [48], show very little to no influence of weather conditions on the radar's FOV. For the sake of model simplicity, we therefore suggest leaving out the radar FOV specifications independent of weather conditions.

V. OUTLOOK
In future work, we will focus on including subsequent modeling steps for increasing the model fidelity of this approach. These subsequent modeling steps shall take target properties, sensor capabilities, and measurement errors into account. Target properties include, e.g., color for camera modeling, material reflectance in the (near) infrared spectrum for lidar modeling, and RCS (radar cross section) for radar modeling. Sensor capabilities include, e.g., reflectance or object-based FOV definitions. Then there are the measurement errors which include, e.g., position estimation errors. Previous work in this context includes [24], [47], [48]. Reference [47] presented a lidar modeling approach that takes material reflectance and lidar capabilities into account. Reference [48] presented a radar modeling approach based on a large labeled dataset and machine learning in order to estimate the detection probability of objects. Reference [24] presented a sensor modeling approach that combines kernel density estimation with regression modeling to provide improved position estimations of targets. This approach was demonstrated for camera modeling, but can also be applied to lidar and radar modeling. These three modeling approaches shall be implemented as a subsequent step after presenting the occlusion model.