Climate-Based Daylight Modelling for compliance verification: Benchmarking multiple state-of-the-art methods

Climate-Based Daylight Modelling (CBDM) gives designers the possibility to evaluate complex, long-term lu- minous environment dynamics. This complexity can be challenging to simulate, and even more challenging to communicate effectively through the use of performance metrics. A multiplicity of CBDM techniques and metrics has been developed over the last two decades, but these were rarely assessed against each other. This paper reviews four state-of-the-art techniques based on the Radiance raytracing engine and systematically compares them against a benchmark CBDM method. Four classroom spaces are used to carry out an inter-model comparison between performance metrics commonly used for compliance verification obtained from all analysed techniques. Additional sensitivity analyses assessed how changes in input variables influence such metrics. Results from the inter-model comparison showed that the representation of direct sunlight is markedly dif- ferent between the various CBDM techniques, and that metrics based on horizontal direct sunlight are very sensitive to the choice of simulation method. This led to differences in predicted Annual Sunlight Exposure up to 39 percentage points. Metrics that consider both direct and inter-reflected light were found to be more robust, with variations from benchmark results within ± 15%. The analysis of the input variables showed that sensor grid spacing and time-step interpolation do not significantly affect any of these metrics. Changes in orientation and sky discretisation scheme had different effects depending on the metric and technique considered. The need for authoritative benchmarking systems when introducing new performance metrics for compliance verification or new simulation methods is also discussed.


Introduction
The consideration of daylight in the design of a building was traditionally the preserve of the architect rather than the engineer. Designs were typically founded on rule-of-thumb estimates and daylighting properties were sometimes evaluated using scale models [1,2]. In contrast, the techniques devised to estimate the thermophysical performance of a building emerged from the discipline of building services, and so these were carried out by engineers. Dynamic thermal modelling using computer simulation gradually evolved from analytical/steadystate methods to become a commonplace and routinely applied technique to predict the overall energy and environmental performance of a building [3]. Using standardised climate data (i.e. weather files) to define the prevailing external conditions, dynamic thermal modelling became established in the 1980s as the foundation of Building Performance Simulation (BPS).
The ability to realistically simulate just a single point-in-time daylight condition for a simple space did not become a practicality until the early 1990s with the emergence of tools such as the Radiance lighting simulation system [4]. By the year 2000 the prediction of annual profiles of daylight illumination (using the same climate files as dynamic thermal modelling to generate the sun and sky conditions) became a practical possibility [5,6]. The prediction technique was eventually given the name Climate-Based Daylight Modelling (CBDM) in 2006 [7], and in the last decade it has all but completely replaced traditional daylighting evaluations, i.e. those founded on a single point-in-time condition such as the CIE standard overcast sky.
Although a relative latecomer compared to dynamic thermal modelling, CBDM has undergone a process of rapid evolution/development in the last fifteen years, resulting in multiple prediction techniques -a process which, at the time of writing, has yet to abate. Alongside these developments in the fundamentals of lighting simulation, there has been a huge increase in demand from users/practitioners for daylight modelling to address ever more challenging applications. These include: the prediction of daylight glare/discomfort [8]; the accurate simulation of daylight through a Complex Fenestration System (CFS) [9] the selection and calibration of daylight-responsive control systems [10]; the modelling of the non-visual effects of daylight [11], amongst others. However, a key factor in helping to establish CBDM as a mainstream technique has been its increasing presence in compliance schemes such as LEED (US) where it exists as an option alongside traditional methods. It was the UK however that took the lead in 2013 by being the first in the world to make the evaluation of proposed designs using CBDM a mandatory requirement for certain building programmes [12].
The combination of increased demand and rapid development has led to a situation where users face a potentially bewildering array of CBDM options with little authoritative material in the literature to help make informed choices regarding both applicability and expectations of accuracy. The purpose of the study described here is to address this situation by benchmarking four of the CBDM techniques most commonly used by practitioners/researchers against an 'in-house' formulation called the 4-component method which is believed to be the most rigorously validated of all CBDM techniques. The four CBDM techniques benchmarked are: DAYSIM; the 2-phase method; the 3phase method and the 5-phase method. Performance evaluation is based on a range of metrics including those which form the basis of US and UK guidelines. Thus the outcomes of the study described here are directly relevant to both practitioners and researchers/developers alike.

Development of CBDM
Although lacking a formal definition, CBDM is widely taken to be the prediction of any luminous quantity (illuminance and/or luminance) using realistic sun and sky conditions derived from standardised climate data. CBDM evaluations are usually carried out for a full year at a time-step of an hour or less in order to capture the daily and seasonal dynamics of natural daylight. Developed in the late 1990s, CBDM steadily gained traction -first in the research community, closely followed by some of the more forward-thinking practitioners.
It is possible in principle to carry out an annual CBDM evaluation using brute force, i.e. a unique simulation for each of the several thousands of daylight hours in the year. However, the computational demands are such that the time taken to generate the results would be prohibitive for much practical application. The computationally efficient simulation of a large number of different daylight conditions was made possible by the use of the Daylight Coefficient (DC) method [13], which simplifies the formulation of annual simulations to the formula: where the DC matrix stores the values describing the relationship between the virtual sensor points (n) and the 145 sky patches (plus one for the external ground), the sky matrix (S) stores the luminance values for each of the sky patches at each hour of the year (8760 h for hourly time steps), and the resulting illuminance matrix (E) is obtained by multiplication of the previous two matrices. The computationally expensive lighting simulation is only needed to obtain the DC matrix, thereafter all the rest of the process (i.e. the derivation of illuminances from any number of arbitrary sun and sky configurations) involves largely the multiplication of matrices which is relatively rapid. The physical accuracy of the DC values depends on the underlying lighting simulation engine. All CBDM techniques considered in the present study are based on the Radiance backward ray-tracing engine [4], which is tightly connected to the development of CBDM and it has been repeatedly validated throughout the years. More recently, software which embeds photon mapping or radiosity algorithms has also introduced tools to implement the use of DCs and perform CBDM [14,15]. Table 1 summarises the main validations concerned with the development of CBDM and Radiance-based techniques, and the expected statistical errors found for each of them. All of these simulation techniques are based on Radiance and implement different modifications of the DC method. Yet, they significantly differ in the way they represent sky, sun and fenestration systems, as they were conceived at different times and for different purposes. The earliest techniques were primarily developed to simulate daylight passing through clear glazing systems, whereas the latest ones focused more on the representation of CFS. Table 2 lists the main characteristics of the five validated CBDM simulation techniques that were compared in the present study; a short description of each technique is also provided hereafter.

4-Component method (4CM)
The 4-component method is believed to be the first example of an annual daylight simulation tool to perform what would be later on called CBDM. The validation database formed from simultaneous longterm measurements of exterior sky luminance patterns and interior illuminance -collected by the Building Research Establishment (BRE) at Garston, UK, as part of the International Daylight Measurement Programme (IDMP) -became the definitive dataset to validate the method [20]. The whole process is based on the use of the Radiance command rtrace to collect the contribution of each of the daylight components. The light is taken to be formed by four distinct components, each of them derived in a different manner: (i) direct sunlight; (ii) indirect sunlight; (iii) direct skylight; and (iv) indirect skylight. The direct components are derived from deterministic ray-tracing, whereas the indirect ones are computed stochastically, using Radiance's ambient calculation [21]. Illuminance values from climate files are used to determine whether the sky can be classified as a CIE clear sky or an overcast one, according to the Perez clearness index [22], and to create ad hoc skies by blending together the standard ones. This blend model was proved to perform marginally better than the Perez All-Weather model [23]. As noted, the accuracy of the 4-component method was tested using the BRE-IDMP validation dataset and proven to be comparable to that of the standard rtrace approach [21]. Thus, the 4-component method serves as the benchmark against which the other methods evaluated here are compared.

DAYSIM
Appearing shortly after the 4-component method, DAYSIM was developed using Radiance as raytracing engine, but introducing a modified version of the rtrace command, called dc_rtrace. This modification allowed the calculation of all the DCs in a single run, as the information about the surface that each ray hits is stored together with the coefficient itself [6]. DAYSIM played a very important role in the diffusion of CBDM, as it was arguably the first method to provide these capabilities along with a graphic interface to front-end users. It was later embedded in several BPS tools.
In the original version of DAYSIM, the sky is subdivided in 145 patches for the indirect light component, while for the direct sunlight there are up to 65 points over the sun path, specific for the chosen location, that are used to assign the sun position at each time step. There are 3 additional coefficients used for the external ground, which is divided into three concentric circular patches. The DAYSIM research version included three different calculation modes (Nearest Neighbor, Interpolated, and Shadow Test), for increasing accuracy in the process and in the results [16]. However, only the research version of DAYSIM offered all three modes. The widely available version, run "under the hood" by commercial software, uses only the Interpolated method. As stated by Reinhart and Walkenhorst [16], the overall statistical errors in an annual calculation are very similar between the Nearest Neighbor and the Interpolated modes, while the Shadow Testing outperforms both but requires a longer calculation time and it is only suggested in case of glare studies, for which there is a need of higher accuracy.
DAYSIM uses the records of direct normal and diffuse horizontal irradiances and feed them into the Perez All-Weather model; this is composed by two separate models, one for deriving illuminance values from irradiances and the second to recreate a luminance distribution on the sky vault from those values.
DAYSIM was also validated in conjunction with the use of Radiance trans and transdata materials, created to describe diffusing elements with a specular component, with or without angular dependency [17].
DAYSIM has been extensively used in the lighting simulation community, for both academic research and consultancy projects, due to its ease of use and its wide availability as a back-end component in several commercial simulation programs. Examples of its applications include: investigation of the daylight performance of different classrooms design [24]; the development of a novel circadian daylight metric [25]; design optimization of perforated solar screens [26]; urban studies on solar accessibility [27]; exploration of adaptive shading systems [28]; and many others.

2-Phase method (2 PH)
The 2-phase method (also known as 1-phase method or DC method) did not initially undergo a validation process, but the rcontrib command on which the method is based is the same used for the 3-and 5-phase methods. Such command replaced rtrace to calculate with more ease the Daylight Coefficients in a single run. Opposed to DAYSIM though, the sun and the sky contributions are not separated and the sun luminance is usually assigned to the three sky patches surrounding the actual sun position. This raised some issues related to the huge difference in brightness between adjacent patches; the interpolation commonly used in Classic Radiance when running an ambient calculation could possibly lead to significant errors in these areas where the luminous variance is so high. The suggested ambient parameters when using rcontrib are therefore switching off any interpolation and ambient caching that used to be set for rtrace, i.e. the parameters -aa (ambient accuracy) and -ar (ambient resolution) are overwritten to be always 0.
To counterbalance the big approximations that were introduced by assigning the sun luminance to a wide solid angle, new types of sky vault discretisation were introduced and their optional selection was made available in most of Radiance commands to perform CBDM. The so called Reinhart subdivisions are obtained by subdividing each patch in smaller parts, e.g. MF:2 (Multiplication Factor) defines 4 patches in each of the Tregenza's ones (except for the one at the zenith) for a total of 577 patches in the sky vault and one for the ground.
There is not a specific procedure to gather climate files for the 2phase method, but typically the Perez All-Weather model is applied through the gendaymtx command. Alternatively, the sky matrix can be formed by multiple genskyvec calls, used in combination with gensky to generate CIE skies or gendaylit for Perez skies.
The 2-phase method (and modified versions of it) has been used for different studies, e.g. early design envelope optimization [29], and performance analysis of novel three-dimensional textiles [30].

3-Phase method (3 PH)
To meet the increasing need for efficient simulations of CFS, the 3phase method was introduced and validated [18], following the creation of databases with Bidirectional Scattering Distribution Function (BSDF) definitions based on the Klems model and available from e.g. the software Berkeley Lab WINDOW 1 or via the Radiance genBSDF command [9,31]. BSDF definitions are used to describe the behaviour of light redirecting systems by binning the light flux that enters the system and relating it to the light flux that exits the system. BSDF definitions can also be produced from measurements of CFS taken with a goniophotometer [18].
In order to avoid the inclusion of complex geometries that would slow down Radiance dramatically, or that would be impossible to accurately simulate at all (e.g. light-pipes), the 3-phase method consists in two separate raytracing processes, one for the light that goes from the sensor points (or view point) to the fenestration system (stored in the View Matrix V), and one for the light that goes from the fenestration system to the sky vault (stored in the Daylight Matrix, D). The last step is the multiplication of these two matrices with the transmission matrix (T, the BSDF definition stored in the xml file) and with the climate data for a whole year (the Skylight Matrix S). For an hourly simulation (i.e. (2) This simulation strategy has the advantage of facilitating parametric analyses and comparison studies aimed at selecting the best performing fenestration system, as the transmission matrix can be replaced in the above multiplication without running the whole simulation again. This is advantageous for the modelling and analysis of adaptive systems too.
The raytracing process for each step is carried out by an rcontrib run as in the 2-phase method and the calculation of the Skylight matrix is also done in a similar fashion, by using the Perez All-Weather sky model. Both the subdivisions of the Tregenza patches on the sky vault and the Klems patches in the BSDF can be refined for greater accuracy, but at the cost of computational speed.

5-Phase method (5 PH)
To offer reliable simulation of CFS through the insertion of BSDF, and at the same time to offer greater accuracy than the 3-phase method in the calculation and visualisation of the direct sunlight, the 5-phase method was introduced and validated [19].
The 5-phase method takes the results of the 3-phase method (VTDS), then it subtracts the direct component from it (V TD S d d ds ) and recalculates it in a more accurate way (C S ds sun ), by using 5185 sun-like sources evenly distributed on the sky vault and by using variable-resolution Tensor Tree BSDF instead of the Klems angle basis [39]. In effect, similar to the DC subtraction used in the 4CM method for determining the various contribution. The equation that describes the concept behind the 5-phase method is: For the third term of Eq. (3), the sun is described as a light type source and it is therefore traced with a deterministic algorithm, rather than using a stochastic sampling strategy as in the 3-phase method. At this stage, all surfaces in the model are assigned a black material with no specular properties, hence it is assumed that they are all Lambertian reflectors.
Even if the 5-phase method was introduced more recently than the other methods, it has already been applied in several research studies, e.g. for daylight performance assessments of complex fenestration systems [40] and for the validation of a High Dynamic Range (HDR) sky monitoring system [41]. It was also taken into consideration for occupant-centric daylight evaluations [42].

CBDM metrics in building guidelines
Even though the Daylight Factor (DF) is still the most common requirement in building national codes, CBDM metrics made their appearance in building guidelines starting from 2013. In the USA, the LEED v4 energy rating system incoroprated CBDM metrics as part of its daylighting assessment [43], following the findings from a large field study on daylight levels and occupants' satisfaction [44,45]. In the UK, the first guideline to insert CBDM metrics was the one created for the Priority Schools Building Programme (PSBP), a mandatory programme promoted by the Education Funding Agency (EFA) to build and restore hundreds of schools in England and Wales [46].
The metrics specified in the two guideline requirements are defined as: • Useful Daylight Illuminance (UDI): Formed by a set of values, each representing the percentage of occupied hours where the illuminance level falls into certain ranges. The concept was first introduced in 2006 [15]. The sum of all UDI results has to add up to 100% for the same space. The ranges used in all analyses are [0-100 lx] (UDI-n for non-sufficient), [100-300 lx] (UDI-s for sufficient), [300-3000 lx] (UDI-a for autonomous) and over 3000 lx (UDI-x for exceeded). Sometimes the range [100−3000] is used as well, and is referred to as UDI-c, for combined.
• Spatial Daylight Autonomy (sDA): Represents the portion of the working plane that complies with the Daylight Autonomy (DA) requirement; DA represents the percentage of occupied hours where the illuminance level is higher than a certain threshold (300 lx) for each of the sensor points.
• Annual Sunlight Exposure (ASE): It considers only direct sunlight during the simulation and it represents the portion of the working plane where the sensor points recorded illuminances higher than a certain threshold (1000 lx) for more than a fixed number of occupied hours (250).
To obtain daylight credits for the LEED v4 energy rating system, each occupied space has to satisfy the requirements of an sDA300/ 50%> 55% (2 credits) or sDA300/50%> 75% (3 credits), as well as an ASE1000/250hr< 10%. The first requirement aims at providing enough light into the space and has to be verified with movable shading devices in place, if they are deemed necessary; it prescribes that more than 55% (or 75%) of the working plane has to record illuminance values higher than 300 lx for more than 50% of the occupied hours. The second requirement takes care that the amount of direct sunlight entering the building is not excessive, even when movable shading devices are not operated (i.e. are open). Only fix form shadings are modelled in this case (e.g. overhangs are, but blinds are not). The prescription considers direct sunlight only, for less than 10% of the working plane has to record illuminance values that are higher than 1000 lx for more than 250 occupied hours.
For PSBP compliance, the requirements are an sDA300/50%> 50% and a UDI-c(100-3000lx)> 80%. The second condition prescribes that the illuminances recorded over the working plane fall within the range 100-3000 lx for more than 80% of the occupied hours. The UDI-c values are collected at each sensor points, and then their average is calculated to give the working plane overall result. Movable shading systems are not modelled, as only the designed fixed form is evaluated.

Methodology
The main objective of the work was to compare some of the widely available Radiance-based CBDM methods, expressing their results in terms of annual daylight metrics. In previous validation studies of Radiance-based methods, a 'brute-forced' rtrace run (or the equivalent rpict for the generation of images for glare assessments) iterated for each hour of the year was often considered 'ground truth' and used as a reference for inter-model comparisons [6,19]. Here, the 4-component method was chosen as reference, as it is also based on the 'classic' Radiance rtrace command and it was thoroughly validated for clear glazing before [5,21]. The inter-model comparison was carried out on four different building spaces, which were modelled from four existing school classrooms. 2 In compare with the use of a 'shoe-box' model, this choice allowed a better understanding of the complexities that can be found when designing for actual buildings. For daylighting, modelling the details -optical and geometrical -of fenestration systems correctly is known to be very important to obtain accurate results from simulation [47]. Furthermore, when using the 3-and 5-phase methods, there are multiple approaches that the user can follow to represent the boundaries between the simulation of the interior environment and the exterior one. Working on a realistic design facilitated the understanding of which approach could better meet the needs of a designer. In the present work, both the 'sender' and 'receiver' surfaces were represented with the same geometry of the clear glazing surfaces, i.e. all frame and sill geometries were modelled explicitly, whereas the BSDF represented only the clear glass optical behaviour. Fig. 1 shows the four classroom spaces chosen for the analyses. The rendered interior, the model exterior and the plan layout are displayed for each classroom. The code names L3, L7, M1 and M5 are used from here onwards to refer to the single spaces. The orientation of the apertures is different in each classroom: L3 is a single side-lit room oriented towards North-West; L7 is a double-aspect room with windows on the North-East and South-East walls; M1 is a deep plan room with a single window looking towards due South; M5 is a room characterised by a slanted ceiling, with the main aperture looking towards due North and a clerestory window towards due South. Standard optical properties were assigned to the model surfaces as follows: the exterior ground surface and the interior floor exhibits a diffuse reflectance of 20%; the walls and window frames have a diffuse reflectance of 50%; the ceilings have a 70% diffuse reflectance; the windows are assumed to be double glazing units with a transmittance of 80% (equivalent to a 87% transmissivity, which is the parameter required as input by Radiance). For the 3-and 5-phase methods, the window glazing was represented by an 'infinite' BSDF created with the Radiance genBSDF command and based respectively on the Klems scheme and on the TensorTree scheme. An 'infinite' BSDF can be created by sampling only part of the fenestration system, which is representative of the overall optical behaviour of the system itself. This is then applied to an infinitely thin surface in the Radiance model. If the Klems basis is used, the result is equivalent to BSDF definitions produced by LBNL WINDOW.
The ambient parameters set for each method are reported in Table 3. These specific parameter combinations were chosen after a convergence tests run for each method, and they are suitable for the considered geometries. More information on the method adopted for the convergence test can be found in Ref. [48] (p. 44, Methodology Section).
The sky discretisation for the 'phased' methods followed a Reinhart MF:2 scheme, whereas the 4CM and DAYSIM were used in their standard form (see Table 2). The simulations were run considering the climate for London, UK, as the representative one for the location. The standard climate file was obtained from the EnergyPlus website 3 and downloaded in EPW format. The output collected from each simulation technique was an illuminance profile, i.e. a matrix containing an illuminance value for each sensor and for each time step in a year.
The appropriateness of evaluating daylight performance by simulating horizontal illuminance values over the working plane is under discussion at the moment. A shift towards luminance-based assessments is favoured by some researches [49], and the latest development in CBDM methods (i.e. 5 PH) is focusing on the accurate reproduction of the occupants' field of view. However, evaluations based on horizontal illuminance are still the norm for most metrics, thus only such metrics were considered for the present work. The analysis grid was set up at 0.80 m height, with a spacing between virtual sensors of at most 0.25 m and an empty boundary near the rooms' walls of 0.50 m.
The illuminance profiles were all post-processed with the same Python script. The calculation considered Daylight Saving Time. The following annual daylight metrics were obtained: • Total Annual Illuminance (TAI), annual cumulative illuminance value on the working plane, during occupied hours, expressed in klx  ASE is calculated by taking into account only direct sunlight, thus the simulation parameters had to be set appropriately for each method. The 4-component method gives equivalent results to hour-by-hour rtrace runs with ambient bounces (-ab) set to zero. DAYSIM calculates the direct sunlight component by setting -ab 0 as well. For the 'phased' methods, the procedure necessary to isolate the direct sunlight influence is slightly more involved; one ambient bounce is needed to 'find' the light source, but all opaque surfaces have to be transformed in black coloured surfaces in order to avoid any inter-reflection.

Inter-model comparison
The first set of results was obtained by performing a simple intermodel comparison with the five methods described before. The results for all the four case study classrooms were considered, to understand whether differences in geometrical configuration and window characteristics were influencing the annual metrics in different manners. Fig. 2 shows the results in terms of five annual daylight metrics: TAI, DA, sDA, ASE and four ranges of UDI. The error bars -indicating an error range of ± 15% -were inserted only on the graph displaying TAI values. TAI, being a cumulative metric, is affected by the same uncertainty of each illuminance value recorded by one sensor at a single moment in time. The uncertainty was set at ± 15% following the references reported in Section 2. For all the other metrics, a single error range cannot be unambiguously defined, as their uncertainty is highly sensitive to the room configuration. If the illuminance values oscillate around the metric's threshold, the uncertainty in the final result is higher; if, instead, the illuminance values are most frequently in a range well below or well above the threshold, then the uncertainty is almost none [50].
When observing TAI results, it is evident that all simulation methods considered are in agreement, independently of the room characteristics. The same can be noticed for DA values. When looking at sDA -which indicates the portion of the working plane that satisfies the DA requirement of 300 lx -and UDI ranges, some small variations can be noticed for rooms M1 and M5. These two rooms are those receiving less daylight, because of a deep plan (room M1) or because the main aperture is oriented towards North (room M5). In these two cases, using the 4-component method or DAYSIM seems to lead to slightly lower results. This could be either due to the fact that, irrespectively of the calibration performed initially, the Radiance ambient parameters are not high enough to sample all light coming into the spaces, or to an over-prediction of the daylight provision by the 'phased' methods. In both cases, the variation cannot be considered significant: the methods rank similarly when looking at sDA, UDI and TAI, and the error ranges determined for TAI results indicate that the variation is nevertheless within the uncertainty range that is generally accepted for daylight simulation.
The highest variation among simulation methods is readily noticeable on the ASE graph ( Fig. 2-d). ASE results for room L7 vary from a minimum of 4% when using the 4-component method, to a maximum of 44% when using the 3-phase method. The rank order among methods differs from the other metrics' order for the same room, but it is similar between rooms L7 and M1 when looking only at ASE. The other two classrooms, L3 and M5, do not receive sufficient direct sunlight during the year to record ASE values over zero, i.e. there is a small number of virtual sensors (or none) that receive sunlight with an illuminance over 1000 lx for more than 250 h.
To understand the reasons behind this large variation in ASE results, the spatial distribution of illuminance over the working plane was investigated. Room L7 is taken as example to illustrate the comparison between simulation methods, as the absolute difference obtained for ASE was the largest among the four rooms, while all other annual daylight metrics displayed a very good agreement, indicating that the chosen Radiance ambient parameters were sufficient to guarantee a full convergence for all methods. Still, the same explanation given for room L7 is valid for all other rooms, especially when considering several orientations as presented in the sensitivity analyses later on. Fig. 3 shows the spatial distribution of TAI values over the horizontal working plane for room L7. The values recorded by each virtual sensor were interpolated bilinearly to allow for a better comparison between methods that used a different number of sensors. For example, the 4-component method can make use of a very high number of sensors without compromising on computational time thanks to the Radiance ambient interpolation, a feature which is otherwise 'switched off' by the 'phased' methods. For room L7, the working plane was represented by a grid 92 × 116, whereas for the 'phased' methods a grid 20 × 25 was adopted (equivalent to a spacing between sensors of about 0.25 m). As a result of this, it can be noticed that the plot displaying the 4-component method's results (4CM) is characterised by finer details. However, this is not solely due to the grid resolution. The effect of the grid resolution on the final results was also investigated, and the related findings are presented in Section 5.2. As a matter of fact, the major difference between the simulation techniques under investigation is the way they represent the sun and the fenestration systems. In the 4component method and in the 5-phase method, the sun is represented as a point-like light source, allowing the reproduction of well defined shadows and solar patches on a plane. The 5-phase method has however an additional step in the simulation process, as the fenestration system is represented as a TensorTree BSDF. This slightly reduces the accuracy of the direct light patterns in compare with the 4-component method, but on the other hand it allows the simulation of CFS. 4 The 2-and 3phase methods cannot reproduce realistic patterns, as all direct light is smoothed over larger emitting surfaces, due to either the sky discretisation scheme (for both methods), or to the coarser Klems-based BSDF (for the 3-phase method). DAYSIM adopts a completely different approach, in the form of the so-called interpolated mode, by using four sun points at any given time, with intensities proportional to their distance from the actual sun position. This leads to spatial distributions that are often very different in shape than the other methods. The average TAI values are nonetheless extremely similar, as explained before.
The differences in spatial distribution are more visible in Fig. 4, which shows the area contributing to final ASE values for the different simulation methods. The plots represent the number of hours in a year in which the direct sunlight was more intense than 1000 lx; the white contours delimit the area where the number of hours is higher than 250. These are the two parameters currently required to determine ASE. The area taken into account for ASE notably differs from one simulation method to the other. With the 4-component and the 5-phase techniques, the high intensity illuminance values are limited to a small area close to the room apertures, whereas for the other methods that area reaches the back of the working plane.
Another way to look at this effect is to isolate a single moment in time from the yearly illuminance profile. Fig. 5 shows the direct illuminance distribution over the working plane for the 9 th March at 9 a.m. simulated with the five different methods. This moment in time was chosen as it displays the highest average illuminance obtained by the 4component method throughout the year. The average illuminance over the working plane is indicated under each plot, together with the portion of the plane that records illuminances higher than 1000 lx. It can be noticed that the average values are similar among methods, and they all fall within the expected uncertainty range of ± 15% from each other. On the opposite hand, the 'high intensity' area delimited by the white contour line changes significantly with the simulation method, ranging from 24% to 68%.

Sensitivity analysis
This second part of the analysis considered the effect of varying the simulation input on the final annual metrics, i.e. the sensitivity of CBDM simulation techniques and metrics to uncertainty in simulation input. The definition of each parameter is explained in the corresponding subsection. The sensitivity analyses performed for these parameters were simply obtained by iterating the simulation over a small range of the most common values that could be set by software users. To determine these values, the authors relied on personal experience, as well as on the answers obtained from a survey completed by several daylighting experts [48]. For parameters that are not solely dependent on user's choices, such as climate data or optical properties of the materials, more complex sensitivity analyses should be applied to consider different types of uncertainties. For example, optical properties are affected by both design decisions and measurement uncertainty; a detailed investigation on surface reflectance for CBDM was carried out in a separate paper [50]. The input parameters considered for the present analysis were the following: (i) sensor grid spacing, (ii) simulation time step, (iii) building orientation, and (iv) sky discretisation.

Sensor grid spacing
The spacing between virtual sensors is generally defined by the user and a grid of point is consequently generated over the working plane. For the 4-component method, which relies on the creation of images to extract point coordinates (stencil method), the user should specify the image resolution in pixels. The coordinates refer to the centre of the pixels.
Four values were considered for the 'phased' methods and DAYSIM (0.10, 0.25, 0.50 and 1.00 m), and five values for the 4-component method (0.05, 0.10, 0.25, 0.50 and 1.00 m). The additional value for the 4-component method was introduced as it is more typical for this technique to work with small grid spacing, i.e. high resolution images for the stencil method. Oppositely, when using high resolution grids with the other methods, the computational time significantly increases, thus unnecessary small spacing values should not be adopted without good reason. The results presented here are obtained from simulations run with a hourly time step, but the results were the same when using subhourly time step, as explained in the subsequent analysis. Fig. 6 shows the variation for four annual daylight metrics when using different grid resolutions in the case study classroom L7. Within the Figure, plot (a) illustrates the variation for TAI when using the five analysed simulation techniques, with an indication of a ± 15% range for the 4-component method's results. It can be noticed that for spacing smaller than 0.25 m all the results fall into that error range. With larger spacing distances, the variation among techniques tend to increase. TAI values obtained with the 4-component method tend to decrease with larger spacings, whereas 'phased' methods exhibit the opposite behaviour. The effect on the 4-component method can be explained by the fact that the sensor points are placed at the centre of the pixels that fill the working plane; by decreasing the resolution, the size of the pixels increases and their centre is placed progressively further from the edge of the working plane.
The second plot in Fig. 6 shows DA results, which were found to be insensitive to variations in grid spacing for this particular room configuration. Similarly, UDI-a results were not very dissimilar when using The white contour line divides the plane between sensor points that record illuminance greater than 1000 lx and those that record illuminance lower than 1000 lx. The ratio of the former over the total number of sensor points is indicated below the plots. It can be noticed that even though such ratios are very different, the average illuminance over the working plane is fairly similar among methods. larger or smaller spacings. ASE results display were also found to remain largely constant over different grid resolutions; the large difference between simulation methods visible in plot (d) was explained in the previous Section.

Time step
The effect of using different simulation time steps -i.e. the temporal frequency at which annual simulation are repeated -was investigated in a similar manner as for the grid spacing. A set of five time steps was selected (5, 10, 15, 30, 60 min) and the variation in annual daylight metrics analysed. This analysis was not carried out for DAYSIM as the software used to run it did not allow a straightforward variation of the time step. It is worth noticing that most simulation users run daylight analysis with hourly time steps and that the most common standard climate files report hourly irradiance and illuminance data. The analysis grid spacing adopted for the simulations presented here was 0.25 m. Fig. 7 clearly shows that both metrics and methods considered in the analysis are not significantly influenced by the choice of time step. This is the case for both the 4-component method, which used a linear interpolation of hourly data to derive sub-hourly time steps, and for the other methods, which instead used a stochastic interpolation method based on the work by Walkenhorst et al. [51] and Skartveit and Olseth [52].

Orientation
The effect of varying the building orientation was investigated mainly to test that the findings from the inter-model comparison (Section 5.1) were valid under different daylight conditions. As before, room L7 is taken as example, but the same analysis was repeated for all case study classrooms. The baseline configuration used in the intermodel comparison for room L7 corresponded to the North-East orientation here. As a reminder, room L7 is a double aspect space, with windows on the North-East and South-East facing walls.
From Fig. 8 it is possible to understand the sensitivity of three different annual metrics (TAI, DA and ASE) to changes in orientation. TAI is consistently similar when obtained from different simulation methods, also when considering multiple orientations, i.e. multiple daylight levels entering the space. TAI shows a sensitivity to the orientation, with lower values for the configuration tilted by −°45 (NW), in which the two glazed facades face North-West and North-East directions, and higher values for the opposite orientation (SE). Conversely, DA appears to be insensitive to the orientation, as it is to the choice of simulation method; this was found to be true for all the rooms where illuminance levels were generally above the 300 lx threshold even in absence of direct sunlight (i.e. due North orientation), whereas for darker rooms (e.g. case study M1) the orientation had a more pronounced effect on this metric too.
The last plot in Fig. 8 shows how ASE values are affected by the change in orientation, for the five different simulation techniques. The orientations for which little or none direct sunlight enters the space result in a value equal to zero (N and NW). For all the other orientations, ASE results are proportional to the amount of direct sunlight, with South-East resulting in values as high as 89% when obtained from the 3-phase method. It is important to notice how the discrepancy between methods also increases with increased sunlight levels, indicating that the main reason for the difference between CBDM techniques lies in their representation of the light coming from the sun, as explained in previous Sections. ASE results are further investigated in the following analysis, to account for the possibility of improving the representation of the sky -and of the sun, for some techniques.

Sky discretisation
For the 'phased' methods the resolution of the sky subdivision can be modified by the user (in the Radiance command-line procedures, but usually not in end-user software). The basic scheme follows the Tregenza subdivision in 145 sky patches; each of the patch is increasingly subdivided in the so-called Reinhart schemes, identified by the notation MF:n, where MF stands for Multiplication Factor and n is the number of times the sides of each patch is subdivided (excluding the patch representing the zenith cap). The subdivisions applied for the present analysis were MF:1 (Tregenza scheme), MF:2 (577 patches) and MF:4 (2305 patches). All previous analyses with the 'phased' methods were performed with a MF:2 subdivision. For the 4-component method and for the interpolated mode of DAYSIM it is not possible to change the sky and sun representation. Fig. 9 shows the effect that higher resolution skies have on ASE, for the 2-, 3-and 5-phase methods, considering multiple orientations. Only  . 8. TAI, DA and ASE results for room L7 rotated towards multiple orientations and obtained with five CBDM simulation techniques. TAI results shows sensitivity to orientation, and the different simulation techniques produce very similar results for all configurations. DA is also characterised by a very good agreement among methods, but it was found to be largely insensitive to orientation (for this specific space). ASE results in variations among methods even greater than those found in the inter-comparison, for orientations that allow more daylight in the space.
ASE is considered here, as the intention was to investigate whether the large variation found between simulation techniques when calculating such metric could be reduced by improving the sky definition. All other metrics considered here were found to give consistent results across techniques. The benchmark here is defined by the rtrace results (solid line), which are equivalent to the results obtained from the 4-component method. In this analysis, instead of using the daylight coefficient method, an iteration of instantaneous rtrace runs with no ambient bounces (-ab 0) was performed for each hour in the year.
The first plot in the Figure refers to the 2-phase method's results; in this case, an increase in sky resolution leads to an improvement in ASE results, i.e. the values are progressively closer to those obtained from rtrace. This is a plausible finding, as in the 2-phase method the sun is represented by three adjacent patches, whose solid angle becomes increasingly smaller when the subdivision is refined. The second plot shows results for the 3-phase method; here the improvement due to the sky resolution is less pronounced than for the 2-phase method. This is due to the fact that the 3-phase method applies an additional discretisation to the incoming light flux on the window plane (because of the representation of the fenestration as a BSDF created on a Klems basis), which again distributes the light on 145 large solid patches. A refinement of the BSDF Klems basis would be necessary to improve ASE results, but that would nullify the efficiency in computational time for which the 3-phase method was created. For the 5-phase method, shown in the last plot, ASE values were expected to remain the same, independently of the chosen sky resolution. The 5-phase method combines the diffuse daylight component obtained with the 3-phase method with a more accurate direct component simulation. The sky resolution can be modified for the former, but not for the latter, for which a scheme with 5185 suns is predefined. As ASE is calculated from direct sunlight illuminance alone, its results are not affected by changes in sky resolution. The agreement with rtrace results was however found to be good, making the 5-phase method the most accurate one in terms of direct sunlight simulation, among the widely available CBDM techniques. These findings were consistent for all the other case studies.

Discussion
The findings from the inter-model comparison and sensitivity analyses described here have demonstrated how performance evaluations can be highly sensitive to: the choice of CBDM software; the initial model configuration; and, user assumptions. These can act independently or in combination with each other. In particular, the key LM-83/LEED performance measure ASE was found to be very sensitive to variations in input parameters such as orientation (and sky discretisation for the 2-phase method), as well as to the choice of simulation method. This pronounced sensitivity poses challenges in comparing results obtained from different sources and in establishing an authoritative ground truth. For practical application, the choice of method could determine the outcome of a compliance pass/fail evaluation using the current version (v4.1) of LEED -a key finding which the practitioner community (particularly in the US) should be aware of.
In contrast to ASE, some of the other metrics displayed a remarkable lack of sensitivity to variation in fundamental input parameters. For example, when assessing case study L7 for different (compass) orientations, DA was found to remain constant. This might be read as an indication that the daylight performance is adequate whatever the orientation. Equally, one might infer that this metric is, for some designs, too insensitive to be effective in revealing variations in daylighting performance. Consequently, care should be taken when choosing a metric to act as a diagnostic/target for parametric design or optimization strategies. One should first ensure that the metric's sensitivity to the investigated parameter(s) is adequate. Similar conclusions were reached in relation to sensitivity to the assignment of surface optical properties [50], which were found to affect significantly the uncertainty in CBDM evaluations. Another important parameter is obviously the external climate [48], which is the subject of additional on-going research by the authors [53].
Some of the discrepancies between techniques reported here have been noticed in studies focusing on the development of simulation software [54] or metrics [55]. However, the authors believe this is the first study to systematically compare outcomes for multiple compliance metrics across all of the commonly used and state-of-the-art CBDM simulation techniques. Amongst Radiance experts, it is usual practice to take the results obtained from the rtrace command as the 'ground truth' against which other methods are compared (which is essentially how the 4-component method works). However, this approach requires a high level of expertise with command-line programming making it impractical for everyday users who wish to run comparison tests. Another type of approach is to compare simulation output from any new/ unproven tool against a set of benchmark results, i.e. the same rationale which led to the creation of the of the BESTEST schema for the evaluation of output from thermal modelling tools [56]. The development of an equivalently rigorous methodology to BESTEST but formulated specifically for the evaluation of daylight simulation output would appear to be timely if not overdue. Initial suggestions for a reference test office by Reinhart et al. [57] or a suite of idealised space geometries by Iversen et al. [58] have been instructive examples, e.g. revealing basic weaknesses in some non-CBDM software tools. The Commission Internationale de l'Éclairage (CIE) provides test cases to benchmark lighting computer programs [59], but the focus is exclusively on the physical accuracy of the underlying basic algorithms and does not scale to remotely real-world test cases. All of these aforementioned test cases lack both the variety and -as revealed in this study -diagnostic potential of the four spaces used here, not to mention also the real-world congruence. The authors of this study therefore propose that a variety of real-world test cases -such as the four spaces L3, L7, M1 and M5 used here -might better serve as reference benchmark models than idealised 'shoebox' spaces.
Further work will evaluate the differences of the methods investigated in this paper when the geometry includes CFS. As some of the methods were developed with the specific intent of accurately simulate CFS, it is an important aspect to consider.

Conclusion
The work presented in this paper has systematically assessed five state-of-the-art CBDM techniques which all have Radiance as the underlying simulation engine, but employed in very different ways. The techniques were evaluated by inter-model comparison using the 4component method as the benchmark since that has been shown to be the most rigorously validated of all the techniques. Outcomes were evaluated for four very different real-world spaces using key annual daylight metrics such as DA, UDI and ASE since these form the basis of current CBDM guidelines (including a mandatory one in the UK). The analysis employed windows with clear glazing which is still overwhelmingly the most common fenestration system used in buildings.
The five CBDM methods were found to be largely in agreement when expressing the annual results in terms of DA or TAI, i.e. metrics that account for total illuminance (both direct and diffuse) and that are averaged over the working plane. In such cases, the deviations due to the choice of simulation technique is within the overall uncertainty typical of daylight simulation, i.e. ± 15 to 20%. For metrics such as sDA -which take into account the spatial distribution of illuminance values too -and ASE -which is based on spatial distribution and on the isolated effect of direct sunlight -the differences between simulation methods became significant. Deviations of up to 39% points from the benchmark were found in the case of ASE for well daylit spaces. These uncertainty ranges were identified for the location used in this work (London, UK), but they might be even greater for sunnier locations; future research work will investigate this aspect in further detail.
Additional sensitivity analyses were performed to quantify the effect of simulation input variations on annual metrics. The analysis grid spacing and time steps were not found to be influential. Changes in orientation were found to be affecting some metrics (TAI, ASE), but not others (sDA, DA). Finer sky discretisation schemes had beneficial effects only on the accuracy of the 2-phase method.
The core difference between simulation methods results from the representation used to model the direct sun contribution, which influences significantly the consequent spatial distribution of direct sunlight. The methodology devised by the authors has ensured that the predicted measures were sensitive diagnostics of simulated performance outcomes applied to a variety of real-world space types. Thus, the findings -especially those for key compliance target ASE -are of considerable value to practitioners faced with the challenging task of selecting appropriate tools/methodologies for compliance verification amongst the many alternatives available.