Multi-domain analysis of photovoltaic impacts via integrated spatial and probabilistic modelling

: replace with: Currently, the impacts of wide-scale implementation of photovoltaic (PV) technology are evaluated in terms of such indicators as rated capacity, energy output or return on investment. However, as PV markets mature, consideration of additional impacts (such as electricity transmission and distribution infrastructure or socio-economic factors) is required to evaluate potential costs and benefits of wide-scale PV in relation to specific policy objectives. This study describes a hybrid GIS spatio-temporal modelling approach integrating probabilistic analysis via a Bayesian technique to evaluate multi-scale/multi-domain impacts of PV. First, a wide-area solar resource modelling approach utilising GIS-based dynamic interpolation is presented and the implications for improved impact analysis on electrical networks are discussed. Subsequently, a GIS-based analysis of PV deployment in an area of constrained electricity network capacity is presented, along with an impact analysis of specific policy implementation upon the spatial distribution of increasing PV penetration. Finally, a Bayesian probabilistic graphical model for assessment of socio-economic impacts of domestic PV at high penetrations is demonstrated. Taken together, the results show that integrated spatio-temporal probabilistic assessment supports multi-domain analysis of the impacts of PV, thereby providing decision makers with a tool to facilitate deliberative and systematic evidence-based policy making incorporating diverse stakeholder perspectives.


Introduction and context
When evaluating the systemic impacts of the rapid expansion of relatively new energy technologies within existing systems, effective decision making and policy development requires consideration of a wide range of complex inter-related issues. These include factors such as the potential costs for grid stabilisation (a technical factor) or the benefits related to fuel affordability (a socio-economic factor). However, for photovoltaic (PV) technology in national, regional or local contexts, these complex issues are currently largely evaluated on a simplistic basis; PV is in general considered as a homogenous collection of devices, with their aggregated output and impacts being similarly homogenous. With PV capacity in the UK exceeding 5 GWp as of mid-2014 [1], the overarching question in this 'whole system' context is 'how can we more accurately and holistically assess benefits and costs within such a vibrant PV market?'.
Geographic information system (GIS)-based modelling has been applied previously to facilitate wide-area assessment of the solar resource [2] or to evaluate PV system yields [3]. However, to date, an integrated approach that considers in detail all relevant factors (such as different array orientations), regional differences in environmental conditions (such as moving weather fronts or cloud transients) as well as variations in local demand profiles or socio-economic indicators has not been attempted. Furthermore, while some previous studies have applied probabilistic approaches to PV impact analysis [4,5] there is no evidence of work relating to the integration of probabilistic modelling with multi-parametric spatio-temporal analysis.
In terms of the impacts of PV on electricity networks, small-area studies have been carried out with somewhat contradictory results. For example, one study focussed on PV in the Scandinavian domestic sector [6] found that high penetration levels of PV power generation may cause voltage problems in the electrical network but that this also depends on the network type. Conversely, a UK study of the impacts of PV on a domestic low voltage network [7] indicated that even at very high penetrations of PV, network voltage rises are small and unlikely to cause problems. At a wider system level, it has been shown that the limited flexibility of base load generators produces increasingly large amounts of unusable PV generation when PV provides more than 10-20% of a system's energy [8].
Previous evaluation of the impacts of PV upon socio-economic indicators such as net household fuel costs is very limited. A UK study in 2007 based on nine dwellings in the social housing sector [9] tentatively indicated that PV can 'provide a significant contribution towards the annual electrical demand and an overall reduction of the fuel burden'.
In this context of complexity, the work described in this paper attempts to address these issues via a GIS-based modelling approach that integrates multi-domain spatial and temporal aspects. Such a multi-parametric approach is subject to a relatively high degree of uncertainty depending on the domain(s) under consideration, and thus a probabilistic technique that utilises a Bayesian inference network is used in conjunction with GIS. The research described here addresses a series of questions in the UK context † How much energy will be generated when and where? From a current reference point in terms of installations, a model for the performance prediction of systems based on post-codes was developed and validated against monitored datasets. † How much PV is likely to be achieved with different policies and where is it likely to be installed? This considers different socio-economic drivers, cost curves of PV and work on installation scenarios giving links to the likely social background of installations, locations (as in regions) and quantities. † Taking into account distribution network and socio-economic drivers, what future national installation architecture is optimal?
This includes an estimate of individual dynamic system energy yields aggregated to generation regions. † What system feedback effects will there be? Most policies will have effects on the questions above and thus it is foreseen that a feedback methodology will be created, calculating the costs/ benefits for UK plc as well as evaluating likely responses of the policy makers and grid operators.
2 Spatial modelling and dynamic simulation of the solar resource The UK PV ensemble is often deemed to be a uniform set of installations, whereas in reality numerous factors impact upon the output of any individual device. This simplistic view results in significant uncertainty when assessing the impact of relatively high penetrations of PV on local, regional and national power generation, transmission and distribution systems. Here, the focus is on three environmental and geometric aspects, namely the solar irradiation incident on the system, given its tilt and aspect.
Theoretically, incident irradiation is directly related to latitude, with more northerly latitudes receiving less irradiation. However, this does not take into account local terrain effects such as increases in elevation. Furthermore, given the UK's specific maritime location, its daily weather can change rapidly, influenced strongly by transient depressions and high pressure weather systems. This influences insolation and for short time periods may even reverse expected trends. The UK is mostly influenced by the prevailing south-west wind, causing irradiation generally to decrease from Cornwall to Shetland. Despite the variability of irradiation across the UK, only between 80 and 94 weather stations regularly record it [10], leaving large areas of the country without information. In addition, typically only total global horizontal irradiation is monitored, whereas plane-of-array irradiation is required to accurately model PV yield. Thus, this research presents a framework for the production of UK-wide tilt irradiation data from the available inputs.

Framework outline
The framework within which the geographical diversity of the UK PV fleet is accounted for comprises a series of stages, each of which involves the sequential implementation of specific algorithms before progression to the next stage, as illustrated in Fig. 1.
The initial stage involves interpolation, whereby gaps between the meteorological office station observations are filled to produce a country-wide grid of high resolution data. Subsequently, each global horizontal irradiation value is separated into its constituent parts, namely beam and diffuse. This is a prerequisite to the final stage of translation in which beam and diffuse irradiation components are treated to transform them onto a tilted plane. Separation requires a Sun geometry model, in which individual translations of irradiation components must be completed because of the unique inclination and orientation of each PV system. For a 2.5 km grid of the UK, this results in approximately 40 000 data points originating from 80 weather stations.

Interpolation algorithm
Of at least 12 methods of interpolation (each with a range of up to 11 parameters), the Kriging method was selected as it has proven effective in many fields [11]. It is especially suitable where data is spatially autocorrelated (i.e. where spatial relationships are correlated with proximity - Fig. 2). It is also effective where the sample points are poorly distributed or are few in number or where there is directional bias in the data. In Kriging, estimated output pixel values are calculated as weighted averages (W i ) of The solution is achieved via implementation of a number of simultaneous equations.

Sun geometry model
Several Sun geometry equations that deliver the solar declination angle were trialled. Declination changes with season and arises from the Earth's constant tilt of 23.45 o within its orbit around the Sun. The Strous algorithm [12], with an uncertainty of approximately 0.01°was implemented in which declination = arcsin(sin(eclong)x sin(23.45)) (1) where eclong = earthcentred longitude calculated from Julian Date The declination angle is used to compute the clearness index which is essential in the next stage.

Separation of beam and diffuse components
The algorithm for this stage of the framework was selected empirically. Only two UK weather stations log diffuse irradiation: Camborne in Cornwall and Lerwick in the Shetlands. The results of several split equations were compared with the actual measured observations and the model which delivered the closest match was selected. This was found to be that of Ridley et al. [13], which comprises an algorithm described by a sigmoid graph as follows: (see (2)) where, k t is clearness index, AST is apparent solar time (measured by direct observation of the sun or a sundial and based on the length of the apparent solar day which varies throughout the year because of the earth's elliptical orbit and axial tilt.), α is solar altitude, j is persistence factor (average clearness index over 2 h) Horizontal beam irradiance is then simply calculated as the original global measured irradiance minus the just-calculated horizontal diffuse irradiance.

Translation to specific inclination and orientation
Previous work in Loughborough [14] has demonstrated that an all-sky model delivers the best results for UK conditions. Therefore the Hay and McKay equation [15] with Reindl correction [16] was employed. This has circumsolar and uniform Finally, inclined beam irradiation is obtained via a simple cosine calculation including the solar zenith. Over the period 2005-2013, the Kriging step yields a yearly average cross-validation root mean square error (RMSE) of 56 Wh/m 2 (5%) for the interpolated values of global horizontal irradiation compared with measured values. Interpolation error is known to increase with distance from weather stations, hence it is anticipated that satellite data will be eventually be incorporated to improve the model. Furthermore, the uncertainty of slope irradiance calculations is affected by the quality of inclination and orientation inputs. Currently, sample and standard values are being used but it is intended to utilise LiDAR data to derive slope and aspect. Once estimates for plane-of-array irradiance have been achieved, these are utilised within a PV performance model, with the objective of establishing average values for roof pitch and housing aspect for various administrative areas (i.e. country subdivisions) and calculating the possible yield per area assuming a variety of installation scenarios. If PV aggregation is evaluated via transformer service area (i.e. the amount of PV mounted on each group of houses served by a respective transformer is considered), this provides the foundation for improved dynamic analysis of potential impacts upon distribution networks. In addition, if individual household roof slopes and orientations are grouped by postcode or lower super output area (LSOA, a geography for the collection of census data of approximately 650 households), socio-economic aspects of PV establishment may be studied, as demonstrated in Section 4 of this paper.

GIS modelling of PV distribution
Building upon the foundation provided by GIS-based dynamic irradiation modelling, subsequent work focussed upon spatial and temporal evaluation of PV expansion, and the impacts of various policy frameworks upon PV diffusion dynamics. The following sections describe the GIS methodology implemented to analyse correlations between recent policy implementation and regional levels of deployment.

Regional case study
A case study region was selected at the outset as a means of both minimising the impact of geographic variables such as irradiation upon deployment dynamics and at a scale consistent with that of low and medium voltage network analysis. Cornwall in SW England possesses relatively high levels of PV deployment per capita; as a result, it provides a highly relevant case study to understand the characteristics of this more developed market, as a viewpoint to considering how less established regional markets across the UK may evolve in the future. Furthermore, the peninsular geographic context for Cornwall has resulted in a distribution network which is relatively poorly interconnected at the low and medium voltage level [6]. This factor combined with the high penetration of PV provide a form of 'worst-case' scenario approach to consider potential electricity network impacts. To evaluate spatio-temporal trends, all grid-connected PV systems were characterised in terms of capacity (kWp), commissioning date, market segment (e.g. domestic rooftop, non-domestic rooftop or ground-mounted) and locational information. For the purpose of this work, locational data was defined within LSOAs. These are census-based areas, each containing around 600 households, which allow for subsequent integration of socio-economic datasets within the impact modelling framework.

Impact of policy upon PV deployment
Although policies supporting PV have existed in the UK for over a decade, it was the introduction of the feed-in tariff mechanism (FiT) in 2010 that catalysed substantial market growth [17]. The FiT followed similar examples set by other European states (notably Germany, Italy and Spain) which pay generators a premium for electricity produced, funded through consumer bills. Together with rapidly falling module prices, in the UK a relatively high initial FiT level led to a rapid expansion of the PV sector, which initially exceeded Government forecasts significantly. In response, reactive changes in the level of support occurred over a relatively short time-frame. Specifically in March 2012 rates were reduced and payments were linked to building energy efficiency standards. The duration of the subsidy was also reduced from 25 to 20 years and a further mechanism was introduced whereby the FiT rate was linked to the level of deployment, with higher deployment leading to a more rapid reduction in the FiT rate (i.e. a market triggered mechanism).
To gain a quantitative insight regarding the impact of policy dynamics on PV deployment, monthly installed domestic PV capacity in Cornwall was analysed for 2010-2013. The data shows that following the introduction of the FiT, deployment rapidly increased from a small initial base. High monthly deployment rates continued until the implementation of additional tariff reductions in March and August 2012. Fig. 3 shows how the monthly installations peaked just prior to the FiT reduction deadlines, that is, between March and August 2012. This snapshot of deployment highlights the key role of policy deployment trends, in that both the level and structure of the subsidy have a significant impact on the dynamics of deployment rates. The introduction of a market triggered degression mechanism appears to have allowed for more stable and predictable monthly deployment trends.

Spatial and temporal evaluation
To develop an insight into regional spatial distribution, a GIS approach was utilised to aggregate domestic PV capacity at LSOA level to provide a comparison across the case study area. Fig. 4 shows the spatial distribution of domestic PV in Cornwall as of December 2013. This illustrates a significant inhomogeneity in terms of PV capacity density, with even adjacent LSOAs having substantial variations in installed capacities depending on socio-economic, landscape and planning factors, such as specific levels of urbanisation and availability of suitable roof-area to mount PV installations [6].
The ongoing increase in domestic PV capacity as shown in Fig. 4 is characterised by a broadening of the distribution of installed PV capacity per LSOA over time, indicating that the penetration of PV in some LSOAs is increasing faster than the regional average. It should be noted that in the context of regions such as Cornwall, the inhomogeneous spatio-temporal evolution of PV capacity provides an important tool for assessing regional planning and policy effects, in terms of both the impact of PV on the regional electricity network [18], as well as upon specific socio-economic factors such as technology acceptance and household fuel affordability. It is the latter aspect that is the focus of subsequent analysis in this paper. It should also be noted that overt time it is likely that maturation of the domestic PV market and related supply chains influence perceived investment risk as seen previously in more mature markets such as Germany [17].

Socio-economic impacts: PV and fuel affordability
An assessment of the impacts of community-scale PV implementation in a domestic rooftop context is presented. Community-deployed renewable energy technologies are seen as a valuable contribution to a number of energy policy objectives [19]. However, significant uncertainty exists with regards to the potential impacts of PV in terms of socio-economic policy goals, such as impacts on the incidence of 'fuel poverty' (proportional net household fuel costs). Such uncertainty derives largely from the wide variability of socio-economic parameters relevant within the PV deployment space. These uncertainties represent a significant risk for policy makers, particularly as their interdependencies are rarely modelled and poorly understood. The challenges of multi-disciplinary assessment has been partly addressed in parallel work by developing models which integrate socio-economic, environmental and technical factors to provide stakeholders with improved decision support, diagnostic and simulation tools [19]. However, the effective management of uncertainty remains a recognised problem; deterministic methods, for example, need to incorporate sensitivity analysis to better evaluate the variability of output parameters in relation to inputs within a multi-dimensional problem space. With a large number of parameters this can be difficult and often excludes a consideration of dependencies between inputs.
Latterly, probabilistic graphical models (PGMs) have grown in popularity for modelling problems that require the integration of multiple knowledge domains while endogenising uncertainty. In PGMs, model inputs and outputs are intrinsically probabilistic, rendering their variability explicit and their sensitivity to the multi-dimensional parameter space a matter of querying the model's joint probability distributions (JPD). Specifically, Bayesian networks (BNs) can model and integrate knowledge domains in a manner that is intuitive to interdisciplinary researchers and stakeholders [20]. BNs have previously been applied for modelling optimum carbon mitigation and economic decision making in agriculture [21] and energy scenario studies for national energy systems [22], and the endogenising of uncertainty which allows decision makers to visualise risk as part of a due diligence approach is a distinct advantage in such applications [23,24]. The utility of BNs in this application suggests that stakeholders can be provided with valuable socio-economic decision support or policy making tools. To this end, a BN has been constructed, and a candidate model is presented below, along with an overview of the data with which to encode the dependencies between variables. Finally, some results are explored and discussed in the light of implications for decision support and policy making.

Bayesian networks
A BN is a mathematical model depicted by a directed acyclic graph (DAG) where each variable is represented by a node and dependencies between variables are represented by directed edges between them (Fig. 5).
A root node has no incoming edges and is encoded with a discretised probability distribution. A child node has one or more incoming edges leading from parent nodes and is encoded with a conditional probability distribution for each combination of parent node values. A leaf node is a node with no child nodes. The conditional probability distributions quantify the relationship, causal or observational, between a variable and its parents' variables in the DAG.
This state space can be statistically enumerated using a JPD, P(U ), which provides the probability of each possible combination of every variable in the BN. The semantic of the BN is the independency assumption: each variable of every pair of unconnected variables is independent of the other, given their parent values. The JPD can thus be factorised using the chain rule (5). Thus the BN's encoded probability distributions encapsulate the JPD and thereby the entire knowledge domain for which the DAG is a conceptual model The utility of this highly compact knowledge representation is further enhanced with reasoning algorithms which propagate evidenceobservations on one or more variablesto calculate a posterior probability distribution of all other variables in the BN [25]. Bayes Rule for conditional probability is used, which given a variable A, calculates the posterior distribution, P(A|B) given evidence B, from the prior distributions, P(A) and P(B) and the likelihood P(B|A) (6) The benefits of a BN in this context are † The efficient storage and encapsulation of an entire knowledge domain. † Effective inference-making in both a prognostic sense, when an observation is applied to a root node or a diagnostic sense when an observation is applied to a leaf node (one with parent but no child nodes) † A visual conceptual model in the form of a DAG which is an intuitive causal or influence diagram for the problem domain † The integration of knowledge domains using probabilistic relationships between model parameters to create transdisciplinary knowledge.

Object orientated Bayesian networks (OOBN)
An OOBN consists of a collection of connected BNs, each of which encapsulates a particular knowledge domain [26]. Thus Fig. 5 can be reinterpreted such that each object, A, B, C, D and E represents a functioning BN with its own factorised JPD, and the connections represent an interface between output nodes of one network and input nodes of another to enable the transfer of probabilistic information from one network to another. An OOBN facilitates trans-disciplinary enquiry and, particularly for a large network, provides a hierarchical model with each sub-network delivering the benefits listed above. Owing to the complexity and multi-disciplinary nature of the problem domain discussed in this paper this was the approach employed in this study. In the next section the knowledge domains which were integrated into a single OOBN are discussed.

Construction of the OOBN:
A BN is often constructed using expert knowledge to define the dependencies and independences between the parameters included in the study [27,28]. An OOBN facilitates this approach and the academic literature was employed to support the DAG structure of each object. Fig. 6 presents a UML schema for the model with each titled box representing a network object and the crow-foot connections depicting an interface between the output node of one object and the input of another.
The evaluation of socio-economic impacts in a community context suggests a focus of the OOBN around defined UK LSOAs. Thus the root BN object was designed to probabilistically characterise the LSOA. The key parameters for which probabilistic data were obtained were the building type, age and floor area, the southernmost area, pitch and orientation of roofs from LiDAR data and modelled household income distributions from census data and the English Housing Survey using an iterative proportional fitting approach [18]. Using GIS, roof parameters are provided as inputs to the yield object which calculates the specific yield. Irradiation data was then used to provide a modelled yield for every property in the LiDAR dataset. This deterministic value is augmented with an uncertainty parameter calculated from empirical data and modelled data for the same systems. Outputs from the yield and area objects enable the modelling of yields in the PV system object (Fig. 7). A building energy demand object was constructed using empirical datasets from the NEED framework [29]. This furnishes the energy cost object with the inputs to provide a probabilistic domestic energy cost. The FiTs subsidy object takes as inputs the energy demand and PV system yields to determine income from export and generation tariffs. To account for energy self-use, variability data were used to derive probability distributions which were influenced by both the PV energy generated and the total household electricity demand. The last three objects are used to deliver three key indicators; the socio-economic object provides fuel affordability indicators, the NPV object provides a discounted cash flow analysis and the carbon object provides the carbon savings. It should be emphasised that this brief description masks somewhat the nature of the model's quantitative data. All the parameters have been solicited to furnish the BN with probability mass functions (PMFs) (discretised probability distributions), as shown in Fig. 7. Furthermore, objects which have parent nodes are encoded with a PMF for each combination of parent values. Thus there is a significant degree of data processing and statistical analysis to derive these distributions. Further discussion of all variables, data sources and preparation of PMFs can be found in [19].
The OOBN itself was constructed using Netica BN software [20] which allows the simple input of observations on any node to observe the influence of the evidence on all other variables as discussed in the next section.

Bayesian model implementation
The BN imparts an informative prior probability distribution for every variable in the network, while generating posterior distributions for socio-economic, financial and environmental parameters of interest. Fig. 8 illustrates an example of the application of the small-area technique for a typical LSOA. This shows distributions of system yield calculated using interpolation-based predictive modelling together with LiDAR-enhanced building-specific data [29], CO 2 reduction, NPV and percentage of household income spent on fuel, respectively, assuming all suitable rooftops are subject to PV installations within the LSOA in question. The fuel spend parameter was calculated using LSOA-specific domestic income distributions obtained from small area simulation methods based on iterative proportional fitting [30]. The BN also offers enhanced diagnostic or prognostic utility by fixing one or more specific node values (observations or predictions) and evaluating the resultant posterior distributions of all other variables of interest. Thus, the model achieves the objective of creating an integrated decision support tool with which a large spectrum of queries can be posed and probabilistic answers delivered.
With respect to providing an insight into fuel affordability aspects, it should be noted that UK Government fuel poverty indicators use a modelled energy demand calculated using a normative heating regime. Since UK households are generally not heated to the same intensity [31], official fuel poverty incidence may be expected to be higher in general than that suggested by the proxy indicator used in this research [32].

Conclusions
The results presented in this work demonstrate the potential for integrated spatio-temporal probabilistic modelling to provide valuable new insights across a range contrasting domains, in specific terms Interpolated GIS-based solar resource modelling using meteorological station data, applied with the objective of improving the accuracy of dynamic solar resource prediction. The Kriging approach utilised gives a yearly average cross-validation RMSE (2005-2013) of 56 Wh/m 2 (5%) for the interpolated values of global horizontal irradiation compared with measured values.
The influence of specific policy measures upon PV sector expansion dynamics were examined, with the objective of improving policy development and implementation moving forward. The results indicate that policy can have a significant influence on the growth of installed capacity, not only by its ability to stimulate, but also to dampen the installation market, thereby reinforcing the need for a stable and transparent policy framework. The results identify drivers and provide a basis for informing subsequent probabilistic modelling with deployment distribution data based on empirical evidence.
In terms of socio-economic impacts, specific sustainability indicators provide a valuable multi-criteria parameter set for decision support which can account for diverse stakeholder perspectives. A probabilistic assessment of parameters of interest provides a versatile means of risk assessment relating to the attainment of key performance indicators in a wide number of simulated scenarios using a Bayesian approach. Thus, the prospects of PV impact optimisation may be further improved by deliberative policy and decision making under uncertainty.
Spatially disaggregated empirical energy demand and household income datasets have been used to provide a probabilistic indicator giving the percentage of income spent on fuel. Such a probabilistic approach provides a useful spatially disaggregated proxy indicator which can help with the targeting of mitigation interventions.

Acknowledgments
This work has been conducted as part of the research project 'PV2025 -Potential Costs and Benefits of Photovoltaic for UK Infrastructure and Society' project which is funded by the RCUK's Energy Programme (contract no: EP/K02227X/1).