Quantification of existing rooftop PV hourly generation capacity and validation against measurement data

The spatio-temporal patterns of the existing rooftop photovoltaic (PV) production can provide valuable insights for the design of effective strategies for PV integration at large scale. In this work, we quantify the hourly production of existing rooftop PV installations by combining two large-scale methods, for detecting rooftop solar panels and for estimating their hourly PV generation, which are both applicable at the Swiss national scale. A validation against measured data of PV installations from 16 roofs in the Swiss Canton of Aargau shows that the existing PV area is detected accurately, while the hourly profiles overestimate the PV production, with higher errors in winter than in summer. These errors result in a mean overestimation of the annual PV production by 16%, assuming a triangular panel placement at low tilt (15°) on flat roofs.


Introduction
Solar photovoltaics (PV) are considered as one of the key technologies in Switzerland's renewable energy strategy, which aims for an electricity production of 34 TWh by 2050 [1]. Achieving this ambitious goal requires the large-scale deployment of PV panels, particularly on rooftops. To design effective pathways towards an electricity supply with a high share of PV, we hence need to not only identify roofs with a high potential, but also assess the spatial and temporal patterns of the electricity production of existing PV capacities. These may indicate (i) socio-economic factors that support PV deployment, (ii) areas with recent growth in PV and (iii) current and potential future challenges of distributed renewable energy generation for the energy system. Despite some estimation of the total installed PV capacities in the country, their spatial and temporal patterns are to date unknown.
In this work, we propose an approach for quantifying the hourly production of existing rooftop PV installations, which is scalable to the national level for Switzerland, and validate the results against measurements. The approach combines, for the first time, a detection of the size and location of existing rooftop solar panels [2] with the estimation of hourly PV electricity generation [3]. We further expand the method from [3] to compare two scenarios for PV panel placement on flat roofs, as shedded rows or as east-west facing triangular rows. The estimated PV area and hourly production are compared to real electricity generation data from 16 roofs with 31 individual roof parts in the Swiss Canton of Aargau (AG). Despite the relatively small sample size, the validation indicates systematic differences between the estimation and the actual generation from rarely available measurement data. By combining two methods that are only addressed separately in the literature [4,5], this work can be used to quantify the current PV production at a high spatio-temporal resolution anywhere in Switzerland, whereby the validation results may serve as a basis to quantify the uncertainty related to the hourly PV yield.

Size and location of existing PV installations
The size and location of existing PV installations is estimated through a Machine Learning (ML)-based framework, which is used to compile a rooftop solar registry of Switzerland [2]. It combines high resolution images at 0.25 m, collected during a multi-year aerial campaign by the Swiss Federal Office of Topography (swisstopo) [6], and a Convolutional Neural Network (CNN) for pixel-wise semantic image segmentation to locate and quantify the size of rooftop solar installations. The strength of this method resides in the combination of transfer learning with an encoder-decoder architecture (U-Net), which allows to drastically reduce the number of labelled images used for training the algorithm. The use of geospatial tools for post-processing further reduces the number of false positives by eliminating predictions outside of rooftops, by aggregating the detected pixels to installations and by attributing them to the existing building stock. The framework hence allows to detect even small installations (< 30 kWp) which are not necessarily recorded by accredited certification bodies such as Pronovo AG or the Swiss Federal Inspectorate for Heavy Current Installations (ESTI).
As the aerial images for the measured buildings in the Canton of Aargau date from 2016, only installations older than this date are considered in the validation of the PV installation size.

Hourly rooftop PV electricity generation profiles
The hourly rooftop PV yield of the existing panels is estimated using the method proposed by Walch et al. [3], which consists of three steps: First, the physical potential is obtained from the incoming direct and global horizontal solar radiation and the surface reflectance (albedo). To be comparable to the validation data, we use the recorded hourly solar radiation from satellites during 2018 and 2019 [7].
Second, the geographical potential is obtained by multiplying the size of the PV installation (see Section 2.1) with the hourly rooftop tilted radiation. The tilted radiation estimation accounts for roof tilt and orientation, for shading effects and for sky visibility (see [3] for details). As PV panels on flat roofs are installed in various configurations, we model two scenarios for flat roofs: (1) In the "base" scenario, panels on fully flat surfaces (0°) are placed in south-facing shed rows, tilted at 30° and spaced apart by one projected panel width (~ 87 cm). This PV placement aims to maximize the electricity yield per panel with a realistic spacing between adjacent panel rows. (2) For practical reasons and to maximize the used PV surface, existing panels are often placed in adjacent and alternating, (approximately) east and westfacing rows at lower tilt. The "EW" scenario hence models panels on flat roofs as alternating east and west-facing rows at 15° tilt, whereby all roofs with tilt angles below 10° are defined as flat.
Finally, the technical potential is obtained by multiplying the geographical potential with the panel and inverter efficiency reported in [3], which assumes that all PV panels are mono-crystalline panels. In contrast to [3], we do not account for additional losses due to soiling, wiring, system availability etc., as most analysed systems are new and we consider these losses to be negligible. For the comparison with the real data, the electricity yield is normalized by the PV installation size of each sample roof.

Measurement data
The measurement data used for the comparison has been provided by IBW, the local utility of Wohlen (AG), which owns about a dozen PV installations placed on own and customer roofs. Five of them were installed before 2016, thus their installed PV area can be compared against the CNN-based PV detection described in Section 2.1 (referred to as "CNN algorithm"). IBW also provides 15 hourly PV generation profiles (including third-party installations) of 2018 and 2019, which are used to validate the hourly PV profiles (see Section 2.2). To increase the number of roofs for validating the CNN algorithm, the actual PV area on five of these third-party PV installation sites is quantified visually from aerial images of swisstopo, by counting the number of panels and multiplying them by 1.6 m 2 , which is a typical size of PV panels [3]. Thus, 16 roofs with PV installations are considered in total, consisting of 31 individual surfaces. Table 1 summaries the most relevant characteristics of these roofs (and roof parts) as well as their use in this study. For reasons of confidentiality, no further information (e.g. address) can be given.

Size and location of existing PV installations
A qualitative comparison of the PV area detected by the CNN algorithm and the real area shows that the CNN algorithm performs very well in most cases (8/10 roofs), as shown in Figure 1b and c for roof IDs 10 and 36 (both flat). In two cases, it fails to detect the full installation (see Figure 1a for ID 9), partly due to the high reflection on the aerial images (9b,c), which is one of the limitations of the CNN approach, and partly to the model sensitivity, which is estimated to be 84% for the class of pixels belonging to a PV installation on the image [2]. Further known limitations include the differentiation between solar thermal and PV panels and the detection of building-integrated PV, which are not present in the training dataset and could hence not be assessed by the CNN algorithm.
The above results are mostly confirmed when comparing estimated and real installed areas, shown in Table 2 as percentage of the total roof area for all roof IDs. This percentage represents the roof area that is not or cannot be covered by PV either due to obstructing superstructures and/or any other reasons (such as static, economic, design, etc.). Except for IDs 5 and 9, where the CNN algorithm cannot detect a significant part of the PV panels (see Figure 1), there is a good match between the actual and detected PV area (< 15% deviation). Out of those, the cases with a high deviation (ID 22, 29, 36) occur for roofs where the PV area has been derived visually from aerial images, which may lead to discrepancies if the actual surface per panel is not equal to 1.6 m 2 (see Section 2.3). While these results are not representative due to the small sample containing only large installations (> 30 kWp), they suggest that the PV detection algorithm underestimates the real PV area by 3% of the roof surface on average, with 18% deviation.
A comparison of the two PV scenarios for flat roofs (see Table 1) shows that in most cases, PV panels on roofs with tilt angles below 5° are installed in a triangular arrangement, corresponding to the EW scenario from Section 2.2. On those roofs with tilts of 5°-10°, panels are mostly installed flat on the roof (corresponding to the base scenario). Only one roof uses a shed arrangement, which describes PV panels placed in individual rows of tilted panels (base scenario). The validation results suggest that the EW scenario represents a more accurate approach for estimating the real PV yield of flat roofs.

Hourly profiles of existing PV installations
To assess the quality of the modelled hourly PV generation profiles, the normalized (scaled to kWh/m 2 ) hourly PV generation profiles from actual measurements and modelled by the base and EW scenario are compared in Figure 2 for a selection of two flat (ID 19, 36, EW scenario) and one pitched (ID 9) roofs for five consecutive days in winter and summer in 2018. For pitched roofs, the base and EW scenarios are identical (see Section 2.2). From a visual inspection, the EW scenario matches the real profiles more accurately, which is expected for triangularly mounted panels, confirming that the lower tilt of 15° is a more realistic assumption for flat roofs. Moreover, the match of the modeled profiles is generally better in summer than in winter. This is in particular the case for entirely sunny (clear-sky) days, while for cloudy or intermittently sunny days both models become less accurate. This may be due to discrepancies between the actual solar radiation and the satellite data, caused by local cloud coverage. Further differences may arise for example from inaccuracies in the PV geometries or from unexpected shading. To see diurnal patterns, hourly differences of the actual and modelled (only EW) profiles are shown in Figure 3 as boxplots for five selected roofs per season. To avoid large outliers, hours with missing values have been discarded. Figure 3 shows no systematic bias for some roofs (e.g. ID 26), while for other roofs the model generally underestimates (e.g. ID 8) or overestimates (e.g. ID 16) the actual PV generation. For ID 16, this overestimation takes place more prominently in the morning hours and levels off towards the evening. An opposite pattern can be observed for ID 22, where the model underestimates the actual generation in the morning and overestimates generation in the afternoon (and evening). Outliers are typically large in winter, even though absolute solar irradiance and consequently the average (median) differences are smaller in winter. This is in line with Figure 2, where in some days in winter, the modelled and the actual generation are off by a relatively large amount. This may be due to unaccounted for phenomena in the model such as snow cover on PV panels, more extensive near-object shading, etc. Based on these patterns, a better calibration of the model can be conducted.
A quantitative comparison of the actual and modelled (normalized) PV generation, using a weighted mean absolute percentage error (wMAPE), is shown in Table 3 Figure  2, the EW scenario generally outperforms the base scenario for flat roofs. The wMAPE is substantially higher in winter with a daily mean of 31% (EW), while in other seasons its daily mean is of 14%-21% and 18%-31% for the EW and base scenarios, respectively. Hourly wMAPEs are generally 10%-20% higher than the daily values, as small temporal shifts between the production profiles can be balanced across the day. The yearly production of the estimation lies on average 11% above the measured values (EW), possibly due to different panel efficiencies, unaccounted losses, or overestimated solar radiation.
The overestimation of the annual normalized PV production also dominates the error in the absolute annual PV yield (in kWh), which lies at 16% above the measured data, computed across 9 roofs where CNN area and hourly PV estimates are available. This error is dominated by deviations in the PV yield for large flat roofs, which reinforces the importance of assessing different scenarios for PV on flat roofs.    Table 3. Weighted mean absolute percentage error (wMAPE, in %) between measured and modelled hourly PV profiles of roofs (ID) per season. Errors for flat roofs are shown for both scenarios, as base/EW. The column "all" shows the mean of all roofs (for both scenarios of flat roofs).

Conclusion
This work presents an approach to estimate the spatio-temporal patterns of rooftop PV generation from current installations by combining existing methods scalable to the Swiss national level. A comparison to measured data of 16 roofs shows that the PV area is identified accurately in most cases, while the real production is overestimated, with higher errors in winter than in summer. The modelled annual PV yield lies 16% above the measured data, assuming a triangular PV placement at 15° tilt on flat roofs. The errors are to be treated with caution as the sample size is relatively small and there are several unknowns. As the detection of the existing PV areas and the estimation of the hourly potential electricity generation from PV are large-scale methods, the proposed approach can be used to estimate the existing PV production for the entire Switzerland. These roofs can hence be excluded from future PV potential studies and to quantify remaining exploitable surfaces on already occupied roofs, which may play a key role in future PV integration strategies.