Site adaptation with machine learning for a Northern Europe gridded global solar irradiance product

Hydrological Institute (SMHI) is likely the most accurate one, providing data across Sweden. To further enhance the product quality, the calibration technique called "site adaptation" is herein used to improve the STRÅNG dataset, which seeks to adjust a long period of low-quality gridded irradiance estimates based on a short period of high-quality irradiance measurements. This study introduces a novel approach for site adaptation of solar irradiance based on machine learning techniques, which differs from the conventional statistical methods used in previous studies. Seven machine-learning algorithms have been analysed and compared with conventional statistical approaches to identify Sweden ’ s most accurate algorithms for site adaptation. Solar irradiance data gathered from three weather stations of SMHI is used for training and validation. The results show that machine learning can substantially improve the STRÅNG model ’ s accuracy. However, due to the

Gridded global horizontal irradiance (GHI) databases are fundamental for analysing solar energy applications' technical and economic aspects, particularly photovoltaic applications.Today, there exist numerous gridded GHI databases whose quality has been thoroughly validated against ground-based irradiance measurements.Nonetheless, databases that generate data at latitudes above 65 ˚are few, and those available gridded irradiance products, which are either reanalysis or based on polar orbiters, such as ERA5, COSMO-REA6, or CM SAF CLARA-A2, generally have lower quality or a coarser time resolution than those gridded irradiance products based on geostationary satellites.Amongst the high-latitude gridded GHI databases, the STRÅNG model developed by the Swedish Meteorological and Hydrological Institute (SMHI) is likely the most accurate one, providing data across Sweden.To further enhance the product quality, the calibration technique called "site adaptation" is herein used to improve the STRÅNG dataset, which seeks to adjust a long period of low-quality gridded irradiance estimates based on a short period of high-quality irradiance measurements.This study introduces a novel approach for site adaptation of solar irradiance based on machine learning techniques, which differs from the conventional statistical methods used in previous studies.Seven machine-learning algorithms have been analysed and compared with conventional statistical approaches to identify Sweden's most accurate algorithms for site adaptation.Solar irradiance data gathered from three weather stations of SMHI is used for training and validation.The results show that machine learning can substantially improve the STRÅNG model's accuracy.However, due to the spatiotemporal heterogeneity in model performance, no universal machine learning model can be identified, which suggests that site adaptation is a location-dependant procedure.

Introduction
Global primary energy consumption increased from ~28 PWh to ~165 PWh between 1950 and 2021 [2].In 2021, about 82 % of the global primary consumption was from fossil fuels [2], which has hitherto been the main driver of climate change, one of the most challenging problems concerning humanity [24].To mitigate climate change, developing appropriate and supporting policies and regulations on deploying renewable energy is considered the foremost step [21,27].The European Union (EU) has set progressive targets to reduce greenhouse gas emissions through 2050 [13].The energy sector is responsible for 75 % of the European Union's greenhouse gas emissions, and to ensure that the EU achieves the ambitious climate targets set, a revised Climate and Energy package was released in July 2021 [15].The target is to reduce at least 55 % of greenhouse gas emissions by 2030 compared to 1990, which demands significantly higher shares of renewable energy in the energy mix.In this regard, the current EU target of having at least 32 % renewable energy by 2030 is insufficient, and according to the climate target plan, it needs to be adjusted to 38− 40 % to broaden the prospect of reaching the greenhouse gas emissions target by 2030 [14].
Sweden is a member of the EU and has a political agreement to have a 100 % renewable electricity system by 2040 [34].To achieve this national goal, the Swedish government has supported, amongst other renewables, the solar energy sector with subsidies for photovoltaic (PV) installations.A higher subsidy level has increased the profitability margin of PV installations.The cumulative grid-connected PV capacity has boosted from 4 MW p in 2005 to 2.38 GW p in 2022 [46].In 2020, the Swedish government decided to support the PV sector by giving a tax deduction on labour and materials with a roof of 50 kSEK instead of subsidies, and the policy commenced on January 1, 2021 [11].
To support the solar PV sector, gridded solar irradiance products at high spatiotemporal resolutions are developed to better support siting, sizing, design, performance evaluation, and operation & management of PV systems.Campana et al. [4] analysed several gridded global horizontal irradiance (GHI) databases and determined that the CM SAF SARAH-2 performs best amongst the five databases.The CM SAF SARAH-2 covers Africa, the Atlantic, Europe, and part of South America with a resolution of 0.05 ˚× 0.05 ˚and is based on the geostationary METEOSAT satellites [12].However, a drawback of using CM SAF SARAH-2 is that the database does not provide GHI estimates for latitudes above 65 , owing to the limited field of view.On the other hand, the product based on the STRÅNG model, which attained the second-best result during the comparison [4], is not restricted by the latitude limit since STRÅNG also integrates data from polar orbiters into its modelling.Specifically, STRÅNG is developed by the atmospheric remote sensing group at the Swedish Meteorological and Hydrological Institute (SMHI).It outputs estimates of GHI, beam horizontal irradiance, diffuse horizontal irradiance, photosynthetic active radiation, and CIE-weighted UV irradiance (i.e., the irradiance of each wavelength in the UV is weighted by the weighting factor given by the Commision Internationale de l' Éclairage (CIE)-action spectrum that gives the CIE-weighted spectral irradiance).It has a temporal resolution of 1 h, covering the Nordic countries with a spatial resolution of 2.5 km × 2.5 km [23,38].
Solar resource assessment is the foremost step for developing a solar projection.However, owing to the high cost and time constraints, longterm ground-based radiometry measurement at the target site is rarely available.Therefore, one has to perform resource assessments based on satellite-derived irradiance [32].Physical and empirical models are the two main approaches for estimating satellite-derived irradiance [26].Physical models are based on (reduced forms of) radiative transfer, which seeks to estimate the attenuating effect of the atmosphere on incoming radiation.The empirical models, on the other hand, regress the ground observations onto various predictors, such as the satellite visible channel's recorded intensity, such that when new predictors are available, the fitted regression can issue predictions accordingly.
Polo et al. [31] have conducted a benchmarking study on several site-adaptation techniques to assess the quality improvement brought by those techniques on ten gridded irradiance products covering satellite-derived and reanalysis solar radiation data.It was found that most techniques can significantly improve at most sites regarding bias reduction.In parallel, some sites with high-quality satellite-derived irradiance showed no noticeable improvement after site adaptation.To that end, the author emphasised that no universal procedure could apply to all possible combinations of sites and modelled datasets, and the quality improvement is, in the main, heterogeneous.Two commonly used site adaption techniques to reduce bias and improve the model performance in a given geographical area are linear regression and quantile mapping [32,44].
Since the scientific principle of site adaptation is one of regression, one needs not restrict to statistical methods as Polo et al. [32] did.Narvaez et al. [28] have already used machine learning as a site-adaptation technique and shown improvements over traditional quantile mapping.Machine learning offers numerous advantages stemming from diversity and versatility which have consistently proven to be highly beneficial.There is a rich opportunity for developing machine learning models for site adaptation.Even so, as is the case for statistical site adaptation, the performance of machine-learning-based site adaptation models also depends highly on the local weather and irradiance regime, which is hard to know a priori.In other words, one cannot infer model performance just from experience with high certainty.Instead, several machine-learning models must be developed and compared to find the optimal model, according to specific performance criteria and under the local regime.
In this study, machine learning, given its ability to perform regression, is used as a site-adaptation strategy to improve the quality of the GHI generated by STRÅNG.Site adaptation in solar energy meteorology refers to correcting the bias in a long period of gridded data using a short period of ground-based measurements [31,32].In a large pool of available machine-learning models, several algorithms with distinct prediction mechanisms are chosen, which include support vector regression (SVR), k-nearest neighbour (kNN), Bagging (BAG), LSBoost (BOS), XGBoost (XGB), CatBoost (CAT), and artificial neural network (ANN).The performance of the site-adapted GHI using these machine-learning techniques is compared to the GHI estimates from STRÅNG and two traditional statistical site-adaptation approaches (linear adaptation (LA) and quantile mapping (QM)) using ground-based measurements at three locations in Sweden.
In this study, an innovative approach is employed by using cuttingedge machine learning techniques to enable site-specific adaptation of GHI data.Our primary objective is to enhance the accuracy of STRÅNG's GHI estimates, a development with profound implications for the advancement of solar energy systems in Scandinavia.This impact is particularly pronounced within the realms of PV and agrivoltaic (APV) systems, where accuracy and reliability of solar radiation resources are paramount.Improved GHI estimates can better support the prediction and characterization of solar energy system's performances.In APV systems, improved GHI estimation has positive impacts both on the PV yield assessment as well as on crop production providing improved input for energy balances at crop level and better photosynthetically active radiation (PAR) estimation [5,25].By enhancing the accuracy of PV and APV mathematical models through site adaptation, we equip stakeholders and decision-makers with a powerful tool to optimize solar energy utilization and land use, thereby fostering sustainability and productivity.In the context of Sweden, our study addresses several key factors that underscore its novelty and significance.Firstly, Sweden is experiencing a remarkable upsurge in the solar PV sector, with particular growth evident in utility-scale PV installations.The necessity for precise and reliable data for ensuring the bankability of solar projects has never been more critical, and our research becomes paramount in this evolving landscape.Additionally, our work builds upon a prior study by Campana et al. [4], which identified STRÅNG as the second-best model for GHI after CM SAF SARAH-2.Recognizing the potential for improvement, our study endeavors to enhance the accuracy of the STRÅNG model, a significant step in advancing predictive capabilities in the context of Swedish solar energy.Furthermore, STRÅNG is unique in generating consistent data across the entire Swedish territory, in contrast to SARAH, positioning it as a reference product for Nordic countries and supporting comprehensive planning and utilization of PV systems.Notably, our research also addresses a significant research gap, as there are few to no studies on site adaptation at northern latitudes, specifically within Sweden.Lastly, our study provides an efficient solution, showing that a mere 12 months of data are sufficient to effectively train a site adaptation model, catering to the increasing demand for practical and resource-efficient solutions in the solar energy sector.

Method
This section describes the methodology for improving the STRÅNG database through site adaptation and the selected machine-learning algorithms.Section 2.1 presents the data used in this study.Section 2.2 describes the machine learning models employed in this study.Section 2.3 describes the hyperparameters optimisation used to increase the performance of the machine-learning models.Section 2.4 describes the sensitivity analysis conducted in this study.Section 2.5 describes the error metrics used to analyse the machine-learning model estimations.

Data
In the realm of solar energy, the precise measurement and prediction of GHI are crucial for optimizing energy production and site selection.Machine learning has emerged as a powerful tool, offering versatile solutions to key challenges in the sector [35].Machine learning models can predict GHI values, enhancing resource assessment, and supporting data-driven site choices.For instance, Huang et al. [19] proposed a data-driven framework for developing short-term solar irradiance forecasting at a target site considering both spatial and temporal information from large-scale neighbouring sites by utilizing several machine learning models such as boosted regression trees, ANN, and support vector machine.These models also provide accurate energy yield estimates, aiding in project financing and planning.Additionally, machine learning can offer short-term, medium-term and long-term GHI forecasts, enabling real-time adjustments in energy production strategies [16].Long-term forecasting is important to establish a global management of the electricity supply by estimating the large-scale electricity production, while medium-term forecasts can be used for demand response, market analysis, and energy efficiency.At last, short-term forecasts can be used to predict detailed power management and provide information for developing grid stability strategies and provision of ancillary services [22].Furthermore, they enhance data quality assurance, ensuring the reliability of data used for site adaptation, making them invaluable for advancing solar energy technology and sustainability in engineering applications.
In Sweden, SMHI maintains several weather stations that perform solar irradiance measurements.In this work, site adaption is conducted at three locations: Kiruna, Norrköping, and Visby, as also listed in Table 1 and mapped in Fig. 1.These locations represent the extremes in terms of the availability of direct and diffuse irradiance in Sweden.On an annual basis, Kiruna has the highest amount of diffuse irradiance, while Visby has the highest amount of direct irradiance.Norrköping, on the other hand, may be representative of the Swedish territory as it does not have any of these extremes [4].
As for the gridded product, STRÅNG is a mesoscale model for solar radiation products.The resolution and geographical extent of STRÅNG have changed over the years.Between January 1999 and May 2006, the horizontal resolution was about 22 km × 22 km.Then it was increased to 11 km × 11 km.Furthermore, starting from 2017, the product is at the current resolution of 2.5 km × 2.5 km.The input and output fields that are produced from the model are adapted to the mesoscale analysis system MESAN at SMHI.Input data also come from numerical weather predictions from MEPS [38].
During training, validation, and testing, the weather station GHI measurement is used as the output target for the machine learning models.The Kiruna, Norrköping, and Visby weather stations measure global irradiance with Kipp & Zonen CM21 pyranometer.SMHI does quality assurance routines based on routines developed for the Baseline Surface Radiation Network (BSRN), and are adjusted to fit the measurement program at the Swedish stations and tuned to the climatological limits in Sweden [6].Traditional statistics approaches are evaluated with machine learning models to see their performance  difference.The statistical approach uses the STRÅNG data as input and GHI measurements from the weather stations.The results of the approach deployed in this study are site-specific, as per the definition of site adaptation [1,17,40,41].The input gridded data, which has a spatial resolution of 2.5 km × 2.5 km, is used to train the models together with ground observations to improve the gridded product in the specific grid point where the site under consideration falls in.The training period of solar irradiance is evaluated using short-term local ground measurements to characterize long-term solar resources.This approach is common in solar power projects and many other applications [31].The training dataset consists of a continuous period spanning from 2021 to 2022, while the testing dataset covers the time range from 2017 to 2020.

Machine-learning models for site adaptation
The machine learning models were developed using several predictors, including STRÅNG GHI, solar azimuth, solar elevation, and time information encompassing months, days, and hours of the day.The solar elevation and solar azimuth angles are obtained by computing the planetary positions [36].Additionally, Mesan has been used to obtain air temperature, pressure, relative humidity, cloud cover, and wind speed.The time component has been cyclically encoded using a sinusoidal function.Months, days, and hours are represented with values that cycle between − 1 and 1.This encoding approach enables the models to effectively capture temporal patterns.The machine learning algorithms used in this study can be classified into two categories: Shallow Models and Ensembles Models.A summary is provided in Table 2.The machine learning algorithms, implemented with libraries such as scikit-learn (sklearn) [30] and Keras [9], were trained for each specific site.To enhance model accuracies, Sequential Forward Selection was employed to select the five most important features for each model.

Shallow models
The kNN method is used for classifying objects based on the closest training examples.The kNN method can be used for regression [7,43].However, the kNN looks into the history to find cases with the closest pattern to this case instead of using a learning base.Linear regressors are commonly used to minimise the sum of squared errors [18].Several extensions of the linear regression are known as lasso, ridge, and elastic net, with additional penalty parameters to minimise the complexity or reduce the number of features used in the model [18].
In some cases, we only want to reduce the error to a certain degree within an acceptable range.In those cases, the SVR gives the flexibility to define an acceptable error in our model and find an appropriate line to fit the data [7].A neural network is commonly used for predictive modelling, adaptive control, and applications to be trained via a dataset.ANN modelling has become popular in the last decade due to being successful in several fields of medicine, mathematics, engineering, meteorology, and many other subjects [37].ANN can vary in depth, with shallow networks having fewer hidden layers, while deep networks comprise numerous hidden layers.In this study, a shallow model is employed, consisting of only one hidden layer.This choice falls within the category of shallow models, indicating a simpler network architecture relative to deeper, more complex ANNs [29].

Ensembles
Ensemble methods combine the strengths of a set of models into one predictive model where several ensemble learners exist to decrease the variance, and bias, or improve the predictions [18].Several ensemble methods construct regression trees when predicting the outcome of the given regression problem.Bagging aims to decrease the model's bias by averaging the prediction over multiple estimates.Boosting methods decrease the model's bias by training different models sequentially to improve the previous models generated [7].The difference between Boosting and Bagging methods is that boosting learners are trained on a weighted data version.Extreme gradient boosting (XGBoost) is an improved gradient boosting technique that has added several parameters to improve the algorithm's prediction accuracy [8].The XGBoost uses Newton's tree boosting to optimise the learning of tree structures, add randomisation parameters for better learning, proportional shrinking of lead nodes on trees, and determines the depth of trees used as weak learners using a penalisation parameter added to prevent trees with high depth that prevents the model for overfitting and improves the performance.Categorical boosting (CatBoost) is a gradient-boosting algorithm that handles categorical features and numerical variables successfully.CatBoost takes advantage of the categorical features by dealing with them during training instead of pre-processing time.The CatBoost algorithm reduces or avoids overfitting by using a new schema for calculating leaf values when selecting tree structures [10,33].A schematic diagram summarising the predictors and predicted parameters and the machine learning algorithms deployed in this study is given in Fig. 2.

Hyperparameter optimisation
The hyperparameters are parameter values in machine learning algorithms that are used to control the learning process, and these values cannot be changed during training.Therefore, hyperparameters must be tuned to attain the best accuracy.The optimal hyperparameter values vary depending on the problem, as the parameters are tuned with a specific dataset [3].One typical way to tune these parameters is to find the best combination of values experimentally; one of the most straightforward hyperparameter tuning strategies is grid search [45].The grid search uses several given hyperparameter values and trains the model with all possible combinations from the given values.While this approach is conceptually simple and exhaustive, it can quickly become computationally expensive and time-consuming, especially when dealing with a large parameter space.The reason grid search becomes expensive is rooted in its exhaustive exploration of hyperparameter combinations.It explores hyperparameter combinations in a brute-force manner, meaning it requires evaluating the model's performance for every combination, leading to an exponentially growing workload as the number of parameters and their respective values increases.To mitigate these computational challenges, Bayesian optimization can be used instead for tuning the hyperparameters [42].Bayesian optimization employs probabilistic models to predict which combinations are most likely to yield better results instead of exhaustively searching through the entire parameter space [39].This allows for a more efficient allocation of computational resources, as it focuses on the most promising regions of the parameter space, thereby reducing the number of model evaluations required.In this study, Bayesian optimization is employed, leveraging 50 iterations to explore a broader range of hyperparameters.We conduct a 5-fold cross-validation on the training data to fine-tune these hyperparameters.As a part of the sensitivity analysis, each training set undergoes a hyperparameters optimization process.The chosen hyperparameters and range of Bayesian optimization in this study are summarised in Appendix.

Sensitivity analysis
A sensitivity analysis is conducted to examine the impact of the training period for each machine learning model.The training dataset spanning from 2021 to 2022 is utilized to create four new training datasets of varying lengths: 6 months, 12 months, 18 months, and 24 months.These datasets are then employed to determine the minimum period necessary for effective training.To evaluate the estimation accuracy for different years, a testing dataset comprising multiple years is utilized.The estimation accuracy varies across these years, and thus, each year's accuracy is individually analysed by separating the estimations for each specific period.It is worth emphasizing that these datasets do not feature random selections across different months.Instead, they offer a realistic representation of how data is collected in practice, enabling the capture of seasonal patterns and trends throughout the specified time frames.To gain a deeper understanding of the correlations amongst the different features, please refer to Fig. 3.This visual representation offers insights into how various features relate to GHI measurements.Notably, the STRÅNG GHI consistently exhibited the highest correlation at all locations, surpassing the 90 % threshold.Following closely, solar elevation demonstrated the second-highest correlation, consistently exceeding 86 % across all locations.
Furthermore, it is intriguing to note that both temperature and relative humidity displayed generally high correlations.However, a notable contrast emerged in Visby when compared to other locations like Kiruna and Norrköping.In Visby, the relative humidity exhibited a significantly lower correlation, nearly half of what was observed in the other locations.It is also essential to emphasize that the correlations between these features remained stable across various training datasets and test datasets.This stability underscores the robustness of the observed correlations.For a comprehensive overview of all feature correlations, please refer to the Appendix.
To enhance the analysis further, all machine learning models are combined to create an ensemble model using probabilistic machine learning techniques.Three specific probabilistic models, namely quantile regression (QR), quantile regression forest (QRF), and quantile regression neural network (QRNN), are employed to estimate the GHI for each specific site.However, training the new probabilistic model require additional data.In this case, the 12-month training set is initially used to estimate the GHI using the deterministic machine learning models.Subsequently, an additional 12 months of data is incorporated to train the probabilistic models, enabling them to estimate the GHI using the machine learning ensembles.

Performance analysis and metrics
The error metrics used in this study to evaluate the performance of the machine learning algorithms for site adaptation are the mean absolute error (MAE), normalized mean absolute error (nMAE), and mean bias error (MBE) with the following formulae: The MAE, nMAE, and MBE should be zero if the estimated values are 100% accurate.During the training process, MAE was employed as the primary error metric.MAE is widely used in machine learning and statistical modelling, and it serves as a valuable tool for assessing the performance of various model.Overall, the choice of MAE as the error metric during training was based on its ability to provide a robust evaluation of the model's performance, especially when dealing with real-world data that may contain noise, outliers, or situations where both overestimation and underestimation need to be treated with equal importance.The machine learning site adaptation techniques do not fully quantify the necessary uncertainty.In addition, the number of machine learning models at each site is satisfactory for combining them and creating a probabilistic model.Therefore, three standard probabilistic models are used to combine the training output obtained by the machine learning models.The probabilistic models are performed for 96 quantiles ranging from 0.05 to 0.95.The probabilistic models are evaluated using the continuously ranked probability score (CRPS), which assess the performance for probabilistic forecasts versus observed value.The CRPS is given by the following relationship: where n is the number of data points, F yi is the cumulative predictive distribution of the forecast ŷi and 1 is the Heaviside step function that is shifted to the observation y i .

Results and discussions
This section provides the site adaptation results of the machinelearning models for Kiruna, Norrköping, and Visby.First, the machine learning models are benchmarked against STRÅNG, linear adaptation, and quantile mapping.Linear adaptation and quantile mapping are used to see if machine learning models can perform better than common site adaptation techniques.Secondly, the models are verified by computing the metrics using the probabilistic models.The estimation accuracy at high latitudes is a known problem with gridded products.Today STRÅNG model is likely the most accurate high-latitude gridded GHI database.However, despite this, a tremendous increase in estimation bias can be noticed at locations in the northern part of Sweden.In Figs. 4 and 5, the MBE and MAE of the respective machine learning models are presented compared to STRÅNG's estimation accuracy.In Kiruna, STRÅNG had a median MBE of − 4.15 W/m 2 as compared to 6.39 W/m 2 and 0.39 W/m 2 for Norrköping and Visby.
The machine learning models successfully reduced the MBE of the STRÅNG model, except for the site located in Visby.While the STRÅNG model exhibited a low bias in Visby, the machine learning models had a negative bias.Despite this negative bias, the machine learning models significantly improved estimations in terms of MAE compared to the STRÅNG model.The MAE was reduced from 28.68 W/m 2 to 20.57W/ m 2 using SVR with 12 months of training in Visby.Furthermore, there was a noticeable trend of MBE and MAE reduction for the machine learning models with an increase in the training period.However, following 12 months of training, the rate of improvement became less pronounced, yet it still resulted in noticeable enhancements when compared to the initial 6-month period.Nevertheless, extending the training data to 18 or 24 months did not yield significant additional improvements.This highlights that a longer training duration than 6 months is necessary for meaningful progress, while the benefits of training for more than 12 months appear to diminish.
Overall, machine learning models improved the solar irradiance estimations of STRÅNG compared to traditional statistical approaches.The variability of solar radiation from STRÅNG from year to year at the different sites shows that both the machine learning models, and the training dataset used for site adaptation will vary at any given location.The machine learning models' accuracy varied depending on location, and none of the models outperformed the rest of the models at the sites being studied.This observation can be further highlighted in Fig. 6, where we depict the nMAE for all machine learning models across the entire test dataset.It is noteworthy that the ANN exhibited suboptimal performance compared to the other models.This can be attributed to several primary factors.First, the limited volume of data provided a less robust foundation for the ANN to learn from.Additionally, the utilization of a shallow neural network architecture restricted its capacity to capture complex relationships within the data.Furthermore, the choice of hyperparameters, such as learning rate, batch size, and the number of epochs, can significantly affect the performance of neural networks.It is possible that further fine-tuning of these hyperparameters could have led to improved results for the ANN.Therefore, a combination of  S. Zainali et al. addressing data limitations, optimizing hyperparameters, and enhancing feature selection could lead to improved performance for the ANN.
More complex machine learning models increase the bias despite being more computationally expensive.The SVR model had promising results at all given locations.However, the SVR model had a longer computational time.The time complexity of SVR is O(N 3 ) [20].Therefore, large sizes of training datasets will be problematic for the SVR model.On the other hand, shallow machine learning models such as kNN are easy to implement as a site adaptation technique that can improve gridded solar products without being computationally expensive.Nevertheless, overall, it could be seen that several of the chosen machine learning models had a significant reduction in bias compared to the STRÅNG model, which is necessary for decision-makers as it reduces the risk in projects and financial investments.However, a universal machine learning model for site adaptation does not exist, which complicates the choice of model.In that sense, three probabilistic models combine the outputs from the previously developed machine learning models.The probabilistic models are used first to reduce the uncertainty of the machine learning models and, secondly to present the reliability and sharpness of the developed model.In Fig. 7, the monthly average QRNN predictions at each site are presented in a time series.In the probabilistic models, the prediction interval coverage is sharp due to high similarities between the machine learning models used as inputs.However, the developed model struggles to accurately estimate the GHI with the same magnitudes at specific timesteps under cloudy conditions.In this study, the number of inputs used for the machine learning model has been limited to not add more noise to the dataset.As the data from STRÅNG already have a bias and do not fully exploit the information required to estimate the GHI at a location without any bias, there is a high chance that adding more input parameters would have their biases corresponding to a reduced estimation accuracy.The variation in estimation accuracy varies significantly during the year, which could be noted in the cross-validation.Therefore, in cases where STRÅNG estimates GHI poorly, similar estimates can be noted for the machine learning models.It is, therefore, essential to reduce the variance in the model's GHI estimations, and one solution would be to develop monthly trained models instead.However, there is one drawback to the development of monthly trained models, and that is the necessity for many models.This is because a single model cannot effectively estimate an entire year.However, in future works, more input parameters from the site can be incorporated to enhance the training of the models.For instance, it could be beneficial to include information concerning elevation data, including that of surrounding regions, as well as the albedo of those nearby areas.These additional factors would likely contribute to a more comprehensive and accurate model training process.In addition, further analysis of different loss functions to reduce the estimation bias can be tested, such as minimizing the CRPS of the machine learning models.
The performance of the probabilistic models in terms of prediction interval coverage particularly noteworthy, highlighting the high similarities observed between the machine learning models.Table 3 provides an overview of the measurement and estimation of GHI error, consistently demonstrating a CRPS below 20 W/m 2 across all three probabilistic models at each of the specified sites.Amongst the models, the QRNN showcased the most impressive performance in terms of CRPS, outperforming the other models at all locations.However, it is worth noting that when considering MAE and MBE, the QRNN exhibited a relatively larger positive bias and did not substantially reduce the MAE in comparison to the QR and QRF models.This result shows that the machine learning models can significantly reduce the bias at highlatitude locations, which are needed for accurate estimations.

Conclusion
Gridded irradiance products struggle to estimate the global horizontal irradiance (GHI) at high-latitude locations.In Nordic countries, the gridded irradiance product STRÅNG is one of the most accurate models.However, this model still has a large bias in high-latitude locations, and there is still a high uncertainty in GHI estimations.This study used site adaptation with machine learning models to improve estimations at three locations in Sweden.These models successfully reduced the bias, especially at the higher latitude locations.Therefore, machine learning as a site adaptation technique can substantially improve the STRÅNG model's quality.The machine learning model enhances the accuracy of GHI estimates.In Visby, the SVR technique successfully reduced the MAE by 8 W/m 2 .Furthermore, by employing a probabilistic ensemble model that combines all deterministic models, we were able to achieve a CRPS of below 20 W/m 2 at all locations.In Kiruna, the probabilistic ensemble model QR effectively achieved a 9.2% reduction in nMAE compared to STRÅNG.This indicates a high level of agreement between the model predictions and the actual measurements, thereby bolstering the reliability of the estimates.However, the inter-annual variability and spatial inhomogeneity in solar radiation affect the quality of the STRÅNG, which also affect the estimation accuracy of the machine learning models.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

•
Introduces a novel approach for site adaptation of solar irradiance based on machine learning techniques, departing from conventional statistical methods.• Machine learning models can substantially improve the STRÅNG model's quality.• Enhancing the reliability of GHI estimates by employing a probabilistic ensemble model that combines deterministic models.A R T I C L E I N F O

Fig. 1 .
Fig. 1.Geographical locations of the three Swedish Meteorological and Hydrological Institute weather stations used in this study.

Fig. 3 .
Fig. 3. Correlation heatmap showing the relationships between input and output features for each training dataset and testing dataset.

Fig. A. 2 .
Fig. A.2. Correlation heatmap showing the relationships between input and output features in Norrköping for each training dataset.

Fig. A. 3 .
Fig. A.3.Correlation heatmap showing the relationships between input and output features in Visby for each training dataset.

Table 1
Swedish Meteorological and Hydrological Institute weather stations under consideration.

Table 2
Machine learning methods.
Fig. 2. Diagram for the site adaptation models.S.Zainali et al.