Solar Panel Tilt Angle Optimization Using Machine Learning Model: A Case Study of Daegu City, South Korea

: Finding optimal panel tilt angle of photovoltaic system is an important matter as it would convert the amount of sunlight received into energy e ﬃ ciently. Numbers of studies used various research methods to ﬁnd tilt angle that maximizes the amount of radiation received by the solar panel. However, recent studies have found that conversion e ﬃ ciency is not solely dependent on the amount of radiation received. In this study, we propose a solar panel tilt angle optimization model using machine learning algorithms. Rather than trying to maximize the received radiation, the objective is to ﬁnd tilt angle that maximizes the converted energy of photovoltaic (PV) systems. Considering various factors such as weather, dust level, and aerosol level, ﬁve forecasting models were constructed using linear regression (LR), least absolute shrinkage and selection operator (LASSO), random forest (RF), support vector machine (SVM), and gradient boosting (GB). Using the best forecasting model, our model showed increase in PV output compared with optimal angle models.


Introduction
Recently, research and use of photovoltaic power generation have been increasing worldwide. With issues such as depletion of natural resources and environmental pollution, securing sustainable green energy and using it more effectively became important. In particular, photovoltaic power generation has attracted a great deal of attention by using semi-permanent energy sources such as solar, but efficient development has been limited due to factors such as location, climate, and installation type.
There have been numerous efforts to implement the photovoltaic systems in South Korea. The country relied heavily on the imports of fossil fuels as its source of energy and its energy-consumption rate is among world's top 10 [1]. Nonetheless, due to the negative effects the fossil fuels generate to the environment, the Korean government plans to build rural-area photovoltaic (PV) systems. Following the trend, numerous studies have been conducted by Korean researchers in terms of PV systems including topics such as the regional differences of optimal orientation of PV systems and optimal PV model under residential conditions to minimize the cost [2,3]. Solar energy gets converted into electricity using photovoltaic (PV) technology, which receives solar irradiance from its panel as a source of energy. Roman [4] noted that how much of electricity a solar system produces depends on how much sunshine it receives. Therefore, the more a PV collects, the more energy it produces. Accordingly, previous studies have focused on estimating solar radiation and the optimal tilt angle of the solar panel to maximize the amount of solar irradiation. Jamil et al. [5] estimated availability of solar radiation for south-facing flat surfaces in humid subtropical climatic region of India, and monthly, seasonal, and annual optimum tilt angles were estimated. Benghanem [6] analyzed the optimal choice of the tilt angle for the solar panel in order to collect the maximum solar irradiation for Madinah, Saudi Arabia. Wei [7] constructed forecasting models to estimate surface solar radiation on an hourly basis and the solar irradiance received by solar panels at different tilt angles, to enhance the capability of PV systems in Tainan City, Taiwan.
Nevertheless, amount of sunlight reaching at PV panel is not a sole factor in expecting maximum power generated. Although not as impactful as solar radiation, factors such as elevation, humidity, and weather condition were found to be other important variables in determining solar power generation [8]. Dinçer and Meral [9] found that factors such as cell temperature, MPPT (maximum power point tracking), and energy conversion efficiency affect solar cell efficiency. Since each PV module consists of different solar cell structures, materials, and technologies, it is difficult to expect a unified spectral response when equal amount of solar radiation was given.
As such, finding the optimal tilt angle of a solar panel to receive maximum sunlight does not guarantee the PV module to exploit it fully. Martin and Ruiz [10] analyzed the angular loss of the incident radiation and the surface soil. They calculated the optical losses under a certain field condition relative to the normal incidence situation, of which electrical characteristics of a PV module is applied with a clean surface. They found that dust influenced the angular loss meaningfully. This finding suggests that the angle where maximum sunlight could reach the PV module is not necessarily the angle, but a complex entanglement of a wide variety of factors. Therefore, the objective of this study is to construct a forecasting model to estimate the solar power generation and derive an angle that can maximize it through simulation considering various conditions such as weather, dust level, and aerosol level. PV data from 22 solar power plants in Daegu city, South Korea, weather data ranging from January 2016 to March 2018, and sun location data were used as input variables. The rest of this paper is organized as follows: Section 2 describes about the studying site and data. Section 3 introduces the proposed methodology of PV panel optimization based on the machine learning algorithm. Section 4 evaluates the result of the proposed model and compares the predicted solar power based on the optimized panel angle against the original angle. Finally, Section 5 discusses the conclusion of this study.

Study Site and Data
The study site is in Daegu city, South Korea. The collected data are from 22 PV modules out of 246 present in Daegu city.

Solar Power Generation Data Set
173,568 records of solar power generation data were acquired from 22 PV modules. Collected period of the data ranges from January 2016 to March 2018. The data consists of relevant features such as module capacity, installation location, module azimuth angle, and panel angle. The panels' angles were all fixed as shown in Table 1.

Meteorological Data
The meteorological data of Daegu Metropolitan City was collected through Meteorological Agency's Open Weather Portal. The meteorological office operates single meteorological observatory in the city and collects time data such as temperature, precipitation, wind speed, humidity, and sunshine. Synoptic meteorological observations are ground observations that are performed at the same time on all observatories at a fixed time in order to determine the weather of the synoptic scale. The size of the scale refers to the spatial size and longevity of high and low pressures expressed in weather map. The attributes of the collected dataset are shown in Table 2. The mass concentrations of aerosols, the microdust (µm/m 3 ), were collected using dust monitor (PM10) placed in Daegu Metropolitan City. The dust monitor is a device that continuously measures the concentration of particles having a diameter of 10 µm/m 3 or less among aerosols floating in the atmosphere.
In addition, aerosol data were collected (Table 3). Aerosols are solid or liquid particles floating in the air and usually have a size of about 0.001-100 µm/m 3 and are caused by natural factors such as dust, ash, and sea salt, as well as by artificial factors such as emissions from urban and industrial facilities, incineration, and automobiles. It affects climate change by flooding in the atmosphere to block or absorb solar radiation coming into the surface, or by changing cloud formation and physical properties. The Meteorological Agency observes the aerosol water concentration by particle size from 0.5 to 20 µm/m 3 at the Anmyeon Island Climate Change Monitoring Center as part of the World Meteorological Organization's Global Atmosphere Monitoring (GAW) program.

Sun Position Data
The hourly solar position for Daegu City during the 2016-2018 period was calculated using a theoretical equation. The declination angle, the hour angle, the zenith angle, the elevation angle, and the azimuth angle were the variables for the solar position used in this study. In addition, the ratio of beam radiation and diffuse radiation on tilted surface were also calculated.
The declination angle, which is denoted by δ, has a seasonal variance due to the tilt of the earth on its axis of rotation and the rotation of the earth around the sun. The equation of declination is calculated as: where n d is the day of a year. The hour angle, which is denoted by ω, is the hourly angle of the sun's movement from the east to the west on the celestial sphere of the Earth. Sun's positional change is 15 • per hour since it takes 24 hours for sun to have a full rotation on its axis. The equation of hour angle is calculated as: where H is time in 24-hour format. The zenith angle, denoted by θ, is the angle between the sun and the direct overhead point at a measuring location. The equation is calculated as: where λ is the latitude of a measuring location. The elevation angle, denoted by α, is the angle from the sun to the observation point and the horizontal plane. The equation is calculated as: The azimuth angle, denoted by ξ, is the angle between the Earth's orbit around the sun and its horizon.
The equation is calculated as: The ratio of the average daily beam radiation on a tilted surface and the ratio of the average daily diffuse radiation on a tilted surface was calculated by using equations proposed by Liu and Jordan [11]. The equation of the ratio of the average daily beam radiation on a tilted surface (R b ) depends on the point of observation's geographic location. Since the observation point of this study is located in the northern hemisphere, we used the corresponding equation: where φ is the latitude, β is the solar panel's tilt angle, and the ω ss is the sunset hour angle. Lastly, the ratio of the average daily diffuse radiation on a tilted surface (R d ) was calculated as: 3. Methodology

Procedures
The procedure of this study is as shown in the Figure 1.
Energies 2020, 13, 529 5 of 13 depends on the point of observation's geographic location. Since the observation point of this study is located in the northern hemisphere, we used the corresponding equation: where is the latitude, is the solar panel's tilt angle, and the is the sunset hour angle. Lastly, the ratio of the average daily diffuse radiation on a tilted surface ( ) was calculated as: 3. Methodology

Procedures
The procedure of this study is as shown in the Figure 1.

Data Collections
As mentioned in the previous section, the PV module data, meteorological data, and the sun position data were the required data for this study. As each PV module's collection period differed within the range of January 2016 through March 2018, the meteorological data and sun position data were collected for this whole period.

Data Preprocessing
All collected data were recorded on an hourly base. As our proposed model predicts each PV module's monthly and annual output, collected data were aggregated accordingly to match the unit. Additionally, for every PV site, we calculated average daily beam radiation ( ) and average daily diffuse radiation ( ) of all possible panel tilt angles ranging between 0 and 90 degrees using the equations stated in the previous section. There were originally 69 PV sites data collected from Daegu-city but we only chose 22 of them because others had missing data.

Correlation Analysis
In the data preprocessing stage, we performed correlation analysis on 22 PV sites and calculated correlation between the input features and PV output to select relevant features for our forecasting model. From 31 of the available features, 14 were selected as shown in the resulting Table 4.

Data Collections
As mentioned in the previous section, the PV module data, meteorological data, and the sun position data were the required data for this study. As each PV module's collection period differed within the range of January 2016 through March 2018, the meteorological data and sun position data were collected for this whole period.

Data Preprocessing
All collected data were recorded on an hourly base. As our proposed model predicts each PV module's monthly and annual output, collected data were aggregated accordingly to match the unit. Additionally, for every PV site, we calculated average daily beam radiation (R b ) and average daily diffuse radiation (R d ) of all possible panel tilt angles ranging between 0 and 90 degrees using the equations stated in the previous section. There were originally 69 PV sites data collected from Daegu-city but we only chose 22 of them because others had missing data.

Correlation Analysis
In the data preprocessing stage, we performed correlation analysis on 22 PV sites and calculated correlation between the input features and PV output to select relevant features for our forecasting model. From 31 of the available features, 14 were selected as shown in the resulting Table 4.

Modeling
In machine learning, predictive methods serve different objectives depending on which type of prediction problem a researcher works on. Since our objective is to construct a model, which can successfully learn from the data to predict the PV output, which is a continuous variable, regression learners were considered for our predictive method candidates.
In this work, gradient boosting was used as our model's base algorithm. Gradient boosting machine is an ensemble method, which constructs base learners to maximally correlate it with the negative gradient of loss function, associated with the whole ensemble [12]. Ensemble methods often improve predictive performance for its generalization power and computational advantage [13]. More specifically, Gradient boosting machine constructs a sequence of regression trees, where each tree predicts the residual of preceding tree, and the machine aggregates the predictions additively to minimize the loss [14]. Compared to other machine learning algorithms, Gradient Boosting is proven to be very successful in experimental comparisons of learning algorithms [15,16]. It is also successfully applied in industrial applications [17,18]. Considering optimization, the gradient boosting algorithm has relatively few parameters to tune.
In order to verify that gradient boosting algorithm is a good fit for our study, we compared the predictive performance of different algorithms. For comparison, we randomly selected one of the 22 PV modules (S07-04) and trained each model on the subset (January 2016-December 2017). Trained models were then validated using the remaining portion of the PV dataset (January 2018-March 2018). Root-mean-square error (RMSE), which represents the difference between the predicted output and the actual output, was calculated for each model as shown in Table 5. From the result, we verified that the gradient boosting (GB) model showed the lowest RMSE (train: 2.5152, test: 5.5122). Thus, we chose the trained gradient boosting model for our simulation model after tuning the model using grid-search algorithm.

Model Simulation for Monthly/Annual PV Optimal Tilt Angle
For every PV module, monthly optimal tilt angles were derived by simulating our trained model. We defined the optimal tilt angle as an angle that maximizes the PV output. Simulation period was January 2017-December 2017 and simulated angles ranged from 0 to 90 degrees. Among the simulated angles, an angle that produced the highest PV output was recorded as a monthly optimal tilt angle, and the corresponding PV output was recorded as well. Similarly, simulation of 2017 as a whole was done for the annual PV optimal tilt angle and a single angle that produced the highest PV output for the entire year was recorded as optimum.

Results
The estimation result of the 2017 PV outputs is shown in Table 6. The estimated PV output is the annual PV output predicted by our forecasting model. Original panel angles were applied for the estimation. The trained model successfully simulated the annual PV output with identical parameters given as the original condition.
The simulation result of the 2017 PV outputs is shown in Table 7. Here, our trained model simulates each PV module's annual output by applying: (1) the computed yearly optimal angle and (2) the computed monthly optimal angles. The comparison was made based on the model's estimated PV output shown in previous result. The yearly optimal angles of the PV modules were 1-29 • . Most of the modules had a small increase in PV output at the yearly optimum angle. S06-03 module showed the least improved output rate (0.03%) while S07-04 module showed the most (4.02%). In terms of angular difference, S06-03 module required least amount of angular change and S07-04 module required the maximum angular change. Similarly, we could see that other modules' rate of improvement and rate of angular change were positively correlated. This pattern partially suggests the level of efficiency in currently applied angles for all PV modules.
The result of PV output difference was even more significant when angles were monthly adjusted using monthly optimum angles. For every PV module, the result of PV output for the monthly adjusted case was significantly better than the yearly adjusted case. S13-02 module showed the least improved rate (1.89%) and S07-05 module showed the most improved rate (6.32%). Although costly, the result suggests that it is advisable to adjust the panel angle in monthly fashion to expect high efficiency. Samples of monthly optimum angles and outputs are shown in the Appendix A.
As shown in Table 8, when all other conditions are same and only the angle of the PV panel was adjusted as suggested by our model, we could expect a total of 0.83% (22,452 kWh) increase in overall PV output when adjusted with yearly optimum angle, and 3.32% (91,662 kWh) increase when adjusted with monthly optimum angles. To gain a realistic insight of these results, we used LCOE value (levelized costs of electricity) for the solar energy conversion value [19]. In Korea, the LCOE value for 100 kW facilities was 147.1 Korean Won (KRW)/kWh. By converting additional power generated, we saved 3302 thousand KRW (147.1 × 22,452) by yearly optimum angle and 13,483 thousand KRW (147.1 × 91,662) by monthly optimum angles for the 22 sites annually.

Conclusions
In this paper, forecasting model based on the gradient boosting algorithm was proposed to predict the amount of solar power generated by PV modules on both a monthly and yearly basis, which then simulated the energy generation of PV modules to derive the monthly/yearly panel tilt angles that could maximize them. The study site was in Daegu city in South Korea. The model used the solar power generation data, the meteorological data, and the sun position data.
Compared to the originally fixed angles, the amount of solar energy generated by PV modules when the panel angles were fixed with yearly optimal angle brought slight increase (0.83%) in overall energy generation. The performance change of each PV modules varied from 0.03% to 4.02%, suggesting that actually applied angles of these modules differed in efficiency. When the optimal angle of each PV module was calculated and adjusted on a monthly basis, the overall energy generation had an even higher increase (3.32%) to that of original angles. The performance change of each PV modules varied from 1.89% to 6.32%. Although all modules were located in a single city and share similar geometrical attributes, the optimal angles differed to some degrees.
We calculated how much of economic efficiency we gained when we applied these changes to the real-world in annual basis. In order to produce additional kWh with the original tilt angles, the studied PV modules would cost additional 3302 thousand KRW for the amount of energy that could be produced with yearly optimum tilt angles applied, and 13,483 thousand KRW for the monthly optimum tilt angles applied.
The sun positional data were calculated from data collected by a single meteorological observatory. Although studied PV modules were located in a same city and would not show significant difference in sun positional data between the modules, we could expect more precise and reliable outcome in both modeling and simulation stage if we could measure the sun related data for each module.
We acknowledged a limitation of generalizing our finding to different PV modules of various geographical conditions since the experiment was done on PV modules located within a single city. In our future study, we plan to collect PV modules data from different cities in order to improve the generalization of our approach. In addition, since our studies collectively combined different factors and applied for machine learning techniques, it was a little difficult to single out individual feature's effect. Future studies could address issues like 'rain effect of clearing dust level for increasing PV output' using feature engineering or statistical techniques.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
This section presents the monthly optimum angles and corresponding PV outputs of some of the PV modules to visualize the monthly optimum case.  Monthly PV_output (S13-02)