MULTIPLE LINEAR REGRESSION APPROACH FOR SHORT-TERM FORECASTING OF ELECTRIC ENERGY CONSUMPTION IN TOGO

Comlanvi Adjamagbo, Akim Adekunle Salami, Yao Bokovi, Djamil Gado and Ayite Senah Akoda Ajavon ResearchLaboratory in Engineering Sciences (Larsi), Regional Center of Excellence for the Control of Electricity (CERME), Department of Electrical Engineering of the National Superior School of Engineers (ENSI), University of Lome (UL), BP 1515, Lome, Togo. ...................................................................................................................... Manuscript Info Abstract ......................... ........................................................................ Manuscript History Received: 05 August 2020 Final Accepted: 10 September 2020 Published: October 2020

A Linear Multiple Regression approach is used to model the energy consumption of electricity in Togo. This model is developed from the load data recorded at the electric power source stations in Togo during the period from 2016 to 2017. This model predicts four input parameters (Day of the week, the type of day (working day). or not), Hours in the day and Load data of the same time of the previous day) is used to predict the electrical energy consumption data for the period of 2018 with a MAPE of 4.4964% and a correlation coefficient R2 equal to 95.5889%.
This paperdescribes the experienceswe have gainedduring the development of a short-termprediction model of the electric charge of the next 48 half-hours per day for all year 2018 in Togo with a LinearRegressionmethod. Multiple.
Our goal is to predict the load data withvariouscombinations of explanatory variables to determinewhich configuration case gives the best results.
The problemsencountered and the solutions proposed are discussed. The developed model shouldprovidedailyload profile forecasts for the nextsevendays. The forecastresults for all year 2018 are alsopresented 23 Data presentation: Electricityload or consumption data istakenat the various source substations in Togo during the periodfrom 2016 to 2018. The readings are made in 30-minute steps over the day, whichmakes 48 data per day or 52,608 data. Figure 1 gives an overview of the loadstatements in Excel.

Multiple linearregressionmethod :
Multiple linearregressionanalysis relies on descriptive analysis of data to observe the relationshipsbetween a quantitative dependent variable and n quantitative independent variables. Anymethodusingregressionsisbased on the acceptance of the foundingassumptions of parametricstatistics and the notion of least squares fit. The concept of least squares consists in minimizing the sum of the residualsraised to the power of twobetween the observed value and the extrapolated one [10].
The descriptive equation for multiple linearregressionis as follows (Equation (1)) [11], [12], [13]: is the vector of responses; is the matrix of explanatory variables; is the vector of the model parameters; is the vector of errors. It istherefore a question of calculating the vector of the estimatorsβ which is the solution, in the "least squares" sense. This vector of estimatorsβ is defined by the equation (2) : This model canbeused to makepredictions. It istherefore a question of applying the relation defined by the equation (3) : wherey is the vector of predicted responses.

Methodology:-
The choice and methodicalanalysis of the explanatory variables makeit possible to assess the influence of each input parameter on the output of the forecast model. Indeed, itisvery important, for the accuracy of the model, to chooseadequate input parameters. This stepisveryusefulbecauseitallowsyou to eliminatesome variables thatprovideverylittle or no information to describe the output, or to eliminateredundant variables. Wetookintoaccount the followingexplanatory variables (Table 1):  Our goal is to predict the load data withvariouscombinations of theseexplanatory variables in order to determinewhich configuration case gives the best results. Wetesteddifferent configuration cases which are summarized in Table 2, for a total of 7 configuration cases.
We have dividedour data intotwo groups. The data for the years 2016 and 2017 are used for learning, thatis to say for the determination of the coefficients of the estimatorsβ of the model and the data for the year 2018 are used for validation (for the test of the prediction). For each of the configurations adoptedpreviously, weapplied the modelingmethoddescribed in section 3. Thus, wecalculated the coefficients of the vector of the estimatorsβ from equation (2). Once these coefficients wereknown, wethenperformed the prediction of the new load data by equation (3).
To evaluate the performance of eachprediction model, weused as measures: the average value of the absoluteerrors in percentage (%) (MAPE: MeanAbsolutePercentageError, [2]) committed, expressed by equation (4), the histogram of the absoluteerrors, as well as the correlation coefficient (R²) between the predicted data and the real load data.
where y t the real value ; y t the predicted value ; T the total number of samples.

Results and Discussions:-
In this part, wewill first discuss the choice of the prediction model and thenpresent the results of the prediction for the year 2018 of the electricalload of the energy system of Togo. Tables 3 and 4 respectivelyrepresent the summaries of the coefficients of the estimatorsβ , the R² and MAPE for the seven case configurations thatwe have chosen for the input of each model.   Table 5. The results in Table 5 show that all modelsexcept the Case 4 model (becauseits R² decreased, 77.2865% vs. 78.8005% during training) fit the validation test data. Indeed, the MAPE and R² measurementswereimprovedduringthese tests. The model of case 1 allows us to predict the electric charges with an R² coefficient of 90.0178%. The model of case 2 gives us an R² coefficient of 89.8176%, whichis no longer the highest value, since the model of case 5 gives us an R² coefficient of 95.5889% whichis the largest. The model of case 6 alwaysgives us the lowestcorrelation coefficient (61.9213%). Note alsothatitis the model of case 5 whichpresents the smallest MAPE (4.4964%), whichismoreoverlogicalsinceitscorrelation coefficient R² is the highest (95.5889%). Thus the model of case 5 with the highest coefficient R² and the lowest MAPE during the validation tests ischosen for the prediction of the electricalload of the energy system of Togo. This correlationbetween the measured and predictedload data iswellobserved if wevisualizethisresult for one week. Figure 9 shows the loadpredictionresultfromSunday 07 to Saturday 13 January 2018. Fromthis figure (Figure 9) wecan observe the predicted and measuredload data for eachday (delimitedaccording to Table 6) fromthisweek. In Figure 9, we note a strongcorrelationbetween the measured and predictedload data for the sevendays (fromJanuary 07 to 13, 2018), howeverwe observe a very large differencebetween the curves of the measured and predictedload data for twodays. (Sunday 07 and Saturday 13 January 2018). This observation led us to measure the accuracy of the prediction of the electric charge for eachday of the week of the year 2018, the results of which are shown in Table 6.  Thus the results of Table 6 show that there is a strong correlation between the measured and predicted load data for the seven days of the week in 2018 since the average of the R² for each day is greater than 99%. We also observe a large average of the MAPEs between the curves of the measured and predicted load data for Sundays (9.2746%) and Saturdays (12.8566%). From Monday to Friday, the model of case 5 retained for the prediction of the electric charge of the energy system of Togo presents good performances (MAPE < 2.67% and R² > 99%) on the prediction of the electric charge of each day.

Conclusion:-
This paperpresents the short-termprediction of the electricalload of Togo'senergy system by the Multiple LinearRegressionmethod. Amongsevenmodelsused (each of whichdiffersfrom the other by the nature of these input parameters), we have chosen for the prediction of the electricload a model having four input parameters (Day of the week, the type of Day (working or not), Hours in the day and Load data of the same time of the previousday). Our choicewasjustifiedthanks to the performances obtainedduring the validation tests of this model, sincethislinear multiple regression model allowed us to predict the electricalload of Togo'senergy system for the year 2018 with a MAPE of 4.4964% and a correlation coefficient R² equal to 95.5889%.