IMPLEMENTATION OF MULTIPLE LINEAR REGRESSION METHODS AS PREDICTION OF VILLAGE SPENDING ON VILLAGE FINANCIAL MANAGEMENT SYSTEM

The realization of village welfare and improvement of village development can be started from the financial management aspects of the village. The village government has authority ranging from planning, implementation, reporting to accountability. There are two important variables as the financial aspects, there is village income, and village expenditure. The village budget process is a plan that will be compiled systematically. Planning has an association with predictions which is an indication of what is supposed to happen and predictions relating to what will happen. To provide a good village budget planning the village budget prediction feature is required. This prediction feature is done using data mining which is modeled i.e. multiple linear regression algorithm. The variable is selected using a purposive sampling technique and the sample count is 29 villages. Dependent variables are village Expenditure as Y, and independent variables i.e. village funds as X1 and village funding allocation as X2. The best values as validation were gained in the 3rd fold with a correlation coefficient of 0.8907, Mean Absolute Error value of 87209395.37, the value of Root Mean Squared Error of 114867675.6, Roll Absolute Error (RAE) Percentage was 42 %, and Root Relative Squared Error was 44 %.


INTRODUCTION
The village is the smallest unit in the government and has an important role in national development [1] [2]. The village itself has the authority to manage its household, which means that the village has to regulate its village potential to improve the welfare and quality of life of the village community [3].
The realization of village welfare and quality improvement and village development can be started from the financial aspects of village management. In managing the finances of the village, the village government has authority ranging from planning, implementation, reporting to accountability. The good village financial management activities must be supported by competent human resources (HR) and have the quality and system and procedures of adequate village financial management [4]. In the financial aspects of the village, two things become the focus of the village, village income, and village expenditure. Village income is derived from various sources of village income. Village revenue is used to finance village spending, especially for village spending that is prioritized to improve the welfare and development of the village, where the village's shopping has been through deliberations or village agreements themselves [5].
In the Ministry of Interior regulation of the Republic of Indonesia No. 113 the year 2014 About Village Finance Management, the village revenue is all revenues in the form of money through a village account which is the right of the village in 1 year of unnecessary budget Repayment by the village. Village income is derived from the village's original income consisting of the results of the business and the results of assets, self-participation, mutual assistance, and other indigenous village income. The village revenue also includes the allocation of State budget revenue and expenditure (APBN), part of local tax returns and district/city Levy, Village Fund Allocation, financial assistance of provincial and APBD districts, grants and donations, And others who are the legitimate income of the village. Meanwhile, village shopping is all form of village expenditure through village account which is the obligation of the village in the 1 year budget that will not be obtained by the village again [6].
Village spending can be the needs that will be fulfilled by the village itself, whether it is primary needs, basic service, environment, and community empowerment activities of the village. These needs have been through an agreement and aim to encourage the development of villages and the welfare of the village [7]. Every village in Indonesia has different income sources, it is adapted to the potential owned by the village itself. Village revenue and village spending are two interdependence, where village spending must adjust to village revenue. In drafting the government work plan, the village Government will verify and ratification of what kind of shopping will be included in the government work plan[8]. This budget planning has a close relationship to the predictions, planning a good program will affect the success of the village program. Planning plan is always related to the name of the prediction, predictions relate to what will happen, while planning is related to what should happen. The application of predictive methods in this budget planning will help the village government to develop good budget planning [9].
In this research, researchers conducted research using APBD data of villages in the city of Bandung obtained from the Bandung city of BPS. To gauge how the village's revenues affect village spending, researchers use statistical methods of multiple linear regression, which is one of the statistical methods used to test causal relationships as a result of one dependent variable and two or more independent variables. In general, the free factor is usually denoted by X and the fixed factor is denoted by the variable Y [10]. In this method, the fixed variable is village shopping, and the free variable is village funds and village Fund allocation both of these free variables are Income from the village. In village revenue sources, these two variables have a greater value among other sources. Note the table 1. Concerning table 1 the magnitude of the village funds, the allocation of village funds, and village spending showed that the relationship or influence of these three variables. If the value of village income and village funds allocation rises then village spending will also follow the value of the village income itself, this is the opposite as well if the village income decreases then the value of the village expenditure will also decline [11]. With the implementation of this multiple linear regression statistical method, researchers will measure the impact of village income and village funding allocation on village spending. In addition to determining the influence of variables, researchers will also implement a statistical method of multiple linear regression in the system, to measure the predicted number of village spending [12]. For now, the budget logging process in some villages still uses manual logging, noting what the budget is. In this system, researchers will design a system for the budget recording process and add predictive features that will help the village government to determine the proper use of the budget by predicting village spending as the basis for Specifies the government work plan. For the results of this study with validation accuracy of 94.47% precision and the regression model obtained is Y =-217,067,261 + 1.3033363 X 1 + 1.253114628 X 2

Fig 1. Cross Industry Standard Process for Data Mining
The phase of the research methodology flowchart [13] contained in the Fig.1 will be described in general as follows:

Bussiness Understanding
Good and precise use of budgets will support Village development and the welfare of the village itself. Before using APBDes, the village will first prepare the (government work plan). These funding submissions must be completely accounted for and have valid documents. The relationship between government work plan and the village spending on this research is where the village's shopping comes from the expenses made by the village within 1 year. Planning alone relates to what is supposed to be done and has relevance to the predictions.
Predictions in the form of village spending can be used as a village guideline or reference in determining the use of budget anywhere, and the total prediction of village spending can also be used as an evaluation material to assist the Government in managing Budget. This will also help to avoid legal issues related to malpractice or improper use of village budgets. Imprecision of budget use goals can be a serious problem that the village has, the level of community confidence will decrease when there is an improper use of funds. Village budget preparation will adjust to each other's village spending plan by utilizing the amount of village income gained by the village.

Data Understanding
The data used is secondary data derived from the APBDes report in 2018. Data obtained from the website bps.go.id. and databandung.go.id. The data retrieval technique uses a purposive sampling technique, by taking certain criteria in the sampling. For the number of samples using as many as 29 samples of villages in West Java.

Data Preparation
Researchers will conduct testing using calculations manually using Microsoft Excel, Weka, and SPSS. SPSS is used for statistical analysis. The program has a fairly high analysis capability besides it also has a data management system on the graphical environment. The use of the program is also simple with a variety of descriptive menus and dialogue boxes.

Modelling
The researchers implemented the model that has been obtained by conducting various tests.

Evaluation
The results of the models already obtained will then be evaluated in the context of the research objectives set in the initial stage (business understanding). This will lead to the identification of other needs through the introduction of the acquired pattern. Acquisition of business understanding is a recurring procedure in data mining, which results from visualizations, statistic facts, and artificial intelligence methods towards new relationships. For self-evaluation, The research will be validated using the tools Weka, using these tools will find the validation result whether the model is good after performing the Test stage. In the tools, It will also measure the results of estimated performance from regression by measuring the values Correlation, MAE, RMSE, RAE, and RRSE.

Deployment
After going through all the research phases, researchers use the results of regression models obtained from the manual calculation process or through calculations using SPSS tools. Then all the design threads will be applied to the system, and the sample data is used on the system to determine the regression formula. The process of creating an APBDes report will begin from the village's income input, then to the shopping village, in this infiltration village spending will have a total village spending automatically/the maximum limit of village spending. At the village spending, the related parties will adjust any budget or government work plan to be included. After that, the village financing will be inputted as the village income and village expenditure. The Output of the research itself is the APBDes report consisting of village revenue, village expenditure and village financing.

Fig 2. Population and Sample
Purposive Sampling technique, the use of this data retrieval technique is used to determine non-random sampling retrievals, this data retrieval technique determines the special Citi characteristic by adjusting to the purpose of the research itself [14]. The data retrieval technique itself is part of the Non-random sampling technique by not providing the same opportunity in population members to be sampled on research. For the population in this study using village data in the province of West Java and for samples used as many as 29 samples of data.

Classic Assumption Test
In a double linear regression model, there is a classical assumption test to analyze whether the data used meets the criteria of BLUE (Best, Linear, Unbiased Estimator) [15]. Here are some stages of testing the data used on the double linear regression model: Based on the Fig.2 , following the rule at output SPSS, that if the value of tolerance in the output is obtained > 0.1 and the value of variance inflation factor (VIF) < 10 Then there is no symptom of multicollinearity [16]. The value of tolerance are both free variables in this study of 0479 > 0.1 and the VIF value of 2,086 < 10 Then there are no symptoms of multicollinearity. Following the provisions that Du < Durbin Watson < 4-du or 1, which means 1,563 < 1,665 < (4-1.563 = 2,437). It can be concluded that there are no symptoms of autocorrelation [17].

Fig 5. Test the normality by Histogram
This image is the output of SPSS tools, the above diagram is a histogram to indicate normality. In multiple linear regression, the normality test will test the residual instead of the variable data. According to the output rule of the normal distributed data SPSS if the Histogram diagram forms a normal curve then the residual is expressed normal and the assumption of normality is fulfilled [18].

Fig 6. Test the normality by Normal P-Plot
In this SPSS output, it can be noted that the plots follow a straight line groove, thus it can be assumed that the normal distributed data.

Fig 7. Test the normality by Kolmogrov Smirnov
In this SPSS output, it can be noted that the plot follows a straight line groove, so it can be assumed that the distributed data is normal.

Statistical Test F Fig 8. Statistical Test F
Based on the output of multiple linear regression in Figure 6, the statistical test result of F with a significance value of 0.000 which means the sig value of < 0.05, i.e. 0,000 < 0.05, it can be deduced village funds (X1) and village funds allocation (X2) simultaneously Significant effect on village spending (Y1).  Figure 7, the result of the test value is statistical significance t in column P >t -= 0.00. Then it can be concluded that: Variable significance value X1 (village Fund) P >t -= 0.00 < = 0.05. It was concluded that the X1 (village Fund) variable has a significant significance and positive impact on the Y variable (village expenditure). Variable significance value X2 (village Fund Allocation) P >t -0.04 < = 0.05. It can be concluded that the variable X2 (village fund allocation) is a significant point towards Y (village spending)..

Multiple linear regression
Multiple linear regression methods are one of the statistical methods used to test causal relationships from one variable dependent and two or more independent variables [19]. In general, the causal factor is usually denoted by X and the resulting factor is denoted by the variable Y. Multiple linear regression methods are also often used in production to perform production forecasting related to quality as well as Quantity [20]. In general, the formula of double linear regression is: : The value of the regression coefficient or the size of the first variable value b2 : The value of the regression coefficient or the size of the second variable value X1 : First Variabel independent X2 : Second Variabel independent To obtain a constant value of a and a regression coefficient value of b1 and b2, can use the formula below: To get a constant value, then use the following formula to make use of the values b1 and b2 are obtained first.
̅ ̅̅̅̅ ̅̅̅̅ (4) To perform the analysis using multiple linear regression, for the stage used using a small quadratic method. The first stages are as follows: 1

Data Calculation Process
With the use of 29 samples of village data in West Java, researchers will analyze the relationship or influence of village income to village expenditure. For variable X or variable due is village spending and variable Y or variable free is village fund and village fund allocation. In the calculation of data, researchers first calculate the total of variables X1, X2 and variable Y. Note on Fig. 8  The value of each sample data exists in the X1, X2 and Y variables, and then in the search for the total squared value of each sample. Once found the quadratic value of each new sample will be equalized. After that, the variable multiplication value is X1, X2 and the variable Y. Multiply the values of the X1, X2 and Y variables in each sample. Then in total, the value of all of these totals will be used to determine the values a, b1, and b2.
For value N in research is the amount of data sample used in research, for sample data as many as 29 villages. The following formulas are used to specify the values of the variables to be used in the formula to calculate the value of regression coefficient b1, b2 and a constant.
Once the searched total value is obtained then specify a constant value a and the regression value b1 and b2 by using the formula (2) and the formula (3). Then the values of b1 and b2 that have been obtained are used in the general formula (4) of multiple linear regression, after which the specified value.   = -217,067,261 The value a = -217,067,261, the value b1 = 1.3033363, and the value b2 = 1.253114628, the regression then model is as follows: Next, this stage will demonstrate the process of testing the data using the model already acquired. Using the X value will then be able to find what the predicted value is from the Y value. The multiple correlation coefficient (R) is the relative size of the association between two variables to study the relationship between two quantitative variables based on the number. The strong link between the two variables expressed in a range or size is -1 ≤ R ≥ 1. The formula used is:: Then the value R = 0.914933867 which means -1 ≤0.914933867 ≥1.

Coefficient of determination
Coefficient of determinations are used to measure how far the ability of the model in describing the variation of dependent variables by a regressor or independent variable. The value of the efficiency determination has an interval between 0 and 1 (0 ≤ R 2 ≥1). The formulas used are: Then the value R = 0.83710398which means -1 ≤ 0.83710398≥1.

Validate using Cross Validation
Validation testing on regression models will be used to measure the regression model obtained by using 29 samples of data already obtained. Cross-Validation is how to find the best parameter of a model by testing the amount of error in the test data[21].
In the crossvalidation, the data will be divided into k samples of the same size.
In the crossvalidation researchers must set the number of partitions or fold, using the validation test to use the K-fold cross-validation. For the standard use of fold is to use 5, 10, 15, 20 to 30 fold.
Here's the result::  In Figure 22, this page serves to manage village financing CONCLUSION Based on research conducted by researchers with the background of the problem regarding the management of village finances with the aim to analyze the linkages of the village fund and village funds allocation to the village expenditure, as well as manufacturing APBDes Management System and the addition of village spending prediction features to help use the government work plan budget more precisely on target. Researchers found that research could answer or provide solutions to existing problems. With the use of methods as a feature for village spending predictions, will help the village to have a picture of the village budget usage will be addressed anywhere. Total expenditure village which so predicted to be the limit of budget use and can also be used as a village driver to further improve the sources of income thus village spending will also increase and support the development of the village and progress. A multiple linear regression method is a statistical analysis method that can be used as a predictive method and also as a method for measuring the linkage between the variables used in the study. The use of methods as a village expenditure prediction feature helps research in conducting village spending predictions using the village funds and village fund allocation variables, two variables that have gone through the Test stage on the SPSS also escaped the various Symptoms on testing or meeting the criteria of BLUE (Best, Linear, Unbiased Estimator). For validation values with the best value obtained in the 3rd fold with a correlation coefficient of 0.8907, and the Mean Absolute error rate of 87209395.37, and the Root Mean Squared error value was 114867675.6, the Absolute Error (RAE) Percentage of 42% and Root Relative Squared Error of 44%.