Forecasting Blood Demand Using the Support Vector Regression Method (Case Study: Blood Transfusion Unit-PMI Central Lombok)

– Blood is an important component produced by the human body. Blood is also a very vital part of human survival. When blood levels in the human body are less than they should be, the way to overcome this is by donating blood or blood transfusion. The health facilities that organize blood donations, provide blood and distribute blood are called Blood Transfusion Units (UTD). UTD in carrying out its duties encountered several obstacles, such as blood only having a shelf life of 35 days from donation. If it has passed the expiration date, it cannot be used anymore for blood transfusions. Meanwhile, regarding the demand for blood, the need for blood is greater than those donating. Making it difficult for UTD if the demand occurs when the existing blood stock is not sufficient. And if the stock in UTD experiences an axcess, it can cause losses because the blood is wasted due to expiration. Apart form that. The problem is that in everyday life, many people’s need for blood is reduced. Many of their families intervened directly to find available donors. They even search on social networks or social media such as WhatsApp, Facebook, Instagram and others. And this shows that many of them lack donors. To anticipate these problems. So it is necessary to carry out research on forecasting blood demand using the Support Vektor Regression method at UTD PMI Central Lombok. The aim of this research is to forecast or predict the demand for blood at UTD PMI Central Lombok in the coming period. To reduce the impact of lack or excess blood. SVR is the application of Support Vektor Machine (SVM) in the case of regression to find the best dividing line in the regression function. The advantage of the SVR model is that it can handle overfiting problems in the data. The tests used to measure the best model are Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE) and Coefficient of Determination (R2). The results of this research shows that the best model is Support Vektor Regression (SVR) with a polynomial kernel and based on the tuning results, the parameters used are C=10, degree=1, epsilon=1. The SVR model using a polynomial kernel produces a MAPE value of 18.7502% and RMSE value of 0.6919, which means the model has very good predictive ability. Prediction accuracy was achieved with an R2 value of 0.9936 or 99.36% and an MSE value of 0.4787, which means that the prediction of blood demand data at UTD PMI Central Lombok using SVR with a polynomial kernel function had very good prediction accuracy. With predicted result in january for blood type A it was 1654, B was 920, O was 2205 and AB was 1104.


I. INTRODUCTION
Blood is an important component produced by the human body.Blood is also the important part of human survival.In general, the function of blood is as a means of transportation in the body such as O2, CO2, nutrition, metabolism, hormones, heat and body immunity [1].When blood level in the human body are less than they should be, the way to overcome this is by donating blood or blood transfusion [2].Blood transfusion is the process of donating blood voluntarily by someone, which will later be used to help people in need [3] [4].
In indonesian, the need for blood continues to increase every year, this happens because blood is needed for accidents and planned operations.Blood donation is a very important thing but is often forgotten.The indonesian Red Cross an one of the blood providers is currently experiencing limitations due to a lack of blood supply for patients, because according to WHO (World Health Organization), ideally blood availability is 2% of the population, meaning that indonesian currently needs 4.6 million bags of blood but PMI only received less than 0.5% of this need in 2005 [5] In the health sector, the health service facility that organizes blood donations, provides blood and distributes blood is the Blood Transfusion Unit (UTD) [ [7][8].UTD in organizing blood donations, providing and distributing blood, there are several obstacles such as the nature of blood which has a limited shelf life [9].Or is easily damaged [10].Quantity requested and amount of blood available at UTD.For example, blood has as a shelf life of 35 days from donation.If it has passed the expiration date, it cannot be used anymore for blood transfusion [11].Meanwhile, regarding the demand for blood, the need for blood is greater than those donating, making it difficult for UTD if the demand occurs when the existing blood stock is not sufficient.And if the stock in UTD experiences an excess, it can cause losses because the blood is wasted due to expiration [12].Apart from that, the problem is that in everyday life, many of their families intervened directly to find willing donors.They even search on social networks or social media such as WhatsApp, Facebook, Instagram and others.And this shows that many of them have not had their blood needs met.
To anticipate losses if there is an excess or shortage of blood, forecasting is necessary.Forecasting is the activity of estimating, predicting future events to provide blood reserves to meet blood needs in the future [13].The method used is Support Vektor Regression (SVR).The application of Support Vektor Machines (SVM) in regression cases with output in the form of real numbers or sequential data is called SVR [14].SVR is an algorithm that can overcome the problem of overfitting where a good model is produced from training data and not test data.a data that at the training stage produces almost perfect prediction accuracy values is called Overfitting [15].Research conducted by [16] with the title Support Vektor Regression for Forecasting Blood Demand: Case Study of UTD Branch-PMI Malang City to Determine Optimal Norms for SVR Parameters in predicting Blood Demand in PMI Malang City.Based on the results of blood data simulations for the 2010-2014 period, a minimum MAPE .value of 3,899% was obtained with parameter values lambda=10, sigma=0.5, cLR= 0.01, C=0.1, epsilon=0.01,number of data features=4 and the number of iterations is 5000, from 12 test data used.The resulting MAPE value is <10% and can be said to be good for predicting the amount of blood needed.Meanwhile, research was also carried out regarding forecasting blood demand which was carried out by [17] by taking a case study in Malang City.In his research, it was concluded that based on the experimental result obtained, the accuracy value produced by the system in the Fuzzy Time Series Interval Optimization research using Particle Swarm Optimization in predicting blood needs was 92.49670% with an error rate (MAPE) of 7.50330% which was obtained from the error calculation results from actual data with forecasting results on 12 test data.in addition, research conducted by [18] used the SVR method to predict monthly rainfall.The SVR prediction results obtained show that the SVR prediction method is very effective in predicting rainfall with forecast accuracy obtained by testing the RMSE value against the SVR model parameter values.The best RMSE is 0.038800637 with gamma of 0.0005, C of 0.0001, and epsilon of 1.Based on the problems presented and research references, the author proposes research entitled "forecasting blood demand using the Support Vektor Regression method (Case Study: Blood Transfusion Unit -PMI Central Lombok)".Hopefully the results of this research can help blood transfusion units adjust blood stocks.

II. RESEARCH METHODOLOGY
The steps taken in this research is began with a literature rivew, collecting data, pre-processing data, modeling, evaluating, implementing, and drawing conclusions.These processes shown in the diagram below in figure 1.

Data Preprocessing
Data preprocessing is the stage for carrying out an initial process in data processing.Data Preprocessing aims to convert raw data into quality data so that the data is suitable for processing at the next stage.The stages in data preprocessing are [19]: a. Data Integration Data integration is the process of combining data from various databases to become one new database.The data needed in the data mining process does not only come from one database but can also come from several databases.Cleaning is the process of cleaning the data that will be used from unnecessary characters and even words.Data cleaning is carried out to clean unfalid data.unfalid data is data that in a dataframe has an empty value and the data has a Nan value.In the incoming and outgoing data, there are some blanks and some values, such as the name feature, age feature, aptave date feature and pocket number feature, have empty data.As in Figure 4 below, it shows missing data.

Fig 4. Data Missing
There are 2 steps used to clean a missing data, namely deleting missing features and unimportant features and inputting the average or mean value for features with few missing values.Here the features that are deleted such as village, aptaf date and pocket number because apart from having missing values, these features are also not very important.and the age feature is input based on the average value.c.Data Selection Data selection is data from a dataset which is be processed later.This data selection aims to select attributes that are considered to be attributes that influence blood data classification.Outliers are data that deviate too far from other data in a data series (outliers).The existence of outliers will bias the analysis of a series of data.Transformation is the stage of making the data to be processed according to the model or algorithm that you want to use in the data processing stage.Data that past the data selection and data cleaning will be transformed, aiming to the distribution of the data so that the data becomes normal data and is easier to group.Data transformation here means changing categorical values into numerical values by utilizing the source code provided by Sklearn, such as the label encoder.
In Figure 7, the columns that are transformed are the goal-in column, the goal-out column, the blood type column and the age column.The age column is binned first, after which it continues to the transformation stage.In this SVR model there are three kernels, namely the linear kernel, polynomial kernel and radial basis function (RBF) kernel.These three kernels will be tested using train data and good results will be selected for forecasting.a support vector regression diagram can be seen below: after getting good results from all three kernels.Then, an evaluation stage was carried out using MAPE, MSE, RMSE and d2.After the evaluation results for the SVR model and the three evaluation methods come out, the next stage is to make predictions.If the combination of kernel functions, parameters, and kernel evaluation used gives good results, then predictions can be made using the test data that was provided at the time of data sharing.

Implementation
This stage is carried out based on the results of tests that have been carried out previously.This implementation includes algorithm implementation and interface implementation using the Python programming language with the Jupyter notebook editor.

Conclusion
At this stage, conclusions are drawn after passing through the initial stages of literature study to system testing and analysis which the result in conclusions based on the results of the research carried out by the author.Drawing conclusions is used to answer the problem formulation that has been formulated before.And provide suggestions for further research.

SVR Models
The data used in this research was 2075.The data was divided into 2, namely 80% training data or 2164 data, while 20% was test data amounting to 541 data.Then the training data is used to find the value of each kernel, including: a. Linear Kernels The linear kernel is to look for the x_train, y_train scores and x_test, y_test scores.The scores obtained are for the x_train, y_train scores of 0.3703 and for the x_test, y_test scores of 0.4006.as in figure 9.     2, it can ne explains that in this study the error calculation was that the MSE value was 0.4064, the RMSE value was 0.6375 and the MAPE value was 22.1498%.and the calculation of accuracy or precision of predictions using R2 is 99.46%.It means that the SVR model with a polynomial kernel is more accurate in predicting blood demand at UTD PMI Central Lombok.After calculating the error evaluation, the next step is to predict blood demand for the next period.This is the prediction of results to be obtaine for the next month:  Based on the pictures above, it can be explained that the prediction results are in January.For incoming blood group A of 9 and outgoing blood group A of 9, the prediction result obtained is 1655.Next, for incoming blood group B of 5 and outgoing blood group of 5, the prediction result obtained is 920.Furthermore, for the incoming blood group O of 12 and the outgoing blood group of 12, the prediction result is 2205.Meanwhile, for incoming blood group AB of 6 and outgoing blood group of AB of 6, the predicted result is 1104.

IV. CONCLUSION
Based on the discussion above, it can be concluded that the SVR model used is a poly kernel because the results shows was very good.With the parameters used, namely C= 10, degree = 1, epsilon=1, this means that the poly kernel is able to overcome the overfitting problem in predicting blood demand at UTD PMI Central Lombok.forecasting blood demand at UTD PMI Central Lombok using support vector regression with a polynomial kernel.then the MSE obtained is 0.4787, the RMSE is 0.6919 and the MAPE value is 18.7502%, which means that the forecasting accuracy is very good or the model has very good forecasting capabilities.The R2 value is 0.9936 or 99.36%, which means that the prediction of blood demand data at UTD PMI Central Lombok using SVR with a polynomial kernel function has very good accuracy or prediction accuracy results.With predicted results in January for blood type A of 1654, B of 920, O of 2205 and AB of 1104.
[6].In accordance with Government Regulation Number 7 of 2011 concerning Blood Transfusion Services, Minister of Health Regulation Number 83 of 2014 concerning Blood Transfusion Units and Minister of Health Regulation Number 14 of 2021 concerning Related Standards relating to commercial activities and products.In implementing risk-based commercial licensing.

Fig 1 .
Fig 1. Research Steps 2.1 Literature Review at this stage, a literature study is carried out or collecting information based on references from books, journals, and some research related to the study, a national and international research.Literature studies also discuss theories that are relevant to the research and

Fig 2
Fig 2 Incoming Blood Data 2.3 Data PreprocessingData preprocessing is the stage for carrying out an initial process in data processing.Data Preprocessing aims to convert raw data into quality data so that the data is suitable for processing at the next stage.The stages in data preprocessing are[19]: a. Data Integration Data integration is the process of combining data from various databases to become one new database.The data needed in the data mining process does not only come from one database but can also come from several databases.

Fig 3 .
Fig 3. Integration Data Source Code b.Data CleaningCleaning is the process of cleaning the data that will be used from unnecessary characters and even words.Data cleaning is carried out to clean unfalid data.unfalid data is data that in a dataframe has an empty value and the data has a Nan value.In the incoming and outgoing data, there are some blanks and some values, such as the name feature, age feature, aptave date feature and pocket number feature, have empty data.As in Figure4below, it shows missing data.

Fig 5 .
Fig 5. Selection Data d.Data OutlierOutliers are data that deviate too far from other data in a data series (outliers).The existence of outliers will bias the analysis of a series of data.

Fig 6 .
Fig 6.Outlier Data e.Data TransformationTransformation is the stage of making the data to be processed according to the model or algorithm that you want to use in the data processing stage.Data that past the data selection and data cleaning will be transformed, aiming to the distribution of the data so that the data becomes normal data and is easier to group.Data transformation here means changing categorical values into numerical values by utilizing the source code provided by Sklearn, such as the label encoder.In Figure7, the columns that are transformed are the goal-in column, the goal-out column, the blood type column and the age column.The age column is binned first, after which it continues to the transformation stage.

Fig 8 .
Fig 8. Support Vector Regression Steps 2.5 Evaluationafter getting good results from all three kernels.Then, an evaluation stage was carried out using MAPE, MSE, RMSE and d2.After the evaluation results for the SVR model and the three evaluation methods come out, the next stage is to make predictions.If the combination of kernel functions, parameters, and kernel evaluation used gives good results, then predictions can be made using the test data that was provided at the time of data sharing.2.6 ImplementationThis stage is carried out based on the results of tests that have been carried out previously.This implementation includes algorithm implementation and interface implementation using the Python programming language with the Jupyter notebook editor.2.7 ConclusionAt this stage, conclusions are drawn after passing through the initial stages of literature study to system testing and analysis which the result in conclusions based on the results of the research carried out by the author.Drawing conclusions is used to answer the problem formulation that has been formulated before.And provide suggestions for further research.

Fig 9 .
Fig 9. Source Code Linear Kernels b.Rbf KernelsThen, the formation of SVR uses Radial Basis Function (RBF).This is a way to look for x_train, y_train scores and x_test, y_test scores.The scores obtained are the scores for x_train, y_train of 0.2662 and the scores for x_test, y_test of 0.3210.as in figure10.

Fig 10 .
Fig 10.Source Code RBF Kernels c. Polynomial Kernels The next thing is to look for the x_train, y_train scores and x_test, y_test scores.The scores are the scores for x_train, y_train of 0.8131 and for x_test, y_test of 0.6856.As seen in Figure 11 below:

Fig 11 .
Fig 11.Source Code Polynomial Kernels To make it clearer, you can see in table:

Fig 12 .
Fig 12. Prediction ResultsBased on the pictures above about the prediction results for the next month.Or more details can be see below:

Table 1
Comparison of SVR ModelsBased on table 1, it can be concluded that, based on the three kernels.The kernel used to make predictions is a polynomial kernel.After getting the best SVR model, proceed with improving the model using tuning parameters.Tuning parameters are used to increase the