Machine learning algorithms for forecasting and backcasting blood demand data with missing values and outliers: A study of Tema General Hospital of Ghana

The major challenge in managing blood products lies in the uncertainty of blood demand and supply, with a trade-off between shortage and wastage, especially in most developing countries. Thus, reliable demand predictions can be imperative in planning voluntary blood donation campaigns and improving blood availability within Ghana hospitals. However, most historical datasets on blood demand in Ghana are predominantly contaminated with missing values and outliers due to improper database management systems. Consequently, time-series prediction can be challenging since data cleaning can affect models’ predictive power. Also, machine learning (ML) models’ predictive power for backcasting past years’ lost data is understudied compared to their forecasting abilities. This study thus aims to compare K-Nearest Neighbour regression (KNN), Generalised Regression Neural Network (GRNN), Neural Network Auto-regressive (NNAR), Multi-Layer Perceptron (MLP), Extreme Learning Machine (ELM) and Long Short-Term Memory (LSTM) models via a rolling-origin strategy, for forecasting and backcasting a blood demand data with missing values and outliers from a government hospital in Ghana. KNN performed well in forecasting blood demand (12.55% error); whereas, ELM achieved the highest backcasting power (19.36% error). Future studies can also employ ML algorithms as a good alternative for backcasting past values of time-series


Background of the study
Blood supply chain (BSC) encapsulates all the processes of collecting, testing, processing, storing and distributing blood and its components from donor to recipient patient (Osorio, Brailsford, & Smith, 2015;Stanger, Wilding, Yates, & Cotton, 2012).Blood cannot be manufactured artificially, and supply depends on voluntary human donors who cannot be easily predicted.Given that blood demand and supply are uncertain, and blood is a limited resource and a perishable product with short shelf life, it is challenging to manage blood inventory.The major challenge in managing blood products lies in the uncertainty of blood demand and supply, with a trade-off between shortage and wastage (Stanger et al., 2012).Blood demand is increasing rapidly among developed countries, such that 10 out of every 100 people in the hospital requires some blood products (The Lancet, 2005).Among developing countries, approximately 100,000 deaths are recorded yearly due to blood shortage and common blood-borne infections from unscreened denoted blood (The Lancet, 2005).Thus, reliable demand predictions can be imperative in planning blood donation campaigns and improving blood availability.Empirical studies on the management of blood products are dominated by the use of operational research methodologies (such as critical path supply chain analysis, optimisation, Markov chains and queueing theory) and other simulation techniques (Katsaliaki & Brailsford, 2007;Kopach, Balcıoğlu, & Carter, 2008;Van Dijk, Haijema, Van Der Wal, & Sibinga, 2009).Time-series approaches for blood demand forecasting are now gaining popularity in the literature of BSC (Fortsch & Khapalova, 2016;Pereira, 2004).The precision of predictions from time-series models concerning the blood demand has been considered an essential determinant of donor recruitment decision making and inventory control (Shih & Rajendran, 2019).Consequently, a large amount of data is expected to be collected over time to predict blood demand trends and identify underlying patterns such as seasonality through smoothing and time-series decomposition (Pierskalla, 2005;Rajendran & Ravindran, 2019).
Moreover, the extant literature on the blood supply chain has been dominated by studies based on blood centres from developed countries such as the USA (Katsaliaki & Brailsford, 2007;Pereira, 2004), Canada (Kopach et al., 2008), Finland (Rytilä & Spens, 2006) and Estonia (Alloja, Espenberg, & Kiivet, 2012), among others.However, empirical studies on blood demand forecasting using reallife time-series data in less developed countries are insufficient in the literature.In a less developed country like Ghana, blood products are in high demand for varying health issues.Due to the frequent blood shortages at the major hospitals in Ghana, the family-replacement system has been instituted as the replenishment policy to compel blood donation.However, this replenishment system does not ensure the sufficiency of blood supply.Hence, investigating the requests for blood in developing countries must be approached as a public health issue.Therefore, this paper is based on the premise that a more robust demand predictive method would be a useful planning tool for voluntary blood donation campaigns to supplement supplies and improve blood availability.In Ghana, blood banking and transfusion facilities have poor database management systems predominantly due to lack of computers, computer programs, and training (Nene, Olayemi, & Asamoah-Akuoko, 2015).Thus, most of these health facilities recorded data by hand in books and other documents without the proper filing of patient's medical records, until recent times where the need to collect blood data more efficiently for easy retrieval became a matter of national concern (Nene et al., 2015;Teviu, Aikins, Abdulai, Sackey, Boni, Afari, & Wurapa, 2012).Consequently, most of the past years' data are lost completely, and available data are usually contaminated with missing values and outliers due to genuine recording errors.In such instances, the fidelity of forecasts is greatly affected due to a bias in model parameter estimates and the outliers' carry-over effect on the point forecasts (Chen & Liu, 1993).
Backcasting, a term coined by Robinson (1990), was previously proposed as a planning method in many fields of study, including urban planning and resource management, to investigate future mechanisms from the present by moving backwards in time to determine what policy measures would be required to reach those future outcomes (Bibri, 2018;Phdungsilp, 2011).In the context of time-series modelling, the backcasting methodology can be adapted to predict lost or unavailable data of past years based on current values by forecasting backwards in time.Thus, backcasting and forecasting are only distinguished based on the direction of predictions.The former retropolates past values from future values in reverse time, and the latter extrapolates the future values based on historical data.In blood supply chain management, backcasting can be used to ensure the coherence of blood-related data by predicting lost data of past years for short time-series to aid in exploring relevant long-term patterns or trends, which can provide additional insights into inventory management policies, improving data quality for systematic analysis and the effectively assessing the imbalance between blood supply and demand.Classical time-series models cannot capture the non-systematic changes of outliers due to their exogenous effects (López-de Lacalle, 2016).Different types of outliers may exist in time-series data such as additive outlier, innovation outlier, level shift, temporary change and seasonal level shift (Ahmar et al., 2018;Chen & Liu, 1993;López-de Lacalle, 2016).An automatic detection method for the different types of outliers in a given series has been developed (Chen & Liu, 1993), and a more robust data imputation method utilising Kalman smoothing on state-space methods have been developed for univariate time-series data (Durbin & Koopman, 2012;Hinich, 2005;Moritz & Bartz-Beielstein, 2017).Before time-series modelling, data pre-processing is thus vital for model selection, parameter estimation, and predictions.
Machine learning (ML) algorithms for forecasting blood demand series have been recently developing, and their performance against classical time-series models have even been explored in previous studies (Bontempi, Ben Taieb, & Le Borgne, 2013;Papacharalampous, Tyralis, & Koutsoyiannis, 2018;Shih & Rajendran, 2019).Nonetheless, in the context of blood supply management, no known study has explored these ML algorithms' performance in backcasting or reverse forecasting blood demand during instances where past years' data are completely unavailable compared to their forecasting abilities.Moreover, the literature on time-series backcasting is also insufficient relative to forecasting in general.Thus, the predictive power of machine learning models for backcasting past time-series values is also imperative.Moreover, in evaluating the performance of ML algorithms and traditional time-series models, most studies employ the fixed-origin strategy instead of a rolling-origin evaluation with different training and test partitions.However, according to Tashman (2000), the efficiency and reliability of out-of-sample evaluations of time-series models can be improved by adopting rolling-origin strategies.Hence, an out-of-sample rolling strategy is a better approach for generalisability and model comparison.This study thus assesses both forecasts and backcasts from the timeseries models using a proposed rolling-origin strategy.For additional information about the essence of rollingorigin assessment approaches in any forecasting studies, see work by Tashman (2000).
In this paper, we compare six different machine learning algorithms (K-Nearest Neighbour regression, Generalised Regression Neural Network, Neural Network Auto-regressive, Multi-Layer Perceptron, Extreme Learning Machines, and Long Short-Term Memory models) for forecasting and backcasting blood demand data with missing values and outliers from a major hospital in Ghana.The standard ARIMA model is used as the baseline model for comparison.We use an out-of-sample rollingorigin strategy with model re-calibration for forecasting and backcasting assessments and statistically compare the prediction error distributions between the time-series models under investigation.Before the time-series modelling, we perform data cleaning by adopting existing state-of-the-art algorithms in two stages: (i) data imputation by Kalman smoothing and (ii) automatic outlier detection and correction.We further justify that the blood demand series under study is time-reversible for backcasting using Teraesvirta's neural network linearity test, Lobato-Velasco normality test of stationary processes, and surrogate data testing.

Paper structure
This study is structured in six main sections.Apart from the background of the study, the paper is organised as follows.First, Section 2 reviews relevant literature on inventory management policies and forecasting methods in BSC; and also presents the main research gap and contribution of this paper.Then, the empirical data, data preprocessing techniques adopted, the proposed backcasting scheme, the time-series models (ARIMA and six ML models) to be used for blood demand predictions, and a summary of the empirical evaluation methods are presented in Section 3.After that, Section 4 presents the main results of the study, and Section 5 discusses the key findings.Finally, we conclude and provide some recommendations for future studies in Section 6.

Impact of blood demand forecasting on BSC inventory management policies
The improvement of BSC (outlined in Fig. 1) has become a matter of global concern due to the challenges posed in healthcare delivery.For example, inventory management of blood products is crucial in hospital operations due to their perishable nature and demand uncertainty.Also, the short shelf-life of blood components such as platelets can significantly lead to wastage or expiration if excess platelets are kept.Nonetheless, blood products' inaccessibility when needed may result in loss of lives and other critical conditions.Thus, recent studies have focused on exploring possible solutions for blood demand uncertainty, inventory management and inadequate human resources (Fortsch & Khapalova, 2016).A cost-effective hospital inventory management system should reduce wastage while sustaining the required level of service.Blood shortages in hospitals are predominantly due to the increasing demand and the constrained or irregular supply (Rajendran & Srinivas, 2020).Additionally, the frequency of demand for blood is regularly rising due to the ageing population and the increasing number of accidents, amongst other reasons (Ali, Auvinen, & Rautonen, 2010).Consequently, blood inventory management should incorporate the challenge of exploring the tradeoff between wastage and shortage (Rajendran & Srinivas, 2020).Accurate blood demand forecasting during the development of inventory policies thus aids in lowering costs, reducing blood wastage, and conserving limited resources (Fortsch & Khapalova, 2016).
There exist several generalisable inventory policies for perishable products, and other blood products developed based on inventory management models under demand-supply uncertainties (Dillon, Oliveira, & Abbasi, 2017;Li, Chiang, Down, & Heddle, 2021;Rajendran & Srinivas, 2020;Shokouhifar, Sabbaghi, & Pilevari, 2021).The frequently employed policies in BSC to manage inventory levels in order to bridge the gap between blood shortage and wastage are the periodical policies (Shokouhifar et al., 2021).Other hybrid ordering policies for BSC under demand uncertainty have also been proposed in the literature, and their performances examined (Rajendran & Srinivas, 2020).These inventory policies provide a decision support system for healthcare practitioners to ascertain the best order quantities, given the characteristics of the hospital.In summary, demand forecasting is considered as one of the essential aspects of inventory management, and it can be combined with inventory management policies to obtain optimal strategies even for instances of intermittent blood demands (Ramaekers & Janssens, 2014).

Literature on time-series methods for blood demand forecasting
Time-series methods have been proposed and applied in forecasting daily, monthly, and yearly blood demand data at hospitals and other health centres.The time-series models for blood demand prediction in the literature have been dominated by the classical univariate time-series models, including autoregressive integrated moving average (ARIMA) models, Holt-Winters' exponential smoothing models and time-series decomposition for both seasonal and non-seasonal data (Pankratz, 2009;Pereira, 2004;Shih & Rajendran, 2019), amongst other methods including logistic regression models (Bosnes, Aldrin, & Heier, 2005).Fortsch and Khapalova (2016) discovered that demand for blood is naturally non-stationary.Therefore, univariate time-series models fitted using the Box-Jenkins methodology based on the integrated moving average (ARIMA) class of models are mostly the optimal choice over the naive, exponential smoothing, simple moving average and other classical forecasting models.However, other studies have discovered limitations in the conventional method in identifying the best specified ARIMA model for forecasting non-seasonal data (Simon, 2007).The conventional approach of best fit does not always guarantee the model's ability to predict future values nor its predictive performance for previous periods due to underlying factors of the data (Simon, 2007).
To efficiently fit a classical ARIMA model, essential data features such as seasonality and trends need to be eliminated to achieve a stationary and non-seasonal series if they exit.Thus, ARIMA and most classical time-series models are not robust unless these models' parametric assumptions are met.One of the most widely employed models is the seasonal ARIMA (SARIMA) model in the presence of seasonality.Both ARIMA and SARIMA models assume that the future values have a linear relationship with current and past series.The ARIMA models assume that its residuals must be independent and identically distributed.The residuals must also have a mean of zero, constant variance and serially uncorrelated (white noise).Thus, for complex non-linear problems, predictions by either ARIMA or SARIMA may not be adequate.Hence, hybrid models that combine these ARIMA classes of models or other traditional approaches such as Holt-Winter methods and computationally intelligence models such as Artificial Neural Network (ANN) become ideal (Khashei, Bijari, & Hejazi, 2012).Consequently, a different but robust class of time-series models called machine learning algorithms were developed.Few studies have employed machine learning algorithms such as the ANN to predict future demand for blood; their performance against the classical time-series models has been considered (Alajrami, Abu-Nasser, Khalil, Musleh, Barhoom, & Naser, 2019;Darwiche, Feuilloy, Bousaleh, & Schang, 2010;Khaldi, El Afia, Chiheb, & Faizi, 2017).Shih and Rajendran (2019) compared traditional time-series models' performance to ANN and multiple regression via machine learning algorithms on a five-year historical data from Taiwan Blood Services Foundation (TBSF).They discovered that the classical time-series forecasting (predominantly Seasonal Exponential Smoothing Method and ARIMA models) outperformed the machine learning (ML) algorithms.The cause for such under-performance of the ML algorithms is attributed to limited data or small sample size used in training these algorithms; hence, it can lead to biased ML predictions in such instances (Vabalas, Gowen, Poliakoff, & Casson, 2019).Khaldi et al. (2017) found that ANN outperformed ARIMA models for blood demand forecasting.Thus, they concluded that these models could be considered a promising approach to forecasting monthly blood demand.Hence, the classical time-series methods are usually considered baseline models for machine learning algorithms.
However, ML algorithms have been considered in academic literature as suitable alternatives to traditional time-series forecasting models (Makridakis, Spiliotis, & Assimakopoulos, 2018).Nevertheless, minimal evidence about their relative performance (accuracy and computational requirements) is available (Makridakis et al., 2018).Makridakis et al. (2018) compared the post-sample accuracy between Neural Networks (NN), Automated ANN and eight classical models for different forecasting horizons.It was revealed that the prediction power associated with the machine learning algorithms were relatively better than the traditional methods across all examined forecasts horizons; however, ML computational requirements were more significant than the classical time-series models.Machine learning algorithms for forecasting blood demand data (with linear or non-linear trends) from blood centres in developing countries (including Ghana) are under-studied.Thus, the classical ARIMA model is considered the baseline model for comparing the ML models in this study.

Brief review of ML methods for time-series predictions
Machine learning (ML) is a computational intelligence technique that adopts programmed algorithms to analyse input data and learn from it via supervised or unsupervised processes to predict output values within an acceptable range (Wakefield, 2013).ML models have proven to be an excellent alternative to classical statistical models for forecasting and other research problems (such as regression and classification problems) over the last decade.Neural network (NN) commenced the development of ML algorithms.NNs are deep learning methods developed as mathematical models of the brain.They can allow complex non-linear relationships between the response variable and its predictors.Over time, other studies employed the concept of neural networks to develop Decision trees, Random forest, Gradient Boosting Machines, Support vector machines and other ML algorithms for regression-type problems (Alpaydin, 2020;Friedman, Hastie, Tibshirani, et al., 2001).There have been parallel efforts towards empirical validation of existing models, model comparison and development of new ones.The immense importance of these ML developments to modellers provides a wide range of choices and a comprehensive understanding of available models' strengths and weaknesses for different forecasting problems.The ML algorithms for time-series forecasting dominating the literature are Multi-Layer Perceptron (MLP), Bayesian Neural Network (BNN), Radial Basis Functions (RBF), Generalised Regression Neural Network (GRNN), K-Nearest Neighbour regression (KNN), CART regression trees (CART), Support Vector Regression (SVR), Recurrent Neural Network (RNN), Long Short Term-Memory neural network (LSTM), Automated Artificial neural network (AANN) and Gaussian Processes (GP) (Ahmed, Atiya, Gayar, & El-Shishiny, 2010;Makridakis et al., 2018).For an extensive description of these ML models, see works of Ahmed et al. (2010), Alpaydin (2020) and Hastie, Tibshirani, and Friedman (2009).MLP, BNN, GP, KNN and GRNN were found in a large-scale comparative study to be the top five machine learning algorithms based on 18-month one-step-ahead forecasts, among other ML models (Ahmed et al., 2010).However, the differences in their performance could be affected by the choice of predictive error measures (Makridakis et al., 2018) and the type of evaluation strategy employed (Tashman, 2000).The performance of ML models may also vary depending on the historical time-series data and underlying hyper-parameters.Other studies have also explored the predictive performance of ML algorithms such as Neural Network Auto-Regressive model (NNAR) and Extreme Learning Machine (ELM).ML models for time-series forecasting have significantly evolved over the years and are considered good competitors to the classical models within the forecasting community.

Research gap and contribution
Forecasters, policymakers, time-series users and practitioners usually need long time-series data for model assessment, policy analysis, and investigating underlying trends and patterns to aid decision-making (Caporin & Sartore, 2011).Nevertheless, such long time-series data may not always be obtainable at the expected frequency, with the needed temporal or spatial coverage of the data unavailable in previous years, especially among blood centres in developing countries.The data unavailability may be due to several reasons such as human error, human failure, software corruption, data storage destruction, lack of the required data processors, or data collection only started in some future time, amongst others.Consequently, it is imperative to estimate or predict the lost data of past years for time-series users.The process of predicting data of past years is referred to as backcasting or reverse forecasting.It is possible to backcast or forecast in reverse time for relevant time-series provided the series is strictly stationary and time-reversible (Caporin & Sartore, 2011;Sharifdoust & Mahmoodi, 2013).Unfortunately, to the best of our knowledge, no known study has explored the backcasting power of the existing state-of-the-art ML algorithms for predicting unavailable blood demand data of past years.Hence, this current study attempts to bridge this gap by investigating the forecasting and backcasting power of a few selected ML algorithms (KNN, NNAR, GRNN, MLP, ELM and LSTM) for a short time-series data on blood demand with past lost values using an out-of-sample rolling-origin evaluation strategy for model comparison.The current study can further be expanded by time-series modellers for other general modelling problems with short series contaminated with missing values and outliers, and the findings can help policymakers in the management of the blood supply chain (as previously discussed) as well as provide publicly accessible adaptive programming codes in R (with the help of existing packages) to forecast or backcast any time-series data using the underlying time-series models via the proposed rolling-origin strategy.
The key contributions of this study are: i.To demonstrate the application of ML algorithms and a classical time-series model for also backcasting lost data of past years or any time-reversible stationary series.ii.To establish that the direction of prediction (forecasting or backcasting) can affect the predictive performance of ML models given time-series data.iii.To justify the need for an out-of-sample rollingorigin strategy in comparing existing time-series models' forecasting and backcasting power for short time-series data (with missing values and outliers).

Materials and methods
As previously discussed, the study primarily compares the predictive performance of six different ML algorithms for forecasting and backcasting blood demand in a major hospital in Ghana based on a short series with missing values and outliers.In addition, this section discusses the data source, data preprocessing procedures, the proposed backcasting scheme based on time-reversibility, the time-series models for predicting blood demand via a rolling-origin strategy, and summarises the empirical evaluation methods.All analyses were carried out using R statistical software version 3.6.3(R Core Team, 2019).The R scripts developed (for the forecast and backcast schemes via a rolling-origin strategy) and the empirical data used for the study can be found via the GitHub URL link (for reproducibility of results): https://github.com/twumasiclement/TimeSeries-Forecasting.

Sources of data
Monthly data collected from January 2013 to September 2020 on blood demand was obtained from Tema General Hospital (TGH) for this study.TGH has acquired the Enzyme-Linked Immunosorbent Assay (ELISA) machine to screen its blood samples just recently (Ghana Health Service, 2014).Therefore, most data on blood demand before acquiring the ELISA machine remains fragmented and difficult to retrieve (coupled with missing values).From 2014, the hospital started recording the aggregate blood quantity demanded every month at the Health Information Department of the Hospital.Thus, the time-series data for this study is short in length with missing values and outliers (assumed to be due to genuine recording error).The Institutional Review Board of the Ghana Institute of Management and Public Administration Business School (GIMPA) gave ethical approval for the study.The TGH's management also gave the authorisation to access the secondary dataset on aggregated monthly blood demand.

Data preprocessing
The data cleaning or preprocessing was done in two stages using existing state-of-the-art algorithms.The blood demand series had 14% missing values (n = 13) and 11 gaps with an average gap size of 1.182.In the first preprocessing stage, missing value imputation was done by adopting the Kalman Smoothing on a basic structural model (BSM) implemented in the imputeTS R package (see Moritz & Bartz-Beielstein, 2017).For the state-space form of the BSM, see work by Durbin and Koopman (2012).Fig. 2 shows the time-series plot of the blood demand series with highlighted missing values and the imputed data at the first data-preprocessing stage.During the second stage of data preprocessing, an automatic outlier detection and adjustment algorithm for identifying additive outliers (AO) and other outlier types (innovation outliers, level shift, temporary change and seasonal level shift) based on the best ARIMA model errors was used (proposed by Chen & Liu, 1993).Additionally, we used the tsoutliers package in R (implemented by López-de Lacalle, 2016) for the automatic ARIMA data correction based on the blood demand series imputed by Kalman smoothing on BSM (from the first stage of data preprocessing).As a result, the automatic ARIMA data correction detected only three additive outliers and no other outlier type based on the best ARIMA model errors (Fig. 3).The final corrected data was used for the subsequent time-series modelling.The Augmented Dickey-Fuller unit root (ADF) test was used to investigate the stationarity of the corrected data.The ADF test revealed that the corrected blood demand series was stationary at lag 0 (Test statistic = −4.305,p-value ≤ 0.01).

Time-series forecasting and backcasting models
The principles of time-reversibility and backcasting based on forecasts are discussed (Section 3.3.1).This section also presents the time-series models under investigation (classical non-seasonal ARIMA as baseline model and six ML algorithms) to forecast and backcast blood demand (Sections 3.3.2-3.3.8).

Time reversibility and linearity
Definition 3.1.Let {Y t , t ∈ Z} be a strictly stationary series or stochastic process.Y t is said to be timereversible (TR) if, for any positive integer n and all integers where D ∼ represents equal in distribution (Sharifdoust &   Mahmoodi, 2013).
Definition 3.2.The stochastic process {Y t , t ∈ Z} is said to be a linear process if it can be represented as where {Z t , t ∈ Z} is a sequence of nondegenerate independent and identically distribution random variables with mean 0 and constant variance σ 2 ; and  (Sharifdoust & Mahmoodi, 2013).
Remark.Hence, to show that a strictly stationary (blood demand) series is time-reversible and thus backcasting with the ARIMA and ML models is possible, we need to establish that the stationary series follows a Gaussian process, comes from a linear stochastic process, and there is linearity in the mean.Once this assumption of time reversibility holds, backcasting with the time-series  models is done by forecasting with the reversed series in time.Lobato-Velasco normality test of stationary process (LV test) was used to test the Gaussianity of the stationary blood demand series using the nortsTest package in R (Asael Alonzo Matamoros & Nieto-Reyes, 2020).In addition, Teraesvirta's neural network linearity test, which tests whether there is linearity in mean given the series (Constantino, Garcia, & Sawitzki, 2020;Teräsvirta, Lin, & Granger, 1993) and the surrogate data testing, which tests whether the series is a Gaussian linear process (Constantino et al., 2020;Schreiber & Schmitz, 2000) were employed at α = 5%.For the Surrogate data testing, the time symmetry statistic (T ), which measures the asymmetry of stationary time-series (Y t of length n) under time-reversibility, is computed as (adapted from Kantz and Schreiber ( 2004)): Surrogate data of size 2K α − 1, for K ≥ 1 is generated using a phase randomization procedure (based on Fast Fourier Transform of the Y t ) for a two-sided test (Constantino et al., 2020;Raeth & Monetti, 2009).The null hypothesis (the series is a Gaussian linear process) for a two-sided Surrogate test is rejected if the test statistic for the original series Y t is significantly different from test statistics for all the 2K α − 1 generated surrogate datasets.
The LV normality test revealed that the stationary blood demand series was normally distributed (Test statistic = 0.078, df = 2, p-value = 0.962).Teraesvirta's neural network linearity test revealed that the corrected blood demand series (χ 2 = 0.647, df = 2, p-value = 0.723) is linear in its mean.The symmetrical distribution of the time-reversibility statistic (based on the generated surrogate and observed series) and the normally distributed demand series are shown in Fig. 4. The time-reversibility statistic for the corrected blood demand series is not significantly different from test statistics of all generated surrogate data (p-value > 0.05).Therefore, the blood demand series comes from a Gaussian linear process and is thus time-reversible.Thus, blood demand backcasting with time-series models is possible since the data is confirmed to be time-reversible.Moreover, according to Sharifdoust and Mahmoodi (2013), if a stationary linear model is assumed, then a test for time reversibility is usually equivalent to a test for Gaussianity of the stationary series.In conclusion, the time-reversibility assumption is a necessary condition for the proposed backcasting scheme using the time-series models since backcasts are obtained by reversing forecasts backwards in time (as previously justified by Sharifdoust & Mahmoodi, 2013).This is because, if the stationary series is time-reversible, the joint probabilities of the forward and reverse state sequences from the time-series models are the same for all sets of time increments.
. ., m + n} and m, n ∈ Z (where 0 ≤ m < n < ∞), be the current series of a stationary data with frequency f and time-reversible.Suppose F h t = ( Ŷm+n+1 , Ŷm+n+2 , . . ., Ŷm+n+h ) ′ are the one-step-ahead for- ward prediction for h horizons based on a fitted time-series model; then the backcasts of past values for h (h ≥ 1) based on the frequency (f ) of the series.
The pseudo-code summarising the proposed backcasting scheme in this study (in accordance with Proposition 3.1) is given by Algorithm 1.

Non-seasonal ARIMA model
Let suppose Y t is the observed series at time t, then the full ARIMA(p, d, q) model which integrates both AR(p) and MA(q) models is given as (Hyndman & Athanasopoulos, 2018): Algorithm 1: Pseudo-code for the proposed backcasting scheme using time-series models Input: Current series of a stationary data based on the frequency (f ) of the series; where the forward prediction 1 Test for stationarity of the original series (Y t ) or transform series into a stationary form by differencing if necessary.
2 Test the Gaussianity of the stationary series using appropriate test.
3 Determine if the stationary series is time-reversible by justifying whether the stationary series is a linear Gaussian stochastic process.

if stationary series is time-reversible then 5
Fit the required time-series model based on the training set or current series.

6
Generate a one-step-ahead forecast for h ≥ 1 horizons from the fitted model.

7
Reverse the forward predictions or forecasts in time (based on the frequency of the series) for the required backcasts or predicted values of past years.

8
Assess the predictive performance of the model based on the validation set and appropriate error measure (preferably via a rolling-origin evaluation strategy for a short series).9 else 10 backcasting is not possible with the time-series model given a time-irreversible series.

end
where ∆Y t is the differenced series (at differencing order d denoting the minimum non-negative order such that the series Y t is stationary), h 0 ∈ R is a constant and ϵ t is the random error at time t; whereas p and q are the autoregressive and moving average non-negative orders respectively, and α i ∈ R for i = 1, 2, . . ., p and θ j ∈ R for j = 1, 2, . . ., q are the regression coefficients.The auto.arima function in R was used to fit an automatic nonseasonal ARIMA model (Hyndman et al., 2020).A function in R was developed to implement the automatic ARIMA via a rolling-origin strategy for model comparison (for both forecast and backcast schemes).on the target output of the K nearest neighbours of the given query point (Ahmed et al., 2010).Now, suppose the ith training instance or data point consists of a vector of υ features: f = (f i 1 , f i 2 , . . ., f i υ ) which describes the instance.Given a new instance with known features: γ = (γ 1 , γ 2 , . . ., γ υ ), but unknown target vector, these new features are used to find its K most identical training instances from their vectors of features and the Euclidean distance d(f , γ ) (a similarity metric) such that:

K-nearest neighbour regression model K-Nearest Neighbour regression (KNN) is a non-parametric ML model that makes predictions based
(5) Thus, the closest K training data points are selected, and a prediction is obtained based on the average of the target output values for these K points (Martínez, Frías, Charte, & Rivera, 2019a;Martínez, Frías, Pérez, & Rivera, 2019b).That is, let assume that the targets (y 1 , y 2 , . . ., y K ) are the K nearest neighbours for a new instance x, then the prediction (ŷ) is given by ŷ Larger values of the hyperparameter K lead to a smoother fit at the cost of a higher bias and vice versa for smaller values of K ; and thus controls the bias-variance trade-off (Taieb, Bontempi, Atiya, & Sorjamaa, 2012).There are several methods for determining the value of K , including but not limited to: (i) using some heuristic or rule-of-thumb technique that recommends setting K to be the square root of the number of training examples (Martínez et al., 2019a), (ii) using a cross-validation or optimisation techniques that estimate the optimal K value by minimising a forecast error measure such as MAPE (Hyndman & Koehler, 2006), and (iii) using local learning techniques to adaptively set K by minimising the Leave-One-Out (LOO) error statistic (described by Taieb et al., 2012) or, preferably, the Prediction Sum of Square (PRESS) statistic, which produces forecasts with the most similar stochastic characteristics to the training samples (Allen, 1974;Bontempi, Birattari, & Bersini, 1999).In this study, KNN was fitted using the tsfknn package in R (Martínez et al., 2019a).Due to ease of computation and the tsfknn package being used, the optimal values of the hyperparameter K were iteratively tuned from a range of possible values (from 1 to the maximum number of training samples) via a cross-validation procedure which minimises the MAPE statistic (defined in Eq. ( 20)) based on a multiple-step ahead strategy known as the Multiple-Input-Multiple-Output (MIMO) (described in Bontempi & Taieb, 2011;Taieb et al., 2012).In addition, we developed a function in R to implement KNN via a rolling-origin strategy for model comparison (for both forecast and backcast schemes) by adapting the tsfknn R package.

Generalised Regression Neural Network model
Generalised Regression Neural Network (GRNN) is a type of ANN model which is more robust for non-linear fitting and originally developed to complement the ARIMA model (BuHamra, Smaoui, & Gabr, 2003;Leung, Chen, & Daouk, 2000).GRNN is a type of Radial Basis Function network (RBF) characterised by a fast single-pass learning, and consists of hidden layers of RBF neurons.Now, given a training set with υ training patterns (x 1 , x 2 , . . ., x υ ) and its corresponding targets (y 1 , y 2 , . . ., y υ ), GRNN makes predictions for an input x based on the weighted average of target outputs of the training data points in the neighbourhood of x, using some kernel function (Ahmed et al., 2010).Thus, the prediction (ŷ) for data point x is estimated such that: where the weights w j produced by the hidden layers are given by where y j is the target output for training data point ρ, determines the smoothness of the fit; such that very large ρ result in prediction close to the mean of training targets (with similar weights), and small ρ assigns significant weights to training targets closer to the input vector.The GRNN model was fitted, the smoothing parameter ρ automatically tuned via optimisation using tsfgrnn package in R (Francisco Martinez, 2019).Also, a function was developed in R to implement GRNN via a rolling-origin strategy for model comparison (for both forecast and backcast schemes).

Neural Network Auto-Regressive model
Neural Network Auto-Regressive model (NNAR) is also a type of artificial neural network (ANN) which allows the modelling of complex non-linear relationships between input and output variables but uses lagged values of the current series as inputs for model fitting (Thoplan, 2014).For a seasonal data with a frequency of s, the NNAR seasonal model is represented as NNAR(p, P, k) s ; where p, P and k model parameters denote the trend auto-regressive order, seasonal trend autoregressive order and the number of nodes in the hidden layer, respectively (Hyndman & Athanasopoulos, 2018).The non-seasonal model is denoted by NNAR(p, k).More generally, NNAR(p, P, k) s has inputs (y t−1 , y t−2 , . . ., y t−p , y t−s , y t−2s , . . ., y t−Ps ) and k neurons or nodes in the hidden layer.Eliminating the hidden layer of the model (i.e.setting k = 0), NNAR(p, P, 0) s is analogous to the seasonal ARIMA (p, 0, 0)(P, 0, 0) s model, and NNAR(p, 0) is similar to AR(p) model but with non-linear functions (Faraway & Chatfield, 1998).The NNAR model is a feedforward neural network, where each layer of neurons or nodes receives inputs or training samples from the previous layers.Consequently, the outputs of the nodes in one layer are inputs to the next layer.Now, let suppose y t−1 = (y t−1 , y t−2 , . . ., y t−p ) is an input vector for a single-hidden-layer NNAR(p, k) model with k hidden nodes at time t − 1; then, the (non-linear) relationship between the model output (y t ) at time t and inputs (y t−1 ) at time t −1 has the following mathematical representation: where b i is the hidden layer bias, and w i,j is the weight corresponding to the input vector y t−1 for the jth (j = 1, 2, . . ., p) input in hidden node i; ϵ t is the series error (assumed to be homoscedastic and possibly normally distributed), and ϕ is a logistic sigmoid function in the form The hidden layer bias and weights are numerically estimated from the data (using a backpropagation algorithm and a cost function).Thus, the weights have no closed-form and meaningful interpretation.In this study, a non-seasonal NNAR(p, k) model with a single-hidden layer was fitted using the forecast package in R (Hyndman et al., 2020).A novel function in R was also developed to implement NNAR via a rolling-origin strategy for model comparison (for both forecast and backcast schemes).The best average number of hidden nodes was determined at different possible values.

Multi-layer perceptron model
A multilayer perceptron (MLP), just like NNAR model, is also a feedforward ANN class that utilizes supervised learning techniques via backpropagation for non-linear prediction of a stationary time-series (Rosenblatt, 1961;Rumelhart, Hinton, & Williams, 1985).MLP consists of at least three layers of nodes: an input layer, a hidden layer and an output layer, including hidden node (i), weights (w) and a transfer function ϕ (a logistic sigmoid function given by Eq. ( 10)).Suppose x is an input vector or training samples of length ϑ; then, the prediction for the network output ( ŷi ) for hidden node i is given as (related to NNAR output prediction): where b i is the hidden layer bias, w i,j is the weight corresponding to jth input x j (j = 1, . . ., ϑ) for hidden node i, and k is the number of hidden nodes.The hidden layer bias and weights can be adjusted based on corrections that minimize the error in the entire output via gradient descent.A major drawback of MLP is its high computational time for training.Hence, the model is trained in three stages: a forward pass, calculation of error or loss and a backward pass.An automated MLP was fitted using the nnfor package in R (Kourentzes, 2017).A 5-fold crossvalidation was used to automatically choose the number of hidden nodes and the differencing order of the training samples.A function was created in R to implement MLP via a rolling-origin strategy for model comparison (for both forecast and backcast schemes).

Extreme Learning Machine
Extreme Learning Machine (ELM) is also a hidden layer feedforward neural network that does not require gradient-based backpropagation optimisation but Moore-Penrose generalised inverse to set the weights connecting the inputs to the hidden layer and hidden layer biases (unlike other ML algorithms like NNAR or MLP); hence, generally faster in computational time (Huang, Zhu, & Siew, 2006;Jayaweera & Aziz, 2018).The mathematical model for a single-layer ELM with input vector or training samples x and output (y) is given as where k is number of hidden nodes, ϑ is the length of the input vector, b is the hidden layer bias vector, ω is the weight vector between input and hidden layer, β is weight vector between the hidden layer and output, and g is a linear activation function (where g can be a penalised linear regression or the traditional linear regression).The main hyperparameter that requires tuning is the number of hidden nodes.An automated ELM was fitted using the nnfor package in R (Kourentzes, 2017).A 5-fold crossvalidation was used to automatically select the number of hidden nodes and the differencing order of the training samples.In addition, the classical Lasso regression was used as the linear activation function (g) in estimating the output layer weights.A function was also developed in R to implement ELM via a rolling-origin strategy for model comparison (for both forecast and backcast schemes).

Long Short-Term Memory network
Long Short-Term Memory network (LSTM) is a particular type of Recurrent Neural Network (RNN) for dealing with the vanishing gradient problem of the classical RNN or capable of learning long-term dependencies (Gopika, Sowmya, Gopalakrishnan, & Soman, 2020).LSTM comprises memory blocks (or cells) connected through layers.The information in the cells is contained in cell state c t and hidden state y t at time t via mechanisms known as gates based on two sigmoid activation functions: hyperbolic tangent function (tanh) and a logistic sigmoid function.LSTM requires the series to be in a supervised learning mode with target (y) and predictor (x) variables.Consequently, the series is transformed by lagging (at lag φ) such that the lagged value at time t − φ is used as the input and lagged value at time t considered as the target, for a φ-step lagged dataset.Additionally, LSTM has three gates: (i) input gate (receives new information and the prior predictions as inputs), (ii) forget gate (eliminates the information that is no longer relevant in the cell state), and (iii) output gate (makes a selection based on the new information and the previous predictions).Thus, the final output is obtained through three steps or mechanisms described as follows: Step 1 (input gate i t ): Let x t (current input at time t) and y t−1 (previous hidden state at time t − 1) be input information.The logistic sigmoid layer creates an update filter such that where W is the input weight vector, R is the recurrent weight vector, b is the hidden layer bias vector, and ϕ is the logistic sigmoid function (given by Eq. ( 10)).
Step 2 (forget gate f t ): This gate eliminates the information that is no longer relevant in the cell state (c t ) given weights (W f , R f and b f ) such that and the tanh activation layer creates a vector of potential candidates (z t ) given weights (W z , R z and b z ) such that A sigmoid layer then creates an update filter (u t ) given weights (W u , R u and b u ) such that and the previous cell state c t−1 is updated such that Step 3 (output gate o t ): The sigmoid layer filters the cell state (c t ) for the output (o t ) given weights (W o , R o and b o ) such that Finally, an element-wise product (⊗) of the scaled cell state (c t ) and the filtered output (o t ) gives the new hidden state y t for the next cell (c t+1 ) such that With the help of the open-source R software libraries: Keras and TensorFlow (Arnold, 2017), a function was created in R to implement LSTM based on the empirical time-series data (lagged at differencing order φ = 1 and normalised afterwards) via a rolling-origin strategy for model comparison (for both forecast and backcast schemes).
Unlike the other aforementioned R packages for machine learning forecasting (discussed in Sections 3.3.3-3.3.7), the Keras and TensorFlow libraries do not directly return a forecast object in R when implementing LSTM.Thus, users must take care when fitting LSTM due to numerous modelling steps from time-series differencing, data normalisation, data transformation into 3-dimensional arrays, inverse scaling after obtaining predictions from the compiled model, converting predictions into a forecast object in R, and finally estimating prediction error (either via the rolling-origin forecast or backcast strategy).

Empirical evaluation
The forecasting and backcasting power of six ML algorithms (KNN, NNAR, GRNN, MLP, ELM and LSTM) based on the corrected blood demand data with short length are primarily examined.The non-seasonal ARIMA model was considered as a baseline model for comparison with the ML algorithms.We further investigated whether there exist seasonality in the stationary series using the Webel-Ollech overall seasonality test (implemented in Ollech & Webel, 2020).The Webel-Ollech test found no seasonality in the corrected blood demand series (p-value >0.05).
This finding is also confirmed in the Auto-correlation Function (ACF) and seasonal plots (Fig. 5).Also, the stationary blood demand series was found to be time-reversible.Thus, blood demand backcasting (presented by Algorithm 1) with the time-series models was possible under the time-reversibility assumption.
An out-of-sample rolling-origin strategy with model recalibration is used to assess the forecasting and backcasting power of the time-series models under study at different lead times.Thus, the corrected blood demand series was split into different training and test partitions (determined by the maximal and minimal prediction horizons or lead times) according to the following proposed series splitting rule: Let H be the maximal prediction horizon (set at H = 18 months), and assume ν is the minimal prediction horizon (set at ν = 2 months).Suppose N is the total length of the observed series.For forecasting and backcasting schemes, different training sets for model fitting are chosen with increasing lengths N − H, N − H − 1, . . ., N − ν; whereas the test set for cross-validation is chosen at decreasing lead times H, H − 1, . . ., ν. Per this proposed rolling-origin strategy, the expected number of different sets of predictions (forecasts or backcasts) or the number of times an individual model is re-calibrated is η = H − ν + 1 (i.e.η=18-2+1=17 prediction sets).Hence, the overall total number of either forecasts or backcasts for this rolling-origin strategy with a maximal prediction horizon of H is given η(H+ν) 2 (based on the sum of η lead times as arithmetic sequences).
For each set of the 17 different predictions (η = 17), their respective test set was compared with its corresponding predictions by estimating the mean absolute percentage error (MAPE) given by Eq. ( 20) at each lead time.A pooled average error statistic (median MAPE denoted by MdMAPE) given by Eq. ( 21) (recommended by Tashman & Kruk, 1996) was finally computed for model comparison.Thus, MdMAPEs at the different lead times were obtained for each time-series model during forecasting and backcasting assessments, respectively, resulting in an error distribution.The prediction error distributions of the time-series models were further compared, and the significant differences between the time-series models' error distributions were determined using the standard Kruskal-Wallis (KW) median test and Bonferroni Dunn's test of multiple comparisons at 5% alpha level.For each test set (Y T ) at lead time T = H, H − 1, . . ., ν, let assume the test series is given by Y T = (Y 1 , Y 2 , . . ., Y n T ) and its corresponding predicted values F T = (F 1 , F 2 , . . ., F n T ), then the MAPE at lead time T and the median MAPE (MdMAPE) are computed such that: where n T is the test series length at lead time T .

Blood demand forecasts and backcasts from the ARIMA and ML models
The non-seasonal ARIMA model was considered a baseline model for comparison with the ML models for forecasting and backcasting the units of blood demanded via the proposed 18-month rolling-origin strategy with 17 different sets of predictions; and thus, 17 different fitted ARIMA models.Fig. 6 is a plot of the blood demand forecasts and backcasts at the maximal 18-month prediction horizon.
Moreover, the six ML algorithms (KNN, GRNN, NNAR, MLP, ELM and LSTM) were also fitted to forecast and backcast the units of blood demanded at the Tema General Hospital via the rolling-origin strategy.Comparative plots of the blood demand forecasts and backcasts for the fitted ML models at the maximal 18-month prediction horizon are given by Figs. 7 and 8, respectively.

Overall predictive performance of the time-series models
The predictive power of the time-series models for forecasting and backcasting the short blood demand series were assessed at different prediction origins based on the rolling-origin evaluation (using Eqs. ( 20) and ( 21)).Fig. 9 shows that there was some degree of variability in the prediction error at the different forecast and backcast origins for any given time-series model.For example, KNN generally had the highest forecasting power at forecast origins from May 2019 to July 2020; whereas, ELM achieved the best backcasting power at backcast origins from July 2013 to June 2014.The prediction error distribution between ARIMA and the six ML models were also compared as displayed in Fig. 10.It was found from the Kruskal-Wallis median test that there is a significant difference in the median prediction error of the timeseries models for both forecasts (χ 2 = 39.058,df = 6, p-value <0.001) and backcasts (χ 2 =43.081, df = 6, p-value <0.001) at 5% alpha level.In addition, the Bonferroni Dunn's test for pairwise comparison of the models' forecast error distributions (Table 1) revealed that KNN's forecast error was significantly different (and lower) than the forecast errors of the ARIMA model and the other machine learning algorithms (NNAR, GRNN, MLP, ELM and LSTM models).
Also, pairwise comparison of the models' backcast error distributions (Table 2) showed that, except for the KNN model, the backcast errors of ELM were significantly different (and lower) than the backcast errors of the other time-series models; whereas, backcast errors of NNAR and KNN were significantly different at 5% alpha level.We can therefore conclude from Table 3 that the best forecasting model was KNN (12.55% error), followed by NNAR (17.63% error) and ARIMA (18.40% error).On the other hand, ELM achieved the highest backcasting power (19.36% error), followed by KNN (25.94% error) and LSTM (28.21% error).Nevertheless, there was no significant difference between the forecasting power of the ML algorithms and the ARIMA model, except the KNN model (Table 1).However, ELM significantly outperformed the ARIMA model in backcasting blood demand based on the short time-series data.The high variability in the models' predictive power at the different training and test partitions of the data suggests the need to adopt the rolling-origin strategies over the fixed-origin strategy for effective and robust model comparisons (Figs. 9 and 10).From Table 3, it can also be inferred that all the time-series models under investigation performed well in forecasting blood demand than backcasting based on the given empirical data.

Theoretical implications of the study
This study mainly investigated the predictive performance of ML algorithms for forecasting and backcasting 1.000 0.000 1.000 MLP 0.602 0.000 1.000 1.000 ELM 1.000 0.000 1.000 1.000 1.000 LSTM 1.000 0.000 1.000 1.000 1.000 1.000 * p-value < 0.05 in ijth position implies forecasting error between models i and j are significantly different at 5% alpha level.blood demand in a major hospital in Ghana, based on a short time-series data with unavailable past values; and contaminated with missing values and outliers.The classical non-seasonal ARIMA model was used as a baseline model for comparison with the ML models (KNN, NNAR, GRNN, MLP, ELM and LSTM algorithms), and a proposed rolling-origin strategy was employed for model evaluations.We also proposed a backcasting scheme to predict past values of lost data using time-series models provided the series is strictly stationary and time-reversible, such that its probabilistic structure remains invariant.The backcasts were obtained by reversing forecasts backwards in time, and thus, the time-reversibility assumption was considered a necessary condition for the proposed backcasting scheme using time-series models.In addition, robust state-of-the-arts data preprocessing algorithms were adopted for data cleaning before the time-series modelling.Thus, the effect of different data preprocessing techniques and their respective performance on the predictive errors of the models were not significant concerns in this study.
Previous studies have systematically shown that the traditional time-series methods such as ARIMA and exponential smoothing models usually outperform the complex machine learning algorithms (Makridakis et al., 2018).Nonetheless, Cerqueira, Torgo, and Soares (2019) discovered in their study that the prediction deficiency of ML models in such instances is predominantly due to potentially small training sample sizes.ML algorithms generally perform better with large training samples, though appropriate hyperparameter(s) tuning can also improve their predictive performance.Expressly, the two studies, as mentioned earlier, confirmed that traditional forecasting methods give better predictions relative to the ML models when up to only 144 observations are considered during model training (Cerqueira et al., 2019;Makridakis et al., 2018).However, the choice of the predictionerror measures, the type of out-of-sample evaluation tests employed, data preprocessing techniques used, and the   the empirical data was 93.It was revealed that KNN and NNAR models outperformed the classical ARIMA model during blood demand forecasting, but there was no significant difference between their prediction errors across different forecasting origins, except for the KNN model.On the contrary, ELM models significantly had better backcasting power than the ARIMA model.Therefore, the direction of prediction (either forecasting or backcasting) can also affect the performance of machine learning algorithms and the traditional methods.

Managerial implications of the study
Blood centres are responsible for ensuring that blood products are available to promptly satisfy the demand from hospitals for transfusion to save lives.Reliable demand forecasting is thus the foundation of all blood supply chain planning and decision making (Silva Filho, Cezarino, & Salviano, 2012).However, there are often shortages in blood supply among blood banks or hospitals in Ghana due to low voluntary blood donation and associated challenges with the family-replenishment strategy (Stanger et al., 2012).The critical impeding factors for voluntary blood donations in Ghana are social and cultural beliefs, health risk concerns, and lack of proper education on sanitary blood donation (Harrington, 2013).Therefore, the rise in demand predominantly causes blood shortages, and thus, the need to identify time-series models that can efficiently predict blood demand over time based on existing empirical data to avoid excess blood supply resulting in blood wastage.Unfortunately, small datasets like the one considered in this current study often exist where values of past years are either lost or completely unavailable.
Hence, we have demonstrated the application of machine learning algorithms and other time-series models like ARIMA to predict unavailable data of past years and make future predictions based on smaller sets of information or available data.Nonetheless, this study's main limitation is that the only available data on blood demand at the Tema General Hospital spanned from January 2013 to September 2020.Hence, it is inevitable that with more data, the conclusions would have been different concerning the predictive performance between the reference model (ARIMA) and the ML algorithms, but also regarding data preprocessing techniques and their sensitivity to the numbers of outliers.

Conclusion and recommendation
The study discovered that Extreme Learning Machine (ELM) and the K-Nearest Neighbour regression (KNN) algorithms are effective ML algorithms for predicting past values of unavailable blood demand data for blood centres and hospitals in Ghana via a reverse-forecast or backcast scheme.Time-series backcasting was only possible because the blood demand data was time-reversible and followed a Gaussian linear stochastic process.Even though The data correction method could significantly affect the predictive outcome of ML algorithms and other classical time-series models.Hence, we recommend that future studies investigate the effects of different data preprocessing techniques on the time-series models' predictive power for short or long series.Furthermore, the blood centres in Ghana should get proper database management systems to avoid data loss and outliers due to genuine recording errors.Future studies can also employ machine learning algorithms as a good alternative for backcasting past values of different time-series data with unavailable data of previous years.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Financial support
No funding was obtained for the study.

Fig. 2 .
Fig. 2. Units of blood demanded from January 2013 to September 2020 with highlighted missing regions (top) and imputed series using Kalman smoothing on BSM (bottom).

Fig. 3 .
Fig. 3. Corrected blood demand series by automatic ARIMA data correction algorithm.

Fig. 4 .
Fig. 4. Plot of time-reversibility statistic for the blood demand series and 199 generated surrogate data (top), and Normal Q-Q plot of the blood demand series (bottom).
x j , Ker f is a kernel function with bandwidth ρ and ∥ • ∥ is the Euclidean norm.The Gaussian kernel in the form Ker f(u) = e −u 2 /2 / √ 2π is typically used.The normalised weights ω denote the training pattern contributions to the final output.The hyperparameter of GRNN, the bandwidth

Fig. 6 .
Fig. 6.A plot of the 18-month forecasts and backcasts from fitted ARIMA model.

Fig. 7 .
Fig. 7. Comparative plot of the 18-month blood demand forecasts from the fitted ML models.direction of prediction (either forecasting or reverse forecasting) could also affect the validity of their conclusion, amongst other factors.In this study, the time-series

Fig. 8 .
Fig. 8. Comparative plot of the 18-month blood demand backcasts from the fitted ML models.

Fig. 9 .
Fig. 9. Comparison between the prediction errors of the time-series models at different forecast (top) and backcast (bottom) origins.

Fig. 10 .
Fig. 10.Comparison of the prediction error distribution between ARIMA and the ML models.K-Nearest Neighbour regression (KNN) and Neural Network Auto-regressive (NNAR) ML models outperformed the traditional ARIMA model in predicting future values based on the short time-series data, only KNN amongst the ML models, had a significantly lower forecasting error than the ARIMA model comparatively.Also, it was revealed that the direction of prediction (either forecasting or backcasting) could also affect the performance of the underlying time-series models.The data correction method could significantly affect the predictive outcome of ML algorithms and other classical time-series models.Hence, we recommend that future studies investigate the effects of different data preprocessing techniques on the time-series models' predictive power for short or long series.Furthermore, the blood centres in Ghana should get proper database management systems to avoid data loss and outliers due to genuine recording errors.Future studies can also employ machine learning algorithms as a good alternative for backcasting past values of different time-series data with unavailable data of previous years.
).Let {Y t , t ∈ Z} be a strictly stationary linear process such that Y t = ∑ k∈Z ϕ k Z t−k ; then {Y t , t ∈ Z} is time-reversible if and only if it is a Gaussian process or for t 0 ∈ Z and ρ = ±1, ϕ t = ρϕ t 0 −t and Z t D ∼ ρZ t

Table 1 P
-values of the Bonferroni Dunn's test for pairwise comparison of models' forecast error distributions.

Table 3
Pooled average prediction error (median MAPE) for empirical model comparison.