Assessing the forecasting model ability in measuring the prevention transmission of COVID-19 pandemic: An application of visibility analysis using Inductive logic

Forecasting is an integral approach due to its ability to make informed act decisions and develop data-driven strategies. It's also used to make decisions related to current circumstances and predictions on future conditions. An integral part has been developed using visibility analysis for COVID-19 Outbreak, a lesson from Indonesia. The author identified that its topic has limited attention, especially in assessing the forecasting models. The issue comes from predicted results that are questionable or cannot be trusted without applying the visibility analysis in the forecasting model. The visibility analysis is required to assess the model's ability to forecast future events. In conjunction with the issue, this paper introduces the analysis of visibility error with the different concepts during model development for the transmission prevention measures in making the decision. This study applied a statistical approach to assess the visibility error of forecasting performance in determining how long periods of forecasting and deciding for transmission prevention measures COVID-19 pandemics. Also, we developed the visibility error of time-variant using inductive logic. The result indicated that the number of data required to perform forecasting work on the basis of forecasting model specifications. In conclusion, this study has been completed to develop the statistical formula for identifying the largest time horizon in forecasting model N = V + 2 . Also, this developed model can assist the stakeholder in forecasting the number of transmission prevention and making the decision in case of COVID-19 pandemic.


Introduction
Coronavirus disease 2019 (COVID-19) was identified at the end of 2019 in Wuhan, China.Currently, this disease spreads to more than two hundred countries and territories in the world.Due to the deployment, the World Health Organization (WHO) declared COVID-19 as a global pandemic on January 30, 2020.Indonesia is one of the countries affected.The two residents contacted a Japanese man from Malaysia who had been confirmed positively earlier infected on March 2, 2020, in West Java (Beritasatu, 2020).The spread of this outbreak shows an exponential trend.On July 19, 2020, confirmed cases in Indonesia overtake China, with more than 86 thousand infected people (Ulum, 2020).The Indonesian government announced several interventions to reduce the chain transmission, namely physical distancing and large-scale social restrictions, which are required to restrict physical activity between people.However, since COVID-19 has been confirmed transmitted by human contact, droplets for short-distance transmission, and aerosols for long-distance transmission (Moriyama et al., 2020), the crucial factors in the outbreak's spread through activities related to physical interactions (Dalton et al., 2020).Nevertheless, not all COVID-19 patients show symptoms (Rettner, 2020).In order to implement social distancing, several local governments in Indonesia have proclaimed the Health Minister's Decree No. HK.01.07/MENKES/248/2020 on implementing large-scale social restrictions (PSBB) policies.Jakarta and West Java, the regions with the highest cases in Indonesia initially, were pioneers of this policy, followed by several other regions where high cases were also confirmed, such as East Java, South Sulawesi, and Central Java (the region with high cases).PSBB is a restriction on human activities in a particular region, province, or city suspected of being infected with COVID-19.The restrictions enclosing limitations on the movement of people and goods, the closure of schools and jobs, restrictions on religious activities, activities in public places, transport, and access to and from the region.During the period of PSBB, rapid mass tests are conducted to find the center of the spread of this case; thereby, regions with high cases can strictly employ the health protocols.COVID-19 cases in Indonesia continue to increase daily.There are sometimes high surges in daily new cases in several provinces due to different causes.For example, the province of East Java recorded more than 300 new cases per day, sometimes caused by cumulative test results from the previous day at ITD UNAIR.According to the report from the deputy governor of East Java, there was infected staff, so that ITD Unair had limited operations (Setkab, 2020a).In West Java, the infected people have shown a downward trend since the PSBB was introduced.Still, there has been a further increase in cases on July 8, 2020, after a new cluster was discovered at SECAPA AD (the Army's officer candidate school) (Arbi, 2020).In addition, a sharp increase in confirmed cases in South Sulawesi was caused by the rapid mass test, stated the governor of South Sulawesi (New Desk, 2020).As a result of the high increase in cases, hospitals might not accept critical patients due to the lack of beds.Patients with no symptoms and mild symptoms hence only do independent isolation without being hospitalized, then patients with the severe condition can be treated in hospital.It aims to reduce the increase in hospital capacity, which can lead to ineffective hospital services.East Java, as a region with a very high case surge, can cope with this well.It can be seen from the large number of patients who recover more than a hundred in a day (Gugus Tugas, 2020).An estimated number of infected people cannot be known because the number of tests each day is small.Therefore, the hospital might not make optimal preparations, both technical and non-technical, to maximize treatment.Besides, the rate of recovery and the rate of death is still unpredictable.COVID-19 also affected the global economy.The Indonesian economy has been declining since introducing the PSBB rate (Zahroh et al., 2020).Hence, the central government announced the new normal system in Indonesia (Pangestika, 2020) on June 25, 2020.However, the local governments in several regions have not yet allowed the regions with red-zone to pass the PSBB directive to abolish and replace a new normal or a new way of life.As a result, some activities were not previously permitted, and closed places ran again to revive the economy that declined during the PSBB period.Regardless, the application of this directive must be accompanied by compliance with the health standards set by the WHO, where residents must continue to follow health protocols, such as wear masks, keep their distance from other people, wash hands, and avoid the crowd (Setkab, 2020b).Based on the report of WHO, none of the six provinces in Java meets the criteria from WHO for entering the new normal (Nurbaiti, 2020).Many people misinterpret the new normal system as the freedom to live the same way before this outbreak without considering the health protocols.Thus, the number of active cases increases without a declining trend or the pandemic's peak, as shown in Fig. 1.In order to flatten this curve, public awareness needs to be improved by executing more difficult sanctions.Consequently, the transmission chain will be ended.In addition, treating infected patients with other more effective methods also reduces the number of active cases.
Now it has been almost six months since the pandemic.The positive and active cases of Covid-19 in Indonesia have consistently increased.The addition of positive cases of Covid-19 during 29-8-2020 reached 3308 people.What is the trend of Corona's movement in the future?It requires sophisticated scientific forecasting.But the problem that arises is to what extent can the future forecasting period be carried out?This paper focuses on answering this crucial problem.
Forecasts are often made for several consecutive periods ahead.For example, Consensus Economics collects quarterly forecasts for up to six quarters into the future.Yet, especially longer-term forecasts possibly do not provide any information beyond that contained in the long-run mean of the target variable.As a result, such forecasts are deemed to be uninformative.Therefore, it is desirable to determine the largest horizon for which informative forecasts can be made.Unfortunately, there have only been descriptive methods accessible for this purpose up until now (Breitung and Knüppel, 2018).Therefore, forecasters will require a statistical method to determine the farthest prediction horizon for which forecasts are still useful.The choice of the largest forecast horizon appears to be an important issue for decision-makers.For example, several central banks, like the Federal Reserve and the European Central Bank, have increased the horizon of their macroeconomic projections in recent years.However, it's uncertain whether the additional forecasts for these longer horizons provide useful information, given that their forecast error variance could be as high as the target variable's unconditional variance.

Literature Review
While statistical tools for such an assessment, based on the approach of (Parzen 1981), have been proposed in the literature, formal statistical tests have not been available.This paper aims to provide such tests, allowing us to determine the most useful forecast horizon.Visibility is the ability of a model to predict the future.The accuracy and precision of forecasting methods will vary depending on how far the method can predict the future.Visibility is an important indicator of forecasting models because it demonstrates the ability of a model to predict the future.Visibility forecast performance is shown by the statistics (Hidayat, 2016).For example, visibility two periods ahead shows a model's eligibility to forecast the subsequent two periods.Without the forecast performance statistics, we have no basis for forecasting the future.Because without these statistics' forecasters can predict the future for an unlimited period.It means that through the study of visibility, a forecasting activity will avoid the problem of speculation.No visibility analysis makes us wonder whether the prediction is still valid now.It is about the validity period of a forecasting model.Kim and Schwartz (2013) stated that forecast horizon and the number of observations affect the accuracy of forecasting models.Forecast accuracy was shown to depend on the country of origin and the forecasting horizon (Witt and Witt, 1995).The frequency of observations within the forecasted time frame (i.e., the length of the time interval between observations) might determine what tourism forecasting method should be used (Li et al., 2005).For example, quarterly or monthly timeseries data call for using a seasonal autoregressive integrated moving average (SARIMA) model more than annual data because of the seasonality and periodicity of data (Li et al., 2006).The longer the forecasting horizon, the farther the forecast into the future, the more uncertainty is involved.It follows that short-term forecasts tend to be more accurate than longterm forecasts (Ahlburg, 2001;Calantone et al., 1987;Li et al., 2005;Witt and Witt, 2005).Therefore, some forecasting methods are more appropriate to handle certain forecasting horizons.
Small MAPE denotes a more accurate forecast (Hsu and Wang, 2008).It is the most popular measure because it is better suited for comparisons across different time series, over different time intervals, and different volumes (Lee et al., 2008).MAPE is dimensionless, allowing meaningful comparisons across varying-sized cases (for example, among countries where the number of tourist arrivals differs significantly).When used to compare different studies, it eliminates the need to calculate the "effect-size" statistic.MAPE was selected as the dependent variable because it facilitated the largest number of tourism forecasting studies as the most popular accuracy measure in this comparative analysis.The paper's focus is to determine what forecast horizon is most consistent with the observed behavior of firms in the U.S. manufacturing sector (Gordon, 1995).Gordon (1995) developed the model to evaluate the posterior probabilities associated with the various forecast horizon lengths.Let θ denote the model's parameters, and let T and Z represent the conditioning sets related to the theory and the data, respectively.p(s|θ, T, Z) Suppose that the analyst knows with certainty the value of θ and that the value of θ is consistent with the regularity conditions.For a given forecast horizon s and given ˜zt, the vector of forecasted7 values of prices and discount rates in period t, expressions for the period-t decision rules Kˆ (s, θ, z˜t, Kt−1),Lˆ(s, θ, z˜t, Kt−1) and ˆy(s, θ, z˜t, Kt−1) can be derived.
We can now apply Bayes' rule to the conditional probability (13) to obtain: p(s = i|θ, T, Z) = p(u|s = i, θ, T)πi S j=0 p(u|s = j, θ, T)πj ( 14) Short-term forecasting remains an integral component in public ambulance emergency preparedness.Henderson and Mason (2004) highlighted the quantitative decision processes are becoming increasingly important in providing public accountability for the resource decisions that have to be made.Any solution to such problems requires careful balancing of political, economic, and social objectives.Grekousis and Liu (2019) emphasized that predicting emergency medical services (EMS) demand is crucial for saving people's lives.

Material and Methods
Eq. ( 1) represents the differentiation of the generated model.
where Yi is the observation for variable response period i, and Ɛi is a random experimental error, i=1,2.3...,n.Where i and n is an indicator that is distinctive compared with existing models because of Ɛi, different for each forecasting period.In the current model, Ɛi is derived from the model built for 1% forecasting accuracy.But can we believe that this model gives 1% accuracy in predicting future value?Obviously not.There is no reason to think that past accuracy can be used as a reliable indicator for the accuracy of future forecasts.That is why in this study Ɛi was identified at the time of model testing.The time-variant error model is proposed to solve problems of overfitting.Overfitting is less of a concern in forecasting.
Overfitting is associated with building a model in which the model is pursued very fit with historical data to make the smallest possible Ɛi, the forecasting model in this research introduces n as the length of visibility, which is a new issue in forecasting. = f (R) is a forecasting model for predicting Y.The model is identified at the time of the model building.Eq. ( 6) and forecasting models, in general, assume error,  is obtained during the model building, while the errors in equation 1 are obtained during the testing model.

Model Development: Visibility with Error Time-Variant Approach
The concept of time-variant visibility error is illustrated by the following time series data: t-3,t-2,t-1,t0; is the data set for the building model, and t1,t2,t3,t4; is 4 data used as data testing.

Eligibility visibility of two future periods
For t-3, t-2, t-1, t0 are utilized for forecasting purposes for the two future periods (forecasting the value of t2) to obtain the forecast error value for the actual data t2, the first two-step ahead forecast performance is formulated as follows:

Results
The formula is: N = V + 2, where N = Number of test data required, and V = visibility or validity period.The formula needs N to be equal to 3 data tests to obtain one step ahead forecast or validity period.We need 5 data tests to get three steps ahead of forecasting.Table 1 displays some of the results of the minimum data testing necessary for different forecasting periods, while Fig. 2 depicts the pattern of linear correlations between the two variables.

Discussion
Singapore University of Technology and Design (SUTD) predicts the final prediction of the Covid-19 outbreak in Indonesia will end on September 23, 2020 (CNBC Indonesia May 6, 2020), and this prediction is revised again, back to October 28, 2020 (Yuliasari, 2020).SUTD uses daily updated COVID-19 data from Our World in Data to estimate the pandemic life cycle by regressing the susceptible-infected-recovered (SIR) model of COVID-19 (Luo, 2020).This prediction was announced in May 2020.So SUTD made predictions for the next 4 months while the data used for making these predictions is for 2 months, starting from March.Based on the formula developed, the data used for predictions is considered insufficient to predict the next 4 months.How can we have 4-month visibility?SUTD has had Covid data for 2 two months.The rule is N = V + 2. If SUTD wants to have 4 months visibility, then SUTD must have N=6 month data, which is minimal.Based on this formula, the predictions made by SUTD are considered inaccurate because the time horizon that SUTD uses does not comply with N = V + 2. Of course, we will see the inaccuracy of the SUTD prediction at the end of this September 2020.Hidayat (2020) has predicted 13 cases of Covid1-19 for Indonesia weekly, and all of them are accurate.His prediction does not predict beyond that dictated by N = V + 2 formula.His prediction on August 30 -September 5 is in interval prediction: 181.326 -194.390, and the prediction for September 6 -September 12 is 195.20 -223.217,So the transmission in September is still consistently increasing, and there is no sign of reaching a peak because the peak can only be known after there has been a significant decrease in positive cases in a significant period of time.

Conclusion
In conclusion, the induction process has obtained a new technique to decide the validity period of a forecasting method.Forecasters cannot do forecasting without knowing the statistics forecasting performance of a forecasting model.One important finding of this framework is that we now have a formula for determining how much data testing is required to perform forecasting work based on specific forecasting specific lengths.The formula is: N = V + 2, where N = Number of test data required, and V = visibility or validity period.According to the formula we need N=3 to obtain one step ahead forecast or one validity period.We need 6 data tests to get four steps ahead of forecasting.The formula N = V + 2 is developed using the concept of error time variant visibility.This concept assumes different errors for each period of forecast.
The consequence of equation ( 1) is that we must have two sets of data, the first data set is for model building purposes and the second one is for testing the model and creating error visibility.We propose a time variant error model to overcome a little known problem in forecasting called overfitting.As explained above, overfitting is related to the model building approach, which is too fit to historical data.The principle used is we need at least 3 data to create an average because one data is by chance, two data are too many possibilities, and three data are enough to capture the pattern.This study does not discuss how much data is optimal for forming an average error.So three data is only the minimum amount of data needed to make predictions.Please note, forecasting performance statistics is only a necessary condition and not a sufficient requirement to know the substance of prediction accuracy.Errors in each period of forecast are called in terms of Visibility error.Then visibility errors are calculated using worst-case scenario formulas that should be developed to complete the formula N = V + 2 so that the interval forecasting involving Lower Bound Error and Upper bound error can be obtained.

Table 1
Minimum Data Test Required to Forecast