Abstract
Forecast adjustment commonly occurs when organizational forecasters adjust a statistical forecast of demand to take into account factors which are excluded from the statistical calculation. This paper addresses the question of how to measure the accuracy of such adjustments. We show that many existing error measures are generally not suited to the task, due to specific features of the demand data. Alongside the well-known weaknesses of existing measures, a number of additional effects are demonstrated that complicate the interpretation of measurement results and can even lead to false conclusions being drawn. In order to ensure an interpretable and unambiguous evaluation, we recommend the use of a metric based on aggregating performance ratios across time series using the weighted geometric mean. We illustrate that this measure has the advantage of treating over- and under-forecasting even-handedly, has a more symmetric distribution, and is robust.
Empirical analysis using the recommended metric showed that, on average, adjustments yielded improvements under symmetric linear loss, while harming accuracy in terms of some traditional measures. This provides further support to the critical importance of selecting appropriate error measures when evaluating the forecasting accuracy. The general accuracy evaluation scheme recommended in the paper is applicable in a wide range of settings including forecasting for fashion industry.
This paper is an extended version of Davydenko and Fildes [8] which appeared in the International Journal of Forecasting
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The formula corresponds to the software implementation described by Hyndman and Khandakar [19].
References
Alkhazaleh AMH, Razali AM (2010) New technique to estimate the asymmetric trimming mean. J Probab Stat 2010 http://www.hindawi.com/journals/jps/2010/739154/cta/
Andrews DF, Bickel PJ, Hampel FR, Huber PJ, Rogers WH, Tuckey JW (1972) Robust estimates of location. Princeton University Press, Princeton
Armstrong S (1985) Long-range forecasting: from crystal ball to computer. Wiley, New York
Armstrong JS, Collopy F (1992) Error measures for generalizing about forecasting methods: empirical comparisons. Int J Forecast 8:69–80
Armstrong JS, Fildes R (1995) Correspondence on the selection of error measures for comparisons among forecasting methods. J Forecast 14(1):67–71
Chatfield C (2001) Time-series forecasting. Chapman & Hall, Boca Raton
Davydenko A, Fildes R (2008) Models for product demand forecasting with the use of judgmental adjustments to statistical forecasts. Paper presented at the 28th international symposium on forecasting (ISF2008), Nice. Retrieved on 20 Sep 2013 from http://www.forecasters.org/submissions08/DAVYDENKOANDREYISF2008.pdf
Davydenko A, Fildes R (2013) Measuring forecasting accuracy: the case of judgmental adjustments to SKU-level demand forecasts. Int J Forecast 29(3):510–522
Diebold FX (1993) On the limitations of comparing mean square forecast errors: comment. J Forecast 12:641–642
Fildes R (1992) The evaluation of extrapolative forecasting methods. Int J Forecast 8(1):81–98
Fildes R, Goodwin P (2007) Against your better judgment? How organizations can improve their use of management judgment in forecasting. Interfaces 37:570–576
Fildes R, Goodwin P, Lawrence M, Nikolopoulos K (2009) Effective forecasting and judgmental adjustments: an empirical evaluation and strategies for improvement in supply-chain planning. Int J Forecast 25(1):3–23
Fleming PJ, Wallace JJ (1986) How not to lie with statistics: the correct way to summarize benchmark results. Commun ACM 29(3):218–221
Franses PH, Legerstee R (2010) Do experts’ adjustments on model-based SKU-level forecasts improve forecast quality? J Forecast 29:331–340
Goodwin P, Lawton R (1999) On the asymmetry of the symmetric MAPE. Int J Forecast 4:405–408
Hill M, Dixon WJ (1982) Robustness in real life: a study of clinical laboratory data. Biometrics 38:377–396
Hoover J (2006) Measuring forecast accuracy: omissions in today’s forecasting engines and demand-planning software. Foresight Int J Appl Forecast 4:32–35
Hyndman RJ (2006) Another look at forecast-accuracy metrics for intermittent demand. Foresight Int J Appl Forecast 4(4):43–46
Hyndman RJ, Khandakar Y (2008) Automatic time series forecasting: the forecast package for R. J Stat Softw 27(3)
Hyndman R, Koehler A (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688
Kolassa S, Schutz W (2007) Advantages of the MAD/MEAN ratio over the MAPE. Foresight Int J Appl Forecast 6:40–43
Makridakis S (1993) Accuracy measures: theoretical and practical concerns. Int J Forecast 9:527–529
Marques CR, Neves PD, Sarmento LM (2000) Evaluating core inflation indicators. Working paper 3-00, Economics Research Department, Banco de Portugal
Mathews B, Diamantopoulos A (1987) Alternative indicators of forecast revision and improvement. Mark Intell 5(2):20–23
McCarthy TM, Davis DF, Golicic SL, Mentzer JT (2006) The evolution of sales forecasting management: a 20-year longitudinal study of forecasting practice. J Forecast 25:303–324
Mudholkar GS (1983) Fisher’s z-transformation. Encyclopedia Stat Sci 3:130–135
Sanders N, Ritzman L (2004) Integrating judgmental and quantitative forecasts: methodologies for pooling marketing and operations information. Int J Oper Prod Manage 24:514–529
Spizman L, Weinstein M (2008) A note on utilizing the geometric mean: when, why and how the forensic economist should employ the geometric mean. J Legal Econ 15(1):43–55
Syntetos AA, Boylan JE (2005) The accuracy of intermittent demand estimates. Int J Forecast 21(2):303–314
Trapero JR, Fildes RA, Davydenko A (2011) Non-linear identification of judgmental forecasts at SKU-level. J Forecast 30(5):490–508
Trapero JR, Pedregal DJ, Fildes R, Weller M (2011) Analysis of judgmental adjustments in presence of promotions. Paper presented at the 31th international symposium on forecasting (ISF2011), Prague
Tversky A, Kahneman D (1974) Judgment under uncertainty: heuristics and biases. Science 185:1124–1130
Wilcox RR (1996) Statistics for the social sciences. Academic, San Diego
Wilcox RR (2005) Trimmed means. Encyclopedia Stat Behav Sci 4:2066–2067
Zellner A (1986) A tale of forecasting 1001 series: the Bayesian knight strikes again. Int J Forecast 2:491–494
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix 1 Alternative Representation of MASE
Appendix 1 Alternative Representation of MASE
According to Hyndman and Koehler [20], for the scenario when forecasts are made from varying origins but with a constant horizon (here taken as one), the scaled error is defined asFootnote 1
where MAE b i is the MAE from the benchmark (naïve) method for series i, e i,t is the error of a forecast being evaluated against the benchmark for series i and period t, l i is the number of elements in series i, and Y i,j is the actual value observed at time j for series i.
Let the mean absolute scaled error (MASE) be calculated by averaging the absolute scaled errors across time periods and time series:
where n i is the number of available values of e i,t for series i, m is the total number of series, and T i is a set containing time periods for which the errors e i,t are available for series i.
Then,
where MAE i is the MAE for series i for the forecast being evaluated against the benchmark.
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Davydenko, A., Fildes, R. (2014). Measuring Forecasting Accuracy: Problems and Recommendations (by the Example of SKU-Level Judgmental Adjustments). In: Choi, TM., Hui, CL., Yu, Y. (eds) Intelligent Fashion Forecasting Systems: Models and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39869-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-39869-8_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39868-1
Online ISBN: 978-3-642-39869-8
eBook Packages: Business and EconomicsBusiness and Management (R0)