Abstract
In stochastic optimisation, the large number of scenarios required to faithfully represent the underlying uncertainty is often a barrier to finding efficient numerical solutions. This motivates the scenario reduction problem: by finding a smaller subset of scenarios, reduce the numerical complexity while keeping the error at an acceptable level. In this paper we propose a novel and computationally efficient methodology to tackle the scenario reduction problem for two-stage problems when the error to be minimised is the implementation error, i.e. the error incurred by implementing the solution of the reduced problem in the original problem. Specifically, we develop a problem-driven scenario clustering method that produces a partition of the scenario set. Each cluster contains a representative scenario that best reflects the optimal value of the objective function in each cluster of the partition to be identified. We demonstrate the efficiency of our method by applying it to two challenging two-stage stochastic combinatorial optimization problems: the two-stage stochastic network design problem and the two-stage facility location problem. When compared to alternative clustering methods and Monte Carlo sampling, our method is shown to clearly outperform all other methods.
Similar content being viewed by others
References
Arthur D, Vassilvitskii S (2006) k-means++: the advantages of careful seeding. Technical Report 2006-13, Stanford InfoLab. http://ilpubs.stanford.edu:8090/778/
Baron O, Berman O, Krass D (2008) Facility location with stochastic demand and constraints on waiting time. Manuf Serv Oper Manag 10(3):484–505
Bengio Y, Frejinger E, Lodi A, Patel R, Sankaranarayanan S (2019) A learning-based algorithm to quickly compute good primal solutions for stochastic integer programs. CoRR arXiv:1912.08112
Bertsimas D, Mundru N (2022) Optimization-based scenario reduction for data-driven two-stage stochastic optimization. Oper Res. https://doi.org/10.1287/opre.2022.2265
Bieniek M (2015) A note on the facility location problem with stochastic demands. Omega 55:53–60
Birge JR (1982) The value of the stochastic solution in stochastic linear programs with fixed recourse. Math Program 24:314–325
Birge JR, Louveaux F (2011) Introduction to stochastic programming, 2nd edn. Springer, New York
Chen M, Mehrotra S, Papp D (2015) Scenario generation for stochastic optimization problems via the sparse grid method. Comput Optim Appl 62(3):669–692
Crainic TG, Fu X, Gendreau M, Rei W, Wallace SW (2011) Progressive hedging-based metaheuristics for stochastic network design. Networks 58(2):114–124
Crainic TG, Hewitt M, Rei W (2014) Scenario grouping in a progressive hedging-based meta-heuristic for stochastic network design. Comput Oper Res 43:90–99
de Vos NJ (2015) kmodes categorical clustering library. https://github.com/nicodv/kmodes
Dupačová J, Gröwe-Kuska N, Römisch W (2003) Scenario reduction in stochastic programming an approach using probability metrics. Math Program Ser A 95:493–511
Dyer M, Stougie L (2006) Computational complexity of stochastic programming problems. Math Program Ser A 106(3):423–432
Elmachtoub AN, Grigas P (2022) Smart ‘predict, then optimize’. Manag Sci 68(1):9–26
Fairbrother J, Turner A, Wallace SW (2019) Problem-driven scenario generation: an analytical approach for stochastic programs with tail risk measure. Math Program 2019:1–42
Feng Y, Ryan SM (2016) Solution sensitivity-based scenario reduction for stochastic unit commitment. Comput Manag Sci 13(1):29–62
Guha S, Khuller S (1999) Greedy strikes back: improved facility location algorithms. J Algorithms 31(1):228–248
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Heitsch H, Leövey H, Römisch W (2016) Are Quasi-Monte Carlo algorithms efficient for two-stage stochastic programs? Comput Optim Appl 65(3):567–603
Henrion R, Küchler C, Römisch W (2009) Scenario reduction in stochastic programming with respect to discrepancy distances. Comput Optim Appl 43:67–93
Henrion R, Römisch W (2018) Problem-based optimal scenario generation and reduction in stochastic programming. Math Program. https://doi.org/10.1007/s10107-018-1337-6
Hewitt M, Ortmann J, Rei W (2021) Decision-based scenario clustering for decision-making under uncertainty. Ann Oper Res. https://doi.org/10.1007/s10479-020-03843-x
Higle JL, Sen S (1991) Stochastic decomposition: an algorithm for two-stage linear programs with recourse. Math Oper Res 16(3):650–669
Ho N, Nguyen X, Yurochkin M, Bui HH, Huynh V, Phung D (2017) Multilevel clustering via wasserstein means. In: International Conference on Machine Learning, pp 1501–1509
Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304
Høyland K, Kaut M, Wallace SW (2003) A heuristic for moment-matching scenario generation. Comput Optim Appl 24(2–3):169–185
Høyland K, Wallace SW (2001) Generating scenario trees for multistage decision problems. Manag Sci 42(2):295–307
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall Inc, Upper Saddle River, NJ
Kantorovich LV (1939) Mathematical methods of organizing and planning production. Manag Sci 6(4):366–422
Kaut M (2012) Scenario-tree generation. In: King AJ, Wallace SW (eds) Modeling with stochastic programming. Springer, New York, pp 77–102
Keutchayan J, Gendreau M, Saucier A (2017) Quality evaluation of scenario-tree generation methods for solving stochastic programming problems. Comput Manag Sci 14(3):333–365
Keutchayan J, Munger D, Gendreau M (2020) On the scenario-tree optimal-value error for stochastic programming problems. Math Oper Res 45(4):1572–1595
King AJ, Wallace SW (2012) Modeling with stochastic programming. Springer, New York
Kleywegt AJ, Shapiro A, Homem-de-Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM J Optim 12(2):479–502
Li S (2013) A 1.488 approximation algorithm for the uncapacitated facility location problem. Inf Comput 222:45–58
Luedtke J, Ahmed S (2008) A sample approximation approach for optimization with probabilistic constraints. SIAM J Optim 19(2):674–699
Novikov A (2019) PyClustering: data mining library. J Open Source Softw 4(36):1230. https://doi.org/10.21105/joss.01230
Ntaimo L, Sen S (2005) The million-variable “march’’ for stochastic combinatorial optimization. J Glob Optim 32(3):385–400
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Pflug GC (2001) Scenario tree generation for multiperiod financial optimization by optimal discretization. Math Program 89(2):251–271
Pflug GC, Pichler A (2015) Dynamic generation of scenario trees. Comput Optim Appl 62(3):641–668
Prochazka V, Wallace SW (2020) Scenario tree construction driven by heuristic solutions of the optimization problem. Comput Manag Sci 17:277–307
Rahmaniani R, Crainic TG, Gendreau M, Rei W (2018) Accelerating the benders decomposition method: application to stochastic network design problems. SIAM J Optim 28(1):875–903
Rahmaniani R, Crainic TG, Gendreau M, Rei W (2018) Accelerating the benders decomposition method: application to stochastic network design problems. SIAM J Optim 28(1):875–903
Riis M, Andersen KA (2002) Capacitated network design with uncertain demand. INFORMS J Comput 14(3):247–260
Rockafellar RT, Wets RJ-B (2009) Variational analysis, vol 317. Springer, New York
Rujeerapaiboon N, Schindler K, Kuhn D, Wiesemann W (2018) Scenario reduction revisited: fundamental limits and guarantees. Math Program 2018:1–36
Römisch W (2009) Scenario reduction techniques in stochastic programming. In: Watanabe O, Zeugmann T (eds) Stochastic algorithms: foundations and applications. Springer, Berlin, Heidelberg, pp 1–14
Santoso T, Ahmed S, Goetschalckx M, Shapiro A (2005) A stochastic programming approach for supply chain network design under uncertainty. Eur J Oper Res 167(1):96–115
Schubert E, Rousseeuw PJ (2019) Faster k-medoids clustering: improving the pam, clara, and clarans algorithms. In: International Conference on Similarity Search and Applications, Springer, pp 171–187
Schultz R, Tiedemann S (2006) Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math Program 105(2):365–386
Shapiro A, Homem-de-Mello T (2000) On the rate of convergence of optimal solutions of Monte Carlo approximations of stochastic programs. SIAM J Optim 11(1):70–86
Shapiro A (2003) Monte Carlo sampling methods. In: Ruszczyński A, Shapiro A (eds) Handbooks in operations research and management science: stochastic programming, vol 10. Elsevier, Amsterdam, pp 353–425
Sun M, Teng F, Konstantelos I, Strbac G (2018) An objective-based scenario selection method for transmission network expansion planning with multivariate stochasticity in load and renewable energy sources. Energy 145:871–885
Wallace SW (2010) Stochastic programming and the option of doing it differently. Ann Oper Res 177(1):3–8
Acknowledgements
This research was initiated when Julien Keutchayan was postdoctoral researcher at University of Montreal. Janosch Ortmann was partially supported by a NSERC Discovery Grant. While working on this paper, Walter Rei was the Canada Research Chair (CRC) in Stochastic Optimization of Transport and Logistics Systems. In addition, he was supported by the National Sciences and Engineering Research Council of Canada (NSERC) through the discovery grants program. He gratefully acknowledges all the support that is provided by the CRC and NSERC programs.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Keutchayan, J., Ortmann, J. & Rei, W. Problem-driven scenario clustering in stochastic optimization. Comput Manag Sci 20, 13 (2023). https://doi.org/10.1007/s10287-023-00446-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10287-023-00446-2