Skip to main content
Log in

Problem-driven scenario clustering in stochastic optimization

  • Original Paper
  • Published:
Computational Management Science Aims and scope Submit manuscript

Abstract

In stochastic optimisation, the large number of scenarios required to faithfully represent the underlying uncertainty is often a barrier to finding efficient numerical solutions. This motivates the scenario reduction problem: by finding a smaller subset of scenarios, reduce the numerical complexity while keeping the error at an acceptable level. In this paper we propose a novel and computationally efficient methodology to tackle the scenario reduction problem for two-stage problems when the error to be minimised is the implementation error, i.e. the error incurred by implementing the solution of the reduced problem in the original problem. Specifically, we develop a problem-driven scenario clustering method that produces a partition of the scenario set. Each cluster contains a representative scenario that best reflects the optimal value of the objective function in each cluster of the partition to be identified. We demonstrate the efficiency of our method by applying it to two challenging two-stage stochastic combinatorial optimization problems: the two-stage stochastic network design problem and the two-stage facility location problem. When compared to alternative clustering methods and Monte Carlo sampling, our method is shown to clearly outperform all other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Arthur D, Vassilvitskii S (2006) k-means++: the advantages of careful seeding. Technical Report 2006-13, Stanford InfoLab. http://ilpubs.stanford.edu:8090/778/

  • Baron O, Berman O, Krass D (2008) Facility location with stochastic demand and constraints on waiting time. Manuf Serv Oper Manag 10(3):484–505

    Google Scholar 

  • Bengio Y, Frejinger E, Lodi A, Patel R, Sankaranarayanan S (2019) A learning-based algorithm to quickly compute good primal solutions for stochastic integer programs. CoRR arXiv:1912.08112

  • Bertsimas D, Mundru N (2022) Optimization-based scenario reduction for data-driven two-stage stochastic optimization. Oper Res. https://doi.org/10.1287/opre.2022.2265

    Article  Google Scholar 

  • Bieniek M (2015) A note on the facility location problem with stochastic demands. Omega 55:53–60

    Google Scholar 

  • Birge JR (1982) The value of the stochastic solution in stochastic linear programs with fixed recourse. Math Program 24:314–325

    Google Scholar 

  • Birge JR, Louveaux F (2011) Introduction to stochastic programming, 2nd edn. Springer, New York

    Google Scholar 

  • Chen M, Mehrotra S, Papp D (2015) Scenario generation for stochastic optimization problems via the sparse grid method. Comput Optim Appl 62(3):669–692

    Google Scholar 

  • Crainic TG, Fu X, Gendreau M, Rei W, Wallace SW (2011) Progressive hedging-based metaheuristics for stochastic network design. Networks 58(2):114–124

    Google Scholar 

  • Crainic TG, Hewitt M, Rei W (2014) Scenario grouping in a progressive hedging-based meta-heuristic for stochastic network design. Comput Oper Res 43:90–99

    Google Scholar 

  • de Vos NJ (2015) kmodes categorical clustering library. https://github.com/nicodv/kmodes

  • Dupačová J, Gröwe-Kuska N, Römisch W (2003) Scenario reduction in stochastic programming an approach using probability metrics. Math Program Ser A 95:493–511

    Google Scholar 

  • Dyer M, Stougie L (2006) Computational complexity of stochastic programming problems. Math Program Ser A 106(3):423–432

    Google Scholar 

  • Elmachtoub AN, Grigas P (2022) Smart ‘predict, then optimize’. Manag Sci 68(1):9–26

    Google Scholar 

  • Fairbrother J, Turner A, Wallace SW (2019) Problem-driven scenario generation: an analytical approach for stochastic programs with tail risk measure. Math Program 2019:1–42

    Google Scholar 

  • Feng Y, Ryan SM (2016) Solution sensitivity-based scenario reduction for stochastic unit commitment. Comput Manag Sci 13(1):29–62

    Google Scholar 

  • Guha S, Khuller S (1999) Greedy strikes back: improved facility location algorithms. J Algorithms 31(1):228–248

    Google Scholar 

  • Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    Google Scholar 

  • Heitsch H, Leövey H, Römisch W (2016) Are Quasi-Monte Carlo algorithms efficient for two-stage stochastic programs? Comput Optim Appl 65(3):567–603

    Google Scholar 

  • Henrion R, Küchler C, Römisch W (2009) Scenario reduction in stochastic programming with respect to discrepancy distances. Comput Optim Appl 43:67–93

    Google Scholar 

  • Henrion R, Römisch W (2018) Problem-based optimal scenario generation and reduction in stochastic programming. Math Program. https://doi.org/10.1007/s10107-018-1337-6

    Article  Google Scholar 

  • Hewitt M, Ortmann J, Rei W (2021) Decision-based scenario clustering for decision-making under uncertainty. Ann Oper Res. https://doi.org/10.1007/s10479-020-03843-x

    Article  Google Scholar 

  • Higle JL, Sen S (1991) Stochastic decomposition: an algorithm for two-stage linear programs with recourse. Math Oper Res 16(3):650–669

    Google Scholar 

  • Ho N, Nguyen X, Yurochkin M, Bui HH, Huynh V, Phung D (2017) Multilevel clustering via wasserstein means. In: International Conference on Machine Learning, pp 1501–1509

  • Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304

    Google Scholar 

  • Høyland K, Kaut M, Wallace SW (2003) A heuristic for moment-matching scenario generation. Comput Optim Appl 24(2–3):169–185

    Google Scholar 

  • Høyland K, Wallace SW (2001) Generating scenario trees for multistage decision problems. Manag Sci 42(2):295–307

    Google Scholar 

  • Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall Inc, Upper Saddle River, NJ

    Google Scholar 

  • Kantorovich LV (1939) Mathematical methods of organizing and planning production. Manag Sci 6(4):366–422

    Google Scholar 

  • Kaut M (2012) Scenario-tree generation. In: King AJ, Wallace SW (eds) Modeling with stochastic programming. Springer, New York, pp 77–102

    Google Scholar 

  • Keutchayan J, Gendreau M, Saucier A (2017) Quality evaluation of scenario-tree generation methods for solving stochastic programming problems. Comput Manag Sci 14(3):333–365

    Google Scholar 

  • Keutchayan J, Munger D, Gendreau M (2020) On the scenario-tree optimal-value error for stochastic programming problems. Math Oper Res 45(4):1572–1595

    Google Scholar 

  • King AJ, Wallace SW (2012) Modeling with stochastic programming. Springer, New York

    Google Scholar 

  • Kleywegt AJ, Shapiro A, Homem-de-Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM J Optim 12(2):479–502

    Google Scholar 

  • Li S (2013) A 1.488 approximation algorithm for the uncapacitated facility location problem. Inf Comput 222:45–58

    Google Scholar 

  • Luedtke J, Ahmed S (2008) A sample approximation approach for optimization with probabilistic constraints. SIAM J Optim 19(2):674–699

    Google Scholar 

  • Novikov A (2019) PyClustering: data mining library. J Open Source Softw 4(36):1230. https://doi.org/10.21105/joss.01230

    Article  Google Scholar 

  • Ntaimo L, Sen S (2005) The million-variable “march’’ for stochastic combinatorial optimization. J Glob Optim 32(3):385–400

    Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  • Pflug GC (2001) Scenario tree generation for multiperiod financial optimization by optimal discretization. Math Program 89(2):251–271

    Google Scholar 

  • Pflug GC, Pichler A (2015) Dynamic generation of scenario trees. Comput Optim Appl 62(3):641–668

    Google Scholar 

  • Prochazka V, Wallace SW (2020) Scenario tree construction driven by heuristic solutions of the optimization problem. Comput Manag Sci 17:277–307

    Google Scholar 

  • Rahmaniani R, Crainic TG, Gendreau M, Rei W (2018) Accelerating the benders decomposition method: application to stochastic network design problems. SIAM J Optim 28(1):875–903

    Google Scholar 

  • Rahmaniani R, Crainic TG, Gendreau M, Rei W (2018) Accelerating the benders decomposition method: application to stochastic network design problems. SIAM J Optim 28(1):875–903

    Google Scholar 

  • Riis M, Andersen KA (2002) Capacitated network design with uncertain demand. INFORMS J Comput 14(3):247–260

    Google Scholar 

  • Rockafellar RT, Wets RJ-B (2009) Variational analysis, vol 317. Springer, New York

    Google Scholar 

  • Rujeerapaiboon N, Schindler K, Kuhn D, Wiesemann W (2018) Scenario reduction revisited: fundamental limits and guarantees. Math Program 2018:1–36

    Google Scholar 

  • Römisch W (2009) Scenario reduction techniques in stochastic programming. In: Watanabe O, Zeugmann T (eds) Stochastic algorithms: foundations and applications. Springer, Berlin, Heidelberg, pp 1–14

    Google Scholar 

  • Santoso T, Ahmed S, Goetschalckx M, Shapiro A (2005) A stochastic programming approach for supply chain network design under uncertainty. Eur J Oper Res 167(1):96–115

    Google Scholar 

  • Schubert E, Rousseeuw PJ (2019) Faster k-medoids clustering: improving the pam, clara, and clarans algorithms. In: International Conference on Similarity Search and Applications, Springer, pp 171–187

  • Schultz R, Tiedemann S (2006) Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math Program 105(2):365–386

    Google Scholar 

  • Shapiro A, Homem-de-Mello T (2000) On the rate of convergence of optimal solutions of Monte Carlo approximations of stochastic programs. SIAM J Optim 11(1):70–86

    Google Scholar 

  • Shapiro A (2003) Monte Carlo sampling methods. In: Ruszczyński A, Shapiro A (eds) Handbooks in operations research and management science: stochastic programming, vol 10. Elsevier, Amsterdam, pp 353–425

    Google Scholar 

  • Sun M, Teng F, Konstantelos I, Strbac G (2018) An objective-based scenario selection method for transmission network expansion planning with multivariate stochasticity in load and renewable energy sources. Energy 145:871–885

    Google Scholar 

  • Wallace SW (2010) Stochastic programming and the option of doing it differently. Ann Oper Res 177(1):3–8

    Google Scholar 

Download references

Acknowledgements

This research was initiated when Julien Keutchayan was postdoctoral researcher at University of Montreal. Janosch Ortmann was partially supported by a NSERC Discovery Grant. While working on this paper, Walter Rei was the Canada Research Chair (CRC) in Stochastic Optimization of Transport and Logistics Systems. In addition, he was supported by the National Sciences and Engineering Research Council of Canada (NSERC) through the discovery grants program. He gratefully acknowledges all the support that is provided by the CRC and NSERC programs.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Julien Keutchayan, Janosch Ortmann or Walter Rei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keutchayan, J., Ortmann, J. & Rei, W. Problem-driven scenario clustering in stochastic optimization. Comput Manag Sci 20, 13 (2023). https://doi.org/10.1007/s10287-023-00446-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10287-023-00446-2

Keywords

Navigation