1 Introduction

Machine learning has recently emerged as a prospective area of investigation for OR in general and specifically for combinatorial optimization. Especially after the impressive boost in the effectiveness of deep learning models in various tasks, new approaches, such as neural combinatorial optimization, have been proposed as frameworks to tackle combinatorial optimization problems using a blending of different machine learning techniques. Following this trend, OR conferences and workshops are featuring an ever-increasing number of events and contributions related to the use of machine learning both as an end-to-end heuristic solver and as a component of a solution approach for combinatorial optimization problems.

This special issue intercepts this trend of interest for the use of machine learning in the context of combinatorial optimization and collects eight contributions building on distinctive possibilities offered by the combination of perspectives offered by the two areas. It is representative of the state of the art of the field, featuring works that cover multiple problems, as well as different uses and applications of machine learning in the domain of combinatorial optimization problems.

In Generalization of machine learning for problem reduction: a case study on Traveling Salesman Problems, Sun et al. consider traveling salesman problems proposing a reduction approach that makes use of a labeled dataset of problem instances and a statistical measure to learn, using a support vector machine, which variables are expected not to be part of an optimal solution. This preprocessing allows to reduce problem size and therefore ease solution. In a number of experiments, the method is proven effective and capable of good generalization. It is shown that standard solving methods can benefit from the preprocessing in terms of speed, without losing precision, and allowing scalability on larger instances.

An application in the domain of military flights and maintenance planning is presented by Peschiera et al. in A novel solution approach with machine learning-based pseudo-cuts for the Flight and Maintenance Planning problem. A mixed integer programming model is presented, and in order to solve it efficiently, valid cuts are automatically generated both based on initial condition and learned by using a large labeled dataset of historical data about maintenance schedules. The approach aims to reduce the solution time with little losses in optimality and feasibility. The experimental results show the benefit of this new way of adding learned cuts to problems based on predicting specific characteristics of solutions. The approach is also suitable to be applied to similar domains.

The work of Fajemisin et al. on An analytics-based heuristic decomposition of a Bilevel Multiple-Follower Cutting Stock Problem presents a class of bilevel problems along with a heuristic approach to solve them. Variables are classified as leader or follower variables, where the leader variables can be partitioned among the followers. The proposed approach uses Monte Carlo simulation and k-medoids (p-median) clustering to reduce the bilevel problem to a single level, which is then solved using integer programming techniques. In the case of large instances, k-medoids are substituted by self-organizing maps, which permit to reduce the computational cost.

The contribution A machine learning based Branch and Bound algorithm for a Sampled Vehicle Routing Problem of Furian et al. is focused on variants of the classical vehicle routing problem with time windows and of the capacitated vehicle routing problem, called the sampled vehicle routing problem with time windows. The authors observe that in realistic settings, the planning of routing operations is a repetitive task and argues that common patterns emerge in every instance of the problem, thus offering opportunities for machine learning to identify such patterns. Practically, machine learning models are used to speed up an implementation of a branch-and-price-and-cut algorithm to predict the value of binary decision variables in the optimal solution and to predict branching scores for fractional variables based on full strong branching. The prediction of decision variables is integrated into a node selection policy, while a predicted branching score is used within a variable selection policy. Computational results show that the proposed approach can outperform benchmark branching strategies.

In Mathematical optimization for time series decomposition, Gozuyilmaz and Kundakcioglu focus on decomposing time series into trend and seasonality components, which can provide insights for forecasting and anomaly detection. The study proposes a mathematical optimization approach that addresses several data-related issues in time-series decomposition. Through numerical experiments on real-world and synthetic problem sets, the approach is shown to be able to handle long and multiple seasons, as well as to identify outliers and trend shifts.

The work of Onieva et al. is on the Estimation of a logistic regression model by a genetic algorithm to predict pipe failures in sewer networks. Machine learning is proposed as a tool capable of predicting failures in sewer pipes when the amount of available data is large enough. A real-coded genetic algorithm is used to estimate the optimal weights of a logistic regression model, whose objective is to forecast pipe failures. The methodology is applied to a real sewer network of a Spanish city. Results demonstrate that almost 30% of unexpected pipe failures could have been prevented, all along revealing some weaknesses of the network as well as the influence of the features in the pipe failures.

Sefair presents a study on A column-oriented optimization approach for the generation of correlated random vectors. The work is based on the observation that simulation software usually relies upon the assumption that the Spearman rank correlation is a meaningful way to approximate other correlation measures among the random variables, when trying to induce a desired correlation structure among random variables. However, in practice, the desired a posteriori correlation structure often deviates from the Spearman correlation structure. The work proposes an alternative, distribution-free exact method based on mixed-integer programming to directly induce either a Spearman rank, or a Pearson correlation structure, or a Kendall’s coefficient of concordance, or a Phi correlation coefficient to bivariate random vectors. The method is validated in three different contexts: the simulation of a healthcare facility, the analysis of a manufacturing tandem queue, and the imputation of correlated missing data in statistical analysis.

In Leveraged Least Trimmed Absolute Deviations, Sudermann-Merx and Rebennack consider the design of regression models that must not be affected by outliers. Building on the state-of-the-art least trimmed absolute deviations (LTAs), a method that ignores the k largest absolute deviations, they introduce the leveraged least trimmed absolute deviations (LLTA), which needs to consider only outlying values in the independent variable, the so-called leverage points. These can be computed beforehand and support a mixed-integer formulation that only needs one binary variable per leverage point, resulting in a significant reduction of binary variables with respect to alternative methods. LLTA is shown to be as effective in prediction as the best current methods, with significantly reduced computation time.

The works included in this special issue for sure do not cover the whole spectrum of possibilities neither for using machine learning concepts and techniques in combinatorial optimization, nor regarding the large variety of classes of combinatorial optimization problems that could benefit, in different ways, from adopting a machine learning approach in their solution. Indeed, the goal of this special issue is, in addition to give a snapshot on the state of the art, to give impetus to the field, to further stimulate the investigation of the impressive potentialities offered by the most recent machine learning results and approaches to face the hard challenges of combinatorial optimization. We are living a global machine learning hype, such that we expect that also OR practitioners must be and will be part of it!