Elsevier

Computers & Operations Research

Volume 75, November 2016, Pages 214-230
Computers & Operations Research

Discrete optimization methods to fit piecewise affine models to data points

https://doi.org/10.1016/j.cor.2016.05.001Get rights and content

Highlights

  • We address the fundamental piecewise affine model fitting problem.

  • We propose an MILP formulation, strengthened with symmetry breaking constraints.

  • We present a heuristic with point-reassignment and domain partition at each iteration.

  • We report extensive results on real-world and structured random instances.

  • We conclude with an application to the identification of hybrid dynamical systems.

Abstract

Fitting piecewise affine models to data points is a pervasive task in many scientific disciplines. In this work, we address the k-Piecewise Affine Model Fitting with Piecewise Linear Separability problem (k-PAMF-PLS) where, given a set of m points {a1,,am}Rn and the corresponding observations {b1,,bm}R, we have to partition the domain Rn into k piecewise linearly (or affinely) separable subdomains and to determine an affine submodel (i.e., an affine function) for each of them so as to minimize the total linear fitting error w.r.t. the observations bi.

To solve k-PAMF-PLS to optimality, we propose a Mixed-Integer Linear Programming (MILP) formulation where symmetries are broken by separating shifted column inequalities. For medium-to-large scale instances, we develop a four-step heuristic involving, at each iteration, a point reassignment step based on the identification of critical points and a domain partition step based on multicategory linear classification. Differently from traditional approaches proposed in the literature for similar fitting problems, in both our exact and heuristic methods the domain partitioning and submodel fitting aspects are taken into account simultaneously.

Computational experiments on real-world and structured randomly generated instances show that, with our MILP formulation with symmetry breaking constraints, we can solve to proven optimality many small-size instances. Our four-step heuristic turns out to provide close-to-optimal solutions for the small-size instances, while allowing to tackle instances of much larger size. The experiments also show that the combined impact of the main features of our heuristic is quite substantial when compared to standard variants not including them.

We conclude the paper with an application to the identification of dynamical piecewise affine systems, for which we obtain promising results of comparable quality with those achieved with state-of-the-art methods from the literature (hinging hyperplane models and sum-of-norms regularization) on benchmark data sets.

Introduction

Fitting a set of data points in Rn with a combination of low complexity models is a pervasive problem in, essentially, any area of science and engineering. It naturally arises, for instance, in prediction and forecasting when determining a model to approximate the value of an unknown function, or whenever one wishes to approximate a highly complex nonlinear function with a simpler one. Applications range from optimization (see, e.g., [36] and the references therein) to statistics (see, e.g., the recent work in [9]), to data mining (see, e.g., [5], [11]), and to system identification (see, for instance, [18], [6], [35]), only to cite a few.

Among the different options, piecewise affine models have a number of advantages with respect to other model fitting approaches. Indeed, they are compact and simple to evaluate, visualize, and interpret, in contrast to models obtained with other techniques such as, e.g., neural networks, while allowing to approximate even highly nonlinear functions.

Given a set of m points A={a1,,am}Rn, with index set I={1,,m}, the corresponding observations {b1,,bm}R, and a positive integer k, the general problem of fitting a piecewise affine model to the data points {(a1,b1),,(am,bm)} consists in partitioning the domain Rn into k continuous subdomains D1,,Dk, with index set J={1,,k}, and in determining, for each subdomain Dj, an affine submodel (an affine function) fj:DjR, so as to minimize a measure of the total fitting error. Adopting the notation fj(x)=wjxw0j with coefficients (wj,w0j)Rn+1, the j-th affine submodel corresponds to the hyperplane Hj={(x,fj(x))Rn+1:fj(x)=wjxw0j}, defined for all xDj.1 The total fitting error is defined as the sum, over all iI, of a function of the difference between bi and the value fj(i)(ai) provided by the piecewise affine model, where j(i) is the index of the affine submodel corresponding to the subdomain Dj(i) which contains the point ai.

In the literature, different error functions (e.g., linear or quadratic) as well as different types of domain partition (with linearly or nonlinearly separable subdomains) have been considered. See Fig. 1(a) for an illustration of the case with k=2 and a domain partition with linearly separable subdomains.

In this work, the focus is on the version of the general piecewise affine model fitting problem with a linear error function (ℓ1 norm) and a domain partition with piecewise linearly separable subdomains.2 We refer to it as to the k-Piecewise Affine Model Fitting with Piecewise Linear Separability problem (k-PAMF-PLS). A more formal definition of the problem will be provided in Section 3.

k-PAMF-PLS shares a connection with the so-called k-Hyperplane Clustering problem (k-HC), an extension of a classical clustering problem which calls for k hyperplanes in Rn+1 which minimize the sum, over all the data points {(a1,b1),,(am,bm)}, of the ℓ2 distance from (ai,bi) to the hyperplane the point is assigned to. See [8], [1], [13], [14] for some recent work on the problem and [4] for the problem variant aiming at minimizing the number of hyperplanes needed to fit all the points within a prescribed tolerance ε>0.

It is nevertheless crucial to note that, differently from many of the approaches in the literature (which we briefly summarize in Section 2) and depending on the type of the domain partition that is adopted, a piecewise affine function cannot be determined by just solving an instance of k-HC. A naive application of algorithms designed for k-HC to tackle k-PAMF-PLS can indeed lead to solutions with a large fitting error, as a consequence of the domain partitioning aspect being entirely neglected. As illustrated in Fig. 1(b), the two aspects of k-PAMF-PLS, namely, submodel fitting and domain partitioning, should be taken into account at once to obtain a solution where the two are consistent. For this reason, most of the algorithms in the literature incorporate techniques to enforce the piecewise linear separability of the domain, which is typically not guaranteed in solutions of k-HC and its variants. In this work, we propose exact methods for k-PAMF-PLS based on mixed-integer linear programming as well as heuristic algorithms which consider both aspects of the problem simultaneously, rather than deferring the domain partitioning aspect to a later stage of the solution process.

The paper is organized as follows. After summarizing previous and related works in Section 2, we formally define the problem under consideration in Section 3. In Section 4, we provide a Mixed-Integer Linear Programming (MILP) formulation for k-PAMF-PLS. We then strengthen the formulation when using it for solving the problem in a branch-and-cut setting by generating symmetry-breaking constraints. In Section 5, we propose a four-step heuristic to tackle larger-size instances. Computational results are reported and discussed in Section 6. In Section 7, we consider an application of k-PAMF-PLS to problems in the area of dynamical system identification and compare the obtained results with those provided by state-of-the-art methods. Section 8 contains some concluding remarks. Portions of this work appeared, in a preliminary stage, in [2], [3].

Section snippets

Previous and related work

Recently, there has been a growing interest in mixed-integer programming and discrete optimization approaches to a wide range of problems in the areas of data mining and statistics, see, e.g., [20], [15], [11], [12], [9], [30]. As to the problem of fitting a piecewise affine model to data points, many variants have been considered in the literature. We briefly mention some of the most relevant ones in this section.

In some works, the domain is partitioned a priori, exploiting the domain-specific

Problem definition

In this work, we require that the domain partition D1,,Dk of Rn satisfy the property of piecewise linear separability, which forms the basis for multicategory linear classification [7]. In the following, we briefly recall some key features of this well-known classification paradigm, which we will then leverage in the remainder of the paper.

Strengthened mixed-integer linear programming formulation

In this section, we propose an MILP formulation to solve k-PAMF-PLS to optimality via a branch-and-cut method, as implemented in state-of-the-art MILP solvers. To enhance the efficiency of the solution algorithm, we break the symmetries that naturally arise in the formulation by generating symmetry-breaking constraints, as we will explain in the following.

Our MILP is derived by combining an adapted hyperplane clustering formulation with a multicategory linear classification one. The former

Four-step heuristic algorithm

As we will see in Section 6, the introduction of SCIs has a remarkable impact on the solution times. Nevertheless, even with them, the MILP formulation only allows for the solution of small to medium size instances in a reasonable amount of computing time.

To tackle instances of larger size, we propose an efficient heuristic that takes into account, at each iteration, all the three aspects of the problem, namely, affine submodel fitting, point partition, and domain partition. At each iteration,

Computational results

In this section, we report and discuss on a set of computational results obtained when solving k-PAMF-PLS either to optimality with branch-and-cut and symmetry breaking constraints or with our four-step heuristic 4S-CR. First, we investigate the impact of symmetry breaking constraints when solving the problem to global optimality. On a subset of instances for which the exact approach is viable, we compare the best solutions obtained with the exact algorithm (within a time limit) to those

Application to the identification of piecewise affine dynamical systems

Let us consider an application of k-PAMF-PLS to the identification of hybrid dynamical systems. A dynamical system is said to be hybrid if it exhibits a combination of a continuous and discrete behavior. Such models are of use in a number of practical cases to approximate continuous, but nonlinear, systems. Adopting the standard notation, a piecewise affine system is formally described by an equation of the form: yt=g(φt)+et,where yt is the output of the system at time t, φt is the regression

Concluding remarks

We have addressed the k-PAMF-PLS problem of fitting a piecewise affine model with a piecewise linearly separable domain partition to a set of data points. We have proposed an MILP formulation to solve the problem to optimality, strengthened via symmetry breaking constraints. To solve larger instances, we have developed 4S-CR, a four-step heuristic algorithm which simultaneously deals with the various aspects of the problem.

Computational experiments on a set of structured randomly generated and

References (37)

  • Amaldi E, Coniglio S, Taccari L. Formulations and heuristics for the k-piecewise affine model fitting problem. In:...
  • Amaldi E, Coniglio S, Taccari L. k-Piecewise affine model fitting: heuristics based on multiway linear classification....
  • E. Amaldi et al.

    Column generation for the minimum hyperplanes clustering problem

    INFORMS J Comput

    (2013)
  • A. Bemporad et al.

    A bounded-error approach to piecewise affine system identification

    IEEE Trans Autom Control

    (2005)
  • K. Bennet et al.

    Multicategory discrimination via linear programming

    Optim Methods Softw

    (1994)
  • P. Bradely et al.

    k-plane clustering

    J Glob Optim

    (2000)
  • D. Bertsimas et al.

    Least quantile regression via modern optimization

    Ann Stat

    (2014)
  • L. Breiman

    Hinging hyperplanes for regression, classification, and function approximation

    IEEE Trans Inf Theory

    (1993)
  • Cited by (0)

    The work of S. Coniglio was carried out, for a large part, while he was with Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano and with Lehrstuhl II für Mathematik, RWTH Aachen University. While with the latter, he was supported by the German Federal Ministry of Education and Research (BMBF), grant 05M13PAA, and Federal Ministry for Economic Affairs and Energy (BMWi), grant 03ET7528B.

    View full text