The min-Knapsack Problem with Compactness Constraints and Applications in Statistics

In the min-Knapsack problem, one is given a set of items, each having a certain cost and weight. The objective is to select a subset with minimum cost, such that the sum of the weights is not smaller than a given constant. In this paper we introduce an extension of the min-Knapsack problem with additional “compactness constraints” (mKPC), stating that selected items cannot lie too far apart from each other. This extension has applications in statistics, including in algorithms for change-point detection in time series. We propose three solution methods for the mKPC. The first two methods use the same Mixed-Integer Programming (MIP) formulation, but with two different approaches: either passing the complete model with a quadratic number of constraints to a black-box MIP solver or dynamically sep-arating the constraints using a branch-and-cut algorithm. Numerical experiments highlight the advantages of this dynamic separation. The third approach is a dynamic programming labelling algorithm. Finally, we focus on the special case of the unit-cost mKPC (1c-mKPC), which has a specific interpretation in the context of the statistical applications mentioned above. We prove that the 1c-mKPC is solvable in polynomial time with a different ad-hoc dynamic programming algorithm. Experimental results show that this algorithm vastly out-performs both generic approaches for the mKPC, as well as a simple greedy heuristic from the literature.


Introduction
In this paper, we present an extension of the min-Knapsack problem (Csirik et al., 1991) with applications in statistics, including to change point detection in time series.Being an extension of min-Knapsack, the considered problem is N P-complete.We also consider a special case of the problem which is both relevant for the statistical applications and solvable in polynomial time.
The min-Knapsack problem asks to select a subset of n items, each with weight w j ≥ 0 and cost c j ≥ 0 (j ∈ {1, . . ., n}), such that the sum of the costs of the selected items is minimum, and their total weight is not smaller than a constant q ≥ 0.
In this paper, we introduce a variant of the min-Knapsack problem, which we call the min-Knapsack Problem with Compactness Constraints (mKPC).Applications in time series analysis and high-dimensional statistics (see Section 1.1) motivate the study of this variant.
In the mKPC, there is a distance metric defined over the items.Consider two items i and j and assume, from now on and without loss of generality, that i < j.We define the distance between items as the difference of their indices, i.e., j − i.We can think of the items as an ordered sequence, and we are interested in how far apart i and j lie in the sequence.With this notion of distance, we impose the additional condition that the set of selected items is compact.Formally, we consider a maximum distance parameter ∆ ∈ N. If two items i and j are both selected and j − i > ∆, then we require that there is at least another selected item between i and j.I.e., we require that there is a selected item k such that i < k < j.   (without compactness constraints), and the mKPC (with compactness constraints).Items lie on the x axis according to their index, and the bar heights indicate their weights.The value of parameter ∆ for the mKPC is set to 2 and c j = 1 for all items.An optimal solution of the min-Knapsack problem, depicted in Figure 1a, has total cost 12.However, it violates compactness constraints: items 8 and 12 (with distance 4 > 2) are both selected, but no other item between them is selected.Indeed, an optimal solution of the mKPC has a cost of 13, as shown in Figure 1b.

Motivation
The motivation for the mKPC comes from applications in statistics.In the following, we give a detailed example from change-point detection in time series.Given a time series y 1 , . . ., y n , the objective of change-point detection is to identify whether the underlying probability distribution of y changes, how many times it does so, and at which time points.Typical change-points for time series occur when the time series changes its expected value (see Figure 2a), its variance (see Figure 2b), or both.Next, it identifies a level-q credible set, i.e., a subset of {1, . . ., n} in which the sum of the probabilities is at least q (for a given threshold q ∈ [0, 1]).For example, a level-0.95credible set corresponds to a 95% probability that the set contains the change point.
Following a criterion of parsimony, it is desirable that the credible set contains as few elements as possible.Not all time points, however, must carry the same penalty if included in the credible set.For example, a time instant corresponding to an external shock might cost less in terms of parsimony compared to a time instant when no such shock occurred.Therefore, one can associate to each time point j a scaling factor c j and minimise the sum of these factors.On the other hand, when no such information is present, one can just set c j = 1 for all time instants.
As we will see in Section 2.2, using a unitary scaling factor decidedly simplifies the problem.In the rest of this explanation we will consider, for simplicity, this unit-cost case.
The most straightforward method to build the credible set is perhaps to follow a greedy approach which inserts points by decreasing value of probability until the desired threshold q is met.This criterion was used, for example, by Wang et al. (2020, Supplementary Data, Section A.3)  To overcome this problem, one must then consider the compactness of the credible set: because each set should identify a single change point, its elements should be "compact" and, ideally, distributed tightly around the real (unknown) change point.This objective can be achieved via compactness constraints.Indeed, once the value of parameter ∆ is fixed (usually to a small number such as 2 or 3), the problem of producing the most parsimonious credible set becomes our mKPC, in which the probability values associated to each time point take the role of the weights.Figure 5 shows how including compactness leads to a better credible set construction.

Formal definition
In this section we give a formal definition of the mKPC by means of an integer programming model and we discuss the complexity of the mKPC and of the unit-cost mKPC (1c-mKPC).As mentioned in Section 1, in fact, the mKPC is N P-complete.In Section 2.2, however, we prove that the 1c-mKPC is solvable in polynomial time.

Mathematical model
We can formulate the m-KPC as the following integer program, in which binary variable x j takes value 1 iff the j-th item is selected: We denote constraints (3) the compactness constraints.

Complexity
The mKPC is N P-complete because it contains the min-Knapsack problem as a special case when ∆ = n.In the applications described in Section 1.1, however, it can often be the case that all items take unit cost (i.e., c j = 1 for all i ∈ {1, . . ., n}).This problem is denoted as 1c-mKPC and arises, for example, when the user has no prior knowledge of which time instants of a time series are more likely to be change points.The following theorem establishes a strong result about the 1c-mKPC: namely, that it can be solved in polynomial time.
Theorem 1.Consider the decision version of the 1c-mKPC: for a given integer number t ∈ {1, . . ., n}, we want to know whether there exists a feasible solution of the 1c-mKPC using at most t items.The decision version of the 1c-mKPC can be solved in polynomial time.
Proof.Consider a Dynamic Programming (DP) table W with entries W (i, ℓ) for each i ∈ {1, . . ., n} and ℓ ∈ {1, . . ., i}.Entry W (i, ℓ) will contain the maximum weight of a subset of {1, . . ., i} such that the set has size ℓ and that the element of the set with the highest index is item i.This table can be trivially initialised with W (i, 1) = w i for all i ∈ {1, . . ., n}.
Furthermore, the following DP recursion is valid: where notation [i − ∆] is used as a shorthand for max{1, i − ∆}.Recursion ( 5) is valid because of the following observation.Any set of size ℓ having item i as its highest-index element must contain at least one element in {[i − ∆], . . ., i − 1} as its second-highest-index element.If that were not the case, in fact, the compactness constraint would be violated.
Finally, to know whether there is a subset of {1, . . ., n} of size at most t such that its elements have weight at least c and that satisfies compactness constraints, we must check that We now analyse the complexity of the above algorithm to conclude that it runs in polynomial time in the instance size n.Indeed, table W has size O(n 2 ) and we derive the worst-case complexity of computing an entry.To compute a generic entry W (i, ℓ) through ( 5) we need to compare values in rows , we perform at most ∆ comparisons.
Noting that the table can be built in increasing order of columns and rows (indeed, W is lowertriangular) and that ∆ ≤ n, we conclude that the total complexity of the DP algorithm is

Related problems
In addition to applications in statistics discussed in Section 1.1, the mKPC has a specific combinatorial structure.As anticipated, the problem falls in the wide family of knapsack problems (see Kellerer et al., 2004;Martello & Toth, 1990).In particular, it extends the min-Knapsack problem by introducing compactness constraints.For the earliest results on the min-Knapsack problem in English, we refer the reader to the seminal work of Csirik et al. (1991); for earlier works in Russian see, e.g., Babat (1975).
The special structure of compactness constraints can be represented by a graph G = (V, E) in which each item i corresponds to a vertex v i ∈ V , and an edge {v i , v j } ∈ E is defined for each pair of vertices v i and v j , i < j, such that j − i < ∆.The mKPC asks to select a subset of V inducing a connected subgraph, such that the corresponding items optimise the associated min-Knapsack problem.
If instead of graph G we are given a generic graph, and if we also have to include a pre-defined subset T ⊂ V of vertices in the connected subgraph, the problem is known as the Connection Subgraph problem (see Conrad et al., 2007).This problem is strongly N P-complete and remains so even when T = ∅.As discussed in Section 2.2, the mKPC (that is, the Connection Subgraph As discussed, our compactness constraints can be interpreted as a connectivity requirement on a suited graph.Similar requirements appear in political districting problems, where one has to partition geographic units (e.g., counties or census blocks) to obtain districts for elections.
Districts must contain geographically contiguous units and have the same number of inhabitants.Political districting problems are typically defined on a graph where vertices represent the geographic units and have a weight corresponding to the population, and the edges connect units that are contiguous.Hence, the problem consists in partitioning the vertices into subsets having approximately the same weight and inducing connected subgraphs (see, e.g., Ricca et al. (2013)).
In a different perspective, Stiglmayr et al. (2022) introduce some measures of robustness for solutions in multi-objective integer linear programming.Here the idea is to select a solution which not only is efficient, but also robust, in the sense that its "closeby" solutions are efficient as well (allowing for a substitution of the selected solution).Closeness of solutions depends on the specific problem, and can be identified as a change of base via a pivot in a linear program, or as a "move" in a combinatorial problem.In any case, close solutions are denoted as adjacent, thus defining a graph.The robustness of each solution is evaluated by analysing its neighbourhood in this graph.

Solution approaches
In this section, we describe exact approaches for the mKPC.We also describe a greedy heuristic for the 1c-mKPC, used in the PRISCA package (Cappello, 2022).

Integer Programming
The first approach consists in solving model ( 1)-( 4) with a black-box integer programming solver.
The model is compact because it uses O(n) variables and O(n 2 ) constraints.

Strengthening compactness constraints. Compactness constraints (3) state that if two
items lying more than ∆ positions apart are selected, then at least another item between them must be selected.These constraints, however, can be made stronger.For example, if the two selected items lie at least 2∆ positions apart, then at least two further items between them shall also be selected.In general, (3) can be strengthened as follows: The following example shows why these constraints help tighten the continuous relaxation of the mKPC.Consider an instance in which the two heaviest items are the first one and the last one: let n = 1002, w 1 = w 1002 = 0.495, and w j = 10 −4 for all other j ∈ {2, . . ., 1001}.Further assume that costs are all equal, that ∆ = 5, and that α = 0.95.Without compactness constraints one might simply choose items 1 and 1002, obtaining a total weight of 0.99 > 0.95.Due to compactness constraints, however, we must "link" these two items, taking other intermediate items.The most parsimonious way to achieve that is to take one every ∆ items, i.e., items 6, 11, …, 1001.The optimal solution, therefore, selects 2 + 200 = 202 items.
When solving the continuous relaxation of the mKPC, however, an optimal solution is x 1 = x 1002 = 1, and x j = 10 −3 for all other j ∈ {2, . . ., 1001}.Such a solution has cost 3 and does not violate any compactness constraint.For example, when i = 1 and j = 1002, we have j−1 k=i+1 x k = 1000 • 10 −3 = 1 and thus (3) is satisfied.On the other hand, the strengthened constraint (7) would be violated by such a solution:

On-the-fly constraint generation
Formulation (1)-( 4) has polynomial size, but the number of compactness constraints can be very large for large values of n.Their management can be impractical, and it can cause a degradation of black-box IP solvers' performances, in particular during preprocessing and when solving linear programming relaxations.For this reason, we evaluate the effectiveness of a branch-and-cut approach in which we first remove the compactness constraints, and then generate them on-thefly by separating infeasible integer and fractional solutions of the resulting relaxed problem.In the rest of this section, we derive the corresponding separation procedures.
Integer solution separation.The following procedure checks whether an integer solution x * 1 , . . ., x * n violates a compactness constraint.For each item i ∈ {1, . . ., n} with x * i = 1, we search the first item 3) is violated for j = σ i and must be added to the formulation.Otherwise, there is no index j such that constraint (3) is violated for the index pair (i, j).By stopping the algorithm after we find the first violated constraint (if any) would cut away the current infeasible integer solution.However, we can keep scanning items i even after we find one involved in the violation of a compactness constraint, thus attempting to separate other useful inequalities.
Fractional solution separation.Given a fractional solution x * 1 , . . ., x * n , the following procedure determines whether it violates a compactness constraint.For each item i ∈ {1, . . ., n−∆−1} such that x * i > 0, let S = 0. Then: 1.For each item k ∈ {i + 1, . . ., i + ∆}, update S with value S + x * k .If, at some point, S ≥ 1, then there is no index j for which (3) is violated for the index pair (i, j).We can then move to the next i.
(a) If x i + x j − 1 > S, then the solution violates the compactness constraint for index pair (i, j).
(b) Otherwise, update S with value S + x * j and move to the next j.
The validity of step 1 follows because condition S ≥ 1 makes the right-hand side of (3) larger or equal than 2 and, thus, the inequality holds.The condition in step 2.a corresponds exactly to a violation of (3), while step 2.b is needed to consider all items between i and j.
Strengthened compactness constraints.Finally, we observe that the separation procedure for compactness constraints can be modified in a straightforward way to detect and add violated inequalities (7) instead of the original (3).In particular, for the fractional case, it is enough to replace the condition in step 2.a with condition

Dynamic Programming
In order to derive a DP algorithm for the (general) mKPC, we first introduce an auxiliary directed graph G = (V, A).The vertex set contains a source node σ, a sink node τ , and one node for each item.Overall, V = {σ, 1, . . ., n, τ }.The arc set A contains: • Arcs from σ to each node i ∈ {1, . . ., n}.
• An arc from node i to j, for each pair i, j ∈ {1, . . ., n} such that i < j ≤ i + ∆.To avoid the complete enumeration of all feasible solutions, we propose a labelling algorithm in which we associate a label to each partial path from σ.A label L = (i, C, W ) has three components: the last visited node i, the total cost C of visited nodes, and the total collected weight W .The initial label is L = (σ, 0, 0).Each time a label L = (i, C, W ) is extended from i to j, the new label L ′ = (i ′ , C ′ , W ′ ) has components: Optimal solutions of the mKPC correspond to labels such that i = τ , W ≥ q, and C is minimal.
Note that, as soon as W ≥ q for some label, the only sensible extension for that label is from the current node to the sink node τ .Analogously, if W < q, then it does not make sense to extend that label to τ , because the new label would correspond to an infeasible solution.
Consider two labels, L 1 = (i, C 1 , W 1 ) and L 2 = (i, C 2 , W 2 ), referring to two partial paths to the same node i.If C 1 ≤ C 2 and W 1 ≥ W 2 , then no extension of L 2 up to the sink node τ can correspond to a strictly better solution than the corresponding extension of L 1 along the same path.This observation leads to the following dominance rule: , and at least one of the two inequalities is strict.In this case, one can discard label L 2 .In case both inequalities are actually equalities, one can discard either L 1 or L 2 (but not both), arbitrarily.

Greedy Heuristic for the 1c-mKPC
For the special case of the 1c-mKPC, we describe here the greedy heuristic procedure used in the PRISCA package (Cappello, 2022) to determine whether a credible set corresponds to a valid change point.As mentioned in Section 1.1, the authors consider the case in which all costs are unitary, and they deem the credible set valid if their heuristic solution of the corresponding 1c-mKPC uses fewer than n 2 items.The greedy procedure aims at identifying a subset of items P ⊆ {1, . . ., n} with total weight at least q and satisfying the compactness constraints.The procedure starts by initialising P with a single item, namely the one with the highest weight: It then keeps augmenting P by adding, at each iteration, the heaviest item which is not yet selected and does not violate compactness constraints: The algorithm stops as soon as j∈P w j ≥ q.

Computational results
In this section, we report the results of computational experiments to test the effectiveness of the algorithms presented in Section 4. The code was implemented in C++, using Gurobi 9.5 as the MIP solver.Experiments ran on a machine equipped with an Intel Xeon CPU running at 2.4 GHz and 4 GB RAM (increased to 8 GB for instances with n = 600).The MIP solver was instructed to only use one thread.All algorithms used a time limit of 3600 s.The instances and the code used are available under an open-source licence (Santini, 2022).
After describing the instance set used, we analyse the results of three sets of experiments: 1. Experiments to assess the impact of strengthened constraints (7).
2. Experiments to compare the compact formulation, the branch-and-cut algorithm, and the DP labelling algorithm for the mKPC.
3. Experiments to investigate the difficulty of solving the unit-cost version of the problem.To this end, on top of the above algorithms, we also add the DP algorithm for the 1c-mKPC (described in the proof of Theorem 1) and the greedy heuristic described in Section 4.4.
Because the costs in the S1 instances are all unitary, and the number of items is relatively low, we also generated a second set, denoted S2.This set contains 189 instances with n ∈ {200, 400, 600}, q = 0.95, and ∆ ∈ {2, 3, 5, 10}.In the following, we explain how we generate the weights and the costs in the instances of set S2.We use three weight-generation methods: • The Noise method first assigns each item j a weight  where N (λ, σ) denotes a normal distribution with location λ and scale σ.To avoid numerical issues, we also ensure that no weight is smaller than 10 −12 , i.e., we set Because the sum of the above weights is not necessarily equal to one, we finally normalise them: (9) Figure 7 shows an example of a Noise instance, with its optimal solution represented in orange.The y axis, labelled "Probability" refers to the statistical application mentioned in Section 1.1, in which item weights represent probabilities.Noise instances tend to require a large fraction of selected items to reach the target weight of q = 0.95.
• The OnePeak method proceeds as follows.It first chooses a random location λ between 1 and n, sampling from a truncated normal distribution with location n 2 and scale n 4 , and rounding to the nearest integer.It then generates an instance in which the weights have a peak around λ, i.e., an instance similar to the one depicted in Figure 3.To this end, it considers another truncated normal distribution between 1 and n, with location λ and scale n k .Here k ∈ {8, 16, 32} is an instance generation parameter.Weights will be more tightly distributed around the peak when k is larger.The method samples 5000 times from this distribution and builds the corresponding histogram with n bars.The j-th bar counts how many samples fell in interval [j, j + 1).The weight w ′ j of the j-th item is then set as the height of the j-th bar of the histogram.Finally, weights w j are obtained by normalisation as in eq. ( 9). Figure 8 shows an example of a OnePeak instance.
• The TwoPeaks methods is similar to OnePeak, except that the histogram is built by sampling from the sum of two truncated normal distributions with locations λ 1 and λ 2 , and common scale n 2k .Intuitively, λ 1 and λ 2 are the locations of two peaks.The values of the two locations are drawn from two further truncated normal distributions between 1 and n, and rounded to the nearest integer.The first distribution has location n 3 , the second one has location 2n 3 , and both have scale n 6 .Figure 9 shows an example of a TwoPeaks instance.
We use three costs generation methods: • The Constant method simply assigns unit costs to all items and allows us to extend the results obtained on the S1 set to larger instances with different weight types.
• The Few method aims at modelling real-life statistical applications, in which few items have a small cost, and all other items have a constant larger one.In particular, it first selects n 100 items using a roulette wheel method with probabilities equal to item weights.It then assigns these items a weight of 0.10, and all other items a weight of 1.The reason we use roulette wheel selection is that, in the application, the items with the lower costs correspond to time instants with a higher prior probability of containing a change point.
These items are thus also more likely to be detected by the algorithm and, as a consequence, to have a larger weight.Therefore, assuming that the prior knowledge is accurate and that the algorithm works correctly, items with larger weights are more likely to have lower costs.
• The Random method assigns each item a cost uniformly distributed in the interval [1,10].
Note that we have three possible values for parameter n, three values for parameter ∆, and three for the cost generation method.Their combination gives 27 parameter combinations using weight generation method Noise.Because we generate 3 instances for each combination, we build 81 Noise instances.Furthermore, for each of these 27 combinations, we have 3 possible values for parameter k, yielding 81 parameter combinations for each of the OnePeak and TwoPeaks weight generation methods.Again, generating 3 instances for each combination, we obtain 243 instances for each of the two methods.Overall, we then construct 81 + 2 × 243 = 567 instances.

Computational experiments
In this section, we present the results of computational experiments on the instances described in Section 5.1.We first investigate the role of strengthened inequalities (7) on the compact formulation and the branch-and-cut (B&C) algorithm.Next, we compare these two algorithms with the labelling algorithm introduced in Section 4.3.We present the results of these comparisons using instances of set S2 because these are larger and more varied.Finally, we compare our approaches (including the DP one introduced via Theorem 1) with the greedy heuristic of the PRISCA package (Cappello, 2022) on 1c-mKPC instances.This comparison allows us to assess the advantage given by exact algorithms over a heuristic one, on instances relevant to data scientists.
Impact of strengthened compactness constraints.We use two relevant metrics to assess the impact of strengthened inequalities ( 7 1.The percentage optimality gap, i.e., the gap between the best primal and dual bounds returned by each algorithm within the time limit.This metric is denoted as Gap% and is defined as follows: where "UB" indicates the best primal solution and LB is the tightest dual bound returned by the solver.Gap% corresponds to the familiar gap returned by black-box integer programming solvers and depends on both the quality of the primal and dual bound. 2. The second metric is the solution time in seconds, including the time spent creating the model and exploring the branch-and-bound tree.It is denoted by Time (s).
We also note that instances generated using weight types TwoPeaks are considerably harder than the other instances.Therefore, we present the results obtained on Noise and OnePeak instances separately from those obtained on TwoPeaks instances.After commenting on these results, we will come back to the difficulty of for the B&C algorithm.For the largest instances, for example, the average runtimes needed to solve the compact formulation are in the order of hundreds of seconds.The B&C algorithm, on the other hand, closes these instances in a few hundredths of a second: a difference of five orders of magnitude.Regarding the effect of strengthened compactness constraints, we note that they do not seem to help when solving the full compact formulation.If anything, in fact, they slightly increase the computation time.On the other hand, they reduce the computation time of the B&C algorithm.
Table 2 presents the results on the TwoPeaks instances.These instances are considerably harder to solve: in several cases, the solvers run out of time without solving the model to optimality.Even when they solve the model to optimality, it takes on average much longer compared with Noise and OnePeak instances.For these instances, the strengthening constraints have a considerable effect on the solvers.Indeed, by using the strengthened inequalities (7) the average gaps are roughly reduced by two thirds.We also observe that, on these harder instances, the B&C algorithm loses its advantage on the compact formulation.The gaps produced by B&C are slightly worse, while the runtimes are comparable.

Peculiarity of the TwoPeaks instances.
As Tables 1 and 2 show, TwoPeaks instances are much harder to solve using branch-and-bound methods, compared with the other instances.
The reason lies in the characteristics of the optimal solution of the continuous relaxation of the mKPC.Solutions of TwoPeaks instances have a large number of fractional items and the value of the corresponding variables x are closer to 0.5.This implies that much more branching is necessary while exploring the branch-and-bound tree.To appreciate the extent by which TwoPeaks instances differ from the other instances, Figure 10 shows boxplots of two metrics Metric FracGini is the normalised Gini coefficient of the fractional variables, i.e., The value of this metric is higher when many x * j are concentrated around 0.5, while it is lower when the x * j take values close to 0 or 1.Values x * j ∈ {0, 1} do not contribute to the sum at the numerator.Therefore, solutions with more fractional items have more non-zero terms in the sum at the numerator.To compensate for this fact we normalise dividing by the number of fractional items.
Comparison of the algorithms for the mKPC.Table 3 compares the performance of three approaches for the mKPC.Because strengthened inequalities (7) result in lower gaps, we enable them for both the branch-and-cut algorithm and the compact formulation.to the case in which the user of the PRISCA and SuSiE algorithms mentioned above has no prior knowledge of, respectively, which time instants and which features are more likely to be selected.
We proved that the 1c-mKPC is solvable in polynomial time, and we proposed a specific dynamic programming algorithm.Computational results clearly show that using this algorithm is better than both the generic mKPC approaches and a greedy heuristic from the statistics literature.

Figure 1 :
Figure 1: Comparison of the solutions of the min-Knapsack problem and the mKPC on the same instance.Parameter ∆ = 2.

Figure 1
Figure 1 presents an example which shows the difference between the min-Knapsack problem the mean of a time series Possible change points (a) The expected value of this time series changes three times.Shaded areas show time periods in which, qualitatively, the change appears to be happening.In this case, the variance of the time series changes while the expected value stays constant.

Figure 2 :
Figure 2: Example time series which change their expected value and variance.Black points indicate the time series values y t .Shaded areas represent time periods where, qualitatively, an analyst would expect a change point.

Figure 3 :
Figure 3: Probabilities associated with each time point and representing how likely the point is to be the first change point of the time series.

Figure 4 :
Figure 4: The bottom chart shows a credible set relative to the first change point of the time series in the top chart, when disregarding compactness.The points in the credible set are highlighted in yellow.
point is the first variance change-point Points selected in the credible set

Figure 5 :
Figure 5: The bottom chart shows a credible set relative to the first change point of the time series in the top chart, considering compactness requirements.The points in the credible set are highlighted in yellow.
problem with T = ∅ and the special structure of graph G) remains N P-complete.The definition of the mKPC as a problem on a graph gives us an interpretation of inequalities (3) as a special case of the connectivity constraints introduced byFischetti et al. (2017) to impose connectivity of Steiner trees.However, the special structure of graph G that results when solving the mKPC makes it more efficient to specialize those constraints to the specific problem, without the need to introduce G explicitly.In particular, separation of inequalities (3) is straightforward (see Section 4.2).

Figure 6
Figure 6 depicts graph G when ∆ = 2. Thinner arrows represents arcs from σ and to τ , while the thicker ones represents arcs between nodes {1, . . ., n}.A feasible solution of the mKPC corresponds to a path in G starting at σ, ending at τ , and such that the weight collected at visited nodes is at least q.

Figure 6 :
Figure 6: Graph G used by the labelling algorithm.The graph in the figure depicts an instance with ∆ = 2.

Figure 9 :
Figure 9: Example TwoPeaks instance with its optimal solution.

FracGiniFigure 10 :
Figure10: Left: percentage of items selected fractionally in the optimal solution of the continuous relaxation of the mKPC.Right: normalised Gini coefficient showing how close the fractional values are to 0.5 (the higher the value, the closer to 0.5). , Selected itemsFigure 7: Example Noise instance with its optimal solution.Selected items Figure 8: Example OnePeak instance with its optimal solution.

Table 1 :
): Impact of strengthened inequalities (7) on the performance of the Compact Formulation and the Branch & Cut algorithm.The table refers to instances whose weights are generated with the Noise and OnePeak methods.

Table 2 :
TwoPeaks instances, and we will explain what sets them apart from the other instances.Impact of strengthened inequalities (7) on the performance of the Compact Formulation and the Branch & Cut algorithm.The table refers to instances whose weights are generated with the TwoPeaks method.Table 1 reports the results on the Noise and OnePeak instances of set S2.Because all algorithms solve to optimality all instances with up to 600 items, Table 1 is only reporting the runtimes.Note how the runtimes are very different for the complete compact formulation and

Table 3
reports the following metrics:1.Opt% is the percentage of instances in each row for which the algorithm found a provably optimal solution.

Table 3 :
Comparison of the MIP-based approaches (the branch-and-cut and the compact formulation) with the labelling algorithm presented in Section 4.3, on the S2 instances.