Operations Research Letters

We introduce the half-cycle formulation (HCF), a new integer programming (IP) model for the kidney exchange problem, which has life-saving applications. In HCF, a cycle (i.e., set of kidney swaps) is represented by two compatible halves. After giving several algorithmic enhancements for HCF, we show through extensive computational experiments with an IP solver that our new model outperforms existing formulations when the cycle size limit is set to 4, 5, or 6, depending on the density of the compatibility graph. © 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons .org /licenses /by /4 .0/).


Introduction
Kidneys have an essential role in the human body as they filter the blood to get rid of waste products, they keep the electrolytes (such as sodium and potassium) and water content of the body constant, and they secrete a number of essential hormones [18]. While most people have two kidneys, it is possible to have a healthy and active life with only one functioning kidney. However, if someone has impaired kidney function (which is the case for almost 700 million people in the world according to recent estimates [5]), they may need specialist medical care that can either consist of dialysis or transplantation, the latter option offering both a better quality of life and a longer life expectancy while being less costly [4,23].
For many years, the kidney used in the transplantation could only come from a deceased donor or from a living donor belonging to the close family of the person in need (e.g., a partner or a blood relative). This requirement was motivated by ethical reasons, to remove any financial incentive for living kidney donors. Living donors are in high demand as (i) there are better outcomes for recipients who receive a transplant from a living donor [16] and (ii) in most countries, the pool of deceased donors is not large enough to cover the demand. However, due to a blood-type or tissue-type incompatibility between the two people involved in a transplantation, many recipients cannot receive a kidney despite having found a willing living donor. * Corresponding author. Thanks to the seminal works of Rapaport [24] and Roth et al. [26] and after a series of legislation changes in many countries to provide a legal framework for transplants between two persons that do not know each other, kidney exchange programmes were created. Such programmes allow a set of (not necessarily compatible) recipient-donor pairs registered in the system to exchange their donor kidneys if the resulting swaps satisfy compatibility requirements. The swaps are typically determined by a centralised matching algorithm and are computed at regular intervals (e.g., every three months in the UK [17]). Compatibility relationships can easily be captured by a directed compatibility graph, where an arc is drawn from one pair to another if the donor of the first pair is compatible with the recipient of the second pair. For example, if we consider two recipient-donor pairs (r 1 , d 1 ) and (r 2 , d 2 ) and a complete compatibility graph, then a suitable exchange matches r 1 to d 2 and r 2 to d 1 and is called a 2-cycle ("2" because there are two kidney transplants since two recipient-donor pairs are involved in the swap, and "cycle" because the arcs associated with these swaps form a cycle in the compatibility graph). Due to practical constraints, most kidney exchange programmes impose a limit on the number of pairs that can be involved in the same cycle. The kidney exchange problem (KEP) can therefore be defined as the problem of finding a vertex-disjoint set of cycles with the maximum number of transplants, given a set of recipient-donor pairs, a compatibility graph, and a limit on the number of pairs involved in any cycle. Note that the vertex-disjoint requirement comes from the fact that each recipient and donor can be involved in at most one cycle. Over the years, kidney exchange programmes have evolved and developed country-specific components, resulting in several KEP extensions. One of the most frequent extensions considers the inclusion of non-directed (or altruistic) donors, each of which is not paired with any recipient. An exchange triggered by a non-directed donor is called a chain because the recipient of the last recipientdonor pair receives a kidney but its paired donor is not immediately matched with any other recipient in the pool of recipientdonor pairs (this donor may initiate a chain in the next run of the programme or give their kidney to a recipient on the deceaseddonor waiting list). Another frequent extension considers a set of hierarchically ordered objective functions to determine the best solution (i.e., the best set of cycles and chains). For example, a kidney exchange programme may want to maximise the number of transplants, and if several optimal solutions exist, may wish to pick the one that includes the highest number of 2-cycles. We refer the interested reader to the works of Biró et al. [6,7] for a detailed description of European kidney exchange programmes.
As far as the KEP literature is concerned, we identified three main (not necessarily disjoint) paper categories: (i) those providing new theoretical results and models for the KEP; (ii) those increasing the algorithmic performance of existing models; and (iii) those adapting existing approaches to take supplementary realworld features into account.
Starting with papers in category (i), two of the most important theoretical KEP results were obtained by Roth et al. [27] and Abraham et al. [1]. The former showed that if only 2-cycle are allowed, then an optimal solution for the KEP can be found in polynomial time as the problem can be reduced to a maximum weighted matching problem. The latter proved that the KEP becomes N Phard when the limit on the number of pairs involved in a cycle is greater than or equal to 3, while it can be solved in polynomial time if such a limit does not exist. The N P-hardness result has motivated the research community to find effective techniques for coping with the complexity of KEP, which include Integer Programming (IP). Among the most efficient (i.e., fastest to solve with an IP solver) KEP models without non-directed donors introduced in the literature, we mention the cycle formulation and the edge formulation that were both introduced by Roth et al. [28], the extended edge formulation (EEF) introduced by Constantino et al. [9], and the position-indexed edge formulation (PIEF) introduced by Dickerson et al. [14]. Since a non-directed donor can be modelled as a recipientdonor pair whose recipient is compatible with every other donor, these formulations can also be used for KEP instances containing non-directed donors. We mention however that Dickerson et al. [14] introduced a very effective way to model chains in KEPs that, when coupled with the cycle formulation to model cycles, obtained state-of-the-art results for instances with at most three pairs per cycle and up to eleven pairs per chain.
In category (ii), as far as improving the algorithmic performance of existing models is concerned, we mention the work of Lam and Mak-Hau [19] who solved the cycle formulation with a branch-and-price framework, and the work of Delorme et al. [11] who used reduced-cost variable fixing together with preprocessing techniques and a diving algorithm to enhance the performance of the cycle formulation for KEPs with hierarchical optimisation. Finally, for category (iii), the literature aiming at adapting existing approaches to real-world applications is very large. We only mention a few main areas such as realistic instance generators [12,29], stochastic optimisation [2,22], and hierarchical optimisation [7,11].
In this work, we introduce the half-cycle formulation (HCF), a new IP model for KEPs without non-directed donors. The model is inspired by the cycle formulation and represents a cycle by two compatible halves. We detail some algorithmic enhancements for HCF; these include the use of a reduced-cost variable fixing framework [11] that could also be applied to PIEF. We show that HCF together with these algorithmic enhancements obtains better empirical performance when solved with a commercial IP solver than other mathematical models when the cycle size limit is set to 4, 5, or 6, depending on the density of the compatibility graph. This improved empirical performance includes faster solution times and the ability to solve larger instances to optimality. Increasing the instance size that can be solved with IP modelling is important for the KEP community because these formulations are often considered easier to re-implement and modify than handmade algorithms such as branch-and-price algorithms [19] or matheuristics [12]. Our experimental findings are based on an extensive empirical evaluation that indicates the limits of each of the tested IP models.
The rest of the paper is organised as follows. In Section 2, we formally define the KEP and briefly mention existing IP formulations and their main features. We then introduce HCF in Section 3 together with a study of its continuous relaxation, some reduction procedures and algorithmic enhancements, a discussion about extending the model to handle non-directed donors, and some insights about a version of HCF where cycles are split into three parts or more. Section 4 gives a summary of the results obtained by extensive computational experiments on realistic instances and concluding remarks are provided in Section 5.

Problem statement and existing formulations
In the KEP without non-directed donors (called KEP-WNDD thereafter), we are given a set of n recipient-donor pairs together with a limit K on the maximum number of pairs that can be included in any cycle and a compatibility graph G = (V, A) where vertex set V = {v 1 , v 2 , . . . , v n } contains one node per recipientdonor pair and where arc set A contains arc (p 1 , p 2 ) if the donor of pair p 1 is compatible with the recipient of pair p 2 . The objective is to compute a subset of arcs A ⊆ A (the transplants) such that |A | is maximised while being solely composed of disjoint cycles with cardinality K or below.
Let S = {v 1 , . . . , v r } be a set of vertices, where each recipient and donor appears in at most one vertex of S.
The size of this cycle is r. In anticipation of the description of HCF, we also introduce the concept of a half-cycle where a cycle can be decomposed into two halves, denoted by v 1 , v 2 , . . . , v r/2 +1 and v r/2 +1 , v r/2 +2 , . . . , v r , v 1 . For example if we consider the cycle [A, B, C , D], where A, B, C , D ∈ V and (A, B), (B, C ), (C, D), (D, A) ∈ A, then this cycle can be decomposed into two half-cycles A, B, C and C , D, A . The sizes of these half-cycles are r/2 + 1 and r/2 + 1, respectively. We also remark that the vertex labelling of {v 1 , v 2 , . . . , v n } gives an ordering on the vertices that will be used in HCF.
One of the most intuitive models, the so-called cycle formulation [28], enumerates every possible cycle of size K or below, stores them in set C, and associates one binary variable ξ c to every feasible cycle c ∈ C, where ξ c takes value 1 if cycle c is selected in the solution and value 0 otherwise. By defining the set V (c) to contain the vertices involved in cycle c (c ∈ C), we can define the cycle formulation as follows: Objective function (1) maximises the number of transplants while constraints (2) make sure that no recipient-donor pair appears in more than one of the selected cycles. Even though it is one of the first proposed IP models, the cycle formulation still obtains the best results for KEP-WNDD instances with values of K up to 4 when compared to other IP models [9]. This can be explained by the size of the model, containing O (n K ) variables and O (n) constraints, which remains tractable for low values of n and K , and by the good quality of the bound derived from the Linear Programming (LP) relaxation of the model, which is the best among all existing IP models for the KEP-WNDD [14]. Another intuitive model is the so-called edge formulation [28] which associates a binary variable to every arc a ∈ A in the compatibility graph. This variable takes value 1 if the arc is selected in the solution and value 0 otherwise. The objective is therefore to maximise the number of arcs selected under the constraints that (i) there should be flow conservation at every node, (ii) every node should have at most one outgoing flow, and (iii) the selected arcs should not form a cycle of size K + 1 or above. While the edge formulation only uses one variable per arc (so at most n 2 in total), it requires an exponential number of constraints of type (iii). One way to model these constraints is to enumerate every feasible chain of size K that does not form a cycle and forbid each of these chains with a no-good cut, resulting in a total of O (n K ) constraints.
It was empirically shown that the edge formulation is much less effective than the cycle formulation [9], even when solved within a constraint generation framework that only adds the relevant cycle elimination constraints (i.e., the ones that have been violated by a previous incumbent solution) to the enumeration tree. The first compact (or fully polynomial) IP formulation for the KEP-WNDD is EEF [9]. In brief, EEF considers n copies of the compatibility graph, one for every potential cycle, and associates one variable for every arc in every copy. Such a variable takes value 1 if the arc is selected in the solution in its associated copy and value 0 otherwise. The objective is still to maximise the number of arcs selected under the constraints that (i) there should be flow conservation at every node in each of the n graph copies, (ii) every node should have a sum of outgoing flow over the n copies less than or equal to 1, and (iii) at most K arcs should be selected per graph copy. Even though EEF has O (n 3 ) variables and O (n 2 ) constraints, the model is not particularly competitive in practice due to the poor quality of its linear relaxation. Later on, Dickerson et al. [14] introduced PIEF, an extension of EEF that duplicates each of the n graph copies K times to keep track of the position of every arc in a cycle. They showed that PIEF has the same LP-relaxation as the cycle formulation while only having O (Kn 3 ) variables and O (Kn 2 ) constraints. Note that preprocessing techniques can remove many variables and constraints in EEF and PIEF.
A summary of the most prominent existing models, their respective size, and the quality of their linear relaxation is given in Table 1. For the sake of completeness, we also included our new formulation HCF in the table. While many features may impact the time required by a state-of-the-art IP solver to solve an instance to optimality, the research community tends to agree that the model size (i.e., the number of variables and constraints) and the quality of the LP-relaxation bound both play a crucial role. In that regard, the cycle formulation, PIEF, and HCF complement each other very well as each of these models has the smallest size for at least one value of K 1 while always having a tight LP-relaxation bound (for HCF, the model size and the tightness of the LP-relaxation are derived in Section 3.1).
To the best of our knowledge, no other mathematical model for the KEP-WNDD has been proposed in the literature, apart from two edge formulation extensions introduced in a thesis [25] that 1 Example values are K = 2 for the cycle formulation, K = 4 for HCF and K = 6 for PIEF.
tight were shown to offer no competitive advantage. Note that a larger variety of models have been proposed for the KEP with nondirected donors [3,14,20]. Indeed, even though every modelling technique used to represent a cycle can also be used to represent a chain, there are also (more efficient) techniques that can be used to represent a chain that are not suitable for representing a cycle.

Half-cycle formulation and reduction procedures
In the last decade, several research papers introduced strategies to exploit the symmetry within combinatorial optimisation problems to derive IP models with a reduced size. For example, Delorme and Iori [13] introduced a new mathematical model for the bin packing problem that uses half of the bin capacity to model an instance, requiring significantly fewer constraints and variables than existing models (that used the full bin instead). The authors represented both the first half and the second half of a bin with the same modelling structure and showed that the resulting decrease in terms of model size outweighed -by far -the extra burden brought by the supplementary constraints that are necessary to ensure compatibility between the selected half bins. This "two halves instead of a whole" strategy was extended later on to other packing problems [10,21]. In this section, we show that a similar idea can also be used for the KEP-WNDD.

Mathematical model
In our new formulation HCF, a cycle is represented by two compatible half-cycles. To guarantee that the half-cycles selected by the model are compatible, we simply need to ensure that, for every pair of nodes v 1 , v 2 ∈ V, a half-cycle starting by v 1 and ending by v 2 is selected if and only if another half-cycle starting by v 2 and ending by v 1 is also selected. It is tempting to believe that splitting a cycle into two halves will lead to a mathematical model that has twice as many variables as the cycle formulation. However this need not be the case, as is illustrated in Section A of the Online Supplement.
After enumerating every possible half-cycle of size up to 1 + K /2 and storing them in set H, the model associates one binary variable x h to every feasible half-cycle h ∈ H taking value 1 if halfcycle h is selected in the solution and value 0 otherwise. Note that half-cycles containing the same node twice are not generated. By vertices of h, we can define HCF as follows: Objective function (4) maximises the number of transplants while taking into account that the starting and ending nodes of every selected half-cycle only count for halves as they both appear in two half-cycles. Constraints (5) make sure that no recipient-donor pairs appear more than once in the middle or more than twice at the start/end of the selected half-cycles, while constraints (6) ensure that every half-cycle selected in the solution can be matched with another selected half-cycle to form a complete cycle. Note that, to ensure the correctness of the model in the case where K is odd, the following set of constraints is needed in HCF: (8) Constraints (8) prevent any half-cycle v 1 , . . . , v 2 with size 1 + K /2 and v 1 index greater than v 2 index from being generated. This forbids two half-cycles with size 1 + K /2 from being merged, which would be the equivalent (when K is odd) of a cycle with size K + 1.
We point out that the bound obtained by the LP-relaxation of HCF is as tight as the bound obtained by the cycle formulation since the two models are equivalent, or in other words, since any continuous solution of model (4)-(8) can be transformed into a continuous solution of model (1)-(3) with the same objective value and vice versa (the proof is available in Section B of the Online Supplement).

Reduction procedures and node ordering
As in the case of other IP models for the KEP, HCF displays some symmetry.  (6) by only allowing a half-cycle v 1 , . . . , v 2 with size k to be matched with another half-cycle v 2 , . . . , v 1 with size k or, in case the index of v 1 is smaller than the index of v 2 , with size k and k − 1. Preprocessing removing unnecessary variables can also be applied to HCF. A simple rule is to only consider a half-cycle in the model if it can be completed by another half-cycle.
Considering that the bound obtained by the LP-relaxation of HCF is as tight as the bound obtained by the cycle formulation, we can follow the path of Delorme et al. [11] and apply the concept of a destructive bound together with a reduced-cost variable fixing strategy to reduce the number of variables that needs to be considered in the model. To do so, we start by solving the LPrelaxation of model (4)-(8), save the objective value ẑ, and use it to derive a valid upper bound U = ẑ on the maximum number of transplants. From now on, we are only looking for an integer solution with objective value U . Delorme et al. [11] proved a general result concerning reduced-cost variable fixing, a consequence of which is that any continuous solution of model (4)-(8) where a variable x h with reduced cost ŝ h ≤ U −ẑ − takes value 1 or above 2 Note that this example is only valid if K ≥ 6, as at most one half-cycle among A, B, C , D and C , D, E, A would be generated if K = 5, since in this case, the former half-cycle would only be generated if the index of A is smaller than the index of C (as the first reduction procedure ensures that either the index of A or the index of D is smaller than the index of C , and constraints (8) force the index of A to be smaller than the index of D), while the latter half-cycle is only generated if the index of C is smaller than the index of A (again by constraints (8)). must have objective value strictly below U ( is a very small number used to avoid precision errors). As a result, no integer solution selecting the half-cycle associated with such a variable x h can have objective value U and x h can therefore be deactivated. If we find a solution with value U for the reduced model, then that solution is optimal. If that is not the case, then it means that no solution with objective value U exists, so U − 1 becomes a valid upper bound. Therefore, U is decremented and all the variables are reactivated except the ones with reduced cost smaller than or equal to the updated U −ẑ − value. The algorithm is iterated until a solution with value U is found. Note that in a more efficient implementation, the algorithm terminates if the optimal objective value of the reduced model is equal to U − 1.
A peculiarity of HCF (which also occurs with EEF and PIEF, but not in the cycle formulation) is that the node ordering has an impact on the number of variables required by the model. To illustrate this behaviour, we consider a simple instance whose compatibility graph G is displayed in the left part of Fig. 1, where G has 4 recipient-donor pairs and K = 4. In the right part of the figure, we outline the half-cycles that should be merged together in order to build each of the feasible cycles of the instance, given a certain node ordering and the reduction procedures mentioned above. One can observe that using node ordering B < C < D < A produces fewer half-cycles (and thus, fewer variables in the HCF model) than using node ordering A < D < B < C . In fact, searching for the minimum number of half-cycles that are necessary to represent every feasible cycle in a KEP-WNDD is an optimisation problem in its own right. Interestingly, such an objective cannot always be achieved by using a specific node ordering in the context of our HCF model construction process. Indeed, in our example, every feasible cycle can be obtained by using only 4 half-cycles,

Extending HCF to the KEP with non-directed donors
Even though we introduced HCF for the KEP-WNDD, we point out that there are several ways to extend the model to take nondirected donors into account. The most intuitive (but probably not the most efficient) strategy would be to represent the chains as cycles in the compatibility graph, by introducing a dummy recipient for each non-directed donor that is compatible with every directed donor. However, we expect that better computational performance could be obtained by using instead the chain structure introduced by Dickerson et al. [14] as the latter only requires a polynomial number of variables and constraints, O (Ln 2 ) and O (Ln), respectively (where n is now the number of recipient-donor pairs plus the number of non-directed donors and L is the chain size limit). Unique half-cycle representation given a fixed node ordering  Table 1 for any value of K .

Computational experiments
We first tested the performance of HCF on a set of randomly generated instances with various sizes n ∈ {50, 100, 200, 400, 600, 800, 1000} and maximum cycle size values K ∈ {3, 4, 5, 6, 7, 8} using the instance generator introduced by Delorme et al. [12] and available at https://wpettersson .github .io /kidney-webapp. We clicked on "Use recipient blood group distributions from the paper" and left the other parameters untouched. We generated 20 instances for each value of n, resulting in 140 instances in total that can be downloaded from https://github .com /mdelorme2 / Half _Cycle _Instances. These instances will be referred to as the DGGKMPT instances in the following. Note that each of the 140 DGGKMPT instances were solved using different values of K . We then tested the performance of HCF on instances from the "Kidney Data (00036)" dataset [15], which can be downloaded from the PrefLib library at https://www.preflib .org /dataset /00036 and which will be referred to as the PrefLib instances in the following. We selected the 10 PrefLib instances without non-directed donors for each size n ∈ {64, 128, 256, 510, 1024}. PrefLib instances have been widely used by the research community [12,19] and are characterised by a high graph density (around 25%), which usually increases the number of variables that needs to be generated in IP models. The graph density for DGGKMPT instances is close to 10%, a proportion that is similar to what is observed in the UK kidney exchange programme, which is one of the reasons why it was argued in [12] that the generator from [12] produced more realistic instances than those obtained by the well-known Saidman generator [29], which was used to create the PrefLib instances. We tested the following models: • CYCLE, the cycle formulation [28] with symmetry reduction (only the cycles starting with the lowest indexed pair are generated); • EDGE, the edge formulation [28] implemented within a branchand-cut framework; • EEF, the extended edge formulation [9] with symmetry reduction (in copy v of the compatibility graph, only the arcs with head and tail indices equal to v or above that can be included in a cycle of size up to K containing node v are considered; nodes are sorted according to a pre-defined node ordering); • PIEF, the position-indexed edge formulation [14] with the same symmetry reduction and node ordering as EEF; • HCF, the half-cycle formulation introduced in Section 3 with symmetry reduction and node ordering as described in Section 3.2; • CYCLE-RCVF, algorithm CYCLE with reduced-cost variable fixing (as described by Delorme et al. [11] and summarised in Section 3.2); • PIEF-RCVF, algorithm PIEF with reduced-cost variable fixing; • HCF-RCVF, algorithm HCF with reduced-cost variable fixing.
We point out that we did not use reduced-cost variable fixing on the edge formulation and on EEF because the LP-relaxation of those models is often more than one unit away from the optimal solution value.
Our algorithms were all coded in C++ and can be downloaded from https://github .com /mdelorme2 /Half _Cycle _Codes. The experiments were run on an Intel(R) Core(TM) i5-1135G7, 2.40 GHz with 32 GB of memory, running under Ubuntu 22.04.1 LTS, and Gurobi 10.0.0 was used to solve the IP models. A single core was used for the tests (i.e., parameter Threads was set to 1), the barrier algorithm was used to solve the root nodes of the IP models (i.e., parameter Method was set to 2), and callbacks were used for the constraint generation (and thus, parameter LazyConstraints was set to 1 for EDGE). All the instances were first solved with a 16 GB memory limit for the environment (i.e., parameter Mem-Limit was set to 16). Those instances that ended prematurely due to being out of memory were rerun with the memory limit increased to 30 GB. For the algorithms using reduced-cost variable fixing, we deactivated the crossover operation when solving the LP models (i.e., parameter Crossover was set to 0). This means that the solver did not try to transform the interior solution produced by the barrier algorithm into a basic solution. For these last three approaches, we also changed the focus of the solver so that more time was spent on finding a good-quality solution (i.e., parameter MIPFocus was set to 1). For every run, a time limit of 3600 seconds was imposed. Note that, for a given K , when an approach could not solve any instance with size n, we decided to not test the approach on instances larger than n and filled the associated rows with the symbol "-".
Results for the DGGKMPT instances are presented in Section 4.1, whilst those for the PrefLib instances are given in Section 4.2.

DGGKMPT instances
We report in Table 2 the results of a first set of experiments aimed at evaluating the impact of two node ordering rules: one where the vertices are sorted in descending order of total degree (denoted by the suffix "-D") that was suggested by Dickerson et al. [14] for PIEF, and one where the vertices are sorted in ascending order of total degree (denoted by the suffix "-A"). For the sake of conciseness, we only report the results for models PIEF and HCF on instances with K = 4. The first two columns identify the limit K on the number of pairs that can be included in a cycle and the total number of pairs n. The following columns provide, for each of the tested models, the number of optimal solutions found, the average CPU time over the 20 runs (including the ones terminated by the time limit), and the average number of variables and constraints involved.
For K = 3, we observe that every approach apart from EDGE could solve all the instances. CYCLE-RCVF obtained the best results, followed by CYCLE, HCF-RCVF, and HCF. We point out that there is no theoretical interest in using HCF (or HCF-RCVF) over CYCLE (or CYCLE-RCVF) for these instances as every half-cycle of size 3 can only be completed by one single half-cycle of size 2. We also remark that the version of the model using reduced-cost variable fixing always outperforms the corresponding version that does not use it.
For K = 4, we observe that only CYCLE-RCVF and HCF-RCVF can solve all the tested instances, but neither of the two approaches clearly outperforms the other. We mention that CYCLE has around 2.8 million variables and 800 constraints on average for instances with 800 recipient-donor pairs while HCF has around 1.1 million variables and 86 thousand constraints. This confirms the general paradigm behind HCF, which is to reduce the number of variables at the cost of supplementary constraints compared to CYCLE. For the same instances, CYCLE-RCVF has around 1 million variables and 800 constraints on average while HCF-RCVF had 581 thousand variables and 66 thousand constraints, which indicates that reduced-cost variable fixing can be very effective. Based on these results, we continue our tests with larger values of K for the methods EDGE, CYCLE-RCVF, PIEF-RCVF, and HCF-RCVF. We decided to include EDGE because it is expected that the branch-and-cut framework becomes more effective as K increases since the cycles found in incumbent solutions are less likely to have size K + 1 or above (and therefore, are less likely to require a cut).
We report in Table 4 the results obtained by the four remaining approaches on DGGKMPT instances with K = 5 and K = 6. For each model, we added the average number of variables and constraints involved in the model after reduced-cost variable fixing (if any) and for EDGE, the average number of cuts added within the branch-and-cut framework. As early termination due to memory limit occurred in these experiments (even after increasing that limit to 30 GB), the number of instances that were prematurely ended is now reported within brackets after the number of optimal solutions found in column "# opt" (if that number was not 0). Typically, an instance would run out of memory within the first hundred seconds of running time, when solving the LP-relaxation of the model (so before reduced-cost variable fixing). As a result, the instances terminated because the memory limit was exceeded were neither used to compute the average computation time of the approach nor the average number of variables and constraints involved in the model.
For K = 6, HCF-RCVF obtains again the best results as it was the only approach able to solve all instances with n ≤ 400, and its computation times were the fastest on average. However, for n = 600, we observe that the model could not solve any of the 20 instances while PIEF-RCVF could solve one, indicating the start of a performance shift between the two models. We note that the number of variables grows at a lower pace with n for PIEF-RCVF than for HCF-RCVF, but the former model involves more constraints on average than the latter. For this K value also, we note that EDGE struggles to solve instances with n = 100 or more and that CYCLE-RCVF experiences difficulties due to memory limits.
We report in Table 5 the results obtained by the same four approaches on instances with K = 7 and K = 8. For K = 7 and K = 8, it is interesting to observe that PIEF-RCVF now obtains the best results as it is able to solve 16 out of 20 instances with size n = 400 for K = 7 (versus 10 for HCF-RCVF) and it is the only approach able to solve instances with size n = 400 for K = 8. Even though HCF-RCVF is ahead when it comes to solving instances with size n ≤ 200 for K = 7 and 8, we observe that the number of variables involved in the model increases too quickly with n compared to PIEF-RCVF, indicating that the model loses its competitive advantage for K ≥ 7 over PIEF-RCVF. We also note that we start to see a shift in the performance of EDGE, as it becomes faster to solve instances with size n = 100 as K increases, which seems to indicate that when K becomes very large, EDGE might obtain the best results.

PrefLib instances
Our second set of experiments is intended to give evidence that the competitive advantage displayed by HCF over the cycle formulation and PIEF also occurs for KEP instances produced by other generators. To do so, we tested EDGE, CYCLE-RCVF, PIEF-RCVF, and HCF-RCVF on PrefLib instances and report in Section C of the Online Supplement the results obtained by the four tested approaches on instances with K = 3, 4, 5 and 6.
For K = 3, the observations made in our previous experiments still hold: CYCLE-RCVF obtained the best results, followed by HCF-RCVF, and PIEF-RCVF. For K = 4, we observe an interesting shift as HCF-RCVF now clearly outperforms CYCLE-RCVF. This is mainly due to the model size: as the graph density is larger, the number of variables increases at a very fast pace in CYCLE-RCVF with respect to what is observed for HCF-RCVF. PIEF-RCVF is not yet competitive as it requires almost as many variables as HCF-RCVF while also requiring five times as many constraints. For K = 5, PIEF-RCVF and HCF-RCVF obtain similar results: the former requires more constraints but the latter requires more variables. For K = 6, the number of variables in HCF-RCVF now increases too fast with the instance size and the best performance is now obtained by PIEF-RCVF. This observation was confirmed by similar experiments using K = 7 and K = 8. Overall, PrefLib instances appear to be harder for the tested models, but one should keep in mind that ad hoc strategies to solve instances with high density compatibility graphs to optimality have been proposed in the literature and are particularly effective for PrefLib instances [12].

Conclusions
We introduced the half-cycle formulation (HCF), an IP model to solve the KEP-WNDD that represents a cycle using two halves. After detailing a few symmetry reduction procedures, we showed that it was possible to use reduced-cost variable fixing to enhance the performance of the model. We showed through extensive computational experiments that HCF, when solved with symmetry reduction, reduced-cost variable fixing, and descending-degree node ordering, obtained better results than the cycle formulation with symmetry reduction and reduced-cost variable fixing when the cycle size limit is set to K = 5 or above (K = 4 or above for instances with a dense compatibility graph like PrefLib instances), but we also observed that it was behind the position-indexed edge formulation (PIEF) with symmetry reduction, reduced-cost variable fixing, and descending-degree node ordering when the cycle size limit is set to K = 7 or above (K = 5 or above for instances with a dense compatibility graph). A set of rules outlining which formulation to choose depending on the value of K and the density of the compatibility graph is given in Section D of the Online Supplement.
Our work focused on evaluating IP models and their performance as the instance size grows both in terms of recipient-donor pairs and in terms of cycle size limit. We point out, however, that other algorithmic procedures could also be used to enhance the performance of the tested approaches. For example, HCF could be solved within a branch-and-price framework to be more effective on instances where K ≥ 4, but this goes beyond the scope of our study as our primary focus is on mathematical programming, which is often regarded as an accessible tool by the research community. Also, we observed that increasing K for instances with more than 200 recipient-donor pairs does not necessarily increase the maximum number of transplants (e.g., for DGGKMPT instances with 200 recipients and K = 5, the maximum number of transplants was 124.20 on average versus 124.25 for K = 6). This phenomenon is even more pronounced for instances with a dense compatibility graph, as the maximum number of transplants for every PrefLib instance with size n ≥ 128 never increased when K was set to a value greater than or equal to 4. This indicates that for large K values, it could be worth trying to solve the instance with a smaller K (resulting in faster approaches) and assess afterwards by mean of a valid upper bound (e.g., obtained by relaxing the integrality constraints or the limitation on K ) whether or not better solutions can be obtained by considering larger cycles. We did not empirically evaluate the performance of the models on KEP instances with non-directed donors, but we pointed out that every formulation could take non-directed donors into account by using the chain structure introduced for PICEF by Dickerson et al. [14]. In other words, one could introduce PICEF-HC, an algorithm that uses the efficient chain structure of PICEF together with our new HCF model to represent the cycles. Depending on the cycle size limit, it could even be relevant to introduce PICEF-PIE, a version where cycles are represented with PIEF.
For future work, it would be interesting to study whether or not HCF can handle hierarchical objectives as well as the cycle formulation, even though we expect that a few objectives such as maximising the number of cross-arcs [11] will be difficult to model with HCF. Another future research direction could focus on assessing whether or not HCF can be useful for the stochastic or robust version of the KEP [8]. Finally, it would also be worth investigating whether the "two halves instead of a whole" idea can be extended to other combinatorial optimisation problems.