Novel optimum contribution selection methods accounting for conflicting objectives in breeding programs for livestock breeds with historical migration

Optimum contribution selection (OCS) is effective for increasing genetic gain, controlling the rate of inbreeding and enables maintenance of genetic diversity. However, this diversity may be caused by high migrant contributions (MC) in the population due to introgression of genetic material from other breeds, which can threaten the conservation of small local populations. Therefore, breeding objectives should not only focus on increasing genetic gains but also on maintaining genetic originality and diversity of native alleles. This study aimed at investigating whether OCS was improved by including MC and modified kinships that account for breed origin of alleles. Three objective functions were considered for minimizing kinship, minimizing MC and maximizing genetic gain in the offspring generation, and we investigated their effects on German Angler and Vorderwald cattle. In most scenarios, the results were similar for Angler and Vorderwald cattle. A significant positive correlation between MC and estimated breeding values of the selection candidates was observed for both breeds, thus traditional OCS would increase MC. Optimization was performed under the condition that the rate of inbreeding did not exceed 1% and at least 30% of the maximum progress was achieved for all other criteria. Although traditional OCS provided the highest breeding values under restriction of classical kinship, the magnitude of MC in the progeny generation was not controlled. When MC were constrained or minimized, the kinship at native alleles increased compared to the reference scenario. Thus, in addition to constraining MC, constraining kinship at native alleles is required to ensure that native genetic diversity is maintained. When kinship at native alleles was constrained, the classical kinship was automatically lowered in most cases and more sires were selected. However, the average breeding value in the next generation was also lower than that obtained with traditional OCS. For local breeds with historical introgressions, current breeding programs should focus on increasing genetic gain and controlling inbreeding, as well as maintaining the genetic originality of the breeds and the diversity of native alleles via the inclusion of MC and kinship at native alleles in the OCS process.


Background
In recent decades, the widespread use of artificial insemination and other reproductive technologies has resulted in substantial genetic gains in livestock populations. However, another consequence is that only a limited number of animals with high estimated breeding values (EBV) have been intensively used in breeding programs, which can result in increasing rates of inbreeding to undesired levels. A high rate of inbreeding not only leads to considerable reduction in genetic variation but also more deleterious recessive alleles become homozygous, which may threaten the entire future of the population [1]. Thus, there is a conflict between maximizing genetic gain and managing the rate of inbreeding.
Crossbreeding has been demonstrated to be an efficient method to reduce the threat of inbreeding depression and increase the level of genetic diversity [2]. In addition, local breeds are often crossed with breeds of high economic value to improve performance. However, such introgressions of genetic material can be a threat for maintaining local breeds. Amador et al. [3] confirmed that, after several generations without management, even a small introduction of foreign genetic material will rapidly disperse throughout the original population, and that this material is difficult to remove. Therefore, foreign introgressions present a large risk for the conservation of local breeds, which leads to a conflict in current breeding programs between increasing the contribution of foreign genetic material and conserving local breeds.
Optimum contribution selection (OCS) is a selection method that is effective at achieving a balance between rate of inbreeding and genetic gain. This selection process maximizes genetic gain in the next generation while constraining the rate of inbreeding via restriction of relatedness among offspring [4][5][6]. The superiority of OCS has been demonstrated with both simulated [7,8] and real data [9][10][11]. The objective function for OCS has been optimized using Lagrange multipliers [4,8,12], evolutionary algorithms [7,13,14], and semidefinite programming algorithms [9,15,16]. A similar related optimization problem was expressed as a mixedinteger quadratically constrained optimization problem and solved with branch-and-bound algorithms [17]. In this paper, we applied the algorithm described in [18] for solving cone-constrained convex problems by using R package optiSel.
OCS is efficient for controlling the level of kinship among progeny and the rate of inbreeding in future generations and can ultimately maintain genetic diversity [12,16,19,20]. However, a high level of genetic diversity can be achieved by a large genetic contribution from migrant breeds, which is undesirable for the conservation of local breeds, because it reduces their genetic uniqueness, as well as the genetic diversity between breeds [21]. Thus, conflicting objectives are observed with regards to maintaining genetic diversity and conserving genetic uniqueness of local small breeds with historical migrations.
Instead of focusing on genetic gain and rate of inbreeding only, a reasonable breeding objective would be to also include recovery of genetic originality by reducing migrant contributions (MC). The diversity of native alleles may also be important for conservation. Thus, to conserve breeds with historical migrations, Wellmann et al. [22] recommended that approaches should not only constrain MC, but also aim at increasing the probability that alleles originating from native founders are not identical by descent (IBD).
Our aim was to investigate whether including MC and modified kinship matrices that account for breed origin of alleles as additional constraints in OCS can improve breeding programs in local breeds. Both conservation progress and genetic gain were evaluated. The following scenarios based on different objective functions were considered: (1) maximizing the diversity of native alleles while restricting MC and/or the average breeding value of the progeny generation at desired levels; (2) minimizing MC while restricting the loss of diversity of native alleles and/or the average breeding value of the progeny generation at desired levels; and (3) maximizing the average breeding value of the progeny generation while restricting MC and/or the loss of diversity of native alleles at desired levels. The traditional pedigree-based kinship was constrained in all optimization scenarios.

Data
Data from two local German cattle breeds, Angler and Vorderwald, were analyzed. The Angler breed is mainly located in the northern part of Germany and represents a dual-purpose breed, although the primary emphasis is on milk production. With the introduction of other breeds to improve milk yield, the Angler breed has experienced a considerable amount of migrant breed introgressions [23]. The Angler dataset was provided by the VIT (Vereinigte Informationssysteme Tierhaltung w.V., Verden), Germany. The Vorderwald breed is a dual-purpose breed located in the black forest region of southwest Germany. Similarly, due to their frequent crossing with high-yield breeds, the genetic originality of Vorderwald cattle has decreased dramatically [24,25]. The Vorderwald dataset was provided by the Institute for Animal Breeding, Bavarian State Research Center for Agriculture in Grub, Germany. Both datasets consist of pedigrees with information on sex, breed, birth year and estimated breeding values for milk production obtained from routine genetic evaluations. Animals with an unknown pedigree born before 1970 were classified as purebred. Animals from other breeds and animals with an unknown pedigree born after 1970 were considered as migrants, although some may have purebred ancestors. The Angler dataset included 109,109 animals born between 1906 and 2015, of which 86,269 (79.1%) were classified as Angler. The Vorderwald dataset included 200,468 animals born between 1906 and 2010, of which 180,646 (90.1%) were classified as Vorderwald. MC for each animal was calculated and expressed as the proportion of migrant breed alleles based on pedigree information.

Selection candidates
Selection candidates were chosen among animals that were classified as purebred in the herdbook in order to compute their optimum contributions with different approaches. Sires that had progeny born in 2005 and 2006 were set as male selection candidates and selected males were mated to 1000 randomly chosen dams, which are called female selection candidates. For the Angler breed, 1199 selection candidates were available and 15,370 animals were involved in the pedigree that included all selection candidates and their ancestors. For the Vorderwald breed, 1123 selection candidates were available and 12,934 animals were involved in the pedigree. For a better comparison of results between the two breeds, EBV were normalized across all selection candidates of each breed, with a mean of 0 and a standard deviation of 1.

Optimum contribution selection strategies
The output of the optimum contribution selection procedure is a vector c with individual genetic contributions. The genetic contribution c i of animal i is the fraction of genes in the next generation that originate from this individual. Genetic contributions cannot be negative, i.e. c i ≥ 0, which is denoted as constraint (a) in the following. The total genetic contribution of each sex must be equal to 0.5 for diploid species, i.e. c ′ s = 0.5 and c ′ d = 0.5 (constraint b), where s and d are vectors of the indicators (0/1) of a candidate's sex. Because cows can produce only a limited number of calves, all female selection candidates were used for breeding and the genetic contributions were forced to be equal, i.e. c d 1 = c d 2 = · · · = c d n (constraint c). Thus, optimization was only performed for bulls. For male selection candidates, the number of offspring is not limited, thus the maximum genetic contribution is 0.5, i.e. c s i ≤ 0.5. To calculate the proportion of sires with non-zero genetic contributions, a sire i is considered to have a non-zero genetic contribution only if c s i ≥ 0.00025 to account for possible numerical inaccuracies of the algorithm.
Four kinships that are involved in the calculation of the OCS procedure were applied. The diversity parameters described in [22] are complementary to the kinships used here, i.e. these kinship values are equal to 1 minus the corresponding diversity denoted as ϕ A , . . . , ϕ D in [22]. The relevant derivations of the formulas for calculating the diversity parameters are provided in detail in [22].
The classic kinship f A between individuals i and j (element of matrix f A ), which describes the probability that two alleles, X i and X j , at a locus that are randomly selected from individuals i and j are IBD (i.e. ), was restricted in all scenarios. For breeds with historical migrations and foreign introgressions, Wellmann et al. [22] proposed that the breed origin of the alleles should be considered to preserve the local breed. Thus, we considered different approaches that account for the origin of alleles, denoted as f B , f C and f D . Kinship matrix f B contains the probabilities that two alleles randomly chosen from two individuals at a locus are IBD or that at least one allele is from a migrant breed (M): Note that this is equal to the probability that both alleles are IBD and native plus the probability that at least one allele is from a migrant.
Kinship matrix f C contains the probabilities that two alleles randomly chosen from two individuals at a locus are IBD or both alleles are from migrant breeds: The probability that at least one of the two randomly chosen alleles is from a migrant breed is higher than the probability that both are from migrant breeds. Thus, f B is greater than f C . In general, f A ≤ f C ≤ f B (element-wise). The kinship at native alleles f D is defined as the conditional probability that two alleles X and Y at a locus that are randomly chosen from the offspring population are IBD, given that both descended from native founders (F ): Note that this value says nothing about the kinship at loci that originate from migrants or about the MC. The mean kinships for the offspring generation are c ′ f A c, where f N is a matrix containing the probabilities that both randomly chosen alleles at a locus originated from native founders.
Our aim was to identify the best method of accounting for the conflicting objectives of a breeding program, which are to increase breeding values, to maintain genetic diversity, and to maintain genetic originality of the breed. Since 1 − f D (c) = P(X � = IBD Y |X, Y ∈ F ) is the genetic diversity at native alleles, the constraint on f D is used to maintain or increase genetic diversity at native alleles and is a parameter of interest. Kinship f B and f C were considered because minimizing or constraining f D is in general not a convex problem, so minimizing f B and f C could result in lower f D values than minimizing f D itself.
In the different scenarios, an upper bound for MC (ub.MC) and/or a lower bound for the average EBV (lb.EBV) were set as additional constraints. The expectation of the average EBV in the next generation is c ′ EBV, where EBV is a vector of the EBV of each selection candidate. The expectation of the average MC of the next generation is c ′ MC, where MC is a vector of the MC of each selection candidate.
For all optimization problems, constraints a, b, and c were applied to limit the solution for c i to within a reasonable range. Solver "cccp" [18], which was called from the R package optiSel [26], was used to solve the optimization problems. This solver contains routines for solving cone constrained convex problems using interior-point methods that are partially ported from Python's CVX-OPT and based on Nesterov-Todd scaling [27]. The solver uses a primal-dual path following algorithms for linear and quadratic cone constrained programming.
Scenarios were categorized based on three main objective functions: minimizing kinships, minimizing MC and maximizing genetic gain in the next generation. For minimizing kinships, three sub-scenarios were considered, which involved minimizing f B , f C and f D , respectively. Parameters ub.f A , ub.f B ,ub.f C , ub.f D and ub.MC were defined as the upper bound values of the corresponding parameters in the next generation, whereas lb.EBV was set as the lower bound of the mean EBV for the next generation. One or several of the following constraints were used to define the optimization problems for each breed: The OCS scenarios considered are listed in Table 1. The name of each optimization scenario consists of a prefix that indicates the objective function and a suffix that indicates the constraint settings. For example, scenario maxEBV.A.B.MC indicates a scenario that maximizes the average EBV in the next generation, while constraining Criteria for comparing scenarios included not only the result of the objective function, but also the other parameters obtained in the scenario, in particular EBV, MC, classic kinship, and kinship at native alleles. To evaluate the effectiveness of the OCS scenarios, the results were compared with the output from a reference scenario (REF) and the output from a truncation selection scenario (TS). In scenario REF all selection candidates were used as parents and had equal contributions to the offspring generation. For endangered breeds, an effective population size (N e ) of 50 is often considered as sufficient [28]. Based on the equation in [1], 1 N e = 1 4 * N sire + 1 4 * N dam , the 13 sires with the highest EBV were selected as male selection candidates in the TS scenario, and mated to the 1000 dams. All parents had equal contributions to the offspring generation in this scenario.
To ensure that optimal solutions exist in all scenarios for each breed, feasible threshold values must be set for the constraints. To restrict the rate of inbreeding, the upper bound (ub.f A ) was defined as follows. When N e is equal to 50, the rate of inbreeding F, which can be calculated from F = 1 2N e , is 1% per generation. Based on this, the threshold for f A was calculated as To calculate the constraint setting for the other parameters, we used the results from the scenario that optimizes the corresponding parameter with restriction only on f A and the REF scenario, using the following calculations: where is a parameter that indicates the proportion of progress to be accomplished for each constrained parameter relative to the scenario with a restriction only on f A . The value of can be determined by the breeding organization. A higher value indicates a stricter setting for all constraints. We set at 0.3 to ensure that optimized solutions were found for all scenarios and for both breeds.
The specific values used for all constraints for each breed are in Additional file 1: Table S1.

Results
Results of the basic statistical analyses for average kinship, MC and EBV of the parent generation are in Table 2 for both breeds. Average kinship f A was lower for the Angler population than for the Vorderwald population (0.020 vs. 0.025) but f B (0.910 vs. 0.853) and f C levels (0.488 vs. 0.381) were higher. On average, 69.5 and 60.7% of the genetic material of the Angler and Vorderwald cattle, respectively, originated from migrant breeds. Native effective population sizes of 86 and 49 were estimated from six previous generations for Angler and Vorderwald cattle, respectively. Native effective population size is a parameter that quantifies the decrease in native allele diversity and is defined in [22]. If the native effective size is high, then native allele diversity decreases slowly. Thus, the diversity of native alleles decreased more rapidly in Vorderwald cattle than in Angler cattle, whereas MC were higher in Angler cattle. Average EBV for both breeds were below the current population mean, which is 100 for Angler and 0 for Vorderwald because selection candidates were sampled from old age cohorts. A positive correlation between EBV and MC was found for both breeds (Figs. 1, 2).

Minimizing average kinship
Genetic contributions of the selection candidates were optimized to minimize f B , f C and f D with restrictions on MC and/or average EBV in the offspring generation for each breed, (see Tables 3, 4, 5, respectively). Compared to the REF scenario, all OCS scenarios showed superior results for the optimized criteria as expected. Table 3 shows the results obtained when minimizing f B in the offspring generation under the different constraints for each breed. The lowest f B for Angler cattle was 0.827 when the upper bound for f A in the next generation was set to 0.030. MC was lower than the constraint value setting (0.570 vs. 0.677). Thus, the minimum f B did not change after adding the constraint on MC (minfB.A.MC). When the restriction on average EBV was set to 0.516, the average kinship f B increased to 0.866, which was still lower than the f B obtained in the REF scenario (0.926). Similar results were obtained for Vorderwald cattle. When the upper bound for f A in the progeny generation was set to 0.035, the minimum f B level in the progeny generation was 0.789. Again, f B did not change after adding an upper bound for MC (0.528 vs. 0.582). f B increased to 0.813 when the EBV constraint was set to 0.550, although it was lower than the f B obtained in the REF scenario (0.852).
Results when minimizing f C were similar to minimizing f B (see Table 4). The f C of the progeny generation decreased to 0.345 for Angler cattle when the upper bound for f A was set to 0.030. When f C was minimized, MC decreased to a value lower than the constraint level setting (0.570 vs. 0.677). Thus, minimizing f C gave the same results for scenarios minfC.A and minfC.A.MC. After adding an EBV constraint of 0.516, f C increased to 0.404 but was lower than the f C obtained in the REF scenario (0.527). For Vorderwald cattle, the minimum average f C in the progeny generation was 0.300 when f A was restricted to 0.035, even after adding a higher constraint on MC (0.582 vs. 0.528). In scenario minfC.A.MC.EBV, f C reached 0.327 after adding an EBV constraint of 0.550, although this was lower than the f C obtained in the REF scenario (0.380).
When the kinship at native alleles, f D , was minimized, the average kinship f A was automatically lowered in most cases (Table 5); in Angler cattle, f A reached 0.020, which was lower than the constraint level (0.030). In this case, the minimum f D was 0.040. When MC was restricted to 0.677, the minimum f D increased to 0.044. When an EBV constraint of 0.516 was added, the minimum f D increased to 0.047, which was still lower than the f D obtained in the REF scenario (0.049). For Vorderwald cattle, when f A was restricted to 0.035 in the progeny generation, the lowest f D was 0.057. When the maximum MC was set to 0.582, f D increased to 0.058. When adding an EBV constraint of 0.550, the lowest f D was 0.064, which was still lower than the f D obtained in the REF scenario (0.072).

Maximizing the average EBV
Results for maximizing the average EBV in the progeny generation under various constraints are in Table 7.

Table 3 Optimization of the genetic contributions when minimizing kinship f B with a restriction on migrant contribution and/or mean estimated breeding values
a The name of each optimization scenario consists of a prefix that indicates the objective function and a suffix that indicates the constraint settings b The parameter used as a constraint is marked in italics in the scenario. Bold italic values indicate that the actual value obtained does not reach the limit of the corresponding constraint (value higher than the lower limit or lower than the upper limit) c Objective function d Proportion of selected sires with non-zero genetic contributions; a c si value lower than 0.00025 is treated as zero e Standard deviation of the genetic contributions of all male selection candidates

Discussion
For the breeding schemes of the two breeds considered in this study, two conflicts must be addressed: (1) the conflict between increasing genetic gain while managing inbreeding and (2) the conflict between maintaining genetic diversity while controlling loss of genetic uniqueness. The purpose of this study was to determine whether OCS with additional constraints that involve modified kinship matrices and MC was more efficient at conserving genetic diversity and originality while also ensuring genetic improvement than traditional OCS. Using data on German Angler and Vorderwald cattle, various scenarios were compared. Both breeds have been frequently crossed with high-yielding breeds to improve performance. We found that diversity of native alleles decreased more rapidly in Vorderwald cattle than in Angler cattle, whereas MC was higher in Angler cattle. The consequences of the scenarios were similar for both breeds. Compared to traditional OCS, constraining kinship f D and MC promoted recovery of genetic originality in the breeds and diversity of native alleles but reduced response to selection. Traditional OCS achieved the highest average EBV in the progeny generation among all scenarios with a restriction on rate of inbreeding, which, in our study, is represented by scenario maxEBV.A. Compared to the TS scenario, average EBV was higher in the traditional OCS scenario for both breeds, while the average relatedness was lower. Probably, the average EBV in TS was smaller because the TS scenario assumed equal contributions for selected sires, whereas OCS optimizes their contributions. Because MC and EBV were positively correlated, traditional OCS increased the average MC, which is undesirable when the aim is to conserve the genetic originality of local breeds.

Different kinship estimates
Both f B and f C take probabilities of IBD and probabilities of alleles originating from migrant breeds into account, i.e. they account for both level of inbreeding and level of genetic originality. Although theoretically, MC affects f B more than f C , results from minimizing f B and f C were almost identical for the two breeds considered. Wellmann Table 6 Optimization of the genetic contribution when minimizing the migrant contribution with restricted kinship and/ or mean estimated breeding values a The name of each optimization scenario consists of a prefix indicating the objective function and a suffix indicating the constraint settings b The parameter used as a constraint is marked in italic in the scenario. Bold italic values show that the actual value obtained does not reach the limit of the corresponding constraint in this scenario (value higher than the lower limit or lower than the upper limit) c Objective function d Proportion of selected sires with non-zero genetic contributions; a c si value lower than 0.00025 is treated as zero e Standard deviation of the genetic contributions of all male selection candidates Thus, if f D is constrained, then MC must be constrained as well and the constraint for f A can be omitted. Among all the scenarios, TS used the smallest number of sires and resulted in the highest average genetic contribution of selected sires. Including kinship f D as an additional constraint in the OCS scenarios resulted in a larger number of selected sires than including f B or f C . Therefore, including f D is an efficient method to avoid overuse of sires with high EBV and limits the rate of inbreeding in the long run. Compared with the inclusion of f B or f C , inclusion of f D resulted in a lower average EBV in the progeny generation, depending on the constraint level setting. In most cases, OC was negatively correlated with MC and positively correlated with the average EBV, as illustrated in Additional file 2: Table S2, which represents a desirable result for future selection and breeding programs.
Scenarios with optimizations of both male and female contributions were also evaluated (results not shown), using the same calculation methods to obtain the constraint value settings. For all scenarios and both breeds, the constraint settings were stricter than in the scenarios that optimized male contributions. The performance of all scenarios improved when both male and female selection were optimized, which is consistent with Sánchez-Molano et al. [8], who used OCS to improve fitness and productivity traits. To achieve these improvements, however, additional reproductive techniques must be applied due to the limited reproduction rate of female animals. Table 7 Optimization of the genetic contribution when maximizing the breeding value with restricted kinship and/or mean estimated migrant contributions a The name of each optimization scenario consists of a prefix indicating the objective function and a suffix indicating the constraint settings b The parameter used as a constraint is marked in italic in the scenario. Bold italic values show that the actual value obtained does not reach the limit of the corresponding constraint in this scenario (value higher than the lower limit or lower than the upper limit) c Objective function d Proportion of selected sires with non-zero genetic contributions; a c si value lower than 0.00025 is treated as zero e Standard deviation of the genetic contributions of all male selection candidates