Prospects for Genomic Selection in Cassava Breeding

Cassava (Manihot esculenta Crantz) is a clonally propagated staple food crop in the tropics. Genomic selection (GS) has been implemented at three breeding institutions in Africa to reduce cycle times. Initial studies provided promising estimates of predictive abilities. Here, we expand on previous analyses by assessing the accuracy of seven prediction models for seven traits in three prediction scenarios: cross-validation within populations, cross-population prediction and cross-generation prediction. We also evaluated the impact of increasing the training population (TP) size by phenotyping progenies selected either at random or with a genetic algorithm. Cross-validation results were mostly consistent across programs, with nonadditive models predicting of 10% better on average. Cross-population accuracy was generally low (mean = 0.18) but prediction of cassava mosaic disease increased up to 57% in one Nigerian population when data from another related population were combined. Accuracy across generations was poorer than within-generation accuracy, as expected, but accuracy for dry matter content and mosaic disease severity should be sufficient for rapid-cycling GS. Selection of a prediction model made some difference across generations, but increasing TP size was more important. With a genetic algorithm, selection of one-third of progeny could achieve an accuracy equivalent to phenotyping all progeny. We are in the early stages of GS for this crop but the results are promising for some traits. General guidelines that are emerging are that TPs need to continue to grow but phenotyping can be done on a cleverly selected subset of individuals, reducing the overall phenotyping burden.

C assava, a root crop with origins in the Amazon basin (Olsen and Schaal, 1999), provides staple food for more than 500 million people worldwide (Howeler et al., 2013). It is widely cultivated in Sub-Saharan Africa, where the storage roots serve as primary source of carbohydrates and can be processed into a wide variety of products such as fufu, lafun, gari, abacha, tapioca, and starch (Chukwuemeka, 2007;Bamidele et al., 2015).
Cassava is a diploid (2n = 36) and highly heterozygous non-inbred crop that is propagated vegetatively by farmers using stem cuttings, though most genotypes flower and can be used to produce botanical seeds from either self-or cross-pollination. Among the most important traits targeted for improvement are storage root yield, dry matter content (DM), starch content, tolerance to postharvest physiological deterioration, carotenoid content, and resistance to pests or diseases (Esuma et al., 2016).
Development and implementation of breeding strategies in cassava represent a challenge because of the crop's heterozygous nature and long breeding cycle. A traditional cassava-breeding program relies on phenotypic characterization of mature plants that have been clonally propagated. Typically, cycles of selection take 3 to 6 yr from seedling germination to multilocation yield trials and additional years are required to evaluate promising genotypes before variety release (Fig. 1).
Marker-assisted selection has been effective in cassava for the selection of promising genotypes for resistance to cassava mosaic disease (CMD) (Okogbenin et al., 2007;Ceballos et al., 2015;Parkes et al., 2015). However, the use of marker-assisted selection is limited primarily to traits with known large-effect loci, which makes this method infeasible for complex traits (Dekkers and Hospital, 2002;Heffner et al., 2009).
With the advent of next-generation sequencing technologies, it is now affordable to profile single nucleotide polymorphism (SNP) markers genome-wide (Barabaschi et al., 2015), which can support the use of GS, a breeding method that uses such markers to predict the breeding values of unevaluated individuals (Meuwissen et al., 2001). Genomic selection can optimize and accelerate pipelines for population improvement and variety development and release (Heffner et al., 2009) with a reduction in breeding time resulting from selection of parental genotypes with superior breeding values at the seedling stage based on genotypes alone.
Many genomic prediction models are available, differing from each other primarily with respect to the genetic architecture that they assume. For example, genomic best linear unbiased prediction (GBLUP) assumes an infinitesimal genetic architecture (nearly equal and small contributions by all genomic regions to the phenotypes). In contrast, models like BayesB alter that assumption, putting emphasis on major-effect loci and variable selection (Gianola et al., 2009;Legarra et al., 2011;Habier et al., 2011). Evaluation of different GS models with nonsimulated data indicates that prediction accuracy varies across species and traits (Heslot et al., 2012;Resende et al., 2012;Gouy et al., 2013;Charmet et al., 2014;Rutkoski et al., 2014;Cros et al., 2015).
Previous studies in cassava have estimated genetic parameters and evaluated prediction accuracy by applying the GBLUP model with small training sets and low-density markers (Oliveira et al., 2012(Oliveira et al., , 2014. Historical phenotypic data from the International Institute of Tropical Agriculture (IITA), combined with markers obtained from genotyping-by-sequencing (GBS), showed promising results for cassava breeding with GS (Ly et al., 2013). In that study, the predictive ability (accuracy) was measured as the correlation between predictive values and the phenotypic value ranged from 0.15 to 0.47 across traits (Ly et al., 2013).
There are ongoing efforts within the Next Generation Cassava Breeding project (www.nextgencassava. org, accessed 14 Aug. 2017) to increase the rate of genetic improvement in cassava and unlock the full potential of cassava production. The project is currently in the early stages of implementing GS at three African research institutes: the National Crops Resources Research Institute (NaCRRI) in Uganda, the National Root Crops Research Institute (NRCRI) in Nigeria, and the IITA, also in Nigeria.
In the present study, we evaluated the potential of GS as a breeding tool to increase the rates of genetic gain in datasets associated with all three Next Generation Cassava Breeding breeding programs. We assessed predictive ability by cross-validation within TP datasets for seven traits: DM, fresh root weight (RTWT), root number (RTNO), shoot weight (SHTWT), harvest index (HI), severity of CMD (MCMDS), and plant vigor. We compared the performance of seven GS models for these traits in each of the breeding programs.
One important topic in GS concerns the feasibility of prediction across generations and across TPs from different breeding populations or programs. To increase the rate of gain achievable by GS, prediction models will need to accurately rank unevaluated progenies rather than genotypes contemporary with the TP. It is well known that recombination and divergence relative to the TP associated with recurrent selection reduces the accuracy of cross-generation prediction, making this kind of prediction a major challenge for GS (Jannink, 2010;Lorenz et al., 2011). Accuracies in these scenarios have not been previously estimated in cassava. Therefore, we tested the accuracy of cross-generation prediction with the IITA TP and two successive cycles of progeny that have been phenotyped. Similarly, given that the previous results indicated only a small level of genetic differentiation among clones from different populations (Wolfe et al., 2016b), we tested whether combining information from different populations could increase prediction accuracy in the smaller populations.
Finally, in a typical scenario, a GS program will phenotype all selected materials and a subset of the unselected material to update the training model. We further investigated the impact of phenotyping different sized subsets of materials for the TP update. We compared random subset selections to selections based on a TP optimization algorithm .
This study is a starting point for successful application of GS in African cassava. Similar to other studies, factors such as trait heritability, the prediction model, and TP composition play important roles in determining the prediction accuracy and the rate the of genetic progress. For example, traits with higher heritability like DM are considered to be more likely to respond to selection and lead to larger genetic gain over cycles of selection (Kawano et al., 1998;Ceballos et al., 2015). Our results will serve to guide implementation strategies for GS in cassava breeding programs.

Germplasm
In this study, we analyzed data from the GS programs at three African cassava breeding institutions: NaCRRI, NRCRI, and IITA. Germplasm from NaCRRI included 411 clones descended from crosses among accessions from East Africa, West Africa, and South America. The collection from NRCRI was made up of 899 clones, 211 of them in common with the IITA breeding germplasm. The remaining 688 clones were materials derived either in part or directly from the International Center for Tropical Agriculture in Cali, Columbia. Wolfe et al. (2016b) shows details of the origins and pedigrees of the NaCRRI and NRCRI clones used in this study.
The primary IITA germplasm we analyzed is also known as the Genetic Gain (GG) collection, which comprises 709 elite and historically important breeding clones and a few landraces that have been collected starting in the 1970s. These materials have also been previously described in Okechukwu and Dixon (2008), Ly et al. (2013), and Wolfe et al. (2016b).
In addition, two generations of GS progeny were analyzed (Fig. 2). The parents of each set of progeny were chosen on the basis of their GEBVs as described previously (Wolfe et al., 2016b). The first, GS cycle 1 (C1) comprised 2890 clones from 166 full-sib families with 85 parents from the GG collection. Because of inconsistency in the timing and amount of flowering and seed set among clones, successful crossing is a challenge in cassava. To obtain the full set of desired matings among parents of C1, crossing blocks were planted in two successive years (2013 and 2014). In 2013, 79 parents produced 2322 seedlings (135 full-sib families). In 2014, 17 parents, 11 of which were reused from the previous year and another six of which were new parents from the GG collection, gave rise to an additional 568 seedlings (31 new full-sib families). The C1 families had a mean size of 17.4 siblings (median: 15, range: 2-78).

Phenotyped Traits
Seven traits were analyzed in this study. Plant vigor was recorded as 3 = low, 5 = medium, and 7 = high 1 mo after planting at IITA and NRCRI and 3 mo after planting at NaCRRI. We used the across-season average MCMDS for our analyses; this was the mean of measurements taken at 1, 3, and 6 mo after planting, on a scale of 1 (no symptoms) to 5 (severe symptoms). Dry matter content was expressed as a percentage of dry root weight relative to fresh root weight (RTWT). At IITA, DM was measured by drying 100 g of fresh roots in an oven, whereas at NRCRI and NaCRRI, the specific gravity method (Kawano et al., 1987) was used. Root weight and SHTWT were expressed in kilograms per plot, whereas HI was the proportion of total biomass per plot (i.e., RTWT). Root number was the number of fresh roots harvested per plot. For all analyses below, RTNO, RTWT, and SHTWT were natural-log transformed to obtain normally distributed residuals.
The phenotyping trials analyzed in this study have been described in part in previous publications (Wolfe et al., 2016a;2016b). However, complete details on the phenotyping trial design particular to this study are provided in Supplemental File S1. All phenotyping trials were conducted between 2013 and 2015. Clones from NaCRRI were evaluated in three locations with different agro-ecological conditions in Uganda: Namulonge, Kasese, and Ngetta. Clones from NRCRI were tested in three locations in Nigeria: Kano, Otobi, and Umudike. Meanwhile, IITA clones were evaluated in four locations within Nigeria: Ibadan, Ikenne, Ubiaja, and Mokwa.

Two-Stage Genomic Analyses
Except where noted otherwise, a two-step approach was used to evaluate genomic predictions in this study. This approach was used to correct for heterogeneity in the experimental designs and increase computational efficiency. The first stage involved accounting for trial design-related variables with a linear mixed model.
For NaCRRI we fitted the model shown in Eq. [1]: where b included a fixed effect for the population mean, the location-year combination, and for plot-basis traits (RTWT, RTNO, and SHTWT); the number of plants harvested per plot was included as a covariate; the vector c and the corresponding incidence matrix Z clone represented a random effect for the clone where ( ) s 2 N 0, c c I ; I represented the identity matrix; and the range variable was nested in location-year-replication and was represented by the incidence matrix Z range(loc.year) and the random effects vector ( ) s 2 N 0, r r I . Ranges were equivalent to the row or column along which plots were arrayed. Blocks were also modeled, with a block being a subset of a range. Block effects were nested in ranges and were incorporated as random variables with the incidence matrix Z block(range) effects vector ( ) where Z set was the incidence matrix corresponding to the random effect for the planting group (see above), which was nested in location-year, with ( ) s 2 N 0, s s I . Replication effects were nested in sets and treated as random with the incidence matrix Z rep(set) and the effects vector ( ) s 2 N 0, r r I . Blocks were nested in replications, treated as random, and represented by the design matrix Z block(rep) and the effects vector . The fixed effects for NRCRI included were the same as those for NaCRRI, with the addition of a term for trial (i.e., TP1 and TP2; see above).
For IITA, data from all trials described above were fitted together using the model in Eq. [3]: [3] The range effect was fitted as random. The fixed effects were the same as those described for NaCRRI, except the proportion of harvested plants (out of the total originally planted) was used instead of the number harvested as a cofactor. This was done to correct for differences in plot sizes.
For the clone effect, the best linear unbiased prediction (BLUP) (ĉ), which represents an estimate of the total genetic value (estimated genetic value, EGV) for each individual, was extracted. The EGVs were de-regressed by dividing by their reliability ( - the prediction error variance of the BLUP. This was done to avoid applying shrinkage to the same data twice (once in the first step and again in the genomic prediction step). The mixed models above were solved with the lmer function of lme4 package (Bates et al., 2014) in R (https:// cran.r-project.org, accessed 30 Aug. 2017). We used de-regressed the EGVs as the response variables and weighted error variances in downstream genomic evaluations. Error variances were weighted according to Garrick et al. (2009)  where H 2 is the proportion of the total variance explained by the clonal variance component, s 2 c . Weighting error variances during the genomic prediction step was done to preserve information from the first step about differences between clones in the reliability of the de-regressed BLUPs being used to represent their genetic value. These differences occur mostly because of imbalances in the number of observations among clones. This information would otherwise be ignored when making genomic predictions with a two-step procedure.

Genotyping Data
The cassava collections described above were genotyped with GBS (Elshire et al., 2011) with the ApeKI restriction enzyme recommended by Hamblin and Rabbi (2014). Single nucleotide polymorphisms were called with the TASSEL 5.0 GBS pipeline version 2 (Glaubitz et al., 2014) and aligned to the cassava reference genome, version 6.1 (http://phytozome. jgi.doe.gov, accessed 14 Aug. 2017;International Cassava Genetic Map Consortium, 2015). Genotype calls were only allowed when a minimum of two reads was present; otherwise, the genotype was imputed (see below). Furthermore, the GBS data were filtered so that clones with >80% missing and markers with >60% missing genotype calls were removed. Markers with extreme deviation from Hardy-Weinberg equilibrium (Χ 2 > 20) were also removed. Only biallelic SNP markers were considered for further analyses. We used a combination of custom scripts and common variant call file (Danecek et al., 2011) manipulation tools to accomplish this pipeline. Finally, imputation was conducted with Beagle version 4.0 (Browning & Browning, 2009). A total of 155,871 markers were obtained following these procedures. For genomic prediction in a given population or dataset, we further filtered out SNPs with a minor allele frequency less than 0.01.

Assessment of Prediction Accuracy via Cross-Validation
To obtain unbiased estimates of prediction accuracy, we used a k-fold cross-validation scheme (Kohavi, 1995). In brief, each breeding program dataset [NRCRI collection (NR), NaCRRI collection (UG), and GG] was split randomly into k = fivefold mutually exclusive training and validation sets. The training set composed of four out of five of the subsets was used to estimate marker effects for predictions. The estimated marker effects were used to predict the breeding value of the validation set individuals. The process of subset assignment and genomic prediction was repeated 25 times for each model. For each repeat, predictions were accumulated from each individual when it was in the validation subset. Prediction accuracy was then calculated as the Pearson correlation between the EGV (not de-regressed) and the accumulated predicted values for that repeat.

Genomic Prediction Methods
In this study, we compared the accuracy of genomic prediction via seven methods that are briefly described below. These methods differ in their assumptions about genetic architecture and whether the prediction being made represents a genome estimated breeding value (GEBV that included additive effects or a genome estimated total genetic value, which includes additive and nonadditive effects. Prediction models were compared by examining several prediction scenarios (described in detail below), including 25 replications of fivefold cross-validation, crossgeneration, and cross-population prediction.

Genomic BLUP
Prediction with GBLUP involves fitting a linear mixed model of the following form: Here, y is a vector of the phenotype and b is a vector of fixed, nongenetic effects with the design matrix X. The vector g is a random effect, the BLUP, which represents the GEBV for each individual. Z is a design matrix indicating observations of genotype identities, and ε is a vector of residuals. The GEBV is obtained by assuming s 2 g is the additive genetic variance and K is the square, symmetric genomic realized relationship matrix based on SNP markers. The genomic relationship matrix was constructed with the function A.mat in the R package rrBLUP (Endelman, 2011) and follows the formula of VanRaden (2008), Method 2. Predictions using GBLUP were made with the function emmreml in the R package EMMREML (Akdemir and Okeke, 2015).

Reproducing Kernel Hilbert Spaces
We made predictions with reproducing kernel Hilbert spaces (RKHS). The genomic relationship matrix used in the GBLUP model described above can be considered as a parametric (additive genetic) kernel function and exists as a special case of RKHS (Gianola and van Kaam, 2008;Morota and Gianola, 2014). For RKHS predictions, we used a mixed model of the same form as for GBLUP above. Unlike the case of GBLUP, we used a Gaussian kernel function: where K ij was the measured relationship between two individuals, d ij was their Euclidean genetic distance based on marker dosages, and θ was a tuning (sometimes called a "bandwidth") parameter that determines the rate of decay of correlations among individuals. Because this is a nonlinear function, the kernels we used for RKHS could capture nonadditive as well as additive genetic variation. Thus the BLUPs from RKHS models represent genome estimated total genetic values rather than GEBVs. Because the optimal θ must be determined, a range of values was tested in two ways. First, we did cross-validation with the following θ values and selected the one with the best accuracy: 0.0000005, 0.000005, 0.00005, 0.0001, 0.0005, 0.001, 0.004, 0.006, 0.008, 0.01, 0.02, 0.04, 0.06, 0.08, and 0.1 (single-kernel RKHS). Second, we used the emmremlMultiKernel function in the EMMREML package (Akdemir and Okeke, 2015) to fit a multikernel model with six covariance matrices, with the following bandwidth parameters and allowed restricted maximum likelihood to find optimal weights for each: 0.0000005, 0.00005, 0.0005, 0.005, 0.01, and 0.05 (multikernel RKHS).

Bayesian Marker Regressions
We tested four well-established Bayesian prediction models: BayesCpi (Habier et al., 2011), the Bayesian LASSO (BL; Park and Casella, 2008), BayesA, and BayesB (Meuwissen et al., 2001). In ridge-regression (equivalent to GBLUP), marker effects were all shrunk by the same amount, because we assume they are all drawn from a normal distribution with the same variance. Further, all markers have a nonzero effect and most have small effects, essentially assuming that the genetic architecture of the trait is infinitesimal. In contrast, the Bayesian models we tested allow for alternative genetic architectures by inducing differential shrinkage of marker effects. For BayesA and Bayesian LASSO, all markers have a nonzero effect but the marker variances are drawn from scaled-t and double-exponential distributions respectively, which are both distributions with thicker tails and greater density at zero. Both BayesB and BayesCpi are variable selection models because the marker variances come from a two-component mixture of a point mass at zero and either a scaled-t distribution (BayesB) or a normal distribution (BayesCpi). Fitting BayesB and BayesCpi begins by estimating a parameter pi, representing the proportion of markers with a nonzero effect. We performed Bayesian predictions with the R package BGLR (Pérez and De Los Campos, 2014). Following Heslot et al. (2012) and others, we ran BGLR for 10,000 iterations, discarded the first 1000 iterations as burn-in, and thinned the remainder to every fifth sample. Marker dosages were mean-centered on the combination of training and test sets before analysis. Convergence was confirmed visually in initial test runs using the CODA package in R (Plummer et al., 2006).

Random Forest
Random Forest (RF) is a machine learning method used widely in regression and classification (Breiman, 2001;Strobl et al., 2009). The use of RF regression with marker data has been shown to capture epistatic effects and has been successfully used for prediction of genome estimated total genetic value (Breiman, 2001;Motsinger-Reif et al., 2008;Michaelson et al., 2010;Heslot et al., 2012;Charmet et al., 2014;Sarkar et al., 2015;Spindel et al., 2015). In prediction, a random forest is a collection of r regression trees grown on a subset of the original dataset that is bootstrapped over observations and randomly sampled over predictors. Averaging the prediction over trees for validation observations then aggregates the information. We used RF with the parameter with ntree set to 500 and the number of variables sampled at each split (mtry) equal to 300. We implemented RF with the randomForest package in R (Liaw and Wiener, 2002). As in the Bayesian regressions, marker dosages were mean-centered before RF analysis.

Comparison of Models Based on the Similarity of Rankings
To test for GS model similarities among breeding programs, we clustered the GEBV output on a breeding program basis. Genomic estimated breeding values from each model were scaled and centered on a column basis with the scale function in R and were then used to construct a matrix of Euclidean distances between models. Distance matrices were used as an input for hierarchical clustering using the Ward criterion implemented in the hclust R function (Heslot et al., 2012).

Cross-Generation Genomic Predictions
Because nearly all of the IITA germplasm from C1 and C2 had been clonally evaluated, we were able to test the prospects for predicting unevaluated progeny. We predicted all traits via all methods in four scenarios: GG predicting C1, GG predicting C2, C1 predicting C2, and GG + C1 predicting C2. Unlike the other predictions presented in this study, cross-generation predictions were done in a single step (raw phenotype and genomic data were fitted simultaneously). The exception was for RF, where correction for location and blocking factors is not supported. For RF prediction, we used the same de-regressed EGVs as for cross-validation. The software and parameters used were the same as already described. The design model was the same as that described for IITA above.

Training Population Update
We evaluated the impact on cross-generation prediction accuracy of phenotyping different size subsets of the unselected C1 (materials selected for crossing in each cycle were phenotyped, but unselected materials were not phenotyped in all cases). We selected subsets of C1 using two methods: randomly and with a genetic algorithm implemented in the R package STPGA .
The STPGA package uses an approximation of the mean PEV expected for a given set of training individuals in combination with a given set of test genotypes as a criterion (which does not require phenotype data) for selecting the "optimal" training set. The genetic algorithm implemented by STPGA is used to rapidly find the training set that minimized the selection criterion (the mean PEV of the test set; . To speed up computation, STPGA uses principal components rather than raw SNP markers as genetic predictors. Parents selected for further recombination were cloned into a crossing block. This was the point at which additional unselected seedlings must be chosen for phenotyping to incorporate their data in predictions of the eventual progeny that are produced. Since the next generation of progeny had not yet been produced, we targeted STPGA on the parents of C2. Figure 3 provides a schematic of GS with the TP update and optimization with STPGA. We constructed a genomic relationship matrix with only C1 (including the parents of C2). We did a principal component analysis on the kinship matrix and took the first 100 principal components as genomic predictors. We ran 1000 iterations of the genetic algorithm 10 times at each sample size. Sample sizes ranged from 200 to 2400 at increments of 400 (Supplemental Table S1). Predictions at each sample size were then made with each of 10 random and 10 optimized training sets using GBLUP in two scenarios: either just the sample of C1 was used to train the model or the sample of C1 plus all of GG were used.

Cross-Population Genomic Predictions
We predicted all traits using all methods in three scenarios: GG + NR predicting UG, GG + UG predicting NR, and NR + UG predicting GG (Supplemental Table S2A). Cross-population predictions were made with the prediction models described above and followed the two-step approach as described above.
We selected optimized subsets of the combined datasets with a genetic algorithm implemented in the R package STPGA . Random subsets of the same size as the optimized subsets (300, 600, 900, and 1200) were selected for comparisons between predictive accuracies. Predictions at each sample size were then made for 10 random and 10 optimized training sets with GBLUP.

Results
After quality control and keeping only markers with >1% minor allele frequency, the datasets had between 70,010 and 78,212 SNP markers (Table 1). Principal component analysis of the genomic relationship matrix indicated some genetic differentiation between Nigerian populations (GG and NR) and the Ugandan TP (UG; Supplemental Fig. S1a). In contrast, there was little differentiation between the NRCRI and IITA GG datasets, even when we compared only the nonoverlapping clones. We also calculated the F ST between populations as implemented in vcftools (Danecek et al., 2011). In agreement with results from the principal component analysis, the F ST between GG and NR was only 0.008, but was 0.019 and 0.021 between the Ugandan and the Nigerian populations, GG and NR, respectively. There was a similar amount of genetic differentiation between the IITA C2 progeny and its grandparental GG population (F ST = 0.02), as there was between GG and UG (Table 1, Supplemental Fig. S1b).
The mean inbreeding coefficient (F), as measured by the mean of the diagonal of the genomic relationship matrix, was similar for all populations, ranging from 0.933 in GG to 0.965 in C1. The mean rate of heterozygous loci was also similar between populations, ranging from 0.15 to 0.17. There was no notable decrease in heterozygosity or increase in the inbreeding coefficient from GG to C1 or from C1 to C2 (Table 1; Supplemental Fig. S2).

Prediction Within Breeding Populations
We tested seven genomic prediction models that differed in their extent and the kind of shrinkage, which is relevant in modelling different genetic architectures, and in their ability to capture nonadditive effects (Supplemental Fig. S3 to Supplemental Fig. S5).
Overall, breeding populations exhibited differences in the cross-validated prediction accuracies between methods and across traits (  Fig. S5). For NRCRI (n = 899), the mean predictive accuracy values across methods ranged between -0.02 for plant vigor and 0.27 for HI. For NaCRRI (n = 411), the mean predictive accuracy values ranged between 0.23 for SHTWT and 0.46 for HI. Meanwhile, the predictive accuracy values for GG (n = 709) ranged between 0.22 for plant vigor and 0.66 for DM.
In the NRCRI population, RKHS and RF, which capture nonadditive effects, had the highest predictive accuracy values for all traits except plant vigor. The trait with the highest predictive accuracy was RTWT (RF (0.34)) and the lowest predictive accuracy was found for vigor (multi-kernel RKHS (-0.03)). Fig. 3. Schematic of genomic selection with training population optimization carried out by STPGA. The selection was initially made among available genotyped candidates on the basis of genomic predictions with available phenotype data. Selected parents are grown and mated in a crossing block. The resulting Cycle 1 (C1) seeds are subsequently collected and grown in a nursery. Cycle 1 seedlings were genotyped by genotyping-by-sequencing (GBS) and selections were made on the basis of genomic prediction alone. Selected parents of Cycle 2 (C2) were cloned into a crossing nursery. STPGA was used to select the optimal additional C1 seedlings to plant in a clonal evaluation trial. Because C2 seedlings do not yet exist, STPGA was instead used to select the optimal C1 seedlings to predict the selected parents of C2. Phenotypes from the C1 clonal evaluation were added to the existing genomic prediction training dataset. The updated training model will be used to predict breeding values of theC2 seedlings when the GBS data become available and the selections of the parents of Cycle 3 (C3) is made. Subsequent cycles will proceed following this procedure.
In the NaCRRI population, the multikernel RKHS model showed the highest predictive accuracies for all traits except for CMD, for which BayesB showed the highest value (r = 0.50). In this population, CMD had the overall highest predictive accuracy across traits, whereas SHTWT exhibited the lowest predictive accuracy (Bayesian LASSO, r = 0.18).
In the IITA GG population, Bayesian approaches performed better for vigor, CMD, SHTWT, and DM, but the RKHS method showed higher predictive accuracies for HI and for yield related traits such as RTWT and RTNO. Meanwhile, RF gave better predictive accuracy when it was used to estimate GEBVs.
Some trait-dataset combinations exhibited better predictive accuracies than others. For example, NaCRRI population had better predictive accuracies for yield components like HI, RTWT and RTNO but the highest predictive values for CMD and DM were obtained in the GG population.
Similar to Heslot et al. (2012), we compared the cross-validated GEBVs following a clustering approach. The results in Supplemental Fig. S6 show the hierarchical cluster trees from the combined results of the three breeding populations. Differences in the clustering of methods are observed across datasets (Fig. 4). In the NRCRI data, we found two groups of clustering GS methods, with BayesB, BayesC, and GBLUP in one group and the rest on the other group. In the NaCRRI and IITA populations, nonparametric methods such as RKHS and RF clustered together, BayesA clustered with Bayesian LASSO, and GBLUP clustered with BayesC or BayesB.

Cross-Population Prediction
Previous studies have reported close relatedness between the clones in the next-generation TPs (Wolfe et al., 2016b). One important question within this project is whether or not datasets from different breeding programs can be combined in a training set to increase predictive accuracy. The application of any prediction model with the combined dataset would then benefit from an increase in the TP size with the prospect of using such

models in other cassava breeding programs in Africa.
With that in mind, we used combined datasets of GG + NR, GG + UG, and UG + NR to predict the population that was not included in the training set (UG, NR, and GG, respectively).
When we predicted the traits in the UG dataset with the combined GG + NR full set, Bayesian models gave better predictive accuracies for MCMDS, RTNO, and DM. Random Forest gave better predictive accuracies for HI and RKHS was best for RTWT and SHTWT (Supplemental Table S2a).
The average predictive accuracy with the combined GG + NR full set as the training set with the GBLUP model was consistently lower for all the traits than the average GBLUP cross-validation results (Supplemental Table S2a). Furthermore, the subsets selected by STPGA to predict the NaCRRI (UG) validation set gave, for all traits and all subset sizes, lower predictive accuracies than the GBLUP cross-validation model (Table 3; Supplemental Fig. S7; Supplemental Table S2b).
For plant vigor, MCMDS, and HI, the optimized STPGA subsets gave higher predictive accuracies than the combined GG + NR full training dataset. With few exceptions (MCMDS, SHTWT, and DM), the optimized STPGA datasets gave better prediction accuracies than the same sized random sets. As the optimized STPGA dataset increased in size, the predictive accuracy did not increase, except for RTNO, where the highest predictive accuracy was found when the TP size was 1200.
When the combined GG + UG full training dataset was used to predict the NRCRI TP, Random Forest and RKHS prediction models performed better for RTWT, SHTWT, RTNO, and plant vigor. Bayesian models gave better predictive accuracies for MCMDS and DM. For plant vigor, MCMDS and DM, the combined UG+GG full dataset gave better predictive accuracies than the GBLUP cross-validation model (Supplemental Fig. S8; Supplemental Table S2b). For prediction of the NRCRI TP, the optimized STPGA selected datasets gave better predictive accuracies for plant vigor, RTWT, RTNO, and SHTWT than the combined UG+ GG full training dataset.
To predict the NRCRI TP for all traits except RTNO (at n = 900 and n = 1200) and CMD (n = 900), the optimized datasets gave higher predictive accuracies than the random datasets. For plant vigor, CMD resistance, and DM, the selection of optimized datasets with STPGA gave better predictive accuracies than the GBLUP cross-validation model.
Among the STPGA datasets, the highest predictive accuracy was not always the result of an increase in TP size. For CMD resistance, the highest predictive accuracy was found for the smallest optimized dataset, with the same value as the highest optimized size,.
The predictive accuracy results of traits in the GG dataset using the full training set (UG+NR) varied across methods. Whereas Bayesian methods gave better predictive accuracy values for MCMD and plant vigor, RKHS performed better for DM, HI, RTWT, and SHTWT. The combined (UG+NR) full training dataset for predicting the GG population gave lower predictive accuracies than the GBLUP cross-validation model for all traits. The GBLUP cross-validation model also gave better predictive accuracies for all the traits than the random and optimized STPGA datasets. The optimized STPGA datasets gave better predictive accuracies than the random sets for all traits except for plant vigor and for DM (optimized dataset n = 900) (Supplemental Fig. S9; Supplemental Table S2b). For all traits except MCMDS and DM, the optimized STPGA subsets gave higher predictive accuracies than the combined UG+NR full training dataset. For all the cross-population results, we tested if the optimized STPGA sets would do better than random with a binomial test, assuming the independence of the comparisons. We compared how many times the prediction accuracy of STPGA was greater than random for all traits. We found that for prediction of the NR and UG sets, the STPGA-optimized sets performed better than the random sets. On the contrary, when we applied the same comparison of the STPGA sets with the predictions with full sets, the latter had a significantly higher number of full sets that was greater than STPGA's predictive accuracy results.
Additionally, we tested if there was differential enrichment in the optimized STPGA training set of any of the populations relative to the source sets. We found a significant enrichment of the GG population (p < 0.001) in the STPGA of different sizes for the prediction of the NR set with GG + UG. Similarly, we found a significant enrichment of the NR population (p < 0.001) in the STPGA of different sizes for predicting the GG set with the UG-NR. On the contrary, we found no significant enrichment of any population in the STPGA-optimized sets predicting the UG population.

Cross-Generation Prediction
One major area where analysis was needed concerned prediction across generations. Selections can be done at the seedling stage if GEBV can be predicted from the previous generations and training data. Because nearly all of the IITA germplasm from C1 and C2 were clonally evaluated, we were able to use these data to assess the accuracy of genomic predictions on unevaluated genotypes of the next generation. In general, the accuracy of prediction across generations was greatest when predicting C2, as shown by averaging across prediction models and traits for predictions trained either with C1 (mean = 0.19 ± SE 0.02) or GG + C1 (0.19 ± 0.02). The accuracy was lower on average when we predicted C2 with GG (0.11 ± 0.01) than when we predicted C1 with GG (0.17 ± 0.02). Accuracy was lowest for both plant vigor and RTWT (0.06 ± 0.005) and was highest for MCMDS (0.32 ± 0.03) and DM (0.38 ± 0.01). Most prediction models performed similarly, as shown by the averaged accuracy across traits and training-test combinations, with RF performing worst (0.08 ± 0.01) and BayesA and BayesB performing best (both 0.20 Table 3. Summary of mean genomic best linear unbiased prediction (GBLUP) cross-validated predictive accuracies across populations. Four subset selection methods (random vs. STPGA) and the full set were considered. The highest predictive accuracy across subsets and the full set is indicated in bold.   Table S3).

Training Population Update
The first 100 principal components of the C1 kinship matrix were used as predictors for STPGA and explained 97.7% of the genetic variance. In all cases, the genetic algorithm converged within the 1000-iteration run (Supplemental Fig. S11).
Given the constraints of breeding programs described above, it was necessary to select samples of C1 that were optimized for predicting the parents of C2, rather than the C2 themselves. Despite targeting the parents of C2, we used selected training sets to predict C2, thus simulating the addition of phenotypes to the training set. Because of this, we compared the accuracy of subsets of C1 predicting C2 to the accuracy of predicting the parents of C2. As the number sampled increased from 200 to 2400, averaging across traits and methods for subset selection (STPGA and at random), accuracy increased by 120 and 105% when predicting C2 and the parents of C2, respectively. The increase in accuracy was smaller when we included the 709 GG clones in the prediction, increasing only by 43 and 36% respectively when predicting C2 and parents of C2 (Supplementary Table S4).
The STPGA approach consistently selected training datasets with a lower expected mean PEV on the test set than random sampling, across training set sizes (Supplemental Fig. S12). Further, using STPGA to select clones for phenotyping gave 13% better accuracy on average (average accuracy of 0.242 vs. 0.214, two-tailed t = 6.29, df = 4458, p < 0.0001) than random sampling. Broken down by validation set, STPGA was significantly better than random for predicting the parents of C2 (t = 9.8, df = 2147, p < 0.0001) but was not significantly better for predicting C2 (t = 1.41, df = 2227, p = 0.16).
We compared these accuracies with that of the full set of C1 (or GG + C1) and to the cross-validation accuracy within the test set (C1 for prediction of the parents of C2, and C2 for predictions of C2). When predicting C2, which was our primary goal, the subsets were almost always inferior to the full set, with the exceptions of the middle sizes for RTWT, but the advantage was very small (Fig. 6, Supplemental Fig. S13). However, STPGA-selected subsets tended to have better accuracy than the full set, especially for yield components when predicting the parents of C2, which were the genotypes targeted by the optimization algorithm (Fig. 7, Supplemental Fig. S14).
The correlation between the selection criterion (mean PEV) used by STPGA and the training set size is strong for all traits (range = -0.57 to -0.61). Aside from Seven genomic prediction methods were tested for seven traits (panels). For each model (colors, x-axis within panels), four predictions were made: Genetic Gain (GG) predicting Cycle 1 (C1), GG predicting (Cycle 2) C2, C1 predicting C2, and GG + C1 predicting C2, indicated by shapes. All data are from the International Institute for Tropical Agriculture (IITA) genomic selection program. DM, dry matter content; HI, harvest index; RTWT, root weight; RTNO, root number; SHTWT, shoot weight; MCMDS, mean cassava mosaic disease severity; VIGOR, early plant vigor.
simply increasing the TP size, we wanted to assess the extent to which the mean PEV could be used as a predictor of the achievable accuracy. Regression of prediction accuracies for each sample (regardless of whether it was selected randomly or by STPGA) on mean PEV explained between 8% (RTNO) and 46% (DM) of the variance in accuracy. Multiple regression including mean PEV and training set size as predictors showed PEV to be the more significant predictor (across all traits). In fact, training set size was not a significant explanatory variable for RTWT or RTNO (Supplemental Table S5).

Discussion
The Next Generation Cassava Breeding Project (www. nextgencassava.org, accessed 15 Aug. 2017) aims to assess the potential of genomic selection in cassava to reduce the length of the breeding cycle and increase the number of crosses and selections per unit of time. The project is implementing GS in three breeding programs from Nigeria and Uganda, with genotypic and phenotypic data from TPs and two cycles of selection available on a database dedicated to cassava (www.cassavabase.org, accessed 15 Aug. 2017).
In general, the performance of predictive models is known to be conditional on the genetic architecture of the trait under consideration (Daetwyler et al., 2010;Su et al., 2014). Although nonadditive models, including RF and RKHS, capture dominance and epistasis effects, GBLUP is more suitable for prediction when traits are determined by an infinite number of unlinked and nonepistatic loci, with small effects.
Not surprisingly, heritability varied between populations, conceivably as a consequence of the differences in the number and design of field trials among breeding programs. For most traits, it is not possible to determine the reason for differences in heritability exactly. However, for DM, we can hypothesize that the difference in phenotyping protocols between programs (the specific gravity method at NRCRI and NaCRRI versus oven drying at IITA) could account for the observed differences. We note the estimate of zero heritability for RTWT, RTNO, and SHTWT in the Fig. 6. The relationship between training set size and the accuracy of predicting the International Institute for Tropical Agriculture Cycle 2 (C2) (across generations). The accuracy of prediction for seven traits (panels) with the IITA Genetic Gain (GG) population training data plus data from different sized subsets (x-axis) of their progeny, Cycle 1 (C1) is shown. Subsets of a given size were selected either at random or with the genetic algorithm implemented in the R package STPGA. Ten random and 10 STPGA-selected subsets were made for each training set size. Error bars are the SE around the mean for the ten samples. Horizontal black lines show the mean crossvalidation accuracy for C2 (validation set; solid line) and the accuracy of the full set of GG + C1 predicting C2 (dashed line). DM, dry matter content; HI, harvest index; RTWT, root weight; RTNO, root number; SHTWT, shoot weight; MCMDS, mean cassava mosaic disease severity; VIGOR, early plant vigor.
IITA C2 and acknowledge this is likely to account for the quality of cross-generation prediction in that dataset.
The cross-validation results were mostly consistent across breeding programs and the superiority of one prediction method over the others was trait-dependent. Random Forest and RKHS usually predicted phenotypes more accurately for yield-related traits, which are known to have a significant amount of nonadditive genetic variation (Wolfe et al., 2016a). Similar findings have been made in wheat (Triticum aestivum L.) for grain yield, an additive and epistatic trait, in which RKHS, radial basis function neural networks, and Bayesian regularized neural networks models clearly had a better predictive ability than additive models like BL, Bayesian ridge-regression, BayesA, and BayesB (Perez-Rodriguez et al., 2013).
Though the cross-validation results within the breeding programs are encouraging for the use of GS, prediction values across breeding programs were fairly low. Mean F ST values were low (less than 0.05), indicating that the three breeding populations share genetic material. Despite this, our results indicate that the prospect for sharing data across Africa to assist in GS is limited to certain traits (most notably MCMDS) and populations.
Indeed, obtaining a larger training set by combining TP did not always lead to higher prediction accuracies than what could already be achieved within that population, as shown by the cross-validation results.
In animal models, prediction with multibreed populations has also been shown to be poor, with most of the observed accuracy caused by population structure (Daetwyler et al., 2012). An alternative kernel function has been proposed to estimate the covariance between individuals based on markers, which can improve the fit to the data to account for the genetic heterogeneity of breeding populations .
Conceivably, in our study, the addition of individuals from different breeding programs was detrimental caused by the inconsistent heritability of most traits. Another possibility is genotype × environment interaction. The impact of genotype × environment interactions on predictive accuracy has been reported in wheat when the same population was evaluated in different environments (Crossa et al., 2010;Endelman, 2011). Similarly, in cassava with historical data from the IITA's GG population, prediction across locations led to a decrease in accuracy (Ly et al., 2013). Fig. 7. The relationship between training set size and the accuracy of predicting the parents of Cycle 2 (C2) [from Cycle 1 (C1), withingeneration). The accuracy of the predictions for seven traits (panels) with the International Institute for Tropical Agriculture Genetic Gain (GG) population training data plus data from different sized subsets (x-axis) of their progeny, Cycle 1 is shown. Subsets of a given size were selected either at random or with the genetic algorithm implemented in the R package STPGA. Ten random and 10 STPGAselected subsets were made for each training set size. Error bars are the SE around the mean for the 10 samples. Horizontal black lines show the mean cross-validation accuracy for C1 (validation set; solid line) and the accuracy of the full set of GG + C1 predicting the parents of C2 (dashed line). DM, dry matter content; HI, harvest index; RTWT, root weight; RTNO, root number; SHTWT, shoot weight; MCMDS, mean cassava mosaic disease severity; VIGOR, early plant vigor.
Using the training sets selected based on an optimized algorithm gave better predictive ability than randomly assigned samples but showed a decrease in accuracy when compared with the GBLUP cross-validation results. Although in previous studies, the predictive accuracies with full sets were lower than those obtained with optimized subsets (Rutkoski et al., 2015), in our study, we found the opposite, indicating that a larger training set was more advantageous. Combining data from different experiments and populations for cross-population prediction remains promising for traits like CMD, where the GWAS results indicate a stable large-effect quantitative trait loci throughout the tested breeding populations (Wolfe et al., 2016b).
When predicting unevaluated progenies from the next generation (cross-generation prediction), our results indicated, in our judgment, that accuracy should be sufficient for DM, MCMDS, and, to a lesser extent, HI (Fig. 5). Although accuracy is stable across the generations tested for DM with most models, for MCMDS to be successful, we recommend using a Bayesian shrinkage model such as BayesA or BayesB. The advantage of these models over GBLUP for CMD resistance probably arises because of the major known quantitative trait loci segregating in the population Wolfe et al., 2016a) and the ability of these two models to allow differential contributions of markers near the quantitative trait loci to the prediction. One disadvantage of BayesB, in particular, is that the known polygenic background resistance for CMD may become deemphasized in favor of heavy selection on the major effect gene(s) (Hahn et al., 1980;Legg and Thresh, 2000;Akano et al., 2002;Rabbi et al., 2014;Wolfe et al., 2016b).
We noted that RF and RKHS performed poorly across generations; this is a result that makes sense, given that the predictability of epistatic and dominant interactions declines with recombination (Lynch and Walsh, 1998).
On the basis of the datasets analyzed in this study, it was apparent that the size of a TP had a significant impact on prediction accuracy for most traits. Thus breeding programs will benefit from phenotyping the maximum possible amount. In agreement with the results in other crops (Rincent et al., 2012;Isidro et al., 2015), our results indicate that optimization algorithms like STPGA can provide at least a small advantage over random selection of materials for phenotyping.
Each breeding program will need to determine the amount of phenotyping vs. genotyping to do to maximize prediction accuracy and selection gain based on the cost and availability of land, labor, and genotyping. An analysis in barley (Hordeum vulgare L.) by Endelman et al. (2014) provides a good example of the potential complexity of these decisions. The authors show, as we do, that having a larger number of phenotyped individuals is always beneficial, and that it is usually beneficial to focus on evaluating new lines at the expense of additional phenotyping of old lines. However, if genotyping costs are high, the cost-benefit balance shifts toward more evaluation of the existing lines (Endelman et al., 2014). Endelman et al.'s (2014) study focused on prediction in biparental populations. Although this is likely to apply to cassava breeding populations, we stress the necessity of doing such an analysis for each breeding application separately.
An important result is that STPGA was able to find subsets that were better than the full set for predicting the parents of C2. The parents of C2 are members of C1 and were the individuals targeted by STPGA. One possible interpretation is that the benefit comes from phenotyping members of the same generation. If that were true, we could make a significant difference in accuracy by phenotyping a subset of clones from the current generation before predicting GEBV for the entire set of selection candidates. To do this without lengthening the selection and recombination cycle, harvested stems would need to be stored long enough for phenotypic data to be curated, predictions and selections to be conducted, and STPGA to be run. Methods of storing cassava stakes for up to 30 d are available, indicating that such a scheme could be possible (Sungthongw et al., 2016). Even without improved stem cutting storage, this could be done while only lengthening the selection and recombination cycle to perhaps 1.5 to 2 yr, which would still be significantly faster than conventional cassava breeding.
A related possibility is to place annual selection pressure on traits that are predictable across generation (e.g., MCMDS, HI, and DM). Predictions of total genetic value for yield traits for selection of clones that will be tested as potential varieties could then be done after clonal evaluation data become available on at least a subset of contemporary genotypes. Further trials will be necessary to determine whether there is an advantage to this type of strategy.
The primary promise GS offers to cassava breeding is the ability to select and recombine germplasm more frequently and thus hopefully speed the rate of population improvement while combining a myriad of quality, disease, and yield-related traits into a single genotype that can be released as a variety. The applicability of the results from the different prediction models in cassava is then dependent on whether the goal is the prediction of breeding values of progeny or the selection of advanced lines for testing as varieties.
We are still in the early stages of GS in this crop, but the results are promising, at least for some traits. The TPs need to continue to grow and quality phenotyping is more critical than ever. However, general guidelines for successful GS are emerging. Phenotyping can be done on fewer individuals, cleverly selected, making for trials that are more focused on the quality of the data collected.

Supplemental Information
Supplemental File S1: Supplementary methods describing the details of the field trial design. Supplemental Table S1: Details of the prediction scenarios tested using different sized subsets of the IITA Cycle 1. Supplemental Table S2: Cross-population prediction results.
(A) Cross-population results of seven prediction models for the combined datasets (full set model showing the accuracy of the predictions for seven traits with the combined NR + GG population training data. Subset sizes (x-axis) were selected either at random or by using the genetic algorithm implemented in the R package STPGA. Ten random and 10 STPGA-selected subsets were made at each training set size. Error bars are the SE around the mean for the 10 samples. Horizontal lines show the mean cross-validation accuracy for the UG population (validation set, orange line) and the accuracy of the full NR + GG set predicting the UG population (red line). Supplemental Fig. S8: Cross-population prediction of NR, showing the accuracy of the predictions for seven traits with the combined UG + GG population training data. Subsets sizes (x-axis) were selected either at random or by using the genetic algorithm implemented in the R package STPGA. Ten random and 10 STPGA-selected subsets were made at each training set size. Error bars are the SE around the mean for the 10 samples. Horizontal lines show the mean cross-validation accuracy for the NRCRI population (validation set, orange line) and the accuracy of the full UG + GG set predicting the NR population (red line). Supplemental Fig. S9: Cross-population prediction of GG, showing the accuracy of prediction for seven traits with the combined NR + UG population training data. Subset sizes (x-axis) were selected either at random or by using the genetic algorithm implemented in the R package STPGA. Ten random and 10 STPGA-selected subsets were made at each training set size. Error bars are the SE around the mean for the 10 samples. Horizontal lines show the mean cross-validation accuracy for the GG population (validation set, orange line) and the accuracy of the full NR + UG set predicting the GG population (red line). Supplemental Fig. S10: Cross-generation prediction accuracies.
In the IITA genomic selection dataset, there are three generations of clones: the genetic gain (GG), their progeny in Cycle 1 (C1), and C1's progeny, Cycle 2 (C2). For each of seven traits (rows) and seven prediction models (x-axis, colors), we made four cross-generation predictions (columns): GG predicting C1, GG predicting C2, C1 predicting C2, and GG + C1 predicting C2. DM, dry matter content; HI, harvest index; RTWT, root weight; RTNO, root number; SHTWT, shoot weight; MCMDS, mean cassava mosaic disease severity; VIGOR, early plant vigor. Supplemental Fig. S11: Convergence of the genetic algorithm implemented in the R package STPGA. Plot of the optimization criterion (mean PEV, y-axis) versus the iteration of the genetic algorithm (x-axis) across training sample sizes (panels). Samples were drawn from the IITA Cycle 1 (C1), excluding the parents of Cycle 2 (PofC2). The algorithm was set to find the smallest mean PEV with the PofC2 as the test (validation) set and a sample of C1 as the training set. Ten runs of the genetic algorithm are shown in different colored lines. Supplemental Fig. S12: Does STPGA find lower mean PEV across sample sizes than random selection? The size of training samples used in four prediction scenarios (rows) is plotted against the mean PEV of the subset for every trait and each of 10 samples elected either by the genetic algorithm implemented in the R package STPGA (red) or randomly (blue). The actual mean PEV and number of training samples are plotted here, with variations from planned sample sizes and PEV means initially expected because of missing data for some traits or individuals. The genetic algorithm implemented by STPGA was run 10 times. The validation set target for the optimization algorithm were the parents of IITA's Cycle 2 (PofC2) and the training sets were samples of differing size of the IITA Cycle 1 (C1). Predictions were made either with samples of C1 only (Rows 1 and 2) or with samples of C1 plus the entire GG (Rows 3 and 4). Validation sets were either the PofC2 (Rows 1 and 3) or the C2 (Rows 2 and 4). Supplemental Fig. S13: The relationship between training set size and accuracy predicting IITA Cycle 2 (across generations). The accuracy of prediction for seven traits (panels) with different sized subsets (x-axis) of IITA Cycle 1 (C1) is shown. Subsets of a given size were selected either at random or with the genetic algorithm implemented in the R package STPGA. Ten random and 10 STPGA-selected subsets were made at each training set size. Error bars are the SE around the mean for the 10 samples. Horizontal black lines show the mean cross-validation accuracy for Cycle 2 (C2, validation set; solid line) and the accuracy of the full set of GG + C1 predicting C2 (dashed line). GBLUP was used for all predictions. DM, dry matter content; HI, harvest index; RTWT, root weight; RTNO, root number; SHTWT, shoot weight; MCMDS, mean cassava mosaic disease severity; VIGOR, early plant vigor.
Supplemental Fig. S14: The relationship between training set size and the accuracy of predicting the parents of Cycle 2 (C2) from Cycle 1 (C1) (within-generation prediction). The accuracy of prediction for seven traits (panels) with different sized subsets (x-axis) of the IITA C1 is shown. Subsets of a given size were selected either at random or with the genetic algorithm implemented in the R package STPGA. Ten random and 10 STPGAselected subsets were made at each training set size. Error bars are the SE around the mean for the 10 samples. Horizontal black lines show the mean crossvalidation accuracy for the C1 (validation set; solid line) and the accuracy of the full set of GG + C1 predicting the parents of C2 (dashed line). GBLUP was used for all predictions. DM, dry matter content; HI, harvest index; RTWT, root weight; RTNO, root number; SHTWT, shoot weight; MCMDS, mean cassava mosaic disease severity; VIGOR, early plant vigor.