Simulated data from a genotype-to-phenotype crop growth model for pepper

The data in this article includes 300 simulated two-way data tables with 200 genotypes in the rows and 12 environments in the columns each. The yield data was obtained from a genotype-to-phenotype crop growth model that was adapted for pepper. The genotypes were characterized by 237 markers covering all the 12 chromosomes, and the environments were obtained as a combination of: (i) two levels of radiation based on historical data; (ii) three levels of daily average temperatures, 15, 20 and 25 °C; and (iii) two countries, Spain and The Netherlands. 100 two-way data tables were obtained for each of the three levels of heritability in the environments, 0.3, 0.5 and 0.8. The data is available as supplementary material of this paper.


a b s t r a c t
The data in this article includes 300 simulated two-way data tables with 200 genotypes in the rows and 12 environments in the columns each. The yield data was obtained from a genotype-to-phenotype crop growth model that was adapted for pepper. The genotypes were characterized by 237 markers covering all the 12 chromosomes, and the environments were obtained as a combination of: (i) two levels of radiation based on historical data; (ii) three levels of daily average temperatures, 15, 20 and 25 °C; and (iii) two countries, Spain and The Netherlands. 100 two-way data tables were obtained for each of the three levels of heritability in the environments, 0.3, 0.5 and 0. 8

Value of the Data
• The data (300 data tables) will be useful to validate and compare statistical methods for modelling genotype-by-environment interaction • Researchers and practitioners can in plant science and agronomy can benefit from these data • The data can be used for general model comparison and simulation in the context of plant sciences and in the context of genotype-by-environment studies • The data can be used to compare methods for stability analysis

Data Description
Two-way data tables to structure and understand genotype-by-environment interactions in plant sciences are of key importance to improve genotype adaptability and maximize yield [4,6] . This data article presents 300 simulated two-way data tables of yields with 200 genotypes in the rows and 12 environments in the columns, 100 for each level of heritability in the environments, 0.3, 0.5 and 0.8 (Supplementary Material). The yield data was obtained from a genotypeto-phenotype crop growth model, adapted for pepper. The genotypes were characterized by 237 markers covering all the 12 chromosomes, following Barchi et al. [1] . The environments were obtained as a combination of: (i) two levels of radiation based on historical data; (ii) three levels of daily average temperatures, 15, 20 and 25 °C; and (iii) two countries, Spain and The Netherlands. The codes in the data files are of the form "XXa-bb", with XX being NL for The Netherlands and SP for Spain, a = 1 , 5 for lower and higher radiation in the historical data, and bb = 15 , 20 , 25 for the respective temperature. The R script to generate the data is available in the supplementary material, together with two csv files needed to run the code.

Experimental Design, Materials and Methods
A physiological genotype-to-phenotype model was used in which only growth-defining factors determine the maximum production that can be achieved under given environmental conditions and crop characteristics. The model under study was developed by Rodrigues [8] and by Rodrigues et al. [3] , and applied by Rodrigues et al. [7] , and allows to write the yield of genotype i in environment j as where FDMCi is the fruit dry matter content of genotype i, T j defines the l levels of daily average temperature, constant across all growing season, t0 and tf are the indices of the first and last days of the growing season, K is the light extinction coefficient, and the function { 1 − W i × ( T j − T F T F ) } represents a genotypic-specific linear reduction in fraction partitioned to the fruits (FTF) for temperatures above TFTF = 15 °C. The light use efficiency (LUE) can be written as where LUE max i is the maximum LUE of a given genotype i, [ C O 2 ] j defines the m levels of CO2 concentrations, constant across all growing season, c is a constant, and represents a genotypic-specific exponential reduction in LUE for temperatures below 25 °C with TLUE a temperature level chosen to have the reduction more expressive for temperatures below 20 °C. The leaf area index (LAI) for genotype i in environment j and day t is the product of leaf area per shoot and shoot density and can be written as where a and Bi the constant intercept and the genotype specific slope in a regression of the leaf area per stem (m2) as a function of the temperature sum ( °C d), Tbase is the base temperature, t represents the t-th day of the growing season (t = t0 is the day of the first flowering), and Sd is the stem density. The photosynthetically active radiation (PAR) incident in the crop (I) on day t in environment j, Ij,t, can be written as where radj,t is the global radiation at day t in environment j, FPAR is the fraction of PAR in global radiation, and Trj is the greenhouse transmissivity in environment j. More details about this model can be found in Rodrigues [8] and Rodrigues et al. [7] . The model in Eq. (1) was made specific for sweet pepper. The seven physiological parameters in model (1) The remaining constants in the model, for the particular case of sweet pepper, are defined in Table 1 .  As in Rodrigues et al. [7] , 12 environments were defined for two levels of radiation based on available historical data for Spain (SP) and The Netherlands (NL), and three levels of daily average temperature (l = 15, 20 and 25 °C). A time integration of one day was considered for each of the 12 environments.
Each run of the multivariate normal distribution defined by Eqs. (5) and (6) , together with the information in Table 1 , introduced in model (1) defines one genotype. Running the model 200 times for each of the 12 environments results in a two-way table with 200 rows/genotypes and 12 columns/environments of yield for sweet pepper.
Following the procedure in Rodrigues [8] and Rodrigues et al. [3] , the genetic map ( Fig. 1 ) was simulated based on the lengths of chromosome and number of markers per chromosome in Barchi et al. [1] . A number of QTLs were assigned to the seven physiological parameters as shown in Fig. 1 . The simulation was conducted using the function sim.map in package qtl of Software R was used [2] , which allowed us to control the level of heritability in the environments.

Ethics Statement
This work neither involves human subject nor animal experiments.

Declaration of Competing Interest
The author declares that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.