Phenotypic and genotypic data of a European beech (Fagus sylvatica L.) progeny trial issued from three plots along an elevation gradient in Mont Ventoux, South-Eastern France

We provide phenotypic and genotypic data for a progeny trial of 5813 European beech seedlings, originating from 60 open-pollinated families collected at three altitudes (1020 m; 1140 m, 1340 m) on Mont Ventoux (44° 11′ N; 17° 5′ E).


Background
Considering the patterns of adaptive traits' genetic divergence and local adaptation displayed by many tree species at large spatial scale, forest tree populations are usually assumed to have a high evolutionary potential (Alberto et al. 2013). However, there is still limited evidence of the level of genetic variation available within population at key functional traits involved in response to climate. Moreover, we also need to investigate the abilities of tree populations to adapt to local variation of their environment (i.e., microgeographic adaptation, Richardson et al. 2014).
This data paper extensively describes a valuable quantitative genetic experiment designed to address these issues in the European beech (Fagus sylvatica L.), a major tree species in Europe. Sixty beech maternal progenies were collected in three plots along an elevation gradient and grown in a common garden under two contrasted experimental conditions (water stress/no water stress), to assess how the variation at twelve adaptive traits partitioned within and among families, plots, and experimental contrasts. Moreover, we genotyped a subset of offspring and all the potentially reproductive adults in the three plots at 13 microsatellite markers to infer paternal relationships and to estimate average relatedness within and between maternal families and genetic divergence among plots.

Species and field sampling
The European beech is a shade-tolerant species requiring well-drained, moderately deep soils, and relatively high humidity. Its distribution ranges from the northern seedlings, 41 failed to survive to plantation, leading to 5813 effectively surveyed seedlings.
The seedlings grew 3 years (from April 2010 to September 2013) in the common garden. From the 7th of August 2010, we divided the trial in two contrasted experimental conditions: "watered" (from block 1 to 25, 2953 seedlings) versus "water-stressed" (from block 26 to 50, 2860 seedlings). In the "watered" condition, seedlings boxes were watered daily until saturation. The weight of three control boxes was measured when water percolation stopped and allowed to define the average box weight at field capacity (W PF1 ). The properties of the sandy substrate, measured in the laboratory, made it possible to define the expected box weight at the wilting point (W PF2.5 ) corresponding to 30% of the permanent wilting point. In the "water-stressed" condition, from August to October each year, 3 control boxes were weighed every day (W control ), and seedlings were not watered as long as W control was above W PF2.5 . As soon as W control < W PF2.5 , seedlings were watered again.

Measurement of phenotypic traits on seedlings in the trial
Fifteen "raw" traits, related to growth, leaf phenology, leaf morphology, leaf physiology, and competition, were directly measured on all or part of the seedlings (Table 1; "2_Pheno-type_Table" spreadsheet in the seedlings database). Other "estimated" traits were computed from these raw traits and are available in the "2_Phenotype_Table" spreadsheet in the seedlings database. This data paper gathers the details of the protocols and provides the raw data. It also briefly describes the measured and estimated traits (more details can be found in Gauzere et al. 2016aGauzere et al. , b, 2020.

Phenological traits
The timing of bud burst was monitored weekly from April to May in 2011 and 2012. Five stages were used to follow the bud burst dynamics. Moreover, we attempted to bridge the gap between these five stages and the most widespread methodology used to describe phenological stages in plants (i.e., BBCH, Meier et al. 2009), to allow Mediterranean regions to the south of Scandinavia. On Mont-Ventoux, Southeast of France (44° 11′ N; 17° 5′ E), beech forests are at the climatic limit of their ecological range (Fig. 1). On the North slope of Mont-Ventoux, beech forest ranges almost continuously from 750 to 1700 m in elevation. This steep elevation gradient provides almost linear variation in mean temperature and humidity with elevation ( Fig. 1). Along this elevation gradient, we selected three plots at elevations N1: 1020 m (dimension: 1.30 ha), N2: 1140 m (2.20 ha), and N4: 1340 m (0.80 ha). These plots extend over 1.5 km (Fig. 2).
Within each plot, we achieved the exhaustive mapping and genotyping of the reproductive beeches ( Fig. 2; Bontemps et al. 2013;Gauzere et al. 2013a;Oddou-Muratorio et al. 2018). This sampling design is relevant to perform paternity analysis, as we know that most mating events occur within the maternal neighborhood (average pollen dispersal distance d = [35; 63] m; Gauzere et al. 2013a).
The progeny trial described in this study includes 60 open-pollinated families (numbered with a Family_ID ranging from 1 to 60). In August 2009, 20 highly fertile and randomly distributed trees were chosen as mother-trees in each plot (Oddou-Muratorio and Gauzere 2021). We collected ~ 1164 seeds per mother-tree (min = 725, max = 1655) directly from the canopy (either by climbing or using scissors mounted on a rod). A sample of 344 seeds per mother-tree on average (min = 202, max = 733) was randomly chosen to measure the average seed weight (g), and the proportions of empty, infested, and healthy seeds based on Faxitron numerical X-ray radiography (Faxitron Bioptics, Tucson, AZ; 15-20 kV, 0.3-3 mA). All the seeds were dried to a humidity rate of 8%.

Half-sib trial establishment
In October 2009, seeds were rehydrated and conserved at + 4 °C during 10 weeks to break dormancy and initiate germination. In April 2010, 8976 seedlings successfully germinated, among which 5854 were kept as "focal" seedlings (91 seedlings per family on average). All the 8976 seedlings were transferred in a common garden at the nursery of Aix-Les-Milles (43° 30′ N; 5° 24′ E) where they were planted in individual pots of 1.2 l with sand substrate and fertilizer. Seedlings were grouped in boxes of 17 seedlings and stored over 6 shelves (Fig. 3). The 5854 focal seedlings were arranged in 50 complete blocks, each block including ~ 2 seedlings per family (Fig. 3). To ensure that the 5854 focal seedlings had neighboring seedlings (i.e., to avoid border effects), we used 3122 additional seedlings as "border" (i.e., on external sides of boxes) or "neutral" (i.e., within boxes) seedlings ( Fig. 3; see also the "Trial Design" spreadsheet in the database). Among the 5854 focal the possible posterior analysis and comparison of these data. To that aim, we matched each of our five stages to the closest BBCH reference stage, considering the BBCH scale adapted to trees and shrubs (https:// tempo. pheno. fr/ Prese ntati on/ Varia bles-mesur ees). The five stages were: (A) buds are dormant (equivalent to the stage 00 in the BBCH scale) or swelling (equivalent to the stage 01 in the BBCH scale); (B) bud scales are broken (BBCH 07); (C) at least 15% of the leaves are emerging (BBCH 08); (D) at least 50% of the leaves are emerging (BBCH 09); and (E) 90% of leaves are spread out (BBCH 19). We also noted dormant buds as stage "0". The instructions for these measurements are available in the file "SOP_Phenology. pdf" at https:// doi. org/ 10. 15454/ 6HETQP/ WF3JYB. Then, the sum of budburst scores (Sum_BSS) was computed as in Bontemps et al. (2017): after converting the letters A, B, C, D, and E into marks (from 1 to 5), we summed the marks over all of the dates for each individual. The higher Sum_BSS, the earlier and quicker was leaf unfolding.
The timing of leaf senescence was monitored weekly from October to November in autumn 2011. Three stages were used to follow senescence dynamics (SOP file "Phenology"): (0) leaves have not fallen and are not colored; (1) at least 10% of the leaves are colored or have fallen; and (2) at least 50% of the leaves are colored or have fallen. The instructions for these measurements are available in the file "SOP_Phenology.pdf" at https:// doi. org/ 10. 15454/ 6HETQP/ WF3JYB. We also computed the sum of senescence scores over all of the dates for each individual (Sum_ SenescScores). For both budburst and senescence surveys, phenology was always monitored by the same two groups of observers.
These raw phenological data were transformed to estimate the dates of passage from stage A to B or from B to C for bud burst phenology and the date of passage from stage 0 to 1 or from stage 1 to 2 for leaf senescence (see details in Gauzere et al. 2016a, b).
In November 2011, some seedlings started flushing because of exceptionally warm early autumn temperatures. This unusual event did not lead to the development of fully functioning leaves, but it increased the sensitivity of buds to frost during the winter of 2011/2012. The damaged buds did not restart their development in spring 2012. In May 2012, after all individuals had finished their spring development, we visually quantified the percentage of damaged buds (A: no damages; B: less than 25%; C: between 25 and 50%; D: more than 50%).

Leaf morphological traits
We first measured the fresh leaf area (LA in cm 2 ) with a planimeter. Three leaves were collected on each seedling. The leaves were then dried at 60 °C during about 3 days to finally record the leaf dry mass (LM in mg). The leaf mass area was calculated as LMA = LM/LA (in mg/cm 2 ). The average values of the three estimates of LA, LM, and LMA were kept in the database.
We also measured three other morphological qualitative traits based on scanned images of fresh leaves: the pilosity, presence of lace, and emboss.

Physiological traits
We measured leaf carbon content (in %), leaf nitrogen content (in %), and leaf carbon isotope composition (δ13C) as a surrogate of intrinsic water use efficiency (Farquhar and Richards 1984; see detail in Gauzere et al. 2016a, b) for a subset of 1594 individuals, representative of plots, families, and blocks (1039 for the "watered" condition, 555 for the "water-stressed" condition). For each of them, we mixed three collected leaves and dried and ground them in a ball mill. A subsample of 1 ± 0.1 mg was weighed into tin capsules. Leaf nitrogen content was measured with a continuous flow elemental analyzer (Carlo Erba NA 1500; CE Instruments, Rodano, Italy) and the carbon isotope composition with a coupled isotope ratio mass spectrometer (Thermo-Finnigan; Delta S, Bremen, Germany). δ13C was calculated according to the international standard (Vienna Pee Dee Belemnite, VPDB) using the following equation: (1) where R sa and R sd are the isotopic ratios 13 C/ 12 C of the sample and the standard, respectively. The precision of spectrometric analysis (standard deviation of δ 13 C) was assessed with internal laboratory reference material with a matrix close to the measured samples (oak leaves, n = 16, SD = 0.05 ‰) and precision among the different runs ranged from 0.08 to 0.13 ‰. These analyses were performed at SIVATECH facilities (SILVATECH, INRAE, 2018, Structural and Functional Analysis of Tree and Wood Facility https:// doi. org/ 10. 15454/1. 55724 00113 62785 4E12).

Competition
To estimate the competition acting on each focal seedling, we computed the mean height (i.e., the mean of H3 values, see section "Growth traits" above) of the closest neighbors, i.e., of all the seedlings directly adjacent to a focal individual (8 neighbors in theory, but H3 was not available for border and neutral seedlings). This index was called Neighbor-Height in the database.   2) against an internal size standard (ET400 DNA size markers). Automatic allele assignment was checked and revised manually twice to ensure consistency of genotyping. Among the 2088 seedlings, 2068 genotypes were successfully read (1382 seedlings from the "watered" condition, and 686 seedlings from the "water-stress" condition). The genotypes of all adults were successfully read. The genotype dataset was used to search for the father of the genotyped seedlings among all the adult trees of the same plot (including the mother-tree in this self-compatible species). We used the likelihood-based software CERVUS version 3.0 (Marshall et al. 1998), with parameters described in details in Gauzere et al. (2013a). The father was retrieved for 1000 among the 2068 genotyped seedlings. This pedigree information can be used to refine quantitative genetic analyses and, in particular, to account for departure from half-sib assumption in progeny test (Gauzere et al. 2016a(Gauzere et al. , b, 2013b. Independently of paternity analyses, the genotype dataset was also used to estimate the probability for each seedlings to originate from migrant pollen (LogLikMigGL), using the mixed effect mating model (MEMM) and its associated program (Gauzere et al. 2013a;Klein et al. 2011).

Microsatellite genotyping, paternity, and mating system analyses
Part of the genotype dataset was also used to estimate mating system parameters at family level (Gauzere et al. 2013a): the effective number of pollen donors, the migration rate (estimated from MEMM), and the selfing rate (estimated from MEMM).

Measurement of phenotypic traits on adult trees in situ
All potentially reproductive beech trees were measured in situ. Although these measurements are not directly related to those made in the seedlings trial experiment described here, we still detailed the different variables measured on adult trees, in an attempt to gather together data as exhaustively as possible. First, the diameter at breast height (dbh) of each adult tree was measured. As beech sometimes produces stump shoots resulting in multiple clonal stems (coppice), we measured the number of stems (Nstem) per tree. If a tree displayed multiple stems, the diameter of all stems was measured, and the maximum, mean, and sum of diameter of the clonal copies were computed (respectively MaxDbh, MeanDbh and SumDbh). For single stem trees, MaxDbh = MeanDbh = SumDbh = Dbh. Additionally, we characterized the stature of each tree through a class variable with 3 levels (dominant, codominant, and suppressed). These levels aimed at accounting for differences in light accessibility among trees with their crown, respectively, above, within or below the surrounding canopy.
Using the spatial coordinates, the conspecific local density was estimated based on the number of reproductive beech neighbors found in disks with a radius of 5 and 20 m around each tree (ConDens5 and ConDens20, respectively). The total competitor density (TotDens5 and TotDens20) was estimated similarly considering trees of all species within these disks.
We also used the Martin-Ek index (Martin & Ek 1984) to quantify the intensity of competition on a focal individual i. This index accounts simultaneously for the diameter and the distance of each competitor j to the competed individual i: where dbh i and dbh j are the diameter at breast height (in cm) of the competed individual i and of competitor j (any adult tree of any species with dbh j > dbh i ), n dmax the total number of competitors in a given radius d max (in m) around each individual i, and d ij the distance between individuals i and j. We computed this index within radius of 5 and 20 m (MartinTot5 and MartinTot20, respectively). We also used the Martin-Ek index to characterize the intensity of intraspecific competition, by considering only beech as competitor (MartinCon5 and MartinCon20, respectively).

Access to the data and metadata description
The seedlings and adult databases are available at Portail Data INRAE: https:// doi. org/ 10. 15454/ 6HETQP. Associated metadata access is at https:// metad ata-afs. nancy. inra. fr/ geone twork/ srv/ fre/ catal og. searc h#/ metad ata/ 55aa3 d92-3866-48e9-b000-f9cbf 2d13d 03. The seedlings database is made of three tables, available as three spreadsheets in one Excel file (entitled: Oddou-Muratorio_etal_dataBeechHSTrial.xlsx). The first table (1_ Seeds_Table) contains information on the 60 mother-trees and on the quality of seed lots collected on them (the spreadsheet Seeds_Table_Legend contains variables' description). It includes 16 variables, in particular the number of healthy, empty, and infested seeds, and values of mating system parameters at mother-tree level (effective number of pollen donors, selfing rate, migration rate). (2) The second table (2_Phenotype_Table) contains the phenotypes of the 5813 focal seedlings, as well as general information on all the 8976 focal, border or neutral seedlings raised in the half-sib trial (the spreadsheet Phenotype_Table_Legend contains variables' description). It includes 77 variables in total, and the main raw phenotypic data are summarized in Table 1.
The third table (3_Genotype_Table) provides genotypes at 13 microsatellite markers of a subset of 2068 seedlings, their 60 mother-trees, and the 630 candidate fathers (the spreadsheet Genotype_Table_Legend contains variables' description).
Detailed information on the trial design is given in a last dedicated spreadsheet ("Trial Design"), with a schematic representation of the trial and seedling boxes.
One additional excel file (entitled: Oddou-Muratorio_ etal_dataAdult_inSitu.xlsx) contains the phenotypes of adult trees (mother-trees and candidate fathers) measured in situ on the three plots (the spreadsheet Legend contains variables' description). These data were originally published in Oddou-Muratorio et al. (2018).

Technical validation
Phenotypic measurements were first validated by careful cross-reading of the tabled values, complemented by numerical and graphical analyses. Genotypic data were validated independently by two different operators; moreover, the genotype of the seedlings was validated by comparison with the genotype of their mother-tree. Every record was revised in relation to the normal range of values for each variable. Related variables were confronted and tested for inconsistencies through correlation of time series analyses, and corrected when necessary.
Laboratory equipment was regularly calibrated, and standards were used on each analysis.

Reuse potential and limits
Part of this database has already been used to estimate mating system parameters and the variation in adult fecundity (Gauzere et al. 2013a;Oddou-Muratorio et al. 2018), the heritability of adaptive traits (Gauzere et al. 2016a, b), and microgeographic adaptation (Gauzere et al. 2020). However, the different datasets corresponding to these articles are available on different data portals, with no link between them, and sometimes without the raw data. Hence, a main value of the database published here is to gather all of the raw or derived variables that were measured or computed, for all of the individuals. Moreover, some of these data were never published so far (e.g., the 1_Seeds_Table data, or the leaf morphological variables). This database can be reused for other quantitative genetics studies, for instance, to test new methods of estimation of quantitative genetic parameters based on open-pollinated progeny design (Gauzere et al. 2016a(Gauzere et al. , b, 2013b. It can also be used for ecophysiological studies, for instance, to investigate the relationships between the different studied ecophysiological traits (Bontemps et al. 2017), or to calibrate ecophysiological models accounting for intra-specific variability (Berzaghi et al. 2019;Oddou-Muratorio et al. 2020). Finally, it could provide new entries in existing ecophysiological databases (e.g., Kattge et al. 2011). There are classical limitations for assembling this dataset with other quantitative genetic datasets (lack of a common reference material), but some material (seed, DNA sample) has been secured and conserved at INRAE URFM, and can be made available for future research projects.