Long-term and large-scale Quercus petraea population survey conducted in provenance tests installed in France

Key message: Provenance tests are invaluable resources in forest genetics and ecology. They were originally established for seed sourcing research, but they are now also used for monitoring and predicting population responses to environmental changes. They have also raised considerable interest for conservation purposes. We provide here a data resource for a multisite large-scale long-term provenance test on sessile oak ( Quercus petraea (Matt.) Liebl.) setup in the late 1980s in France, supplemented with a few selected Q. robur provenances. The experimental layout comprises a range-wide collection of 124 provenances (109 Q. petraea and 15 Q. robur ) planted at four experimental sites covering 90 ha in total. The dataset includes individual tree assessments of traits of functional, ecological and economic importance. Dataset access is at https://doi.org/10.15454/838U9L, and associated metadata are available at https://metadata-afs.nancy.inra.fr/geonetwork/srv/fre/catalog.search#/metadata/ ede45af7-22bb-432b-8c30-af4c5248ff3e.


Background
In the early 1980s, the Institut National de la Recherche Agronomique (INRA) and the Office National des Forêts (ONF) joined forces to perform range-wide investigations of the genetic variation of sessile oak, Quercus petraea (Matt.) Liebl., a broad-leaved tree highly valued in European forestry. Studies had already been performed in other countries but were mostly limited to regional issues ( (Kleinschmit 1993) for review). Provenance research is subject to severe biological constraints in oaks, such as heterogeneous fruiting across the distribution range of species, and the poor keeping qualities of acorns, which cannot be stored for more than 1 year. These constraints account for previous studies being performed at the regional scale and the earlier reluctance of research organizations to support research into oak genetics. The motivations driving the establishment of this large-scale genetic survey in Q. petraea in the early 1980s stemmed from the decline in oak species observed at the time due to the very severe summer droughts of 1975 and 1976 (Delatour 1983;Becker 1984). Furthermore, sessile oak was increasingly being used in plantations, and recommendations concerning the choice of seed sources were urgently needed for operational forestry (Fernandez 1990). These two issues formed the basis of the rationale behind initiating a large-scale research project on oak genetic variation. Q. petraea was the species of choice for this research due to the considerable interest in planting this species rather than Q. robur. The 1980s was also the period in which the first Framework research programmes were launched by the European Union, providing opportunities to extend and support research into this species at a range-wide scale. The provenance research experiment set-up jointly by INRA and the ONF had two basic objectives: (1) To identify the best seed sources for operational plantation. This was the applied focus, as in previous provenance projects for other tree species.
(2) To provide estimates of the level and distribution of genetic variation across the distribution range of the species. This issue is more fundamental and was important to meet the need for genetic indicators in conservation and management strategies.
These were the two objectives driving the rationale of provenance sampling and the establishment of the experimental plantations described in this data paper. The whole project started in the summer of 1986, when an exceptional seed harvest was predicted for the autumn. INRA and ONF launched a joint nationwide collection operation that was repeated in the following years in France and other European countries (with EU support). Finally, these operations were completed by the Søren Madsen initiative to install a multisite provenance network in Europe in the autumn of 1989 (Madsen 1990).

Methods
2.1 Provenance sampling 2.1.1 Collection units and delineation of provenances INRA and the ONF developed a collection protocol that was followed in each year of sampling. Right from the start, this protocol allowed for collections over successive years, to cope with the problem of heterogeneous fruiting of oak trees over the species distribution range.
In pure stands of Quercus petraea, a collection unit was defined as a forest compartment of 15 to 20 ha. Contiguous compartments could also be selected as collection units if a single compartment was too small. In mixed stands (with the other associated species not another white oak), such as mixed Q. petraea-Fagus sylvatica stands, a larger collection area, encompassing one or more contiguous compartments, was recommended (30 to 40 ha). However, in such cases, Q. petraea had to account for at least 50% of the trees in the stand concerned. Stand density could range from 70 to 400 Q. petraea stems/ha, indicating that the stand was more than 80 years old.
The additional selection criteria for the collection were as follows: No other white oaks present in either the collection unit or the compartments immediately adjacent to the collection unit Acorn harvest only if at least half the Q. petraea trees in the stand were fruiting Autochthonous origin of the oak trees in the collection unit, based on management documents The collection protocol involved the bulk harvesting of acorns from the ground at 50 collection spots separated at least by 50 m. In most cases, a 50-m grid system was used. Each collection spot was an approximate circle of 10 to 15 m in diameter. It was not considered necessary to match a given collection spot with the canopy ground projection of a single tree, and collection spots could, thus, overlap with more than one tree. About 2 kg of seed was collected at each collection spot (about 100 kg/ provenance in total). All the seeds collected were bulked together into a single seed lot per collection unit.

Sampling and collection of provenances in France
Collection was performed by the personnel of ONF under the supervision of the Technical Management of the ONF. A protocol was distributed to the field stations of the ONF, with a proposed list of stands for harvesting provided that seed yields were sufficient. The proposed list comprised 55 stands classified as seed stands at the time and distributed across the different provenance regions and 33 stands selected on the basis of particular ecological features or their geographic location. These stands were located either at the margins of the distribution or on unusual soils. The list of stands was intended to encompass as much of the genetic variation of the species as possible. It was generated following discussions between INRA and ONF staff. All the stands were located in publicly owned and managed forests. The final outcome after the completion of collection in 1992 was 70 French populations of sessile oak. Overall, 49 of these 70 populations originated from stands registered at that time as seed stands located in 41 different forests, and 21 originated from non-registered stands located in 20 other forests. Four French populations of pedunculate oak (Quercus robur L.) were also collected, all from registered seed stands, located in four different forests.
Due to the geographic variation in oak masting, it required several masting years to collect sufficient seeds from all sampled seed stands covering the natural range of the species. Therefore, seed harvests took place in masting years between 1986 and 1992, and the respective annual (or yearly) seed collections were later considered as "yearly sets" within the established provenance experiments. Collection began in the autumn of 1986, when the acorn harvest was very good and was repeated in 1987, 1989, and 1992. The seed harvest was poor in the missing years between 1986 and 1992. A collection was performed in French stands in the autumn of 1988, but the seed harvest was very poor (only 15 populations collected, with many overlaps with 1987). The same list of suggested stands was proposed every year, but additional outlier stands were added over the years. Repeated collections were deliberately organised in the same compartments, to ensure that a subsample of populations was included in at least two yearly sets. Such populations, for which repeated collections were made, were considered to be "crossover populations" in plantations established each year (see Section 2.2), thus ensuring provenance connectivity in the plantation network. Repeated collections in the same stands (crossover populations) were organised only in French oak stands.

Sampling and collection of provenances in other European countries
The seed collection procedures and protocol used elsewhere in Europe were the same as those used in France, except that much smaller quantities of seed were harvested. It was recommended to collect acorns from 50 collection spots separated by at least 50 m. Acorns were collected from the ground at collection spots distributed over an area of 5 to 30 ha. Seed lots consisted of 10 to 30 kg of seed. There was also more variation in terms of fruiting levels. Seeds were collected in the autumns of 1987, 1989 and 1992. Repeated collections at the same location were not performed. The collections were made by staff from research organisations, or of national forestry services in the different countries, according to a protocol distributed via a circular.

The Søren Madsen collection
In the autumn of 1989, Dr Søren Madsen, a scientist at the Danish Forest and Landscape Research Institute, initiated a collection of provenances across the distribution range of Quercus petraea, with the aim of establishing a network of multiple provenance tests. The original plan was to collect 19 populations selected from indigenous stands with good growth in the following countries: Belgium, Denmark, France, Germany, Hungary, Norway, Poland, Turkey and the UK. These populations were then planted at 27 test sites (including the four French sites, see Section 2.2). Collections were performed in 19 European sessile oak stands covering large parts of the natural distribution area of the species in various countries, including Turkey, by local forest research institutes. The institutes taking part in seed collection and/or the establishment of field provenance experiments were ( Seeds were collected from mature stands (more than 80 years old), and collection areas ranged from 3 to 40 ha (only four collection units covered areas of less than 10 ha). Seeds were generally collected from at least 100 mother trees, in accordance with commercial rules. Most of the stands were considered to be of autochthonous or of natural origin. Only the three British stands were of unknown origin, and no information was provided about the Polish provenance. The stands were variable in terms of species composition, ranging from pure sessile oak stands, possibly with some beech or hornbeam undergrowth, to more or less mixed stands for the upper layer of oak, beech, spruce, larch and pine. Later on, visual observation of the leaf morphology in  the provenance tests revealed that some provenances (Mölln and Blakeney for example) comprised a few Q. robur. The seed harvest was generally considered good or above average. In four cases, it was fair, below average or poor. The acorns harvested for a given provenance were thoroughly mixed and split into batches of 12 kg for shipping to each of the participating institutes.
Finally, a few Quercus robur populations were also harvested according to the same procedures and protocols. The whole collection consisted of 109 Quercus petraea provenances and 15 Q. robur provenances (Tables 1 and 2, Fig. 1).

Test sites
After harvest, the seeds were transferred to the ONF tree seed centre (Sècherie de la Joux -39300 Supt-France), in the Jura, in eastern France. Seeds were transported by car in France (by ONF personnel) or via surface transport Fig. 1 Origin of the oak provenances and location of the test sites. Blue dots correspond to Q. petraea provenances, red dots to Q. robur provenances and yellow dots to the test sites (from West to East: La Petite Charnie, Vierzon, Vincence and Sillegny). Green area corresponds to the distribution of Quercus petraea (Caudullo, Welk et al. 2017)  After 3 years in the nursery, the seedlings of each yearly set were planted in a common garden experiment replicated in four national forests located along a geographic gradient running from the west to the east of France (Table 3, Fig. 1). Site conditions in three forests (Petite Charnie, Vincence, Sillegny) were favourable for Q. petraea, whereas the conditions in the fourth forest (Vierzon) were much harsher, due to its podzolic-like soil. Fig. 2 Overall description of the experimental design. Each yearly set planted in a given site corresponds to a given test. Thus, the site*yearly set design is a factorial design comprising 16 tests. Next, each test is a nested design comprising four hierarchical levels (macroblock, microblock, plot, tree) Almost all provenances were installed at all four sites (forests), in four yearly plantation tests corresponding to the four yearly collection sets (Fig. 2). The four yearly plantation tests were contiguous within a plantation site. The whole experiment therefore comprised 16 tests (4 yearly sets replicated at four sites). The four yearly collection sets for 1986, 1987, 1989 and 1992 were subsequently planted in the early months of 1990, 1991, 1993 and 1995 and labelled sets 1, 2, 4 and 5. The age of seedlings at plantation was 3 years except for the fourth collection, where the seedlings were 2 years old.
The same experimental design was used for each plantation site. Within each site, the area corresponding to a yearly set plantation (test on Fig. 2) was subdivided into 5 or 10 ecological zones, of approximately equal size, based on an ecological and soil survey. The ecological survey was conducted by recording soil and vegetative descriptors on sampled plots installed on a grid system every 20 m.
The ecological zones were called macroblocks. Microblocks with a random composition of eight population plots were nested within macroblocks (Fig. 2). Each population plot was composed of 24 seedlings of the same provenance (Table 4). The 24 trees of a population plot were planted in four rows (3 m × 4 = 12 m), with six trees per row (1.75 m × 6 = 10.5 m) (approximately square plots). Within a macroblock, each population was replicated in two different microblocks (in three microblocks for crossover populations). Thus, any given population was represented by 240 trees in a given set (2 plots in 5 macroblocks, i.e. 2 × 5 × 24 = 240 trees, 360 plants for crossover populations). In total, over the four sites and over the four yearly sets, 159,100 trees were planted (Table 4). Over and above statistical accuracy, the rationale underlying the experimental layout was to ensure the durability of the experimental plantation: (1) Initial plot size (n, Table 4) was set to 24 trees so that at the end of the rotation (between 120 and 180 years), there would be at least one tree per plot, after all the usual systematic thinnings had been performed. The final tree density of the stand would then be 80 stems/ha. At that stage, the population sample size over the whole set would be 10 trees (15 trees for a crossover population). These figures suggest that the plantation should give reliable results for at least 60-80 years, when there will still be three trees per plot (30 trees per population, 45 for a crossover population).
(2) The subdivision of each set into macroblocks was performed not only to account for soil differences, but also as security against unexpected events (e.g. storms, fire) that could potentially cause partial destruction over the lifetime of the plantation. Installation of macroblocks would secure for the durability of the design, even if partial destruction due to extreme events would occur. Similarly, the nested microblock within macroblock design allows a more homogeneous spatial distribution of a given population, should an extreme event occur.
As provenances were collected and installed over different years in different sets, statistical comparisons between provenances are possible only if the connectivity between sets is maintained, through a subset of provenances repeatedly collected over the years ("crossover" provenances, Fig. 3). Details of the experimental layout at each test site are provided in Table 4. A systematic thinning of 1 out of 2 trees was implemented 21-22 years after plantation, leaving half of the trees on the sites.

Dataset content and access to data
The whole data set comprises passport data regarding provenances and provenances tests and assessments of phenotypic traits made on individual trees. The data are stored within the "Oak provenance" database (https:// Fig. 3 Number of sessile oak populations per yearly set (diagonal) and crossover populations between yearly sets (off diagonal). Crossover populations are provenances that were repeatedly collected over at least 2 years, and subsequently planted, to ensure genetic connectivity between the different yearly sets oakprovenances.pierroton.inrae.fr/) and are available via an open repository (Ducousso, Ehrenmann et al. 2022).
Passport data of provenances and provenance tests correspond to their geographic coordinates and to the ecological data of the provenance origins and of the sites where the provenances test were established. Ecological data refer to soil (when available) and climatic data.
Phenotypic assessments were recorded on 21 traits of ecological and economical relevance, corresponding to 7 trait classes (Table 5, Appendix Table 6 for more details). Growth-related traits (girth at breast height, total height) were recurrently measured at different ages, as well as stem shape. Details and protocols regarding the procedures used for the assessments are provided in the database. Traits were not systematically recorded in all tests and on all trees, due to limitation of resources and manpower; however, growth and quality-related traits have received more attention than other traits (Appendix Table 6).
Information and data described in this data paper are accessible at the open repository (https://doi.org/ 10.15454/838U9L) and searchable at the oak provenances database at https://oakprovenances.pierroton. inrae.fr/.The database is searchable online with no identification required, using different menus. Access to files containing all descriptive data (sites, provenances tests, trees, phenotypic assessments and traits measurements), interventions and measurements carried out on each site as well as climatic data require r e g i s t r a t i o n a n d a u t h e n t i c a t i o n ( h t t p s : / / oakprovenances.pierroton.inrae.fr/data).

Reuse potential and limits
Data reuse is an essential component of open science, and a special effort was made to make the data as "FAIR" as possible. Especially the (R)eusable part, for the trait classes and traits studied on all sites, based on the • Number of pruning cuts needed for shape pruning 7 • Stem shape 10, 20

Branchiness
• Diameter of the thickest branch 7 • Diameter of the branch located below the thickest branch 7 • Height of insertion of the lowest living branch 10 • Height of insertion of the thickest living branch 7