Grapevine yield gap: identification of environmental limitations by soil and climate zoning in the region of Languedoc-Roussillon (South of France)

Often unable to fulfill theoretical production potentials and to obtain the maximum yields set by wine quality labels, many vineyards and cellars need to solve the issue of so-called grapevine yield gaps in order to assure their durability. These yield gaps particularly occur in Mediterranean wine regions, where extreme events have intensified because of climate change. Yield gaps at the regional level have been widely studied in arable crops using big datasets, but much less so in perennial crops, such as grapevine. Understanding the environmental factors involved in yield gaps, such as those linked to climate and soil, is the first step in grapevine yield gap analysis. At a regional scale, there are numerous studies on ‘terroir’ linked to wine typicity and quality; however, none have classified spatial zones based on environmental factors identified as being involved in grapevine yield. In the present study, we aggregated into one big dataset information obtained from producers at the municipality level in the wine region Languedoc-Roussillon (South of France) between 2010 and 2018. We used a backward stepwise model selection process using linear mixed-effect models to discriminate and select the statistically significant indicators capable of estimating grapevine yield at the municipality level. We then determined spatial zones by using the selected indicators to create clusters of municipalities with similar soil and climate characteristics. Finally, we analysed the indicators of each zone related to the grapevine yield gap, as well as the variations among the grapevine varieties in the zones. Our selection process evidenced 6 factors that could explain annual grapevine yield annually (R 2 = 0.112) and average yield for the whole period (R 2 = 0.546): Soil Available Water Capacity (SAWC), soil pH, Huglin Index, the Climate Dryness Index, the number of Very Hot Days and Days of Frost. The clustering results show seven different zones with two marked yield gap levels, although all the zones had municipalities with no or high yield gaps. On each zone, grapevine yield was found to be driven by a combination of climate and soil factors, rather than by a single environmental factor. The white wine varieties showed larger yield gaps than the red and rosé wine varieties. Environmental factors at this scale largely explained yield variability across the municipalities, but they were not performant in terms of annual yield prediction. Further research is required on the interactions between environmental variables and plant material and farming practices, as well as on vineyard strategies, which also play an important role in grapevine yield gaps at vineyard and regional scale.


INTRODUCTION 1. The importance of grapevine yield gaps
Compared to the production of other types of crop, grapevine yield has been historically overlooked, assuming a trade-off between grape yield and wine quality (Bisson et al., 2002;Jackson and Lombard, 1993;Poni et al., 2018). Indeed, in viticulture, the issue of yield is treated in a particular way, since many production regions set a limit on grapevine yields within the framework of geographical indications (Stranieri and Tedeschi, 2019). However, in many vineyards, producers do not manage to reach the maximum grapevine yield authorised within the framework of the quality label. In France, Schauberger et al. (2018) showed that historical grapevine yields have stagnated since the 1980s. Some authors have even diagnosed a "vineyard decline" that could relate to environmental and management drivers (Riou et al., 2016). Furthermore, in winegrowing regions with typically warm temperatures and low rainfall, such as the Mediterranean, climate change phenomena (i.e., increasing temperatures, droughts and extreme events) are expected to have a negative impact on grape yield (Bernardo et al., 2018;Droulia and Charalampopoulos, 2021;Hannin et al., 2021;Ollat et al., 2017;Touzard et al., 2017). Therefore, it seems necessary to increase knowledge about the factors involved in grapevine yield and explore potential adaptation measures to maintain or increase them. In the case of the Pays d'Oc Protected Geographical Indication (PGI), the label is particularly concerned about the long-term stability of its production levels, because (i) the yield is obviously a main driver of individual wine estate performance and its long-term sustainability (Tintinger, 2020), and (ii) as a product, the amount of Pays d'Oc on the market should be constant to guarantee satisfactory prices. In light of this, the label supports the increasing effort to explore avenues for stabilising yields at satisfactory individual and collective levels.
Yield gaps can be defined as the differences between the targeted yield and the obtained yield, and they result from a combination of environmental, genetic and management factors (Cooper et al., 2021;Edreira et al., 2017;Van Ittersum et al., 2013), of which we focused on the environmental components in the present study. Environmental factors contribute to 'resource yield gaps', (e.g., the need for nutrients and water), but they also interact with management and genetic factors to create 'efficiency yield gaps' in the use of those resources . Edreira et al. (2018) created the concept of 'technology extrapolation domains' to study zones with similar environmental resources for crop production. Andrade et al. (2019) added socio-economic variables to this framework to study the application of agricultural technologies in different countries. At the regional scale, defining zones with similar environmental variables is a first step towards understanding the local factors involved in yield gaps in large datasets with spatial variability (Beza et al., 2017;Liu et al., 2017;Van Wart et al., 2013).
While numerous studies have been carried out on yield gaps in arable crops (Anderson et al., 2016;Guilpart et al., 2017;Pradhan et al., 2015), very little is known about grapevine yield gaps, with only some very recent studies carried out in the Barossa and Eden valleys by Bonada et al. (2022), in which water-limited grapevine yield gaps were analysed. In contrast to other crops, most European wine production must comply with maximum yield requirements set by the geographical quality labels; this is the target yield for many growers (Stranieri and Tedeschi, 2019). In some cases, grapevine yields are intentionally limited to improve wine quality (Poni et al., 2018), by applying specific farming practices, such as leaf removal, reduced fertilisation or deficit irrigation -or to limit other production factors, such as workforce or capital (e.g., if they do not optimise the maintenance of their vine stock capital by renewing the vineyard); however, in accordance with the quality label experts, we assumed these factors to be less significant for the study region and wine quality label studied. We therefore defined grapevine yield gaps as the difference between the yield maximum limit established by the corresponding quality label and the actual yield obtained.

Environmental factors involved in grapevine yield
Grapevine yield is known to develop during the course of two years and is linked to a combination of environmental and management factors associated with several key phenological stages . As has been shown by studies at the plant scale, both soil and climate factors contribute to grapevine yield (Gerós et al., 2015). Soil and climate factors are involved in several key grapevine stages, particularly those determining the availability of water resources (Gaudin et al., 2014;Pellegrino et al., 2005;Simonneau et al., 2017). Furthermore, soil and climate indicators can be used not only at vineyard scale, but also at regional scale to determine and map key environmental characteristics involved in grapevine yield (Carbonneau et al., 2015). Different soil indicators have been used in vineyards to spatially define zones, such as erosion fragility (Chevigny et al., 2014;Rodrigo-Comino and Cerdà, 2018), degree of soil compaction (Lagacherie et al., 2006), carbon content (Bonfatti et al., 2016) and mineral nutrition (Arnó et al., 2012). Moreover, climate indicators have often been used to estimate grapevine phenology (Garcia de Cortazar Atauri et al., 2017;Gavrilescu et al., 2018), identify suitable grapevine growing areas (Anderson et al., 2012;Fraga et al., 2014a;Moral et al., 2016), determine cultivar growing conditions (Parker et al., 2013;Santos et al., 2019) and assess or predict the climatic events impacting grapevine yields, such as heatwaves, hail or frost (Fraga et al., 2020;Petoumenou et al., 2019;Sgubin et al., 2018). Numerous climatic models predict  that climate change will threaten grapevine yields by climatic extreme events or worsened water scarcity in specific regions. For instance, simulations performed by Fraga et al. (2016) have indicated that climate change is projected to impact grapevine yield in southern Europe as a result of dryness. Santos et al. (2011) explored climatic evolution in several zones of northern Portugal, helping to identify potential local adaptation strategies. While the prediction of climate evolution in vineyards is useful when developing adaptation strategies to mitigate climate change impacts (Mosedale et al., 2016;Naulleau et al., 2021;Santillán et al., 2019), precise local adaptation can only be applied through a fine spatial description of the environmental factors involved (Naulleau et al., 2022).

Soil and climate zoning in viticulture
In viticulture, zoning is traditionally based on the concept of 'terroir', which refers to the relationship between the climate, soil and vine compartments, and farming practices, with a final objective of ensuring wine quality and typicity . Recently, many authors have suggested that terroir studies should become more unbiased by including precision agriculture techniques for measuring environmental data (Bonfante and Brillante, 2022;Bramley and Hamilton, 2007;Brillante et al., 2020;Vaudour et al., 2015). Numerous terroir classifications have thus been proposed to try and better explain terroir in terms of the environmental variables of a given wine growing region (usually trying to find a link with wine composition), these classifications have sometimes used grapevine yield data to explain terroir (Bramley et al., 2011;Bramley et al., 2020;Peng et al., 2021). However, little attention has been paid to spatial classifications that consider the environmental factors involved in grapevine yield.
At regional and sub-regional scales, climatic indicators have been used to classify viticultural production. Tonietto and Carbonneau (2004) proposed the MCC (Multicriteria Climatic Classification) system based on three climate indicators that also proved to be correlated with the typicity of wine sensory characteristics (Tonietto et al., 2014). Bois et al. (2018) proposed a temperature-based zoning of the Bordeaux wine region to predict maturity dates. Resco et al. (2016) classified the Spanish viticultural regions helping to exploring adaptation to future climate, and Blanco-Ward et al. Although a high diversity of soil indicators in vineyards exists (vanLeeuwen et al., 2018), some authors prefer to consider geological soil factors (Ferretti, 2019). Ultimately, since the availability of soil sample data is limited, resulting in imprecise upscaled cartography, soil data is often the bottleneck in zoning, (White, 2020). Therefore, only a few studies have combined both soil and climatic indicators for the zoning of wine regions; for example, the classification of viticultural terroirs in the Iberian Peninsula by Fraga et al. (2014b); the classification of the north-west zones in Iberian Peninsula by Cardoso et al. (2019); the zoning of terroirs in Denmark by Peng et al. (2021); and the classification of the Australian Barossa sub-zones by Bramley and Ouzman (2022). However, no methods exist for the spatial classification of soil and climate indicators directly related to grapevine yield.

Purpose of this study
In this study, we used climate and soil indicators linked to grapevine yield gap to determine spatial zones within the winegrowing region of 'Languedoc-Roussillon' (South of France, south-eastern part of the French Occitanie region). We collected 9 years' worth of data from grapevine producers under the Pays d'Oc PGI quality label -who are hence all subject to the same maximum yield requirements established by the label -and aggregated them at the municipality level. We calculated, at municipality level, the soil and climate indicators that influence grapevine yield according to the scientific literature. We then kept only those indicators proving to have a significant effect on the grapevine yield at municipality level. Then, we clustered the zones presenting similar indicators, thus facilitating the identification of environmental resources. We hypothesised that (i) the combination of climate and soil can result in different yield levels (e.g., low-yielding and high-yielding environments), (ii) the same yield level can be obtained with different combinations of climate and soils, and (iii) some grape varieties are preferentially cultivated in specific combinations of climate and soils with higher associated yields.

Yield data
Grapevine yield data were obtained from harvest customs declaration data provided by producers under the Pays d'Oc PGI in the former Languedoc-Roussillon region ( Figure 1A). This label is the largest wine label in France in terms of cultivated area, wine produced and number of winegrowers, with over 1100 wine cellars. Between 70,000 and 120,000 ha are declared under this label every year, which represents over 50 % of the grapevine cultivated area in Languedoc-Roussillon ( Figure 1B). The label sets a maximum red and white wine production limit at 90 hl·ha -1 ·year -1 . Although the maximum yield for rosé wine was increased from 90 to 100 hl·ha -1 ·year -1 in 2015, we considered 90 hl·ha -1 ·year -1 as the generic yield objective and threshold for all years and wine colours. The dataset contains a total of 96,667 yield data for a period of 9 years (from 2010 to 2018), 58 grapevine varieties and 606 municipalities. The yield data (hl·ha -1 ·year -1 ) was aggregated on a yearly basis at municipality level for all grapevine varieties and vineyards, resulting in 4455 annual municipality yield data. These data are available in the following repository: https://doi.org/10.57745/THCVRE (Fernández-Mena, 2022).

Weather data
We used weather data from the SAFRAN reanalysis (Vidal et al., 2010). The SAFRAN data cover the whole of the French territory at a spatial resolution of 8km by 8km and provide weather variables relevant to crop growth. These data have already been used to investigate the effects of climate on crop yield (e.g., Ben-Ari et al., 2018). We used SAFRAN data provided by MétéoFrance and extracted from the INRAE SICLIMA platform at a daily time step between 1 January 2010 and 31 December 2018 for the following variables: daily average (tmean, °C), maximum (tmax, °C) and minimum temperature (tmin, °C), daily rainfall (mm), and daily reference evapotranspiration (ETo, mm).

Soil data
Two soil indicators were considered, as they can have an important effect on grapevine yield: soil available water holding capacity (SAWC, mm) and soil pH (pH, no units). High SAWC can buffer against transient water deficits, which is particularly relevant in Mediterranean areas during dry periods in spring and summer (Coipel et al., 2006;. Soil pH can affect grapevine yield negatively, particularly at very low pH levels (Himelrick, 1991;Tagliavini and Rombola, 2001 Lagacherie, 2019), which provides SAWC values at three maximum rooting depths (60, 100 and 200 cm). Validation by comparing with local SAWC observations revealed a substantial level of uncertainty in this map (Styc and Lagacherie, 2021). However, the map displays pedologically sound spatial patterns of predicted SAWC that justify its use for large scale studies, such as the present study (Styc and Lagacherie, 2021). We used soil pH data from the Languedoc-Roussillon GlobalSoilMap (Vaysse and Lagacherie, 2015) at a spatial resolution of 90m, following the GlobalSoilMap specifications (Arrouays et al., 2019). The map had been created using legacy-measured soil profiles associated with a set of soil covariates using random forest and by kriging. The data covers four soil layers at depths of 0-5 cm, 5-15 cm, 15-30 cm and 30-60 cm and shows high prediction performance (R 2 = 0.71 -0.73). Both datasets (SAWC and pH) are publicly available for download at https://www.openig.org/.

Data aggregation at the municipality level
Yield, climate and soil data were aggregated at the municipality level. Average grapevine yield (hl·ha -1 ·year -1 ) was calculated as the area-weighted average of yields over all grapevine varieties in each municipality. SAWC map raster was intersected with the municipalities of the region studied and an area weighted average of SAWC was calculated for each municipality. We calculated the municipality soil pH as the average pH of several soil layers at depths of 0-5, 5-15, 15-30 and 30-60 cm. The weather data were aggregated at municipality level using the nearest neighbour method.

Calculation of climatic indicators
Drawing from current knowledge of grapevine physiology, a total of eight climatic indicators were initially selected for potentially being able to explain annual and spatial variations in grapevine yield. These can be grouped into four types. The first type of climate indicator provides information about temperature suitability for grapevine development (Tonietto et al., 2014;Tonietto and Carbonneau, 2004), and comprises a single indicator: the Huglin Index (HI, degree Celsius), proposed by Huglin (1978). This index is calculated as the sum of temperatures over a threshold of 10 °C, using both the mean and maximum daily air temperatures, during the grapevine vegetative period; i.e., from the beginning of April to the end of September in the northern hemisphere. The sum is multiplied by a length-of-day coefficient (d).
We used a d equal to 1.03 for the latitude of Languedoc-Roussillon region (42°01′ -44°00′).  (Luo et al., 2011;Pagay and Collins, 2017) and can also cause leaf scorch and fruit wilt above 40 °C (Galet, 2000;Liu et al., 2019;Pagay and Collins, 2017;Venios et al., 2020). (iii) Severity of Heat Stress (SHS, degree Celsius), calculated as the accumulation of daily mean temperature over 28 °C from budburst (considered as 1 April) to harvest (considered as 30 September). Around 28 °C is considered as the optimum daily mean temperature for grapevine photosynthetic activity in a Mediterranean context (Schultz, 2000;Xiao et al., 2017). Heat stress will increase proportionally to the degree days accumulated over this optimum, reducing grapevine photosynthetic activity and thus yield.
The fourth type of climate indicator describes cold stress and comprises three indicators: (i) Days of Frost (DF, days), calculated as the total number of days per year with minimum temperature below 0 °C. Low winter temperatures can cause seasonal changes to fruit structure and a decrease in yield; extreme cold events (i.e., < -10 °C) can create injuries in vines (Buztepe et al., 2017;Kaya and Köse, 2017). This indicator is also correlated with the probability of largerscale local episodes of late frosts not being detected by climate stations. (ii) Frequency of Late Frost (FLF, days), calculated as the number of days with minimal temperature below 0 °C from 1 April to 30 September. Late frosts are a significant hazard for grapevines as they cause considerable damage to plant tissue (Molitor et al., 2014;Trought et al., 1999).
(iii) Severity of Late Frost (SLF, degree Celsius), calculated as the accumulation of daily minimum temperatures below 2 °C from 1 April to 30 September. The most severe frost impact increases with extremely low temperatures below 0 °C (Poling, 2008), but some temperatures between 0 and 3 °C have proven to significantly impact grapevine growth (Hendrickson et al., 2004), and local frost episodes can also occur when downscaling the average temperature to specific vineyards.

Selection of climate and soil indicators that best explain variations in grapevine yield
After identifying and calculating an initial set of climate and soil indicators relevant to grapevine yield (Table 1), we used linear modelling to select indicators that best explained the variations in grapevine yield at the municipality level.
Although some indicators, such as the Huglin Index and the number of hot days, were correlated (SI: Figure 2), we did not eliminate any indicators based on correlations. Instead, a backward stepwise model selection process was used to select indicators the most related to grapevine yield, following Zuur et al. (2009), Gareth et al. (2013) and Cayuela (2018).
Our yield prediction model tested all the soil and climate variables listed in Table 1 as predictors for each municipality yield and year (n = 4455), as well as for the average grapevine yield of the municipality for the whole period studied (n = 606). These models were fitted using linear mixedeffect models (LMM). LMM are commonly employed in analyses of grouped data where observations cannot reasonably be assumed to be independent of one another (Pinheiro and Bates, 2006). In our case, climate and soil indicators were considered as fixed effects for the prediction of average grapevine yield at the municipality level, and the municipality was assumed to have a random effect, since there is a spatial dependence between among interannual yield data for the same municipality (Bonansea et al., 2015).
In the backward stepwise selection process, we started with the most complex model and dropped one variable at each step depending on the p-values, until the remaining variables had significant p-values (< 0.05). Then, we selected the most parsimonious model using both Akaike information criterion (AIC) and Bayesian information criterion (BIC). This technique of model selection is also used in earth science studies, such as that of Cremona et al. (2018), to discriminate factors involved in ecological processes. Random forest was used to discriminate the importance of variables by ranking   Figure S2). P-values are flagged: * for < 0.05 ;** for < 0.01 and *** for < 0.001.   the predictors of the selected model as a function of their contribution to reducing prediction error (Strobl et al., 2008).

Spatial clustering and zone assessment
Based on the set of selected indicators as described in Section 2.2, we clustered the municipalities with similar soil and climate, helping us to create groups of municipalities that we refer to as zones. For the clustering, we used a combination of principal components analysis (PCA) and ascendant hierarchical classification (AHC). PCA and AHC on principal components were performed using the 'FactomineR' R package (Lê et al., 2008). The number of clusters in the AHC was defined in collaboration with local viticultural experts (grapevine growers and wine label managers of Pays d'Oc PGI) to capture a representative range of the pedoclimatic regional variability. The Ward's method suggested using three clusters, but this number of clusters was too low and did not include some important factors such as pH, CDI or VHD.
We assessed and described the characteristics of the defined zones in terms of the grapevine yield gap (as the difference between the average grapevine yield and the label production limit; i.e., 90 hl·ha -1 ·year -1 ). For every zone, we calculated the selected indicators in Section 2.2, the distribution of grapevine varieties, the number of municipalities and the total grapevine cultivated area. Tukey's range test was then used to compare the average values between zones using 'multcomp' R package (Hothorn et al., 2009). Anova test (R Core Team., 2014) was used to estimate the variability of the grapevine yield distribution explained by the zone classification grapevine yield, and to compare it to the unexplained variability inside each zone. Chi-Square test of goodness of fit using 'summarytools' R package (Comtois, 2021) was used to analyse the proportions of grapevine varieties in each zone (with a significance level alpha = 0.05), comparing their distribution to an expected equilibrated distribution among the zones.

Descriptive statistics of grapevine yield data
In our database, the vineyard cultivated area and wine volume as declared by the Pays d'Oc quality label were, on average, 73718 ha·year -1 and 5 950 302 hl·year -1 . However, we aggregated individual yields weighted by their area at the municipality level, obtaining an average yield per municipality and year. Yield values for all the municipalities were as follows: a mean of 65.3 hl·ha -1 ·year -1 , a median of 67.03 hl·ha -1 ·year -1 , a minimum of 6 hl·ha 1 ·year -1 , and a maximum of 100 hl·ha -1 ·year -1 (the latter corresponding to a municipality with rosé wines -permitted maximum yield being 100 hl·ha -1 ·year -1 ). If the yield gap was filled, the extra production expected could be 684 318 hl·year -1 (i.e., over 11.5 % of the current wine volume) with regards the label's maximum yield requirements for red and white wines (90 hl·ha -1 ·year -1 ).
Spatially, the estimation of average grapevine yield at the municipality level between 2010 and 2018 revealed localised yield gaps in numerous municipalities ( Figure 2).
Temporally, no declining trend was observed within this time frame, although some years had lower yields ( Figure S1), in particular 2010 and 2017, which were linked to severe drought conditions.
The most planted grapevine varieties in the region were, in order of cultivated area, Merlot, Cabernet-Sauvignon, Syrah, Chardonnay, Grenache noir, Sauvignon blanc and Cinsault. In total, 10 varieties were grown over 70 % of the cultivated area (Table ST1). The yield of white wine varieties including Chardonnay, Grenache blanc, Viognier and Muscat petit grain were generally lower than 70 hl·ha -1 ·year -1 and lower than that of red and rosé wine varieties. Yield differences between red wine varieties were low.

Identification of six relevant climate and soil indicators
Of the models tested in Table ST2, we selected a mixed model that maximised AIC and BIC performance, for which 6 (Table 2) of the 10 calculated indicators proved to have a significant effect on the annual grapevine yield of the municipalities (n = 4455). This method obtained a low marginal R 2 (0.112), thus showing low potential for annual yield prediction. Yet, the same predictors proved to be more relevant for the prediction of average grapevine yield for the whole period (n = 606), with a marginal R 2 of 0.546 and a conditional R 2 of 0.627. The variables that were found to have a significant effect on grapevine yield at the municipality level were, in order of increasing significance: Soil Available Water Capacity (Figure 3 Figure S4) and the Very Hot Days ( Figure S5). Despite their theoretical impact on grape yield, four indicators were excluded from our model: SLF, FLF, HD and SHS ( Figures  S6, S7, S8 and S9). A random forest partial dependence plot of variable importance ( Figure S2) ranged the variables according to their predicting capacity, as shown in Table 2. The Pearson correlation matrix showed a high positive correlation between the indicators related to extreme heat events (HD, VHD and SHS), also with temperature accumulation in HI, whereas CDI was highly and positively correlated with DF and highly and negatively correlated with HI ( Figure S2). These correlations helped to discriminate HD and VHD for improving model performance and verify the low correlations of all the factors with the Municipality as an aggregated variable.

Seven agroecological zones of municipalities with similar climate and soil conditions
The PCA and HCH statistical analyses helped to define seven clusters of municipalities using the selected soil and climate variables as listed in Section 3.2 ( Figure S10). Each of those clusters represents an agroecological zone with similar soil and climate characteristics ( Figure 6). The characteristics of the zones are significantly different from each other in terms of at least one index that favours or constrains grapevine yield (Figure 7).
Zone 1 is the 'Humid zone of the hinterland' and has the coolest temperatures due to its distance from the Mediterranean coast. As a consequence of having the lowest FIGURE 6. Soil and climate zones related to grapevine yield at the municipality level in Languedoc-Roussillon. Climate Dryness Index (CDI) (around -150 mm) and number of Very Hot Days (VHD) (from 0 to 2), this region benefits from high grapevine yield. Its main constraint is its Huglin Index (HI), which is the lowest, with 300 to 400 degree-days less than other zones. The Soil Available Water Capacity (SAWC) is relatively high (from 70 to 100 mm) and Days of Frost (DF) are average (from 10 to 20).
Zone 2 is the 'Zone with acid and shallow soils in the mountains', which is the only one with an acid soil pH (ranging from 5 to 7.5), which constrains grapevine yield. This zone also has the lowest SAWC (from 50 to 80 mm) and a low HI. The rest of the variables are average. The municipalities of this zone are located at the highest elevations with municipalities in the southern (Pyrenees mountains) and northern (Caroux Mountains) areas of the region.
Zone 3 is the 'Zone of piedmont with constraining SAWC'. It has low temperature-related variables (HI; DF and VHD) similar to those in Zone 2, but municipalities in this zone have alkaline soil pH (from 7 to 8.3). Water-related indicators are also not very favourable, although SAWC is significantly higher than in Zone 2. CDI is lower, being linked to higher temperatures and the HI. The municipalities of this zone are located at mid-elevation and in the piedmont areas of the region.

Zone 4 is the 'Cold and dry zone surrounding Pic St Loup'.
This zone is constrained by numerous Days of Frost (DF), but high temperatures in summer (high HI and VHD). This zone is also constrained by low water availability from rainfall (low CDI) and soils (low SAWC). The municipalities of this zone are located in high areas surrounding the peak Saint-Loup (north of Montpellier).
Zone 5 is the 'Zone of average inland soils'. It comprises relatively average soils, although SAWC is very variable. The region is constrained by a high Climatic Dryness Index (CDI) and the highest Huglin Index. The municipalities of this zone are mainly located on the inland plains in the central and eastern parts of the region.
Zone 6 is the 'Zone with deep soils in mild coasts'. It comprises the best soils (highest SAWC), compensating for having the highest water deficit (highest CDI) in the region. Extreme temperatures are rare in this zone due to the proximity of the sea.
Zone 7 is the 'Highest number of very hot days but deep soils'. It is subject to the most extreme temperatures with the highest level of Very Hot Days (VHD) and many Days of Frost (DF). In contrast, water availability is favourable due to deep soils (high SAWC) and less dry climate (low CDI).
The municipalities of this zone are located on several inland plains in the eastern part of the region.

Assessment of the yield gap and varieties per agroecological zone
Depending on their yield gaps (i.e., the difference between label maximum yield (90 hl·ha -1 ·year -1 ) and obtained yield), the clustered zones can be divided into two main groups ( Figure 8): 1. The group with the highest yield gaps, from 30 to 50 hl·ha -1 ·year -1 ; i.e., yields ranging from 50 to 60 hl·ha -1 ·year -1 . This group corresponds to municipalities in FIGURE 8. Distributions of average municipality grapevine yields (n = 606) in hl·ha -1 ·year -1 in each of the seven clustered zones in Languedoc-Roussillon between 2010 and 2018. The boxplots represent the distribution in quartiles with median lines. Circles represent the mean and filled dots are outliers. Letters correspond to Tukey's range test for comparison of means. The dashed red line corresponds to the 90 hl·ha -1 ·year -1 maximum label yield used for yield gap calculation. The percentage in brown corresponds to the coefficient of variation over time for each zone.
Zones 2, 3 and 4. Within this group, Zone 3 has a significantly higher yield.
Municipalities in Zone 1 show an intermediate yield gap between the two above-described groups.
The conditions for the application of the Anova test on grapevine yield depending on zone were validated by the Shapiro-Wilk normality test. The variability in grapevine yield explained by the clustered zones is significant according to the Anova test. The F-value obtained was 11.54, indicating more variability explained by the zones than not explained inside the zones.
We observed a high variation in yield levels depending on the zones as shown by the coefficient of variation in Figure 8. Although over time there was not a significant tendency towards lower yields in the zones, the zones with low yields (i.e., Zones 2, 3 and 4) drastically reduced their yields in occasional years ( Figure S11). Zones 5 and 6 account for the highest cultivated areas (i.e., 20000 to 25000 ha) and Zones 4 and 2 for the lowest (i.e., 1000 to 2500 ha) ( Figure S12). In addition, these zones account for the highest and the lowest number of municipalities respectively ( Figure S12).
Concerning grapevine varieties, we found a total of 15 different grapevine varieties based on the distribution of the top 10 varieties in each of the seven pedo-climatic zones ( Figure S13); the selected varieties represent over 81 % of the total diversity grown under the Pays d'Oc PGI label during the years studied. Of these varieties, 9 of them represented between 75 and 93 % of the planted area depending on the zone (Table ST3); these were, in order of increasing planted area: Merlot, Cabernet-Sauvignon, Syrah, Chardonnay, Grenache noir, Sauvignon blanc, Cinsault, Viognier and Pinot noir. We analysed the yield distribution of each of these nine wines per zone, resulting a similar overall distribution ( Figure S14). The white wine varieties showed lower yields than the red wine varieties. As well as these popular varieties, only two original varieties were found in a few zones: Cabernet franc in Zone 1 and Muscat in Zone 2 and 3.

DISCUSSION AND PERSPECTIVES
In the following sections, we first discuss the limitations of the data used and selection of indicators (Section 4.1). We then refer to the zoning results to give recommendations for management adaptations (Section 4.2), discuss other factors involved in grapevine yield gaps that would need to be studied to complete this work (Section 4.3), and, finally, give our conclusions (Section 4.4).

Limitations of climate and soil data and selected indicators
Soil data availability at the landscape scale is often the bottleneck when zoning terroir and in grapevine production (White, 2020); in particular, the lack of an SAWC map and measurements constrains many classifications, such as that of the Italian Chianti, as suggested by Priori et al. (2019). As a result of the GlobalSoilMap initiative, new soil databases are available at the regional scale for Languedoc-Roussillon (Vaysse and Lagacherie, 2015). In this study, we showed that not only climate, but also soil variables (SAWC and soil pH) are relevant when studying regional grapevine yield gaps. However, GlobalSoilMap products are less useful for doing these analyses on more local scales. Although the SAWC map explained a small amount of the total SAWC variance (20 % according to Styc and Lagacherie (2021), the aggregation of SAWC at the municipality level reduced this variance, as observed by Vaysse and Lagacherie (2017).
The use of SAFRAN climate data raises a number of questions. On the one hand, the 8 km by 8 km resolution reduced out the local variability of the climate, which can sometimes be very important in magnitude over small distances and participate in a significant way to the heterogeneity of the vineyard environment . On the other hand, SAFRAN is a reanalysis, which necessarily produces biases, especially in areas where relief is more pronounced (Quintana-Segui et al., 2008). SAFRAN tends to minimise extreme temperatures (Ollat et al., 2021), thus decreasing the number of Very Hot Days and Days of Frost. Because of the scale used, we could not consider certain local phenomena such as hail, that can drastically damage grapevine yield (González-Fernández et al., 2020). The analysis of a longer climate data series (i.e., over 20 or 30 years) would help to improve the relevance of climatic indicators, but they would require grapevine yield data from the same period and length of time. Another aspect that could be considered is the integration of plant phenology into the calculation of the indicators. Indeed, the impacts of climate on the functioning of the plant differ depending on the phenological stage of the plant, and they therefore depend highly on the variety used (Morales-Castilla et al., 2020). A possible approach to integrating these effects would be to calculate the indicators based not on calendar dates but on periods corresponding to key stages of the vine (budburst, flowering, ripening, etc.); this has already been applied to other crops (Caubel et al., 2015).
Most of the proposed indicators proved to be significant determinants of grape yield. Despite the influence of high temperature being partially explained by the number of Very Hot Days, the number of Hot Days and Severity of Heat Stress were, surprisingly, not significant indicators in a region with numerous heat events (Garnier, 2010). In addition, the period studied did not account for the most severe heat wave in the South of France recorded in 2019, which severely impacted grapevine yield (Lopez-Fornieles et al., 2022). The selection of the Days of Frost indicator may seem surprising from the point of view of extreme event impact, yet many Days of Frost in winter will delay budburst, which, for some cultivars, reduces the length of the vegetative cycle and therefore yield (Cameron et al., 2022). The Frequency of Late Frost was not deemed significant, since frosts were very marginal during the period studied ( Figure S8). Although rare, some strong late frost events can occasionally occur in Languedoc-Roussillon; a noteworthy example is that of the historic early frost of April 2021 that caused much damage in all the southern viticultural regions of France and whose consequences are still being assessed (DDTM du GARD., 2022). Still under discussion are some projections made by authors of a reduction in frost events in Europe (Leolini et al., 2018); meanwhile, a recent modelling study in France has predicted a higher risk of them occuring (Vautard et al., 2022).

The role of 7 pedoclimatic zones to guide R&D in viticulture in Langedoc-Roussilon
The 7 pedoclimatic zones explained more than 50 % of the grapevine yield spatial variations at the scales studied, despite being based on few variables. In each zone, we identified some indicators that exceeded the regional average and which may have limited grapevine yield, so we proposed some specific management adaptations. For instance, Zone 1 has the lowest HI; therefore, growing grapevine varieties adapted to lower cumulated temperatures, such as Merlot, Cabernet franc and Cabernet-Sauvignon, could help to improve or maintain grapevine yield (Morales-Castilla et al., 2020;Parker et al., 2013;Parker et al., 2020). Zone 2 has the lowest SAWC and soil pH; therefore, rootstock adaptation to lower pH and water stress could help to improve productivity in this area (Himelrick, 1991;Serra et al., 2014;Tagliavini and Rombola, 2001;Vrsic et al., 2016). Zones 3, 4, 5 and 6 are affected by water stress, either having a low SAWC (especially in Zone 4), a high Climatic Dryness Index (in particular in Zone 6, as well as in Zone 5) or both the latter (in Zone 3). The combination of water and thermic stress are known to increase stomata closure, which negatively affects grapevine photosynthesis (Carvalho and Amâncio, 2019). Adaptations to water scarcity can include rootstock and cultivar choice, irrigation, decreasing planting density, weed early control and cover crop management (Duchêne, 2016;Lovisolo et al., 2016;Naulleau et al., 2021). Zone 7 also has limitations linked to extreme temperatures, which are low during the winter (i.e., high number of Days of Frost, thus shortening the grapevine cycle) and very high during the summer (i.e., high number of Very Hot Days, severely impacting grapevine yield as shown by Lopez-Fornieles et al. (2022)). The introduction of shading nets and tree shades in this zone could help to prevent extreme temperatures and scorching of the vines (Grimaldi et al., 2019;Oliva Oller et al., 2022;Villalobos-Soublett et al., 2021;Williams et al., 2022).
Furthermore, the adaptations to be implemented will depend on the total cultivated area of a given zone. Zones 5 and 6 are the most cultivated, followed by 1, 7 and 3. Within each zone, we observed that the municipality variability of the yield gaps was higher than their indicator variability. Consequently, it would be worth investigating the causes of municipalities in a high yield zone having a low average yield gap, and, if relevant, improving farm management practices to fulfill the environmental potential of the zone.

Other non-environmental factors are involved in grapevine yield gaps
Indeed, climate and soil alone do not determine yield at the municipal level. It is very important to consider other factors, including the grapevine variety, clones and rootstocks that have been planted. Little is known about the yield potential of each variety and rootstock, particularly linked to growing conditions (Palacios et al., 2022). In our study, the choice of varieties had less effect on the yield gap than environmental factors, with smaller differences found between the most grown varieties (i.e., 5-10 hl·ha -1 ·year -1 ) compared to withinzone yield variation (i.e., 10-30 hl·ha -1 ·year -1 ). In addition, their spatial distribution is quite homogeneous, with the nine most grown varieties planted in 75 to 93 % of the zone areas, as shown in Table ST3. In other case studies exhibiting large differences between varieties, the zoning method proposed here can also be applied by zoning per grapevine variety. Further research should be carried out on the yield potential of grapevine varieties, clones and rootstock, as well as their adaptation to different climate and soil conditions (Duchêne, 2016;Gisbert et al., 2022;Serra et al., 2014).
Vineyard technology and management practices beyond the scope of this study also have an important effect on grapevine yields. Irrigation facilities are available in 17.8 % of grapevine cultivated area in Languedoc-Roussillon region (Cambrea et al., 2020), and many winegrowers considering turning to irrigation (Graveline and Grémont, 2021), thus helping to reduce grapevine yield gaps linked to water stress. Regarding vineyard inter-row management, different strategies exist already in Languedoc-Roussillon that can help to reduce competition between inter-row plants and the grapevine for water and soil nutrients (Fernández-Mena et al., 2021). Soil fertility preservation practices by mineral fertilisation and the incorporation of compost into soils can enhance grapevine growth and grape production (Vrignon-Brenas et al., 2019). The crop protection strategy to avoid the impact of pests and diseases also has an important effect on grapevine yields depending on the climate and pest pressure within each zone, powdery mildew being the most common disease in the region (Fouillet et al., 2022;Guilpart, 2014).
For each of the seven zones, the yield gaps were divided into two groups representing i) high yield gaps (30-50 hl·ha -1 ·year -1 ), ii) low yield gaps (10-25 hl·ha -1 ·year -1 ). All the zones contained municipalities with high yield gaps (i.e., higher than 30 hl·ha -1 ·year -1 ), and municipalities without any yield gaps; in the latter case, irrigation in these municipalities may explain the closing of the yield gap. In addition, insight into good management practices could be gained by investigating the management practices of winegrowers who obtain high yields in low yield municipalities or zones, as did Andrade et al. (2022) and Mourtzinis et al. (2018). The exploration of such avenues is a future perspective for this research; for instance, by collecting extra data or by 'innovation tracking' (Salembier et al., 2021).
Low grapevine yield has often been related to high quality wine and therefore yield has been restricted by quality labels to ensure a minimum quality standard (Stranieri and Tedeschi, 2019). The grapevine yield data used was provided by Pays d'Oc PGI, the most cultivated wine label in Languedoc-Roussillon region, which is a PGI (Protected Geographical Indication) label within the EU label framework (European Commission., 2022). Although other PGI labels exist in the region, as well as numerous Protected Designation of Origin labels (PDO) with more restricted vineyard practices and constrained target yields, the Pays d'Oc PGI label was more appropriate for studying a population of winegrowers and wineries with similar yield objectives. However, mixed label wineries are often located in low yield municipalities that apply business models that are not necessarily only based on the grapevine yield objective of this study.

CONCLUSION
The present study proposes a method for selecting theoretical climate and soil factors that may have significant influence on grapevine yield at the municipality level. By working to such a scale, it is possible to gain more knowledge about winegrowing landscape characteristics that could contribute to future studies on vineyard management practices. Our analysis evidenced 6 relevant factors that explained grapevine yield at R 2 = 0.546, thus explaining only part of the grapevine yield. Further research should consider longer yield and the time-span of the climate database (e.g., 30 years) to improve the accuracy of the indicator selection. We opted to perform clustering to help analyse the types of municipality that have similar characteristics in terms of soil and climate. The choice to apply this zoning approach was also motivated by the fact that it provides a basis for formulating R&D recommendations. Based on statistical clustering carried out following the advice of regional wine label experts, we divided the Languedoc-Roussillon region into seven differential zones that had two contrasting yield gap levels associated with different combinations of indicators related to the limiting of grapevine yield. For each zone, we determined the extent to which the variability could be explained by pedoclimatic factors. Understanding the limiting factors linked to each zone could help local experts to implement adaptation measures in order to avoid or limit grapevine yield loss. In the present study, we showed that environmental factors at this scale can explain a small part of the annual variability of yield, but a large part (> 50 %) of average yield over time. Further research is needed to study the interactions between plant material and farming practices within each zone, as they may also play an important role in grapevine yield gaps at the regional scale.