From parcel to continental scale -- A first European crop type map based on Sentinel-1 and LUCAS Copernicus in-situ observations

Detailed parcel-level crop type mapping for the whole European Union (EU) is necessary for the evaluation of agricultural policies. The Copernicus program, and Sentinel-1 (S1) in particular, offers the opportunity to monitor agricultural land at a continental scale and in a timely manner. However, so far the potential of S1 has not been explored at such a scale. Capitalizing on the unique LUCAS 2018 Copernicus in-situ survey, we present the first continental crop type map at 10-m spatial resolution for the EU based on S1A and S1B Synthetic Aperture Radar observations for the year 2018. Random forest classification algorithms are tuned to detect 19 different crop types. We assess the accuracy of this EU crop map with three approaches. First, the accuracy is assessed with independent LUCAS core in-situ observations over the continent. Second, an accuracy assessment is done specifically for main crop types from farmers declarations from 6 EU member countries or regions totaling>3M parcels and 8.21 Mha. Finally, the crop areas derived by classification are compared to the subnational (NUTS 2) area statistics reported by Eurostat. The overall accuracy for the map is reported as 80.3% when grouping main crop classes and 76% when considering all 19 crop type classes separately. Highest accuracies are obtained for rape and turnip rape with user and produced accuracies higher than 96%. The correlation between the remotely sensed estimated and Eurostat reported crop area ranges from 0.93 (potatoes) to 0.99 (rape and turnip rape). Finally, we discuss how the framework presented here can underpin the operational delivery of in-season high-resolution based crop mapping.


Introduction
The European Union (EU) is the largest global exporter of agri-food products, with exports reaching €138 billion in 2018 (European Commission, 2019). Farmlands are an important feature of the EU landscape with about 42% of the EU's land area dedicated to agriculture. At the same time, the EU's Common Agricultural Policy (CAP) makes up about 37% of the European Commission's (EC) budget (€58.8 billion in 2018). Besides underpinning resilient food production, the CAP could contribute to mitigate climate change, improve the sustainable management of natural resources, and preserve biodiversity and landscapes. This is in line with the European Green Deal, and the Farm to Fork and Biodiversity strategies 1 . Updated EU-detection for large scale crop type mapping remains underexploited, opposed to applications using optical imagery, e.g. Belgiu & Csillik (2018), Immitzer et al. (2016) and Defourny et al. (2019). Recently, the use of high performance cloud computing on platforms such as Google Earth Engine (GEE, Gorelick et al. (2017)) has been demonstrated for detailed continental scale crop-type mapping. Massey et al. (2018) and Teluguntla et al. (2018) used large collections of 30-m optical LANDSAT imagery to respectively classify crops over North America, and over China and Australia.
Besides the recent developments in computational capacity to handle such large data streams and related algorithms, complementary in-situ observations remain essential for training and validation purposes. In the EU, continental in-situ data collection is organised through the Land Use/Cover Area frame Survey (LUCAS) (Gallego & Delincé, 2010;Eurostat, 2018a). In 2018, the Copernicus module -a survey protocol specifically tailored to EO -was introduced (d' Andrimont et al., 2021), providing a unique set of polygons with homogeneous land cover (covering up to 0.52 ha) with land cover as well as crop specific information. Another source of in-situ data is the parcel-level crop declarations by farmers in the context of the CAP, which are increasingly publicly available. Here, we bring these elements together to create the first validated crop type map for the EU at 10-m resolution. After reviewing the pertinent literature in detail, we present the specific objectives of this study.

Crop mapping with Sentinel-1
The SAR S1 constellation provides minimum consistent 6day revisits over the EU since 2016. The negligible dependence of microwave backscatter on atmospheric conditions provides consistent calibrated time series that are particularly wellsuited for automated processing and machine learning. The sensitivity of microwave backscattering to crop canopy structure (Dobson & Ulaby, 1981;Ulaby et al., 1981;Veloso et al., 2017;d'Andrimont et al., 2020a;Meroni et al., 2021), along with its sensitivity to soil surface structure and moisture content (Dobson & Ulaby, 1981) makes it an ideal data source for crop mapping. One of the overall principles in electromagnetic scattering models of crop canopies is that (back)scattering depends on leaf size and shape, relative to the microwave wavelength, their orientation in space, relative to the polarization plane of the microwaves, and the water content of those leafs. Leaf geometry and orientation determines, to a large extent, the preferential (back)scattering direction. This explains the large differences in vertically structured crop canopies (e.g. cereals) versus broad-leaf crops (e.g. sugarbeet, potatoes). It also implies that backscattering changes are expected for phenological stages of the crop that cause structural change in the canopy (ear formation in cereals, fruit formation after flowering). The leaf water content relates to the effective dielectric properties of the canopy, which determines (back)scattering intensity. Vigorous, well-watered crop canopies are more effective scatterers than stressed or senescent crops, which scattering behavior may be modulated by simultaneous change in leaf orientation. Scattering of bare soil is primarily dependent on surface roughness and soil moisture content, which has parallels to canopy scattering because the relative size and orientation of the surface structure is important, as well as row direction of the soil cultivation and soil water content, which determines scattering effectivity via the dielectric properties. It follows that scattering of partially vegetated soil, e.g., at the crop emergence stage, is related to the ensemble of canopy and soil scattering effects and that variation over time is related to the relative contributions of each component. Polarization of the microwaves determines the (Fresnel) reflection coefficients of the canopy and soil material.
Synthetic Aperture Radar (SAR) has been used as a data source for crop type mapping in different studies (McNairn & Brisco, 2004;Kenduiywo et al., 2018;Clauss et al., 2018), as well as in synergistic combination with optical data (Van Tricht et al., 2018;Gao et al., 2018;Kussul et al., 2018;Orynbaikyzy et al., 2020;Chakhar et al., 2021). Despite all the advantages of SAR, relatively few studies have focused on crop mapping based on S1 only. Kussul et al. (2017) and Van Tricht et al. (2018) derived a crop type map from a combination of S1 with optical data over Ukraine and Belgium respectively. Other local studies over the Chinese Fuyu City (Wei et al., 2019) and the French Camargue region (Ndikumana et al., 2018), demonstrate the potential of using S1 for crop type mapping. The consistent availability of S1 observations is particularly relevant in machine learning contexts. Such consistency is difficult to achieve with optical Sentinel-2 data, due to unpredictable cloud cover gaps in a multi-annual monitoring setting. Early results with the use of S1 time series in a deep neural network approach (Lemoine et al., 2019) demonstrated this successfully in a national EU crop mapping context.

In-situ data in the European Union
There is growing interest in using LUCAS in-situ data for land cover and land use research (Pflugmacher et al., 2019;Weigand et al., 2020). However, until the recent harmonization of the five past LUCAS surveys by d' Andrimont et al. (2020b), use of different LUCAS survey data remained difficult. In addition, the LUCAS protocol was designed for EU-wide standardized reporting of land cover and land use area statistics and not for Earth Observation (EO) applications.
Past remote sensing studies that incorporated LUCAS core data have been limited to statistical area estimates (Gallego & Delincé, 2010). The LUCAS sampling frame was used to assess availability of crowd-sourced pictures potentially relevant for agriculture across the EU (d' Andrimont et al., 2018b). Little experience exists in using LUCAS data for large scale land cover generation processes, especially on the usability of the LUCAS data set as training data base for supervised classification approaches (Mack et al., 2017). The 2015 survey was used in several land cover and land use mapping exercises. Close et al. (2018) provided a Sentinel-2 LUCAS-based classification over southern Belgium in the context of Land Use, Land-Use Change, and Forestry (LULUCF) monitoring. Pflugmacher et al. (2019) demonstrated the potential of using LUCAS survey 2015 to map pan-European land cover (13 classes) with Landsat data. Weigand et al. (2020) showed the suitability of the LUCAS 2015 survey as reference information for high resolu-tion land cover mapping using Sentinel-2 imagery at the scale of Germany (7 classes).
In 2018, a new LUCAS module (the Copernicus module) specifically tailored to EO was introduced, see d 'Andrimont et al. (2021). A specific protocol was designed to collect in-situ information with specific characteristics fitting EO processing requirements. As a result, a total of 58,428 polygons are provided with a level-3 land cover (66 specific classes including crop type) and land use (38 classes) information. This represent a unique set of data opening the possibility for applications with higher thematic detail as compared to previous LUCAS surveys, such as crop type mapping.

Objectives
The overall aim of this study is to assess the potential of combining the LUCAS Copernicus module with S1 data to generate an EU-wide crop type map at 10-m spatial resolution for crops in the year 2018. Beyond the production of the map, this study aims to set a benchmark for continental crop type mapping at 10-m spatial resolution. More specifically, we address the following research objectives: • To highlight the untapped potential of the LUCAS Copernicus survey as training data for land monitoring applications in general and crop type mapping in particular; • To demonstrate the value of S1 data for continental crop type map production; • To evaluate the effectiveness of various S1 indices and time periods for the discrimination of specific crop types; • To benchmark accuracies against independent LUCAS core point data and EU farmers' declarations at parcel level; • To determine the crop specific evolution of mapping accuracy throughout the growing season relevant for prospective operational tasks; • To assess crop surface areas derived from the 10-m crop type map against reported EU statistics at subnational level.

Materials and Methods
The study area is the EU-28 as in 2018, covering 4,469,169 km 2 (Fig. 2). A 10-km buffer was applied around the EU-28 to avoid mapping issues at the border. Fig. 1 provides a flow chart illustrating the general approach and specific methodological steps of the study. First, the input data is detailed: S1 in section 2.1 and in-situ data in Section 2.2. Then, Section 2.3 defines the legend used for the study. While Section 2.4 sets out how the features and the parameters for the crop classification models are selected, Section 2.5 describes how the classification is applied at the continental scale. Finally, Section 2.6 presents the three accuracy assessment approaches.

Sentinel-1 data
The S1 C-band operates at a central frequency of 5.404 Ghz in the microwave portion of the electromagnetic spectrum which corresponds to a wavelength of 5.55 cm. S1 acquisitions over the European land mass are in the so-called interferometric wave (IW) mode, which registers the backscattering of a vertically transmitted microwave signal in a vertical and horizontal receiver, creating a VV and VH polarized band, respectively.
Over Europe, S1 acquires data both in descending and ascending orbits, as it does not rely on solar illumination (as opposed to optical sensors). This leads to a doubling of the revisit cycle, though from two distinct sensor viewing configurations due to different azimuth and look directions. Furthermore, the S1 swath width is designed to guarantee gap free coverage at the equator. At higher latitudes, this design leads to considerable overlap between adjacent orbits, and an effective higher revisit, more than daily at latitude above 55 degrees North.

Geocoded backscattering coefficients
Application ready S1 data are accessed on the GEE platform. On GEE, S1 data are processed from the GRD Level-1 product with the S1 Toolbox to remove thermal noise and to create geocoded and radiometrically calibrated backscattering coefficients (σ 0 ) at 10-m pixel spacing.
Due to the side-looking nature of the SAR sensor, the incidence angle varies over the swath in the range direction, for S1 IW mode between 25-40 degrees. Normalization of the incidence effect is normally desired, and simple corrections can be applied to convert σ 0 to γ 0 . A more robust approach would be to apply a correction that takes the local terrain height variation into account, known as terrain flattening (Small (2011)), which can also partially compensate for the look angle differences in ascending and descending orbit acquisitions. However, terrain flattening is not applied in the GEE processing workflow and our analysis is thus based on σ 0 as most of the crops are typically on flat areas and the steep slopes are masked out as described in Section 2.5.
To be ingested in the machine learning classification algorithm, a spatially and temporally consistent time series needs to be derived from the S1 VV and VH σ 0 scenes. In addition, the backscattering coefficient ratio VH/VV (cross polarization ratio, CR) is computed from the VV and VH scenes. Several studies highlighted the advantages of the CR for mapping and monitoring of crops (Veloso et al. (2017); Meroni et al. (2021)). We proceeded as follows to generate the required archive: 1. Edge masking. The edge of each scene should be masked before the compositing. This is done by masking groups of neighboring pixels with values lower than -25 dB in the VV polarization (as described in d' Andrimont et al. (2018a)). 2. Averaging over a 10-day period. σ 0 values are then averaged over subsequent time periods of 10 days for each pixel and for all ascending and descending acquisitions available in that period, separately for the VV and VH polarizations. The averaged σ 0 value is converted to decibels (dB). 3. Computing the cross polarization ratio. The CR is computed for each scene and averaged to the same 10-days periods.
In this way, we generate a regularly timed, gridded, equal sized set of temporal features for VV, VH and CR to use as input for the subsequent machine learning runs (n = 36 10-day composites over the period 2018-01-01 to 2018-12-31), independent of the number of actual S1 acquisitions in the 10-day period. The S1 data were processed in GEE and then downloaded.

LUCAS 2018 in-situ data
This section describes the LUCAS 2018 survey data that were used for training the classification models and assessing the accuracy of the crop type map.

LUCAS 2018 survey
The LUCAS 2018 survey collected 97 variables at 337,854 core points. Most of the points surveyed fall in a homogeneous area for which the minimum mapping unit is about 7 m 2 (a circle of 1.5 m radius). When the land cover is not homogeneous, for example when it is composed of trees or shrubs interspersed with grass, the scale of observation is extended to classify it. In these cases, a systematic observation of the "environment" in the vicinity of the point, which in LUCAS is called the extended window of observation, has to be adopted. The extended window of observation expands to a radius of 20 meters from the point (representing an area of 0.13 ha) for forest and shrublands. Detailed information about the survey can be found in Eurostat (2018b). The land cover surveyed is classified according to a harmonized 3-level legend system Eurostat (2018c). Additionally to the core variables collected, other specific modules were carried out on demand on a subset of points, such as (i) the topsoil module and (iii) the grassland module. The LUCAS 2018 core data are available in an harmonised open database in d' Andrimont et al. (2020b).

LUCAS Copernicus module
The LUCAS 2018 Copernicus module was applied to a subset of points to collect the land cover extent up to 51 meters in four cardinal directions around a point of observation, offering in-situ data compatible with the spatial resolution of highresolution sensors (see d 'Andrimont et al. (2021) for the open ready-to-use dataset). The LUCAS Copernicus dataset contains 63,287 polygons at level-2. When filtering the data for which a level-3 legend is available, 58,426 polygons with a level-3 land cover (66 specific classes including crop type) and land use (38 classes) are available. The minimum mapping unit (MMU) of the in-situ data is 78.53 m² (i.e. a circle of 5-m radius) as the Copernicus module survey is not executed for smaller areas. The data from the LUCAS Copernicus module is used to train the classification models.

Legend definition
The LUCAS survey legend is re-organised to serve the purpose of the study (for details of the LUCAS legend classification see Eurostat (2018c) and Supplementary material). As the objective of the study is to classify the main EU-28 crop types based on LUCAS in-situ data, the class definitions differ slightly from the LUCAS legend. The four vegetation classes of LUCAS (B-cropland, C-woodland, D-shrubland and E-grassland) are organized here in three main classes: arable land, woodlands and shrubland, and grasslands. The arable class is further divided in 19 specific classes of crop types or crop groups. The final legend is presented in Table 1.
The main differences with the LUCAS legend are the following. The woodlands and shrublands classes are regrouped in a single class. The LUCAS cropland (B) class is re-organized to keep the arable land classes separated from the grasslands and permanent crops. First, the temporary grasslands (B55) are grouped with the grasslands class (E). Second, the permanent crops (B70 and B80) are grouped with the woody vegetation class. Third, the bare arable land (F40) with an agricultural land use (U111/112/113) is regrouped with the arable land classes. The grassland class encompasses grasslands from natural and agricultural land uses.

Crop type classification
This section describes the training data (input reference data and classification features) selection and the parametrization of the classification models used for the continental crop type mapping.

Reference training data
The classifiers are trained with the in-situ information collected in the LUCAS Copernicus module. The polygons corresponding to the LUCAS labels mentioned in Table 1 are selected and reclassified according to the legend of the study. In total, 58.178 polygons from the 63.287 Copernicus polygons available are selected to train the classifier.

Stratification
To take into account climatic and ecological spatial gradients, the EU-28 is stratified in two strata based on the main biomes defined by Dinerstein et al. (2017). One main continental northern stratum and one southern Mediterranean stratum (see map in Fig. 2 More detailed stratification schemes (e.g. country-or a regional-level) have been adopted in previous crop type mapping exercises in Europe (Defourny et al., 2019;Wang et al., 2019). However, this is feasible where crop type in-situ data are abundant, e.g. provided by national crop data survey. In this study, we had to face a trade-off between stratification detail and sample size available for the resulting strata and we opted for a simple and broad stratification.

Two-phase classification
For each stratum, a two-phase classification procedure is followed. The first step distinguishes between five broad land cover classes: Artificial land, Arable land, Woodland and shrubland type of vegetation, Grassland and Bare land (Level 1 in Table 1). In a second step, the Arable land class is further classified into 19 different crop types or crop groups (Level 2 in Table 1). Therefore, a total of four supervised classification (2 strata x 2 hierarchical levels) models are trained.
For the supervised classifications, we trained Random forests (RF) models on the S1 time series to map level 1 and 2 classes. RF is an ensemble machine learning method known to be robust against multi-colinearity and overfitting (Breiman, 2001) and currently is a standard classification algorithm in the remote sensing domain (Belgiu & Drȃguţ, 2016). RF was reported to yield classification accuracies comparable to more sophisticated algorithms such as support vector machines, but with a much lower computational complexity (Inglada et al., 2017). RF models are also stable with respect to the choice of parameters (Pelletier et al., 2016) and easy to implement, making them suited for operational processing chains (such as Sen2-Agri (Defourny et al., 2019)).

Features selection
In order to produce a single crop type map at the continental map, the most significant features in term of S1 backscatter and time period are selected. To do so, different combinations of the 10-day averaged S1 indices, and different time periods from January to December are evaluated using the training reference dataset.
The first part of the features selection evaluates, throughout the growing season period, the progression of the overall accuracy (O.A.) of the classifications and the F-score for the main crop types. The accuracy is evaluated using hindcasting with monthly time step from January to December, thus extending the analysed period beyond the end of the average European time of harvest. For the two classification levels and in the two strata, RF models are trained for twelve different time period using a 80/20 split of the reference (polygon) training data. The resulting O.A. and F-score are evaluated at the continental and the stratum level to select the best time period. From the analysis presented in section 3.1, we consider the period from January to July (included) as a good trade-off between accuracy and timeliness. The period of January to end of July is used for all further processing.
Second, the performance of the two S1 σ 0 (VV and VH), one index (VH/VV), and their combinations are evaluated in terms of overall accuracy over the period January to end of July (included) .

Classification at scale
A final map is produced at the scale of EU-28 using the best features defined by the previous analysis. This final map de- rived with S1 information from January to the end of July is referred to as the EU crop map in the manuscript. The RF models are trained for the two strata with the best features in term of indices and time period.
A random search cross validation is used to define the best hyperparameters for each of the four RF models (level-1 and level-2 for each strata). We are tuning seven parameters (Supplementary Table S3) and define a grid of ranges, and randomly sample from this grid of possible parameter values to perform a 3-fold cross-validation with 100 possible combination of values, totalling 300 fits. Among the seven parameters, two are considered as key parameters: the number of decision trees (Ntrees) and the number of features to consider when looking for the best split when growing the trees (Mtry) Belgiu & Drȃguţ (2016). For Ntrees, we defined a range from 300 to 1200 with a step size of 100 and and for Mtry, we tested two options: the square root of the number of features and the binary logarithm of the number of features.
The RF classification algorithms and the hyperparameter tuning is implemented using the Python's scikit-learn (Pedregosa et al. (2011)) packages RandomForestClassifier and Random-izedSearchCV. The final hyperparameters used for each RF model are summarized in the supplementary materials (Supplementary Table S3).
The classification was performed at the continental scale on the JRC Big Data Analytics Platform (BDAP) using an HTCon-dor environment (Soille et al. (2018)).
The final map is further processed for distribution. The main focus of the map being the arable land, areas located at an altitude above 1000m and on a slope of more than 10 • is masked out. The ALOS Global Digital Surface Model "ALOS World 3D -30m (AW3D30)" is used to mask steep slopes. In addition, classes that were poorly represented in the LUCAS Copernicus module and consequently poorly classified (i.e. built-up, water, wetlands, and bare lands classes) are masked out with auxiliary products. Built-up areas are masked with the JRC GHSL European Settlement Map for 2015 at 10 m (Corbane et al., 2020;Sabo et al., 2019). Water is masked with the permanent water from the 2019 version of the JRC Global Surface Water product (Pekel et al., 2016). Then, a number of classes are masked with the Corine Land Cover 2018 map (EEA, 2018) (inland and coastal wetlands, bare land, moors and heathlands) see Supplementary Table S4 for details.
The EU crop map is re-projected to the Lambert azimuthal equal area (ETRS89-LAEA, EPSG:3035) projection for the analysis.

Accuracy assessment
Three approaches to estimate the accuracy of the EU crop map are tested. The first approach is taking as reference data the high quality LUCAS core points not surveyed by the Copernicus module. The second approach is comparing the EU crop map with a selection of GSAA data based on farmers' declarations. The third approach compares the area of several main crops obtained from the EU crop map to the corresponding official subnational statistics.

LUCAS 2018 core points
In the first approach, validation was focused on all the vegetation classes (i.e. crop types, woodland and shrubland, and grassland). To obtain data relevant for the assessment, LUCAS core points were filtered to keep only high quality information. While the LUCAS core points were not initially designed for EO analysis, Weigand et al. (2020) showed that the data can be used in land cover applications although no detailed analysis for crop type mapping applications were done to date. We therefore defined four criteria to select only high-quality points suitable for crop type mapping applications: keep only direct and in-situ observations, remove parcels smaller than 0.1 ha and only keep data with an homogeneous land cover class. In addition, the subset of (63,364) core points also surveyed for the Copernicus module and used for training were not included in this validation set. This screening resulted in a total of 87,853 LUCAS core points covering the selected vegetation classes (spatial distribution overview in Figure Supplementary Fig. S4).
The reference data is used in combination with the EU crop map to report the confusion matrix, the overall, user, and producer accuracy. LUCAS is a two-phase sampling scheme. The LUCAS 2018 survey followed a stratified random sample design (Scarno et al., 2018). The first-phase is a systematic sampling scheme and we treat the second-phase, the 2018 samples, as collected under a stratified random sampling, due to the large amount of collected data. Accordingly, the accuracy metrics are reported based on the estimated area proportion of correctly and mis-classified classes according to the equations of Olofsson et al. (2014) for a 95% confidence level. The accuracy metrics are provided first for a set of grouped crop classes (Table 1) and second considering all the classes.

Comparison with GSAA
In the second approach, a subset of the classes are assessed and compared wall-to-wall with vector parcels from the GSAA for specific regions. The GSAA refers to the annual crop declarations made by EU farmers for CAP area-aid support measures. The electronic GSAA records include a spatial delineation of the parcels. A GSAA element is always a polygon of an agricultural parcel with one crop (or a single crop group with the same payment eligibility). The GSAA is operated at the region or country level in the EU-28, resulting in about 65 different designs and implementation schemes over the EU. Since these infrastructures are set up in each region, at the moment data is not interoperable, nor are legends semantically harmonized. Furthermore, most GSAA data is not publicly available, although several countries are increasingly opening the data for public use. In this study, six regions with publicly available GSAA are selected representing a contrasting gradient across the EU ( Table 2). The selected regions along with the acronym used in this study are: bevl2018 (Flanders in BE), dk2018 (DK), frcv2018 (Centre -Val de Loire in FR), nld2018 (NL), nrw2018 (North Rhine-Westphalia in DE)) and si2018 (SI). The six selected regions with GSAA data have comparable sizes, ranging from 401,833 parcels in Flanders to 659,302 parcels in the Netherlands, facilitating the comparison. As each national GSAA records set can contain more than 300 distinct classes, only the parcels for crop classes covering at least 1% cumulative GSAA area of a given region were selected (see Supplementary Fig. S2 for the distribution and Supplementary Fig. S3 for the count distribution). Excluding marginal classes with percentage cover smaller than 1% results in a coverage of the total area ranging from from 84% (Denmark) to 92% (Slovenia) of the total parcels for each respective GSAA. In total, 3,149,334 parcels were extracted covering 8.21 Millions of hectares. Of our pixel-based classification predicted classes within each GSAA parcels, we extracted the class mode and compared it to the declared crop, mapped to the LUCAS legend (semantic mapping in Supplementary Table S10). See also the Discussion Section 4.2 about semantic harmonization of GSAA.
For each of the selected parcels, the majority predicted class was computed for the enclosed pixel ensemble and then compared to the declared crop label, mapped to the LUCAS legend. Then Producer Accuracy (PA) and User Accuracy (UA) were computed. The extraction of the data was implemented with the Python "Checks by Monitoring" (CbM) package (https:// pypi.org/project/cbm/). This package provides Big Data Analytics routines for parcel crop monitoring using the Sentinel sensors and cloud compute infrastructure.

Comparison with reported subnational statistics
In the third and final validation approach, the area of the crop obtained from the EU crop map is compared to the official subnational statistics for the main crops in EU-28. In order perform do the comparison, subnational area data is obtained from the "Area (cultivation/harvested/production) (1000 ha)" 2 by NUTS 2 regions. The legend of the Eurostat crop classes is then semantically matched with the EU crop map legend as described in Supplementary Table S17.
The areas estimated from the EU crop map are obtained with a zonal histogram per administrative zone and then compared to the one reported by Eurostat at the finest spatial detail available (i.e. NUTS2 regions for most of the EU-28 except for the UK and DE where only NUTS1 is reported). For each crop with enough data available, the Pearson correlation coefficient is calculated. Additionally, the relative percentage difference in area is computed for each administrative region and mapped, highlighting the difference between the reported and here estimated surface area for each crop. The distribution of these differences for each crop is finally summarized to draw conclusion at the crop level. 3. Results

Features selection
The reference data consists of S1 time series retrieved from 58,178 polygons derived from the LUCAS Copernicus polygons corresponding to 2,296,889 S1 10-m pixels (Supplementary Table S1). The training data are distributed into 23 thematic classes and two geographical strata (Str1 and Str2) as described in the methodology. Each of the pixels thus have timeseries of synthesized 10-day -or 36 dekads per year -with VV, VH and VH/VV data. The VV and VH temporal σ 0 signatures for the 12 main crops are averaged per stratum and presented in Fig. 3 for the crop growing season. Fig. 4 summarizes the overall accuracy through time. An initial steep increase in accuracy is observed from February to June when only very small increases are achieved by adding more data. By the end of July the accuracy reaches a value of 78.8% (Fig. 4).
The dynamic of the Fscore through time for each crop separately (Fig. 5) shows expected patterns. The rape and turnip rape Fscore is booming when the crop is flowering in May (Fscore 0.86) indicating that the model could identify them quite early in the growing season (March). The figure also shows that common wheat has a steep increase in Fscore in April which coincides with the reproductive growth period. Grain maize, a summer crop which is harvested later in the season than winter wheat and rape and turnip rape, is reaching the plateau in Fscore at the beginning of July. Interestingly, the Fscore for sugar beet continuous to increase until October which is logical as sugar beet is harvested in late autumn in Europe. Finally, as expected, we observe a very low Fscore value for mixed classes (such as other fodder crops).
As detailed earlier, and considering also any future operational application, we opted for using time-series from January until the end of July for further processing (evaluation of indices and final training of the RF algorithms).
From the comparison (Table 3) of the overall accuracy for the polarization backscattering coefficients (VV and VH), the cross-ratio index (VH/VV) along with their combinations, we observe that the combination of VV and VH gives the highest accuracy (79.89 %) and is selected for the EU crop map. The training data for the EU crop map therefore consists of 10-day time series of VV and VH backscatter coefficients from 1st of January to 31st of July (22 dekads).  Fig. 6 is representative of a number of crop types and main vegetation classes and covers 91 Mha of cropland (9,174 M 10 m pixels). The map is available for download and visualization (see section 7).
The resulting map is consistent over different landscapes at the level of the main land cover classes. A naive area estimation by pixel counting for each class is provided in Supplementary Table S5. Woodlands and shrublands are regrouped in a single class covering more than 2 M km 2 ; i.e. more than half of the area mapped because it is a very broad class covering broadleaved and needleleaved natural and managed forests as well as permanent crops. The arable land class has an extend of about 0.9 million km 2 . The grassland class, grouping the permanent and temporary grasslands covers 0.7 million km 2 . The woodland and shrubland class covers forests patches but also tree lines and hedges in agricultural landscapes.
Over the EU-28, the three most detected crop types in term of area are common wheat, maize and barley. Despite using a pixel based classification approach, separated parcels can clearly be recognized. The difference in the size of parcels is well captured, as illustrated in Fig. 7, by comparing Fig. 7 (b) Austria and the Czech Republic Fig. 7 (c). The crop types are identified over the EU without a priori knowledge of country practices. The ability of the map to represent different agricultural landscapes is well illustrated in the examples of Fig. 7 for North of Orleans, France (a) where the diversity of crop types cultivated in the area are distinctly classified (wheat, barley, sugar beet, maize, potatoes, rape and turnip rape, dry pulses). North West of Valladolid, Spain Fig. 7(c) the major crops cultivated in Castilla y Leon: barley and sunflower are clearly detected together with some parcels of wheat, dry pulses and bare arable land. In Romania, North East of Bucharest Fig. 7 (d), the diversity of small and large parcels of wheat, maize, rape and turnip rape, sunflower and barley are well captured. 3.2.2. F-score per crop type and administrative unit As agricultural practices vary across Europe, in terms of crop type cultivated and crop calendar, we expect our mapping approach using a minimal stratification to represent the field reality with different fidelity in different areas. To investigate this aspect, a first assessment of spatial discrepancy for arable land class accuracy is done by calculating the F-score at country level, when enough LUCAS points were available, for the six main crops over EU, i.e. common wheat, barley, maize, sugar beet, sunflower, rape and turnip rape. This allows to quickly grasp how well the classification is performing spatially as highlighted in Fig. 8 (see Supplementary Table S9 for number details). The F-score for most crops is higher in the northern part of the study area than in the southern part. The F-score for common wheat is about 0.7 for Belgium, Bulgaria, Czech republic, Denmark, France, Luxembourg and Slovakia. Maize reaches high scores in the North (above 0.8 for Germany, Czech Republic, Hungary and Slovenia) and rape and turnip rape have an F-score higher than 0.9 in Austria, Czech Republic, Germany, Lithuania, Luxembourg, Latvia, Poland and Slovakia.

Accuracy Assessment 3.3.1. LUCAS core points
The accuracy of the EU crop map is evaluated using 87.853 points selected from the LUCAS 2018 survey. An overall accuracy of 76.1% (Supplementary Table S2) is achieved over the EU when considering the 19 crop type classes with the woodland and shrubland and grassland classes. An accuracy of 80.3% (Table 4) is achieved when grouping the crops by the main crop type classes (groups defined in the Table 1). The woodland and shrubland class is well discriminated from the rest (PA of 97% and UA of 83%). A confusion is observed between the grasslands class and the woodland and shrubland class, as highlighted by a PA of 58% for grasslands indicating an omission error. The UA of grassland is 82% indicating a commission error for grassland and again a confusion between In terms of land cover some classes of grasslands are very similar to open canopy class of woodland and shrubland. The point information provided by the LUCAS core dataset is also not the most suitable to assess more complex land cover classes characterised by discontinuous canopies.
Regarding arable land, cereals are well discriminated, reaching a producer accuracy (PA) of 84% and user accuracy (UA) of 72%. Omission errors for the cereals class are mainly occurring with the woodland and grassland classes while commission errors are happening mainly with grasslands. Root crops have a lower PA of 59%, with many points labelled as root crops being mapped as cereals instead. However the root crops present a high UA of 90% showing little commission errors with the other classes. The non permanent industrial crops have high PA (89%) and UA (79%) with good discrimination of sunflower and rape and turnip rape. The fodder crops class including cereals and leguminous cultivated for fodder is not well discriminated and is confused with the cereals and grassland class (PA of 7% and UA of 29%).
Regarding specific crop type (Table 5), the LUCAS accuracy assessment highlights a difference between the northern and the southern stratum with an OA of respectively 78% and 70.8%. The overall best-performing crop classes are rape and turnip rape, sunflower and maize. In the northern stratum, common wheat, maize, sunflower and rape and turnip rape have PA above 80% but the value of their UA is lower, indicating commission errors. The commission error is specifically high for the common wheat.

GSAA
The LUCAS core points are giving a first evaluation of the accuracy of the crop type at the continental level. A more detail assessment is done with the GSAA data for specific countries or regions. For each region, the classified crop of each parcel from a crop representing an area greater than 1% of the region is compared to the one from the GSAA. After removing the grassland and woodland classes the confusion matrices are computed. The confusion matrices along with the PA and UA for each region are presented in 7 see (Supplementary Table  S11 Fig. 9a and Fig. 9b respectively. The analysis shows very promising results for the detection of common wheat, barley, maize, sunflower and rape and turnip rape in all the regions analysed. Rape and turnip rape are detected without almost any confusion in the three regions where it is present (dk2018, frcv2018 and nrw2018), reaching PA above 98% and UA above 97%. It is worth noting that common wheat and barley have much better accuracies for both PA and UA compared to the LUCAS points based assessment, with the exception of si2018. Except for nrw2018 and si2018, common wheat has a PA above 80%, while the UAs are above 92% in all regions except si2018. In nrw2018, the most common confusion for common wheat are omission error towards barley and triticale classes. On the other hand any parcels of triticale in GSAA are correctly mapped as triticale. The confusion of common wheat with triticale is due to the similarity of the two crops, both in general crop structure and seasonality. In si2018, omission errors are observed towards barley and the other cereals class. Barley is present in the six regions with a PA above 92% except for dk2018 (88%) and nld2018 (82%). In Denmark, this is explained by a confusion between common wheat and barley while in nld2018 the confusion is mainly with maize. The UA for barley are a bit lower than the PA value but still above 80% for bevl2018, dk2018 and frcv2018. Both in nrw2018 (UA 76%) and nld2018 (UA 64%), the main commission error is due to a confusion with common wheat. Sugar beet presents very promising results in bevl2018 (PA 97% and UA 93%) and nrw2018 (PA 84% and UA 85%). In bevl2018, dk2018 and nld2018 where UA are lower, the confusion is mainly with maize and to a lesser extent with potatoes. The class potatoes, presents in four regions, shows high PA (91% in bevl2018, 96% in dk2018, 95% in nld2018 and 94% in nrw2018) and relatively low UA. In nrw2018, dk2018 and bevl2018, the UA (respectively 47%, 54% and 50%) of potatoes is explained by commission errors with maize. However, the UA of maize is showing little commission errors (99%). In nld2018, confusion is happening with maize and sugar beet and probably linked to the time period made available to the classifier. Maize and sugar beet maturity phases are reached in autumn, between September and November. A better discrimination would thus be expected later in the season.

Subnational statistics
The crop area data is extracted and compared for 220 administrative regions covering most of the EU-28. The matching with the reported statistics was done at the NUTS2 level (Supplementary Table S17). For DE and UK no data is available at NUTS2 level, the comparison was thus done at NUTS1 level for which statistics are available. Fig. 10 shows how the area estimated by the EU crop map compared with the area reported by Eurostat. The calculated Pearson's correlations (r) range from 0.93 for potatoes and rye to 0.99 for rape and turnip rape. Additionally, the relative difference in area (%) is computed for each region and presented in Supplementary Fig. S5 (Supplementary Fig. S6 provides the histogram of the differences for each crop). The maps show that while for common wheat and maize the EU crop map estimations are generally higher compared to Eurostat without a distinct spa-tial pattern in observed differences, this is not the case for other crops, whose overestimation and underestimation are more localised. Barley for example is underestimated in the center of Europe but overestimated in both the north and the south of Europe. The distribution of the difference shows that common wheat and maize are underestimated in our EU crop map compared to Eurostat. Barley is under-estimated in the northern part and over-estimated in the southern part. The other main crop area estimations are close to the official Eurostat statistics.

Timeliness and accuracy of the crop type classification
The final map presented in this study is built with S1 observations from January to end of July. The overall accuracy at the end of July across all crops is relatively high at 76%. As the season progresses, discrimination of crop types and parcels keeps improving as shown in Fig. 11. However, for an operational application specific crops can be mapped earlier during the season depending on the specificity of the region and requirements regarding accuracy for specific crop types. The timing when the F-score is highest for each specific crop is linked to canopy structure and phenological stage as these affect the backscattering signal. For in-season applications, a trade-off can be found between the required level of accuracy and timely map production.
While covering only 2018, the current study shows potential for developing in season prediction of crop type and confirms the observations of (Veloso et al., 2017; You & Dong, 2020) at the EU scale. A key element is that the S1 SAR signal is not affected by atmospheric perturbation. Any acquisitions can thus be used for crop type mapping. Although dependent on year to year phenological development, findings regarding the best timing for specific crop types are likely to be valid for future studies. Supplementary Fig. S7 shows that the time series of VV and VH backscatter for specific crop types have a similar shape in the northern and the southern stratum but often appear shifted in time. This relates to phenological development that occurs later when going northward. As a consequence, the F-score for the same class may peak at different times as revealed by the analysis of the temporal evolution of the F-score in the two strata.
In the northern stratum ( Supplementary Fig. S1), the F-score reaches a plateau at the end of June for common wheat, rape and turnip rape. Other cereals (rye, barley and maize) can be predicted with a high F-score at the end of July, even if the Fscore keeps increasing slightly until the end of the summer. The F-score for sunflower, potatoes and other fodder crops reaches a plateau one month later, at the end of August, while sugar beet reaches the plateau in September, in agreement with its longer growth cycle that extends into autumn. Winter wheat and rape and turnip rape that are already established in the field at the beginning of the year, present a structure, reflected by the VV and VH backscatter, that is already different for the other classes earlier in the year.
Similar results were found in other studies exploring the potential of S1 for crop type monitoring in northern European countries. For Belgium, Van Tricht et al. (2018) illustrates that while winter cereals were already well discriminated at the end of June, end of August was more appropriate for potatoes, maize or sugar beets. For France, Veloso et al. (2017) showed that S1 backscatter clearly separates winter wheat and rapeseed between March and July during the tillering and senescence stages and distinguishes maize, soybean and sunflower during the heading/flowering phase between June and the end of August.
In contrast, the highest accuracies are achieved earlier in the Mediterranean stratum. Already at the end of June, plateaus in F-score are reached for the accuracies of wheat (common and durum wheat), rape and turnip rape, other fodder crops, and the mixed class "other non permanent industrial crops" (including e.g. cotton, fibre and oleaginous crops). At the end of July, the plateau is reached for barley, sunflower and root crops while it is reached end of August for maize. The earlier time at which the highest accuracies are achieved in the southern stratum can be explained by typical timings of the growing season in the     two strata. As shown in ( Figure S4) Meroni et al. (2021) there is latitudinal gradient in the timing of the growing season with earlier growing seasons in the south of Europe. This gradient justifies the observed lag in achieving the best performances in the two strata To assess the accuracy of the EU crop type map produced at the end of July, three different and complementary approaches were presented in the Section 3.3. We discuss here the pros and cons of each.
The first one using high-quality LUCAS points (Section 3.3.1) enables an accuracy assessment for the whole continental area but on the other hand the protocol to collect the in-situ information is not designed for EO applications and we filtered out "low-accuracy" points, working only with a subset of the 2018 survey. In addition, the point information is difficult to reconcile with the assessment of crop type classes that present spatial heterogeneity in a small area and especially where only few acquisitions contribute to the 10-day average (e.g. in the South). Speckle filtering, not used in the present study, may attenuate such undesired heterogeneity.
The information about the crop type is then assessed more precisely with the GSAA (Section 3.3.2). The main differences between the LUCAS point based and GSAA parcel based accuracies are due to the parcel aggregation of the pixel classifi- cation within the GSAA parcel, which uses a majority filter. In this way, the analysis is not deteriorated by possible geometric inaccuracies that can affect the LUCAS points. In addition, speckle-induced variation affecting the pixel classification result within the parcel is removed. However, the GSAA assessment is limited to specific areas in the northern part of the study, due to a lack of openly accessible GSAA sets for 2018.
Finally, the comparison with the official statistics (Section 3.3.3 ) has the obvious advantage of covering the whole area of interest but the drawback of being a rather coarse evaluation of the classification as the comparison with official statistics can only be performed at some aggregate administrative spatial level.
Useful observations can be derived from the comparative analysis of of the three complementary level of accuracy assessment. For instance, some specific errors are detected with all approaches.
From the comparison with official statistics (Fig. Supplementary Fig. S5 and Supplementary Fig. S6) we observe an overestimation of common wheat in the northern part of our study. Similarly, for common wheat, an important commission error with other cereals classes is depicted by the LUCAS points assessments. The same confusion (common wheat versus barley and triticale) is reported by the GSAA assessment even with a smaller severity.
An over-estimation in the northern stratum is also observed for maize. This is consistent with what is observed in the accuracy assessment based on the LUCAS point(a low UA associated with an important commission error). In contrast, this is not reported in the GSAA assessment where the high UA of maize illustrates a low commission error for that class. Contrastingly, an under-estimation of maize is observed in the southern part of our study area. This is in agreement with the omission errors reported in the LUCAS assessment. We however lack of the GSAA comparison in the South to confirm this observation.
While barley is over-estimated in the southern part of the study, it is under-estimated in the northern part according to the official statistic assessment. The underestimation of barley in the North is in agreement with the LUCAS point assessment reporting low PA and UA. In contrast, the GSAA assessement reports PA are above 80% in all GSAA analyzed, not confirming an omission error. An hypothesis is that the GSAA assessment filters the pixels that are not yet detected as barley at the parcel level. The overestimation of barley in the South is confirmed by the LUCAS point assessment.
The confusion observed at the level of the study area with the LUCAS points assessment (Table 4) between the cereals and the root crops and for the GSAA between sugar beet and maize and potatoes is directly related to the time period (January to end of July) selected for EU crop map. The evolution of the VH backscatter coefficient in time Supplementary Fig. S7 illustrates what is also visible from the F-score Supplementary  Fig. S1, that the discrimination between broad leaf crops, cereals and maize increases from June onwards in the Northern stratum. Extending the period of a few month would alleviate these problems.

Recommendation for an operational service
More than 6 years after the successful launch of the first Copernicus satellite, S1 A, Europe still lacks access to a full, free and open archive of Level-2 Copernicus Application Ready Data for the S1 (A and B) sensors (S1-CARD). Whereas Sentinel-2 Level-2 data is produced as a routine ESA output, this is not the case for S1. Level 2 S1-CARD generation is a required expensive processing task that should be done once with the results made available to the community.
The scale of this study requires the use of advanced cloud computing solutions closely coupled to the massive on-line data archives of Sentinel data. The requirements for such cloud solutions are that they provide access to application ready data (ARD) and a programming interface that allows the application of advanced machine learning algorithms, for instance, in support of supervised classification, as was applied in this study. GEE is currently the only platform that provides access to consistently processed S1 ARD (backscattering coefficients only, for now), which is processed with a standard recipes using the open source SNAP s1tbx software 3 . Nevertheless, although GEE offers a number of routines for supervised classification (including RF used in this study), it has some limitations in the hyperparameter tuning along with some processing issues when the model is applied to very large areas. This is why the pre-processing of the S1 data into 10-day temporal synthesized mosaics was done in GEE while the results were downloaded on a local facility to perform the classification. For this study, this represented about 10 TB of data. This data transfer is clearly sub-optimal and, in addition, while GEE is free to use for non commercial application, the download of large amount of data has to be done via a paid Google Cloud account. In our study, we have processed the annual volume of data for 2018 in a post-hoc fashion (i.e. the classification was done off-line after the growing season). However, with the effective use of cloud based solutions, our method can be run in quasi-real time, for instance as a combination of transfer learning with early season in situ collection and GSAA for training and testing. Cloud based capacities allow integration of daily acquisitions in 10-day compositing with automatic re-run of the ML classifiers. Deep learning approaches that learn to extract discriminative temporal information from the data, such as convolutional Neural Networks and Recurrent Neural Networks could provide a more robust alternative to RF and should be considered for further studies.
Finally, it is interesting to reflect on the actual utility of a EU crop map against other potential sources. We in fact acknowledge that there would be no need for a post-harvest crop map with 70-85% accuracy for only a limited set of major crops if the annual GSAA declaration set is available in July, with a typical accuracy of 90-95% (for a far more significant set of crops). However, so far, most of the EU Member States' GSAA are not openly accessible and, when this is the case, they are usually available after the growing season (the Netherlands being an exception, with a preliminary version published in July of the current year). Recently, the Integrated Administration and Con- Fig. 11: Crop type mapping results for different time periods, considering the data from January to respectively April, May, June, and July (the EU crop map presented in this study) in South East Germany.
trol System (IACS) spatial data sharing initiatives 4 is promoting EU member states to share their IACS data within the infrastructure for spatial information in Europe (INSPIRE) portal 5 . The expected outcome of this new regulation is the expanded public availability of GSAA.
While LUCAS is performed every three years, the GSAA data are collected every year in Europe, making them and ideal source of ground data for crop classification training and validation. Beside the legal and technical challenge to make the data open access already discussed, the legend semantic harmonization is key to use in classification approaches across different regions. While the semantic matching was done for this specific study, a common semantic model would be very useful to integrate GSAA in a more detailed EU-wide classification context.
Besides LUCAS and GSAA, other information source could be considered. The use of crowd sourcing to collect in-situ data and generate LUCAS-like information could enrich data availability (Laso Bayas et al., 2020). The use of geo-tagged street-view pictures has been already investigated for agriculture monitoring (d' Andrimont et al., 2018b) and could represent an alternative in-situ dataset. Application like Pl@ntnet (Joly et al., 2016) or i-naturalist (Van Horn et al., 2018) collect massive plant and species occurrence information via their portal. Finally, leveraging mobile gaming solutions such as the impact games (https://world.game/) could represent an additional alternative to collect massive in-situ data.

The EU crop map 2018: perspectives for improvement
There are a number of limitations in the proposed mapping approach. Some are coming from the lack of information in the LUCAS survey and other from methodological choices.
From the LUCAS data, the limitation comes mainly form the crop type surveyed, the number of samples per classes and details about the agriculture practices such as irrigation. We did not distinguish between rain-fed and irrigated parcels due to a lack of samples in each category. This particularly affects the maize detection in the South of Europe where most of the maize is irrigated. The Supplementary Fig. S5 shows that the maize is indeed underestimated in Spain, southern Italy and Greece. Another limitation is that the approach proposed doesn't capture rice. While that class is usually easy to discriminate from other crops, it is here missed due to the lacking of training data (i.e. 14 polygons). The west area of northern Italy is particularly affected, with a confusion between rice and maize. Future studies could consider using information about the water management available in the LUCAS survey. Better understanding and taking into account on the backscatter signal of irrigated surfaces would also certainly improve the discrimination of irrigated crops.
In term of methodological choices, improvements are possible taking into consideration crop classes not considered in this study. Specifically, permanent crops are covering a considerable extent of agricultural land and are currently classified in the broad class Woodland and shrubland. The Copernicus information for the two classes of permanent crops (fruit trees and the other permanent crops such as olive and vineyard) can be used to explore classification workflows identifying them. Distinction between temporary, permanent and natural grasslands would also be worth considering in future developments which would include multi-season temporal profiles.
Although we have not considered using S1 coherence in this study we are well aware of the potential use of this alternative Level 2 product in agricultural use contexts (e.g. Tamm et al. (2016); De Vroey et al. (2021)) 6 . We realize that a future extension of S1 use in national or continental crop mapping requires the introduction of terrain flattening to better compensate for incidence angle and look direction differences in the multi-orbit compositions for 10-day periods. Although we have masked terrain with steep slopes (> 10 • ) above 1000 m limiting these effects somewhat, we expect improvements especially for crops with an extended bare soil phase (e.g. summer crops) for which angular variability in backscattering can be significant. We have not applied speckle filtering, because speckle variance is already reduced in the averaging step we apply to create the 10-day composites and using polygon averaged values for the RF training. However, the sparser revisit frequency in the Southern areas makes speckle filtering potentially useful. Furthermore, we could consider speckle filtering with a shorter averaging period (e.g. 5 days) in the North, to better capture abrupt temporal changes in the σ 0 signatures (e.g. at harvest).
This study demonstrates the potential of S1 as a baseline for continental mapping. However Sentinel-2 (S2) would certainly improve the results as already demonstrated by Van Tricht et al.  Chakhar et al. (2021). In particular, we expect that the inclusion of S2 data would improve the discrimination of main cropping practices. For example helping to separate winter crops from permanent grassland, for the detection of bare soil conditions in spring to delineate summer crops, and to highlight crop specific phenology events (e.g. d 'Andrimont et al. (2020a)). Additionally, S2 would be particularly useful in the South where there is less cloud coverage and also less S1 acquisitions (Lemoine, 2017). More generally, to facilitate S2 ingestion in the machine learning processing, smoothed time series of cloud free biophysical variables would be useful (e.g. the Copernicus pan-European High-Resolution Vegetation Phenology and Productivity products (HR-VPP) (Tian et al., 2021)).
The EU crop map provides a coherent and unique continental view with a spatial detail that is relevant for crop management interventions at the parcel level. As such, the information can be relevant for EU wide policy monitoring, e.g. by deriving indicators such as crop diversity (Merlos & Hijmans, 2020), and be a source of input data for EU wide models and assessments requiring detailed information on crop location.
Another perspective is to derive national or sub-national area statistics from the EU crop map. This was not considered in this study. Area estimation can be done directly from the crop map (pixel counting) but this is not taking into account the bias inherent of remote sensing: the presence of mixed (border) pixels and the misclassification of pure pixels Gallego (2004). Estimating area from a satellite-based map typically requires the use of an un-biased estimators (Olofsson et al., 2014;Stehman, 2014;Gallego & Delincé, 2010) by combining data from a ground survey or from another source of higher quality data in combination with the map. LUCAS 2018 survey is a source of reference data, providing information until the NUTS2 level. However, further exploration is needed in order to use it for area estimation at a finer scale. While the approach of Gallego & Bamps (2008) for area estimation with CLC as poststratification could be a first naive approach, a more precise approach would be the one of Stehman (2014) to take into account the fact that the strata from our map and the strata from the LU-CAS survey are not similar. However, for this purpose, the sampling weight of each LUCAS point should be made available by Eurostat 7 . In addition, further care should used to account for the fact that a subset of the 2018 survey was used in the training. 7 It is worth mentioning here that two institutional ongoing projects are developing methodologies and good practices to integrate EO in statistic collection process: Sen4Stats (https://www.esa-sen4stat.org/) and This study sets up a framework that could apply for other years. While the approach demonstrated is robust, using model trained for 2018 on other years could result in not satisfying result. Indeed, it is noted that 2018 was a peculiar year, with droughts and severe heatwaves affecting Europe. Drought conditions in central and northern Europe caused yield reductions up to 50% for the main crops, yet wet conditions in southern Europe saw yield gains up to 34%, both with respect to the previous 5-year mean (Buras et al., 2019;Toreti et al., 2019). To tackle year to year seasonality and the limited availability of insitu data, classification methodological developments are still needed to apply the proposed approach to other years.

Conclusions
Arable land represents almost half of total EU area and its monitoring is crucial for efficient environmental, agriculture and climate policies' implementation. In this study, a framework to derive a 10-m crop type from S1 at continental level has been designed, implemented and assessed rigorously. The proposed framework opens avenues to develop a synergistic robust information system to monitor agriculture consistently both at fine spatial-and-temporal scale over large areas. This demonstrates the current momentum combining the unique Copernicus Sentinel fleet, high quality massive in-situ data along with cloud computing. The method and data developed will serve communities ranging from scientific modellers to policy makers.

Data dissemination
The EU crop map masked with the non-vegetation classes and steep slopes is available for download and visualization. 8   List of Tables   Table 1 The legend of the EU crop map. For the two-phase hierarchical classification procedure, the legend is divided into two levels: level 1 with seven broad land cover classes, and level 2 containing 19 crop types or crop groups. 6         Fig. S7: Sentinel-1 backscattering coefficient VV and VH are grouped for broad leaf crops (potato, sugar beet, sunflower), cereals (common wheat, durum wheat, barley, rye, oats, triticale), maize and rapeseed. This grouping better illustrates the differences observed in the north (stratum 1) and in the south (stratum 2).