The ICOS Atmosphere use case: from raw data to knowledge of societal relevance

In order to effectively tackle the on-going changes in global climate, stakeholders and policy makers across the globe need timely and accurate information they can trust on a wide range of topics. Here we briefly describe how observational data on greenhouse gas concentrations collected from the ICOS network of measurement stations in Europe is transformed via atmospheric modelling into information on how emission and uptake of greenhouse gases vary over time and space, and interpreted by experts to create knowledge usable by decision makers.


Background
Perhaps the biggest challenges facing human civilization today is the ongoing global climate change. On top of anthropogenic modifications, like deforestation and pollution, our environment is now also being affected by climate-driven changes including rising sea levels and increasing risks of extreme weather events, affecting millions of people. Stakeholders and policy makers across the globe must work hard to come up with effective strategies towards mitigation of these negative effects. But to do so, they need timely and accurate information they can trust, on topics ranging from demographics, socioeconomic development, and land usage to the environment.
However, this task poses considerable challenges, including 1) dealing with a multitude of sources, often with unclear provenance of the information, such as lack of details of how data were collected, analyzed and interpreted; 2) combining data with varying spatial and temporal resolution and coverage; and 3) judging quality and fitness-for-purpose of data, information and knowledge.
In this paper we take a closer look at a use case taken from atmospheric research in Europe, examining how observational data collected from a network of measurement stations scattered across the continent are fed into advanced numerical models of how emission and uptake of greenhouse gases vary over time and space, and how interpretations of this output contributes to knowledge that can be used by decision makers.

Integrated Carbon Observation System (ICOS)
Integrated Carbon Observation System, ICOS (https://icos-ri.eu), is a pan-European research infrastructure with a mission to provide standardized, long term, high precision and high quality observations 11th International Symposium on Digital Earth (ISDE 11) IOP Conf. Series: Earth and Environmental Science 509 (2020) 012023 IOP Publishing doi:10.1088/1755-1315/509/1/012023 2 on the carbon cycle and Greenhouse Gas (GHG) budgets and their perturbations. ICOS is built around over 130 field sites, each related to one or more of the three domains Atmosphere, Ecosystem and Ocean, and operated by ICOS 12 member countries.

From observations to information
At the ICOS measurement stations, observations of e.g. concentrations or exchange fluxes of greenhouse gases and meteorological variables are collected following strict standardized protocols. All collected raw data are then processed and quality controlled, again using highly standardized processing chains, at ICOS Thematic Centres (one for each domain), before being distributed via the Carbon Portal (https://icos-cp.eu), the data centre of ICOS.
Although ICOS Carbon Portal offers discovery services allowing anyone to visualize these observational data products, they cannot however be readily used or interpreted by non-experts. A second data processing step, performed by atmospheric scientists with an understanding of the processes that underlie the emission and uptake of greenhouse gases is needed. Indeed, the perhaps most important end user groups for ICOS are researchers specializing in so-called atmospheric inversion and data assimilation models. These combine observations with data from emissions inventories (anthropogenic sources), outputs of ecosystem and marine models (contributions from vegetation and the oceans), and atmospheric transport calculations (showing the 3D-movement of air packets over time) to calculateas a function of time and location on the Earth's surface -the strengths of GHG sources and sinks. Importantly, the models also allow to assess the uncertainties of the results.
The model outputs, typically time series of maps with a relatively coarse spatial resolution, may now be used as the basis for estimates of for example the yearly net emissions of CO 2, CH4 and other GHGs of different regions, such as northern or southern Europe, Eurasia etc. One can thus claim that at this level, data has been turned into information.
To make data and model outputs more accessible, ICOS provides tools, in the form of scripting environments (such as Jupyter Notebooks) based on Virtual Research Environment technology, to work with the emission maps and for example apply geographical filters to extract time trends of regional emissions. (It must however be noted that the uncertainties of the outputs in combination with quite large grid cells makes any statistics extracted for individual countries, or even small regions, highly questionable from a scientific viewpoint!)

Creating knowledge from information
The final step, to produce knowledge from information, requires further actions by experts -not only from the atmospheric sciences, but also involving specialists on climatology, land use, economics, industrialization, and society. Together, these specialists can interpret the model outputs, put the results into context, and further analyze the overall uncertainties, before comparing with other sources of data on net emissions based on other methods. The final results, often known as synthesis reports, can then be handed over to policy makers and their associated experts to be used as the basis for political decisions.

Concluding remarks
In addition to the transformation steps outlined above, there are additional points in the overall data processing and analysis workflow where information could be created, and knowledge extracted. As an example, by allowing both the involved actors (ICOS, the atmospheric modelling community and the experts) , as well as the end users (of either (raw) data, calculation outputs or syntheses) to annotate the data sets with their comments, a wealth of information and cross-links between the involved entities would become available. However, the annotation system must be not only light-weight and easy to use, but also robust and secured against tampering.