A reporting format for leaf-level gas exchange data and metadata

that data and facilitate efficient data re-use. For data users, the standard will expand the capacity of data repositories to optimise data search and extraction, and more readily integrate similar data into synthesis products. The standard comprises metadata elements, standard vocabularies and required variables for survey measurements, dark respiration, CO 2 and light response curves, and parameters derived from those measurements. A crosswalk across the outputs of common instruments was developed to enable accurate data compilation. A process of extensive consultation with data collectors, data users and data scientists was undertaken to ensure that the standard would meet community needs. The standard presented here is intended to form a foundation for future development that will incorporate additional measurement types and variables. Access to the standard documentation, and future additions, will be enabled by hosting the standard on an open source version control system.


Introduction
The interface between plant and ecological sciences and research data infrastructure is rapidly evolving, with greater expectations for data preservation, reproducible and open research, and the potential to synthesize data across different studies maximizing investments in research. Moreover, publicly accessible data archiving is increasingly required by funding bodies and publishers. Numerous databases and repositories, and other data infrastructure, have been developed to fulfill these needs, including TRY (Kattge et al., 2020), Environmental Data Initiative (environmentaldatainitiative.org), Dryad (datadryad.org), figshare (figshare.com) and DataOne (dataone.org). Yet the reuse of these data resources remains hampered by the difficulty of locating, unifying and assessing the quality of data, and the absence of important metadata needed for inter-site comparison or synthesis. The challenges that must be addressed for data managers to best support scientific discoveries are summarized by the FAIR principles, a call to improve Findability, Accessibility, Interoperability and Reusability of data (Wilkinson et al., 2016).
Leaf-level gas exchange measurements quantify the flux of carbon dioxide (CO 2 ) and water vapor into and out of a leaf. Typically collected with infrared gas analyzer instruments, these measurements are used to determine a range of physiologically important fluxes and traits, principally the rates of net CO 2 assimilation, respiration and stomatal conductance. Gas exchange data are used to answer a wide variety of scientific questions regarding plant function and response to environmental change (Long et al., 1996;Long and Bernacchi, 2003). They are the basis for estimating and scaling photosynthesis from the leaf to canopy (Yang et al., 2020), and are used to parameterize global biogeochemical models (Kattge et al., 2009). The products of photosynthesis are critical to society, as they provide renewable supplies of food, fuel, medicine, and fiber (Vitousek et al., 1986). Understanding and improving photosynthesis, and water-and nutrient-use efficiencies are currently considered to be key targets to improve the resilience of crops to global change (Ainsworth et al., 2008;Leakey et al., 2019;López-Calcagno et al., 2020;Ort et al., 2015;Simkin et al., 2019). Furthermore, plants play a critical and unique role in determining the response of the terrestrial biosphere to rising CO 2 concentration and in turn influence the rate of global change (Walker et al., 2020). Analyses have also shown that terrestrial biosphere model outputs are particularly sensitive to parameters derived from gas exchange data (Bonan et al., 2011;Booth et al., 2012;LeBauer et al., 2013;Ricciuto et al., 2018;Sargsyan et al., 2014) and that the use of derived parameters from gas exchange data can effectively constrain uncertainty in model simulations (Dietze, 2014). In short, gas exchange data are central to understanding, improving and modelling the response of plants to global and environmental change.
However, collection of these data requires specialist training, is time consuming, can involve elaborate logistics (Ellsworth et al., 2012;Weerasinghe et al., 2014), and often utilizes techniques adapted to particular experiments, instruments and environments. Thus, resulting data products are typical long-tail data, i.e. data are low volume, and have diverse and heterogeneous content, and are thus not easily shared (Heidorn, 2008;Wallis et al., 2013). Currently, most data repositories that store diverse data typically focus on describing generic packagelevel metadata, and not metadata specific to the data type, which limits the use, search and data discovery services for long-tail data types (Limani et al., 2019). Our review of existing data repositories and plant trait databases revealed that where leaf-level gas exchange data are available, the data provided are limited and metadata required to properly interpret and reuse those data are often missing. The need for specialist data standards for disciplines is well recognized (Bruneau et al., 2019;Limani et al., 2019), and the importance of developing standards for the collection and storage of plant trait data has been the subject of several recent studies (Gallagher et al., 2020;Kissling et al., 2018;Schneider et al., 2019). Despite recent increases in compendia of gas exchange data (Lin et al., 2015;Keenan and Niinemets, 2016;Kumarathunge et al., 2019;Smith et al., 2019;Niinemets et al., 2015;De Kauwe et al., 2016;Ali et al., 2015), and previous calls for standard archiving (Dietze, 2014), there is no standardized reporting format that enables syntheses of these data.
Data archiving is only the first step towards maximizing the value of data. In order to be reused or incorporated into models or synthesis products, data must be both findable and accessible; these characteristics are optimized by appropriate, machine-readable search terms and persistent dataset identifiers. Currently, gas exchange instruments do not share a common output format, i.e., file structure, variable names, and units, and include column headers that are not machine readable. Additionally, reuse is enabled by including sufficient metadata to correctly interpret the data, and use of common formats and terms that allow processing multiple studies from different sites, with various measurement methods (Christianson et al., 2017). Currently, metadata associated with gas exchange data collections are largely limited to location and species (Kattge et al., 2020). Lack of documentation and metadata are recognized as data archiving risk factors (Mayernik et al., 2020), with the implication that without adequate metadata, data cannot be interpreted or used correctly. To reuse data, researchers often have to refer to original publications to access essential metadata or other key information (e.g. leaf temperature), which can be a prohibitively resource-intensive process, or, especially in the case of older work, impossible because information is unavailable. Also, as research data infrastructure moves towards advanced capabilities such as application programming interfaces (APIs) to facilitate data upload and download, or support for data visualization and analytics, standardization of data and metadata in machine readable formats will become increasingly essential (Bruneau et al., 2019). For example, the Darwin Core standard for biodiversity data has enabled the global integration of hundreds of millions of species occurrence records through the Global Biodiversity Information Facility (GBIF, www.gbif.org), and has facilitated reuse of these data in countless studies (Ball-Damerow et al., 2019). However, specific guidance for leaf-level gas exchange data and metadata is lacking.
Here we present a new data and metadata reporting format for common types of leaf-level gas exchange data, reaching consensus among over 80 researchers in the field. We describe the process of development of these guidelines, for which the aim was to find the balance between maximizing the usefulness of the reporting format to the research community with ease of compliance when a data provider is preparing a new dataset. A key aspect has been engaging the community of leaf-level gas exchange experts in the development of this reporting format, with a concerted effort to reach as many potential data contributors and users as possible. Our goal with this initial focused effort on a leaf-level gas exchange reporting format was to develop a solid foundation for further development that could include a wider range of data types. An important component of this proposed reporting format is the public archive of complete instrument outputs. While we cannot foresee all future data uses or different processing methods, the preservation of the unprocessed instrument output is a way of futureproofing rare and valuable leaf-level gas exchange data sets (Rogers et al., 2017).
The creation of this reporting format for leaf-level gas exchange data was initiated by a call for community accepted data formats for the U.S. Department of Energy's (DOE) Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) data repository (Varadharajan et al., 2019). Accordingly, the reporting format described here is known as the 'ESS-DIVE reporting format for leaf-level gas exchange data and metadata', and referred to in this paper as 'the reporting format'. However, development of the format and documentation has considered global needs for these data. This paper presents a reporting format for leaf-level gas exchange data designed for wide adoption across ecological data repositories, and thus does not describe implementation in a specific repository or database. The reporting format is designed to complement, and not duplicate, other metadata requirements for sites, samples and additional relevant information, and when possible this format should be used in combination with such requirements. For example, in the ESS-DIVE data repository a data submission must also include package-level metadata (e.g. authors, keywords, publication date, spatial and temporal coverage) and samplelevel metadata (e.g. sample material, latitude, longitude, elevation, biome). It is encouraged that, where available, this reporting format is used in conjunction with established ontologies, such as Darwin Core (Wieczorek et al., 2012), the Plant Ontology (Cooper et al., 2013) and the Environment Ontology (Buttigieg et al., 2013).
The scope of this reporting format for leaf-level gas exchange focused on defining data and metadata variables to describe the most common type of measurements and those that have been the focus of recent synthesis efforts, i.e. survey style measurements; the response of photosynthesis to CO 2 and irradiance; and parameters derived from these relationships. In this paper we 1) describe the process of developing the format, including review of existing standards and conventions, and community consultation; 2) provide details of the components of the reporting format, including the guidance for data and metadata fields, vocabularies, units and definitions; and 3) discuss challenges to reaching consensus, the future potential to include additional measurement types and the use of this reporting format as a basis for the development of data management tools.

Search for published standards
Literature and web resources were searched to identify any published standards guiding best practice for the archive of leaf-level gas exchange data. A list of ecological trait databases was assembled, based on web searches, and a comprehensive table published by Schneider et al. (2019). Of these, databases and repositories identified as containing plant trait data were reviewed to determine if they included leaflevel gas exchange data, and if submission of data required adherence to any standards or formats (Supplementary Tables 1 & 2). A catalog of over 1400 data standards, including 17 categorized as concerning physiology, available at FAIRsharing (Sansone et al., 2019), was searched for standards that define variable names and metadata terms required to describe leaf-level gas exchange data.

Variable names and definitions
Existing data repositories, databases and synthesized datasets were reviewed, and the most commonly used variable terms and definitions were adopted into this reporting format (Supplementary Tables 1-3). The TRY plant trait database (Kattge et al., 2020) was identified as the most extensive publicly available plant trait database that contains leaflevel gas exchange data. Variable definitions in TRY are adopted from TOP, a thesaurus of plant characteristics (Garnier et al., 2017). Several relevant variables are included in BETYdb, the biofuel ecophysiological traits and yields database (LeBauer et al., 2018). Another resource for measurement variable definitions are several published guides to standard measurement protocols, including ClimEx (Halbritter et al., 2020), the Plant Handbook (Pérez-Harguindeguy et al., 2013) and Prom-etheusWiki (Evans and Santiago, 2014;Sack et al., 2010). The use of variable names in the large datasets including GlobResp (Atkin et al., 2015) and GlobAmax (Maire et al., 2015) were also considered. The default output from ten commercially available gas exchange instruments (manufactured by ADC Bioscientific, CID-Bioscience, LI-COR Biosciences, PP Systems and Walz; Supplementary Table 4) was assembled into a translation table to allow comparison and identify commonalities. Each variable was defined with a name, unit and description, and in some cases, an expected value range.

Metadata requirements
Many data repositories have existing metadata requirements to cover general experimental and sample parameters, such as characteristics of the location where measurements were conducted. Here we identified specific metadata parameters that would allow users of leaf-level gas exchange data to discriminate between data types, experimental protocols, and sample characteristics. We chose variables based on our collective expertise of conducting gas exchange measurements across diverse ecosystems and experimental designs, and of using those data in syntheses and meta-analysis. Our goal was to include metadata requirements for variables that would be most relevant for synthesis activities, including variables to distinguish data obtained from natural or cultivated plants, and to differentiate between common experimental manipulations and leaf sampling techniques. Controlled vocabularies-lists of preferred terms-were developed for each metadata variable to allow consistent metadata reporting. The selection of required metadata variables sought to find a balance between optimizing data discoverability and usability, while at the same time not placing undue burden on data contributors.

Community consultation
The draft reporting format was made available to the community of leaf-level gas exchange experts for suggestions and comment. Input was sought from data contributors, data scientists, data users, and instrument manufacturers. The invitation to participate was sent via direct email to 120 contacts, and reached a wider (unquantified) audience through encouraged sharing of the invitation to participate and social media. The development of this data reporting format was very well received by the community; eighty individuals contributed to the reporting format documentation, and thus this paper ( Supplementary  Fig. S1).
An introduction to the purpose, structure and components of the reporting format was presented as a publicly accessible webinar hosted by ESS-DIVE in July 2020, followed by a month-long period of feedback and discussion. Follow up video conferences were scheduled to discuss refinements and solutions. Feedback was gathered in an open manner, with comments and suggestions available to view by all on a collaborative online document. The reporting format documentation was then migrated to a public GitHub repository, where additions and refinements can continue to be made, and version controlled releases will be freely available for use.

Results
There are a number of common conventions in use for reporting of leaf-level gas exchange data, however they are not universal, and our search did not discover any published standards for data reporting. This directed our efforts into the development of a new metadata and data reporting format to enable diverse data contributors to use unified terminologies and formats when publishing data, thus lowering the barrier for data reuse and harmonization. The range of measurements that can be made with gas exchange instruments is broad. We have developed the foundation for a common reporting format by reaching consensus on a list of standard variable names and units for data tables, and metadata elements specific to leaf-level gas exchange measurements. Further guidance for data and metadata content is also proposed for data types that are commonly measured: survey style gas exchange measurements, dark adapted respiration measurements, CO 2 and light response curves, and parameters derived from those response curves.
The reporting format documentation comprises a number of elements relevant to all types of leaf-level gas exchange measurements; a list of variable names and units (Section 3.1) that should be used in data tables, a translation table of data outputs from commercial gas exchange instruments (Section 3.1.1) and comprehensive metadata requirements with controlled vocabularies (Section 3.2). For selected data types (Section 3.3), the reporting format also specifies the minimum required variables to be included in the data table, and a list of details to be included in the measurement protocol description. Each of these elements is described in more detail in the sections below. The reporting format templates and complete documentation is available for download from the ESS-DIVE repository  and an example data package following the reporting format guidelines is hosted on the NGEE-Arctic data archive (Rogers et al., 2019).
Here we use the term 'data package' to refer to a collection of data and metadata files to be published together in a data repository (Christianson et al., 2017). A data package of gas exchange data should contain formatted data tables, metadata tables and the complete instrument output. Any data package may also include additional data types and variables not yet covered by the reporting format. Data packages should also include general metadata as required by the hosting data repository or database (e.g. author list and other citation information, data licensing terms). Fig. 1 shows the relationship between the components of a data package and definition tables included in the reporting format documentation.
The documentation includes user guides and templates to present the methods metadata and instrument details. Data and metadata tables should be in comma separated value (.csv) format; additional materials can be also included as text if appropriate. All components of a data package should be in English language; other language translations can be included as an additional resource.

Variable names and unit specifications
Consistent use of variable names (also known as field names or headers) in data tables is a key element of generating standardized datasets that can be readily combined or imported into a database. A list of variables, including measured and calculated instrument outputs, calculated parameters, and constants, were designated as variableName, variableUnit and variableDefinition in the defined variables table. Note that the camelCase naming used here indicates variables defined in the reporting format documentation. These conventions for variables were reached based on the most common usage in existing publications, databases and instrument outputs. In cases where common usage had not already been established, variableNames were selected to be human and machine readable, and with no recognized conflicts with other uses. The units for each variable are listed separately, and are not included as part of the variableName. The measurement quantities are described in each variableDefinition, thus variableUnits are presented without information about the quantity, following NIST guidelines (Thompson and Taylor, 2008). For many variableNames, the reporting format also specifies an expected range of values resulting from common measurement approaches; these limits can be used to guide quality checking of data.

Instrument output translation table
A translation table of 23 measured and calculated output variables from ten commercially available gas exchange instruments (Supplementary Table 4) was compiled to assess the most common variable names and measurement units, and provided input into the process of defining the variables for this reporting format. It was found that there is some variation among the output of different instruments, in both variable names and units used, and thus the default instrument outputs are not always exactly aligned with the proposed reporting format. For example, measurement of photosynthetic flux density (PPFD) incident on the leaf is variously labeled as Q, Qleaf, Qin, PAR, PARi or PARtop across the different instruments. The instrument output translation table provides a guide for conversion of results to standard variableNames and variableUnits, can assist data users to understand instrument output from unfamiliar instruments, and be used for future advances such as the development of automated tools for data upload. This compilation was based on current instrument manuals and software versions; users should note that future instrument and software updates may change the outputs. It is emphasized that re-labeling and conversion of variable names and units to match the format is not a requirement for the complete instrument output files.

Metadata
All published data packages that use this reporting format should include metadata to ensure that data are adequately described, in order to allow users to fully understand how the data were generated, and maximize findability of data with certain characteristics. The reporting Fig. 1. The relationships between gas exchange measurements and the components of this data and metadata reporting format. The characteristics of the 'measurements' (experimental design and recording of information) inform the content of metadata and data tables. Components shown in boxes should be included in a data package. Data tables, methods metadata, instrument details and the complete instrument output (if available) are required to be included in a data package (grey boxes with solid borders). Inclusion of metadata supplement tables will be dependent on the experiment (grey box with dashed border). The requirements for other related metadata (white box) could be set by a data repository, or be mandated by other specialist data standards or conventions; those details are not covered by this reporting format. Information components (hexagons) are reference tables to guide the format and content requirements of the submitted data. Refer to the reporting format documentation for a complete list of variables, definitions and controlled vocabularies . format provides controlled vocabularies and template files for the required methods metadata (Section 3.2.1) and instrument details (Section 3.2.2). The inclusion of methods supplement tables (Section 3.2.3) and other related metadata (Section 3.2.4) will depend on the design of individual experiments; thus for these items the reporting format provides guidelines and recommendations only.

Methods metadata
The methods metadata is a record of data types, measurement protocols, experimental and sample characteristics, and details of data processing and calculation approaches, summarized in a single file. The reporting format includes a template file and controlled vocabularies to simplify metadata creation and ensure consistency across datasets. However, the diversity of experimental approaches is recognized, and flexibility is accommodated by allowing use of free text for many variables if the controlled vocabulary is not adequate.
Development of the methods metadata focused on important search filters for data users such as the growth conditions and treatments of the plants on which measurements were made; these should be indicated using the growthEnvironment and experimentalManipulation variables. For example, the growthEnvironment variable captures if the plants were grown in natural or controlled environments, while exper-imentalManipulation can be employed by data users to include or exclude common treatments such as atmospheric, water or nutrient manipulation. Further categorization is enabled by use of variables such as can-opyPosition, lightExposure, leafAge and plantAge.

Instrumentation details
This data reporting format provides a template to record details of the instruments used for data collection, including model information, software version, type of chamber used, and a statement of instrument calibration. This will enable users of data to understand the data provenance, and achieve data equivalency in synthesis products.

Methods supplement tables
Leaf-level gas exchange data are often measured with the purpose of comparing between sample types or treatments; these discriminators are commonly included in data tables as codes to represent species, treatments, plots or other characteristics. The methods supplement tables component of this reporting format demonstrates how the explanation of these descriptors should be included in a data package with a range of examples. Inclusion of metadata supplements in a data package is highly dependent on the nature of the experiment, and as such, examples are provided as guidelines only and are not required by the reporting format.

Other related data and metadata
Gas exchange data are frequently associated with other measurements, e.g. leaf nitrogen content. We strongly encourage the use of unique, persistent sample identifiers to link gas exchange data and other data and metadata associated with the same sample. The unique sample identifier should be a column in the data file. Also, in simple cases associated data could be included as additional variables (i.e. columns) in a gas exchange data table and where data are collected in-line with gas exchange, e.g. fluorescence, logically they should be included in the same file. In cases where a variableName is not defined by this reporting format, data providers should follow other appropriate standards or conventions. Similarly, for experimental data not covered by the methods metadata variables, such as reporting of environmental, landscape, or climatic characteristics, or genotype variation, data providers should utilize published standards or formats for that data type. Metadata associated with the sample collection (e.g. location information, sample description) can be provided separately using a file that conforms to recognized sample reporting formats (e.g. Damerow et al., 2020).

Specific requirements for selected data types
Additional reporting guidelines are provided for seven data types identified as common gas exchange measurements (e.g. photosynthetic CO 2 response curves) and analytical approaches (e.g. one-point method) (Table 1). For each of these data types the reporting format includes a detailed description of the data type, a list of elements required in the protocol description, and the minimum required variables to include in the data table. For data types not described here, data creators should use a protocol and minimum variable requirements as judged appropriate for their data.
For each data type, a list of 5-8 required variables was developed in order to capture the result variable (e.g. V cmax ) and covariates required to interpret that result in context. Of the existing standards and databases reviewed, only the BETYdb specifies any required or optional covariates (LeBauer et al., 2018). Thus the minimum required variables presented in this reporting format are the result of an iterative feedback process involving both domain expert contributors and users. Data contributors may also include any other variables, using the varia-bleNames defined in this reporting format. The data table should also include the sample identifier, and other sample variables (e.g. species, treatment) as required.

Table 1
The data types for which this reporting format makes specific recommendations for variables required in the data table, and protocol descriptions. Refer to the reporting format documentation for a detailed description of each data type.

Data type Description
Survey Single point measurement of leaf gas exchange. Response of photosynthesis to intercellular CO 2 concentration (AC i curves) Sequential measurements on the same leaf material of photosynthetic rate with varying CO 2 concentration. Photosynthetic parameters derived from AC i curves Results of fitting photosynthetic CO 2 response curves to derive parameters, e.g. apparent V cmax , J max , TPU. V cmax from one-point Apparent V cmax calculated from A sat measurements using the one-point method. Response of photosynthesis to irradiance (AQ curves) Sequential measurements on the same leaf material of photosynthetic rate with varying irradiance. Photosynthetic parameters derived from AQ curves Results of fitting light response curves to derive parameters, e.g. quantum yield of CO 2 fixation. Dark adapted respiration Respiration rate of a dark adapted leaf.

Inclusion of instrument output data
The methods metadata and required variables are designed to capture adequate information to allow proper interpretation of datasets. However, not all possible data reuse can be foreseen. The inclusion of the complete instrument output (commonly referred to as 'raw data') in a data package is seen as the ultimate future-proofing for a dataset (Fig. 1). Archiving of raw gas exchange data is recognized as good science practice and has been highlighted as important for the preservation and reuse of data (Dietze, 2014;Rogers et al., 2017). Ideally we would like to mandate archiving of quality controlled complete instrument output to allow reanalysis of highly valuable datasets as new knowledge, analytical approaches or data corrections are developed. The term 'complete instrument output' is used here to recognize that instrument data files with some quality control applied, such as correction of user input errors, are generally more valuable to data users than true raw data. However, this ideal has to be balanced by the need to ensure we do not create a barrier for data submission, particularly for older data sets where complete instrument output may no longer be available, or for data collected with custom built gas exchange systems.

Discussion
We have developed a reporting format for leaf-level gas exchange data and metadata with the goal of improving findability and the capacity to reuse these valuable data . This reporting format will be adopted by the ESS-DIVE data repository and be freely available to the community. We encourage its use by all data producers, repositories and databases. More than eighty data contributors, data users, manufacturers and data scientists contributed to the development of this effort, which we hope will form a foundation for future development by the community. The reporting format aims to provide a resource for data contributors to enhance the value of their data, reduce the overheads to re-using and synthesizing data, and provide prescribed metadata that will simplify parsing of data for analysis and synthesis (Fig. 2).

Development of a community standard
Given the importance of gas exchange data, the effort taken to collect it, and the widespread use of gas exchange data in synthesis activities and model parameterization, it was surprising that a data standard did not yet exist. However, the need and desire for the development of a common reporting format was readily apparent. Both data contributors and data users were very supportive of the effort, were quick to contribute, and provided valuable input.
Data reporting formats, and the mandate by funding agencies to use them, burden the data contributor with the task of preparing and uploading their data. In contrast, the data user is hopefully relieved of the burden associated with finding and harmonizing datasets prior to analysis. Therefore there is a perception that data reporting formats and preservation of data in a repository offer little direct return for the contributor's effort. However, whilst not readily tangible, there are several benefits to contributors. These include the provision of a formal way to meet mandates for data preservation, defined data descriptions and units, and tools for data quality control (e.g. expected data ranges). Widespread adoption of a data reporting format will also accelerate the development of data ingest tools that will benefit the contributor. Furthermore, sharing of data in an accessible and searchable format increases the impact of their data collection, jump-starts collaborations, and, with conscientious data users, can lead to invitations to co-author novel data syntheses where data contributors can share knowledge of their data and also gain additional insight from their collaborators (Allen and Mehler, 2019;Cheruvelil and Soranno, 2018). One issue that remains challenging for the field is formal recognition of datasets through citations and ensuring the continued recognition of a given contribution. For example, if the original dataset is combined with other data into a larger dataset the original association with the data contributor can be lost.
While a formal data format had not existed before we started this work, the vocabulary of leaf-level gas exchange was well established and was fairly similar across instruments. Therefore, incorporating many variables and definitions that are already in widespread use resulted in large parts of the reporting format being readily accepted by the community. Most feedback was focused on additional components, and fine tuning of definitions, rather than large changes to the first draft proposal. It was necessary to provide precise descriptive information to clearly communicate our goal of developing a data reporting format; in some instances this goal was conflated with documentation of measurement protocols, defining a gold standard method or building a database. The data reporting format does not attempt to constrain method choice by data contributors but is intended to be inclusive of all approaches and methodologies. However, there were several issues that garnered significant commentary and these are discussed further below.

Decisions and compromises
As expected there was a necessary compromise between the desire for additional metadata detail and the need for a relatively simple and manageable reporting format for data contributors. Many of the requests for increased metadata would increase the effort, and therefore the barrier, to format data for some contributors whilst providing only limited value for most data users. Experimental and sample details that are not covered by the methods metadata variables may be included in protocol descriptions or methods supplement tables. While not yet providing specific formats for data, methods metadata variables have been included to indicate the inclusion of canopy height information and additional data collected in-line with gas exchange. There are no restrictions preventing conscientious data contributors from including more metadata detail or data types. Similarly, when developing the required variables for each data type we resisted adding requirements for variables that are not essential to effectively reuse the data. We hope that by strongly encouraging (and perhaps, in time, mandating) the submission of complete instrument output we will preserve all data fields for the specialist data user, and for future, currently unanticipated uses of the data.
There were several comments about missing measurement types; in many cases this reflected the desire to expand the reporting format to cover more data types, e.g. temperature and vapor pressure deficit response curves, or porometer measurements of stomatal conductance. The combination of fluorescence with gas exchange data is very powerful and many instruments allow simultaneous collection of both data types. Whilst we recognize the value of including fluorescence data, developing common reporting formats for these data would have significantly expanded the scope of this initial effort. Development of a common data reporting format for fluorescence data presents some additional challenges, since these data are not always associated with gas exchange data, they can be collected with a wider variety of instruments, and the vocabulary and protocols used are not as constrained as for gas exchange measurements (Baker, 2008;Maxwell and Johnson, 2000;Murchie and Lawson, 2013).

Fig. 2.
Schematic showing how the implementation of this data reporting format across data archives will facilitate data discovery and reuse.
Estimates of photosynthetic parameters derived from the response of photosynthesis to intercellular CO 2 concentration (C i ) provide apparent estimates of those parameters, i.e. the estimate assumes an infinite mesophyll conductance (g m ) and C i is assumed to be equal to the CO 2 concentration inside the chloroplast (C c )-the site of carboxylation. Whilst g m and hence C c can be estimated from gas exchange data (Ethier and Livingston, 2004;Sharkey et al., 2007), the most robust approaches require in-line measurements of fluorescence or isotopic descrimination (Bongi and Loreto, 1989;Busch et al., 2020;Evans et al., 1986;Harley et al., 1992;Loreto et al., 1992;von Caemmerer and Evans, 1991). Estimates of photosynthetic parameters based on C c are different from those that do not account for g m , and the data can be used in different ways, so it is important to distinguish which data (C i or C c ), were used to calculate the derived parameters. Additionally, for the specialist data user, knowledge of additional fluorescence or isotopic discrimination data collected in parallel with gas exchange data would be valuable. Therefore we added methods metadata requirements for photosynthetic CO 2 response data to capture assumptions about g m and indicate the presence of additional data in the data package.
Specialist approaches of gas exchange measurements mean that equivalence cannot be assumed between different studies, even within the same lab, as protocols are adjusted for individual experiments, depending on species measured, ambient environmental conditions, and the experimental goals. The methods metadata categories have been defined to allow equivalency between data sets to be recognized, and provide the required information to recalculate if necessary. Similarly, calculations of parameters such as maximum carboxylation capacity (V cmax ) are dependent on fitting approaches (Bernacchi et al., 2013;Gu et al., 2010;Sharkey et al., 2007), the choice of kinetic constants (Rogers et al., 2017), inclusion of mesophyll conductance (Ethier and Livingston, 2004;Warren, 2006) and whether and how investigators applied corrections for gasket diffusion leaks (Flexas et al., 2007;Rodeghiero et al., 2007). In some cases, capturing these metadata can enable data users to recalculate derived parameters using a common approach (e.g., Niinemets et al., 2015). Ideally data users should recalculate derived parameters from the underlying data.

Future developments
Development of this data reporting format highlighted the strong desire by the community for additional functionality to be added to repositories to aid data ingestion, search and compilation. For example, tools that would reduce the burden of curating instrument output files, enable data validation during upload, and enable full search and compilation of all measured and calculated instrument output variables. Whilst the creation of such tools is out of the scope of developing a data reporting format we considered these needs in our decision making processes. For example, we have provided an instrument output translation table that provides a column by column comparison of default output from commercially available gas exchange systems, have created and defined a machine readable vocabulary, defined units, and provided expected ranges for commonly measured variables that can aid validation and curation. Furthermore, we have mandated that data are published using a single non-proprietary file format (.csv) further reducing the challenge for long-term archival and cyberinfrastructure tool development.
To aid long-term development, the reporting format will be a dynamic document hosted as a public repository on GitHub, a version control platform (https://github.com/ess-dive-community/essdiv e-leaf-gas-exchange). This platform will allow the user community to flag issues, make suggestions, discuss amendments, and prioritize development of the reporting format, all in the open so the community can understand the motivation behind development and contribute to decision making. The published reporting format can be revised with minor edits, ensuring users can easily access the latest update. Contributors on the GitHub platform could also facilitate more substantial changes, such as the addition of new data types, leading to publication of a new version of this reporting format in the future.
We hope that widespread adoption of this first data reporting format for leaf-level gas exchange data will increase the preservation and reuse of these valuable, and hard won, data sets and elevate the importance of data storage in the mindset of data contributors. We also hope that this work will form the cornerstone for a more comprehensive effort by the community to expand and develop the reporting format, including expansion to include full consideration of additional data types that were beyond the initial scope of this effort.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.