Dataset for H2, CH4 and organic compounds formation during experimental serpentinization

Serpentinization refers to the alteration of ultramafic rocks that produces serpentines and secondary (hydr)oxides under hydrothermal conditions. Serpentinization can generate H2, which in turn can potentially reduce CO/CO2 and produce organic molecules via Fischer–Tropsch type (FTT) and Sabatier type reactions. Over the last two decades, serpentinization has been extensively studied in laboratories, mainly due to its potential applications in prebiotic chemistry, origin of life in extreme environments, development of carbon‐free energies and CO2 sequestration. However, the production of H2 and organics during experimental serpentinization is hugely variable from one publication to another. The experiments span over a large range of pressure and temperature conditions, and starting compositions of fluid and solid phases are also highly variable, which collectively adds up to more than a hundred variables and leads to controversial results. Therefore, it is extremely difficult to compare results between studies, explain their variability and identify key parameters controlling the reactions. To overcome these limitations, we collected and analysed 30 peer‐reviewed articles including over 100 experimental parameters and ca. 30 mineral and organic products, hence building up a database can be completed and implemented in future studies. We then extracted basic statistical information from this dataset and demonstrate how such a comprehensive dataset is essential to better interpret available data and discuss the key parameters controlling the effectiveness of H2, CH4 and other organics production during experimental serpentinization. This is essential to guide the design of future experiments.


| INTRODUCTION
Earth's mantle is predominantly composed of peridotites, a type of rock made of Mg-and Fe-rich silicate minerals such as olivine and pyroxene. Though modern Earth's crust is not ultramafic, plate tectonics bring ultramafic rocks to the surface of Earth at various geologic settings, such as slow mid-ocean ridges (e.g. Mid Atlantic Ridge, Southwest Indian Ridge), oceanic transform faults (e.g. Vema OTF in the Atlantic, São Pedro and São Paulo Archipel OTFs in the Equatorial Atlantic, Shaka and Prince Edward OTFs in the South-West Indian Ocean) and at convergent margins (e.g. Oman, New Caledonia). The alteration and hydration of peridotite result in the formation of serpentine group minerals (e.g. lizardite, chrysotile and antigorite) and secondary (hydr)oxides (e.g. brucite, magnetite). These serpentine-forming reactions are called serpentinization. During serpentinization, the ferrous iron in olivine and pyroxene is often oxidized to ferric iron, which produces H 2 through the reduction of water. As a consequence, CO or CO 2 could be reduced by H 2 through Sabatier type (R1) or Fischer-Tropsch type (R2) reaction to form CH 4 and/or other organic compounds.
Analyses of many natural samples have shown abundant release of H 2 , CH 4 and other organic compounds in fluids from natural serpentinization areas and questioned the exact reactions mechanisms involved (Barnes et al., 1967;Wenner and Taylor, 1973;Charlou et al., 2002;Proskurowski et al., 2006Proskurowski et al., , 2008Konn et al., 2009). For instance, ultramafic rocks were abundant on the primitive Earth and possibly other rocky planetary bodies (Ehlmann et al., 2010;Zahnle et al., 2011;Holm et al., 2015;Etiope et al., 2018), so their observation raises several major scientific questions related to serpentinization (Sleep et al., 2004;Schulte et al., 2006;Russell et al., 2010;Hellevang et al., 2011;Guillot and Hattori, 2013;Mayhew et al., 2013;McCollom and Seewald, 2013;Brazil, 2017;Ménez et al., 2018): What is the role of serpentinization in the origin of life-on Earth, and elsewhere? Could the serpentinization reaction sustain microbial communities in the primitive and modern ocean? Could our modern societies use the H 2 produced by serpentinization reactions to help reduce anthropogenic CO 2 emission?
To address these questions, there is an urgent need to understand the serpentinization process and more specifically its capacity to generate reducing conditions and produce abiotic organics. Therefore, tens of experiments have attempted to provide answers to this question. Although they all agree on the production of H 2 by serpentinization (Sleep et al., 2004;Seyfried et al., 2007), the production of CH 4 and more complex hydrocarbons (e.g. C 2 H 6 , C 3 H 8 ) via FTT or Sabatier reactions has always been and is still highly debated (e.g. Evans et al., 2013;McCollom and Seewald, 2013;McCollom et al., 2015). Despite the tremendous collaborative efforts of the community all over the world, we still do not fully understand why similar experimental incentives lead to so contrasted, if not contradictory results. A strong limitation is that those numerous serpentinization experiments have been run under very different conditions using various creative protocols. In order to understand the similarities and discrepancies between results, it requires us to compare more than a hundred variables not even fully identified before the present study. Therefore, we carefully read 30 peer-reviewed publications that described experimental serpentinization and other publications for comparison, analysed 195 experiments and compiled parameters in the dataset described in Section 2. This dataset will be continuously updated as news results become available and used for various purposes related to serpentinization and associated reactions.

| Overview of the dataset
We have collected the data in 30 relevant experimental articles that report measured H 2 and organic compounds (OC) production related to the serpentinization reaction (Berndt et al., (1) 4H 2 + CO 2 = CH 4 + 2H 2 O (2) (3n + 1) H 2 + nCO 2 = C n H 2n+2 + 2nH 2 O implemented in future studies. We then extracted basic statistical information from this dataset and demonstrate how such a comprehensive dataset is essential to better interpret available data and discuss the key parameters controlling the effectiveness of H 2 , CH 4 and other organics production during experimental serpentinization. This is essential to guide the design of future experiments.
The reported experiments covered a large range of experimental conditions, including the temperature (T), pressure (P), experiment duration, chemical compositions of both reactants and products, as well as types of reactors, origins of mineral samples. We summarized the information into a single large Excel spreadsheet of 133 columns and 733 rows. The spreadsheet columns are divided into 3 main sections ( Figure 1): article information (green header), experimental conditions (blue header); and results (yellow header). The section on article information includes details of all published articles of this dataset: data ID, article title, year of publication, authors and DOI numbers, which help the readers of the present contribution to trace back the original studies. The 733 rows describe 195 experiments that include sometimes multiple samplings on the course of the experiments to evaluate the reaction kinetics.
Before moving into the details of each section of the dataset, there is some important general information. In the section dedicated to experimental conditions, it is important to keep in mind that most parameters are independent of each other, but a few of them are dependent. For example, the magnesium content of olivine 'Mg#(Ol)' displays a value only for experiments that include olivine as a reactant. Another example, the total of the reacting minerals sums at 100% and is expressed as wt%. In order to analyse the dataset, a feature is assigned to each cell as explained in the header of each column. F I G U R E 1 Screenshot of a representative version of the Excel spreadsheet for experimental parameters and results. The current dataset has 134 columns (parameters/variables) and 733 rows (measurements). The dataset is divided into three parts-article information, experimental conditions and results Some programs cannot handle dataset with empty cells, so we assigned 'nan' (not a number) to values that were either not measured or not reported, and '0' to measurements that were below detection limits or reported as 'not observed' in the original paper. We also paid attention to assign to each parameter a dedicated format that is given in Tables 1-9 of the present contribution: 1. Numeric: data that are integers and float numbers. For example, temperature data are mostly integers and NaCl concentration data are float numbers. 2. Categorical: data that are not quantitative but categories, such as the rock type and degree of alteration. 3. Binary: Data that are either 0 (no) or 1 (yes), such as the reactor types. 4. Ternary: Data that are 0 (no), 1 (yes) or 2 (yes with 13 C).
For example, in the Carbon_in_solid column, 2 means carbon is 13 C labelled.

| Reaction conditions and reactor information
The reaction conditions are listed first in  Table 2). The reactor materials are also indicated since they are made of metals, whose catalytic role has often been suggested. For each experiment, one of the columns dedicated to the nature of the reactor has a value of 1, others are 0. For instance, a reactor made of both Au and Ti corresponds to '1' in the column Reactor_Au/Ti, other columns display a '0' for that experiment.

| Starting mineral, fluid and gas compositions
The information on the composition of rocks, minerals, solutes in the aqueous phase and gas is summarized in Tables 3-5. Table 3 contains the provenance of samples, the type of rock, the degree of alteration (when indicated by the authors), the composition of the rocks (expressed in weight per cent of each mineral normalized to reach 100%), the Mg# (olivine only), the grain size and other information on the starting mineral phases that we considered useful. The provenance of a sample is either its original geological location or the name of the company where it was synthesized. The estimated degree of alteration is based on which are dedicated to carbon speciation at high P and T. Information of grain size (max and min) and surface area (SSA, cm 2 /g of rock) that are very important for the kinetics of the reaction is also reported in this section when available, otherwise they are labelled as 'nan'. Simple statistics analyses of these parameters are displayed in Tables 3-5, which helps readers to decide if this dataset contains useful data for them. The chemical composition of the starting aqueous solution is described in Table 4. The solutes include many organic and inorganic salts whose concentrations are given in mol/kg (molal) as volumes change significantly under hydrothermal conditions. When articles reported in their experimental section that milli-Q water (18.2 MΩ/cm) was used, the column 'precised_water_clean' displays a value of '1', otherwise the value is '0'. Values of the pH of the starting solution are also given as 'initial pH' when available. The initial amount of carbon among starting chemicals is one important aspect in this study, which allows comparing between studies the produced reduced carbon compounds through FTT or Sabatier reactions or any other reaction. We created a column 'CO 2 _ initial' that tells whether authors flushed their system or not.
If not, we assumed that fluids were at equilibrium with present day atmospheric CO 2 , which leads to a CO 2 concentration of 0.01 millimole per kg H 2 O.
When an experiment contained a headspace filled with gas, we reported as much as we could the gas composition. It is given in relative volume percentage of N 2 , CO 2 , H 2 , CO, CH 4 and Ar in the headspace, as the description in the experiment methods section in the articles seldom provided detailed information on the gas composition (Table 5). For each experiment, the initial concentrations of these gases are therefore reported as vol%, and the sum of these species in the headspace equals arbitrarily 100% or 0 when information is missing.

| Potential catalysts and carbon sources
Previous studies (Fu et al., 2008;Andreani et al., 2013;Mayhew et al., 2013) have shown that the serpentinization reaction and the production of H 2 and organic species (CH 4 , formate, acetate, etc.) can be largely influenced by the presence of cations in solution or solid catalysts (accessory mineral surfaces).
In industrial H 2 and CH 4 production, metal-bearing catalysts are critical. Platinum, rhodium, ruthenium, cobalt and other metallic materials are well-known catalysts for methanization of dry gas by FTT or Sabatier reaction (McKee, 1967;Melaet et al., 2014;Stangeland et al., 2017). In natural environments, a large number of metallic phases are present in ultramafic rocks and minerals, and also in experimental materials as impurities.
Variabilities of the kinetics of serpentinization and H 2 and CH 4 formation, both in nature and experiments, could be due to the effect of catalysts either in their mineral form or as solute (e.g. Andreani et al., 2013;Mayhew et al., 2013;Etiope and Ionescu, 2015). Therefore, we created a list of minerals with potential or expected catalytic effects and recorded their presence or absence in each experiment ( Table 6). The carbon source(s) are also recorded in the dataset in a 'ternary' data format (definition in section 2) for two main goals here: (a) identify if the amount of reduced carbon products is favoured by certain carbon-bearing reactants; (b) help readers to easily locate experiments labelled with 13 C and further analyse the influence of background contaminations.

| Other information
All other information regarding the starting materials is listed in Table 7. The 'isBlank' parameter indicates whether an experiment is blank or control (1 for no mineral within a series of experiments), in order to distinguish them from experiments and avoid any inaccurate interpretation of the data. The 'Add_solids', 'Add_liquids' and 'Add_gases' are used to indicate the presence of a solid, liquid or gas phase, respectively (0 means absent). The 'mass_solids', 'mass_liquids' and 'Water_rock' ratio are also reported when available, as well as the total volume of the reaction cells.

| RESULTS
The results section of the dataset is divided into two subsections-mineral and gas/fluid products. The columns that describe mineral products are less populated, since most contributions focused on the production of gas species (H 2 , CH 4 , etc.) and only a few of them also described the kinetics of serpentinization and analysed the final mineral composition.

| Mineral products
Analyses of mineral products are reported in Table 8. It includes all the minerals identified by the authors of the 30 peer-reviewed articles. Each mineral produced during experiments is defined as a parameter, and reading through the columns, one can see that the data available in the literature are sparse. As a consequence, statistics on the minerals produced during the reaction is poorly constrained. The presence or absence of secondary minerals is not always clearly established in the articles and strongly depends on the details provided by the authors and characterization technique used in those contributions. In some cases, descriptions of the solid phase did not mention the experiment number they were referring to. In addition, in most experimental settings dedicated to the understanding of H 2 and CH 4 production during serpentinization, solids are accessible only at the end of the experiments. The mineral feature is assigned 'nan' if the information is not clearly stated. Hence, this part of the dataset should be used with great care and we encourage the reader to refer to the original article as necessary.

| H 2 and hydrocarbon products in final fluids
This subsection focuses on the experimental measurements of H 2 , CH 4 and OC during experimental serpentinization. We assigned an individual parameter to each compound. The details on the analysis of the composition of final fluids are described in Table 9. It is important to take into account that the measurement precision and detection limit of those compounds are very different across studies and potentially evolved through time as analytical tools and methods improve. Obviously, the

| DISCUSSIONS AND PERSPECTIVES
This dataset provides an up-to-date collection of experimental results until early 2019 that can be used to address the implications of serpentinization and related processes for the production of H 2 , CH 4 and higher hydrocarbons. It is well known that P-T conditions largely control experimental results, but they alone could not explain the large variability of the measured concentrations of H 2 , CH 4 and higher hydrocarbons. The other parameters investigated by the experimental community at large are so numerous that their investigation cannot be done easily and requires a computing approach that can deal with many parameters at the same time. We hope that the database is progressively enriched with upcoming experimental results. As an example, we have used data science techniques to extract embedded information and identify key experimental parameters, which cannot be accessed by checking a limited number of parameters. In our recent paper (Barbier et al., 2020), we used network analysis and machine-learning algorithms to analyse a processed version of this dataset. We found that, as previously known, pressure and temperature are the two most important parameters that govern the production of H 2 and OC. However, we did not find evidences to support the occurrence of R1 or R2 reactions. Moreover, by comparing the concentrations of final OC with initial carbon input, we found that the OC products in several studies are from unidentified sources and likely result from contamination, in agreement with the scarce 13 C-labelled studies (e.g. McCollom et al., 2010;Grozeva et al., 2017). Also, the measurements of initial and final pH values are often not reported, despite the important role of pH on the serpentinization kinetics (e.g. Huang et al., 2019;McCollom et al., 2020). More information and the detailed analytical methods can be found in Barbier et al. (2020). The dataset can also be extended to other reactions under similar conditions using different solid reactants, such as mine-tailing products, to produce H 2 , CH 4 and other carbon species (e.g. Kularatne et al., 2018;Michiels et al., 2018;Brunet, 2019). We hope that this dataset and the analysis by Barbier et al. (2020) stimulate complementary experiments that could fill the identified gaps; this would allow editing future versions of this dataset in a few years and potentially help people better understand the fate of carbon under highly reducing conditions such as the one produced during serpentinization.