An experimental evaluation of the use of (cid:538) 13 C as a proxy for palaeoatmospheric

Understanding changes in atmospheric CO 2 over geological time via the development of well constrained and tested proxies is of increasing importance within the Earth sciences. Recently a new proxy (identified as the C3 proxy) has been proposed that is based on the relationship between CO 2 and carbon isotope discrimination (  13 C) of plant leaf tissue. Initial work suggests that this proxy has the capacity to deliver accurate and potentially precise palaeo-CO 2 reconstructions through geological time since the origins of vascular plants (~450 Mya). However, the proposed model has yet to be fully validated through independent experiments. Using the model plant Arabidopsis thaliana exposed to different watering regimes and grown over a wide range of CO 2 concentrations (380, 400, 760, 1000, 1200, 1500, 2000 and 3000ppm) relevant to plant evolution we provide an experimental framework that allows for such validation. Our experiments show that a wide variation in  13 C as a function of water availability is independent of CO 2 treatment. Validation of the C3 proxy was undertaken by comparing growth CO 2 to estimates of CO 2 derived from  13 C. Our results show significant differences between predicted and observed CO 2 across all CO 2 treatments and water availabilities, with a strong under prediction of CO 2 in experiments designed to simulate Cenozoic and Mesozoic atmospheric conditions (≥1500ppm). Further assessment of  13 C to predict CO 2 was undertaken using Monte Carlo error propagation. This suite of analysis revealed a lack of convergence between predicted and observed CO 2 . Collectively these findings suggest that the relationship between  13 C and CO 2 is poorly constrained. Consequently the use of  13 C as a proxy to reconstruct palaeoatmospheric CO 2 is of limited use as the estimates of CO 2 are not accurate when compared to known growth conditions.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

INTRODUCTION
Understanding both the long term carbon cycle and rapid perturbations in atmospheric CO 2 observed through the geological record has become an increasingly important area of scientific enquiry. A major limiting step in understanding the climate system sensitivity to changes in atmospheric CO 2 over geological time has been the variability in modelled solutions of palaeo-CO 2 concentration which vary considerably both between (GEOCARB vs COPSE (Berner and Kothavala 2001;Bergman Lenton and Watson 2004)) and within model families (GEOCARB III vs GEOCARBSULF (Berner and Kothavala 2001;Berner 2006)).
For example within the GEOCARB suite of models comparisons between GEOCARB III and GEOCARBSULF suggests modelled values ranging from ~3400ppm in the Early Triassic to ~500ppm in the Late Triassic. To constrain these models and evaluate refinements made through model development requires the development of mechanistically based CO 2 proxies that have been independently tested and fully validated (Lomax and Fraser 2015).
Recent work on proxy development (Franks et al., 2014) has led to the suggestion that CO 2 concentrations may have been substantially lower than previous reconstruction and modelling studies have indicated. Franks et al. (2014) suggested that large long-term CO 2 perturbations (~2000-3000ppm) are unlikely and that over geological long-term CO 2 may have been <1000ppm since the evolution and radiation of forests in the Middle Devonian (Morris et al., 2015). Their data compare favourably to modelled solutions of the long-term carbon cycle (Berner 2006; and the temporally and spatially limited proxy record generated from the carbon isotope ( 13 C) analysis of fossil liverworts (Fletcher et al., 2008). However the evaluation sensitivity analysis of the Franks et al. (2014) model as conducted by  suggests an alternative interpretation under which Phanerozoic CO 2 concentrations may have regularly exceeded 1000ppm. However, the sensitivity analysis of  was subsequently critiqued by Franks and Royer (2017) and subsequently rebutted . More recently, Foster et al. (2017) compiled a series of CO 2 estimates from the literature (see SOM of Foster et al. (2017) for full details) via integrating five independent methods (stomata, pedogenic  13 C, liverwort  13 C, foraminiferal   and alkenone 13 C) to produce a LOESS CO 2 curve for the last 420 million years. This compiled LOESS CO 2 curve indicates that atmospheric CO 2 concentrations are lower than GEOCARB predications and partially supports the predictions of the Franks et al. (2014)  Recently a new proxy method has been developed based on  13 C composition of C3 plant material and discrimination ( 13 C) with changes in  13 C being used as a basis to reconstruct pCO 2 (Schubert and Jahren 2012). Following a full statistical analysis and quantification of uncertainty (Cui and Schubert 2016) this method has recently been used to estimate changes in CO 2 through Cenozoic hyperthermals (Cui and Schubert 2017) and to reconstruct atmospheric CO 2 through the Cretaceous (Barral et al., 2017). These results suggest atmospheric CO 2 could be lower than previously thought with particularly low CO 2 estimates for the middle Cretaceous. However these data plot outside of the 95% confidence limits of the Foster et al. (2017) study ( Figure 1) and are at odds with stomatal based estimates of CO 2 through OAE 1d (Richey et al., 2018) and OAE 2 (Barclay et al., 2010). Most recently Schubert and Jahren (2018) have focused on assessing the effects of photorespiration on  13 C and thus the C3 proxy through a round of experiments growing Arabidopsis thaliana over a range of sub-ambient CO 2 concentrations. These data were then combined with pre-existing datasets (Schubert and Jahren 2018) to investigate this relationship at 12 different CO 2 concentrations spanning 97ppm through to 2255ppm. They conclude that a ~3.5‰ change in  13 C can be prescribed to an increase in CO 2 from ~100-2250 ppm and that change in discrimination is independent of C i /C a (ratio of internal CO 2 to external CO 2 ). However, this independence was not tested for (experimentally) within their system.
If the use of  13 C could be independently validated it would offer a major new resource for the palaeoclimate community as C3 vascular plants are thought to originate in the Upper Ordovician (Middle Katian) (Steemans et al., 2009). This would be of particular importance as more CO 2 predictions for the Lower Palaeozoic are urgently required. Clearly the development of a well constrained proxy that could be used to deliver a large number of estimates of CO 2 through this time interval and further back to the origin of vascular plants would be a major advance in the understanding of the Earth system.
From an ecophysiological standpoint changes in  13 C are linked to changes in water use efficiency (WUE) of the plants that are ultimately controlled by the opening and closure of the stomatal pore complex which regulates gas exchange. For  13 C to be used as an accurate and precise method to reconstruct pCO 2 the major requirement is to demonstrate that changes in CO 2 are the main driver of changes in  13 C. Further this needs to be independent of other environmental conditions that can alter C i /C a which in turn influence WUE. Factors that can influence C i /C a include but are not limited to irradiance (Ehleringer et al., 1986), temperature (Körner et al., 1991), salinity (Guy et al., 1980) and logically the amount of water availability (Farquhar et al., 1980;Kohn 2010). Diefendorf et al. (2010) reported a wide spread in  13 C over a range of environments and in a recent review Cernusak et al., (2013) highlighted that there is an inherent tension between viewing  13 C as a sensor that responds to environmental cues or as a species specific set point driven by internal physiological constraints. This is further demonstrated by the ongoing scientific debate that is trying to establish what isotopically derived calculations of C i /C a are a measure of and how closely they relate to carbon draw down (C i ) (eg., Seibt et al., 2008;Cernusak et al., 2009). The  13 C of plant tissue ( 13 C p ) can also vary considerably within a plant canopy with variation of ~6‰ being recorded from the base to the top of single Fagus sylvatica (beech) tree (Schleser 1990). Over printed on this environmentally driven variability are differences in isotopic composition due to discrimination associated with tissue type reviewed in Gröcke (2002).
The fossil record acts as a strong filter. Therefore if carbon from bulk organic matter is analysed to generate  13 C it can be derived from different plant tissue and from plants from across a wide environmental gradient. It has previously been suggested that this filtering generates a smoothed average which might mitigate for these effects when using the  13 C p to predict the isotopic composition of CO 2 in air ( 13 C a ) (Jahren and Arens, 2009). However, when tested experimentally using a sampling strategy designed to represent an allochthonous deposit this assertion was not supported, as large differences between predicted and measured  13 C a were observed (Lomax et al., 2012). Despite the finding of Lomax et al. (2012) the time averaging effect of the fossil record has again been suggested as a factor which has the capacity to mitigate and dampen other environmental signals (Schubert and Jahren 2012), again without testing this assertion in an experimental framework. These issues might be particularly acute in periods of large scale carbon cycle and climate perturbations. As these events have the capacity to reshape standing terrestrial biomass via altering plant ecophysiological performance and by initiating floral overturn both factors that can influence plant  13 C. These changes would then alter the isotopic composition of terrestrial organic matter in a manner that is potentially independent of changes in pCO 2 .
More broadly and looking outside of the work that seeks to use  13 C as a method to reconstruct pCO 2 , the nature of the relationship between  13 C p ,  13 C a and CO 2 in experimental systems needs to be clarified and responses tested (Lomax et al. 2012;Porter et al. 2018).
Within all experimental systems to date,  13 C a becomes very negative when compared to the  13 C of natural atmospheric CO 2. Currently it is unknown if the models developed to explore carbon isotope fractionation at natural  13 C a values can be used when the value of the isotopic substrate is much more negative (compare ambient values of ~ -8.0‰ to experimental values that exceed ~ -30‰). Furthermore over geological time the  13 C a signature of CO 2 is known to be well constrained varying only slightly over the long-term, with short duration negative spikes shifting background values by ~2‰. There is also an issue of auto correlation between atmospheric CO 2 and  13 C a making inferences about any perceived relationship difficult to disentangle (Lomax et al., 2012;Porter et al., 2017).
Consequently prior to the widespread deployment of such a novel proxy there is the requirement for rigorous experimental assessment of how other environmental factors impinge on the predictive capability of  13 C to be used as a proxy to predict pCO 2 . Although Schubert and Jahren (2018) explicitly rule out changes in C i /C a being required to drive changes in  13 C this assumption was not tested for as water availability was controlled. Here we test one of the most important factors associated with C i /C a , namely water availability and how this factor influences  13 C generated from leaf tissue. We then use this dataset to test the utility of the proxy to predict pCO 2 . As a first step to look at the validity of using isotope models constrained on ambient values of  13 C a we use an astomatal (a plant which lacks stomata) to test assumptions linked to the Farquhar model of discrimination. This astomatal mutant differs from other naturally occurring astomatal plants such as some species of bryophytes (e.g. Marchantia polymorpha) which whilst lacking a stomatal pore and accompanying guard cells have permanently open pores, allowing free exchange of CO 2 between the atmosphere and the plant. Whilst many more species of bryophyte lack fixed pores with CO 2 diffusing across the cell membrane. Consequently the  13 C signature of bryophytes has been used as the basis of the CO 2 proxy, BRYOCARB (Fletcher et al., 2005;2006). This is because the confounding effects of the isolation of the sub stomatal cavity via the opening and closure of the guard cell system are eliminated. As bryophytes lack a cuticle diffusion of CO 2 through to the site of fixation should also be less limited when compared to vascular plants that have a cuticle. Diffusion will also be affected by the greater distance that CO 2 has to travel to the site of fixation in vascular plants when compared to non-vascular plants. Consequently we hypothesize that within the astomatal mutant calculated C i /C a (as a reflection of stomatal closure over the life time of leaf growth) should be close to zero reflecting, what is effectively a partially closed system.

Plant growth experiments (University of Sheffield)
Seeds of Arabidopsis thaliana (ecotype Col-0) were sown onto multipurpose compost (Arthur Bowyers, UK) covered with plastic film and stratified for 3 days at 4ºC. They were transferred into growth cabinets (Sanyo-Fitotron Model: SGC097.PPX.F, UK) and grown under a day/night regime of 8/16 hrs at 25/21ºC and 55% RH. Light levels during daylight hours were 230µmol m -2 s -1 . Six separate CO 2 experiments were conducted, with CO 2 held at concentrations of 380, 760, 1000, 1500, 2000 and 3000ppm with the  13 C a signature becoming more negative as CO 2 increases. Nested within each CO 2 treatment, plants were also subjected to one of three watering regimes; a low water treatment (10 ml -1 day -1 7cm (diameter) pot -1 ), a medium water treatment (20 ml -1 day -1 7cm (diameter) pot -1 ) or high water treatment (consistently saturated compost) imposed after 4 weeks of growth. Following the imposition of water treatment, plants were left to develop for a further 2 weeks and leaves that had developed under each treatment where subsequently harvested for carbon isotope ratio analysis. Specifically to test for an isotopic effect within the 3000 ppm experiment we grew the astomatal mutant, Hamlet and its associated wild type, to test for variations in calculated C i /C a .

Plant growth experiments (University of Nottingham)
In Nottingham seeds of A. thaliana (ecotype, Ler 0, Col-1 and Wa-1) were treated as above but grown in Levington M3 compost. Plants were placed into one of two controlled environment walk-in growth rooms (Unigrow, UK) and grown under a day/night regime of 10 hours of light per day in a simulated day/night program. Light levels during daylight hours were 300 µmol m -2 s -1 . Night temperature was set at a high of 17°C and daytime temperature peaked at 20°C. Relative humidity was set at 70% CO 2 was set to at 400ppm in one chamber and at 1200ppm in the other. Within each CO 2 treatment replicate plants were subjected to one of three water treatments (10 ml -1 day -1 6.5 cm (diameter) pot -1 , 20 ml -1 day -1 cm (diameter) pot -1 or permanently saturated).

Carbon isotope analysis (University of Sheffield)
Five plants per treatment were analysed. Leaves were dried for one week at 70º C and 0.1mg of plant material per plant was homogenised in a pestle and mortar for analysis.
Measurements were made using an ANCA GSL preparation module, coupled to a 20-20 stable isotope mass spectrometer (PDZ Europa, Cheshire, U.K.). The isotope values for 13 C are reported as per mil (‰) deviations of the isotopic ratios ( 13 C/ 12 C) calculated to the VPDB scale using within-run laboratory working standards calibrated against IAEA-CH-6.
Replicate analysis indicated a precision of ± <0.15‰. Air samples from growth cabinets were pumped into 10ml evacuated gas tight vials (Labco Exetainer Vials, Labco Ltd, UK) and analysed on the same mass spectrometer.

Carbon isotope analysis (British Geological Survey)
Plant material grown at the University of Nottingham was analysed at the British Geological Survey. Plant  13 C analyses were performed by combustion in a Costech Elemental Analyser (EA) on-line to a VG TripleTrap and Optima dual-inlet mass spectrometer, with  13 C values calculated to the VPDB scale using a within-run laboratory standards calibrated against NBS18, NBS-19 and NBS 22. Replicate analysis of well-mixed samples indicated a precision of + <0.1‰ (1 SD). For 13 C analysis of the CO 2 , the gas was first separated from water vapour using a vacuum line. Measurements were made on an Isoprime dual inlet mass spectrometer. The evolved CO 2 was passed over a water trap prior to the mass spectrometer.
Isotope values ( 13 C, and 18 O not used) are reported as per mil (‰) deviations of the isotopic ratios ( 13 C/ 12 C, 18 O/ 16 O) calculated to the VPDB scale using a within-run laboratory standard calibrated against NBS-19. Craig correction is also applied to account for 17 O. Analytical reproducibility of the standard calcite (KCM) is < 0.1‰ for  13 C.
Discrimination, 13 C which is a proxy measure of integrated WUE (WUE i ) over the lifetime of a leaf is calculated as: Calculated C i /C a , is given by: Where  13 C a is the carbon isotopic composition of the CO 2 inside the growth cabinet and  13 C p is the carbon isotopic composition of the leaf material, and a and b are constants linked to discrimination (a is discrimination limited by diffusion = 4‰ and b is the discrimination limited by Rubisco which can vary between 26 and 30‰ (Farquhar et al., 1989)).

Statistical analysis
All data analysis was carried out in R v. 3.4.2 (R Core Team, 2017) using the package Rsolnp v. 1.16 (Ghalanos and Theussl, 2015). We generated CO 2 predictions from our 13 C data using the hyperbolic relationship developed by Schubert and Jahren (2012): where the asymptote A is equivalent to the maximum rubisco fractionation value, b, in equation (2) (Schubert and Jahren 2012). While A can therefore vary between 26 and 30, Schubert and Jahren (2012) found the best fitting curve had A = 28.26, and this value has been used in subsequent papers Jahren 2013, 2015;Schubert 2016, 2017). B and C have been determined by iterative curve fitting, with the most recent formulation of the model having values of B = 0.22 and C = 23.85 (Cui and Schubert 2016).
We therefore used these A, B, and C values for predicting CO 2 from our 13 C data. We used the one sample Wilcoxon signed rank test to test for significant differences between predicted and growth CO 2 , since this is a non-parametric test and it does not assume normally distributed data (Crawley, 2005). Similarly, we used the Kruskall-Wallace test (Hammer and Harper, 2006) to test between differences in predicted CO 2 among water treatment levels, within each growth CO 2 level.
In addition to using the model parameters derived by Cui and Schubert (2016), we fitted three new curves to our data, one for each water treatment level. We maintained a similar approach to Schubert and Jahren (2012), using equation 3 to model the relationship between 13 C and CO 2 subject to the constraint that 13 C = +4.4‰ when CO 2 = 0 ppm. The curves were optimised by minimising the root mean squared error (RMSE), using the function "solnp" in the R package Rsolnp (Ghalanos and Theussl, 2015). Following Cui and Schubert (2016) confidence intervals were estimated for the model parameters via bootstrapping. Briefly, the residuals from the fitted curves were resampled with replacement and added back to the model 13 C estimates to create a new pseudo-dataset, the curve refitted and the new values of A, B and C recorded. This process was repeated 10,000 times to create a distribution of model parameter values, and the 16% and 84% quantiles used to construct 68% confidence intervals.
We explored uncertainty in the CO 2 predictions by performing Monte Carlo error propagation (Cui and Schubert 2016 becomes inestimable by the model: CO 2 estimates derived from  13 C values just below A will exceed 106 ppm; at  13 C values A estimated CO 2 becomes negative (this switch from positive to negative CO 2 estimates, rather than ever increasing positive CO 2 estimates, is due to the hyperbolic relationship used in the model; see Cui and Schubert 2016 for details).
Following Cui and Schubert (2016) we therefore discarded any CO 2 estimates <0 and >10 6 ppm. Here, ten estimated CO 2 values were discarded for having a prediction of <0 ppm.
While Schubert and Jahren (2012) developed equation (3) from controlled growth experiments, palaeo-CO 2 reconstructions have been carried out relative to Holocene 13 C and pCO 2 (Schubert and Jahren 2015; Cui and Schubert 2016). We followed this approach to test the impact on predicted CO 2 calculated from our 13 C data. Incorporation of Holocene baseline data adds three additional terms to the Monte Carlo error propagation:  13 C a ( t = 0 ) ,  13 C p ( t = 0 ) , and pCO 2( t = 0 ) . We used the same normal distribution parameters as Cui and Schubert (2016), with  13 C a ( t = 0 ) = -6.4‰ ± 0.1‰,  13 C p ( t = 0 ) = -25.1‰ ± 1.6‰, and pCO 2 ( t = resampled as described previously. As before we generated 10,000 CO 2 values for each growth CO 2 treatment level. 6063 results were discarded for having a prediction of <0 and 12 were discarded for predicting CO 2 > 10 6 ppm.

RESULTS
Our data demonstrates considerable spread in 13 C within each CO 2 treatment as a function of watering regime (Fig. 2) suggesting that other factors not previously investigated in the context of the C3 plant proxy (Schubert and Jahren 2012) have the potential to influence estimates of CO 2 based on 13 C. To test the assertion that changes in carbon isotope fractionation (S) are proportionate to changes in CO 2 and that this is the main factor that drives this relationship, three separate curves of 13 C from the Sheffield and Nottingham experimental dataset were developed and the difference between and these experiments and the original A. thaliana data set of Cui and Schubert (2016) were tested (Fig. 3). Comparison between our water availability treatments shows differences are apparent particularly when comparing the low water treatment (Fig 3a) to the other water treatments. The 68% confidence interval on the A value for the 10 ml water treatment (A = 24.44 +1.78/-1.17) does not incorporate the A values for the 20 ml (A = 27.35) or saturated (A = 27.48) water treatments, and the curve for the 10 ml water treatment also has the highest RMSE (Table 1).
There are also differences when comparing our datasets to the proposed model (red lines in Fig. 3) of Cui and Schubert (2016). Again, this is most pronounced in the 10 ml water treatment, where the 68% confidence interval on the A value does not overlap with the 28.26 value used by Schubert and Jahren (2012) and Cui and Schubert (2016).
In an attempt to validate the current 13 C C3 proxy we have used the regression of Cui and Schubert (2016) to predict pCO 2 and compared these predicted values to the known CO 2 within the chamber (Fig. 4a). Our data show large differences between predicted and measured growth CO 2 when plants are sampled as individuals, with this variation increasing as atmospheric CO 2 in the chamber increases, with the problem becoming particularly apparent in the 3000ppm experiment with predictions spanning a CO 2 range of ~950 to 21,680 ppm. Grouping the data via CO 2 treatment (Fig. 4b) and comparing median CO 2 predictions to growth conditions is analogous to the generation of a 13 C signal from allochthonous deposit that captures material from a broad range of environments. Again the data shows a consistent under prediction in CO 2 when the median predicted CO 2 is compared While A has been fixed at 28.26 for previous proxy development and applications (Schubert and Jahren 2012Schubert 2016, 2017), Cui and Schubert (2016) considered the impact of varying A from 26 to 30, with B and C changing accordingly. To test if other values of A, B and C would produce more accurate CO 2 estimates relative to the growth conditions we used the alternative values provided by Cui and Schubert (2016) (Table   2). Comparing both the r 2 values from regressions of estimated on growth CO 2 and root mean squared error of prediction (RMSEP) shows that the most accurate CO 2 reconstructions are found with A = 30, which is in agreement with Cui and Schubert (2016). However, even A = 30 only yields an r 2 value of 0.57 and a RMSEP of 824 ppm, with underestimation at all CO 2 treatment levels (a full graphical display of predicted CO 2 when A is varied is presented in appendix B in the supplementary information).
Monte Carlo error propagation using equation (3) shows a variety of responses (Fig. 6), but with underestimation of CO 2 at all treatment levels ≤1500 ppm. When the full error propagation relative to the Holocene baseline data is carried out the median CO 2 estimates are similar to those derived from Monte Carlo error propagation using equation (3), but the spread of the distributions is much larger. This leads to a greater overlap with growth CO 2 conditions but also a greater proportion of unrealistically high CO 2 estimates (Fig. 6).
Using  13 C a and  13 C p to calculate C i /C a for both the Hamlet mutant and its wild type parent shows that a realistic C i /C a is only achievable in the wild type if Rubisco limited discrimination (b) > 29. When b is set to range between 26.00 and 28.25‰, C i /C a is >1 which from an ecophysiological standpoint is impossible. Analysis of the astomatal mutant Hamlet shows that the  13 C for the mutant is 13.56 ± 0.22 (1 Standard Deviation) and reveals that when b is set at 29 average C i /C a is 0.347 and 0.334 when b is set at 30 (Fig. 7). These C i /C a values reflect a relatively high internal CO 2 concentration which is incongruent with the astomatal nature of the plant and the low discrimination value which signifies a high WUE and "stomatal" closure as expected in an astomatal plant. It is also plausible that the  13 C p value of the Hamlet leaf tissue could indicate recycling of internal CO 2.

DISCUSSION
Our analysis shows that there are clear and consistent impacts of water treatment on the leaf tissue  13 C, these then obviously feed forward and impact on the utility of the proxy to predict CO 2 . It is particularly clear that the 10 ml treatment (low water availability) diverges from predictions made based on the Schubert and Jahren (2012) model resulting in an underestimation in CO 2 . It should be noted that in their original publication (Schubert and Jahren 2012), the authors did state that water availability might be an important factor in their analysis. They consequently suggested that sampling be limited to sites with mean annual precipitation of >2100 mm, which in the modern world are limited to tropical and subtropical environments with consistently high water availability (Jaramillo and Cárdenas 2013;Wilf et al., 1998). However, in subsequent analysis this caveat seems to have been disregarded with samples being taken from a number of non-tropical locations. Cui and Schubert (2017) do however suggest that mixed/reduced moisture signals might be a reason for a possible underestimation of their predicted CO 2 through Cenozoic hyperthermal events.
Analysis of our validation data shows a distinct pattern with a general underestimate in predicted CO 2 when compared to growth CO 2 up to atmospheric concentrations of 1500ppm.
This appears to be independent of water treatment. These data are of relevance to Mesozoic (Barral et al., 2017) and Cenozoic (Cui and Schubert 2017) CO 2 reconstructions as two recent studies using this technique have predicted what could be regarded as anomalously low palaeo-CO 2 particularly through the Cretaceous with estimates as low as ~280ppm (Barral et al., 2017). If the C3 proxy systematically under predicts CO 2 across this range of CO 2 this may go some way to explaining these CO 2 predictions. It is well known that changes in salinity can affect WUE (Guy et al., 1980) so the deltaic/ estuarine setting of these plant fossils (Barral et al., 2017) may further influence their  13 C composition. This would decouple the  13 C signature from the atmosphere further limiting the potential of the  13 C of these plants to be used to predict CO 2 even when the issues raised in our validation assessment are excluded. This combination of factors most likely explains why the majority of the Barral et al. (2017) data plot outside of the 95% confidence limits of the Foster et al.
(2017) compilation (Fig. 1) and are anomalous when compared to stomatal based estimates of CO 2 through the Cretaceous (e.g. Barclay et al., 2010, Richey et al., 2018. Our attempt to validate the methodology developed by Schubert and Jahren (2012) and subsequently expanded on by Cui and Schubert (2016) highlights the need to develop validation protocols that allow for the rigorous testing of new proxies (Jardine and Lomax 2017). These validation protocols should ideally be based on independent data sets or via the segmentation of the original data set, where a proportion of the data set is held back for validation. At the very least cross validation approaches, where each sample (or group of samples) is held back in turn and the value(s) predicted based on the model fit to the rest of the dataset, allow for predictive accuracy to be assessed. However, it should be noted that this type of approach tends to be too optimistic when compared to independent methods of model validation (Mac Nally et al., 2017;Zimmermann et al., 2016). Prior to our analysis, C3 proxy validation has only been attempted in a geological setting with Schubert and Jahren (2015) demonstrating a close relationship between ice core CO 2 records and their CO 2 reconstructions. However, subsequent work (Kohn 2016) suggests close agreement might be related to changes in the abundance of C3/C4 grasses that influence the  13 C record and are largely independent of atmospheric CO 2 concentration. The lack of congruence in our experimental approach to validation lends supports the interpretation of Kohn (2016).
Analysis, via error propagation (Cui and Schubert 2016;Schubert and Jahren 2018) whilst informing on the precision of the predictions, is of limited use in assessing the accuracy of the proxy which underpins the model's utility. This is particularity problematic when the response variable, in this case  13 C, is known to be sensitive to a large number of environmental stimuli (as discussed above) which are excluded from parameterisation. For example in the initial study of Schubert and Jahren (2012) all variations in  13 C were assumed to be driven solely by changes in CO 2 despite the well-known effects of water availability and temperature on  13 C all of which could have varied considerably in the experimental setup of Schubert and Jahren (2012).
The model initially proposed by Schubert and Jahren (2012) and then developed by Cui and Schubert (2016) is heavily dependent on some baseline assumption(s). For example it is not possible for the model to predict palaeo-CO 2 if the 13 C values are greater than A. Using the preferred A value of 28.26 within our experimental dataset this situation occurs nine times, all within the 3000ppm experiment (four incidences occurring in Col 0 and the remainder in the Hamlet wild type). These findings indicate that the original C3 proxy model (equation 3) fails to adequately describe the underlying ecophysiological processes that drive changes in  13 C p that feed through to drive 13 C which are then used to calculate palaeo-CO 2 . If a lower value of A is prescribed this problem is increased. Consequently environmental conditions which result in high levels of discrimination (high 13 C) are unlikely to be suitable for this proxy. In the modern world high 13 C values are associated with plants with open stomata that are typically not water limited, and given the well-known wetland mega bias (Spicer 1981) in the plant fossil record the sensitivity to high values of A could be problematic. The lack of suitability also raises philosophical questions about the utility of the approach as the original model is only operable over a limited climate space. Porter et al. (2017) used isotope data to calculate C i /C a and compared these calculated values to measured values of C i /C a derived from infrared gas exchange (IRGA) data. The difference between these two C i /C a values was then used to estimate b (Rubisco limited diffusion) which equates to A in the C3 proxy model of Schubert and Jahren (2012)  providing the closest fit between measured and calculated C i /C a was 27. Using the b value of 28.26 preferred by Schubert and Jahren (2012), Porter et al. (2017) found an under estimate of 5% when comparing calculated to measured C i /C a . Within their experimental system Porter et al. (2017) also found that a b value of ≥28.26 did not lead to a C i > C a when CO 2 was elevated, a finding replicated in our data and leading them to suggest that other factors besides b might influence measured C i > C a . Porter et al. (2017) demonstrate that "measured C i /C a varies with CO 2, and with differing relationships by plant group indicating that to calculate C i /C a in response to changes in CO 2 b should not be a fixed value" as previously suggested (e.g. Gröcke 2002). Consequently, the fixing of A (in equation 3) at 28.26 as per Schubert and Jahren (2012) is likely to lead to problems when predicting pCO 2 .
Our experiments, like those of Schubert and Jahren (2012) and the work of others (e.g. Fletcher et al., 2008;Porter et al., 2017) that were designed to investigate plant responses to elevated CO 2 , have been conducted in growth cabinets where the isotopic signature of the CO 2 is not controlled (i.e. it co-varies with pCO 2 ) and is highly perturbed when compared to natural settings. Comparison of  13 C and C i /C a values of Hamlet and its wild type reveals intriguing and potentially anomalous results. The discrimination value ( 13 C) indicates, as expected, in an astomatal mutant high WUE suggestive of "stomatal" closure. However values of calculated C i /C a indicate a degree of stomatal opening. The experiments we have conducted have been based around the assumption that changes in C i /C a as recorded by changes in fractionation ( 13 C) are to a large extent controlled by changes in stomatal opening as a function of the environment specifically CO 2 and water availability. However, a growing body of ecophysiology literature suggests that this relationship is not quite so straight forward. For example recent work has shown that the isotopic composition of plant material can be altered by a variety of processes that occur after carbon fixation. For example, Busch et al. (2013) demonstrated that C3 plants can fix photorespired and respired CO 2 which feeds through to effect  13 C; Lanigan et al. (2008) demonstrated both the effects of photorespiration and carboxylation on  13 C; Seibt et al. (2008) looked at the relationship between  13 C p and water use efficiency across a variety of spatial and temporal scales and Cernusak et al. (2009) reviewed six hypothesises relating to patterns of fractionation in C3 plants. Our work on the Hamlet mutant and the anomalous calculated values of C i /C a lend support to there being multiple factors that can influence  13 C p which in turn effect  13 C and calculated C i /C a .
Alternatively these data could also suggests that changes in the  13 C of the CO 2 might be affecting the kinetics of Rubisco discrimination, or that the model used to calculate C i /C a breaks down when  13 C a is very negative. Both or either of these factors would thus generate anomalous calculated values in C i /C a . Within our experimental system and that of Schubert and Jahren (2012) the concentration of CO 2 and  13 C a are positively correlated. This means that it is impossible to determine if the changes in b which are required to maintain C i /C a values that are physiologically possible are driven by the CO 2 concentration or the isotopic value of that CO 2 . If changes in b are underpinned by the  13 C of CO 2 rather than the concentration then the reliability of the C3 proxy must be further examined given that  13 C a in experimental systems is very different to the natural atmosphere, as it was in the original study that developed the C3 proxy method. Together these finding suggest that the data suggest that using fossil values of  13 C as a tool to predict palaeo CO 2 should be treated with caution as the factors that govern fractionation and calculated values of C i /C a are still not fully understood in living plants. Schubert and Jahren (2018) recently suggested that changes in  13 C in response to elevated CO 2 are mathematically independent of C i /C a . In our data analysis we did not consider the effects of photorespiration on our calculated  13 C in different CO 2 growth conditions. However, we have clearly shown that manipulations of water availability alters  13 C when plants are grown together in the same atmospheric conditions (Fig 3) and this in turn impacts on the predictive ability of changes in  13 C to accurately predict pCO 2 .
Over geological time whilst there have been large scale perturbations in the concentration of atmospheric CO 2 the isotopic variation ( 13 C a ) which accompanies this variation in CO 2 is much reduced when compared to experimental systems. This results in a fundamentally different relationship between the experiments and the natural world. To fully disentangle this relationship experiments over a wide CO 2 gradient where  13 C a is kept constant are required. Ideally this experimental programme should be combined with other environmental manipulation that control C i /C a and be accompanied by a campaign of IRGA measurements to allow for comparison between measured and calculated C i /C a . as per Porter et al. (2017).

CONCLUSIONS
We have set out to deliver a robust experimental framework to fully explore environmental controls on the carbon isotope discrimination in plants. This was undertaken to try and validate the proposed C3 plant proxy as a tool to predict palaeo-CO 2 . Comparisons between predicted and growth CO 2 concentrations show that the model fails to accurately predict CO 2 with substantial under prediction in CO 2 in experiments that were designed to simulate Cenozoic and Mesozoic atmospheric environments. Our findings suggest serious limitations in the proposed proxy as delivered estimates of CO 2 are neither precise nor accurate when compared to known growth conditions.

SUPPLEMENTARY MATERIAL
Supplementary data associated with this article can be found, in the online version: Appendix A is a full statistical breakdown of results; Appendix B shows a full graphical representation of predicted pCO 2 with variations in A; Appendix C provides the R code required to run the analysis and Appendix D the  13 C p and  13 C a data.   Jahren (2012) and Cui and Schubert (2016). The red line is the LOESS CO 2 curve of Foster et al. (2017) and the grey shading is the 95% confidence limit of their CO 2 prediction.

Figure 3
Change in  13 C plotted against atmosphere CO 2 . a, change in  13 C per ppm of CO 2 (S) calculation is based on Cui and Schubert (2016) and compares work presented in this study to that of Cui and Schubert (2016); b, change in  13 C per ppm of CO 2 for plants grown in the low water treatment (10ml), dotted line is the curve fit for this dataset based on the protocol of Cui and Schubert (2016); c, change in  13 C per ppm of CO 2 for plants grown in the moderate water treatment (20ml), dashed line is the curve fit as per Cui and Schubert (2016) and d, change in  13 C per ppm of CO 2 for plants grown in the high water treatment (Sat), solid line is the curve fit for this dataset as per Cui and Schubert (2016). The red line in panel bd is the curve fit from Cui and Schubert (2016). shows individual data points for each experimental treatment; b, shows these data points plotted collectively as box plots analogous to a fossil sample collected from an assemblage composed of a transported flora (an allochthonous deposit); c, shows the average predicted CO 2 for each CO 2 experiment and the solid black line represents the one to one line; and d, is the average difference between predicted and measured CO 2 . ppm experiment (panel h) as  13 C was greater than A for four of the five replicates predicted CO 2 from the one remaining data point was 2973 ppm. For these calculations of pCO 2 A was set at 28.26 as per Cui and Schubert (2016). A full presentation of CO 2 predictions when A is varied is presented in the appendix B in supplementary.