On-Site Sensor Calibration Procedure for Quality Assurance of Barometric Process Separation (BaPS) Measurements

Barometric process separation (BaPS) is an automated laboratory system for the simultaneous measurement of microbial respiration and gross nitrification rates in soil samples. To ensure optimal functioning, the sensor system, consisting of a pressure sensor, an O2 sensor, a CO2 concentration sensor, and two temperature probes, must be accurately calibrated. For the regular on-site quality control of the sensors, we developed easy, inexpensive, and flexible calibration procedures. The pressure sensor was calibrated by means of a differential manometer. The O2 and CO2 sensors were simultaneously calibrated through their exposure to a sequence of O2 and CO2 concentrations obtained by sequentially exchanging O2/N2 and CO2/N2 calibration gases. Linear regression models were best suited for describing the recorded calibration data. The accuracy of O2 and CO2 calibration was mainly affected by the accuracy of the utilized gas mixtures. Because the applied measuring method is based on the O2 conductivity of ZrO2, the O2 sensor is particularly susceptible to aging and to consequent signal shifts. Sensor signals were characterized by high temporal stability over the years. Deviations in the calibration parameters affected the measured gross nitrification rate by up to 12.5% and affected the respiration rate by up to 5%. Overall, the proposed calibration procedures are valuable tools for ensuring the quality of BaPS measurements and for promptly identifying sensor malfunctions.


Introduction
The barometric process separation (BaPS) technique is a laboratory method for measuring gross nitrification and respiration rates in soil samples [1]. The BaPS technique "separates" the two microbial processes on the basis of total pressure and partial pressure changes in a gastight isothermal incubation system ( Figure 1). The simultaneous measurement of air pressure, temperature, and O 2 and CO 2 concentrations allows for the quantification of changes in the total gas molecule number n and single gas molecule numbers of O 2 and CO 2 over time (µmol h -1 ) and the solving of the system's gas balance equation: In Equation (1), ∆X stands for the sum of gases other than O 2 and CO 2 that are involved in the gas balance of the incubation system (mainly N 2 , but also N 2 O, NH 3 , CH 4 and other trace gases). Because the calculation of the microbial process rates is based on a balancing approach, the accurate measurement of ∆n, ∆O 2 , and ∆CO 2 of Equation (1) is essential [1,2]. The BaPS measuring system (Meter Group (formerly UMS AG), Munich, Germany) includes a fully automated measuring head containing CO 2 , O 2 and pressure sensors (Figure 1) allowing for a continuous (online) measurement of the three state variables at resolutions as low as one minute. The intensive monitoring of the barometric state inside Figure 1. Photographs of the BaPS incubation system. (a) Open incubation chamber with cutouts for seven soil cores and measuring head containing the BaPS sensor set consisting of two temperature probes, a piezoresistive pressure sensor, an IR-CO2 sensor and a ZrO2-O2 sensor. (b) Closed incubation system with a gastight syringe piercing the rubber septum to introduce or withdraw gas to or from the BaPS headspace.
The reliability of the BaPS method and the quality of the results depend explicitly on the quality of the calibration of the BaPS sensors; this means an accurate translation of the sensor signals into concentrations or pressure. Calibration errors affect the resulting rates directly as well as on a superordinate level, because relevant measures and conditioned constants from the rate calculation routines are derived from a combination of different sensor measurements, e.g., measurements of the headspace volume Vhead; molecule numbers n, nO2, and nCO2; and physicochemical CO2 dissolution [1,3]. Therefore, accurate sensor calibration must be ensured for every measurement. Regular maintenance of proper sensor performance is essential to ensuring high measurement quality.
UMS recommended an annual technical check and recalibration of the sensors [4]. In 2016, however, during the merger of Decagon Inc. (Pullman, WA, USA) and UMS AG (Munich, Germany) to become the Meter Group, the BaPS system was not transferred to the joint product portfolio. Consequently, regular recalibration of the sensors by the manufacturer is no longer possible. The present study was performed before this merger, and The reliability of the BaPS method and the quality of the results depend explicitly on the quality of the calibration of the BaPS sensors; this means an accurate translation of the sensor signals into concentrations or pressure. Calibration errors affect the resulting rates directly as well as on a superordinate level, because relevant measures and conditioned constants from the rate calculation routines are derived from a combination of different sensor measurements, e.g., measurements of the headspace volume V head ; molecule numbers n, nO 2 , and nCO 2 ; and physicochemical CO 2 dissolution [1,3]. Therefore, accurate sensor calibration must be ensured for every measurement. Regular maintenance of proper sensor performance is essential to ensuring high measurement quality.
UMS recommended an annual technical check and recalibration of the sensors [4]. In 2016, however, during the merger of Decagon Inc. (Pullman, WA, USA) and UMS AG (Munich, Germany) to become the Meter Group, the BaPS system was not transferred to the joint product portfolio. Consequently, regular recalibration of the sensors by the manufacturer is no longer possible. The present study was performed before this merger, and the original motivation for developing an easy, do-it-yourself procedure for an on-site calibration of the sensors was that it could allow flexible and regular checks of the sensor calibrations in between the annual checkups by UMS. After the discontinuation of the manufacturer's calibration, on-site self-calibration has become the only way for BaPS users to recalibrate their sensors, which underlines the importance of the proposed procedure. With the help of this procedure, we were able to monitor sensor signal performance and stability over time, and we could analyze the influence of calibration variation and shifts on the resulting gross nitrification and respiration rates. In the present paper, we describe the calibration procedures in detail, assess the involved error sources and estimate their potential effect on BaPS results. The BaPS system uses two temperature probes. One probe measures the absolute air temperature in the headspace (T head ). The other temperature probe is introduced into a soil core to measure the temperature of the soil sample (T soil ). The temperature probes are platinum (Pt 1000) resistance thermometers which record the decrease in the electric potential due to temperature-induced resistance changes [4]. Pt sensors have a high accuracy (1/3 DIN B), have a low drift, and are stable over time. Thus, quality control and recalibration of the temperature sensors were not considered to be necessary; that is, the manufacturer calibration was left unchanged.

Pressure Sensor
The air pressure (P) in the BaPS chamber is measured with a piezoresistive pressure transmitter with an integrated amplifier and temperature compensation [4]. The absolute pressure is measured based on resistance changes due to the deformation of a silicon membrane, which changes the electric potential. The sensor provides an output signal between 0.4 and 2 V.
Pressure is the key variable of the BaPS system because all the other variables in the BaPS calculations directly or indirectly depend on it. Furthermore, the calibration procedure for the O 2 and CO 2 sensors proposed hereafter also depends explicitly on the accurate measurement of pressure changes.

O 2 Sensor
O 2 concentrations in the headspace are measured by means of a zirconium oxide (ZrO 2 ) sensor. At high temperatures, ZrO 2 becomes conductive for O 2 and its resistance changes as a function of O 2 concentration. Thus, at a constant voltage, the electric current changes accordingly. The O 2 sensor is heated up to 500 • C and produces a 4 to 20 mA output signal (transferred to V by a high-precision resistor bridge). According to the measuring technique, the signal response is nonlinear and specific for the individual sensor [4].
2.1.4. CO 2 Sensor CO 2 concentrations are measured with an infrared (IR) gas sensor that detects the attenuation of an infrared light beam that results from the selective absorption by CO 2 . This measuring technique is known to be very stable; however, the measurement accuracy specified by the manufacturer was the lowest for this sensor and regular recalibration is explicitly recommended [4]. The CO 2 sensor delivers a signal between 0 and 2.5 V for a measuring range between 0 and 3 Vol%.

System Settings
The electronic interface of the UMS BaPS system facilitates communication between the sensors and the computer. These sensor signals are then converted to the respective state variables corresponding to either pressure, temperature, or gas concentrations, by polynomial calibration functions of the form: where y is the measured state variable (=measurand), x is the measured sensor signal and a, b, c, d, and e are the calibration parameters [4].
To calibrate a sensor, the raw sensor signal must be recorded. To achieve this, in the "Sensor specification" menu, the parameter b was set to unity and a, c, d, and e were set to zero. No other sensor setting was changed.
During calibration, sensor signals were logged at an interval of one reading per minute using the "Logging" function in the "Current readings" menu. To maximize temperature stability, the set-point temperature of the thermostat was adjusted to ambient temperature.
The BaPS system displays CO 2 concentrations in µmol mL -1 which can be transformed to concentrations in Vol%. Depending on the chosen unit, the raw signal of the CO 2 sensor varies, so it is important to define the unit for which the calibration is performed within the "Current readings" menu. In this study, the CO 2 sensor was calibrated in the unit µmol mL -1 .
The calibration procedures described hereafter were performed simultaneously on three independently operating BaPS systems in order to compare sensor-specific signal behaviors. Calibrations were repeated 6-8 times between 2012 and 2015.

Pressure Sensor Calibration Procedure
The pressure difference between the inside and the outside of the BaPS headspace was recorded by means of a differential pressure manometer with a relative accuracy of 0.2% (GDH 13 AN, Greisinger Electronic GmbH, Regenstauf, Germany) inserted through the rubber septum. In the beginning of a recording, the pressure inside the BaPS chamber was equalized to ambient pressure by piercing a needle through the septum. At zero differential pressure, the sensor signal was recorded manually from the "Current readings" window at high resolution (every 15 s) over a period of 1 to 3 min (5-10 readings). The signal mean was set to absolute ambient air pressure measured by a barometer (G. Lufft Mess-und Regeltechnik GmbH, Fellbach, Germany). Next, the pressure inside the chamber was manipulated stepwise by introducing or removing air to or from the headspace using a gastight syringe (SL syringe 10 mL, Hamilton, Bonaduz, Switzerland). After each manipulation, the sensor signal was recorded over 3 min and the signal mean was paired with the prevailing absolute pressure inside the chamber, calculated as ambient pressure plus/minus the measured pressure difference. The expected pressure change induced by adding or removing gas from the chamber was computed with the ideal gas law (see below).

O 2 and CO 2 Sensor Calibration Procedure
The O 2 and CO 2 sensors were calminibrated simultaneously. Initially, the headspace air of the empty BaPS system was replaced by synthetic air gas (20.5% O 2 , 79.5% N 2, relative accuracy ±2%, Westfalen GmbH, Münster, Germany) via intensive flushing (at least 30 min at 0.02 L s -1 , escape via an introduced needle). After complete gas replacement, the pressure inside the BaPS chamber was equalized with the ambient pressure by releasing overpressure through a needle. Subsequently, the headspace volume V head was determined in triplicate via volume extension [1,4]. Sensor readings in synthetic air were recorded for 10 to 15 min with the "Logging" function (10 readings). Then, a defined gas volume (usually 10 mL) was removed from the headspace using the lockable syringe. The pressure decrease induced by gas removal was recorded to determine the number of gas molecules removed. Next, the removed volume was replaced by a CO 2 /N 2 gas mixture (20.5% CO 2 , 79.5% N 2, relative accuracy ±2%, Westfalen GmbH, Germany). The number of added gas molecules was deduced from the measured pressure increase. After the gas exchange, the sensor signal was again logged for 10 to 15 min. The signal mean was related to the calculated O 2 and CO 2 concentrations (see below). When repeating the procedure several times, the O 2 concentrations inside the chamber decreased stepwise, while the CO 2 concentrations increased accordingly. The total pressure and N 2 concentration remained unchanged during the procedure. The relevant measuring range was covered when the maximum voltage of the CO 2 sensor (2.5 V) was reached.

Calculation of Headspace Gas Composition
The gas composition in the headspace was calculated on the basis of the molecule numbers for every gas exchange by mole balance calculations. The total amount of gas molecules in the system n was computed using the ideal gas law: where R is the ideal gas constant (8.314 Pa m 3 K -1 mol -1 ) and T (K) is the headspace temperature. The number of molecules removed from the system (n out )-calculated from the pressure decrease after gas removal (dP out ) (Equation (3b))-was subtracted from n. Then, the removed volume was replaced by the CO 2 /N 2 gas mixture and the corresponding number of molecules (n in )-calculated from the pressure increase after injection (dP in ) (Equation (3c))-was added: where the subscript i indicates the i-th gas exchange performed. In the following, gas concentrations given in Vol% are indicated by [ ]. Initially (I = 0), 79.5% of n was N 2 ([N 2 ] 0 ) and 20.5% was O 2 ([O 2 ] 0 ), while CO 2 ([CO 2 ] 0 ) was 0%. Changes in the individual pools were calculated using Equation (4) for each gas component individually. The molar fraction of nO 2,i+1 , nN 2,i+1 and nCO 2,i+1 of total n i+1 gives the concentrations of the three components after exchange ([O 2 ] i+1 , [CO 2 ] i+1 , and [N 2 ] i+1 ). As CO 2 sensor calibration requires CO 2 concentrations expressed in µmol mL -1 , nCO 2,i+1 was related to V head instead of n i+1 after each gas exchange.

Assessment of Measurement Errors in Sensor Signal and Measurand
Data pairs collected for calibration contain measurement errors in both assessed variables, x and y-that is, an error in the sensor signal and an error in the measurand, respectively [5]. These errors may affect the accuracy of the resulting calibration. We quantified these errors separately according to the Guide to the Expression of Uncertainty in Measurement (GUM) [6,7] and evaluated their potential effect on the calibration parameters. For this, we considered data of several calibration runs in order to obtain a more general perspective on sensor properties and response behaviors.

Measurement Error in x
The measurement error in x, me x , also called sensor noise, is the standard deviation (s.d.) of the sensor signal. It denotes the sensor-specific fluctuation in the sensor signal under constant conditions ( Figure 2) and depends on the specific measuring system and the individual sensor. Furthermore, it influences the calibration data points in the x direction and was quantified over the calibration range by recording the raw sensor signal at each calibration step at constant conditions for several (approx. 10) minutes.

Assessment of Measurement Errors in Sensor Signal and Measurand
Data pairs collected for calibration contain measurement errors in both assessed variables, x and y-that is, an error in the sensor signal and an error in the measurand, respectively [5]. These errors may affect the accuracy of the resulting calibration. We quantified these errors separately according to the Guide to the Expression of Uncertainty in Measurement (GUM) [6,7] and evaluated their potential effect on the calibration parameters. For this, we considered data of several calibration runs in order to obtain a more general perspective on sensor properties and response behaviors.

Measurement Error in x
The measurement error in x, mex, also called sensor noise, is the standard deviation (s.d.) of the sensor signal. It denotes the sensor-specific fluctuation in the sensor signal under constant conditions ( Figure 2) and depends on the specific measuring system and the individual sensor. Furthermore, it influences the calibration data points in the x direction and was quantified over the calibration range by recording the raw sensor signal at each calibration step at constant conditions for several (approx. 10) minutes.

Figure 2.
Scheme of a calibration data set (several calibration x-y data pairs) and the different error sources involved in the estimation of calibration coefficients: measurement error in x (mex) and measurement error in y (mey). The straight line is the calibration line fitting the recorded x-y data pairs according to the ordinary least square method (linear regression), and the dashed lines represent the 95% confidence limits of the entire calibration line. The dotted lines indicate a displacement of the calibration line in the y direction due to the standard error of the intercept (given the slope) (sea); the thin lines represent the standard error of the slope (given the intercept) (seb), which consist of rotations of the entire line around P(x̅ , y̅ ).

Measurement Error in y
The measurement error in y, mey, arises from the measuring procedures applied to derive the respective measurands ( Figure 2). It represents the conjunction of the individual errors of the variables that are involved in the determination of the measurand value [7]. It also influences the calibration data points in the y direction and depends on the accuracy levels of the calibration gases and the measuring devices utilized [5]. Scheme of a calibration data set (several calibration x-y data pairs) and the different error sources involved in the estimation of calibration coefficients: measurement error in x (me x ) and measurement error in y (me y ). The straight line is the calibration line fitting the recorded x-y data pairs according to the ordinary least square method (linear regression), and the dashed lines represent the 95% confidence limits of the entire calibration line. The dotted lines indicate a displacement of the calibration line in the y direction due to the standard error of the intercept (given the slope) (se a ); the thin lines represent the standard error of the slope (given the intercept) (se b ), which consist of rotations of the entire line around P(x, y).

Measurement Error in y
The measurement error in y, me y , arises from the measuring procedures applied to derive the respective measurands ( Figure 2). It represents the conjunction of the individual errors of the variables that are involved in the determination of the measurand value [7]. It also influences the calibration data points in the y direction and depends on the accuracy levels of the calibration gases and the measuring devices utilized [5].
In the case of the pressure measurements, me y depends solely on the accuracy levels of the utilized manometer, which was specified as a relative uncertainty of 0.2% of the measured pressure difference. No further measurement error was involved. Because the pressure variation in the calibration procedure is small (920-1030 hPa), we used the averaged me y as a constant measurement error over the calibration range.
To estimate the me y of the O 2 and CO 2 concentrations that result from the repeated gas removal and injection cycles, we used a Monte Carlo (MC) simulation approach [7,8].
To set up the simulation spreadsheet (MS Excel, 2010), every input variable of the calculation routine for O 2 and CO 2 concentrations according to Equations (3a)-(3c) and (4) (i.e., V head , P, dP out , dP in , T, T out , T in ) was exchanged by a random number taken from a Sensors 2023, 23, 4615 7 of 17 normal distribution using the built-in Excel function NORMINV, specified by the mean and standard deviation. In the case of the parameters dP out,i and dP in,i , se b of the P calibration was additionally considered. One thousand calibration runs were simulated using a macro sequence. The standard deviations of the resulting gas concentrations were considered as the me y for O 2 and CO 2 .
Here, we assumed that the two calibration gas mixtures (O 2 /N 2 and CO 2 /N 2 ) exactly exhibited the concentrations given by the filler in the calibration certificate, i.e., to this point systematic errors were ignored, and only random errors for the procedure-related input variables were considered.

Calibration Parameterization
According to Equation (2), the BaPS sensors require calibration functions in the form of a linear or polynomial regression, where the sensor signal corresponds to the x variable and the measurand (i.e., gas concentration or air pressure) corresponds to the y variable.
We used the ordinary least squares method in SigmaPlot version 11.0 (Systat Software, San Jose, CA, USA) to obtain the linear regression parameters a and b, i.e., the intercept and slope of the calibration line, and the corresponding standard errors of the parameters se a and se b . The 95% confidence intervals (CIs) of the regression parameters were calculated as where t crit is the critical value of the t-distribution at a 95% confidence level (two-tailed value) for n−2 degrees of freedom (df ) [9]; i.e., t crit = 2.37 at df = 7.
Applying standard linear regression presumes that the error in x is negligibly small. The effect of a failure of this assumption was tested by applying the normal functional model to the data [10]. This model expects a law-like relationship underlying the observed data of the sensor signal and measurand [11,12]. The basic equations are given by where x i denotes a fixed but unknown variate and e i is the additive random and independent variation of y i (=residuals of y i ) deriving from a normal distribution with a common variance σ 2 .
x i is related to the observed variate z ij by where g ij denotes the random variation of x i . Here, i denotes the number of x i -y i pairs in the regression (here, nine), whereas j stands for the number of replicated sensor measurements (here, five or ten). The statistical parameters estimated are a, b, σ 2 , σ z 2 , and the nine x i . The estimation of the latter may seem unusual but is related to the assumption basic to the functional setting that x is a nonrandom variate. The parameters were estimated by using the nlmixed procedure of the SAS 9.2 software (SAS Institute Inc., Cary, NC, USA), which employs a maximum likelihood approach.
For each of the three sensors, the resulting calibration parameters a and b were identical to the parameters resulting from standard linear regression analysis. We concluded that the me x was in all cases negligible, justifying the use of standard linear regression.
To assess the appropriateness of the linear model, the residuals of the linear model (e i ) were checked for trends [9]. The residual analysis was performed on all available calibration runs as a general test of the linearity of sensor signal response. Additionally, Durbin-Watson coefficients were determined for each sensor on all available e i to detect autocorrelation.
The residuals e i are a measure of divergence of the measured data from the functional relationship in the y direction [11]. They combine the effects of known (me y ) and unknown (t) measurement errors in y as well as of a possible error in Equation [13]. The variance of e i , i.e., σ 2 , describes the variation of the measured data points around the calibration line. The unknown error component t can be obtained from the difference in variance: which clearly cannot be separated further. t is inherent to the measurement and is also called the "individual part" of an observed quantity [14]. (Note that errors in x would also increase σ 2 [13]. However, the concordances of the functional and linear relationships show that the contribution of me x to σ 2 was negligible.)

Systematic Calibration Errors
Until now, we have only considered the influence of random errors of the variables on the calibration line. The composition of the calibration gases, however, may cause a systematic error for the calibration parameters [8]. We estimated this systematic effect considering the 2% relative accuracy error indicated by the filler (gasE). By that, we obtained the maximal range of O 2 and CO 2 concentration errors induced by calibration gas inaccuracies, which allowed the estimation of a maximal effect on calibration parameters.

Validation of O 2 and CO 2 Calibration
The obtained calibration functions for the O 2 and CO 2 sensors were cross-checked against two reference gas mixtures of known gas concentrations. We used a reference gas A with a composition of 17.6% O 2 and 1.49% CO 2 and a reference gas B with a composition of 19.7% O 2 and 0.50% CO 2 , with the remaining percentages for both compositions made up of N 2 (Westfalen GmbH, Germany). Both gases had a specified relative accuracy of 2% for each component.
The empty BaPS chamber was flushed consecutively with the two reference gas mixtures. The raw sensor signals were recorded for 15 to 20 min as well as the measured O 2 and CO 2 concentrations using the latest calibration parameters. The signal means were related to the known gas concentrations of the respective reference gas, and the resulting slopes were compared with the calibration slopes using a paired t-test at a significance level of α = 0.05.

Signal Stability
We analyzed several calibration functions determined between 2012 and 2015 in order to recognize general parameter variability and to identify signal drifts or sensor malfunctions. Two indicators of signal stability were adduced: signal strength stability and sensor response stability.

Effect of Calibration Variability on BaPS-Derived Turnover Rates
Finally, the effects of the different calibration errors and calibration functions on the BaPS results (respiration and gross nitrification rates) were analyzed by applying them on the raw sensor readings of an exemplary BaPS incubation. The exemplary BaPS incubation was performed with a silty loess soil from an arable field on the Filder plateau in southwest Germany (Steckfeld, 48 • 3.1 N, 9 • 11.5 E). The soil pH (water) was 7.76, and the organic carbon content was 0.8% (by weight). The carbonate content was 0.8% (by weight). BaPS calculations on this soil were performed using an adopted respiratory quotient of 0.84 and an experimentally determined CO 2 dissolution capacity of 0.33 mmol L -1 soil solution [15].

Data Availability
All the data analyzed or generated during this study are included in the text, figures, and tables of this article. Complete calibration datasets or further details on the MC Analysis of several calibration runs revealed that all three sensors showed slight variations in the sensor signal, i.e., me x , along the calibration ranges. The pressure sensor showed a mean me x level of 0.2 mV, which corresponded to a pressure error of 0.05 hPa. The O 2 sensor had a mean me x level of 0.15 mV, which was equivalent to a concentration error of only 0.002 Vol% O 2 . The mean me x level of the CO 2 sensor was 7 mV and was thus higher than the noise levels of the other two sensors. This accounted for a concentration error of 0.005 µmol mL -1 CO 2 .
The me x level of a sensor defines the maximal attainable precision of a sensor reading and, therefore, conditions the limit of detection for changes in the respective measurand [9]. In order to reliably distinguish a measured change from me x , e.g., during a BaPS measurement, the measured change should exceed three times the noise level [9]. This should be considered in the case of CO 2 measurements, as me x is relatively high and concentrations are generally rather low.
In conjunction, the sensor noise levels of the five BaPS sensors (P, O 2 , CO 2 , T head and T soil ) define a general detection limit of the BaPS method for microbial turnover rates.
In order to use the standard linear regression model with the calibration data, me x needs to be sufficiently small. As the analysis of the calibration data sets with the mixedmodel procedure confirmed that the measurement errors in x did not affect the resulting calibration coefficients, this requirement was considered to be fulfilled.

Measurement Errors in y
The me y of P originating from the calibration procedure is equivalent to the relative error of the manometer measurement of 0.2%. This led to the linear increase in me y with increasing pressure difference (dP). The mean me y of the considered calibration range was 0.052 hPa. The relative uncertainty of a measured dP was very low and dropped below 1% at a measured pressure difference of 6 hPa. Thus, the measured values used for calibration can be considered as highly accurate.
The me y of O 2 and CO 2 concentrations were estimated using MC simulations that consider the influence of the stochastic measurement errors of the input variables of the gas balance for each gas exchange (i.e., calibration step). For the entire calibration, we obtained a mean me y value of 0.0033 Vol% for the derived O 2 concentrations and 0.0013 µmol mL -1 for the derived CO 2 concentrations. These values can be considered very low as they accounted, on average, for only 0.02% of the measured O 2 concentrations and for 0.3% of the measured CO 2 concentrations. These results indicate that the proposed calibration procedure based on repeated gas exchanges allows a precise setting of O 2 and CO 2 concentrations within the BaPS chamber (neglecting potential gasE, contaminations or other artifacts during the procedure). The high precision of the calculated concentrations can be attributed to the stable incubation conditions (concerning P and T) and the high accuracy of the dP measurements.

V head
V head is an input variable of the O 2 and CO 2 calibration procedure as well as in BaPS calculations in general. It is derived from the measurement of pressure change due to volume extension, so its accuracy depends directly on the accuracy of pressure calibration. According to the BaPS manual, the achievable relative accuracy for a pressure change measurement of the implemented pressure sensor is of 0.3-0.5% [4]. Including a syringe error of 1%, UMS specifies the relative uncertainty of V head as 2%. Our estimations for V head uncertainty via Gaussian error propagation [5,6] of the syringe error and se b of the pressure calibration resulted in a standard deviation of V head of approx. 10 mL, which corresponded, at a total chamber volume of 1000 mL, to about 1% relative uncertainty and was mainly conditioned by the syringe error.

Sensor Calibration
In the following section, data and results of the linear regressions of one exemplary calibration per sensor (July 2015) are presented. The residual plots, however, show y deviations for all available calibration data sets collected between 2012 and 2015, in order to obtain a general perspective on the linearity of sensor signal response.
Plotting the three different measurands-pressure, O 2 concentration, and CO 2 concentration-against the recorded sensor signals reveals the strong linear relationship between the variables for each of the three data sets with coefficients of determination very close to one. Additionally, the standard errors of the calibration coefficients, especially the slope errors se b , were very low (Figure 3a,c,e), accounting for only 0.42%, 0.36%, and 0.39% of b for P, O 2 , and CO 2 calibrations, respectively. Because BaPS calculations are based on measurand change rates, the standard error of the calibration slope b is the most important indicator for calibration quality. The standard error of the calibration coefficient a is, in this case, of minor importance.
MC simulations additionally allowed for the estimation of the impact of the me y errors on the calibration coefficients of the O 2 and CO 2 sensors. Those were even lower than the standard errors of the linear regression parameters (Table 1a), indicating that the effectiveness of calibration was not limited by the proposed procedure [12].    Further proof for the calibration effectiveness was obtained by comparing calibration coefficients with company provided sensor calibrations. The pressure and CO 2 calibration coefficients ( Figure 3, Table 1a) coincided almost perfectly with the UMS calibration coefficients (provided by UMS AG in May 2014): P: a = 699.9, b = 250.08; CO 2 : a = −0.01882, b = 0.51523. The UMS parameters for the O 2 sensor (a = −4.70367, b = 15.58962), however, deviated significantly from the presented calibration coefficients. Therefore, additional calibration validation measurements with two reference gases were performed. These confirmed the obtained calibration parameters of the O 2 and CO 2 calibration. The calibration slope of the CO 2 calibration showed a very good agreement with the reference slope (Figure 3e). The slope difference of 0.006 Vol% V −1 was not significant at a significance level of α = 0.05.  MC simulations additionally allowed for the estimation of the impact of the mey errors on the calibration coefficients of the O2 and CO2 sensors. Those were even lower than the standard errors of the linear regression parameters (Table 1a), indicating that the effectiveness of calibration was not limited by the proposed procedure [12]. In the case of O 2 , the difference between the calibration and the reference slope was 0.13 Vol% V −1 , which was rather high and at the edge of significance at a significance level of α = 0.05 (t = 2.388, t crit = 2.365). However, we consider the concordance as rather good since the difference between calibration and the reference slope was considerably smaller than the difference between reference slope and UMS slope coefficient (0.65 Vol% V −1 ).
For both sensors, the differences between the calibration and reference slopes fell within the confidence limits of b (cf. Figure 3c,d).
Despite the differences in the coefficients, postulated concentration differences between the two reference gas concentrations, dO 2 of 2.1 Vol% and dCO 2 of 1 Vol%, were well reproduced by the calibrated sensors (<0.01 Vol%). This allows for the conclusion that the proposed calibration correctly reproduces concentration changes and that it is thus suited for BaPS measurement applications.
The calibration procedure operates in a dry atmosphere. During a typical BaPS measurement, however, the headspace atmosphere is humid due to the presence of the soil water phase. We tested the effect of humidity on the calibration by adding a water phase to the incubation chamber and the calibration gases. We did not find any systematic differences between wet and dry calibrations. The disadvantage of a wet calibration is that equilibration between the water phase and headspace atmosphere needs several hours. Therefore, and for convenience, we recommend performing the calibration under dry conditions.

Residual Analysis
Although calibration data showed clear linear trends, we analyzed the residuals of several calibration runs to confirm the linear relationship as the general sensor response. If the linear model correctly reflects the sensor response, residuals are expected to scatter randomly along the x axis. We found this expectation to be fulfilled for all three sensors. The Durbin-Watson coefficients indicated that the e i of the calibration data of all three variables were not auto-correlated. As no systematic trends were identified in the residuals (Figure 3b,d,e), we considered the linear regression to be appropriate and sufficient for sensor calibration. This is especially interesting in the case of the O 2 sensor, where a nonlinear sensor response is stated by the manufacturer [4,16]. Calibration with a linear function has the advantage that a simple a posteriori adjustment of measurand data is possible when calibration deviations are detected.

Variance of the Residuals
Considering the residuals e i as a measure of the y divergences, these can be used to calculate σ 2 , i.e., the variation in the data points around the calibration line. The error me y explained only 2%, 11% and 20% of σ 2 of the P, O 2 and CO 2 calibration data, respectively (Table 2), which means that the random variability t of y (compare Equation (10)) accounts for a major part of the residual scattering. One should recall that t represents genuine variability of the experimental material and measurement individuality [11]. Thus, a reduction in the variance and improvement of calibration accuracy is not possible by way of calibration procedure adjustments, but instead depends on the measuring system and inherent variability. However, the residuals scatter within the calculated 95% confidence limits (Table 2). This, together with the high coefficient of determination of the regression, leads us to the conclusion that, overall, residual variances are very low and that the calibration lines reliably predict the respective measurands.

Potential Systematic Error Due to Calibration Gas Inaccuracies
The estimation of the me y of headspace O 2 and CO 2 concentrations proceeded on the assumption that the measured input variables were not affected by systematic errors. Considering potential errors in the composition of the calibration gases of 2% induced deviations in the observed headspace concentrations of 2%. In the case of O 2 , the absolute standard deviation of the headspace concentrations decreased when the number of gas exchange was increased from 0.41 to 0.36 Vol% and would be two orders of magnitude higher than the conservative estimate of me y . In the case of CO 2 , the standard deviation of the headspace concentrations increased from zero before gas exchange to 0.025 µmol mL −1 (0.059 Vol%) after the last gas exchange and would be one order of magnitude higher than me y . Additionally, the calibration coefficients reflected the 2% relative gasE (Table 1a). The induced deviations of the coefficient b highly exceeded its confidence limits (Figure 3c,e).
It is clear that the calibration quality depends directly on the accuracy of the utilized calibration gases. As they exert systematic effects on the calibrations of O 2 and CO 2 , they can only be corrected when exact quantification is possible. Therefore, a possibility for further improving the accuracy of the calibration procedure would be to cross-check the concentration of the applied calibration gases with an independent analytical method such as mass spectrometry.
As validation measurements confirmed the obtained calibration parameterization, we can assume high concentration accuracy for the utilized gases.

Sensor Signal Stability
Sensor signal stability is an important quality indicator for sensor performance and data reliability and, therefore, contributes to quality assurance for BaPS measurements. The two indicators, signal strength and response stability, are equally important in this context. The sensor signal of the piezoresistive pressure sensor showed very high stability in terms of the signal strength as well as signal response (Figure 4a). The sensor showed a low intercept variability of 2.5 hPa, which may be attributable to the measurement errors of the barometer that were used to determine the reference pressure. This variability, however, can be excluded from further discussion, as it has no influence on BaPS rate calculations. Between the years 2012 and 2015, the calibration slopes varied only 1.1% around the average b (Table 1b), without a trend that would indicate sensor aging (Figure 4a). Variations were probably induced by differences in the barometric and thermal background conditions or may be inherent to the measuring system. Despite smaller deviations, the operation of the pressure sensor can be considered stable and reliable. concentration of the applied calibration gases with an independent analytical method such as mass spectrometry.
As validation measurements confirmed the obtained calibration parameterization, we can assume high concentration accuracy for the utilized gases.

Sensor Signal Stability
Sensor signal stability is an important quality indicator for sensor performance and data reliability and, therefore, contributes to quality assurance for BaPS measurements. The two indicators, signal strength and response stability, are equally important in this context. The sensor signal of the piezoresistive pressure sensor showed very high stability in terms of the signal strength as well as signal response (Figure 4a). The sensor showed a low intercept variability of 2.5 hPa, which may be attributable to the measurement errors of the barometer that were used to determine the reference pressure. This variability, however, can be excluded from further discussion, as it has no influence on BaPS rate calculations. Between the years 2012 and 2015, the calibration slopes varied only 1.1% around the average b (Table 1b), without a trend that would indicate sensor aging (Figure 4a). Variations were probably induced by differences in the barometric and thermal background conditions or may be inherent to the measuring system. Despite smaller deviations, the operation of the pressure sensor can be considered stable and reliable. Comparably, the CO2 sensor showed rather stable signal strength and responses over time (Figure 4b). The slope parameter varied 6.1% around the average b (Table 1b) but did not show indications of signal drift or sensor aging. The rather high variation may be attributed to the various error components but also to inherent fluctuations in the signal response of the sensor. Since sensor calibration is performed in the low concentration range, the resulting coefficients are more susceptible to absolute errors, e.g., due to small contaminations during the procedure. Though recalibration is explicitly recommended for the CO2 sensor [4], we found the sensor signal to be rather constant despite the variability of the calibration coefficients. Comparably, the CO 2 sensor showed rather stable signal strength and responses over time (Figure 4b). The slope parameter varied 6.1% around the average b (Table 1b) but did not show indications of signal drift or sensor aging. The rather high variation may be attributed to the various error components but also to inherent fluctuations in the signal response of the sensor. Since sensor calibration is performed in the low concentration range, the resulting coefficients are more susceptible to absolute errors, e.g., due to small contaminations during the procedure. Though recalibration is explicitly recommended for the CO 2 sensor [4], we found the sensor signal to be rather constant despite the variability of the calibration coefficients.
Repeated P and CO 2 sensor calibrations with three independent BaPS systems showed similar stability characteristics. The stability analyses confirmed that the pressure and the CO 2 sensor provide reliable, high-quality data. Figure 5 shows temporal recalibrations of the O 2 sensor of three independent BaPS systems and visualizes fundamental differences in sensor signal stability. In all three BaPS systems, the O 2 sensor showed high variability in terms of signal strength. In BaPS system A (Figure 5a), the signal response to O 2 changes was rather constant over time, whereas the sensor signal showed high variability in its absolute strength. Fluctuations in the intercept coefficient a accounted for 20% relative variability; the slopes, however, varied only 2.6% over three years around the average b (Table 1b). Here, variability did not display a temporal trend. The undirected signal shifts may be caused by small variations of ZrO 2 properties induced by temperature or moisture conditions [4,16]. Thus, general stability can be assigned to the O 2 sensor of BaPS system A. By contrast, the O 2 sensor of BaPS system B ( Figure 5b) showed a continuous drift in signal strength over time. This effect can be interpreted as an initial phase of sensor aging, and frequent recalibration is required to ensure BaPS data reliability. In BaPS system C (Figure 5c intercept coefficient a accounted for 20% relative variability; the slopes, however, varied only 2.6% over three years around the average b (Table 1b). Here, variability did not display a temporal trend. The undirected signal shifts may be caused by small variations of ZrO2 properties induced by temperature or moisture conditions [4,16]. Thus, general stability can be assigned to the O2 sensor of BaPS system A. By contrast, the O2 sensor of BaPS system B (Figure 5b) showed a continuous drift in signal strength over time. This effect can be interpreted as an initial phase of sensor aging, and frequent recalibration is required to ensure BaPS data reliability. In BaPS system C (Figure 5c), massive changes in signal strength and response were observed over time. The observed drift clearly indicated severe malfunction of the sensor and the need for sensor replacement. Reliable BaPS measurements were not possible after 2013. The different behavior of the O2 sensor signals over time underlines the necessity of a regular check of the proper functioning of the O2 sensor. Due to the applied measuring method based on the O2 conductivity of ZrO2, the O2 sensor is especially susceptible to aging, entailing signal shifts. The issue of aging for ZrO2 sensors is well known, and its development over time depends on the chemistry of the cell, operation conditions and environment [17]. The issue of aging for ZrO 2 sensors is well known, and its development over time depends on the chemistry of the cell, operation conditions and environment [17].

Influence of Calibration Errors on Measured Turnover Rates
We evaluated the effects of effective and potential calibration errors on the results of a BaPS measurement, i.e., the effects on respiration and gross nitrification rates. Table 3 shows the absolute and relative effects of different slope and intercept errors inherent to the present calibration ( Figure 3 and Table 1). Table 3. Effect of the errors se a , se b , me y and gasE of the calibration "July 2015" on the derivation of respiration rates and gross nitrification rates. Listed are the deviations from the original rates in absolute and relative terms (the latter are in brackets). Calibration errors ascribable to the proposed procedure, i.e., resulting from the linear regression model (se a and se b ) and from measurement errors of the measurands (me y ), were very low (Table 1) and, thus, exerted small effects on the measured turnover rates ( Table 3). The parameter errors induced by me y were lower than the standard errors of the calibration coefficients; correspondingly, the rates showed likewise smaller deviations. Overall, the calibration errors were very small in comparison with other error sources occurring during BaPS measurements and rate calculations, such as stoichiometric ratios or considered processes [1,3], as well as natural rate variability.
The most important error source of the calibration was related to the accuracy of the calibration gas concentrations. As the gasE affected the important b parameters by 2%, effects on the turnover rates were clear, especially on nitrification rates ( Table 3). The GasE calibration errors entailed maximal rate uncertainties for respiration of approx. 2%, and for gross nitrification of approx. 13%. This potential error can be reduced by the use of high-quality calibration gases. These findings point out that ultimate conclusions on the uncertainty of turnover rates depend primarily on the quantification of systematic error sources such as gasE.
Overall, the most important uncertainty source is a result of a lack of signal stability and temporal calibration variability. As we concluded that the observed signal shifts were random, system inherent fluctuations and calibration variabilities were responsible for a major part of the uncertainty of the derived turnover rates. Applying six calibrations on the raw sensor signals of an exemplary BaPS incubation resulted in deviations of 5% and 12.6% in the mean respiration and gross nitrification rates, respectively (Table 4).
Based on the influence of a single sensor's variability, it can be stated that pressure calibration variations hardly affected the calculated turnover rates. The respiration rates were apparently most sensitive to variations in the CO 2 calibration. The nitrification rates were likewise affected by O 2 and CO 2 calibration.
It is worth discussing that the effect of CO 2 calibration variability on nitrification rates should be a function of the soil pH, since the calculation of nitrification rates depends highly on the critical gas balance term ∆CO 2,aq [1,15]. CO 2 calibration affects the absolute measurement of the CO 2 partial pressure and CO 2 concentration changes. Therefore, errors in the CO 2 calibration directly influence the estimation of the CO 2,aq of soils with pH above 6.5 and, thus, affect the critical dissolution rate ∆CO 2,aq . Table 4. Effects of temporal calibration coefficient shifts of the O 2 , CO 2 and pressure sensors on respiration and gross nitrification rates. Based on the raw sensor data of an exemplary BaPS incubation, the respiration and nitrification rates were computed on the basis of six different calibrations performed between May 2012 and July 2015.

Summary and Conclusions
We successfully developed a procedure for cross-checking and adjusting BaPS sensor calibrations on site in order to guarantee the optimal performance of the BaPS measuring system and assure high measurement quality between company inspections. We showed that, within the considered ranges, calibration data could be evaluated using linear regression models.
The calibration accuracies of O 2 and CO 2 sensors were governed by the accuracy of the utilized calibration gases. Intrinsic calibration variability accounted for the greatest source of calibration uncertainty and exerted the strongest effect on BaPS turnover rates.
All three BaPS sensors generally showed a stable sensor response. However, the O 2 sensor is especially vulnerable to signal drift and sensor aging due to the utilized ZrO 2 measuring method operating at high temperatures of up to 500 • C [16]. A regular check of proper O 2 sensor functioning is advisable. The pressure sensor showed the highest signal stability and lowest effect on the measured turnover rates. A major part of the respiration and gross nitrification rate variability was attributable to shifts in O 2 and CO 2 calibration parameters; thus, we point out that the assurance of optimal functioning and the calibration of these two sensors are of particular importance for BaPS measurements.
The proposed procedures proved to be effective tools for detecting sensor malfunction and signal drift. Therefore, the procedures contribute to the quality assurance of BaPS measurements.