A single-step protocol for closing experimental atom balances

Graphical abstract


Method details
When chemical reactions are performed the corresponding element or atom balances should be always closed [1][2][3][4][5]. For example, if the carbon balance is envisaged in a non-nuclear reaction, the initial number of carbon moles should equal the carbon in the reaction products. Typical acceptable ranges for an atom balance are between 90 % and 110 %. Experimental error is logically invoked to explain why atom balances are not exactly equal to 100 %.
This manuscript describes a very simple and elegant method to set atom balances equal to 100 %. Striking consequence of the given CLOBAL procedure is a more accurate calculation of conversion and selectivity values and a lower residual sum of squares during parameter estimation, accompanied by smaller confidence intervals for the parameters [1].
Consider n measurements of n physical quantities, which 'true' values are called w j , j = 1 . . . n. For the sake of example, these quantities are the outlet molar flow rates in a mixture of n compounds, A j . Each of these compounds A j has a i,j atoms of type e i , i = 1 . . . m. Normally the number of compounds exceeds the number of elements taken into account, i.e., m < n. Since there are no nuclear reactions or transformations included, Eq. (1) holds for the true values with w j,0 the initial value for quantity w j : Eq. (1) is an ideal representation, i.e., all the balances for atom type e i , i = 1 . . . m, are 100 % closed.
In reality this is not the case due to experimental error and, hence, the experimental values for the molar flow rate, absolute number of moles or concentrations do not close Eq. (1). The purpose of this manuscript is to offer a method for small corrections on these physical quantities in order to close the balances 100 %. The order of magnitude of these corrections can be compared to the error related to typical calibration data, as outlined in the companion paper [1], and, if the calibration curve has a high R 2 , subsequently small corrections to the concentrations, mol fractions, or derived flowrates, are to be expected with this method. The proposed correction on the physical quantity, w j,c with j = 1 . . . n, should result in a full closure of the m balances, so that Eq. (2) is valid: Eq.
(2) represents m so-called 'fundamental relations' for the n corrections w j,c . Hence, n-m additional relations are required to solve for all of their values. These can be found from Eq. (3), which states that the weighted sum of corrections should be minimal, with w j the weight factor corresponding for correction w j,c : Eqs. (2) and (3) form the basis for a so-called 'Lagrange multiplicator optimization problem': R needs to be minimized and the solution is subjected to equality constraints, see Eq.
(2). The great advantage of the Lagrange multiplicator method is that it allows not to explicitly solve the constraint equations and use them to eliminate extra variables. The complete function, also called the Lagrangian function S [6], with the so-called 'Lagrange multiplicators', 2Ál i (i = 1 . . . m), which has to be minimized, reads as Eq. (4): The prefactor '2' for the equality constraint can be added for the sake of elegancy, so that in further calculations the factor 2, as a result of the derivative of the quadratic function (3), can be cancelled out.
Taking the derivative with respect to w j,c , gives Eq. (5): From Eq. (5) the optimized corrections for the n flow rates, w j,c , are given by Eq. (6): Eq. (6) contains n relations and m + n unknowns, hence, m additional relations are needed, which can be found in Eq. (2). The subsequent substitution of Eq. (6) in the latter gives Eq. (7): Eq. (7) represents a set of m linear relations for l k , i = 1 . . . m, is found and upon solving, the Lagrange multiplicators are inserted into Eq. (6) to obtain the individual correction for each of the individual n molar flow rates: The corrected quantities w j + w j,c , for j = 1 . . . n, give complete balances (1). Expressions (7) and (8) are sufficiently detailed to replicate the presented CLOBAL protocol. The given expressions (7) and (8) can be written in general matrix notation, which will form the basis of the Excel 1 macro that gives the corrections.
In order to validate the presented methodology, the condensation of benzaldehyde and heptanal, which is an important aldol-type reaction in the production of jasmine aldehyde [7][8][9], is taken as showcase in the companion paper [1]. There are 5 compounds to be considered: benzaldehyde The difference in actual value and initial value is given by vector F, see Eq. (10), and the correction vector is defined by Eq. (11): The solution for the m Lagrange multiplicators is given by Eq. (12) with substitution of matrix v , see Eq. (13): Eq. (12) represents the solution of Eq. (7) in matrix notation with respect to the Lagrange multiplicators.
The corrections w j,c for j = 1 . . . n are given by Eq. (14) in one single step calculation, i.e., no iterations are required: The corresponding VBA code is given in Table 1. The input requires the number of atom types, m, and the number of compounds, n. The stoichiometric information on the atom types in the individual compounds, such as given by the stoichiometric matrix via Eq. (9), is the input in worksheet 'atom', see The data vector consists of ndata+1 rows, having the initial concentration on row 2, see Fig. 2. The value of 'ndata' is automatically read by the program, depending on the input in the worksheet 'data';  maximal number of data is n_max, n_max = 1000. The actual concentration values for the n compounds occupy the rows 3 to ndata+2. The first column in worksheet 'data' contains the independent variable, e.g., in this case the minutes at sampling. This can be used for preparation of figures, but for the given procedure it is not required. Fig. 3 gives the results of the CLOBAL procedure: worksheet 'results' evaluates the original atom balances and feeds this back to the user on rows 3 to ndata+4. The Lagrange multiplicators, calculated via Eq. (12), and the individual corrections, obtained via Eq. (14), are given on rows ndata+6 to 2*ndata +6. The corrected data are given from row 2*ndata+8 to 3*ndata+9 and they are ready for further use, i.e., they are generated as in the input form for sheet 'data'.
As a side note for the weight factors, the author found that the best choice is the inverse of the corresponding response; as indicated on line 60 of the code, see Table 1. This can be altered by the user in case another expression should be more appropriate.
As an example, the result of the proposed procedure is given in Figs  is purely a coincidence: when the in silico random error is applied a second time [10] and the CLOBAL procedure is subsequently applied, the balances are still closed, but the small variations are somewhat different due to the different randomized error; this time resulting in a visible improvement of the point of interest. It was shown in the companion paper [1] that parameter estimation via ODRpack   [11], using treated data, results in smaller confidence intervals and lower residual sum of squares (RSSQ).

Declaration of Competing Interest
The author declares that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.