Batch Cultivation Model for Biopolymer Production

C. E. Torres-Cerna,a,◊ A. Y. Alanis,a,◊,* I. Poblete-Castro,b and E. A. Hernandez-Vargasc,d,* aCUCEI, Universidad de Guadalajara, Blvd. Marcelino Garcia Barragan 1421, Col. Olimpica, Guadalajara, Jalisco, México, C. P. 44430 bUniversidad Andrés Bello, Faculty of Biological Sciences, Center for Bioinformatics and Integrative Biology, Biosystems Engineering Laboratory, 8340176 Santiago, Chile cSystems Medicine of Infectious Diseases, Helmholtz Centre for Infection Research, Inhoffenstr. 7, 38124 Braunschweig, Germany dFrankfurt Institute for Advanced Studies (FIAS), Ruth-Moufang-Str. 1, 60438 Frankfurt am Main, Germany


Introduction
Plastics are one of the most used materials worldwide and their production rate is increasing over the years.In 2013, the worldwide plastic material production was approximately 299 million tons, and these numbers continue to increase, causing large amounts of waste which highly contributes to the pollution of the environment 1 .Plastic materials have been for a long time produced through petrochemicals routes.Theses plastics have a high durability and are resistant to degradation.Although such properties are considered important for the industrial production of plastics, these are a significant source of environmental and waste management problems 2 .
New avenues of research have been developed to produce plastics with biodegradability properties, called bioplastic materials.The replacement of a fraction of synthetic plastics with biodegradable polymers produced from renewable resources offers to solve important problems such as the overall consumption of fossil fuels, environmental pollution, and solid waste management 3 .Bioplastics can easily be degraded (Figure 1), and in this way, the plastics deposited in nature could be largely decreased, serving as a carbon substrate for different microorganisms.
Bioplastics are available in specific areas and applications, for example composting bags and sacks, fast food tableware (cups, plastic bottles, cutlery, etc.), agriculture and hygiene 4 .Furthermore, biopolymers are also used in most industries for technical applications, including medicine and pharmacy.Among the candidates for biodegradable plastics, polyhydroxyalkanoates (PHAs) have been drawing much attention because of their similar material properties to conventional plastics and complete biodegradability.PHAs are insoluble in water and exhibit a rather high degree of polymerization ranging from 105 to 107 Dalton (Da).Their non-toxicity and biodegradability properties are essential for their potential use 3 .
PHAs are naturally produced as granules in the cytoplasm of cells by numerous bacteria under limiting nutrients and excess carbon source 3,5 .Pseudomonas putida strains are the most used bacteria for the synthesis of medium-chain-length PHAs (mcl-PHAs) in the presence of different carbon sources, such as fatty acids, glucose, and glycerol.Particular properties of mcl-PHAs give them a broader application range over short-chain-length PHAs 6 .Usually, the production process is composed of two phases.In the first phase, cells are challenged by growth in a minimal medium with an excess of carbon source and a limited amount of ammonium as the nitrogen source.The cells rapidly consume the carbon substrate and exhibit exponential growth as long as ammonium is present in the medium 3,6 .The second phase is characterized by the transition from a feast to a famine period, which allows the high synthesis of mcl-PHA in batch reactors 3,6,7 .
Nevertheless, the amount of bacterial production of mcl-PHA is insufficient compared to the demand and the reduced quantity of production, influencing the price of bioplastics in the market.Systems metabolic engineering approaches of Pseudomonas putida (P.putida) have enhanced the production of PHA, thus enabling development of a more efficient strain, which closes the cost gap be-tween petroleum and bacterial-based production of polymers.For example, the in silico driven metabolic engineering approach developed by Poblete-Castro and coworkers, re-designed P. putida to enhance the production of mcl-PHAs using glucose as the carbon source.The metabolically engineered strains that were performed predicted only one key mutation, revealed a 100 % increase in the final PHA titer, a 50 % increase in the cellular PHA content, and 80 % in the PHA yield, relative to its parent strain.
Additionally to the improvements through genetic modifications, mathematical models have proved to be a relevant tool to enhance biopolymers production 8 .On the one hand, metabolic approaches are focused on metabolic pathways for PHA production, these can reflect the intracellular dynamics.There is a particular interest for this level of modelling due to the high detail that can describe 8 .Nevertheless, this detail level promotes highly complex models.On the other hand, modelling approaches based on the population growth can help dissect the dynamics of different strains 8 .This approach could be useful in developing different cultivation strategies in order to enhance the PHAs production without increasing the model complexity 9,10 .
Recently, several mathematical models, based on population growth, have been proposed to facilitate the dynamics of biopolymer production [9][10][11][12][13][14] .Most of these works contain many parameters esti-  New avenues of research have been developed to produce plastics with biodegradability properties, called bioplastic materials.The replacement of a fraction of synthetic plastics with biodegradable polymers produced from renewable resources offers to solve important problems such as the overall consumption of fossil fuels, environmental pollution, and solid waste management 3 .Bioplastics can easily be degraded (Figure 1), and in this way, the plastics deposited in nature could be largely decreased, serving as a carbon substrate for different microorganisms.
Bioplastics are available in specific areas and applications, for example composting bags and sacks, fast food tableware (cups, plastic bottles, cutlery, etc.), agriculture and hygiene 4 .mated based on local optimization algorithms.Note that parameter estimation is a complex problem involving highly non-linear equations and several variables.Therefore, global optimization algorithms are needed to avoid incorrect fits and consequently incorrect interpretations.The mechanisms of self-adaption, self-organizing and self-learning in stochastic optimization approaches allow addressing challenging problems that cannot be solved by traditional methods.In this paper, a variant of the Differential Evolution (DE) algorithm 15 is considered for parameter estimation.This optimization algorithm is used in many engineering applications due to its simplicity and straightforward implementation.

Strains, medium and batch fermentations
For this work, considered was the parent strain P. putida KT2440 (DSM 6125), which was obtained from the German Collection of Microorganisms and Cell Cultures (DSMZ, Braunschweig, Germany).To generate single deletion mutant of P. putida KT2440, genome editing was applied 16 .P. putida strains were grown in a defined mineral medium.The experiments were performed in a 2 L top-bench BIOSTAT bioreactor with working volume of 1.5 L at 30 °C and 180 rpm.The dissolved oxygen was kept at 20 % of air saturation by control and the pH-value was maintained at 7.
The basic structure of PHA was confirmed by NMR analysis.Thirty mL of culture was harvested and the pellet was washed with deionized water, frozen at -20 °C and lyophilized.The lyophilizate was then suspended in 10 mL chloroform and kept for 3 h at 80 °C to extract the PHA.The PHA composition, as well as the cellular PHA concentration, were determined by gas chromatography (GC) and mass spectrometry (MS).A more detailed description of the experiments can be found in 6 .

Mathematical model
The mathematical model proposed here is based on the following differential equations: where S is the glucose (carbon source), N is the NH 4 concentration (nitrogen source), X is the active biomass, and P is the biopolymer PHA.NH 4 is assumed as the limiting nutrient 13 and the total biomass (Cell Dry Mass) is composed by the catalytically active biomass and the PHA, thus CDW = X + P.
Consumption rates of substrates by active biomass are presented in equations ( 5) and (6).m S and m N can be interpreted as the specific consumption rates of glucose and NH 4 by the active biomass, respectively.These are experimentally proportional to biomass concentration, and reach their maximum when the concentration of substrates is high.
The substrate S is consumed by the active biomass at a rate (k 1 ), which is modeled by equation ( 5) using the following Monod function: where K S is the saturation or affinity constant of the substrate S.
NH 4 dynamics are represented by equation ( 6).Experimental data 6 indicate that ammonium is completely consumed in the growth phase.Different functions were considered for the NH 4 consumption, however, the best estimations were achieved by the following rate equation: where K N is the saturation or affinity constant of the substrate N.
Active biomass dynamics of strains are presented in equation (7).The specific growth rate of biomass with glucose is defined according to the "two substrates" kinetic equation (known as "double Monod" relation).The limiting substrate ammonium (N) is essential to produce the active biomass X and limits its synthesis at low concentrations.The active biomass growth m writes as follows: where m max is the maximum growth rate, K S1 and K N1 represents the affinity constants of X.
The biopolymer PHA (P) is produced at a rate k 4 by active biomass dynamics of P. putida (X), which is triggered by a limited amount of ammonium provided in the growth medium, this can be expressed by the following equation: Note that m P (t) term will increase when the value of ammonium decreases.A growth term based on the biomass growth k 3 m (t) was introduced to provide better fitting for the PHA.This signifies that small amounts of PHA are produced during biomass growth.

Parameter estimation
Parameter estimation for mathematical models can be understood as the search for a set of values for the parameters θ that minimize the difference between the outcome of model y i and experimental data y i as close to zero as possible.This search is restricted to system dynamics, algebraic restrictions, and system constraints.In this paper, the Sum of the Square of Weighed Residues (SSWR) function 17 is used for parameter estimation, and is defined as follows: where j and i represent the number of variables and experimental data points, respectively, y is the set of experimental data points 6 , and y is the outcome of the mathematical model.
To know the behavior of each parameter within an interval of given values, in each ODE the likelihood profile for all parameters is estimated.Using equation (10), each parameter was fixed while other parameters in the equation were estimated.Since the ODEs integration routine requires dense data sets at different times depending on the adaptive step size, inputs in each estimation are approximated by piecewise linear interpolation.For these estimations, the Residual Sum of Squares (RSS) is used as minimization function.
( ) ( ) The differential equations are solved using solver ode45 from MATLAB to obtain the model outcome.

Differential evolution algorithm
The minimization of equations ( 9) and (10) implies a non-linear optimization problem with several variables, which can be solved using a global optimization algorithm.Differential evolution (DE) algorithm 15 is among the most popular global optimization algorithms.DE is designed for optimizing p-dimensional continuous functions.Each individual in the population is a p-dimensional vector that represents a candidate solution to the problem.
In DE algorithm, two individual mutually excluding r 1 ≠ r 2 are picked randomly among the pop-ulation, the difference is scaled and added to a third individual r 3 ∉ {r 1 , r 2 } chosen randomly to create a new mutant vector v i .The individuals are defined as , where i = 1, 2, …, NP, g = 1, 2, …, G.The variable G represents the maximum number of generations, and NP the population number.
The first step is the initialization of the population within a search space for each element of the individuals, defined by model restrictions as , and , The second step is the mutation, for which different strategies exist.In this work, the following mutation strategy is considered: ( ) where x best,g is the individual with the best fit in the current generation.x r1,g and x r2,g are different vectors chosen randomly among the population, and F ∈ [0,1] is the scaling factor.The third step is the crossover, a new trial vector is generated recombining the mutant v i,g and target x i,g vectors.For this paper, the exponential approach (12) was considered.In this approach, s ∈ [1,p] and L ∈ [1,p] are random integer numbers.s represents the starting element in target vector x i,g to create a new trial vector u i,g recombining the L components donated by the mutant vector v i,g ., , , where A:= {s, …, min(p,s + L -1)}, and B : = {1, …, s + L -p -1)} The last step is the selection, a cost function is used to determine which vector survives for the new generation (g + 1), the mutant u i,g and target x i,g vectors are evaluated to find which has the best yield.The basic idea is shown as follows: where f (⋅) is the evaluation of the objective function.

Numerical results
The experimental work assessed the growth and PHA production performance by the wild and mutant-type P. putida strains in minimal medium using glucose as a carbon source (18.5 g L −1 ) and a limiting amount of ammonium as nitrogen source (350 mg L −1 ).The experimental data 6 are shown in Figure 2. The exponential growth of CDW initiates at 5 hours and continues while glucose in the medium exists.The stationary phase is reached when the glucose concentration is less than 0.1 g L -1 .In Poblete-Castro work 6 , the authors showed gluconate production by the wild type strain, and at the same time possible consumption of the gluconate when the glucose is depleted in the medium.However, to develop a model as simple as possible without compromising accuracy, in the proposed model, the formation and consumption of gluconate were not considered.
The exponential growth of PHA started at 10 hours, this could be attributed to the shortage of nitrogen source in the medium.The PHA content exhibits two phases, the first is proportional to cell growth, when PHA production is slow.The second phase is the exponential growth of PHA, which is triggered when the nitrogen source present in the medium is less than 0.1 g L −1 .
The development of new avenues of thinking to decrease bioplastics costs are urgently needed.To this end, mathematical models can help interpret experimental results on quantitative grounds.Here, the reactor dynamics 6 are represented with the system (1)-( 4).
Before mathematical models can be considered for reliable predictions, the unknown parameters should be estimated by comparing model output to experimental data (maximum likelihood principle).However, the mathematical model equations ( 1)-( 4) possess parameters that could be difficult to estimate using common model optimization strategies, e.g., least value, steepest descendent, inner and outer approximation, etc. Innovative optimization approaches help tackle challenging problems that cannot be solved by traditional methods.Here, the DE algorithm 15 is considered and the fine tuning of the algorithm is based on [18][19][20] .Furthermore, the novel approach proposed by Raue et al. 21exploits the likelihood profile to determine both structurally and practically non-identifiability.In addition, the iterative fitting routine proposed by Binder et al. 22 is used to decompose the problem of fitting a system of coupled ordinary differential equations (1)-( 4) into smaller sub-problems.Using equation (10) the likelihood profile for all parameters are estimated to check the identifiability.
Results of likelihood profiles are displayed in Figure 3. Likelihood profile graphics suggests K P is structurally non-identifiable.In addition, parameters K S1 , k 4 , and k 3 are practically non-identifiable since their confidence regions do not converge towards a specific value 21 .Parameters k 3 , k 4 and K S1 could take any value close to zero and do not enhance the model fit significantly.In the same way, K P could take any value inside the interval without enhancing the model fit.On the other hand, likelihood profile graphics suggest that parameters k 1 , k 2 , K S and K N1 are identifiable.Even though a threshold cannot be determined, the graphics offer an overview of which parameters can be identifiable or not.
To simplify the parameters estimations based on likelihood profile, graphics parameters detected as non-identifiable were fixed.Values for fixed parameters were chosen to minimize the SSWR function, while the remaining parameters were simultaneously estimated.Furthermore, values for m max 147 The development of new avenues of thinking to decrease bioplastics costs are urgently needed.To this end, mathematical models can help interpret experimental results on quantitative grounds.Here, the reactor dynamics 6 are represented with the system (1)-( 4).Fig. 2 Experimental data for the wild (black) and mutant-type (gray) strains of P. putida.S is the glucose concentration, N is the NH4 concentration, CDM is the total biomass.
Before mathematical models can be considered for reliable predictions, the unknown F i g . 2 -Experimental data for the wild (black) and mutant-type (gray) strains of P. putida.S is the glucose concentration, N is the NH 4 concentration, CDW is the total biomass.
were taken from the work of Poblete-Castro 6 .
In Figure 3, parameters k 3 and k 4 are non-identifiable.Thus, the parameter k 4 was fixed, while the parameter k 3 was estimated.The best-fit parameter values can be found in Table 1.Model simulation results in Figure 4 reveal good fitting to the experimental data depicted by empty circles.The model presents a better fit for wild type strain than mutant type strain.This difference is clearer on PHA graphics, where there is a logistic growth in the first 10 hours, after that the growth is linear.This behavior is also presented in CDW graphics and is consistent with NH 4 consumption.
As discussed previously, the production and consumption of gluconate by the wild type strain described in 6 were not considered in the presented F i g .3 -Likelihood profiles for parameters of both strains, wild type -black line, and mutant type -gray line Ta b l e 1 -Model parameters.These are the best fitting result using the iterative fitting routine 22 .
Parameter Wild Type Mutant type (*) parameters were fixed.
(1) parameter was taken from experimental data 6 model.However, the gluconate production and consumption could explain the growth of active biomass and PHA after the glucose is consumed.

Confidence interval
Due to the highly variable data in biological systems, further validation of estimated parameters is necessary.Even though the likelihood profile offers a visual insight for parameters validation, due to a lack of experimental data it cannot be determined clearly whether a parameter is identifiable or not.Confidence intervals help determine the degree of validity for each parameter estimated.Bootstrap methods are used to provide statistical inferences of a set of parameters.However, these results are af-fected by the high variability of experimental measurements or by the erroneous assumption in the statistical distributions.
The Weighted Bootstrap method is an effective tool for semiparametric estimation 23 , which allows the study of asymptotic properties of the proposed solution when there are measurement and numerical errors 24 .The Weighted Bootstrap method can be written as follows: ( ) where w is a vector with independent, identically distributed positive random weights with mean Confidence intervals were calculated using the Weighted Bootstrap method 25 .Results F i g .4 -Simulation results.The results are displayed, a) for wild strain, and b) for mutant strain.The simulations were developed using parameter values from Table 1.Confidence intervals were calculated using the Weighted Bootstrap method 25 .Results

a) b)
tribution of w is independent of y and y 25 .θB is the vector of estimated parameters after each bootstrap.Confidence intervals were calculated using the Weighted Bootstrap method 25 .Results after 1000 bootstrap trials are displayed in Table 2, for each constant parameter, the 2.5 % and 97.5 % quantiles of the estimates were selected to compute the 95 % confidence interval.
The model was refitted in each repetition to obtain the corresponding parameter distribution.Parameter distribution is displayed in Figure 5. Parameters K S for the wild strain and k 2 for the mutant strain have the larger confidence intervals, respectively, this allows the model to be more flexible to changes of these parameters.The parametric dependence was obtained using the results from Weighted Bootstrap method, each parameter was plotted against another in scatter plots, which show the correlation between parame-ters (Figure 6).It is the case of parameters k 1 -K S which are linearly dependent, if k 1 increases, K S increases too; this behavior was expected because both parameters belong to the same equation.The same behavior is presented in parameters k 2 -K N (Figure 7).Parameter K N was fixed due to always reaching the upper limits, this behavior is consistent with their likelihood profile (Figure 3) where a threshold cannot be determined.
For the glucose equation ( 1), the consumption rate (k 1 ) and the affinity constant (K S ) of mutant type strain are smaller than those of wild strain.Additionally, the proportion between both parameters (k 1 /K S ) is smaller for mutant strain than that one of the wild strain.On the other hand, in the NH 4 equation (2), despite the higher values in parameters k 2 and K N for mutant strain with respect to those of the wild strain (see Table 1), their proportion is smaller between parameters for mutant type strain than wild type strain.These proportions suggest that the mutant strain consumes the substrates slowly in comparison to the wild strain.Furthermore, the mutant strain grows slower than the wild strain.

Conclusions
This work presents a mathematical model for biopolymer production in a batch reactor, which can assess the performance of two different P. putida strains, wild and mutant type strains.The model parameters were evaluated using a variant of DE algorithm to minimize the difference between experimental data and the model output.These parameters were estimated through computing their likelihood profile and confidence intervals.The likelihood profiles helped determine the parameter identifiability.This analysis was complemented with the confidence interval estimations.
Estimations suggested that the higher PHA production by the mutant strain, in comparison with the wild strain, may be related to the adequate consumption of substrates, which can be clearly observed in the glucose and nitrogen consumption graphics (Figure 4).The presented model contains fewer parameters than previous mathematical models for biopolymer productions [9][10][11][12][13][14] .Furthermore, some models have described fed-batch and batch cultures 9,10,13 , and these approaches use the fedbatch cultures model to increase the PHA yield.The model presented in this work is simpler, which helps to obtain a better parameter estimation.Different terms were considered for substrates equations (1)-( 2), e.g.bilinear terms (k 1 SX) or more complex like Hill functions (X n /(X n + K S )).Nevertheless, a better estimation was reached using the equations (1)-( 2).As future work, to enhance the accuracy of the model, the dynamics of gluconate presented by Poblete-Castro et al. work 6 will be included.
equation (1), the consumption rate ( ) and the affinity constant ( ) of n are smaller than those of wild strain.Additionally, the proportion between ( / ) is smaller for mutant strain than that one of the wild strain.On the e NH 4 equation ( 2), despite the higher values in parameters and for th respect to those of the wild strain (see Table 1), their proportion is smaller ters for mutant type strain than wild type strain.These proportions suggest strain consumes the substrates slowly in comparison to the wild strain.
mutant strain grows slower than the wild strain.
ns esents a mathematical model for biopolymer production in a batch reactor, s the performance of two different P. putida strains, wild and mutant type el parameters were evaluated using a variant of DE algorithm to minimize etween experimental data and the model output.These parameters were

F
i g . 1 -Bioplastics cycle.Bioplastics are produced by microorganisms, which are collected after being used as solid wastes.These are degraded and used as carbon sources for the microorganisms.

Fig. 1
Fig. 1 Bioplastics cycle.Bioplastics are produced by microorganisms, which are collected after being used as solid wastes.These are degraded and used as carbon sources for the microorganisms.

15 152Fig. 4
Fig.4 Simulation results.The results are displayed, a) for wild strain, and b) for mutant strain.The simulations were developed using parameter values from Table1

15 152Fig. 4
Fig.4 Simulation results.The results are displayed, a) for wild strain, and b) for mutant strain.The simulations were developed using parameter values from Table1

F i g . 5 -
Parameter distributions.The histograms show the distribution for parameters of both wild (black) and mutant (gray) strains.Data taken from bootstrap results.Ta b l e 2 -The confidence interval for each parameter.After 1000 weighted bootstrap, confidence interval with 95 % of confidence for both strains is shown.

F i g . 6 -
Parameter dependency.The scatter plots show the parametric dependence between parameters for both wild (black) and mutant (gray) strains.Data taken from bootstrap results.

F i g . 7 -
Linear dependence between parameters k 2 -K N .A direct relation exists between parameters k 2 and K N .