Designing Robust Biotechnological Processes Regarding Variabilities Using Multi-Objective Optimization Applied to a Biopharmaceutical Seed Train Design

Hernández Rodríguez, Tanja; Sekulic, Anton; Lange-Hegermann, Markus; Frahm, Björn

doi:10.3390/pr10050883

Open AccessFeature PaperEditor’s ChoiceArticle

Designing Robust Biotechnological Processes Regarding Variabilities Using Multi-Objective Optimization Applied to a Biopharmaceutical Seed Train Design

¹

Biotechnology and Bioprocess Engineering, Ostwestfalen-Lippe University of Applied Sciences and Arts, 32657 Lemgo, Germany

²

inIT—Institute Industrial IT, Ostwestfalen-Lippe University of Applied Sciences and Arts, 32657 Lemgo, Germany

^*

Author to whom correspondence should be addressed.

Processes 2022, 10(5), 883; https://doi.org/10.3390/pr10050883

Submission received: 6 April 2022 / Revised: 20 April 2022 / Accepted: 21 April 2022 / Published: 29 April 2022

(This article belongs to the Special Issue Bioprocess Systems Engineering Applications in Pharmaceutical Manufacturing)

Download

Browse Figures

Versions Notes

Abstract

:

Development and optimization of biopharmaceutical production processes with cell cultures is cost- and time-consuming and often performed rather empirically. Efficient optimization of multiple objectives such as process time, viable cell density, number of operating steps & cultivation scales, required medium, amount of product as well as product quality depicts a promising approach. This contribution presents a workflow which couples uncertainty-based upstream simulation and Bayes optimization using Gaussian processes. Its application is demonstrated in a simulation case study for a relevant industrial task in process development, the design of a robust cell culture expansion process (seed train), meaning that despite uncertainties and variabilities concerning cell growth, low variations of viable cell density during the seed train are obtained. Compared to a non-optimized reference seed train, the optimized process showed much lower deviation rates regarding viable cell densities (<10% instead of 41.7%) using five or four shake flask scales and seed train duration could be reduced by 56 h from 576 h to 520 h. Overall, it is shown that applying Bayes optimization allows for optimization of a multi-objective optimization function with several optimizable input variables and under a considerable amount of constraints with a low computational effort. This approach provides the potential to be used in the form of a decision tool, e.g., for the choice of an optimal and robust seed train design or for further optimization tasks within process development.

Keywords:

Gaussian processes; Bayes optimization; Pareto optimization; multi-objective; cell culture; seed train

1. Introduction

The development and optimization of biopharmaceutical production processes with cell cultures is cost- and time-consuming, requiring substantial lab work. This necessitates thorough planning of experiments and processes, taking into account existing process knowledge. The need for model-based decision support in biopharmaceutical manufacturing has been emphasized by the US Food and Drug Administration (FDA) [1,2], including taking into account available prior know-how and experience within the decision process and uncertainties [3]. Such methods are still not state-of-the-art for cell culture processes during development or manufacturing [3,4], although first approaches have been proposed, for example, in order to optimize the titer of a mammalian cell culture process [5]. This highlights a need for improved methods and tools for optimal experimental design, optimal and robust process design and process optimization for the purposes of monitoring and controlling during manufacturing.

But also in other engineering fields such as chemical engineering or mechanical engineering, process optimization plays an important role and is the subject of current research. Some application examples rely on dynamic models, an example is the optimization of sustainable algal production processes [6] or the improvement of the vibration performance of cold orbital forging machines [7]. Other approaches rely on machine-learning algorithms such as those reported in [8,9,10].

The optimization of one objective criterion (e.g., final titer) is relatively straight forward, i.e., building an objective function with a unique response variable and applying an appropriate optimization algorithm to maximize this function. However, in industry, it is typically desired to optimize several conflicting objectives at a time, leading to suitable trade-offs and compromises. For example, when trying to maximize final titer via viable cell density while minimizing cultivation time. Multi-objective optimization provides a decision-making tool for optimal decisions in the presence of trade-offs between two or more conflicting criteria.

However, multi-objective optimization is more challenging. Its application is still not state-of-the-art in the context of cell culture processes, probably due to a lack of related studies and instructions. Moreover, within the manufacturing life cycle of biopharmaceuticals, some phases are better investigated than others. Still very few investigations are reported concerning the cell expansion process (seed train). It consists of several consecutive cultivation and passaging (transfer) steps, starting with a small amount of cell suspension because cells are frozen in small vials until they are used for a production process. The goal is to expand the number of viable cells in order to reach the required amount to inoculate (start) the production bioreactor (e.g., 10,000 L at industrial scale) while keeping them in a healthy and growing state. A high amount of operational requirements and constraints have to be fulfilled and, as reported in literature [11,12], the cell expansion process critically effects product quality and the amount of product at production scale. In [12] for example, the passage duration, as well as the initial viable cell density for each passage are reported as important parameters with high impact on process time and productivity at production scale. A careful and optimal planning of a seed train is therefore essential. However, this is not a trivial task due to the inherent variability concerning cell growth (cell growth differs from cell line to cell line and also from cultivation run to cultivation run) and uncertainty about the real state of the process due to considerable measurement uncertainties. This requires the design of a reproducible process which is robust regarding viable cell density, meaning that despite (initial) variabilities concerning cell growth, low variations of viable cell density at the end of the seed train are obtained. The goal of this paper is to close the gap between state-of-the-art optimization techniques and modern techniques from machine learning to improve the biopharmaceutical production by allowing easy to use yet powerful multi-objective optimization.

In most multi-objective optimization problems, no single best (unique optimal) solution exists, instead there is a set of optimal solutions (also called Pareto optimal solutions or non-dominated solutions), meaning for each solution that one criterion cannot be improved without degrading at least one of the other criteria. So, the decision maker has to choose from the set of non-dominated solutions according to the most preferred or important objective criterion. A promising approach to optimize objective functions, which are expensive to evaluate, is Bayes optimization. The methodology of Bayes optimization dates back to the work of Harold Kushner in 1964 [13] and gained impact through the work of Jones et al. in 1998 [14]. It is a probabilistic global optimization method for finding the maximum of objective functions that are expensive to evaluate or unknown (black-box) objective functions that are approximated using simulations [15].

In practice, the objective function could be the outcome of interest of a process, for example, process productivity or control metrics to describe the quality of a product. Input parameters can be process parameters needed to be optimized. Bayesian optimization [16] creates a quick to evaluate model, the so-called surrogate model of the objective function. In order to reduce the objective function evaluations, the surrogate model is iteratively trained and updated on new data. The positions of this new data are chosen by finding a trade-off between exploration (improving the surrogate model) and exploitation (finding optimal points). Typical surrogate models are Gaussian processes.

Gaussian processes (GP’s) are popular machine learning models [17] because, due to their Bayesian nature, they work well with few data points [18]. Furthermore, they allow the inclusion of expert knowledge [19,20] and can be used in dynamic systems [6,21]. GP’s are very flexible non-parametric models, hence, they can approximate any function and do not assume a predefined set of modeling functions.

Bayes optimization is successfully applied in many fields of research and economics [22]. Moreover, applications of Bayes optimization in the field of bioprocess engineering were published during the last decade [6,9,23,24]. Furthermore, this methodology was shown to be efficient in solving multi-objective optimization problems [25] and has also been applied for parameter estimation of kinetic parameters [26]. However, no applications are reported so far applying model-based multi-objective Bayes optimization within biopharmaceutical process development.

This contribution aims to present the concept of a workflow which couples uncertainty-based upstream simulation and Bayes optimization using Gaussian processes and its application in the form of a simulation case study to illustrate its applicability to a relevant industrial task in process development.

This simulation case study addresses the question if a reference seed train setup comprising five shake flask scales can be optimized through varying shake flask volumes and how many shake flask scales, three, four or five, are recommendable in terms of two objective criteria, seed train duration and deviation rate. Moreover it is investigated how the results change if cells grow with 5% lower or 5% higher maximum cell-specific growth rate.

Afterwards, two more objective criteria, titer (product concentration) and viability after 8 days in the production bioreactor, are added and seed train optimization is performed regarding four objective criteria simultaneously.

Furthermore, the suitability of the proposed method and the required number of iterations is evaluated with respect to the obtained information gain.

2. Methods

The main components of the applied methodology and the corresponding tools are described.

2.1. Upstream Simulation

Upstream simulation comprises a simulation of the cell expansion process (seed train) and simulation of the production scale. The reference upstream process taken as an application example for the here presented simulation case study comprises five consecutive shake flask scales followed by three bioreactor scales and one production scale, similar to the upstream process investigated in [27]. Further specifications are listed in Table 1.

A mathematical model is required, describing cell growth and interactions with the main limiting substrates and eventually inhibiting metabolites over time. A cell growth model, a system of ordinary differential equations (ode), already adapted to an industrial cell culture upstream process using a CHO cell line [27] has been used, which describes the dynamic behavior of viable and total cell density,

X_{v}

and

X_{t}

, concentrations of glucose

c_{Glc}

, glutamine

c_{Gln}

, lactate

c_{Lac}

, ammonia

c_{Amm}

and product (volumetric titer)

c_{titer}

(see Table A1 in the Appendix A).

Moreover, such an upstream process includes several constraints, operation steps and process parameters (e.g., concerning passaging intervals, substrate/nutrient concentrations, initial viable cell densities and viable cell densities before transferring cells into the next cultivation vessel, as well as the amount of cell suspension and fresh medium), which have to be considered in the simulation workflow. A detailed description of the required components and calculation routines are described in [28,29].

Besides these requirements, several passaging strategies can be applied, helping to decide at which point in time cells should be transferred from one cultivation vessel into the next larger one and how to perform these passaging steps (e.g., which amount of cell suspension should be mixed with how much fresh cell culture medium).

For the here presented simulation study, the passaging strategy for robust seed train design was chosen, where robustness refers to the reproducibility of the seed train regarding viable cell density, meaning that despite initial uncertainties and variabilities concerning cell growth, low variations of viable cell density at the end of the seed train are obtained. This strategy grounds on the objective of reaching the previously determined threshold of viable cell density and corresponding probability distributions of viable cell density at different points in time. These distributions are used in combination with a utility function following the mean-variance principle, which grounds on the Markowitz mean-variance portfolio optimization theory [30,31]: The utility function

U (t)

is defined as a function of viable cell density

X_{v}

including the expected value

E (X_{v})

and the variance

Var (X_{v})

of viable cell density, as well as a risk aversion parameter

α

which controls the amount of risk (amount of uncertainty) the user is willing to bear. In the here presented example, the risk refers to the probability that viable cell density differs from the expected value (predicted mean). A risk aversion value of

α = 1

would mean that the expected time profile minus one time the standard deviation of

X_{v}

is considered.

The utility function is defined through:

U (t) = E (X_{v} (t)) - α \sqrt{Var (X_{v} (t))}

(1)

Based on the simulated time profiles of the current cultivation scale (by solving the corresponding ode system), Equation (1) is used to calculate the utility function value

U (t)

per hour and to check if this value reaches or exceeds the required transfer viable cell density

X_{v, transfer}

which is necessary to inoculate (start) the next cultivation scale fulfilling the required seeding (initial) viable cell density and the filling volume.

In the next step, it is evaluated whether the calculated point in time lies within the range of practically feasible points in time for cell passaging,

T_{p}

. Thus, the objective is to find the minimum point in time out of the set of practically feasible points in time for passaging,

T_{p}

, which fulfills:

\begin{matrix} U (t) \geq X_{v, transfer}, \end{matrix}

(2)

\begin{matrix} subject to : t \in T_{p} . \end{matrix}

(3)

Based on the obtained point in time and the corresponding concentrations of viable cells, total cells, substrates and metabolites at this point in time, starting concentrations (=initial values of the system of ordinary differential equations) of the next cultivation scale are calculated based on the defined configurations and constraints (e.g., working volumes, acceptable range of seeding viable cell density and medium concentrations). This calculation has to be performed for every cultivation scale and passaging step. For more details refer to [27,28,29].

2.2. Bayes Optimization

A typical mathematical optimization problem is the following: Given an objective function

f : X \to R

over input space

X \subseteq R^{d}

, the aim is to find an argument

x^{*} \in X

, which optimizes (minimizes or maximizes) f.

The idea behind Bayes Optimization consists of creating a simple, probabilistic and cheap to evaluate model, a so-called surrogate model (substitute model), of the objective function f [15,17,32]. Bayesian optimization reduces the number of evaluations of the objective f via the following iterative approach: Before sampling f at another point, we take into account a trade-off between exploration (i.e., sampling of areas of high uncertainties) and exploitation (sampling from areas which are likely to move towards the optimum), which is encoded in a so-called acquisition function. We can find such points quickly from evaluation of the surrogate model.

Within Bayes optimization the following steps are performed:

(1): Generate a set of initial points and evaluate the objective function at these points.
(2): Train the surrogate model based on all evaluated points.
(3): Optimize the acquisition function, which determines the next candidate point $x_{c}$ to be evaluated.
(4): Compute $f (x_{c})$ , the objective function f at the candidate point $x_{c}$ .
(5): Repeat steps 2–4 for N iterations

The key of Bayesian optimization is not to rely on local approximations as many other optimization algorithms and instead to have a global viewpoint of also evaluating the function at unknown positions.

The acquisition function is used to propose the next candidate point to be evaluated based on specific criteria, for example the expected improvement of the optimization criteria, and on the reduction in predictive uncertainty. As in the case of the kernels, there is also a wide variety of possible acquisition functions to choose from. In this study, the Expected Improvement (EI) acquisition function is used [33,34].

Gaussian processes (GPs) are well suited surrogate models when making few assumptions [15]. Just like a Gaussian distribution (a normal probability distribution) is fully described by its mean m and variance

σ^{2}

, a GP is fully described by a mean function

m (x)

and a covariance function

k (x; x^{'})

[17]. A GP is an extension of a multivariate Gaussian (or normal) distribution to distributions of functions in the sense that if a function y follows a GP distribution, i.e.,

y \sim GP (m, k)

, then every evaluation of the function follows a Gaussian distribution

y (x) \sim N (m (x), k (x, x))

. In particular, a GP returns mean and variance of the possible function values (instead of just returning a scalar), and hence also provides information about the uncertainty of a prediction. Moreover, GPs can take into account uncertainty in the form of noise, the class of Gaussian processes is closed under Bayesian updates, and such updates are computationally tractable [35].

The covariance function describes the assumed characteristics such as smoothness or periodicity of the objective function f [16]. They are so-called positive-definite functions, often also called kernels [17,36]. It specifies the relationship between two ‘points’ (vector of the input space) x and

x^{'}

and the corresponding changes in f at these points. A covariance function is described by a set of parameters, also called hyperparameters, describing a specific behavior. This is how prior information is embedded in the Bayes optimization procedure. Also in this work, the most commonly used covariance function, the Squared Exponential (SE) kernel (often also referred to as Gaussian kernel) is used [32].

2.3. Problem Definition and Computational Procedure

The goal of the presented application example is to propose a concept and a numerical procedure for optimal robust seed train design, where robustness refers to the reproducibility of the seed train regarding viable cell density, meaning that despite initial uncertainties and variabilities concerning cell growth, low variations of viable cell density at the end of the seed train are obtained.

First, seed train constraints are defined based on a chosen cell line and its characteristics concerning optimal cultivation conditions and based on the operative possibilities (e.g., feasible points in time for cell passaging). Second, the optimizable input parameters and objective criteria (objective response variables) applied in this study are defined (as also illustrated in Figure 1), followed by the formulation of the mathematical optimization problem. Thereafter, the optimization problem is solved using a workflow which connects seed train simulation and Bayes optimization.

The following objective criteria were chosen to represent an optimal seed train: (I) a minimum duration (d) (=required cultivation time) of the seed train and (II) a minimum deviation rate (D) regarding viable cell density, i.e., the probability that the seed train will run outside predefined ranges of viable cell density (for both, seeding viable cell density and transfer viable cell density) (This is important to consider because in the case that specific constraints are not fulfilled, the performance of the cells could decrease. The growth rate could decrease and, furthermore, it has been observed that the violation of constraints could also cause less viability of the cells in the production phase [12]) (compare to Figure 1 right gray box). These two attributes shall enable an optimal start of the production scale. Note that, in addition to these criteria, the growth rate is another important parameter affecting an optimal start of the production scale and the growth rate should be high until the end of the seed train. However, in this first optimization study it is not set as optimization criterion because the here defined seed train setup (in terms of medium concentrations and possible cultivation volumes per scale) together with the aim to reduce cultivation time already supports good growth during the entire cultivation. However, for other seed train setups, it might be advisable to include growth rate at the end of the seed train into the optimization problem.

After consideration of the two mentioned objective criteria, a third and fourth objective criterion, the product concentration (titer) and the viability at the end of the cultivation in the production bioreactor (in this simulation study: after 8 days in batch mode, i.e., without addition of nutrient feeds) are added to the optimization problem (see Figure 1 right gray box (III)). Note: The authors are aware of the fact that cultivation in the production vessel itself, which is often performed in fed-batch mode, is also influenced by several process parameters having an impact on product quantity and quality. Moreover, data of further attributes would be necessary to describe product quality (e.g., of a recombinant therapeutic protein or antibody) but these are not provided and therewith not considered in this study.

The input variables that can be varied to optimize the recently mentioned objective criteria, and thus the optimizable input variables, are the filling volumes in the first five shake flask scales,

V_{1}, \dots, V_{5}

(compare to Figure 1, the part of the seed train between thawing cells from a small vial and inoculation of the first biorector). These target values are important inputs of the seed train simulation process because they are used to calculate points in time for cell passaging. Volumes in the finally proposed seed train protocol (output of the seed train simulation) may vary within allowed working volume ranges and these are also presented in this work.

Formulation of the Mathematical Optimization Problem

The optimizable variables and therewith inputs of the optimization problem are the filling volumes of the n shake flask scales,

V_{1}, \dots, V_{n}

which are included in the input vector:

\begin{matrix} x = {(V_{1}, \dots, V_{n})}^{T} . \end{matrix}

(4)

Outputs of the optimization problem are the defined objective criteria. These are seed train duration d and deviation rate D for the first optimization example. Thus, the unknown objective function (which should be minimized) can be written as follows:

\begin{matrix} f (x) = {(f_{1} (x), f_{2} (x))}^{T} \end{matrix}

(5)

with

f_{1} (x) \hat{=} d

and

f_{2} (x) \hat{=} D

.

The second optimization example includes a third and fourth optimization criterion, product concentration and viability at the end of the production scale (here after 8 days in the production vessel). Thus

f (x)

expands to:

\begin{matrix} f (x) = {(f_{1} (x), f_{2} (x), f_{3} (x), f_{4} (x))}^{T} \end{matrix}

(6)

with

f_{1} (x) \hat{=} d

,

f_{2} (x) \hat{=} D

,

f_{3} (x) \hat{=} c_{titer, end}

and

f_{4} (x) \hat{=} {Viability}_{end}

.

2.4. Connecting Seed Train Simulation and Bayes Optimization

Uncertainty-based seed train simulation as described in Section 2.1 was coupled with algorithms for Bayes optimization as described in Section 2.2. The workflow integrating both components is illustrated in Figure 2. The inputs of the combined framework are the input variables: Boundaries for the optimizable variables (here filling volumes) and objective criteria (here seed train duration, deviation rate and in the second example also product concentration at the end of production scale) given all required seed train configuration settings and constraints (e.g., initial concentrations, practically feasible points in time for cell passaging, acceptable ranges for viable cell density, …).

First points (=combinations of optimizable variables) are determined using a Latin Hypercube design distributing these points within the design space (see Figure 2, Box A). Seed train simulations are performed at these points in order to obtain the corresponding objective criteria values. Input values together with output values form a data set. An unknown model describing the relationship between inputs and outputs is approximated through a Gaussian process (GP) which has to be trained (see Figure 2, Box A) based on the given data set. Therefore, the Gaussian process proposes a point that has to be evaluated next (see Figure 2, Box B).

A robust seed train is simulated, using a mechanistic process model, and the objective criteria are calculated. This output is then returned to the Bayes optimization (Box A) to update the GP. Usually, experiments are performed to return the experimental output. The present approach instead exploits the advantages of the model-based upstream simulation in order to reduce the experimental effort to a minimum.

These steps are repeated various times, e.g., until a previously defined number of maximum iteration steps is reached. The latter depends on the resources (human and financial resources in case of laboratory experiments or computational resources in case of in silico experiments). In every iteration the Gaussian process chooses a new point aiming to move to the optimum and at the same time to reduce model uncertainty.

Results of this optimization framework are the set of Pareto optimal setups (also called Pareto front) and their corresponding response values.

2.5. Numerical Solvers and Tools

The programming language and numeric computing environment MATLAB [37] was used for the seed train simulations. The code for the optimization workflow was written in Python [38] using the MATLAB Engine API for Python to call MATLAB as a computational engine from Python code. To perform Bayes optimization within this workflow, the library GPflow [39] was used.

3. Results and Discussion

3.1. Optimization of Cultivation Vessels Regarding Number of Shake Flask Scales and Filling Volumes for Five, Four and Three Shake Flask Scales

In this section, it is investigated which cultivation filling volumes should be used for the flask scales in order to obtain optimal results in terms of seed train duration and deviation rate, here defined as the probability that the seed train will run outside the predefined acceptable ranges for initial viable cell density (VCD) and transfer VCD (final VCD before transfer into the next cultivation vessel) per scale. The latter is a measure for the robustness of the seed train regarding viable cell density.

For assessment of the optimization results, a conventional reference seed train comprising five shake flask scales was simulated based on a non-optimized design. Therefore, a common passaging interval of 3 days per cultivation scale was fixed and filling volumes were determined following a conservative layout (i.e., choosing not too huge differences between one cultivation scale and the next to ensure that enough viable cells are generated even if they grow a little bit slower than expected).

In the first step, the optimal combination of filling volumes for five shake flask scales is investigated and the results are compared to the reference seed train. Afterwards, it is investigated if a reduction in shake flask scales from five to four or three shake flask scales leads to similar or even better results in terms of seed train duration and deviation rate. The number of bioreactor scales was kept fixed. Three bioreactors with filling volumes of 40 L, 320 L and 2100 L were used as pre-stages before inoculation of the production bioreactor with 9600 L. The assumed seed train setup is given in Table 1.

To find the optimal solution, multi-objective Bayesian optimization coupled with uncertainty-based seed train simulation, as described in Section 2.1, was applied. First, a Latin hypercube design for

n_{lhs}

design ‘points’ (combinations of filling volumes, here

n_{lhs} = 10

) was initiated and seed train simulation was applied to calculate the objective criteria values, here, deviation rate D and seed train duration d (replacing the normally required experimental cultivation runs) at each point. Within the Bayes optimization procedure, Gaussian processes were trained based on the simulation outcomes and an acquisition function was calculated in each iteration step in order to propose which point should be evaluated next. The input space for shake flask filling volumes (here the optimizable variables) was defined as described in Table 2, assuming the possibility of using several shake flasks in parallel for one shake flask scale and also considering their working volumes ranges.

3.1.1. Optimization of Five Shake Flask Scales

The first optimization was performed for a seed train comprising five shake flask scales. Figure 3 shows the objective criteria values for each evaluated point, whereby the outcomes based on the initial Latin hypercube space are illustrated by blue dots and the outcomes for the proposed points based on the trained Gaussian processes are illustrated through yellow crosses. The optimal solutions are those near to the lower left corner aiming to minimize seed train duration and the deviation rate. The Pareto optimal solutions, also called non-dominated solutions, are illustrated through green circles. A solution (seed train setup/combination of filling volumes) is called non-dominated if no solution exists leading to better (here lower) objective criteria values. As described previously, several Pareto optimal solutions can be obtained because when considering two or more objective criteria then for two different solutions one criterion might have better (here lower) value then the other solution for the same objective, while the other criterion has worse (here higher) values. The set of all Pareto optimal solutions is called Pareto front.

For the investigated scenario (five shake flask scales and the seed train configuration according to Table 1) five Pareto optimal solutions were obtained (see green circles in Figure 3). It can be seen that comparing two of these solutions (green circles) each, one solution has a lower (here better) seed train duration value than the other solution and the opposite holds for the deviation rate.

The corresponding values for the optimizable variables, here shake flask filling volumes (

V_{1}

,

V_{2}

,

V_{3}

,

V_{4}

and

V_{5}

), and the corresponding objective criteria values, here deviation rate D and seed train duration d, are listed in Table 3.

The filling volume of the first scale was limited to a very narrow range (14–15 mL) (A higher variation after cell thawing was not expected). Most obtained solutions start with the maximum value of this range (see Table 3, first column). The filling volume of flask scale 2 varies between 0.065 and 0.115 L, the filling volume of flask scale 3 between 0.340 and 0.904 L, the filling volume of flask scale 4 between 1.582 and 2.355 L and of flask scale 5 between 6.85 and 7.97 L. All five combinations lead to a deviation rate D of less than 7% and to a seed train duration between 520 to 537 h.

A more detailed illustration of the obtained results is presented in Figure 4 and Figure 5. For two optimizable variables and one objective criterion each (deviation rate in Figure 4 and seed train duration in Figure 5), a contour plot is shown which illustrates the objective value for each calculated point (combination of the two variables), using the trained Gaussian processes, through colored isolines.

For example, the diagram in the top left of Figure 4 shows the deviation rate for each combination of

V_{1}

(filling volume in flask scale 1) and

V_{2}

(filling volume in flask scale 2) through colors representing the corresponding values in %, as indicated in the color bar. The results obtained through seed train simulations are shown by dots. The red dots represent the non-dominated (optimal solutions), optimal with respect to the defined multi-objective optimization problem. The dark blue area indicates combinations of

V_{1}

and

V_{2}

leading to a lower deviation rate. It can be seen that values above 0.1 for

V_{2}

combined with any value of

V_{1}

(within the given range) lead to the lowest deviation rates (below 6.2%, see dark blue area). Moreover, the optimal solutions (red dots) are mostly located in the area with higher filling volumes for shake flask 2,

V_{2}

, except one (red dot at

V_{2} \approx

0.065).

For some combinations, a closer delimitation is possible. For example, the middle diagram in the second row (

V_{3}

over

V_{2}

) shows a limited region (dark blue area) and therewith a specific combination of

V_{3}

and

V_{2}

that leads to the lowest deviation rates (<5.6%). These are around 0.3 L for

V_{3}

and around 0.105 L for

V_{2}

. Furthermore, two optimal solutions (red dots) out of the set of Pareto optimal solutions (considering both objective criteria, seed train duration and deviation rate) are located in this region. The remaining red dots are located outside of the dark blue regions (see turquoise regions in the same diagram), meaning that they have higher deviation rates. Analogously, Figure 5 shows the contour plots for the second objective criterion, seed train duration. The dark blue areas show the combinations with the lowest seed train durations (approximately below 528 h). It can be seen in these diagrams that most red dots are located in the dark blue regions. For some combinations the dark blue areas are wider, distributed over several possible values for one variable, e.g., the diagram in the top center, top left, center, and center right.

Other combinations show narrower regions with low seed train durations as can be seen in the diagram showing

V_{4}

over

V_{3}

. The lowest seed train duration is obtained for filling volumes between 1.5 and 2.5 L for shake flask 4 in combination with filling volumes between 0.2 and 0.8 L for shake flask 3.

Overall, these diagrams give an overview of the impact of two combined optimizable variables each on a specific objective criterion.

In addition to this information, simulated time profiles (predictive mean in green, 90% prediction bands in blue) of viable and total cell density as well as concentrations of glucose, glutamine, lactate and ammonium (see Figure 6) can be obtained for each solution, as well as a seed train protocol containing information about the calculated passaging intervals, amount of medium, etc.

It can be seen in the top left of Figure 6 that based on the given filling volumes in addition to the flexibility to choose individual points in time for cell passaging in each scale, it is possible to set the seeding viable cell density at the beginning of each cultivation scale on the desired value with low variability, allowing to stay within the corresponding acceptable ranges for seeding VCD (see yellow dashed lines). Moreover, transfer VCDs lie within the corresponding acceptable range with high probability (see lower boundary, gray dashed line). Moreover, it can be seen that substrate concentrations are not depleted and according to [27], values of 20 mmol/L lactate and 5 mmol/L ammonium are not yet inhibiting concentrations for this cell line.

For a better assessment, the obtained results are compared to the reference seed train which is also defined in this work for five shake flask scales and illustrated in Figure 7. It grounds on a (non-optimized) configuration setup for five shake flask scales using fixed passaging intervals of 72 h each (common practice) and filling volumes of 15 mL (flask scale 1), 80 mL (flask scale 2), 300 mL (flask scale 3), 2000 mL (flask scale 4) and 4000 mL (flask scale 5). This choice grounds on a rather conservative approach aiming to avoid the risk of reaching too low transfer cell densities at the end of a cultivation scale but without the inclusion of probabilistic simulations.

The proposed method instead includes risk calculations and a passaging strategy aiming to minimize this risk but at the same time identifying a seed train configuration which is optimal regarding further objectives such as seed train duration in the present case.

A comparison of the seed train solutions obtained after optimization and the reference seed train shows that deviation rate is much lower after optimization (4.9–6.7% instead of 41.7%) and seed train duration could be reduced by 56 h from 576 h to 520 h. Figure 7, diagram top left shows where seeding or transfer viable cell density do not lie fully within the acceptable ranges (see red circles). This is different for the optimized solutions, e.g., solution 5, as illustrated in Figure 6, where seeding VCD lies within the acceptable range and also transfer VCD lies above the lower bound of the acceptable range for transfer VCD. This significant reduction in time (≈2 days per seed train) would contribute to a meaningful acceleration of the production process.

3.1.2. Optimization of Three and Four Shake Flask Scales

In the next step, the number of shake flask scales was reduced from five to four and then to three shake flask scales and the same optimization procedure was applied. The aim was to investigate if less cultivation vessels would lead to comparable results and if so, which target and filling volumes should be chosen. This is of interest because less operations (such as transferring cells from one scale into another one) signify less risk of failure and deviations.

Figure 8 shows the obtained values for the objective criteria deviation rate and seed train duration for different combinations of filling volumes for three (left) and for four shake flask scales (right). Furthermore, here, the solutions based on the initial Latin hypercube design are shown by blue dots and Pareto optimal solutions are highlighted through green circles.

It can be seen that for both scenarios, combinations of filling volumes could be found leading to an overall seed train cultivation time between 519 and 530 h. However, the scenario of using four shake flask scales, leads to lower deviation rates (

D < 10 %

) compared to the scenario of using three shake flask scales (

23 % < D < 26 %

).

The corresponding filling volumes and the obtained filling volumes (based on the underlying passaging strategy) of the Pareto optimal solutions are listed in Table 4 together with the results for five shake flask scales from Table 3. The results are sorted as discovered by the optimization algorithm. The obtained filling volumes for four shake flask scales are very similar, except for shake flask scale 4 (

V_{1}

= 15 mL,

V_{2}

= 158–200 mL,

V_{3}

= 1.51–1.60 L and

V_{4}

= 4.81–7.58 L). Some of the obtained solutions would be seen or treated as equal in practice, because the differences are rather small. For example it would not be distinguished between 0.190 and 0.195 L. Probably 200 mL would be used instead. However, the applied optimization algorithm works on a continuous input space and differentiates between the solutions listed in the Table 4, even though the differences are very low. The obtained optimal filling volumes for three shake flask scales also look similar, but with a bit more variation for shake flask 3 (4.45–5.58 L).

Comparing the results for the three scenarios (three, four and five shake flask scales) endorses a decision against the three flask scales-scenario due to the higher deviation rates (>20%), which stands for less process robustness. Between the other two scenarios (four or five shake flask scales) only little differences with respect to deviation rates are observed for the determined optimal solutions (4.9–6.7% for five shake flasks, 5.9–9.2% for four shake flasks). Using five shake flask scales would lead to more or less similar cultivations times (520–537 h) but one operational step more would be required.

This information, together with the corresponding seed train protocol, provides a solid basis to take a decision for one of the proposed optimal seed trains designs, taking into account seed train duration, robustness (expressed through deviation rates) and operational steps.

3.2. Application to Further Cell Lines with Potentially Different Growth Rates

The optimization examples presented in the previous subsection were applied to a specific CHO cell line with growth characteristics described by a set of model parameters derived from an industrial cell culture process which was investigated in [27]. If a different cell line or a clonal cell population with potentially differing growth behavior is used, then the optimization has to be performed for this specific cell line. In the following simulation study, a cell line having a 5% lower and a cell line having a 5% higher maximum cell-specific growth rate compared to the reference maximum growth rate (

μ_{\max}

= 0.028 h

^{- 1}

for the first bioreactor scale and

μ_{\max}

= 0.029 h

^{- 1}

for the remaining seed train scales) are assumed and the optimization is applied for both scenarios.

The results for the obtained/proposed filling volumes, as well as the corresponding seed train duration and deviation rate are listed in Table 5.

As expected, cells which grow faster (higher maximum growth rate

μ_{\max}

) would require less time until reaching a specific target cell density. This can be seen in the right column of Table 5. Using five flask scales, the optimal required seed train duration would lie between 494 and 503 h for a cell line with a 5% higher growth rate compared to the reference cell line which would need 520–537 h (see Table 3). Correspondingly, cells with a 5% lower growth rate would need more time (550–568 h). The same is observed when using four or three shake flasks.

With respect to the deviation rates which represent the robustness of the seed train design regarding variability of viable cells, it can be seen that low deviation rates of between 4.1% and 11.6% can be reached when using five or four flask scales, even if the maximum growth rate varies ±5%. A critical limit was identified for the combination of using three shake flask scales for a slower growing cell line. The corresponding optimal solution shows a comparatively higher deviation rate (19.2–29.1%) together with a high seed train duration (548–552 h).

3.3. Optimization Regarding Four Objectives Including Product Concentration

To show the applicability of the proposed method to more than two objectives, a third and a fourth objective criterion, titer concentration and viability at the end of the production vessel (after 8 days) was added. Whereas the first two objective criteria (seed train duration and deviation rate) are related to the seed train itself, the third and fourth criterion refer to the generated product in the production vessel and to the viability of the cells in the production vessel. Product concentration, as well as product quality can be influenced by many factors (seeding cell density, substrate concentrations and nutrient feeds, metabolite production, temperature, pH, dissolved oxygen and carbon dioxide concentration, osmolality and more) and also by the amount and the state of the cells at the end of the seed train. Since no data describing product quality are available, product concentration and viability are considered in this study. A further simplification that was made is the assumption that the production vessel is performed in batch-mode (meaning without any addition of nutrient feeds or medium renewals). The reason for this simplification is to avoid confounding effects. The authors are aware of the fact that many factors affect product concentration and product quality and when data of other critical process parameters or quality attributes are available, these could also be considered in the same manner. The main purpose of the present simulation example is to demonstrate how the proposed method can be applied to more than two objectives and how the corresponding results can be illustrated and interpreted.

To obtain a visual overview for multiple objective criteria in one figure, a so-called spider plot (or net plot) can be used, which is shown in Figure 9.

The horizontal axis shows the values of the deviation rate (on the right) and of the viability (on the left). The vertical axis shows the values of the seed train duration (above) and of the titer (below). The aim of the optimization was to minimize seed train duration and deviation rate and to maximize viability and titer. Each color (hyperplane) represents one of the Pareto optimal seed train configurations (based on the optimal combinations of filling volumes in shake flask scales). Since seed train duration and deviation rate should be minimal and titer and viability should be maximal, hyperplanes covering the lower left area would be desired. However, no such solution (hyperplane) was obtained. The reason is that the optimization problem contains conflicting objective criteria, meaning that an improvement of one criterion leads to a degradation of another criterion. The here presented solutions are all non-dominated (see the green circles in the figures for two objective criteria). For all shown solutions, the deviation rate is rather low (4.9–7.3%), the seed train duration lies between 521 and 562 h and a titer of approximately 430–433 mg/L (assuming here a cell-specific production rate of

q_{titer, \max} = 3.9 \times 10^{- 10}

mg cell

^{- 1}

h

^{- 1}

, as reported in [40]) and a viability of 52–53% is reached after 8 days in the production vessel (here via batch-mode). Of course, the obtained values depend a lot on the real process conditions (production bioreactor probably performed in fed-batch model) and the model parameter values obtained after model validation. However, the presented simulation example shall illustrate how the proposed approach can be applied for risk-based decision making under consideration of several criteria that should be optimal.

3.4. Impact of Performed Iterations during Bayes Optimization

For the example of three shake flask scales, (followed by three bioreactor scales) and optimizing filling volumes for all shake flask scales with respect to the two objective criteria: seed train duration and deviation rate, the number of performed iterations during the optimization procedure was varied. First, 10 initial points (combinations of filling volumes) distributed based on a Latin hypercube design were evaluated, followed by 10 Bayes iterations, which means that 10 times the algorithm updates the black box model (the Gaussian process), calculates the acquisition function and proposes the next point based on the outcome of this calculation. Then, the optimization was performed again for the same seed train setup but using 20 and then 30 Bayes iterations. The obtained solutions are illustrated in Figure 10.

Increasing the number of iterations from 10 to 20 helped to identify one solution that has not been discovered when running only 10 iterations. This can be seen when comparing the green circles in the diagram top left and the green circles in the diagram top right. The solution with

D \approx

22 and d = 523 cannot been found in the diagram top left.

Increasing the number of iterations from 20 to 30 did not lead to an improved optimum as can be seen when comparing the green circles in the Figure 10 top right diagram and bottom left diagram. This underlines the efficiency of the Bayes optimization. In the present example, only 10 initial points (distributed randomly according to a Latin hypercube design) and 20 Bayes optimization iteration steps were required to obtain the results which were confirmed when applying 30 iteration steps.

3.5. Summary

The objective of the first optimization problem was to design a robust seed train (cell expansion process), which means a seed train layout (including the number of cultivation scales, filling volumes and passaging intervals) leading to a reproducible seed train with low variability regarding viable cell density and with a minimum seed train duration. The obtained solutions were compared to a non-optimized reference seed train and a comparison showed that the deviation rate is much lower after optimization (<10% instead of 41.7%) and seed train duration could be reduced by 56 h from 576 h to 520 h, which means a significant reduction of more than 2 days.

Addressing the question of if variation of the number of shake flask scales (and therewith the number of passaging steps) would lead to similar results in terms of deviation rates and seed train duration, it turned out that a reduction to three shake flask scales, would mean an increase in deviation rate and is therefore not recommended, at least under the assumed working volume ranges.

In industrial practice, typically more than one cell line is in use (different cell lines may be used to produce different molecules/products). Since growth rates of different cell lines differ, it was investigated how optimal seed train designs would differ for cell lines with 5% higher or lower growth rates. It turned out that the same optimization procedure could be easily adapted (by modification of the model parameter maximum growth rate) and applied to the modified setup revealing critical limits, e.g., for the combination of using three shake flask scales for a slower growing cell line. The latter shows comparatively high deviation rates (19.2–29.1%) together with high seed train durations (548–552 h instead of 519–523 h for the reference growth rate).

To show the applicability of the proposed method to more than two objective criteria, a third and fourth objective criterion, product concentration (titer) and viability after 8 days in the production phase, were added and the optimization was performed regarding four objective criteria in total. These are seed train duration, deviation rate (i.e., the probability that the seed train will run outside the predefined criteria), titer and viability at the end of the production phase.

Moreover, it was investigated for one seed train configuration (three shake flask scales and two objectives using the reference cell growth rate) if increasing the number of Bayes iterations would identify different optima. A number of 20 Bayes iterations turned out to be sufficient, because running 20 or 30 Bayes iterations showed similar results, which underlines the efficiency of the Bayes optimization approach.

In the present case study, the volumes are considered as fixed after optimization. If the production process allows for more flexibility in terms of adapting the volume within a specific range in the case that cells grow slower than the expected mean, then a reduction in the deviation rate can be achieved because varying the volumes allows for regulation of the inoculum viable cell density. However, this flexibility is not always given due to regulatory requirements and therefore not considered in the present study.

4. Conclusions

A concept has been developed to use process models in combination with algorithms for Bayes optimization using Gaussian processes to solve multi-objective optimization problems in the context of biopharmaceutical production processes. To illustrate this approach, a relevant exemplary optimization problem was chosen and solved using the proposed method.

The goal was to find optimal combinations of filling volumes for the shake flask scales of a seed train leading to a minimum deviation rate regarding viable cell densities and a minimum process duration. Compared to a non-optimized reference seed train, the optimized process showed much lower deviation rates regarding viable cell densities (<10% instead of 41.7%) using five or four shake flask scales and seed train duration could be reduced by 56 h from 576 h to 520 h.

Overall, it is shown that applying Bayes optimization to a multi-objective optimization function with several optimizable input variables and under a considerable amount of constraints, lead to revealing results with a low computational effort. This approach provides the potential to be used in form of a decision tool, e.g., for the choice of an optimal and robust seed train design but also to further optimization tasks within process development.

It should be noted that Bayes optimization and the corresponding computational modules could also be applied, even if no mechanistic process model is available, following a slightly different workflow. Instead of performing model-based in silico experiments (process simulations), real lab experiments would be performed and fed back to update the black box model (here the Gaussian process). This adaptive procedure (also called Bayesian experimental design or experimental design with Bayesian optimization [41]) or further related optimization methods might be promising tools to support experimental planning, process characterization, process transfer or optimization of cell culture processes but they still require further research and being embedded in software solutions that are easy to use for operators.

Author Contributions

Conceptualization, T.H.R., A.S., B.F. and M.L.-H.; methodology, T.H.R., A.S. and M.L.-H.; software, A.S. and T.H.R.; validation, A.S. and T.H.R.; formal analysis, T.H.R.; investigation, T.H, A.S., M.L.-H. and B.F.; resources, M.L.-H. and B.F.; data curation, T.H.R.; writing—original draft preparation, T.H.R.; writing—review and editing, A.S., B.F. and M.L.-H.; visualization, T.H.R. and A.S.; supervision, M.L.-H. and B.F.; project administration, T.H.R., B.F. and M.L.-H.; funding acquisition, B.F. and T.H.R. All authors have read and agreed to the published version of the manuscript.

Funding

The article processing charge (APC) was funded partially by Ostwestfalen-Lippe University of Applied Sciences and Arts (TH OWL).

Acknowledgments

The authors would like to express special thanks to Christoph Posch (Novartis Technical Research and Development) for the fruitful scientific exchange regarding the here presented case study. Moreover, we acknowledge support for the Open Access fees by Ostwestfalen-Lippe University of Applied Sciences and Arts (TH OWL) in the funding program Open Access Publishing.

Conflicts of Interest

All authors T.H.R., A.S., M.L.-H. and B.F. do not have any conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CHO	Chinese hamster ovary
EI	Expected improvement
FDA	Food and Drug Administration
GP	Gaussian process
LHS	Latin hypercube sampling
ode	Ordinary differential equations
SE	Squared exponential
VCD	Viable cell density
List of symbols
$α$	Risk aversion parameter (-)
$μ$	Cell-specific growth rate (h $^{- 1}$ )
$μ_{d}$	Cell-specific death rate (h $^{- 1}$ )
$μ_{d, \max}$	Maximum cell-specific death rate (h $^{- 1}$ )
$μ_{d, \min}$	Minimum cell-specific death rate (h $^{- 1}$ )
$μ_{\max}$	Maximum cell-specific growth rate (h $^{- 1}$ )
$μ_{ref}$	Reference maximum cell-specific growth rate (h $^{- 1}$ )
$σ^{2}$	Variance
$c_{Amm}$ ( $c_{Amm, 0}$ )	(Initial) ammonia concentration (mmol L $^{- 1}$ )
$c_{Glc}$ ( $c_{Glc, 0}$ )	(Initial) glucose concentration (mmol L $^{- 1}$ )
$c_{Gln}$ ( $c_{Gln, 0}$ )	(Initial) glutamine concentration (mmol L $^{- 1}$ )
$c_{Lac}$ ( $c_{Lac, 0}$ )	(Initial) lactate concentration (mmol L $^{- 1}$ )
$c_{titer}$ ( $c_{titer, 0}$ )	(Initial) volumetric titer (product concentration) (mg L $^{- 1}$ )
d	Dimension of the input space, number of optimizable variables
	= seed train duration (h)
D	Data, Deviation rate
$E (\cdot)$	Expectation value
f	Objective function (-)
$f_{i}$	Component i of a multidimensional objective function (-)
$F_{sample}$	Change of volume due to sampling (L h $^{- 1}$ )
i	Running index (-)
k	Covariance function
$K_{Amm}$	Correction factor for ammonia uptake (-)
$K_{Lys}$	Cell lysis constant (h $^{- 1}$ )
$K_{S, Glc}$	Monod kinetic constant for glucose (mmol L $^{- 1}$ )
$K_{S, Gln}$	Monod kinetic constant for glutamine (mmol L $^{- 1}$ )
$k_{Glc}$	Monod kinetic constant for glucose uptake (mmol L $^{- 1}$ )
$k_{Gln}$	Monod kinetic constant for glutamine uptake (mmol L $^{- 1}$ )
m ( $m (\cdot)$ )	Mean (mean function)
n	Number of shake flasks (-)
N	Number of iterations (-)
$N$	Normal distribution (-)
$n_{lhs}$	Number of latin hypercube points (-)
$q_{A m m}$ ( $q_{Amm, uptake, \max}$ )	(Maximum) cell-specific ammonia uptake rate (mmol cell $^{- 1}$ h $^{- 1}$ )
$q_{Glc}$ ( $q_{Glc, \max}$ )	(Maximum) cell-specific glucose uptake rate (mmol cell $^{- 1}$ h $^{- 1}$ )
$q_{Gln}$ ( $q_{Gln, \max}$ )	(Maximum) cell-specific glutamine uptake rate (mmol cell $^{- 1}$ h $^{- 1}$ )
$q_{Lac}$ ( $q_{Lac, uptake, \max}$ )	(Maximum) cell-specific lactate uptake rate (mmol cell $^{- 1}$ h $^{- 1}$ )
$q_{titer}$ ( $q_{titer, \max}$ )	(Maximum) cell-specific product production rate (mg cell $^{- 1}$ h $^{- 1}$ )
$R$	Set of real number
t	Time (h)
$T_{p}$	Set of feasible points in time for passaging
$U (\cdot)$	Utility function
V	Volume (L)
$V_{i}$	Volume in shake flask scale i
$Var (\cdot)$	Variance
$x_{c}$	Candidate point
x, $x^{'}$	Multidimensional points (vectors) of the input space
$x *$	Argument that maximizes $f (s)$
$X_{t}$	Total cell density (cells L $^{- 1}$ )
$X_{v}$	Viable cell density (cells L $^{- 1}$ )
$X_{v, i}$	Viable cell density at point in time with index i (cells L $^{- 1}$ )
$X$	Input space
y	Arbitrary function (-)
Y	Arbitrary random variable (-)
$Y_{Amm / Gln}$	Kinetic production constant for ammonia (mmol mmol $^{- 1}$ )
$Y_{Lac / Glc}$	Kinetic production constant for lactate (mmol mmol $^{- 1}$ )

Appendix A

Table A1. Mechanistic model [27,42,43,44,45] for description of cell growth, cell death, substrate uptake, metabolite production and antibody production applicable to batch and fed-batch mode.

Balance Equations	Kinetic Equations
Biomass
$\frac{d X_{v}}{d t} = X_{v} \cdot (μ - μ_{d}) - \frac{F_{Glc} + F_{Gln} + F_{Medium}}{V} \cdot X_{v}$	$μ = μ_{\max} \cdot \frac{c_{Glc}}{c_{Glc} + K_{S, Glc}} \cdot \frac{c_{Gln}}{c_{Gln} + K_{S, Gln}}$ , if $t > t_{Lag}$
	$μ = μ_{\max} \cdot \frac{c_{Glc}}{c_{Glc} + K_{S, Glc}} \cdot \frac{c_{Gln}}{c_{Gln} + K_{S, Gln}} - (1 - \frac{t}{t_{Lag}}) \cdot a_{Lag} \cdot μ_{\max}$ ,
	if $t \leq t_{Lag}$
$\frac{d X_{t}}{d t} = X_{v} \cdot μ - K_{Lys} \cdot (X_{t} - X_{v})$	$μ_{d} = μ_{d, \min} + μ_{d, \max} \cdot \frac{K_{S, Glc}}{K_{S, Glc} + c_{Glc}} \cdot \frac{K_{S, Gln}}{K_{S, Gln} + c_{Gln}}$
$- \frac{F_{Glc} + F_{Gln} + F_{Medium}}{V} \cdot X_{t}$
Substrates
$\frac{d c_{Glc}}{d t} = - X_{v} \cdot q_{Glc} + \frac{F_{Glc}}{V} \cdot c_{Glc, F} + \frac{F_{Medium}}{V} \cdot c_{Glc, Medium}$	$q_{Glc} = q_{Glc, \max} \cdot \frac{c_{Glc}}{c_{Glc} + k_{Glc}}$
$- \frac{F_{Glc} + F_{Gln} + F_{Medium}}{V} \cdot c_{Glc}$
$\frac{d c_{Gln}}{d t} = - X_{v} \cdot q_{Gln} + \frac{F_{Gln}}{V} \cdot c_{Gln, F} + \frac{F_{Medium}}{V} \cdot c_{Gln, Medium}$	$q_{Gln} = q_{Gln, \max} \cdot \frac{c_{Gln}}{c_{Gln} + k_{Gln}}$
$- \frac{F_{Glc} + F_{Gln} + F_{Medium}}{V} \cdot c_{Gln}$
Metabolites
$\frac{d c_{Lac}}{d t} = X_{v} \cdot q_{Lac} - \frac{F_{Glc} + F_{Gln} + F_{Medium}}{V} \cdot c_{Lac}$	$q_{Lac} = Y_{Lac / Glc} \cdot q_{Glc} \cdot \frac{c_{Glc}}{c_{Lac}} - q_{Lac, uptake} \cdot \frac{μ_{\max} - μ}{μ_{\max}}$
	with $q_{Lac, uptake} = 0$ , if $c_{Glc} > 0.5 {mmol L}^{- 1}$
	with $q_{Lac, uptake} = q_{Lac, uptake, \max}$ , if $c_{Glc} \leq 0.5 {mmol L}^{- 1}$
$\frac{d c_{Amm}}{d t} = X_{v} \cdot q_{Amm} - \frac{F_{Glc} + F_{Gln} + F_{Medium}}{V} \cdot c_{Amm}$	$q_{Amm} = Y_{Amm / Gln} \cdot q_{Gln} \cdot \frac{c_{Gln}}{c_{Amm}}$
	$- K_{Amm} \cdot q_{Amm, uptake, \max} \cdot \frac{μ_{\max} - μ}{μ_{\max}}$
	with $K_{Amm} = 0$ , if $(c_{Gln} > c_{Amm})$
	with $K_{Amm} = 1$ , if $(c_{Gln} \leq c_{Amm}) a n d (μ > μ_{d})$
	with $K_{Amm} = - k_{Amm} (c o n s t a n t)$ , if $(μ \leq μ_{d})$
Product titer and volume
$\frac{d c_{titer}}{d t} = X_{v} \cdot q_{titer} - \frac{F_{Glc} + F_{Gln} + F_{Medium}}{V} \cdot c_{titer}$	$q_{titer} = q_{titer, \max}$
$\frac{d V}{d t} = - F_{Sample} + F_{Glc} + F_{Gln} + F_{Medium}$

Appendix B

Appendix Application to Other Cell Lines with Potentially Higher and Lower Maximum Growth Rates

Figure A1. Pareto solutions for 3, 4 and 5 shake flask scales and for three different growth rates, reference maximum growth rate (left column), a 5% lower (middle column) and a 5% higher growth rates (right column) showing the objective criterion seed train duration over objective criterion deviation rate, using 20 optimization iterations; Blue dots: based on the initial Latin hypercube (LHC) design; Yellow crosses: based on the proposed points (by the algorithm); Green circles: Pareto optimal solutions. (a) 5 sf,

μ_{, \max, ref}

. (b) 5 sf,

μ_{, \max, 95 %}

. (c) 5 sf,

μ_{, \max, 105 %}

. (d) 4 sf,

μ_{, \max, ref}

. (e) 4 sf,

μ_{, \max, 95 %}

. (f) 4 sf,

μ_{, \max, 105 %}

. (g) 3 sf,

μ_{, \max, ref}

. (h) 3 sf,

μ_{, \max, 95 %}

. (i) 3 sf,

μ_{, \max, 105 %}

.

Figure A1. Pareto solutions for 3, 4 and 5 shake flask scales and for three different growth rates, reference maximum growth rate (left column), a 5% lower (middle column) and a 5% higher growth rates (right column) showing the objective criterion seed train duration over objective criterion deviation rate, using 20 optimization iterations; Blue dots: based on the initial Latin hypercube (LHC) design; Yellow crosses: based on the proposed points (by the algorithm); Green circles: Pareto optimal solutions. (a) 5 sf,

μ_{, \max, ref}

. (b) 5 sf,

μ_{, \max, 95 %}

. (c) 5 sf,

μ_{, \max, 105 %}

. (d) 4 sf,

μ_{, \max, ref}

. (e) 4 sf,

μ_{, \max, 95 %}

. (f) 4 sf,

μ_{, \max, 105 %}

. (g) 3 sf,

μ_{, \max, ref}

. (h) 3 sf,

μ_{, \max, 95 %}

. (i) 3 sf,

μ_{, \max, 105 %}

.

References

Herwig, C.; Garcia-Aponte, O.F.; Golabgir, A.; Rathore, A.S. Knowledge management in the QbD paradigm: Manufacturing of biotech therapeutics. Trends Biotechnol. 2015, 33, 381–387. [Google Scholar] [CrossRef] [PubMed]
U.S. Department of Health and Human Services; Food and Drug Administration. Guidance for Industry PAT—A Framework for Innovative Pharmaceutical Development, manufacturing, and Quality Assurance: U.S. Department of Health and Human Services, Food and Drug Administration: Guidance for Industry: PAT—A Framework for Innovative Pharmaceutical Development, Manufacturing and Quality Assurance. Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/pat-framework-innovative-pharmaceutical-development-manufacturing-and-quality-assurance (accessed on 6 April 2022).
Sokolov, M. Decision Making and Risk Management in Biopharmaceutical Engineering—Opportunities in the Age of COVID-19 and Digitalization. Ind. Eng. Chem. Res. 2020, 59, 17587–17592. [Google Scholar] [CrossRef]
Xie, X.; Schenkendorf, R. Robust Process Design in Pharmaceutical Manufacturing under Batch-to-Batch Variation. Processes 2019, 7, 509. [Google Scholar] [CrossRef] [Green Version]
Liu, Y.; Gunawan, R. Bioprocess optimization under uncertainty using ensemble modeling. J. Biotechnol. 2017, 244, 34–44. [Google Scholar] [CrossRef] [PubMed]
Bradford, E.; Schweidtmann, A.M.; Zhang, D.; Jing, K.; del Rio-Chanona, E.A. Dynamic modeling and optimization of sustainable algal production with uncertainty using multivariate Gaussian processes. Comput. Chem. Eng. 2018, 118, 143–158. [Google Scholar] [CrossRef] [Green Version]
Hua, L.; Chen, M.; Han, X.; Zhang, X.; Zheng, F.; Zhuang, W. Research on the vibration model and vibration performance of cold orbital forging machines. J. Eng. Manuf. 2022, 236, 828–843. [Google Scholar] [CrossRef]
Schweidtmann, A.M.; Clayton, A.D.; Holmes, N.; Bradford, E.; Bourne, R.A.; Lapkin, A.A. Machine learning meets continuous flow chemistry: Automated optimization towards the Pareto front of multiple objectives. Chem. Eng. J. 2018, 352, 277–282. [Google Scholar] [CrossRef]
Clayton, A.D.; Schweidtmann, A.M.; Clemens, G.; Manson, J.A.; Taylor, C.J.; Niño, C.G.; Chamberlain, T.W.; Kapur, N.; Blacker, A.J.; Lapkin, A.A.; et al. Automated self-optimisation of multi-step reaction and separation processes using machine learning. Chem. Eng. J. 2020, 384, 123340. [Google Scholar] [CrossRef]
Rangaiah, G.P.; Feng, Z.; Hoadley, A.F. Multi-Objective Optimization Applications in Chemical Process Engineering: Tutorial and Review. Processes 2020, 8, 508. [Google Scholar] [CrossRef]
Le, H.; Kabbur, S.; Pollastrini, L.; Sun, Z.; Mills, K.; Johnson, K.; Karypis, G.; Hu, W.S. Multivariate analysis of cell culture bioprocess data–lactate consumption as process indicator. J. Biotechnol. 2012, 162, 210–223. [Google Scholar] [CrossRef]
Böhl, O.J.; Schellenberg, J.; Bahnemann, J.; Hitzmann, B.; Scheper, T.; Solle, D. Implementation of QbD strategies in the inoculum expansion of a mAb production process. Eng. Life Sci. 2020, 27, 196–207. [Google Scholar] [CrossRef] [PubMed]
Kushner, H.J. A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise. J. Basic Eng. 1964, 86, 97–106. [Google Scholar] [CrossRef]
Jones, D.R.; Schonlau, M.; Welch, W.J. Efficient Global Optimization of Expensive Black-Box Functions. J. Glob. Optim. 1998, 13, 455–492. [Google Scholar] [CrossRef]
Brochu, E.; Cora, V.M.; Freitas, N.d. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. arXiv 2010, arXiv:1012.2599. [Google Scholar]
Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R.P.; Freitas, N.d. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 2016, 104, 148–175. [Google Scholar] [CrossRef] [Green Version]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning, 3rd ed.; MIT Press: Cambridge, UK, 2008. [Google Scholar]
Tulsyan, A.; Garvin, C.; Undey, C. Industrial batch process monitoring with limited data. J. Process. Control. 2019, 77, 114–133. [Google Scholar] [CrossRef]
Lange-Hegermann, M. Algorithmic Linearly Constrained Gaussian Processes. In Advances in Neural Information Processing Systems 31 (NeurIPS 2018); Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2018; pp. 2137–2148. [Google Scholar]
Lange-Hegermann, M. Linearly constrained gaussian processes with boundary conditions. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, San Diego, CA USA, 13–15 April 2021; pp. 1090–1098. [Google Scholar]
Bradford, E.; Imsland, L.; Zhang, D.; del Rio Chanona, E.A. Stochastic data-driven model predictive control using gaussian processes. Comput. Chem. Eng. 2020, 139, 106844. [Google Scholar] [CrossRef]
Petsagkourakis, P.; Sandoval, I.O.; Bradford, E.; Zhang, D.; Chanona, E.A.d.R. Constrained Reinforcement Learning for Dynamic Optimization under Uncertainty. PapersOnLine 2020, 53, 11264–11270. [Google Scholar] [CrossRef]
Bradford, E.; Schweidtmann, A.M.; Lapkin, A. Efficient multiobjective optimization employing Gaussian processes, spectral sampling and a genetic algorithm. J. Glob. Optim. 2018, 71, 407–438. [Google Scholar] [CrossRef] [Green Version]
Narayanan, H.; Stosch, M.; Luna, M.F.; Cruz Bournazou, M.N.; Buttè, A.; Sokolov, M. Consistent value creation from bioprocess data with customized algorithms: Opportunities beyond multivariate analysis. In Process Control, Intensification, and Digitalisation in Continuous Biomanufacturing; Subramanian, G., Ed.; Wiley and Sons: Hoboken, NJ, USA, 2022; Volume 36, pp. 231–264. [Google Scholar] [CrossRef]
Yang, K.; Emmerich, M.; Deutz, A.; Bäck, T. Multi-Objective Bayesian Global Optimization using expected hypervolume improvement gradient. Swarm Evol. Comput. 2019, 44, 945–956. [Google Scholar] [CrossRef]
Manheim, D.C.; Detwiler, R.L. Accurate and reliable estimation of kinetic parameters for environmental engineering applications: A global, multi objective, Bayesian optimization approach. Methods X 2019, 6, 1398–1414. [Google Scholar] [CrossRef] [PubMed]
Hernández Rodríguez, T.; Posch, C.; Schmutzhard, J.; Stettner, J.; Weihs, C.; Pörtner, R.; Frahm, B. Predicting industrial-scale cell culture seed trains-A Bayesian framework for model fitting and parameter estimation, dealing with uncertainty in measurements and model parameters, applied to a nonlinear kinetic cell culture model, using an MCMC method. Biotechnol. Bioeng. 2019, 116, 2944–2959. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hernández Rodríguez, T.; Frahm, B. Design, optimization, and adaptive control of cell culture seed trains. In Animal Cell Biotechnology; Pörtner, R., Ed.; Humana Press: New York, NY, USA, 2020; Volume 2095, pp. 251–267. [Google Scholar] [CrossRef]
Hernández Rodríguez, T.; Frahm, B. Digital Seed Train Twins and Statistical Methods. Adv. Biochem. Eng. 2020, 9, 964. [Google Scholar]
Kellerer, B. Portfolio Optimization and Ambiguity Aversion. Jr. Manag. Sci. 2019, 4, 305–338. [Google Scholar] [CrossRef]
Guerard, J.B. Handbook of Portfolio Construction; Springer: Boston, MA, USA, 2010. [Google Scholar] [CrossRef]
Frazier, P.I. A Tutorial on Bayesian Optimization. arXiv 2018, arXiv:1807.02811. [Google Scholar]
Couckuyt, I.; Deschrijver, D.; Dhaene, T. Fast calculation of multiobjective probability of improvement and expected improvement criteria for Pareto optimization. J. Glob. Optim. 2014, 60, 575–594. [Google Scholar] [CrossRef]
Sekulic, A. Bayes’sche Optimierung von Multikriteriellen Zielfunktionen bei Zellkultur-Seed-Trains. Ph.D. Thesis, Ostwestfalen-Lippe University of Applied Sciences and Arts, Lemgo, Germany, 2020. [Google Scholar]
Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. arXiv 2012, arXiv:1206.2944. [Google Scholar]
Duvenaud, D. Automatic Model Construction with Gaussian Processes. Ph.D. Thesis, Apollo–University of Cambridge Repository, Cambridge, UK, 2014. [Google Scholar] [CrossRef]
MathWorks Inc. MATLAB, Version 9.7.0 (R2019b); The MathWorks Inc.: Natick, MA, USA, 2019.
van Rossum, G. The Python Language Reference. Documentation for Python. Python Software Foundation; Release 3.0.1 [Repr.] Ed.; SoHo Press: Hampton, WA, USA; Redwood City, CA, USA, 2010. [Google Scholar]
Knudde, N.; van der Herten, J.; Dhaene, T.; Couckuyt, I. GPflowOpt: A Bayesian Optimization Library Using TensorFlow. arXiv 2017, arXiv:1711.03845. [Google Scholar]
Hernández Rodríguez, T.; Morerod, S.; Pörtner, R.; Wurm, F.M.; Frahm, B. Considerations of the Impacts of Cell-Specific Growth and Production Rate on Clone Selection—A Simulation Study. Processes 2021, 9, 964. [Google Scholar] [CrossRef]
Greenhill, S.; Rana, S.; Gupta, S.; Vellanki, P.; Venkatesh, S. Bayesian Optimization for Adaptive Experimental Design: A Review. IEEE Access 2020, 8, 13937–13948. [Google Scholar] [CrossRef]
Frahm, B. Seed train optimization for cell culture. In Animal Cell Biotechnology; Pörtner, R., Ed.; Humana Press: New York, NY, USA, 2014; Volume 1104, pp. 355–367. [Google Scholar] [CrossRef]
Kern, S.; Platas-Barradas, O.; Pörtner, R.; Frahm, B. Model-based strategy for cell culture seed train layout verified at lab scale. Cytotechnology 2016, 68, 1019–1032. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Möller, J.; Kuchemüller, K.B.; Steinmetz, T.; Koopmann, K.S.; Pörtner, R. Model-assisted Design of Experiments as a concept for knowledge-based bioprocess development. Bioprocess Biosyst. Eng. 2019, 42, 867–882. [Google Scholar] [CrossRef] [PubMed]
Pörtner, R.; Platas Barradas, O.; Frahm, B.; Hass, V.C. Advanced process and control strategies for bioreactors. In Current Developments in Biotechnology and Bioengineering; Larroche, C., Ángeles Sanromán, M., Du, G., Pandey, A., Eds.; Elsevier: Amsterdam, The Netherlands, 2017; Volume 105, pp. 463–493. [Google Scholar] [CrossRef]

Figure 1. Goal of the study is to propose a concept and a numerical framework for optimal robust seed train design (blue box in the middle), including optimizable inputs (first gray box) as well as objectives (objective criteria) used in this study (right gray box).

Figure 2. Scheme showing the applied computational workflow comprising: (A) a Bayes optimization algorithm which is coupled with (B) a seed train simulation routine. Input and output values are shown in the blue boxes above and below.

Figure 3. Algorthmically determined solutions and Pareto front of the two objective criteria seed train duration and deviation rate. (optimizable variables, here combinations of 5 shake flask filling volumes). Blue dots show an initial Latin hypercube design (LHC); yellow crosses are the points proposed by the algorithm; green circles are Pareto optimal solutions (=Pareto front).

Figure 4. Contour plots showing two optimizable variables on x and y-axis and one objective (here Deviation rate D in %), assigned to each combination of the two variables, through colored isolines. For example, the diagram in the top left shows the deviation rate for each combination of

V_{1}

(filling volume in flask scale 1) and

V_{2}

(filling volume in flask scale 2) through colors representing the corresponding values in %, as indicated on the color bar. Moreover, the results obtained through seed train simulations are shown by dots. The red dots represent the non-dominated (optimal solutions).

Figure 4. Contour plots showing two optimizable variables on x and y-axis and one objective (here Deviation rate D in %), assigned to each combination of the two variables, through colored isolines. For example, the diagram in the top left shows the deviation rate for each combination of

V_{1}

(filling volume in flask scale 1) and

V_{2}

(filling volume in flask scale 2) through colors representing the corresponding values in %, as indicated on the color bar. Moreover, the results obtained through seed train simulations are shown by dots. The red dots represent the non-dominated (optimal solutions).

Figure 5. Contour plots showing two optimizable variables on x and y-axis and one objective (here seed train duration (d) in h), assigned to each combination of the two variables, through colored isolines. For example, the diagram top left shows the deviation rate for each combination of

V_{1}

(filling volume in flask scale 1) and

V_{2}

(filling volume in flask scale 2) through colors representing the corresponding values in %, as indicated on the color bar. Moreover, the results obtained through seed train simulations are shown by dots. The red dots represent the non-dominated (optimal solutions).

Figure 5. Contour plots showing two optimizable variables on x and y-axis and one objective (here seed train duration (d) in h), assigned to each combination of the two variables, through colored isolines. For example, the diagram top left shows the deviation rate for each combination of

V_{1}

(filling volume in flask scale 1) and

V_{2}

(filling volume in flask scale 2) through colors representing the corresponding values in %, as indicated on the color bar. Moreover, the results obtained through seed train simulations are shown by dots. The red dots represent the non-dominated (optimal solutions).

Figure 6. Seed train showing viable cell density (VCD) and total cell density, as well as substrate (glucose and glutamine) and metabolite (lactate and ammonium) concentrations over time and over the whole seed train (5 shake flask scales and three bioreactor scales), based on the shake flask filling volumes according to solution 1. The green lines represent the mean time course and the blue lines show the corresponding 90%-prediction band (5%- and 95%-quantiles). The plot (top left) also includes the filling volumes and the acceptable ranges for seeding VCD and transfer VCD, illustrated through dashed lines.

Figure 7. Reference (non-optimized) seed train showing viable and total cell density, as well as substrate (glucose and glutamine) and metabolite (lactate and ammonium) concentrations over time, based on a reference configuration setup for 5 shake flask scales using passaging intervals of 72 h each. The green lines represent the mean time course and the blue lines show the corresponding 90%-prediction band (5%- and 95%-quantiles).

Figure 8. Algorithmically determined solutions and Pareto front regarding seed train duration and deviation rate for 3 resp. 4 shake flask scales on top resp. bottom. Blue dots show an initial Latin hypercube design (LHC); yellow crosses are the points proposed by the algorithm; green circles are Pareto optimal solutions (=Pareto front).

Figure 9. Spider plot showing the objective criteria values (seed train duration, deviation rate, titer and viability after 8 days in the production vessel) for the Pareto optimal solutions for 5 shake flask scales.

Figure 10. Algorthmically determined solutions and Pareto front of the two objective criteria seed train duration and deviation rate (optimizable variables, here combinations of 3 shake flask filling volumes) for 10 (top left), 20 (top right) and 30 (bottom left) Bayes iterations; blue dots showthe initial Latin hypercube design (LHC); yellow crosses show the points proposed by the algorithm; green circles are the Pareto optimal solutions (=Pareto front).

Table 1. Specification of the exemplary seed train setup providing information concerning cultivation vessels, required viable cell densities and the transfer of cells from one cultivation vessel into the next larger one, assumed in this work.

Seed Train Setup
Flask scales:	3, 4 or 5 flask scales between 0.014 L and 8 L filling volume
Bioreactor scales:	3 bioreactor scales, 38 L, 302 L and 2054 L filling volume
Production bioreactor:	9500 L filling volume
Optimal range for
viable seeding cell density:	$3 \times 10^{8}$ – $3.5 \times 10^{8}$ cells L $^{- 1}$ ( $3 \times 10^{5}$ – $3.5 \times 10^{5}$ cells mL $^{- 1}$ )
Optimal range for
transfer viable cell density:	$0.1 \times 10^{10}$ – $1 \times 10^{10}$ cells L $^{- 1}$ ( $0.1 \times 10^{7}$ – $1 \times 10^{7}$ cells mL $^{- 1}$ )
Target seeding (initial)
viable cell density:	3.15 $\times 10^{8}$ cells L $^{- 1}$ (3.15 $\times 10^{5}$ cells mL $^{- 1}$ )
	(=minimum viable seeding VCD + 5%)
Strategy concerning point	‘Xv transfer’, i.e., passaging as soon as the calculated
in time for cell passaging:	required viable transfer cell density is reached
Practically feasible	Passaging between 48 and 120 h possible
points in time for passaging:	(flexible ranges)
Strategy concerning	Discard cell suspension during the passaging step, if
current and new volume:	required to start within an optimal seeding
	cell density range

Table 2. Input space for the shake flask filling volumes, containing the possible filling volumes per scale, given for optimization runs with 5, 4 or 3 shake flask scales.

	Filling Volumes
	Range for	Range for	Range for
	5 Shake Flask Scale [L]	4 Shake Flask Scales [L]	3 Shake Flask Scales [L]
V1	0.014–0.015	0.014–0.015	0.014–0.015
V2	0.05–0.15	0.1–1	0.1–2
V3	0.15–1.5	1.5–4	4–8
V4	1.5–4	4–8	-
V5	4–8	-	-

Table 3. Pareto optimal solutions concerning the choice of filling volumes in shake flask scales, for three scenarios for 5 flask scales. The following bioreactor filling volumes are 40 L, 320 L and 2210 L. The averaged filling volumes in L and the resulting deviation rate (D) in % and seed train duration (d) in h are listed for each Pareto optimal solution.

	Filling Volumes
Solution	Vol. 1	Vol. 2	Vol. 3	Vol. 4	Vol. 5	D	d
	[L]	[L]	[L]	[L]	[L]	[%]	[h]
1	0.015	0.065	0.904	2.355	7.78	537	4.9
2	0.015	0.115	0.451	1.672	7.89	521	6.1
3	0.015	0.104	0.340	1.614	6.85	524	5.2
4	0.014	0.103	0.369	1.582	7.87	523	5.3
5	0.015	0.114	0.431	2.026	7.97	520	6.7
	Filling volumes of reference seed train
Reference	0.015	0.08	0.30	2	4	41.7	576

Table 4. Pareto optimal solutions concerning the choice of filling volumes in shake flask scales, for 3, 4 and 5 shake flask scales. The following bioreactor filling volumes are 40 L, 320 L and 2210 L. The averaged filling volumes in L, the resulting deviation rate (D) in % and seed train duration d in h are listed for each solution.

	Filling Volumes
Solution	Vol. 1	Vol. 2	Vol. 3	Vol. 4	Vol. 5	D	d
	[L]	[L]	[L]	[L]	[L]	[%]	[h]
	5 flask scales
1	0.015	0.065	0.904	2.355	7.78	4.9	537
2	0.015	0.115	0.451	1.672	7.89	6.1	521
3	0.015	0.104	0.340	1.614	6.85	5.2	524
4	0.014	0.103	0.369	1.582	7.87	5.3	523
5	0.0015	0.114	0.431	2.026	7.97	6.7	520
	4 flask scales
1	0.015	0.195	1.60	7.58		9.2	520
2	0.015	0.190	1.51	5.39		8.0	521
3	0.015	0.169	1.52	6.33		7.5	522
4	0.015	0.158	1.59	4.81		5.9	528
	3 flask scales
1	0.015	0.733	4.45			23.0	522
2	0.015	1.046	4.85			23.8	521
3	0.015	1.103	5.26			24.8	520
4	0.015	0.934	4.77			23.0	522
5	0.015	1.110	4.65			22.2	523
6	0.015	1.306	5.58			26.4	519

Table 5. Pareto optimal solutions concerning the choice of filling volumes in shake flask scales, for 3, 4 and 5 shake flask scales, for two different scenarios. Scenario 1 assumes a 5% lower and scenario 2 a 5% higher cell-specific maximum growth rate compared to the reference maximum growth rate. The bioreactor filling volumes which follow after the shake flask scales are 40 L, 320 L and 2210 L. The filling volumes in L, the resulting deviation rate (D) in % and seed train duration (d) in h are listed for each solution (several Pareto optimal solutions can be obtained per setup).

	Filling Volumes
Solution	Vol. 1	Vol. 2	Vol. 3	Vol. 4	Vol. 5	D	d
	[L]	[L]	[L]	[L]	[L]	[%]	[h]
	5 flask scales
5% lower growth rate
1	0.015	0.083	0.45	2.06	6.67	7.3	550
2	0.014	0.072	0.30	2.78	6.23	7.0	551
3	0.014	0.105	0.45	2.53	6.77	6.1	553
4	0.015	0.119	0.84	2.96	6.21	5.5	568
5	0.014	0.122	0.56	2.64	6.62	5.8	559
5% higher growth rate
6	0.015	0.08	0.354	1.52	7.24	6.2	495
7	0.014	0.09	0.560	1.89	7.19	4.6	502
8	0.014	0.14	0.545	1.89	7.16	4.1	503
9	0.015	0.06	0.313	1.63	7.72	6.5	494
10	0.014	0.09	0.312	1.82	7.08	5.3	496
11	0.014	0.13	0.564	2.16	7.41	4.8	501
	4 flask scales
5% lower growth rate
12	0.015	0.158	1.99	7.6		8.8	551
13	0.015	0.147	1.59	7.7		7.7	552
14	0.015	0.132	2.00	7.8		7.6	553
15	0.015	0.167	1.56	7.5		9.8	550
16	0.015	0.180	1.58	7.4		11.6	548
5% higher growth rate
17	0.015	0.199	1.59	5.2		5.3	501
18	0.015	0.246	1.53	7.9		10.5	493
19	0.015	0.215	1.53	7.2		7.7	494
20	0.015	0.193	1.56	6.8		6.8	496
21	0.015	0.191	1.60	5.6		5.8	498
22	0.015	0.210	1.56	7.5		7.3	495
23	0.014	0.183	1.71	6.1		5.4	499
	3 flask scales
5% lower growth rate
24	0.015	0.174	4.14			20.0	550
25	0.015	1.011	4.03			25.7	549
26	0.015	0.151	4.02			19.2	552
27	0.015	0.929	4.34			29.1	548
5% higher growth rate
28	0.015	0.235	4.277			11.13	496
29	0.015	0.987	7.683			25.5	493
30	0.015	0.267	4.589			14.9	495

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hernández Rodríguez, T.; Sekulic, A.; Lange-Hegermann, M.; Frahm, B. Designing Robust Biotechnological Processes Regarding Variabilities Using Multi-Objective Optimization Applied to a Biopharmaceutical Seed Train Design. Processes 2022, 10, 883. https://doi.org/10.3390/pr10050883

AMA Style

Hernández Rodríguez T, Sekulic A, Lange-Hegermann M, Frahm B. Designing Robust Biotechnological Processes Regarding Variabilities Using Multi-Objective Optimization Applied to a Biopharmaceutical Seed Train Design. Processes. 2022; 10(5):883. https://doi.org/10.3390/pr10050883

Chicago/Turabian Style

Hernández Rodríguez, Tanja, Anton Sekulic, Markus Lange-Hegermann, and Björn Frahm. 2022. "Designing Robust Biotechnological Processes Regarding Variabilities Using Multi-Objective Optimization Applied to a Biopharmaceutical Seed Train Design" Processes 10, no. 5: 883. https://doi.org/10.3390/pr10050883

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Designing Robust Biotechnological Processes Regarding Variabilities Using Multi-Objective Optimization Applied to a Biopharmaceutical Seed Train Design

Abstract

1. Introduction

2. Methods

2.1. Upstream Simulation

2.2. Bayes Optimization

2.3. Problem Definition and Computational Procedure

Formulation of the Mathematical Optimization Problem

2.4. Connecting Seed Train Simulation and Bayes Optimization

2.5. Numerical Solvers and Tools

3. Results and Discussion

3.1. Optimization of Cultivation Vessels Regarding Number of Shake Flask Scales and Filling Volumes for Five, Four and Three Shake Flask Scales

3.1.1. Optimization of Five Shake Flask Scales

3.1.2. Optimization of Three and Four Shake Flask Scales

3.2. Application to Further Cell Lines with Potentially Different Growth Rates

3.3. Optimization Regarding Four Objectives Including Product Concentration

3.4. Impact of Performed Iterations during Bayes Optimization

3.5. Summary

4. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix B

Appendix Application to Other Cell Lines with Potentially Higher and Lower Maximum Growth Rates

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI