Automated self-optimisation of multi-step reaction and separation processes using machine learning

Furthermore, a Claisen-Schmidt condensation reaction with subsequent liquid-liquid separation was optimised with respect to three-objectives. This approach provides the ability to simultaneously optimise multi-step processes with respect to multiple objectives, and thus has the potential to make substantial savings in time and resources.

reaction parameters such as temperature, stirring and reagent addition were optimised using a Nelder-Mead SIMPLEX algorithm. Since then, developments in the areas of automated laboratory hardware and control software have driven these systems to significantly evolve [3]. More recently, continuous flow systems have been combined with online/in-line analytics and optimisation algorithms to automate reaction optimisation (commonly referred to as 'self-optimisation') [4,5]. The ability of self-optimisation to efficiently identify optimal operating conditions in a multivariate parameter space has presented many opportunities for more efficient process development. Such advantages align with the rising interest in continuous flow chemistry towards the 'greener' synthesis of active pharmaceutical ingredients (APIs), and offer the potential to reduce the drug development timeline [6][7][8].
Process development in the pharmaceutical industry must simultaneously consider multiple performance criteria based on conflicting economic and environmental objectives [9]. This applies to the whole process including non-reactive unit operations, such as work-up. Despite this, the majority of self-optimisation applications to date have focused on single-objective optimisation of single-step reactions, utilising the following algorithms: model-based design of experiments [10][11][12], SNOBFIT [13][14][15][16], Nelder-Mead SIMPLEX or variations thereof [17][18][19][20][21][22][23]. Further, these algorithms are not data-efficient as they do not utilise all the available experimental data to build a global surrogate model. Thus, they are not well-suited for expensive-to-evaluate chemical synthesis problems, particularly those involving complex multi-step, and pharmaceutically relevant processes [24].
In our previous work, we investigated the use of a data efficient Bayesian optimisation algorithm for the self-optimisation of chemical reactions with two competing performance criteria [25]. The Thompson sampling efficient multi-objective (TSEMO) algorithm [26,27] builds machine learning surrogate models, i.e. Gaussian processes (GPs), based on all available data and compares favourably with other algorithms such as EHI [28] and ParEGO [29]. TSEMO was integrated with our automated experimental platform and enabled the simultaneous optimisation of space-time yield (STY) with E-factor or impurity content for two exemplar reactions. Notably, the complete trade-off curve (Pareto front [30]) highlighting the compromise between the objectives was identified in a practical number of experiments, overcoming the issues associated with the scalarisation of multiple objectives [31]. TSEMO was also effectively used in the workflow on solvent selection for optimal reactivity and selectivity [32], and in the context of optimisation of life cycle environmental impacts vs. process economics for computer-aided process design [33]. Based on these results, we hypothesised that the TSEMO algorithm would be well-suited for the optimisation of multi-step continuous reaction processes. Herein, we describe the application of this methodology to: (i) a pharmaceutically relevant Sonogashira reaction; (ii) a multi-step Claisen-Schmidt condensation reaction with in-line liquid-liquid extraction.

TSEMO algorithm
The TSEMO algorithm is designed to solve expensive-to-evaluate black-box multi-objective optimisation problems. The algorithm utilises a Bayesian methodology employing GPs as the surrogate model. The surrogate models are coupled with acquisition functions which are optimised in lieu of the real process to suggest the next point of evaluation. TSEMO makes use of Thompson Sampling to determine the next point of evaluation through random sampling of the posterior distribution of the GPs. The sample is then optimised through use of the NSGA-II algorithm to propose a set of candidate points. NSGA-II is an elitist genetic algorithm which solves multi-objective optimisation problems using Pareto ranking and crowding distance computations. The candidate which maximises the hypervolume improvement is selected as the next point to sample. This process continues until the algorithm reaches a pre-defined maximum number of process/function evaluations.
The TSEMO algorithm was combined with an automated continuous flow platform to enable the closed-loop multi-objective optimisation of chemical processes. The algorithm was operated in batch-sequential mode, which proposes multiple sampling points at each iteration. In previous work, we demonstrated that there was a minimal reduction in performance when batches of four evaluations were used compared to single evaluations [26]. Therefore, batches of four experiments were used in these case studies, as this enabled faster optimisations by parallelising the start of an experiment with the analysis of the previous experiment.

Self-optimising platform
Reagents were pumped using JASCO PU980, HiTec Zang SyrDos and/or Knauer AZURA HPLC pumps, and were mixed in Swagelok SS-100-3 tee-pieces. Tubular reactors were constructed from Polyfon PTFE tubing (0.1 cm ID) and fitted to a Cambridge Reactor Design Polar Bear Flow Synthesiser. Miniature continuous stirred tank reactor (CSTR) cascades were custom-built and are detailed below. The reactor was maintained under a fixed back pressure using an Upchurch Scientific back pressure regulator. A Zaiput SEP-10 liquid-liquid membranebased separator fitted with a PTFE membrane (0.5 μm pore size) was used for in-line phase separation when required. Sampling of the organic phase was achieved using a VICI Valco EUDA-CI4W.06 sample loop with a 0.06 μL injection volume. Quantitative analysis was performed on an Agilent 1100 series HPLC instrument (HPLC methods for each case study are provided in the ESI). Steady state was monitored using a Kaiser RxN1 785 nm Raman System when required. The automated reactor was controlled by a custom written MATLAB program, within which the TSEMO algorithm was implemented.

General optimisation procedure
An optimisation program was written in MATLAB that controlled the pump flow rates and reactor temperature, determined steady state, calculated the responses and controlled the inputs and outputs to and from the TSEMO algorithm (TSEMO repository: https://github.com/ Eric-Bradford/TS-EMO). The reactant flow rates were reduced to a minimum (dead-time conditions) during heating/cooling of the reactor to minimise the amount of material used. The algorithm was operated in batch-sequential mode, such that each iteration included four experiments. The responses for each objective were calculated from the HPLC chromatograms at the end of each iteration, and the results used to update the surrogate models and generate the next set of operating conditions.

Miniature CSTR cascade
Each CSTR had a stainless steel base, equipped with a polyacetal lid. The reaction chamber was cylindrical with a 2 mL volume, containing a PTFE coated cross stirrer bar (10 mm diameter) to provide mechanical mixing. A convex glass lens (viewing window) and PTFE gasket were clamped down between the base and lid using three bolts to form a seal. The CSTRs were connected using Polyfon PTFE tubing (1/8″ OD, 1/16″ ID) to form a cascade of desired length and volume. An aluminium heating mantle was designed to inset four CSTRs (volume = 8 mL). Channels were made in the mantle to provide inlet/outlet flow streams to each CSTR. The mantle was heated using two nickel heating element inserts. The temperature was monitored using a thermocouple and controlled using a Eurotherm temperature controller, which was coded into the control software of the self-optimising system. The compact design ensured that mixing could be achieved in all four CSTRs using a single conventional stirrer plate. Separate thermocouples were placed in the additional inlet of each CSTR, to directly monitor the internal temperature of each reactor using a Pico logger.

Towards the synthesis of lanabecestat
Lanabecestat (AZD3293) is a potent inhibitor of β-site amyloid cleaving enzymes [34], which break down β-amyloid proteins into neurotoxic fragments [35]. Hence, lanabecestat was identified as a potential drug candidate for Alzheimer's disease, reaching Phase III clinical trials. Despite a growing interest in the use of flow chemistry in the pharmaceutical industry, there have been very few reports regarding the self-optimisation of synthetic steps in API production [13].
A key step in the original batch synthesis was the alkynylation of 3,5-dibromopyridine with TMS-propyne [36]. The motivations for transferring this step to flow included: (i) TMS-propyne could be safely exchanged for propyne gas, removing the need for additional additives; (ii) downstream lithiation/borylation chemistry is well suited for flow which would enable a telescoped synthesis, providing a significant manufacturing cost saving; (iii) precise control of reaction parameters in flow would give more consistent product quality [37].
To optimise this process, we studied a model Sonogashira reaction between 3,5-dibromopyridine 2 and 1-hexyne 3 (Scheme 1). 1-Hexyne 3 was selected as a model substrate as it is cheaper and easier to handle at room temperature compared to propyne. Due to current difficulties removing 2 during the downstream work-up, the aim of the optimisation was to simultaneously minimise the amount of 2 remaining and maximise the space-time yield (STY) with respect to the mono-alkyne 4 product [Eq. (1)]. The continuous variables which were optimised are: residence time (tR), 1-hexyne 3 equivalents and temperature.  The optimisation was initialised with 20 Latin hypercube (LHC) experiments, followed by a subsequent 60 experiments designed by the TSEMO algorithm. The algorithm converged to a Pareto front consisting of 20 solutions (Fig. 1). The maximum STY found was 3198.8 kg m −3 h −1 with 10.9% 2 remaining. In contrast, the maximum conversion corresponds to 1.9% 2 remaining and a STY of 315.4 kg m −3 h −1 . Between those two edge points, the Pareto front highlights the inherent trade-off between conversion and productivity. Notably, the Pareto front dominates the feasible region. This means that for all points to the left of the Pareto front there exist at least one Pareto point where both objectives are better. Vice versa, one objective of the Pareto optimal points cannot be improved without worsening another objective.
The reaction profiles for the % of 2 remaining and STY are shown in Fig. 2a and b respectively. Notably, inspection along the z-axis indicates that varying the temperature between 120 and 150°C has little effect on either objective. In contrast, reducing the residence time corresponds to a large increase in STY, due to both a reduction in process time and conversion of mono-alkyne 4 to bis-alkyne 5 (Fig. 2c). This correlates with a relatively small increase in 2 remaining. For example, the STY can be increased from 657 to 1586 kg m −3 h −1 by reducing the residence time from 4.1 to 1.8 min, whilst only increasing the 2 remaining from 2.1 to 3.7%. The STY can be further increased at the lower limits of residence time by reducing the equivalents of 1-hexyne 3, which reduces conversion of mono-alkyne 4 to bis-alkyne 5. However, this correlates with a relatively large increase in % of 2 remaining. For  The results from this optimisation enable process chemists to visualise the trade-off between conversion and productivity. In this case, the best STY can readily be selected under the current work-up limitations with respect to 2. Unlike targeted or weighted objective optimisation, the TSEMO algorithm identifies the complete trade-off, such that the data can be re-evaluated without further experimentation if process parameters are altered. This is particularly beneficial in the pharmaceutical industry, where the specifications of the downstream work-up are dynamic during process development. Therefore, with most pharmaceutical processes having competing objectives, the use of such algorithms is clearly a beneficial tool for flexible project scenario planning.
One of the challenges associated with expensive-to-evaluate blackbox optimisation problems is determining an appropriate termination criterion. Too few experiments can yield suboptimal solutions and inaccurate surrogate models, whereas too many experiments are wasteful in terms of time and materials. In terms of multi-objective optimisation, one could envisage automatically terminating the optimisation once the hypervolume improvement between successive iterations falls below a pre-defined level, where the hypervolume is the volume in the objective space between the current Pareto front and a reference point. However, as there is no guarantee that the hypervolume will improve between each iteration, there is a risk of premature termination of the optimisation (see ESI). Thus, we monitored the progress of the optimisation via visual inspection of changes to the predicted Pareto front (Fig. 3). The predicted Pareto front is found by performing a multi-objective optimisation of the GP surrogate model predictions using the NSGA-II algorithm [38]. It is evident that the initial 20 LHC experiments are insufficient for creating GP surrogate models that accurately describe the final Pareto front. The shape of the Pareto front is significantly changed after the initial exploration by the TSEMO algorithm and the subsequent updating of the GP surrogate models with new data. The optimisation was terminated after 80 experiments, as we were satisfied that there were no significant changes to the GP surrogate models between experiments 60-80.

Multi-step self-optimisation
In the previous example, we demonstrated that a reaction in a continually developing process can be optimised whilst considering subsequent work-up operations. An alternative approach would be to optimise the reaction and work-up steps simultaneously. This would reduce the overall development time and enable the impact of downstream processes on economic and environmental objectives to be simultaneously considered. To test this hypothesis, we selected a multistep process involving the Claisen-Schmidt condensation reaction between benzaldehyde 6 and acetone 7 (Scheme 2), and its subsequent work-up step, targeting benzylideneacetone 8 as the desired product [39]. To the best knowledge of the authors, this is the first report of  simultaneous automated optimisation of a multi-step reaction-separation sequence. The reaction was conducted in a toluene/acetone/water solvent mixture, resulting in a biphasic organic-aqueous reaction medium. Multiphasic reactions require effective mixing to overcome mass transfer limitations. This can be achieved in tubular reactors by using static inserts (passive mixing), however relatively high flow rates and large reactor volumes are required [40], which are unsuitable for expensive-to-evaluate optimisations. In contrast, miniature CSTRs provide active mixing which decouples mixing performance and flow rates [41], thus enabling laboratory-scale optimisation of multiphasic reactions with varying residence times. Given this, a temperature-controlled version of our previously reported laboratory-scale CSTR [42] was selected as an appropriate reactor for this reaction.
The subsequent separation of the organic and aqueous phases was facilitated using an in-line liquid-liquid membrane-based separator [43]. The process was optimised by simultaneously maximising purity, STY and reaction mass efficiency (RME) with respect to benzylideneacetone 8 in the organic phase [Eq. (2)]. The optimisation was conducted in terms of flow rates (ν) and ratios, where aq = aqueous and org = organic [ν(6) + ν (7)]. Changing flow rates in this system affects the process in two ways: (i) the reaction in terms of residence time and acetone/sodium hydroxide equivalents; (ii) the solvent ratio and, therefore, the separation in terms of partitioning between the organic and the aqueous phases. The optimisation was initialised with 20 LHC experiments, followed by a subsequent 89 experiments designed by the TSEMO algorithm. Of the 109 experiments conducted, 18 Pareto-optimal solutions were identified. A comparison of the responses at the optimum for each function identified conflicts between all three objectives (Table 1). In this case, the purity was ≈10% lower, the STY was ≈2.5 × lower and the RME was ≈1.5 × lower at the optima of the other objectives compared to their own. Inspection of the optimum conditions for each objective showed that they were located at three different corners of the experimental space (see ESI for plots). In general, all of the objectives favoured high temperatures, as formation of the dibenzylideneacetone 9 by-product was negligible (< 1%) in the presence of a large excess of acetone 7 [44]. Although the optimum conditions for each objective were identified, this method provided limited process knowledge regarding the influence of the variables on each individual step. This was due to confounding between the reaction and work-up steps, caused by monitoring of the multi-step process using a single downstream analytical source. Therefore, systems optimised using this method are best treated as a black-box, where the inputs and outputs are described but without knowledge of how they are related. In cases where a higher degree of process understanding is required, additional process analytical technologies should be integrated downstream of each individual step [45].
A surface was fitted to the non-dominated solutions to provide a visual representation of the Pareto front (See Fig. 4). This successfully highlighted the complete trade-off between all three objectives, which could be used to aid decision making during process design. For example, the Pareto front indicates that a purity of 82.8%, STY of 166 kg m −3 h −1 and RME of 5.66 can be obtained, which represents an approximately equal compromise between all three objectives. Alternatively, a greater importance could be placed on one or more of the objectives, and the process conditions selected accordingly post-optimisation. In this case, the reagents were all cheap and readily available. Hence, the termination criterion was based on practical time limitations, and the optimisation stopped after 65 h. Although the number of experiments was relatively high (> 100), a two-step process was successfully optimised with respect to three-objectives without human intervention, and the trade-off between the objectives was identified to aid decision making during process design. Therefore, the efficiency of this approach is favourable when considered against other optimisation strategies.
The majority of existing automated optimisation techniques are focused on single objective optimisation (e.g. Nelder-Mead SIMPLEX, SNOBFIT etc.), where global optimisers require a larger number of experiments per objective compared to this approach. For example, the SNOBFIT optimisation of two steps with respect to three objectives and four variables would require between 252 and 420 experiments based Scheme 2. Reactor set-up for the Claisen-Schmidt condensation reaction between benzaldehyde 6 and acetone 7 to form the desired benzylideneacetone 8 and the undesired dibenzylideneacetone 9. P = pump, BPR = back pressure regulator, SL = sample loop.  on previous reports [13,39], compared to 109 experiments in this study. Furthermore, single objective optimisation does not provide the tradeoff curve between conflicting performance criteria. In contrast, the multi-objective TSEMO algorithm identifies the Pareto front, notably within a more practical number of experiments. An additional drawback of single objective optimisation is that the data is not suitable for reevaluation in cases where specifications are dynamic during process development, such as continuous processes with downstream unit operations.
In these examples, identification of the trade-off curve was shown to be essential for the active learning process during multi-step process design. Although this work optimises only continuous variables, TSEMO has also been used for discrete variables, i.e., solvent selection for asymmetric chemistry [32] and, the extension to mixed-integer decisions (e.g. catalysts, bases, solvents and simultaneous reaction conditions) is an ongoing area of research in our laboratories.

Conclusions
In conclusion, we have successfully combined multi-objective optimisation based on machine learning methods with self-optimising platforms for the optimisation of multi-step continuous reaction processes. The benefits of this approach compared to other methods include the ability to reevaluate data post-optimisation during dynamic process development, and the identification of a trade-off curve to aid the active learning process during multi-step process design. Application of this method enabled the optimisation of a pharmaceutically relevant Sonogashira reaction in just 13 h, removing the need to conduct further experiments if the downstream work-up process specifications were changed. This corresponded to a significant reduction in the quantity of high-value catalytic reagents required during process development. Furthermore, a multi-step reaction and work-up process were simultaneously optimised with respect to three-objectives in just 65 h, which would conventionally require six separate optimisations carried out over multiple weeks with no guarantee that the trade-off would be obtained. Hence, this method provides a highly efficient and sustainable approach for the development of multi-step continuous flow processes, and aligns with the increasing utilisation of multi-step flow chemistry for the synthesis of active pharmaceutical ingredients.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.