Accelerated Chemical Reaction Optimization Using Multi-Task Learning

Functionalization of C–H bonds is a key challenge in medicinal chemistry, particularly for fragment-based drug discovery (FBDD) where such transformations require execution in the presence of polar functionality necessary for protein binding. Recent work has shown the effectiveness of Bayesian optimization (BO) for the self-optimization of chemical reactions; however, in all previous cases these algorithmic procedures have started with no prior information about the reaction of interest. In this work, we explore the use of multitask Bayesian optimization (MTBO) in several in silico case studies by leveraging reaction data collected from historical optimization campaigns to accelerate the optimization of new reactions. This methodology was then translated to real-world, medicinal chemistry applications in the yield optimization of several pharmaceutical intermediates using an autonomous flow-based reactor platform. The use of the MTBO algorithm was shown to be successful in determining optimal conditions of unseen experimental C–H activation reactions with differing substrates, demonstrating an efficient optimization strategy with large potential cost reductions when compared to industry-standard process optimization techniques. Our findings highlight the effectiveness of the methodology as an enabling tool in medicinal chemistry workflows, representing a step-change in the utilization of data and machine learning with the goal of accelerated reaction optimization.


Benchmarking using Literature Data
We used literature data to benchmark our Bayesian optimization strategies in silico. The challenge is that chemical data is often recorded in unstructured text documents such as journal publications and patents. While there are some conventions, each author is free to express the details of a chemical reaction as they wish. Such unstructured information is not amenable to most machine learning algorithms. Therefore, we designed a custom data extraction workflow that we think is a model for how to apply transfer learning in chemistry.

Data extraction workflow
As shown in Figure S1, we developed a benchmarking workflow that converts unstructured data into benchmarks that can used for comparing various strategies. We leveraged the Open Reaction Database (ORD) format as a common representation of reactions. 1 We wrote converters from spreadsheet formats to ORD. Once the data was transformed into ORD, the data was loaded into local storage on disk for featurization. Subsequently, the featurization step turned the ORD schema into a set of features that can be used for training a benchmark or a GP for Bayesian Optimization. We used one-hot encodings to represent the categorical variables.
We utilize data from publications on Suzuki couplings (see main text Scheme 1) and C-N couplings (see Scheme S1). 2,3 Figure S1: Workflow for converting data in spreadsheets into ORD format and subsequently training datasets for machine leaning. Figure S2: Workflow for training an ExperimentalEmulator to act as a benchmark.

Benchmark Training
We leveraged Summit 4 to build the predictive models from the literature reports. As shown in Figure  S2, we utilized the ExperimentalEmulator feature in Summit, which creates a benchmark based on experimental data. The regressor used was a neural network with one hidden layer of 512 units with a ReLu activation function. A one-hot encoding was used for the pre-catalyst and ligand combinations. The neural networks were trained by five-fold cross validation over 1000 epochs using stochastic gradient descent. Figures S3 -S10 show the parity plots for the benchmarks.

Benchmarking Simulation Details
All benchmark simulations were executed on an Amazon Web Services instance with an Nvidia T4 GPU via Lightning AI. For each configuration of optimization and task and auxiliary task, 20 repeats were completed. Figures show the average performance and the 95% confidence interval at each interval. For multitask benchmarks, the first experiment executed was the highest yielding condition from the auxiliary task(s). Figure S11: Comparison of the performance of single-task Bayesian optimization (STBO) and multi-task Bayesian optimization (MTBO) on Suzuki reactions R1-R4 with auxiliary data from Suzuki B1. Figure S12: Comparison of the performance of single-task Bayesian optimization (STBO) and multi-task Bayesian optimization (MTBO) on Suzuki R1-R4 with Suzuki R1-R4 as auxiliary tasks. The text above the plot represents the data used as an auxiliary task. Figure S13: Comparison of the performance of single-task Bayesian optimization (STBO) and multi-task Bayesian optimization (MTBO) on Suzuki R1-R4 with Suzuki R1-R4 as auxiliary tasks. The text above the plot represents the data used as an auxiliary task. S11

C-N Benchmarks
Figure S14: Comparison of the performance of single-task Bayesian optimization (STBO) and multi-task Bayesian optimization (MTBO) on C-N B1-B4 with C-N B1-B4 as auxiliary tasks. The text above the plot represents the data used as an auxiliary task. Figure S15: Comparison of the performance of single-task Bayesian optimization (STBO) and multi-task Bayesian optimization (MTBO) on C-N B1-B4 with all remaining C-N tasks for auxiliary training. The text above the plot represents the data used as an auxiliary task.

A note on literature data for optimization
During the preparation of this manuscript and the work discussed herein, the authors considered using data mined from reaction databases to improve performance of MTBO using multiple auxiliary tasks. However, when the authors attempted this, it did not generally work well for the following reasons:  Typically, the variation in the literature is in the substrate, while in reaction optimization the variation is in agents (catalyst, reagent, solvent). Therefore, models trained on the literature tend to be unhelpful for reaction optimization. A recent paper from Wuistchik's team at Roche captures this issue well. 5  The broad applicability of using mined data for chemists is hindered by the lack of clean datasets from the literature. Obtaining a clean dataset from the literature takes a significant amount of work since synthetic chemistry data is not stored in a structured format. Even gathering data from reaxys (which is already heavily processed) requires several postprocessing steps. For example, when the authors created a Suzuki coupling dataset from Reaxys, many of the reactions did not have yield (~50%) and 75% of the data with yield were duplicates.

General procedure for synthesis of product analytical standards
The general procedure as highlighted by Hennessy and Buchwald 6 was followed. An oven-dried 50 mL round-bottomed flask was equipped with a magnetic stir bar and air condenser. Palladium acetate (2 mol%), JohnPhos (4 mol%) and the chloroacetanilide substrate (5 mmol) was then charged, and the flask was evacuated and backfilled with nitrogen 3 times. Anhydrous triethylamine (1.05 mL, 10 mmol) was added, followed by anhydrous toluene (5 mL). The reaction mixture was heated at 80 °C and ran overnight, then diluted with ethyl acetate (50 mL). The mixture was filtered through a plug of celite, concentrated on a rotary evaporator, then purified by silica gel chromatography to give the oxindole product. For each case study, the starting material was purchased from J&H Chemical in 95 % purity. All other chemicals were purchased from Sigma Aldrich unless otherwise stated. S13

Experimental setup
The following experimental setup was used, as pictured in Figure S16, as described in the Methods section of the paper. Solubility studies were initially conducted to determine the maximum concentrations of each starting material and product in different solvents. This information would then be used to conclude the ideal S14 concentration of the starting material in each reaction (0.09 M) and the solvents of choice to help to prevent reactor clogging. In each study, a pre-weighed vial and a fixed amount of material was dosed with small amounts of solvent and stirred vigorously until the material was fully dissolved -the vial was then weighed, and the resulting concentration calculated. These maximum concentrations are reported in Table S1: The weighing of the individual components into the sample vials differed for each of the case studies -these vials were used by the liquid handler as solution reservoirs. The vials 1 -5 contained these differing components, whilst vials 6 -25 remained constant throughout. 10 mL of reaction solution was present in each vial, which was each purged with nitrogen. The triethylamine and all solvents are anhydrous, and biphenyl was used as an internal standard. These concentrations were calculated so that a constant 0.09 M concentration of starting material in the reactor could be obtained through dilution, to enable a fair comparison between all conditions where mol% of catalyst was varied. The weights of all components are displayed in Table S2 and S3:

Case study 1
An analytical standard for 18 was obtained using the general procedure, where LCMS analysis gave product m/z 166.15 and the 1 H NMR showed conversion to product as shown in Figure S17.

Case study 2
An analytical standard for 20 was obtained using the general procedure, where LCMS analysis gave product m/z 409.30 and the 1 H NMR showed conversion to product in Figure S18. A pure product could not be obtained for this analytical standard and HPLC calibration was performed after quantitative NMR assays to determine %purity of product. S18 Figure S18: 1 H NMR for oxindole product 20.

Case study 3
An analytical standard for 22 was obtained using the general procedure, where LCMS analysis gave product m/z 211.05 and the 1 H NMR showed conversion to product in Figure S19.

Case study 4
An analytical standard for 24 was purchased from Sigma Aldrich.

Case study 5
An analytical standard for 1,7-dimethylindolin-2-one was obtained using the general procedure, where LCMS analysis gave product m/z 162.31 and the 1 H NMR showed conversion to product in Figure S20. A pure product could not be obtained for this analytical standard and HPLC calibration was performed after quantitative NMR assays to determine %purity of product. S21 Figure S20: 1 H NMR for oxindole product 1,7-dimethylindolin-2-one.