Rocketsled: a software library for optimizing high-throughput computational searches

A major goal of computation is to optimize an objective for which a forward calculation is possible, but no inverse solution exists. Examples include tuning parameters in a nuclear reactor design, optimizing structures in protein folding, or predicting an optimal materials composition for a functional application. In such instances, directing calculations in an optimal manner is important to obtaining the best possible solution within a fixed computational budget. Here, we introduce Rocketsled, an open-source Python-based software framework to help users optimize arbitrary objective functions. Rocketsled is built upon the existing FireWorks workflow software, which allows its computations to scale to supercomputing centers and for its objective functions to be complex, long-running, and error-prone workflows. Other unique features of Rocketsled include its ability to easily swap out the underlying optimizer, the ability to handle multiple competing objectives, the possibility to inject domain knowledge into the optimizer through feature engineering, incorporation of uncertainty estimates, and its parallelization scheme for running in high-throughput at massive scale. We demonstrate the generality of Rocketsled by applying it to optimize several common test functions (Branin-Hoo, Rosenbrock 2D, and Hartmann 6D). We highlight its potential impact through two example use cases for computational materials science. In a search for photocatalysts for hydrogen production among 18 928 perovskites previously calculated with density functional theory, the untuned Rocketsled Random Forest optimizer explores the search space with approximately 6–28 times fewer calculations than random search. In a search among 7394 materials for superhard candidates, Rocketsled requires approximately 61 times fewer calculations than random search to discover interesting candidates. Thus, Rocketsled provides a practical framework for establishing complex optimization schemes with minimal code infrastructure and enables the efficient exploration of otherwise prohibitively large search spaces.


Introduction
High throughput computing (HTC) has emerged as an indispensable framework for computational scientific inquiry, finding broad use from genotype-phenotype mapping in phenomics [1] to screening-based materials discovery [2][3][4]. HTC is defined by its ability to robustly run collections of independent computational tasks over distributed resources; modern highly-parallel computing resources make this a natural abstraction for many computational tasks. With cumulative run times of potentially millions of CPU hours, an instinctive question is to ask how HTC experiments can be optimized to reduce total computational time (or equivalently, how to maximize the utility of a given computational budget). Furthermore, even when one plans to comprehensively compute an entire search space, the use of optimization can help find a desired solution earlier in the process. This can be particularly useful if a comprehensive search potentially involves weeks or months of calculation. A large class of problems in HTC can be viewed fundamentally as optimization problems over a Figure 1. Motivation behind adaptive design. The adaptive strategy, also known as an optimization loop, uses a surrogate model (e.g. Bayesian or machine learning) trained on previous information to select the next calculation or experiment to run. This process is repeated in an iterative fashion by updating the model with new information. Adaptive search strategies can produce better results than other search strategies (e.g. random) with a given number of calculations, which makes them attractive for problems where exhaustive searches are impractical.
high-dimensional search spaces. Rocketsled also provides a method of including domain knowledge to improve optimization performance through a feature we call 'z-descriptors'. We note that the goal of Rocketsled is not to implement a new algorithm for optimization; instead, Rocketsled comes bundled with several built-in BO engines and is capable of using external optimization engines such as skopt, MOE, SigOpt, or a user-defined algorithm. Rocketsled is thus able to support many kinds of scientific workflows, hardware configurations, optimization engines, and distributed execution modes that are not possible with existing software.
In this paper we provide an overview of the design principles behind Rocketsled, highlighting software design aspects of the HTC framework, and brief mathematical details of the optimization. We then demonstrate performance of Rocketsled in optimizing generic mathematical test functions as well as example use cases in two high-throughput computing applications in inorganic materials discovery. Finally, we discuss the distinctions and limitations of Rocketsled. Rocketsled is released open-source under a BSD-style license and can be obtained at www.github.com/hackingmaterials/rocketsled; the current version at the time of this writing is v2019.2.27.

Methods
The goal of Rocketsled is to reduce the time scientists spend designing, writing, and debugging code for optimization loops and ultimately create a high-level and automatic optimization engine for HTC problems. In this section, we summarize the main methods and software Rocketsled uses, and we provide more comprehensive details in the supplementary information (available online at stacks.iop.org/JPMATER/2/034002/mmedia).

Mechanics of rocketsled execution: high-throughput computing with fireworks
Rocketsled is integrated with the open-source scientific workflow code FireWorks. This allows Rocketsled to be hardware-agnostic and workflow-agnostic and to separate the adaptive design component (i.e. building of surrogate model and selection of next calculation to run) from the management and execution of the workflows on various computing platforms. For example, through FireWorks, expensive parts of the workflows, such as physics simulations, can be executed on clusters or supercomputers, whereas less expensive parts, such as optimizations, can be executed locally or on clusters, with all workflows managed through a central database server (MongoDB [24]). Because the design of Rocketsled integrates closely with that of FireWorks, we briefly describe FireWorks' core design below. An in-depth description of FireWorks can be found in the original publication [17] and online documentation (https://materialsproject.github.io/fireworks/).
A FireWorks workflow is composed of one or more jobs (called 'Fireworks') that may have a complex set of dependencies and data passing between them ( figure 3(a)). Each Firework is run on a single computing platform, but different Fireworks within the same workflow can run in parallel on different platforms if the workflow graph and the user specification allow. Each Firework is in turn composed of one or multiple Firetasks, which are Python functions or runnable scripts that are executed sequentially (figure 3(b)). A workflow can thus be as simple as a single Firetask within a single Firework (e.g. a single script to run), or as complex as hundreds or thousands of individual operations that pass data between them. The FireWorks software allows users to define a set of workflows, store them in a central database (LaunchPad), and distribute execution over multiple computing centers. The database manages the state of all the workflows and allows users to inspect progress, rerun failed jobs, and perform other tasks.
Rocketsled is implemented as a Firetask called OptTask; when designing a workflow, the user will typically place the OptTask at the very end of the workflow representing the function evaluation ( figure 3(b)). The operation of OptTask consists of 3 stages (figure 4). In the first stage (figure 4(a)), the OptTask will read in the data generated by the current workflow-namely, the set of input parameters that were chosen for this evaluation (the vector x) and the value of output response (the vector y, with a single value for conventional optimization and multiple values for multi-objective optimization) for this particular input choice. It will also generate any domain-specific descriptors describing the input space that can be used to help the surrogate model (the vector z), as will be discussed later. The OptTask next stores this information in a MongoDB database and retrieves all previously generated inputs, outputs, and descriptors (the matrices X, Y , and Z, respectively) to assemble a full record of information known thus far. In the second stage (figure 4(b)), the OptTask will build a surrogate model that can be used to predict the values and uncertainties of all remaining inputs in the domain. Rocketsled provides many different algorithms for generating this surrogate model. In the third and final stage (figure 4(c)), OptTask will evaluate the remaining points in the space (accounting for uncertainties) to determine the best of the remaining inputs to run via one of several acquisition functions (described later), and automatically submit a new workflow for this best input. We emphasize that, once set up, no user intervention is needed to maintain the system. Rocketsled will continue submitting calculations automatically, improving the accuracy of the surrogate model with every new calculation, until directed to stop (e.g. through stopping the FireWorks 'rocket launches') or until the search space has been exhausted.

Optimization backend
Rocketsled operates in a manner that is agnostic to the choice of underlying optimization engine. The requirements for an optimizer are that, given the dimensions of the search space and a set of previously evaluated inputs X with the corresponding x f ( ) responses Y , it returns at least one x in the unexplored space with which , such as a physics simulation, and each job can be executed conditionally or on different computing resources. Running a workflow is analogous to evaluating an objective function. This particular workflow takes in an input vector x , 2 runs four Fireworks, and outputs an objective function response y .
2 (b) The final Firework job in the workflow from (a), containing OptTask, the Rocketsled optimization object, as the final Firetask. OptTask reads x 2 (the input parameters in this workflow) and y 2 (and output of the function evaluation) from the workflow using the Firework's 'spec' (a set of shared variables common to all the tasks in a Firework job) and submits the optimal next workflow (e.g. x 5 ) for execution.
to start the next workflow. Rocketsled supports the ability to use one of several built-in Bayesian optimizers or to integrate external custom optimizers written as Python functions. The built-in optimizers are selected with arguments to OptTask and are based on scikit-learn [25] regressors, a full list of which can be found in the supplement. These optimizers possess the full scope of Rocketsled's abilities including tolerance-based duplicate prevention and selection of acquisition function. Custom optimizers retain the core Rocketsled features, including FireWorks execution, separate optimization management, use of surrogate functions, and multiobjective mode.
The built-in Rocketsled optimizers operate by fitting regression models to the available data, predicting x f ( ) for points in the unexplored space, generating uncertainty estimates for each point, and selecting the next best x to run. The uncertainty estimates are either taken directly from the model (if the model itself provides such estimates) or bootstrapped [26] if no uncertainty is directly provided by the model. Once uncertainty estimates have been generated, Rocketsled selects the next best x by maximizing the chosen acquisition function. Acquisition functions are statistical methods for selecting the next point to run given limited knowledge about the objective function. Rocketsled comes with five acquisition functions, the most well-known being Expected Improvement (EI) [27,28]. EI evaluates x f ( ) at points with both a high probability of improving and large margin of improvement, and is mathematically defined as [29]: where x m ( ) and x s ( ) are the mean and variance of the f x ( ) prediction, F and f are the cumulative distribution function and probability distribution function respectively, x min is the vector for which x f ( ) is minimized, and c is a user-tunable parameter controlling exploration versus exploitation. High values of c correspond to highly exploratory strategies; it has been shown that 0.01 c = works well for a diverse range of problem sets [28]. The other acquisition functions available for single objectives are Probability of Improvement [30], Lower Confidence Bound [31], and greedy selection (a strategy in which the top prediction is always picked and uncertainty is not taken into account).
Rocketsled's optimization backend also supports multi-objective optimization. In multi-objective optimization, we seek the largest set of Pareto-optimal solutions, particularly those solutions with the smallest hypervolume (i.e. minimization in each objective). A Pareto-optimal solution is a point which is not dominated in all objectives by any other point; the Pareto frontier is the set of all Pareto-optimal solutions, a succinct formal definition of which is presented in Cui et al [32]. At the time of this writing, Rocketsled has two acquisition functions for selecting the next best guess when there are multiple competing objectives. The first is a greedy strategy that selects the guess with the highest probability of being Pareto-optimal. The second strategy is the Expected Maximin Improvement Scheme [33,34] which seeks to maximize the minimum EI gain among objectives. In order to generate predictions and uncertainties in a similar manner to single-objective mode, Rocketsled repeats the single-objective procedure across all objectives, and thus for prediction purposes, considers objectives independent of one another.

Incorporating domain knowledge with z-descriptors
Optimization problems may benefit from the injection of domain knowledge. One way to represent domain knowledge is to allow the optimizer to incorporate features or descriptors, i.e. computationally inexpensive functions incorporating domain knowledge which is not directly found in the x vector. To state it another way, rather than fitting a surrogate model directly as a function of the x vector, one can introduce additional attributes to simplify the model. These functions are of the form z x g , = ( ) where z is a vector representing the information encoded by x and g is computationally cheap to evaluate. While the x vector must be unique, z has no such requirement. For example, if one is optimizing molecules for stability, x may define the molecule uniquely by its atomic positions or SMILES [35] string, but z provides chemically-relevant data based on x such as bond lengths, angles, and functional group orientations which allow the optimizer to better predict the molecule's stability. Such use of feature extraction techniques is a very common practice for improving conventional machine learning results but is typically not supported by optimization packages, which operate directly on the x vector alone. We present specific examples making use of this technique in a later section and demonstrate that the use of domain-specific z features can have a large impact on overall results.

Results
We showcase the use of Rocketsled in two major areas. First, we benchmark Rocketsled on a set of well-known test functions. Second, we demonstrate how Rocketsled can lead to more efficient searches for functional materials for photocatalytic water splitting and for superhard materials. The objective of these case studies is to demonstrate how the built-in optimizers and capabilities for managing high-throughput workflows with Rocketsled may be useful. We also provide results regarding the computational expense of the optimizer itself.

Optimization of common test functions
The primary concern of a black-box optimization algorithm is its ability to accurately forecast objective function values with limited information. To estimate its usefulness in real-world optimization problems, Rocketsled's optimization engine was evaluated on three common single-objective test functions [36][37][38] with known global minima. Two optimizers based on machine learning models built into Rocketsled's framework, GP and RF, were independently tested and compared to a similar, popular open-source software, Scikit-Optimize [14] (skopt) under identical constraints. For each benchmark function, the algorithm was allowed fifty function evaluations to predict the function minimum. The absolute error in the optimizer's predictions are plotted in figure 5.
The objective of these tests was to affirm suitable performance of Rocketsled's built-in optimization engine using default (untuned) hyperparameters across several objective functions. We emphasize that Rocketsled supports any custom optimizer, and its main purpose is not to achieve the best optimization performance on test functions but rather to better support adaptive design in HTC workloads. By the 50th function evaluation, both Rocketsled algorithms and the Skopt GP perform better than random search on average on the three benchmark functions. The GP-based optimization methods typically perform near an order of magnitude better than the RF optimization on these tests.
We first examine the application of Rocketsled to the discovery of one-photon water-splitting perovskites, a photocatalytic technology useful for hydrogen production. A set of 18 928 perovskite compositions were previously calculated by Castelli et al [64] in a comprehensive materials screening procedure, where the chemical search space was exhaustively explored with density functional theory (DFT) calculations. Among the candidates, only 20 were identified as candidates for further study, 13 of which were previously unknown by the research community. In the following use case, we compare the performance of the RF optimizer in finding all 20 photocatalytic candidates to random search, a set of chemical rules, and a tuned genetic algorithm (GA) previously reported by Jain et al [65] that used a priori knowledge of results to determine an upper bound on GA performance. This GA's hyperparameters such as elitism, selection function, population size, and crossover and fitness functions were tuned by searching 2952 parameter combinations; the 'best GA' was optimized via a posteriori analysis (such as one-and two-parameter ANOVA), where the hyperparameters were changed based on the results of previous GA runs. Such hyperparameter tuning is not optimal for adaptive design, since optimal hyperparameters cannot be determined without prior knowledge of the objective function response.
Every compound in the search space (N=18 928) is encoded by its chemical formula ABX, where A and B are elements and X is a ternary anion. The 3-tuple [A B X] is the x vector, with the Mendeleev number of the element for A and B and the average Mendeelev number for X. The fitness function (taken from Jain et al [65]) response is the objective function, ranging from 0 (worst possible performance) to 30 (one of the 20 candidates) and assessing band gap, stability, and band edge position. The results are precomputed, so to evaluate the objective function we merely need to look up an entry's data. In a live optimization run, evaluating the objective function would consist of running a DFT workflow. Calculation and fitness function details can be found in the supplementary information.
We test seven optimization strategies to search the perovskite space. The first strategy is random search. The second strategy is to use a set of 'chemical rules' regarding charge balance, even-odd valence rule, and use of Goldschmidt tolerance factor [66] in ranking candidates. These are designed as a crude way to mimic selections from a human expert in the field [65]. The third strategy is a RF-based BO with Rocketsled, using only the 3-tuple x vector for learning and prediction. In a fourth strategy, we again use Rocketsled but add to each x vector a z vector; these are an additional 132 features that represent composition-based chemical information such as statistics on electronegativity, oxidation state, and common ionic radii [67,68]. The set of features used for z were previously published as the MagPie descriptor set [68] independent of this problem, therefore we can consider them both generally representative of the chemical information of each composition yet 'blind' to the specifics of the problem. The fifth strategy is the same as the fourth (Rocketsled Random Forest + z), and we further exclude the 11 587 compounds failing two simple chemical stability rules from the search. The sixth and seventh strategies are the GA previously reported by Jain et al [65], with either no chemical rules applied (strategy 6) or using chemical rules to exclude the 11 587 compounds failing stability rules (strategy 7). We note that both the GA results from the prior study optimized the hyperparameters of the GA using the problem itself, thus these should be considered as upper bounds on performance as those hyperparameters would likely not have been known in advance. In summary, the seven strategies used are (1) random search, (2) chemical rules, (3) Rocketsled RF with x only (RS RF), (4) RS RF + z features, (5) RS RF + z features + chemical rules, (6) GA (upper bound), and (7) genetic algorithm + chemical rules (upper bound).
We evaluate the performance of an optimization strategy by its 'speedup': For all Rocketsled-guided search strategies, adding additional information via the cheap surrogate z model increased optimization performance. Adding z features alone gave a 76.4% increase in S 20 and a 162.0% increase in S . 10 When chemical rules were further applied to restrict the search space, the resulting models were by far the top-performing models of those tested, with S 20 reaching 15.99 and S 10 reaching 28.26. A more detailed treatment of z features, and a separate study with a manually selected set of statistics for the z features, is presented in the supplementary information. Figure 6 also plots the values previously reported by Jain et al for GAs [65] on the same problem. We note that the GA value reported is the best among more than 2500 hyperparameter combinations tested. This hyperparameter search, which separately studied various crossover rates, mutation rates, and population sizes on GA performance, required prior knowledge of all results so that the best combination could be selected. When these parameters were not tuned using knowledge of the results, the reported speedups in the prior study were typically about half that of the best GA [65], making them significantly lower than that of untuned Rocketsled results. Nevertheless, the Rocketsled results compare favorably to or exceed even the fully tuned GA results even without any prior knowledge of the problem, indicating the robustness of the approach.
In figure 7, we plot the performance of the RF optimizers in terms of candidates found as a function of function evaluations. Steeper lines indicate higher speedup. The RF optimizer, when provided neither z nor chemical heuristics, is generally poorer than the chemical-rules-only search, but has a standard deviation overlapping the S 20 of chemical-rules-only. The localized drop in performance around 11 candidates can be explained by the clustered distribution of candidates in Mendeleev space (see supplementary information), given that the only information this strategy has is the 3-vector of Mendeleev numbers representing A, B, and X. All chemical knowledge-restricted or z-directed optimizations with Rocketsled exceed the performance of the chemical heuristics alone in S 20 . For these strategies, the variance observed in S 20 over the 20 independent runs is sufficiently small such that the mean is not within one standard deviation of the chemical-rules only strategy, indicating significant advantage.
All strategies vastly outperform random search. For example, with only the encoded Mendeleev numbers for perovskite representation, Rocketsled optimization finds all candidates with less than one-fifth of the DFT calculations random searching requires. Using z information or chemical information independently with the optimization, we can reduce this number to less than 10%. If the optimizer can use both z information and chemical heuristics, we use only 6.25% of the calculations, and save 16 901 DFT calculations on average. If we are interested in simply finding half the candidates while incorporating chemical knowledge and z information, we require only 3.4% of the calculations random search requires. Thus, very significant reductions in the computational cost needed to perform materials searches are possible using the Rocketsled framework.
We next demonstrate the examine the application of Rocketsled to a multi-objective problem: searching for eight known superhard materials among a set of 7394 heterogeneous, k-nary structures from the Materials  Project. Discovering new superhard materials is imperative to reducing the cost of industrial cutting and polishing tools. The hardness indicator of a polycrystalline inorganic solid is typically estimated computationally by first determining its full elastic tensor with DFT, a relatively expensive ab initio procedure, and then calculating the Voight-Reuss-Hill [69] average of the bulk modulus (K ) and shear modulus (G). Materials are canonically classified as superhard if their Vicker's indentation hardness, a metric related to having both high K and high G, exceeds 40 GPa [70]. In the same manner as an investigation by Tehrani et al [71], we will consider a material 'superhard' if it is both incompressible (high K ) and resistant to shear (high G).
Our dataset, which includes data from a previous high-throughput elastic tensor study [20], was downloaded from the Materials Project. Structures having negative K or G and compositions containing noble gases were removed as were 2D materials. The resulting 7394 unique structures are composed of 56% ternary, 39% binary, 2.5% unary, 2.1% quaternary, 0.2% quinary, and~0.02% senary stoichiometries. Among these structures, 167 spacegroups and 6238 unique compositions are represented. Spacegroup 225 (Fm-3m) is the mode spacegroup of the set and composes 15% of all the structures; 22 of the spacegroups appear in the dataset with a frequency higher than 1%.
The objective is to identify the crystal structures with both high K and high G without the optimization algorithm having problem-specific or a priori knowledge, and using as few simulated elastic computations as possible. The search can be considered a multi-objective optimization: for a structure to be considered a candidate, it must have both K>350 GPa and G>250 GPa. To reduce the number of polymorphs in the final set of superhard candidates, we will only consider the top performing (Pareto-optimal with respect to K and G) structures for each composition as solutions to the optimization. For example, while there are seven carbon polymorphs matching the K and G criteria, there are only two which are Pareto-optimal. The final list of solutions, shown in table 1, is composed of eight superhard candidates: Previous studies by de Jong et al [79] and Tehrani et al [71] applied statistical learning techniques to smaller -yet similarly diverse-elastic datasets, using sets of compositional and structural descriptors. In similar fashion, we generate basic statistics from the composition and structure of each material which we use as the optimization algorithm's z vector. The 132 composition features were generated using the matminer implementation [67] of Ward et al's MagPie descriptor set [68], which includes means, average deviations, modes, and ranges of elemental properties such as melting temperature, atomic weight, and covalent radii. The remaining 52 structural descriptors were the space group number, the structure's centrosymmetry, and statistics (taken over sites) based on local order parameters [80] implemented in matminer [67]. Rocketsled's multiobjective mode was used with the RF optimizer and the Maximin EI acquisition function. No encoding of the materials was used besides the z vector.
In figure 8 we plot the progress of the Rocketsled optimizer in finding all candidates in fewer computations than random search for twenty independent runs. On average, we can find all the materials in table 1 using 261 elastic calculations instead of 6573 calculations with random search. We find the peak speedup of 61.1 ± 33.1 for 5 candidates, and the lowest speedup of 28.3±9.2 for all 8 candidates. The higher speedups with n 6 < may be explained by the presence of carbon in five of the eight materials; the search algorithm can preferably look for carbon-containing compounds once it has identified them. More dissimilar materials, such as ReN 2 , are found with lower speedup if not predicted as high-scoring candidates early in the optimization run. Figure 9 provides visual snapshot of which candidates in the search space are likely to be explored with both random and Rocketsled strategies at a state of progress of only 200 calculations. This plot demonstrates that Rocketsled is exploring the known 8 solutions with very high probability even at this early stage.

Performance and computational overhead
A major consideration when incorporating a black-box optimization engine is the computational overhead required for training and updating the surrogate model. Here, we show the overhead of the Rocketsled and FireWorks framework as a whole for local execution on a 2017 Macbook Pro with 3.5 GHz Intel i7 processor. We use the EI acquisition function that requires uncertainty estimation. The Rocketsled GP, RF, and random predictors overhead times are shown independently. Due to the small, constant time required for selecting a random guess, random predictor times essentially represent the non-machine learning portions of the optimization procedure including communication with the database, manipulation of the training and prediction matrices in memory, and FireWorks operations. At each function evaluation, the models are retrained using all previous calculations available for training and 1,000 points in the search space are predicted.
For the purpose of this overhead benchmark, a mock objective function was used which accepts any vector and returns a random scalar between 0 and 1. As plotted in figure 10(a), the RF model has a much higher training time for small numbers of function evaluations. This is because the acquisition functions such as EI require estimates of prediction uncertainty as input. With a Gaussian Process model, uncertainty estimates can be directly taken from the trained model; however, with a RF model, bootstrapped training and prediction is used to estimate the uncertainty of predictions made on unexplored points. Bootstrapping the training/prediction cycle in this manner introduces  ) (RF) [81] and O (n 3 ) (GP) [82] when the problem dimension is held constant.
Black-box optimization problems may be high-dimensional, so in figure 10(c) we illustrate the computational overhead of each predictive algorithm as a function of problem dimension, up to 1000 dimensions, for various set numbers of training points (up to 10 000). The random model (representing non-ML Rocketsled operations) is typically near or less than 1 s per optimization unless N 10 000 > and the model is more than 100-dimensional, but the optimizers implementing machine learning methods are less robust to increasing problem dimension. For the bootstrapped RF, prediction tends to become computationally intensive (>10 s/optimization) for N 1000 > with D 50. > The GP optimizer is more robust to increases in problem dimension, but is still computationally intensive for large N and D 100.
> Again, the overhead is dominated by the complexity of the underlying ML models. If the evaluation cost of the objective function is still significantly greater than these per-optimization times, the optimizers could be reasonably used with larger numbers of training points and more search dimensions. The worst cases generally represent 10-100 s of overhead per calculation, which is comparable to modern packages such as COMBO [15] and skopt [14] under the same conditions. Thus, the more expensive it is to perform a full function evaluation, the larger of a problem size that can be efficiently addressed with the Rocketsled framework. We note that the built-in optimizers' hyperparameters (e.g. number of training points, number of testing points, number of bootstraps) can be changed to increase the computational efficiency of the optimization if required.

Discussion, practical considerations, and limitations
While there are specialized black-box optimization packages already available, to our knowledge none adequately handle optimization for HTC. Most existing software is intended to execute the black-box function on the user's local computer, which limits their capability to manage complex production workflows on high performance systems. In these packages, the execution of the workflows is strictly sequential, meaning that the expensive black-box function and optimization are executed one after another, limiting parallelism. Additionally, existing packages are often not robust to heterogeneous, discontinuous, or non-continuous search domains. Furthermore, few existing packages contain multi-objective capabilities or the ability to include domain knowledge through the addition of features or descriptors.
In contrast, Rocketsled's built-in optimizers are designed to balance performance and ease-of-use across a wide range of optimization problems. Integration with the FireWorks workflow software gives the user maximum control over the execution of the expensive black-box function and the optimization. Rocketsled is not limited to sequential function evaluation and optimization; it allows for black-box function evaluations in parallel across arbitrary computing resources as well as optimization in parallel, if desired (a detailed discussion of parallelism features is presented in the supplementary information). Built-in optimizers are robust to many kinds of search spaces, including discrete, combinatorically-generated search domains (sets of allowed values in each dimension) and discrete, non-combinatorically-generated search domains (a set of allowed points). Builtin optimizers also work on spaces defined with a combination of bounded and discrete dimensions (including categorical variables) and all optimizers can function in single or multi-objective mode. To our knowledge, there are no existing software packages robust to the variety of optimization requirements that Rocketsled supports.
Applying adaptive design problems to high-throughput computing problems is growing in popularity, particularly for exploration of novel inorganic materials. A previous study identified stable structures for stage-I and stage-II lithium-graphite intercalation compounds using 4%-6% of the calculations required to explore the entire search space, a combinatorial problem containing more than 16.7 million total possibilities [51]. Others have found adaptive design can accelerate searches-particularly those where exhaustive computation would be problematic-for a variety of applications including novel crystalline interfaces [56], elastic properties [52], high-pressure Mg-silicate phases [47], ultra-low thermal conductivity structures [53], inorganic/organic molecular interfaces [54], layered materials [50], stable carbide/nitrides [57], piezoelectrics [34], Poisson-Schrödinger simulations of LEDs [48], and stable crystal structures [55] for Y 2 Co 17 . In this study, we demonstrated two additional proofs-of-concept searching for superhard materials and photocatalysts. These studies strongly suggest that future use of the Rocketsled framework can lead to more efficient and ultimately more successful searches of materials space.
Rocketsled is the most useful when the following criteria are met: (i) full function evaluations are relatively expensive (e.g. >100 s per evaluation), (ii) the number of workflows to be run is on the scale of 100-1 000 000, (iii) workflows and their results are determined by their input parameters (non-stochastic), and (iv) one wants to take advantage of the workflow features built into the FireWorks platform such as distributed execution and comprehensive job management. Further details on limitations and usage of Rocketsled are presented in the supplementary information.

Conclusion
Rocketsled represents a robust, open-source framework capable of running a diverse range of black-box optimization problems on HTC resources. It contains a combination of features not found in other similar frameworks, notably: (i) distributed parallel execution on supercomputing platforms, (ii) support for multiple optimizers and acquisition functions, (iii) ability to handle heterogeneous search spaces, (iv) the ability to incorporate domain knowledge through descriptors, and (v) multi-objective optimization routines. We presented two applications in computational materials science demonstrating automatic optimization capabilities, demonstrating very high potential speedups compared to random searches, chemical-intuition based searches, or other methods such as GAs. We also have provided benchmarks regarding the computational cost of optimization. We believe the growth of data-driven scientific computing will enable many future use cases for Rocketsled, particularly in materials science, where the underlying FireWorks library is wellestablished.