Abstract
Computational materials design has suffered from a lack of algorithms formulated in terms of experimentally accessible variables. Here we formulate the problem of (ternary) alloy optimization at the level of choice of atoms and their composition that is normal for synthesists. Mathematically, this is a mixed integer problem where a candidate solution consists of a choice of three elements, and how much of each of them to use. This space has the natural structure of a set of equilateral triangles. We solve this problem by introducing a novel version of the DIRECT algorithm that (1) operates on equilateral triangles instead of rectangles and (2) works across multiple triangles. We demonstrate on a test case that the algorithm is both robust and efficient. Finally, we offer an explanation of the efficacy of DIRECT—specifically, its balance of global and local search—by showing that “potentially optimal rectangles” of the original algorithm are akin to the Pareto front of the “multi-component optimization” of global and local search.
Similar content being viewed by others
Notes
Modulo a technical distinction: the extreme points of the convex hull are actually a subset of the Pareto front.
References
Barnsley, M.: Fractals Everywhere. Academic, New York (1993)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)
Finkel, D.E., Kelley, C.T.: Convergence analysis of the DIRECT algorithm. Technical report, North Carolina State University, Center for (2004)
Franceschetti, A., Zunger, A.: The inverse band structure problem: find the atomic configuration with given electronic properties. Nature 402, 60 (1999)
Graf, P.A., Jones, W.B., Kim, K., Wang, L.-W.: Surface passivation optimization using DIRECT. J. Comput. Phys. 224(2), 824 (2007)
Kim, K., Graf, P.A., Jones, W.B.: A genetic algorithm based inverse band structure method for semiconductor alloys. J. Comput. Phys. 208(2), 735–760 (2005)
Perttunen, C.D., Jones, D.R., Stuckman, B.E.: Lipschitzian optimization without the lipschitz constant. J. Optim. Theory Appl. 79(1), 157–181 (1993)
Powell, M.J.D.: The BOBYQA algorithm for bound constrained optimization without derivatives (2009)
Rios, L.M., Sahinidis, N.V.: Derivative-free optimization: a review of algorithms and comparison of software implementations. J. Global Optim. 56(3), 1247–1293 (2013)
Shubert, B.: A sequential method seeking the global maximum of a function. SIAM J. Numer. Anal. 9, 379–388 (1972)
Zunger, A., Kazmurski, L., Tumas, W. et al.: The center for inverse design, 2010–2014. http://www.centerforinversedesign.org/
Acknowledgements
PG has benefitted from many conversations with Wes Jones, Kwiseon Kim, Alex Zunger, and Mayeul de Avezac. This work is supported by the US Department of Energy, Office of Science, Basic Energy Sciences, under Contract No. DE-AC36-08GO28308 to NREL as a part of the DOE Energy Frontier Research Center “Center for Inverse Design”.
Author information
Authors and Affiliations
Corresponding author
Appendix: Local convergence rate of MDTri for strongly convex quadratic objective functions
Appendix: Local convergence rate of MDTri for strongly convex quadratic objective functions
Here we will derive and illustrate an upper bound on the number of evaluations necessary for MDTri to converge to an arbitrary tolerance \(\epsilon \) of the minimal value \(f(x^*)\) of a strongly convex quadratic objective function on one phase (one choice of 3 atoms). Such a function has the property that there are constants m and M s.t.
where \(l_x(y) = f(x) + \nabla f(x)^T(y-x)\) is the linearization of f at x. (This is a consequence of the definition of strong convexity rather than the definition itself; see [2], section 9.1.2, for details). For this section, assume our objective function f is strongly convex with minimizer at \(x^*\). Also, assume x has dimension 3; we are examining the local search on one face.
First, note at each iteration we may not always divide the triangle containing the minimizer \(x^*\). To develop a bound on how many iterations this triangle might have to “wait” before being divided, consider Fig. 11. We argue that this setup represents the worst case the algorithm has to deal with (it amplifies the anisotropy the most). Assume that f is a quadratic function whose Hessian has smallest and largest eigenvalues m and M, respectively. The figure depicts the minimizer \(x^*\) and the level set \(\{x' | f(x') = f(x)\}\) for x at the center of the bold triangle. Assume \(x^*\) is in fact just inside this triangle, but as far as can be toward the corner, i.e. as far as possible from x. And assume the line segment \(\overline{xx^*}\) is along the minor axis of the ellipse. Now, all those triangles whose centers are inside the ellipse will be triangles (of the same size as the one in question containing the minimizer) that are divided before the one containing the optimum, as the function values at the triangle centers are by construction less than f(x). So we need to estimate how many there are. Note that such triangles come (in this case) in sets of 4. We can get an upper bound by measuring the horizontal distance between centers. This distance is in fact \(\sqrt{3} |x-x^*|\). Now, the aspect ratio of the ellipse is given by the condition number \(\kappa = \frac{M}{m}\) of the Hessian. So the major axis, the distance from \(x^*\) to the right edge of the ellipse, is \(\kappa |x-x^*|\). And we just need to know how many \(\sqrt{3} |x-x^*|\)-wide triangles fit in this distance. Taking into account that the first triangle center is \(\sqrt{3}/2 |x-x^*|\) away from \(x^*\) (horizontally) and the curvature at the narrowing point of the ellipse, we claim that an upper bound for the number of triangle centers that could have lower function value than f(x) is
so the upper bound \(n_{w}\) on the iteration after its creation (when it first enters the candidate set) the optima-containing triangle is divided is
where we have added 1 so that \(n_w = 1\) in the case that it is immediately divided. Note that for \(M=m\), i.e., \(\kappa = 1\), this quantity is 1; for perfectly isotropic functions we are guaranteed to always subdivide the “correct” triangle. Also, this estimate is independent of x and \(x^*\), i.e., it is universal for all scales of the search.
Next we estimate the number \(n_c\) of “correct” subdivisions, i.e., subdivisions of the triangle containing the optimizer, needed to ensure a given bound \(\epsilon \) on the distance between the probed points and the minimizer \(x^*\). Recall that each face of the search space is an equilateral triangle with sides of length \(\sqrt{2}\), so the center of each face is \(\frac{\sqrt{6}}{3}\) away from the corners, so x starts at least this close to \(x^*\). The “center to corner” distance decreases by a factor of 2 at each correct subdivision. Hence we can say \(|x-x^*| < R (\frac{1}{2})^{n_c}\), where \(R = \sqrt{6} / 3\).
Now note that \(\nabla f(x^*) = 0\) at the minimizer \(x^*\), so Eq. (2) implies
So
Choosing
then implies that \(|f(x)-f(x^*)| \le \epsilon \).
Now, the upper bound on the total number of iterations, taking into account as above that some might not divide the triangle containing the optimum, is \(n=n_w n_c\). Finally, to connect iterations n to evaluations e, note that every iteration of MDTri involves at most 4n function evaluations. This is true because MDTri adds exactly one new “scale” at each iteration, and we might divide the minimal objective triangle at each scale, and each triangle subdivision results in four new triangles to probe (counting re-evaluation of the center point, which is what the current implementation does). Then the number of evaluations e is no more than \(4 + 8 + 12 + \cdots = 4 (1+2+3+\cdots ) = 4 n (n+1)/2.\)
To summarize, the relations
together provide an upper bound for the number of evaluations required to reach a desired tolerance for MDTri’s search over one phase (triangle) for strongly convex quadratic functions.
We have run a test case to illustrate/verify this estimate. For this, we have defined a quadratic objective function with curvature M and minimum \(x^*\) at random points in the triangle. Note that this is the \(\kappa =1\) case so we are not yet testing the effect of anisotropy. We ran MDTri a number of times (to test at different values of \(x^*\), to get at the maximum number of evaluations we might require in practice) for \(M=1,10,100,1000,10{,}000,100{,}000\), to a tolerance of \(\epsilon = 10^{-8}\). Figure 12 shows the empirical results together with the upper bound from Eq. (3). We note that for this particular test, the gap between the bound and the actual number of iterations appears to come from certain iterations where we do not divide a triangle at every scale, which comes from some of the (scale, value) pairs on the Pareto front (see Sect. 4) not being on the convex hull of the {(scale, value)} set. For the anisotropic case, we have used a quadratic test function with curvature \(M=1\), 10, 100, 1000, 10,000, 100,000 in one direction, and \(m=1\) in the other. (One such function is that in Fig. 11.) Further detailed exposition of the convergence of MDTri, especially its global robustness, is beyond the scope of this paper.
Rights and permissions
About this article
Cite this article
Graf, P.A., Billups, S. MDTri: robust and efficient global mixed integer search of spaces of multiple ternary alloys. Comput Optim Appl 68, 671–687 (2017). https://doi.org/10.1007/s10589-017-9922-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-017-9922-9