Skip to main content
Log in

MDTri: robust and efficient global mixed integer search of spaces of multiple ternary alloys

A DIRECT-inspired optimization algorithm for experimentally accessible computational material design

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

Computational materials design has suffered from a lack of algorithms formulated in terms of experimentally accessible variables. Here we formulate the problem of (ternary) alloy optimization at the level of choice of atoms and their composition that is normal for synthesists. Mathematically, this is a mixed integer problem where a candidate solution consists of a choice of three elements, and how much of each of them to use. This space has the natural structure of a set of equilateral triangles. We solve this problem by introducing a novel version of the DIRECT algorithm that (1) operates on equilateral triangles instead of rectangles and (2) works across multiple triangles. We demonstrate on a test case that the algorithm is both robust and efficient. Finally, we offer an explanation of the efficacy of DIRECT—specifically, its balance of global and local search—by showing that “potentially optimal rectangles” of the original algorithm are akin to the Pareto front of the “multi-component optimization” of global and local search.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Modulo a technical distinction: the extreme points of the convex hull are actually a subset of the Pareto front.

References

  1. Barnsley, M.: Fractals Everywhere. Academic, New York (1993)

    MATH  Google Scholar 

  2. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)

    Book  MATH  Google Scholar 

  3. Finkel, D.E., Kelley, C.T.: Convergence analysis of the DIRECT algorithm. Technical report, North Carolina State University, Center for (2004)

  4. Franceschetti, A., Zunger, A.: The inverse band structure problem: find the atomic configuration with given electronic properties. Nature 402, 60 (1999)

    Article  Google Scholar 

  5. Graf, P.A., Jones, W.B., Kim, K., Wang, L.-W.: Surface passivation optimization using DIRECT. J. Comput. Phys. 224(2), 824 (2007)

    Article  MATH  Google Scholar 

  6. Kim, K., Graf, P.A., Jones, W.B.: A genetic algorithm based inverse band structure method for semiconductor alloys. J. Comput. Phys. 208(2), 735–760 (2005)

    Article  MATH  Google Scholar 

  7. Perttunen, C.D., Jones, D.R., Stuckman, B.E.: Lipschitzian optimization without the lipschitz constant. J. Optim. Theory Appl. 79(1), 157–181 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  8. Powell, M.J.D.: The BOBYQA algorithm for bound constrained optimization without derivatives (2009)

  9. Rios, L.M., Sahinidis, N.V.: Derivative-free optimization: a review of algorithms and comparison of software implementations. J. Global Optim. 56(3), 1247–1293 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  10. Shubert, B.: A sequential method seeking the global maximum of a function. SIAM J. Numer. Anal. 9, 379–388 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  11. Zunger, A., Kazmurski, L., Tumas, W. et al.: The center for inverse design, 2010–2014. http://www.centerforinversedesign.org/

Download references

Acknowledgements

PG has benefitted from many conversations with Wes Jones, Kwiseon Kim, Alex Zunger, and Mayeul de Avezac. This work is supported by the US Department of Energy, Office of Science, Basic Energy Sciences, under Contract No. DE-AC36-08GO28308 to NREL as a part of the DOE Energy Frontier Research Center “Center for Inverse Design”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter A. Graf.

Appendix: Local convergence rate of MDTri for strongly convex quadratic objective functions

Appendix: Local convergence rate of MDTri for strongly convex quadratic objective functions

Here we will derive and illustrate an upper bound on the number of evaluations necessary for MDTri to converge to an arbitrary tolerance \(\epsilon \) of the minimal value \(f(x^*)\) of a strongly convex quadratic objective function on one phase (one choice of 3 atoms). Such a function has the property that there are constants m and M s.t.

$$\begin{aligned} \frac{m}{2}|y-x|^2 \le f(y) - l_x(y) \le \frac{M}{2}|y-x|^2, \end{aligned}$$
(2)

where \(l_x(y) = f(x) + \nabla f(x)^T(y-x)\) is the linearization of f at x. (This is a consequence of the definition of strong convexity rather than the definition itself; see [2], section 9.1.2, for details). For this section, assume our objective function f is strongly convex with minimizer at \(x^*\). Also, assume x has dimension 3; we are examining the local search on one face.

Fig. 11
figure 11

A depiction (left schematic; right progress of actual MDTri search) of the type of case where MDTri will not always divide the triangle containing the optimum. The point \(x^*\) (orange) is just inside the triangle whose center is at x (the red triangle). All points inside the ellipse, (the level set \(\{x' | f(x') = f(x)\}\)) have lower function values than f(x), so any triangle on the candidate list that is the same size as the one in question containing x and has center inside the ellipse will be divided before the one containing the optimum, which is a source of “wasted” iterations we must estimate. In the schematic (left) the four triangles with centers marked with blue “X”s will be divided first. During the course of the actual search (right), triangles of the same size are divided in different iterations. The case shows the situation two iterations after all the triangles in question were the same size as the red one. Now the one to its left has been divided once, and the one below it has already been divided twice (Color figure online)

First, note at each iteration we may not always divide the triangle containing the minimizer \(x^*\). To develop a bound on how many iterations this triangle might have to “wait” before being divided, consider Fig. 11. We argue that this setup represents the worst case the algorithm has to deal with (it amplifies the anisotropy the most). Assume that f is a quadratic function whose Hessian has smallest and largest eigenvalues m and M, respectively. The figure depicts the minimizer \(x^*\) and the level set \(\{x' | f(x') = f(x)\}\) for x at the center of the bold triangle. Assume \(x^*\) is in fact just inside this triangle, but as far as can be toward the corner, i.e. as far as possible from x. And assume the line segment \(\overline{xx^*}\) is along the minor axis of the ellipse. Now, all those triangles whose centers are inside the ellipse will be triangles (of the same size as the one in question containing the minimizer) that are divided before the one containing the optimum, as the function values at the triangle centers are by construction less than f(x). So we need to estimate how many there are. Note that such triangles come (in this case) in sets of 4. We can get an upper bound by measuring the horizontal distance between centers. This distance is in fact \(\sqrt{3} |x-x^*|\). Now, the aspect ratio of the ellipse is given by the condition number \(\kappa = \frac{M}{m}\) of the Hessian. So the major axis, the distance from \(x^*\) to the right edge of the ellipse, is \(\kappa |x-x^*|\). And we just need to know how many \(\sqrt{3} |x-x^*|\)-wide triangles fit in this distance. Taking into account that the first triangle center is \(\sqrt{3}/2 |x-x^*|\) away from \(x^*\) (horizontally) and the curvature at the narrowing point of the ellipse, we claim that an upper bound for the number of triangle centers that could have lower function value than f(x) is

$$\begin{aligned} \Bigl \lfloor {\frac{\kappa |x-x^*|}{\sqrt{3} |x-x^*|}} \Bigr \rfloor \end{aligned}$$

so the upper bound \(n_{w}\) on the iteration after its creation (when it first enters the candidate set) the optima-containing triangle is divided is

$$\begin{aligned} n_{w} = 1 + 4 \Bigl \lfloor \frac{\kappa }{\sqrt{3}} \Bigr \rfloor , \end{aligned}$$

where we have added 1 so that \(n_w = 1\) in the case that it is immediately divided. Note that for \(M=m\), i.e., \(\kappa = 1\), this quantity is 1; for perfectly isotropic functions we are guaranteed to always subdivide the “correct” triangle. Also, this estimate is independent of x and \(x^*\), i.e., it is universal for all scales of the search.

Next we estimate the number \(n_c\) of “correct” subdivisions, i.e., subdivisions of the triangle containing the optimizer, needed to ensure a given bound \(\epsilon \) on the distance between the probed points and the minimizer \(x^*\). Recall that each face of the search space is an equilateral triangle with sides of length \(\sqrt{2}\), so the center of each face is \(\frac{\sqrt{6}}{3}\) away from the corners, so x starts at least this close to \(x^*\). The “center to corner” distance decreases by a factor of 2 at each correct subdivision. Hence we can say \(|x-x^*| < R (\frac{1}{2})^{n_c}\), where \(R = \sqrt{6} / 3\).

Now note that \(\nabla f(x^*) = 0\) at the minimizer \(x^*\), so Eq. (2) implies

$$\begin{aligned} f(x) - f(x^*) \le \frac{M}{2}|x-x^*|^2. \end{aligned}$$

So

$$\begin{aligned} \left| f(x)-f(x^*)\right| \le \frac{M}{2}\left| x-x^*\right| ^2 \le \frac{M}{2} \left[ R \left( \frac{1}{2}\right) ^{n_c}\right] ^2 = \frac{MR^2}{2} \left[ \left( \frac{1}{2}\right) ^{n_c}\right] ^2. \end{aligned}$$

Choosing

$$\begin{aligned} n_c = \frac{\ln \sqrt{\frac{2 \epsilon }{MR^2}}}{\ln \frac{1}{2}} \end{aligned}$$

then implies that \(|f(x)-f(x^*)| \le \epsilon \).

Now, the upper bound on the total number of iterations, taking into account as above that some might not divide the triangle containing the optimum, is \(n=n_w n_c\). Finally, to connect iterations n to evaluations e, note that every iteration of MDTri involves at most 4n function evaluations. This is true because MDTri adds exactly one new “scale” at each iteration, and we might divide the minimal objective triangle at each scale, and each triangle subdivision results in four new triangles to probe (counting re-evaluation of the center point, which is what the current implementation does). Then the number of evaluations e is no more than \(4 + 8 + 12 + \cdots = 4 (1+2+3+\cdots ) = 4 n (n+1)/2.\)

To summarize, the relations

$$\begin{aligned} n = n_w n_c = \left( 1 + 4 \Bigl \lfloor \frac{\kappa }{\sqrt{3}} \Bigr \rfloor \right) \left\lceil \frac{\ln \sqrt{\frac{2 \epsilon }{MR^2}}}{\ln \frac{1}{2}} \right\rceil \quad \text {and} \quad e = 4 n (n+1)/2 \end{aligned}$$
(3)

together provide an upper bound for the number of evaluations required to reach a desired tolerance for MDTri’s search over one phase (triangle) for strongly convex quadratic functions.

Fig. 12
figure 12

(Left) Evaluations required for MDTri to converge to a tolerance of \(10^{-8}\) for an isotropic quadratic test case with \(M=1,10,100,1000,10{,}000,100{,}000\) compared to the estimate from Eq. (3). (Right) Evaluations required for MDTri to converge to a tolerance of \(10^{-8}\) for an anisotropic quadratic test case for the same M values and \(m=1\) compared to the isotropic case. The bound for the anisotropic case is not sharp (e.g., at \(M=10\) it is already over 100,000) so has been omitted

We have run a test case to illustrate/verify this estimate. For this, we have defined a quadratic objective function with curvature M and minimum \(x^*\) at random points in the triangle. Note that this is the \(\kappa =1\) case so we are not yet testing the effect of anisotropy. We ran MDTri a number of times (to test at different values of \(x^*\), to get at the maximum number of evaluations we might require in practice) for \(M=1,10,100,1000,10{,}000,100{,}000\), to a tolerance of \(\epsilon = 10^{-8}\). Figure 12 shows the empirical results together with the upper bound from Eq. (3). We note that for this particular test, the gap between the bound and the actual number of iterations appears to come from certain iterations where we do not divide a triangle at every scale, which comes from some of the (scale, value) pairs on the Pareto front (see Sect. 4) not being on the convex hull of the {(scale, value)} set. For the anisotropic case, we have used a quadratic test function with curvature \(M=1\), 10, 100, 1000, 10,000, 100,000 in one direction, and \(m=1\) in the other. (One such function is that in Fig. 11.) Further detailed exposition of the convergence of MDTri, especially its global robustness, is beyond the scope of this paper.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Graf, P.A., Billups, S. MDTri: robust and efficient global mixed integer search of spaces of multiple ternary alloys. Comput Optim Appl 68, 671–687 (2017). https://doi.org/10.1007/s10589-017-9922-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-017-9922-9

Keywords

Navigation