Skip to main content
Log in

Nelder-Mead Simplex Optimization Routine for Large-Scale Problems: A Distributed Memory Implementation

  • Published:
Computational Economics Aims and scope Submit manuscript

Abstract

The Nelder-Mead simplex method is an optimization routine that works well with irregular objective functions. For a function of \(n\) parameters, it compares the objective function at the \(n+1\) vertices of a simplex and updates the worst vertex through simplex search steps. However, a standard serial implementation can be prohibitively expensive for optimizations over a large number of parameters. We describe an implementation of the Nelder-Mead method in parallel using a distributed memory. For \(p\) processors, each processor is assigned \((n+1)/p\) vertices at each iteration. Each processor then updates its worst local vertices, communicates the results, and a new simplex is formed with the vertices from all processors. We also describe how the algorithm can be implemented with only two MPI commands. In simulations, our implementation exhibits large speedups and is scalable to large problem sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. For this reason the algorithm is also known as the “amoeba” method.

  2. Big O notation is used to describe the limiting behavior of a function as its argument tends toward infinity. Essentially, it allows one to express the function in terms of only its dominant terms. Formally, a function \(h(n) = O(g(n))\) if and only if there exists a positive real number \(M\) and a real number \(n_0\) such that \(|h(n)| \le M\cdot |g(n)|\) for all \(n \ge n_0\).

  3. While it is possible to accomplish our entire algorithm using only \(\mathtt{MPI\_AllReduce}\), it is simpler to use \(\mathtt{MPI\_Bcast}\) in the shrink step.

  4. It should be made clear that if the \(k\) worst vertices are updated each iteration, we count that as updating \(k\) vertices. Similarly if \(1\) vertex is updated each iteration, it would take \(k\) iterations to update \(k\) vertices.

  5. These operations are \(O(n^2)\) and \(O(n\log n)\), respectively, as opposed to the remaining \(O(n)\) cost of updating a single vertex.

  6. Notice that the memory required for a problem size of \(n\) is \(O(n^2)\). Hence, to keep the amount of memory used by each processor the same, if we have \(p\) processors we scale the problem size by a factor of \(\sqrt{p}\). However, with exception to the infrequent sorting and centroid computations, the work done by each processor is \(O(n)\). Since we only increase the problem size by a factor of \(\sqrt{p}\), we also increase the number of iterations by a factor of \(\sqrt{p}\) to equalize the work performed. Specifically, for a single processor we execute \(32,000\) iterations on a problem of size \(n=\) 4,000 and update \(k=n/4\) vertices per iteration. When we scale to \(p=2\), we then execute 32,000\(\cdot \sqrt{2} \) iterations on a problem of size \(n \cdot \sqrt{2}\) and update \(k = (n\sqrt{2})/(4)\) vertices per iteration (the vertices updated per iteration are split amongst the processors).

References

  • Aldrich, E. M., Fernandez-Villaverde, J., Ronald Gallant, A., & Rubio-Ramirez, J. F. (2011). Tapping the supercomputer under your desk: Solving dynamic equilibrium models with graphics processors. Journal of Economic Dynamics and Control, 35, 386–393.

    Article  Google Scholar 

  • Beaumont, P. M., & Bradshaw, P. M. (1995). A distributed parallel genetic algorithm for solving optimal growth models. Computational Economics, 8, 159–179.

    Article  Google Scholar 

  • Creel, M. (2005). User-friendly parallel computations with econometric examples. Computational Economics, 26, 107–128.

    Google Scholar 

  • Dennis, J. E, Jr, & Torczon, V. (1991). Direct search methods on parallel machines. SIAM Journal on Optimization, 1, 448–474.

    Article  Google Scholar 

  • Ferrall, C. (2005). Solving finite mixture models: Efficient computation in economics under serial and parallel execution. Computational Economics, 25, 343–379.

    Article  Google Scholar 

  • Huggett, M., Ventura, G., & Yaron, A. (2006). Human capital and earnings distribution dynamics. Journal of Monetary Economics, 53, 265–290.

    Article  Google Scholar 

  • Lawver, D. (2012). Measuring quality increases in the medical sector. Santa Barbara: University of California Santa Barbara.

  • Lee, D., & Wiswall, M. (2007). A parallel implementation of the simplex function minimization routine. Computational Economics, 30, 171–187.

    Article  Google Scholar 

  • Nelder, J. A., & Mead, R. (1965). A simplex method for function minimization. The Computer Journal, 7, 308–313.

    Article  Google Scholar 

  • Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (2007). Numerical recipes: The art of scientific computing (3rd ed.). Cambridge University Press.

  • Swann, C. A. (2002). Maximum likelihood estimation using parallel computing: An introduction to MPI. Computational Economics, 19, 145–178.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julian Neira.

Additional information

Our computer code is available at Neira’s website. We thank Hrishikesh Singhania for excellent comments and discussion.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Klein, K., Neira, J. Nelder-Mead Simplex Optimization Routine for Large-Scale Problems: A Distributed Memory Implementation. Comput Econ 43, 447–461 (2014). https://doi.org/10.1007/s10614-013-9377-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-013-9377-8

Keywords

JEL Classification

Navigation