References

Zbigniew J. Czech

doi:10.1017/9781316795835.011

References

Published online by Cambridge University Press: 06 January 2017

Zbigniew J. Czech

Show author details

Zbigniew J. Czech: Affiliation:
Silesia University of Technology, Gliwice, Poland

Book contents

Get access

Summary

A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Type: Chapter
Information: Introduction to Parallel Computing , pp. 323 - 342

DOI: https://doi.org/10.1017/9781316795835.011 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ackerman, W. B. 1982. “Dataflow Languages.” IEEE Computer 15(2): 15–25.Google Scholar

Adiga, N. R., Blumrich, M. A., Chen, D., et al. 2005. “Blue Gene/L Torus Interconnection Network.” IBM Journal of Research and Development 49 (2/3): 265–276.Google Scholar

Adve, S. V. and Boehm, H. J.. 2011. “Memory Models.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1107–1110.

Adve, S. V. and Gharachorloo, K.. 1996. “Shared Memory Consistency Models: A Tutorial.” IEEE Computer 29 (12): 66–76.Google Scholar

Agarwal, A. 1991. “Limits on Interconnection Network Performance.” IEEE Transactions on Parallel and Distributed Systems 2 (4): 398–412.Google Scholar

Agerwala, T. and Arvind, N. I.. 1982. “Data Flow Systems: Guest Editor's Introduction.” Computer 15 (2): 10–13.Google Scholar

Aho, A. V., Hopcroft, J. E., and Ullman, J. D.. 1974. The Design and Analysis of Computer Algorithms. Boston, MA: Addison-Wesley.

Ajima, Y., Sumimoto, S., and Shimizu, T.. 2009. “A 6D Mesh/Torus Interconnect for Exascale Computers.” Computer 42 (11): 36–40.Google Scholar

Ajtai, M., Komlós, J., and Szemerédi, E.. 1983. “Sorting in c log(n) Parallel Steps.” Combinatorica 3: 1–19.Google Scholar

Akers, S. B. and Krishnamurthy, B.. 1989. “A Group-theoretic Model for Symmetric Interconnection Networks.” IEEE Transactions on Computers 38 (4): 555–566.Google Scholar

Akl, S. G. 1989. The Design and Analysis of Parallel Algorithms. Englewood Cliffs, NJ: Prentice Hall.

Akl, S. G. 1997. Parallel Computation. Models and Methods. Upper Saddle River, NJ: Prentice Hall.

Alexander, M. and Gardner, W., eds. 2009. Process Algebra for Parallel and Distributed Processing. Boca Raton, FL: Chapman & Hall/CRC.

Alexandrov, A., Ionescu, M. F., Schauser, K. E., and Scheiman, C. 1995. “LogGP: Incorporating Long Messages into the LogP Model.” Proc. 7th ACM Symposium on Parallel Algorithms and Architectures, Santa Barbara, CA, 95–105.Google Scholar

Allen, R. and Kennedy, K.. 2002. Optimizing Compilers for Modern Architectures. San Francisco, CA: Morgan Kaufman.

Alt, H., Hagerup, T., Mehlhorn, K., and Preparata, F. P.. 1987. “Simulation of Idealized Parallel Computers on More Realistic Ones.” SIAM Journal on Computing 16 (5): 808–835.Google Scholar

Amdahl, G. 1967. “Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities.” AFIPS Conference Proc., vol. 30. Washington D.C.: Thompson Books, 483–485.

Anaratone, M., Arnould, E., Gros, T., et al. 1986. “Warp Architecture and Implementation.” Proc. of 13th Annual International Symposium on Computer Architecture, Computer Science Press, Tokyo, 346–356.Google Scholar

Anderson, D. P., Cobb, J., Korpela, E., et al. 2002. “SETI@home. An Experiment in Public-resource Computing.” Communications of the ACM 45 (11): 56–61.Google Scholar

Anderson, T. E., Culler, D. E., and Patterson, D.. 1995. “A Case for NOW (Networks of Workstations).” IEEE Micro 15 (1): 54–56.Google Scholar

Andrews, G. R. 1991. Concurrent Programming: Principles and Practice. Menlo Park, CA: Benjamin/Cummings.

Andrews, G. R. 2000. Foundations of Multithreaded, Parallel, and Distributed Programming. Reading, MA: Addison-Wesley.

Apt, K. R. and Olderog, E-R.. 1991. Verification of Sequential and Concurrent Programs. New York: Springer-Verlag.

Arvind, N. I. and Culler, D. E.. 1986. “Dataflow Architectures.” Annual Review of Computer Science, vol. 1: 225–253.Google Scholar

Arvind, N. I., Gostelow, K. P., and Plouffe, W.. 1978. The ID-Report: An Asynchronous Programming Language and Computing Machine. Technical Report, 114. University of California at Irvine.

Nikhil, R. S. 1990. “Executing a Program on the MIT Tagged-token Dataflow Architecture.” IEEE Transactions on Computers 39 (3): 300–318.Google Scholar

Attiya, H. and Welch, J.. 1998. Distributed Computing: Fundamentals, Simulations and Advanced Topics. London: McGraw-Hill.

Augen, J. 2002. “The Evolving Role of Information Technology in the Drug Discovery Process.” Drug Discovery Today 7 (5): 315–323.Google Scholar

Baase, S. 1988. Computer Algorithms: Introduction to Design and Analysis. Boston, MA: Addison-Wesley.

Bacon, J. and Harris, T.. 2003. Operating Systems. Concurrent and Distributed Systems. Harlow, UK: Pearson Education, Addison-Wesley.

Bader, D. A., ed. 2008. Petascale Computing. Algorithms and Applications. Boca Raton, FL: Chapman & Hall/CRC.

Bader, M., Breuer, A., and Schreiber, M.. 2013. “Parallel Fully Adaptive Tsunami Simulations.” In Facing the Multicore-challenge III. Aspects of New Paradigms and Technologies in Parallel Computing, Lecture Notes in Computer Science. Vol. 7686, edited by Keller, R., Kramer, D., and Weiss, J-P. (Berlin, Heidelberg: Springer-Verlag), 137–138.

Baer, J-L. 2010. Microprocessor Architecture, Cambridge, NY: Cambridge University Press.

Bahi, J. M. 2008. Parallel Iterative Algorithms. From Sequential to Grid Computing. Boca Raton, FL: Chapman & Hall/CRC.

Barnes, G. H., Brown, R. M., Kato, M., et al. 1968. “The Illiac IV Computer.” IEEE Transactions on Computers 17 (8): 746–757.Google Scholar

Barton, M. L. and Withers, G. R.. 1989. “Computing Performance as a Function of the Speed, Quantity and Cost of the Processors.” Supercomputing ’89 Proc., 759–764.Google Scholar

Barz, H. W. 1983. “Implementing Semaphores by Binary Semaphores.” ACM SIG-PLAN Notices 18 (2): 39–45.Google Scholar

Batcher, K. E. 1968. “Sorting Networks and Their Applications.” Spring Joint Computer Conference, AFIPS Proc., 32: 307–314.Google Scholar

BBN Advanced Computers Incorporated. 1968. Butterfly Parallel Processor Overview, BBN Report No. 6148, March.

Beecroft, J., Homewood, M., and McLaren, M.. 1994. “Meiko CS-2 Interconnect Elan-Elite Design.” Parallel Computing 20 (10–11): 1627–1638.Google Scholar

Bell, G. and Gray, J.. 2002. “What's Next in High-performance Computing.” Communications of the ACM 45 (2): 91–95.Google Scholar

Bellman, R. 1957. Dynamic Programming. Princeton, NJ: Princeton University Press.

Ben-Ari, M. 2006. Principles of Concurrent and Distributed Programming, 2nd edn. Boston, MA: Addison-Wesley.

Bharadwaj, V., Ghose, D., Mani, V., and Robertazzi, T. G.. 1996. Scheduling Divisible Loads in Parallel and Distributed Systems. IEEE Computer Society Press, Los Alamitos, CA.

Bhatele, A. 2011. “Topology Aware Task Mapping.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2057–2062.

Bilardi, G., Herley, K. T., Pietracaprina, A., Pucci, G., and Spirakis, P.w. 1996. “BSP vs LogP.” 8th ACM Symposium on Parallel Algorithms and Architectures, Padova, Italy, 25–32.Google Scholar

Bilardi, G., Pietracaprina, A., and Pucci, G.. 2008. “Decomposable BSP: A Bandwidth-latency Model for Parallel and Hierarchical Computation.” In Hand-book of Parallel Computing. Models, Algorithms and Applications, edited by Rajasekaran, S. and Reif, J. (Boca Raton, FL: Chapman & Hall/CRC), 2-1–2-21.

Bilardi, G. and Pietracaprina, A.. 2011. “Models of Computation, Theoretical.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1150–1158.

Bisseling, R. H. 2004. Parallel Scientific Computation. New York: Oxford University Press.

Biswas, R., Aftosmis, M., Kiris, C., and Shen, B-W.. 2008. “Petascale Computing: Impact on Future NASA Missions.” In Petascale Computing. Algorithms and Applications, edited by Bader, D. A. (Boca Raton, FL: Chapman & Hall/CRC), 29–46.

Biswas, R., Thigpen, W., Ciotti, R., Mehrotra, P., et al. 2013. “Pleiades: NASA's First Petascale Supercomputer.” In Contemporary High Performance Computing: From Petascale toward Exascale, edited by Vetter, J. S. (Chapman & Hall/CRC, Boca Raton, FL), 309–338.

Bokhari, S. H. 1987. “Multiprocessing the Sieve of Eratosthenes.” Computer, April: 50–58.Google Scholar

Boppana, R. B. 1989. “Optimal Separations between Concurrent-write Parallel Machines.” Proc. of the ACM Symposium on Theory of Computing, 320–326.Google Scholar

Borkar, S., Cohn, R., and Fox, G.. 1990. “Supporting Systolic and Memory Communication in iWARP.” Proc. of 17th Annual International Symposium on Computer Architecture, Australia, May 1990, 70–81.Google Scholar

Borodin, A. 1977. “On Relating Time and Space to Size and Depth.” SIAM Journal on Computing 6 (4): 733–744.Google Scholar

Borovska, P., Nakov, O., Markov, S., Ivanova, D., and Filipov, F.. 2007. “Performance Evaluation of TOFU System Area Network Design for High-performance Computer Systems.” Proc. 5th European Conference on European Computing Conference, 186–216.Google Scholar

Bovet, D. P. and Crescenzi, P.. 1994. Introduction to the Theory of Complexity. Upper Saddle River, NJ: Prentice Hall.

Brent, R. P. 1974. “The Parallel Evaluation of General Arithmetic Expressions.” Journal of the ACM 21 (2): 201–206.Google Scholar

Brinch Hansen, P. 1975. “The Programming Language Concurrent Pascal.” IEEE Transactions on Software Engineering 2: 199–206.Google Scholar

Brooks, E. D. III. 1986. “The Butterfly Barrier.” International Journal of Parallel Programming 15: 295–307.Google Scholar

Brucker, P. 2010. Scheduling Algorithms, 5th edn. Berlin, Heidelberg: Springer-Verlag.

Bruda, S. D. and Zhang, Y.. 2009. “Relations between Several Parallel Computational Models.” Scalable Computing: Practice and Experience 10 (2): 163–172.Google Scholar

Burns, A. and Wellings, A.. 1998. Concurrency in Ada, 2nd edn. Cambridge: Cambridge University Press.

Buyya, R., Branson, K., Giddy, J., and Abramson, D.. 2003. “The Virtual Laboratory: A Toolset to Enable Distributed Molecular Modelling for Drug Design on the World-wide Grid.” Concurrency and Computation: Practice and Experience 15 (1): 1–25.Google Scholar

Carmona, E. A. and Rice, M. D.. 1991. “Modeling the Serial and Parallel Fractions of a Parallel Algorithm.” Journal of Parallel and Distributed Computing 13: 286–298.Google Scholar

Carver, R. H. and Tai, K-C.. 2006. Modern Multithreading. Implementing, Testing, and Debugging Multi-threaded Java and C++/Pthreads/Win32 Programs. Hoboken, NJ: Wiley-Interscience.

Casanova, H., Legrand, A., and Robert, Y.. 2009. Parallel Algorithms. Boca Raton, FL: CRC Press.

Chaderjian, N. M. and Buning, P. G.. 2011. “High Resolution Navier-Stokes Simulation of Rotor Wakes.” Proceedings of the American Helicopter Society 67th Annual Forum.Google Scholar

Chaderjian, N. M. and Ahmad, J. U.. 2012. “Detached Eddy Simulation of the UH-60 Rotor Wake Using Adaptive Mesh Refinement.” Proceedings of the American Helicopter Society 68th Annual Forum.Google Scholar

Chandra, R., Dagum, L., Kohr, D., et al. 2001. Parallel Programming in OpenMP. San Francisco, CA: Morgan Kaufmann, Academic Press.

Chapman, B., Jost, G., and van der Pas, R.. 2008. Using OpenMP. Portable Shared Memory Parallel Programming. Cambridge, MA: MIT Press.

Cheatham, T. E., Fahmy, A., Stepanescu, D., and Valiant, L.. 1995. “Bulk Synchronous Parallel Computing-A Paradigm for Transportable Software.” Proc. 28th Annual Hawaii Conference on System Sciences, Vol. II. Hoboken, NJ: IEEE Computer Society Press, 268–275.

Chen, S. S., Price, J. F., Zhao, W., Donelana, M. A., and Walsh, E. J.. 2007. “The CBLAST-Hurricane Program and the Next-generation Fully Coupled Atmosphere-wave-ocean Models for Hurricane Research and Prediction.” Bull. Amer. Meteor. Soc. 88 (3): 311–317.Google Scholar

Cheng, J., Grossman, M., and McKercher, T.. 2014. Professional CUDA C Programming. New York: John Wiley & Sons, Inc.

Chlebus, B. S., Diks, K., Hagerup, T., and Radzik, T., 1988. “Efficient Simulations between Concurrent-read Concurrent-write PRAM Models.” Proc. of the Symposium on Mathematical Foundations of Computer Science, 231–239.Google Scholar

Close, P. 1988. “The iPSC/2 Node Architecture.” Proc. of the Conference on Hypercube Concurrent Computers and Applications, 43–55.Google Scholar

Cole, R. 1986. “Parallel Merge Sort.” Proc. of the 27th Annual Symposium on Foundations of Computer Science. Hoboken, NJ: IEEE Computer Society Press, 511–516.

Cole, R. 1988. “Parallel Merge Sort.” SIAM Journal on Computing 4: 770–785.Google Scholar

Cole, R. 1993. “Parallel Merge Sort.” In Synthesis of Parallel Algorithms, edited by Reif, J. H. (San Mateo, CA: Morgan Kaufmann), 453–495.

Collins, W. D., Bitz, M. L., Blackmon, M. L., et al. 2006. “The Community Climate System Model version 3 (CCSM3).” Journal of Climate 19: 2122–2143.Google Scholar

Convex Computer Corporation. 1993. Exemplar Architecture. Richardson, TX: Convex Computer Corporation.

Cook, S. A. 1979. “Deterministic CFL's are Accepted Simultaneously in Polynomial Time and Log Squared Space.” Conference Record of the Eleventh Annual ACM Symposium on Theory of Computing, Atlanta, GA, April–May 1979, 338–345.Google Scholar

Cook, S. A., Dwork, C., and Reischuk, R.. 1986. “Upper and Lower Time Bounds for Parallel Random Access Machines without Simultaneous Writes.” SIAM Journal on Computing 15: 87–97.Google Scholar

Cormen, T. H., Leiserson, C. E., and Rivest, R. L.. 1990. Introduction to Algorithms. Cambridge, MA: MIT Press.

Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C.. 2009. Introduction to Algorithms, 3rd edn. Cambridge, MA: MIT Press.

Coulouris, G., Dollmore, J., and Kindberg, T.. 2005. Distributed Systems: Concepts and Design, 4th edn. Boston, MA: Addison-Wesley.

Courtois, P. J., Heymans, F., and Parnas, D. L.. 1971. “Concurrent Control with ‘Readers’ and ‘Writers’.” Communications of the ACM 14 (10): 667–668.Google Scholar

Culler, D., Karp, R., Patterson, D., et al. 1993. “LogP: Towards a Realistic Model of Parallel Computation.” 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, May 1993, 1–12.Google Scholar

Culler, D. E., Singh, J. P., and Gupta, A.. 1999. Parallel Computer Architecture. San Francisco, CA: Morgan Kaufamann.

Dally, W. J. 1991. “Performance Analysis of k-ary n-cube Interconnection Networks.” IEEE Transactions on Computers 39 (6): 775–785.Google Scholar

Dally, W. J. and Towles, B.. 2004. Principles and Practices of Interconnection Networks. San Francisco, CA: Morgan Kaufmann.

Darema-Rogers, F., George, D., Norton, V. A., and Pfister, G.. 1984. “VM Parallel Environment.” Proc. of the IBM Kingston Parallel Processing Symposium, November 27–29, 1984 (IBM Confidential).Google Scholar

Darema, F. 2001. “SPMD Model: Past, Present and Future.” Recent Advances in Parallel Virtual Machine and Message Passing Interface, 8th European PVM/MPI Users’ Group Meeting, Santorini/Thera, Greece, LNCS 2131, September 23–26, 2001, p. 1.Google Scholar

Darte, A., Robert Y., Y., and Vivien, F.. 2000. Scheduling and Automatic Parallelization. Boston, MA: Birkhuser.

Dennis, J. B. 1980. “Dataflow Supercomputers.” IEEE Computer 13: 48–56.Google Scholar

Dennis, J. B. 1983. “Maximum Pipelining of Array Operations on Static Data Flow Machines.” Proc. of the International Conference on Parallel Processing, August 1983, 176–184.Google Scholar

Dennis, J. B. and van Horn, E. C.. 1966. “Programming Semantics for Multiprogrammed Computations.” Communications of the ACM 9 (3): 143–155.Google Scholar

Dennis, J., and Loft, R.. 2009. “Optimizing High-resolution Climate Variability Experiments on the Cray XT4 and XT5 Systems at NICS and NERSC.” Proceedings of the 51st Cray User Group Conference (CUG), 1–8.Google Scholar

Dijkstra, E. W. 1968. “Cooperating Sequential Processes.” In Programming Languages, edited by Genuys, F. (New York: Academic Press), 43–112.

Dijkstra, E. W. 1971. “Hierarchical Ordering of Sequential Processes.” Acta Informatica 1 (2): 115–138.Google Scholar

Dijkstra, E. W. and Scholten, C. S.. 1980. “Termination Detection for Diffusing Computations.” Information Processing Letters 11 (1): 1–4.Google Scholar

Dill, K. A., Ozkan, S. B., Weikl, T. R., Chodera, J. D., and Voelz, V. A.. 2007. “The Protein Problem: When Will It Be Solved?” Current Opinion in Structured Biology 17 (3): 342–346.Google Scholar

Domeika, M. 2008. Software Development for Embedded Multi-core Systems. Burlington, MA: Newnes.

Donnellan, A., Mora, P., Matsu'ura, M., and Yin, X-C.. 2004. Computational Earthquake Science. Basel: Birkhuser.

Dongarra, J. 2013. “Visit to the National University for Defense Technology Changsha, China, University of Tennessee, Oak Ridge National Laboratory, June 3, 2013.” http://www.netlib.org/utk/people/JackDongarra/PAPERS/tianhe-2-dongarra-report.pdf.Google Scholar

Dongarra, J., Otto, S. W., Snir, M., and Walker, D.. 1995. An Introduction to the MPI standard, University of Tennessee Technical Report, CS-95-274, January 1995.Google Scholar

Dongarra, J., Foster, I., Fox, G., et al. ed. 2003. Sourcebook of Parallel Computing. San Francisco, CA: Morgan Kaufmann.

Dongarra, J., Sterling, T., Simon, H., and Strohmaier, E.. 2005. “High-performance Computing: Clusters, Constellations, MPPs, and Future Directions.” Computing in Science & Engineering, March/April: 51–59.Google Scholar

Dongarra, J. and Luszczek, P.. 2011. “LINPACK Benchmark.” In Encyclopedia of Parallel Computing, vol. 2, edited by Padua, D. (New York: Springer-Verlag), 1033–1035.

Dorband, E. N., Hemsendorf, M., and Merritt, D.. 2003. “Systolic and Hyper-systolic Algorithms for the Gravitational N-body Problem, with an Application to Brownian Motion.” J. Comput. Phys. 185: 484–511.Google Scholar

Downey, A. B. 2007. “The Little Book of Semaphore,” v. 2.1.2. http://greenteapress.com/semaphores/.Google Scholar

Drake, J. B., Jones, P. W., Vertenstein, M., White, J. B. III, and Worley, P. H.. 2008. “Software Design for Petascale Climate Science.” In Petascale Computing. Algorithms and Applications, edited by Bader, D. A. (Boca Raton, FL: Chapman & Hall/CRC), 125–146.

Drozdowski, M. 2004. “Scheduling Parallel Tasks – Algorithms and Complexity.” In Handbook of Scheduling. Algorithms, Models and Performance Analysis, edited by Leung, J. Y-T. (Boca Raton, FL: Chapman & Hall/CRC), 25-1–25-25.

Dubois, M., Annavaram, M., and Stenstr´’om, P.. 2012. Parallel Computer Organization and Design. Cambridge: Cambridge University Press.

Dumancas, G. G. 2015. “Applications of Supercomputers in Sequence Analysis and Genome Annotation.” In Research and Applications in Global Supercomputing, edited by Segall, R. S., Cook, J. S. and Zhang, Q. (Hershey, PA: IGI Global), 149–175.

Dutot, P-F., Mounié, G., and Trystram, D.. 2004. “Scheduling Parallel Tasks Approximation Algorithms.” In Handbook of Scheduling. Algorithms, Models and Performance Analysis, edited by Leung, J.Y-T. (Boca Raton, FL: Chapman & Hall/CRC), 26-1–26-24.

Science. 2005. “Editorial: So Much More to Know.” Science 309: 78–102.

El-Ghazawi, T., Carlson, W., Stering, T., and Yelick, K,. 2005. UPC. Distributed Shared Memory Programming. Hoboken, NJ: John Wiley & Sons, Inc.

Endy, D. and Brent, R.. 2001. “Modelling Cellular Behaviour.” Nature 409: 391–395.Google Scholar

Fatahalian, K. and Houston, M.. 2008. “A Closer Look at GPUs.” Communications of the ACM 51 (10): 50–57.Google Scholar

Feng, T. Y. 1972. “Some Characteristics of Associative/Parallel Processing.” Proc. of the 1972 Sagamore Computing Conference, 5–16.Google Scholar

Feng, T. Y. 1981. “A Survey of Interconnection Networks.” IEEE Computer, December: 12–27.Google Scholar

Feo, J. T., ed. 1993. A Comparative Study of Parallel Programming Languages: The Salishan Problems. Amsterdam, The Netherlands: North-Holland.

Fich, F. E. 1993. “The Complexity of Computation on the Parallel Random Access Machine.” In Synthesis of Parallel Algorithms, edited by Reif, J. H. (San Mateo, CA: Morgan Kaufmann), 843–899.

Fich, F. E., Ragde, P., and Wigderson, A.. 1988. “Relations between Concurrent-write Models of Parallel Computation.” SIAM Journal on Computing 7: 606–627.Google Scholar

Fishman, G. S. 1996. Monte Carlo: Concepts, Algorithms and Applications. New York: Springer-Verlag.

Flatt, H. P. and Kennedy, K.. 1989. “Performance of Parallel Processors.” Parallel Computing 12: 1–20.Google Scholar

Flynn, M. J. 1966. “Very High Speed Computers.” Proc. IEEE 54: 1901–1909.Google Scholar

Flynn, M. J. 1972. “Some Computer Organizations and Their Effectiveness.” IEEE Transactions on Computing C-21: 948–960.Google Scholar

Flynn, M. J. 2011. “Flynn's Taxonomy.” In Encyclopedia of Parallel Computing, Vols 1–4 (New York: Springer-Verlag), 689–697.

Fortune, S. and Wyllie, J.. 1978. “Parallelism in Random Access Machines.” Proc. 10th Symp. Theory Computing. ACM, New York, 114–118.

Foster, I. T. 1995. Designing and Building Parallel Programs. Concepts and Tools for Parallel Software Engineering. Addison-Wesley, Reading, MA, http://www.mcs.anl.gov/~itf/dbpp/.

Foster, I. and Kesselman, C.. ed. 2004. The Grid 2: Blueprint for a New Computing Infrastructure, 2nd edn. San Francisco, CA: Elsevier.

Fountain, T. J. 1994. Parallel Computing Principles and Practice. Cambridge: Cambridge University Press.

Fox, G. C., Williams, R. D., and Messina, P. C.. 1994. Parallel Computing Works!. San Francisco, CA: Morgan Kaufmann.

Francez, N. 1980. “Distributed Termination.” ACM Trans. Program. Lang. Syst. 2 (1): 42–55.Google Scholar

Frank, S., Burkhardt, H., and Rothnie, J.. 1993. “The KSR1: Bridging the Gap between Shared Memory and MPPs.” Proc. of the COMPCON Digest of Papers, 285–294.Google Scholar

Furst, M., Saxe, J. B., and Sipser, M., 1984. “Parity, Circuits, and the Polynomial-time Hierarchy.” Mathematical Systems Theory 17: 13–27.Google Scholar

Gabriel, E., Fagg, G. E., Bosilca, G., et al. 2004. “Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation.” Proc. 11th European PVM/MPI Users’ Group Meeting, September 2004, Budapest, Hungary, 97–104.Google Scholar

Gajski, D., Padua, D. A., Kuck, D. J., and Kuhn, R. H.. 1982. “A Second Opinion on Data Flow Machines and Languages.” IEEE Computer 15 (2): 58–69.Google Scholar

Galvin, P. B., Gagne, G., and Silberschatz, A.. 2013. Operating System Concepts, 9th edn. New York: John Wiley & Sons, Inc.

Gara, A. 2005. “Overview of the Blue Gene/L System Architecture.” IBM Journal of Research and Development 49 (2/3): 195–212.Google Scholar

Gara, A. and Moreira, J. E.. 2011. “IBM Blue Gene ‘supercomputer’.” In Encyclopedia of Parallel Computing, vol. 2, edited by Padua, D. A. (New York: Springer-Verlag), 891–900.

Garey, M. R. and Johnson, D. S.. 1979. Computers and Intractability. A Guide to the Theory of NP-Completeness. New York: W. H. Freeman and Co.

Garland, M. 2011. “NVIDIA GPU.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A (New York: Springer-Verlag), 1339–1345.

Gaudiot, J. and Bic, L.. 1989. Advanced Topics in Data-flow Computing. Englewood Cliffs, NJ: Prentice Hall.

Gebali, F. 2011. Algorithms and Parallel Computing. Hoboken, NJ: John Wiley & Sons, Inc.

Geist, A., Beguelin, A., Dongarra, J., et al. 1994. PVM: Parallel Virtual Machine: A User's Guide and Tutorial for Networked Parallel Computing. Cambridge, MA: The MIT Press.

Geist A. 2011. “PVM (Parallel Virtual Machine).” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1647–1651.

Gent, P. R., Danabasoglu, G., and Donner, L. J., et al. 2011. “The Community Climate System Model Version 4.” Journal of Climate 24(19): 4973–4991.Google Scholar

Ghosh, S. 2007. Distributed Systems. An Algorithmic Approach. Boca Raton, FL: Chapman & Hall/CRC.

Gibbons, A. 1993. “An Introduction to Distributed Memory Models of Parallel Computation.” In Lectures on Parallel Computation, edited by Gibbons, A. and Spirakis, P. (Cambridge: Cambridge University Press), 197–226.

Gibbons, A. and Rytter, W.. 1988. Efficient Parallel Algorithms. Cambridge: Cambridge University Press.

Gibbons, A. and Spirakis, P., eds. 1993. Lectures on Parallel Computation. Cambridge: Cambridge University Press.

Gilge, M. 2012. “IBM System Blue Gene Solution: Blue Gene/Q. Application Development.” March. www.ibm.com/redbooks/.Google Scholar

Glauert, J. A. 1978. “A Single Assignment Language for Dataflow Computing.” Master's Thesis, Manchester, UK: University of Manchester.

Goedecker, S. and Hoisie, A.. 2001. Performance Optimization of Numerically Intensive Codes. Philadelphia, PA: SIAM Publishing Company.

Goldschlager, L. M. 1982. “A Universal Interconnection Pattern for Parallel Computers.” Journal of ACM 29: 1073–1086.Google Scholar

Goodman, S. E. and Hedetniemi, S. T.. 1977. Introduction to Design and Analysis of Algorithms. New York: McGraw-Hill.

Gottlieb, A., Grishman, R., Kruskal, C. P., et al. 1983. “The NUY Ultra-computer— Designing a MIMD Shared Memory Parallel Computer.” IEEE Transactions on Parallel and Distributed Systems 32 (2): 175–189.Google Scholar

Gottlieb, A. 2011. “Ultracomputer, NYU.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2095–2103.

Graham, R. L., Shipman, G. M., and Barrett, B. W., et al. 2006. “Open MPI: A High-performance, Heterogeneous MPI.” Proc. 5th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks, September 2006, Barcelona, Spain, 1–9.Google Scholar

Grama, A., Gupta, A., Karypis, G., and Kumar, V.. 2003. Introduction to Parallel Computing, 2nd edn. Harlow, UK: Addison-Wesley.

Grama, A. Y., Gupta, A., and Kumar, V.. 1993. “Isoefficiency: Measuring the Scalability of Parallel Algorithms and Architectures.” IEEE Parallel and Distributed Technology 1 (3): 12–21.Google Scholar

Grama, A. and Kumar, V.. 2008. “Scalability of Parallel Programs.” In Handbook of Parallel Computing. Models, Algorithms and Applications, edited by Rajasekaran, S. and Reif, J. (Boca Raton, FL: Chapman & Hall/CRC), 43-1–43-16.

Greenlaw, R. 1993. “Polynomial Completeness and Parallel Computation.” In Synthesis of Parallel Algorithms, edited by Reif, J. H. (San Mateo, CA: Morgan Kaufmann), 901–953.

Greenlaw, R., Hoover, H. J., and Ruzzo, W. L.. 1995. Limits to Parallel Computation: P-Completeness Theory. Oxford: Oxford University Press. www.cs.armstrong.edu/-greenlaw/research/PARALLEL/.

Gropp, W. 2011. “MPI (Message Passing Interface).” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1184–1190.

Gropp, W., Huss-Lederman, S., Lumsdaine, A., et al. 1998. MPI-The Complete Reference: Vol. 2. The MPI Extensions, 2nd edn. Cambridge, MA: MIT Press.

Gropp, W., Lusk, E., and Skjellum, A.. 1999. Using MPI. Portable Parallel Programming with the Message-passing Interface, 2nd edn, Cambridge, MA: MIT Press.

Gropp, W., Lusk, E., and Thakur, R.. 1999. Using MPI-2. Advanced Features of the Message-passing Interface, 2nd edn. Cambridge, MA: MIT Press.

Gupta, A. and Kumar, V.. 1993. “Performance Properties of Large Scale Parallel Systems.” Journal of Parallel and Distributed Computing 19: 234–244.Google Scholar

Gurd, J. R., Kirkham, C., and Watson, J.. 1985. “The Manchester Prototype Dataflow Computer.” Communications of the ACM 28 (18): 36–45.Google Scholar

Gustafson, J. L. 1988. “Reevaluating Amdahl's Law.” Communications of the ACM 31 (5): 532–533.Google Scholar

Gustafson, J. L., Montry, G. R., and Benner, R. E.. 1988. “Development of Parallel Methods for a 1024-processor Hypercube.” SIAM Journal on Scientific and Statistical Computing 9 (4): 609–638.Google Scholar

Gustafson, J. L. 1992. “The Consequences of Fixed Time Performance Measurement.” Proc. of the 25th Hawaii International Conference on System Sciences, Vol. III, 113–124.Google Scholar

Gustafson, J. L. 2011. “Brent's Theorem.” In Encyclopedia of Parallel Computing, vol. 1, edited by Padua, D. A. (New York: Springer-Verlag), 182–185.

Gustafson, J. L. 2011. “Moore's Law.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1177–1184.

Hager, G. and Wellein, G.. 2011. Introduction to High Performance Computing for Scientists and Engineers. Boca Raton, FL: Chapman & Hall/CRC.

Halfill, T. R. 2008. “Parallel Processing with CUDA.” Microprocessor Report, January 28: 1–8 (www.MPRonline.com).Google Scholar

Hamacher, V. V., Vranesic, Z. G., and Zaky, S. G.. 2001. Computer Organization, 5th edn. New York: McGraw-Hill.

Handler, W. 1977. “The Impact of Classification Schemes on Computer Architecture.” Proc. of the International Conference on Parallel Processing, August, 7–15.Google Scholar

Handy, J. 1998. The Cache Memory Book, 2nd edn. Orlando, FL: Academic Press.

Harris, T. J. 1994. “A Survey of PRAM Simulation Techniques.” ACM Computing Surveys 26: 187–206.Google Scholar

Hennessy, J. L. and Patterson, D. A.. 2007. Computer Architecture. A Quantitative Approach, 4th edn. San Francisco, CA: Morgan Kaufmann.

Hensgen, D., Finkel, R., and Manber, U.. 1988. “Two Algorithms for Barrier Synchronization.” International Journal of Parallel Programming 17 (1): 1–16.Google Scholar

Herley, K. T. and Bilardi, G.. 1988. “Deterministic Simulations of PRAMs on Bounded-degree Networks.” Proc. of 26th Annual Allerton Conference on Communication, Control and Computation, Monticello, IL, 1084–1093.Google Scholar

Herlichy, M. and Shavit, N.. 2008. The Art of Multiprocessor Programming. Burlington, MA: Morgan Kaufmann.

Heroux, M. A., Raghavan, P., and Simon, H. D., eds. 2006. Parallel Processing for Scientific Computing. Philadelphia, PA: SIAM Publishing Company.

Hicks, J., Chiou, D., Ang, B., and Arvind, . 1992. Performance Studies of the Monsoon Dataflow Processor. CSF Memo 345-2, MIT, October.Google Scholar

Hill, M. 1998. “Multiprocessors Should Support Simple Memory-consistency Models.” IEEE Computer Magazine 31: 28–34.Google Scholar

Hillis, D. 1985. The Connection Machine. Cambridge, MA: MIT Press.

Hiraki, K., Nishida, K., Sekiguchi, S., Shimada, T., and Tiba, T., 1987. “The SIGMA-1 Dataflow Supercomputer: A Challenge for New Generation Supercomputing Systems.” Journal of Information Processing 10 (4): 219–226.Google Scholar

Hoare, C.A.R. 1974. “Monitors, an Operating System Structuring Concept.” Communications of the ACM 17: 549–557;Google Scholar

“Erratum.” Communications of the ACM 18 (1975): 95.

Hoare, C. A. R. 1978. “Communicating Sequential Processes.” Communications of the ACM 21 (8): 666–677.Google Scholar

Hoffman, F. M. and Hargrove, W. W.. 1999. “Multivariate Geographic Clustering Using a Beowulf-style Parallel Computer.” Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications, June, 1292–1298.Google Scholar

Hromkovič, J. 2003. Algorithmics for Hard Problems. Introduction to Combinatorial Optimization, Randomization, Approximation and Heuristics. Berlin: Springer-Verlag.

Hwang, K. 1993. Advanced Computer Architecture, Parallelism, Scalability, Programmability. New York: McGraw-Hill.

Hwang, K. and Xu, Z.. 1998. Scalable Parallel Computing. McGraw-Hill, New York, 1998.

Hwang, K., Fox, G. C., and Dongarra, J. J.. 2012. Distributed and Cloud Computing. Waltham, MA Morgan Kaufman.

Hyndman, Donald and David, Hyndman. 2009. Natural Hazards and Disasters, 2nd edn. Belmont, CA: Brooks/Cole,

Inmos Ltd. 1988. Occam 2 Reference Manual. Englewood Cliffs, NJ: Prentice-Hall.

International Human Genome Sequencing Consortium. 2001. “Initial Sequencing and Analysis of the Human Genome.” Nature 409: 860–921.

International Organization for Standardization, Geneva. 1996. Information Technology-Portable Operating System Interface (POSIX) – Part 1: System Application Program Interface (API) [C Language], December.

JáJ á, J. 1992. An Introduction to Parallel Algorithms. Reading, MA: Addison-Wesley.

Jha, S. K. and Jana, P. K.. 2011. Study and Design of Parallel Algorithms for Interconnection Networks. Saarbr´’ucken, Germany: Lambert Academic Publishing.

Johnson, M. 1991. Superscalar Microprocessor Design. Upper Saddle River, NJ: Prentice-Hall.

Jones, G. A. and Goldsmith, M., 1989. Programming in Occam 2, 2nd edn. Engle-wood Cliffs, NJ: Prentice Hall.

Jordan, H. and Alaghband, G.. 2003. Fundamentals of Parallel Processing. Upper Saddle River, NJ: Prentice Hall.

Kalos, M. H. and Whitlock, P. A.. 2008. Monte Carlo Methods, 2nd edn. Weinheim: Wiley-VCH Verlag.

Kalyanaraman, A., Emrich, S. J., Schnable, P. S., and Aluru, S.. 2007. “Assembling Genomes on Large-scale Parallel Computers.” Journal of Parallel and Distributed Computing 67, 1240–1255.Google Scholar

Karniadakis, G. E. and Kirby, R. M. II. 2007. Parallel Scientific Computing in C++ and MPI. A Seamless Approach to Parallel Algorithms and Their Implementation. New York: Cambridge University Press.

Karp, A. H. and Flatt, H. P.. 1990. “Measuring Parallel Processor Performance.” Communications of the ACM 33 (5): 539–543.Google Scholar

Karp, R. M. and Ramachandran, V.. 1990. “Parallel Algorithms for Shared-memory Machines.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuven, J. (Amsterdam, The Netherlands: Elsevier), 870–941.

Keller, R., Kramer, D., Weiss, J-P., eds. 2013. Facing the Multicore-challenge III. Aspects of New Paradigms and Technologies in Parallel Computing. Lecture Notes in Computer Science 7686. Berlin, Heidelberg: Springer-Verlag.

Kennedy, K. and Allen, J. R.. 2001. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. San Francisco, CA: Morgan Kaufmann Pub.

Kessler, R. E. and Schwarzmeier, J. L.. 1993. “Cray T3D: A New Dimension for Cray Research.” Proc. of the IEEE Computer Society International Conference, February, 176–182.Google Scholar

Kiris, C., Housman, J., Gusman, M., et al. 2011. “Best Practices for Aero-Database CFD Simulations of Ares V Ascent.” In 49th AIAA Aerospace Sciences Meeting, 1–21.Google Scholar

Kirk, D. B. and Hwu, W-M. W.. 2013. Programming Massively Parallel Processors. A Hands-on Approach, 2nd edn. Waltham, MA: Morgan Kaufmann.

Klie, H., Bangerth, W., Gail, X., et al. 2006. “Models, Methods and Middleware for Grid-enabled Multiphysics Oil Reservoir Management.” Engineering with Computers 22 (3–4): 349–370.Google Scholar

Knuth, D. E. 1971. “Optimum Binary Search Trees.” Acta Informatica 1 (1): 14–25.Google Scholar

Knuth, D. E. 1998. The Art of Computer Programming, Vol. 3. Sorting and Searching, 2nd edn. Reading, MA: Addison-Wesley.

Kodama, C., Terai, M., Noda, A. T., et al. 2014. “Scalable Rank-mapping Algorithm for an Icosahedral Grid System on the Massive Parallel Computer with a 3-D Torus Network.” Parallel Computing 40: 362–373.Google Scholar

Koelbel, C. H., Loveman, D. B., Schreiber, R. S., Steele, G. L. Jr., and Zosel, M. E.. 1997. The High Performance Fortran Handbook. Cambridge, MA: MIT Press.

Komornicki, A., Mullen-Schulz, G., and Landon, D., 2009. Roadrunner: Hardware and Software Overview, IBM Technical Support Organization. www.redbooks.ibm.com/redpapers/pdfs/redp4477.pdf.Google Scholar

Kontoghiorghes, E. J. ed. 2006. Handbook of Parallel Computing and Statistics. Boca Raton, FL: Chapman & Hall/CRC.

Kruskal, C. P. and Snir, M.. 1986. “A Unified Theory of Interconnection Network Structure.” Theoretical Computer Science 48 (3): 75–94.Google Scholar

Kshemkalyani, A. D. and Singhal, M.. 2008. Distributed Computing. Cambridge: Cambridge University Press.

Kučera, L. 1982. “Parallel Computation and Conflicts in Memory Access.” Information Processing Letters 14: 93–96.Google Scholar

Kumar, V., Grama, A., Gupta, A., and Karypis, G., 1994. Introduction to Parallel Computing. Design and Analysis of Algorithms. Redwood City, CA: Benjamin/ Cummings.

Kumar, V. and Gupta, A.. 1994. “Analyzing Scalability of Parallel Algorithms and Architectures.” Journal of Parallel and Distributed Computing 22: 379–391.Google Scholar

Kumar, V. and Singh, V.. 1991. “Scalability of Parallel Algorithms for the All-pairs Shortest-path Problem.” Journal of Parallel and Distributed Computing 13: 124–138.Google Scholar

Kung, H. T. 1988. VLSI Array Processors. Upper Saddle River, NJ: Prentice Hall.

Kung, H. T. and Leiserson, C. E.. 1978. “Systolic Arrays (for VLSI).” In Sparse Matrix Proceedings, Knoxville, TN, SIAM, Philadelphia, edited by Duff, I. S. and Stewart, G. W. (US: Society for Industrial & Applied Mathematics), 256–282.

Kurzak, J., Bader, D. A., and Dongarra, J., eds. 2011. Scientific Computing with Multicore and Accelerators. Boca Raton, FL: Chapman & Hall/CRC.

Kwok, Y-K. and Ahmad, I.. 1999. “Benchmarking and Comparison of the Task Graph Scheduling Algorithms.” Journal of Parallel and Distributed Computing 59: 381–422.Google Scholar

Ladner, R. E. 1975. “The Circuit Value Problem Is Log Space Complete for P.” SIGACT News 7 (1): 18–20.Google Scholar

Lansdowne, S. T., Cousins, R. E., and Wilkinson, D. C.. 1987. “Reprogramming the Sieve of Eratosthenes.” Computer, August: 90–91.Google Scholar

Lastovetsky, A. L. 2003. Parallel Computing on Heterogeneous Networks. Hoboken, NJ: John Wiley & Sons, Inc.

Laudon, J. P. and Lenoski, D.. 1997. “The SGI Origin: A ccNUMA Highly Scalable Server.” Proc. of the 24th International Symposium on Computer Architecture, 241–251.Google Scholar

Lawrie, D. H. 1975. “Access and Alignment of Data in an Array Processor.” IEEE Transactions on Computers C-24 (1): 1145–1155.Google Scholar

Lea, D. 1997. Concurrent Programming in Java. Design Principles and Patterns. Reading, MA: Addison-Wesley.

Karp, R. M. and Ramachandran, V.. 1990. “Parallel Algorithms for Shared-memory Machines.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuwen, J. (Amsterdam, The Netherlands: Elsevier), chap. 17;

Vailant, L. G. 1990. “General Purpose Parallel Architectures.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuwen, J. (Amsterdam, The Netherlands: Elsevier), chap. 18.Google Scholar

Leighton, F. T. 1992. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. San Mateo, CA: Morgan Kaufmann.

Leiserson, C. E. 1985. “Fat-trees: Universal Networks for Hardware-efficient Supercomputing.” IEEE Transactions on Computers C-34 (10): 892–901.Google Scholar

Leung, J. Y-T., ed. 2004. Handbook of Scheduling. Algorithms, Models and Performance Analysis. Boca Raton, FL: Chapman & Hall/CRC.

Levesque, J. and Wagenbreth, G.. 2011. High Performance Computing. Programming and Applications, Chapman & Hall/CRC, Boca Raton, FL.

Lewis, B. and Berg, D.. 1998. Multithreaded Programming with Pthreads. Mountain View, CA: Sun Microsystems Press.

Li, K. 1986. “Shared Virtual Memory on Loosely Coupled Multiprocessor.” Ph.D. thesis, Department of Computer Science, Yale University.Google Scholar

Li, K. and Hudak, P.. 1989. “Memory Coherence in Shared Virtual Memory Systems.” ACM Transactions on Computer Systems 7: 321–359.Google Scholar

Lillevik, S. L. 1991. “The Touchstone 30 Gigaflop DELTA Prototype.” DMCC April: 671–677.Google Scholar

Lin, C. and Snyder, L.. 2009. Principles of Parallel Programming. Boston, MA: Addison-Wesley.

Lindholm, E., Nickolls, J., Oberman, S., and Mntrym, J.. 2008. “NVIDIA Tesla: A Unified Graphics and Computing Architecture.” IEEE Micro 28 (2): 39–55.Google Scholar

Loft, R., Andersen, A., Bryan, F., et al. 2015. “Yellowstone: A Dedicated Reitalic for Earth System Science.” In Contemporary High Performance Computing: From Petascale toward Exascale, edited by Vetter, J. S. (Chapman & Hall/CRC, Boca Raton, FL), vol. II, 185–224.

Lynch, N. A. 1996. Distributed Algorithms. San Francisco, CA: Morgan Kaufmann.

Lysne, O. and Sem-Jacobsen, F. O.. 2011. “Networks, Multistage.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1316–1321.

Makino, J. 2002. “An Efficient Parallel Algorithm for O(N2 ) Direct Summation Method and Its Variations on Distributed-memory Parallel Machines.” New Astron. 7: 373–384.Google Scholar

Manber, U. 1989. Introduction to Algorithms—A Creative Approach. Boston, MA: Addison-Wesley.

Mandelbrot, B. B. 1980. “Fractal Aspects of the Iteration of z → λz(1 − z) for complex λ, z.” Annals of the New York Academy of Sciences 357: 249–259.Google Scholar

Marinescu, D. C. and Rice, J. R.. 1994. “On High Level Characterization of Parallelism.” Journal of Parallel and Distributed Computing 20: 107–113.Google Scholar

Marsh, D. R., Mills, M. J., Kinnison, D. E., et al. 2013. “Climate change from 1850 to 2005 simulated in CESM1 (WACCM).” Journal of Climate, 26(19): 7372–7391.Google Scholar

Matsu'ura, M., Furumura, T., Okuda, H., et al. 2006. “Integrated Predictive Simulation System for Earthquake and Tsunami Disaster.” SIAM 12th Conference on Parallel Processing for Scientific Computing (PP06), San Francisco, 2006, and also: Annual Report of the Earth Simulator Center, April 2005–March 2006, 407–410.Google Scholar

Mattson, T. G. 2003. “How Good Is OpenMP?” Scientific Programming 11: 81–93.Google Scholar

Mattson, T. G., Sanders, B. A., and Massingill, B. L.. 2005. Patterns for Parallel Programming. Boston, MA: Addison-Wesley.

McKee, S. A. and Wisniewski, R. W.. 2011. “Memory Wall.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1110–1116.

Mellor-Crummey, J. M. and Scott, M. L.. 1991. “Algorithms for Scalable Synchronization on Shared-memory Multiprocessors.” ACM Transactions on Computer Systems 9 (1): 21–65.Google Scholar

Message Passing Interface Forum. 1998. “MPI2: A Message Passing Interface Standard.” International Journal of High Performance Computing Applications 12 (1–2): 1–299.

Message Passing Interface Forum. 2012. “MPI: A Message-Passing Interface Standard, Version 3.0.” High Performance Computing Center Stuttgart (HLRS), September 21.

Milano, J. and Lembke, P., 2012. “IBM system Blue Gene Solution: Blue Gene/Q. Hardware Overview and Installation Planning.” March. www.ibm.com/redbooks.Google Scholar

Miller, R. and Boxer, L.. 2005. Algorithms. Sequential and Parallel. A Unified Approach, 2nd edn. Hingham, MA: Charles River Media Inc.

Mizuta, R., Uchiyama, T., Kamiguchi, K., Kitoh, A., and Noda, A.. 2005. “Changes in Extremes Indices over Japan due to Global Warming Projected by a Global 20-km-mesh Atmospheric Model.” Scientific Online Letters on the Atmosphere (SOLA) 1: 153–156. doi: 10.2151/sola.2005-040.Google Scholar

Mogoules, F., Pan, J., Tan, K-A., and Kumar, A.. 2009. Introduction to Grid Computing. Boca Raton, FL: Chapman & Hall/CRC.

Moin, P. and Kim, J.. 1997. “Tackling Turbulence with Supercomputers.” Scientific American 276: 62–68.Google Scholar

Moldovan, D. I. 1993. Parallel Processing from Applications to Systems. San Mateo, CA: Morgan Kaufmann.

Monacelli, G., Sessa, F., and Milite, A.. 2004. “An Integrated Approach to Evaluate Engineering Simulations and Ergonomic Aspects of a New Vehicle in a Virtual Environment: Physical and Virtual Correlation Methods.” FISITA 2004 30th World Automotive Congress, 2004, Barcelona, Spain, 23–27.Google Scholar

Monien, B. and Sudborough, H.. 1988. “Comparing Interconnection Networks.” Lecture Notes in Computer Science 324: 139–153.Google Scholar

Moore, G. E. 1965. “Cramming More Components onto Integrated Circuits.” Electronics Magazine 38 (8): 114–117.Google Scholar

Morse, H. S. 1994. Practical Parallel Computing. Cambridge, MA: AP Professional.

Mukherjee, S. S., Banno, P., Lang, S., Spink, A., and Webb, D.. 2001. “The Alpha 21364 Network Architecture.” Proc. of the Symposium on Hot Interconnects, August, 113–117.Google Scholar

Nakata, T., Kanoh, Y., Tatsukawa, K., et al. 1998. “Architecture and the Software Environment of Parallel Computer Cenju-4.” NEC Research and Development Journal 39: 385–390.Google Scholar

nCUBE Corporation. 1990. nCUBE Processor Manual.

Nickolls, J. R. 1990. “The Design of the MasPar MP-1: A Cost-effective Massively Parallel Computer.” Proc. COMPCON Digest of Paper, 25–28.Google Scholar

Nicol, D. M. and Willard, F. H.. 1988. “Problem Size, Parallel Architecture, and Optimal Speedup.” Journal of Parallel and Distributed Computing 5: 404–420.Google Scholar

Nikhil, R. S. and Arvind, . 1989. “Can Dataflow Subsume von Neumann Computing?” Proc. of the 16th Annual International Symposium on Computer Architecture, 262–272.Google Scholar

Niphanupudi, M.V., Norton, C. D., and Szymanski, B. K.. 1995. “Plasma Simulation on Networks of Workstations Using the Bulk Synchronous Parallel Model.” Proc. of the Conference on Parallel and Distributed Processing Techniques and Applications, Athens, Georgia, 13–22.Google Scholar

Null, L. and Lobur, J.. 2015. The Essentials of Computer Organization and Architecture, 4th edn. Burlington, MA: Jones & Bartlett Learning.

Nussbaum, D. and Agarwal, A.. 1991. “Scalability of Parallel Machines.” Communications of the ACM 34 (3): 57–61.Google Scholar

Nuth, P. R. and Dally, W. J.. 1992. “The J-machine Network.” Proc. of the International Conference on Computer Design, October 1992, 420–423.Google Scholar

Nvidia, . 2015. CUDA C Programming Guide, PG-02829-001 v7.5, September. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf.

Nyland, L., Harris, M., and Prins, J.. 2007. “Fast N-body Simulations with CUDA.” In GPU Gems 3 (31), edited by Nguyen, H. (Addison-Wesley, eBook-BBL), 677–695.

Oden, J. T., Belytschko, T., Fish, J., et al. 2006. “Revolutionizing Engineering Science through Simulation.” National Science Foundation Blue Ribbon Panel Report 65: 1–66.Google Scholar

OpenMP Application Program Interface, Version 2.5, May 2005. www.openmp.org.

OpenMP Application Program Interface, Version 3.0, May 2008. www.openmp.org.

OpenMP Application Program Interface, Version 3.1, July 2011. www.openmp.org.

OpenMP Application Program Interface, Version 4.0, July 2013. www.openmp.org.

OpenMP Application Program Interface, Version 4.1, July 2015. www.openmp.org.

Pacheco, P. S. 1997. Parallel Programming with MPI. San Francisco, CA: Morgan Kaufmann.

Pacheco, P. S. 2011. “An Introduction to Parallel Programming.” Burlington, MA: Morgan Kaufmann.

Padua, D. A. ed. 2011. Encyclopedia of Parallel Computing, Vols 1–4 (New York: Springer-Verlag).

Palmer, J. F. 1986. “The NCUBE Family of Parallel Supercomputers.” Proc. of the International Conference on Computer Design, p. 107.Google Scholar

Papadimitriou, C. H. 1994. Computational Complexity. Reading, MA: AddisonWesley, chap. 15, “Parallel Computing.”

Parberry, I. 1987. Parallel Complexity Theory. London: Pitman/Wiley.

Parhami, B. 1999. Introduction to Parallel Processing. Algorithms and Architectures. New York: Plenum Press.

Parnas, D. L. 1975. “On a Solution to the Cigarette Smokers’ Problem without Conditional Statements.” Communications of the ACM 18: 181–183.Google Scholar

Paterson, M. S. 1990. “Improved Sorting Networks with O(logN) Depth.” Algorithmica 5 (1–4): 75–92.Google Scholar

Patil, S. 1971. Limitations and Capabilities of Dijkstra's Semaphore Primitives for Coordination among Processes. Technical report, Massachusetts Institute of Technology.Google Scholar

Patterson, D. A. and Hennessy, J. L.. 2013. Computer Organization and Design, 5th edn. Burlington, MA: Morgan Kaufmann.

Peitgen, H.-O. and Richter, P.. 1986. The Beauty of Science. Heidelberg: Springer-Verlag.

Pfister, G. F. 1998. In Search of Clusters. 2nd edn. Upper Saddle River, NJ: Prentice Hall.

Pfister, G. F., Brantley, W. C., George, D. A., et al. 1985. “The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture.” Proc. of 1985 International Conference on Parallel Processing, 764–771.Google Scholar

Preparata, F. P. and Vuillemin, J.. 1981. “The Cube-connected Cycles: A Versatile Network for Parallel Computation.” Communications of the ACM 24 (5): 300–309.Google Scholar

President's Information Technology Committee. 2005. Computational Science: Ensuring America's Competitiveness, June: 1–117.

Quinn, M. J. 1987. Designing Efficient Algorithms for Parallel Computers. New York: McGraw-Hill.

Quinn, M. J. 1994. Parallel Computing. Theory and Practice, 2nd edn. New York: McGraw-Hill.

Quinn, M. J. 2004. Parallel Programming in C with MPI and OpenMP, New York: McGraw-Hill.

Rajasekaran, S. and Reif, J., eds. 2008. Handbook of Parallel Computing. Models, Algorithms and Applications. Boca Raton, FL: Chapman & Hall/CRC.

Rajasekaran, S., Fiondella, L., Ahmed, M., and Ammar, R. A., eds. 2014. Multicore Computing. Boca Raton, FL: Chapman & Hall/CRC.

Ranade, A. G. 1987. “How to Emulate Shared Memory.” Proc. of 28th Annual Symposium on the Foundations of Computer Science, Los Angeles, CA, 1987, 185–192.Google Scholar

Rauber, T. and R´’unger, G.. 2010. Parallel Programming for Multicore and Cluster Systems. Berlin: Springer-Verlag.

Reif, J. H., ed. 1993. Synthesis of Parallel Algorithms. San Mateo, CA: Morgan Kaufmann.

Reinders, J. R. 2011. “Systolic Arrays.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2002–2011.

Reinders, J. R. 2011. “Warp and iWarp.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2150–2159.

Reingold, E. M., Nievergelt, J., and Deo, N.. 1977. Combinatorial Algorithms: Theory and Practice. New York: Prentice Hall.

Riesen, R. and Maccabe, A. B.. 2011. “MIMD (Multiple Instruction, Multiple Data) Machines.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1140–1149.

Robert, Y. 2011. “Task Graph Scheduling.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2013–2025.

Roberts, M. J., Vidale, P. L., Mizielinski, M. S., et al. 2015. “Tropical Cyclones in the UPSCALE Ensemble of High-Resolution Global Climate Models.” Journal of Climate 28(2): 574–596.Google Scholar

Rochkind, M. J. 2004. Advanced UNIX Programming, 2nd edn. Boston, MA: Addison-Wesley.

Roosta, S. H. 2000. Parallel Processing and Parallel Algorithms. Theory and Computation. New York: Springer-Verlag.

Roscoe, A. W. 1998. The Theory and Practice of Concurrency. Upper Saddle River, NJ: Prentice Hall.

Rosner, J. 2015. “Methods of Parallelizing Selected Computer Vision Algorithms for Multi-core Graphics Processors.” Ph.D thesis, Silesian University of Technology, Gliwice, Poland. http://delibra.bg.polsl.pl/dlibra/.

Rumbaugh, J. 1977. “A Dataflow Multiprocessor.” IEEE Transactions on Computers C-26: 1087–1095.Google Scholar

Sakaj, S., Kodama, Y., and Yamaguchi, Y.. 1991. “Prototype Implementation of a Highly Parallel Dataflow Machine EM-4.” Proc. of the International Parallel Processing Symposium, 1991, 278–286.Google Scholar

Sanders, J. and Kandrot, E.. 2010. CUDA by Example. An Introduction to General-purpose GPU Programming. Upper Saddle River, NJ: Addison-Wesley.

Satoh, M., Tomita, H., Yashiro, H., et al. 2014. “The Non-hydrostatic Icosahedral Atmospheric Model: Description and Development.” Progress in Earth and Planetary Science, 1(1): 1.Google Scholar

Savage, J. E. 1998. Models of Computation. Reading, MA: Addison-Wesley.

Savitch, W. J. and Stimson, M. J.. 1979. “Time Bounded Random Access Machines with Parallel Processing.” Journal of the ACM 26: 103–118.Google Scholar

Schauser, K. E. and Scheiman, C. J.. 1995. “Experience with Active Messages on the Meiko CS-2.” Proc. 9th International Symposium on Parallel Processing, April 1995, 140–149.Google Scholar

Schulz, M., Reuding, T., and Ertl, T.. 1998. “Analyzing Engineering Simulations in a Virtual Environment.” IEEE Computer Graphics and Applications 18 (6): 46–52.Google Scholar

Schwartz, J. 1983. A Taxonomic Table of Parallel Computers Based on 55 Designs. New York: Courant Institute, New York University, November 1983.

“Science on a Grand Scale.” 2015. Science & Technology Review, Lawrance Liver-more National Laboratory, September, 4–11.

Scott, L. R., Clark, T., and Bagheri, B.. 2005. Scientific Parallel Computing. Princeton, NJ: Princeton University Press.

Scott, S. and Thorson, G.. 1996. “The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus.” Proc. of the Symposium on Hot Interconnects, August 1996, 147–156.Google Scholar

Seitz, C. L. 1985. “The Cosmic Cube.” Communications of the ACM 28 (1): 22–33.Google Scholar

Sharp, J. A. 1985. Dataflow Computing. New York: John Wiley & Sons, Inc.

Shimokawabe, T., and Aoki, T.. 2010. “Multi-GPU Computing for Next-generation Weather forecasting – 145.0 TFlops 3990 GPUs on TSUBAME 2.0.” TSUBAME e-Science Journal (ESJ) 2: 11–16.Google Scholar

Shiva, S. G. 2006. Advanced Computer Architectures. Boca Raton, FL: CRC Press.

Shonkwiler, R. W. and Lefton, L.. 2006. An Introduction to Parallel and Vector Scientific Computing. New York: Cambridge University Press.

Sima, D. 1997. “Superscalar Instruction Issue.” IEEE Micro Magazine 17: 28–39.Google Scholar

Singh, J. P., Hennessy, J. L., and Gupta, A.. 1993. “Scaling Parallel Programs for Multiprocessors: Methodology and Examples.” IEEE Computer 26 (7): 42–50.Google Scholar

Sinnen, O. 2007. Task Scheduling for Parallel Systems. Hoboken, NJ: John Wiley & Sons, Inc.

Sipser, M. 2006. Introduction to the Theory of Computation, 2nd edn. Boston, MA: Thomson Course Technology.

Skillicorn, D. B. 1988. “A Taxonomy for Computer Architectures.” IEEE Computer 2146–2157.Google Scholar

Skillicorn, D. B. 2005. Foundations of Parallel Programming. Cambridge: Cambridge University Press.

Skillicorn, D., Hill, J. M. D., and McColl, W. F.. 1997. “Questions and Answers about BSP.” Scientific Programming 6 (3): 249–274.Google Scholar

Slotnick, D. L., Borck, W. C., and McReynolds, R. C.. 1967. “The Solomon Computer.” Proc. of the AFIPS Spring Joint Computer Conference, 22, New York, 1967, 97–107.Google Scholar

Smith, J. R. 1993. The Design and Analysis of Parallel Algorithms. New York: Oxford University Press.

Snir, M. 1985. “On Parallel Searching.” SIAM Journal on Computing 15: 688–708.Google Scholar

Snir, M., Otto, S. W., Huss-Lederman, S., Walker, D. W., and Dongarra, J.. 1998. MPI-The Complete Reference: Vol. 1. The MPI Core, 2nd edn. Cambridge, MA: MIT Press.

Snir, M. 2011. “Reduce and Scan.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 1728–1736.

Solihin, Y. 2016. Fundamentals of Parallel Multicore Architecture. Boca Raton, FL: Chapman & Hall/CRC.

Sottile, M. J., Mattson, T. G., and Rasmussen, C. E.. 2010. Introduction to Concurrency in Programming Languages. Boca Raton, FL: Chapman & Hall/CRC.

Stallings, W. 2013. Computer Organization and Architecture, 9th edn. Upper Saddle River, NJ: Pearson Education.

Stallings, W. 2012. Operating Systems. Internals and Design Principles, 8th edn. Upper Saddle River, NJ: Pearson Education.

van der Steen, A. J. and Dongarra, J. J.. 2006, 2007. Overview of Recent Supercomputers. www.top500.org/.

Sterling, T. L., Salmon, J., Becker, D. J., and Savarese, D. F.. 1999. How to Build a Beowulf. Cambridge, MA: MIT Press.

Stojmenović, I. 1996. “Direct Interconnection Networks.” In Parallel and Distributed Computing Handbook, edited by Zamoya, A. Y. (New York: McGraw-Hill), 537–567.

Sullivan, H. and Bashkow, T. R.. 1977. “A Large Scale, Homogeneous, Fully Distributed Parallel Machine.” Proc. of the International Symposium on Computer Architecture, 1977, 105–124.Google Scholar

Sun, X-H. and Gustafson, J. L.. 1991. “Toward a Better Parallel Performance Metric.” Parallel Computing 17: 1093–1109.Google Scholar

Sun, X-H. and Ni, L. M.. 1990. “Another View of Parallel Speedup.” Supercomputing ’90 Proceedings, 324–333.Google Scholar

Sun, X-H. and Ni, L. M.. 1993. “Scalable Problems and Memory-bounded Speedup.” Journal of Parallel and Distributed Computing 19: 27–37.Google Scholar

Sun, X-H. and Zhu, J.. 1995. “Performance Considerations of Shared Virtual Memory Machines.” IEEE Transactions on Parallel and Distributed Systems 6 (11): 1185–1194.Google Scholar

Sun, X-H. and Rover, D. T.. 1994. “Scalability of Parallel Algorithm-machine Combinations.” IEEE Transactions on Parallel and Distributed Systems 5 (6): 599–613.Google Scholar

Talbi, E-G. 2006. Parallel Combinatorial Optimization. Hoboken, NJ: Wiley-Interscience.

Tanenbaum, A. S. 2006. Structured Computer Organization, 5th edn. Upper Saddle River, NJ: Pearson Education, Prentice Hall.

Tanenbaum, A. S. 2009. Modern Operating Systems, 3rd edn. Upper Saddle River, NJ: Prentice Hall.

Tanenbaum, A. S. and van Steen, M.. 2007. Distributed Systems. Principles and Paradigms, 2nd edn. Upper Saddle River, NJ: Pearson Education.

Taubenfeld, G. 2006. Synchronization Algorithms and Concurrent Programming. Harlow, UK: Pearson Education, Prentice Hall.

Tel, G. 1994. Introduction to Distributed Algorithms. Cambridge: Cambridge University Press.

Thekkath, R., Singh, A. P., Singh, J. P., Hennessy, J., and John, S.. 1997. “An Application-driven Evaluation of the Convex Exemplar SP-1200.” Proc. of the International Parallel Processing Symposium, June 1997, 8–17.Google Scholar

Thinking Machines Corporation. 1990. The CM-2 Technical Summary. Cambridge, MA: Thinking Machines Corporation.

Torán, J. 1993. “P-completeness.” In Lectures on Parallel Computation, edited by Gibbons, A. and Spirakis, P. (Cambridge: Cambridge University Press), 177–196.

Treleaven, P. C. 1985. “Control-driven, Data-driven and Demand-driven Computer Architecture.” Parallel Computing 2 (3): 287–288.Google Scholar

Trono, J. A. and Taylor, W. E.. 2000. “Further comments on ‘A Correct and Unrestrictive Implementation of General Semaphores’.” ACM SIGOPS Operating Systems Review 34 (3): 5–10.Google Scholar

Ungerer, T., Robiè, B., and Silc, J.. 2003. “A Survey of Processors with Explicit Multithreading.” ACM Computing Surveys 35 (1): 29–63.Google Scholar

Valiant, L. G. 1990. “A Bridging Model for Parallel Computation.” Communications of the ACM 33 (8): 103–111.Google Scholar

Valiant, L. G. 1990. “General Purpose Parallel Architectures.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuven, J. (Amsterdam, The Netherlands: Elsevier), 944–971.

Van-Catledge, F. A. 1989. “Towards a General Model for Evaluating the Relative Performance Computer Systems.” International Journal of Supercomputer Applications 3 (2): 100–108.Google Scholar

van Emde Boas, P. 1990. “Machine Models and Simulations.” In Handbook of Theoretical Computer Science, Vol. A, edited by van Leeuven, J. (Amsterdam, The Netherlands: Elsevier), 1–66.

Vazirani, V. V. 2003. Approximation Algorithms. Berlin: Springer-Verlag.

Venter, J. C., Adams, M. D., Myers, E. W., et al. 2001. “The Sequence of the Human Genome.” Science 291: 1304–1351.Google Scholar

Vishkin, U. 1983. “Implementation of Simultaneous Memory Address Access in Models that Forbid It.” Journal of Algorithms 4: 45–50.Google Scholar

Vishkin, U., Caragea, G. C., and Lee, B. C.. 2008. “Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-on-chip Platform.” In Handbook of Parallel Computing. Models, Algorithms and Applications, edited by Rajasekaran, S. and Reif, J. (Boca Raton, FL: Chapman & Hall/CRC): 5-1–5-60.

Vos, J. B., Rizzi, A., Darracq, D., and Hirschel, E. H.. 2002. “Navier-Stokes Solvers in European Aircraft Design.” Progress in Aerospace Sciences 38: 601–697.Google Scholar

Wah, W. and Akl, S. G.. 1992. “Simulating Multiple Memory Accesses in Logarithmic Time and Linear Space.” The Computer Journal 35: 85–88.Google Scholar

Washington, W. M., Buja, L., and Craig, A.. 2009. “The Computational Future for Climate and Earth System Models: On the Path to Petaflop and Beyond.” Phil. Trans. R. Soc. A 367: 833–846. doi:10.1098/rsta.2008.0219.Google Scholar

Wilkinson, B. and Allen, M.. 1999. Parallel Programming. Techniques and Applications Using Networked Workstations and Parallel Computers. Upper Saddle River, NJ: Prentice Hall.

Wilson, G. V. 1993. “A Glossary of Parallel Computing Terminology.” IEEE Parallel & Distributed Technology February: 52–67.Google Scholar

Wilson, G. V. 1995. Practical Parallel Programming. Cambridge, MA: MIT Press.

Wilson, R. J. 1996. Introduction to Graph Theory, 4th edn. Harlow, UK: Addison Wesley Longman Ltd.

Winter, P. C., Hickey, G. J., and Fletcher, H. L.. 2002. Instant Notes. Genetics, 2nd edn. Milton Park, UK: BIOS Scientific Publishers.

Wolfe, M. 1996. High Performance Compilers for Parallel Computing. Addison-Wesley: Redwood City, CA.

Worley, P. H. 1990. “The Effect of Time Constraints on Scaled Speedup.” SIAM Journal on Scientific and Statistical Computing 11 (5): 838–858.Google Scholar

Wulf, W. A. and Bell, C. G.. 1972. “C.mmp-A Multimicroprocessor.” Proc. of AFIPS Conference, 765–777.Google Scholar

Xue, M., Droegemeier, K. K., and Weber, D.. 2008. “Numerical Prediction of High-impact Local Weather: A Driver for Petascale Computing.” In Petascale Computing. Algorithms and Applications, edited by Bader, D. A. (Boca Raton, FL: Chapman & Hall/CRC), 103–124.

Yokokawa, M., Shoji, F., and Hasegawa, Y.. 2015. “The K Computer.” In Contemporary High Performance Computing: From Petascale toward Exascale, edited by Vetter, J. S. (Chapman & Hall/CRC, Boca Raton, FL), vol. II, 115–139.

Zhou, X. 1989. “Bridging the Gap between Amdahl's Law and Sandia Laboratory's Result.” Communications of the ACM 32 (8): 1014–1015.Google Scholar

Zorbas, J. R., Reble, D. J., and VanKooten, R. E.. 1989. “Measuring the Scalability of Parallel Computer Systems.” Supercomputing ’89 Proc., 832–841.Google Scholar

Book contents

References

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive