Abstract
Data-intensive computing brings a new set of challenges that do not completely overlap with those met by the more typical and even state-of-the-art High Performance Computing (HPC) systems. Working with ‘big data’ can involve analyzing thousands of files that need to be rapidly opened, examined and cross-correlated—tasks that classic HPC systems might not be designed to do. Such tasks can be efficiently conducted on a data-intensive supercomputer like the Wrangler supercomputer at the Texas Advanced Computing Center (TACC). Wrangler allows scientists to share and analyze the massive collections of data being produced in nearly every field of research today in a user-friendly manner. It was designed to work closely with the Stampede supercomputer, which is ranked as the number ten most powerful in the world by TOP500, and is the HPC flagship of TACC. Wrangler was designed to keep much of what was successful with systems like Stampede, but also to introduce new features such as a very large flash storage system, a very large distributed spinning disk storage system, and high speed network access. This allows a new way for users to access HPC resources with data analysis needs that weren’t being fulfilled by traditional HPC systems like Stampede. In this chapter, we provide an overview of the Wrangler data-intensive HPC system along with some of the big data use-cases that it enables.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Stampede supercomputer, https://www.tacc.utexas.edu/systems/stampede. Accessed 15 Feb 2015
Wrangler supercomputer, https://www.tacc.utexas.edu/systems/wrangler. Accessed 15 Feb 2015
Extreme Science and Engineering Discovery Environment (XSEDE), https://www.xsede.org/. Accessed 15 Feb 2015
iRods, http://irods.org/. Accessed 15 Feb 2015
TigerVNC, http://tigervnc.org. Accessed 15 Feb 2015
RStudio, https://www.rstudio.com/. Accessed 15 Feb 2015
Jupyter Notebook, http://jupyter.org/. Accessed 15 Feb 2015
The Hofmann Lab at the University of Texas at Austin, http://cichlid.biosci.utexas.edu/index.html. Accessed 15 Feb 2015
OrthoMCL 2.0.9, https://wiki.gacrc.uga.edu/wiki/OrthoMCL. Accessed 15 Feb 2015
Titan supercomputer, https://www.olcf.ornl.gov/titan/. Accessed 15 Feb 2015
Autotune, http://rsc.ornl.gov/autotune/?q=content/autotune. Accessed 15 Feb 2015
PaleoCore, http://paleocore.org/. Accessed 15 Feb 2015
Hobby-Eberly Telescope Dark Energy Experiment (HTDEX), http://hetdex.org/. Accessed 15 Feb 2015
Visible Integral Field Replicable Unit Spectrograph (VIRUS), http://instrumentation.tamu.edu/virus.html. Accessed 15 Feb 2015
Acknowledgement
We are grateful to the Texas Advanced Computing Center, the National Science Foundation, the Extreme Science and Engineering Discovery Environment, Niall Gaffney (Texas Advanced Computing Center), Rebecca Young (University of Texas at Austin), Denne Reed (University of Texas at Austin), Steven Finkelstein (University of Texas at Austin), Joshua New (Oak Ridge National Laboratory).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Salazar, J. (2016). Conquering Big Data Through the Usage of the Wrangler Supercomputer. In: Arora, R. (eds) Conquering Big Data with High Performance Computing. Springer, Cham. https://doi.org/10.1007/978-3-319-33742-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-33742-5_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33740-1
Online ISBN: 978-3-319-33742-5
eBook Packages: Computer ScienceComputer Science (R0)