research-article

Open Access

Taichi: a language for high-performance computation on spatially sparse data structures

Authors:
Yuanming Hu

MIT CSAIL

MIT CSAIL
View Profile

,
Tzu-Mao Li

MIT CSAIL and UC Berkeley

MIT CSAIL and UC Berkeley
View Profile

,
Luke Anderson

MIT CSAIL

MIT CSAIL
View Profile

,
Jonathan Ragan-Kelley

UC Berkeley

UC Berkeley
View Profile

,
Frédo Durand

MIT CSAIL

MIT CSAIL
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 38 Issue 6Article No.: 201pp 1–16https://doi.org/10.1145/3355089.3356506

Published:08 November 2019Publication History

ACM Transactions on Graphics

Abstract

3D visual computing data are often spatially sparse. To exploit such sparsity, people have developed hierarchical sparse data structures, such as multi-level sparse voxel grids, particles, and 3D hash tables. However, developing and using these high-performance sparse data structures is challenging, due to their intrinsic complexity and overhead. We propose Taichi, a new data-oriented programming language for efficiently authoring, accessing, and maintaining such data structures. The language offers a high-level, data structure-agnostic interface for writing computation code. The user independently specifies the data structure. We provide several elementary components with different sparsity properties that can be arbitrarily composed to create a wide range of multi-level sparse data structures. This decoupling of data structures from computation makes it easy to experiment with different data structures without changing computation code, and allows users to write computation as if they are working with a dense array. Our compiler then uses the semantics of the data structure and index analysis to automatically optimize for locality, remove redundant operations for coherent accesses, maintain sparsity and memory allocations, and generate efficient parallel and vectorized instructions for CPUs and GPUs.

Our approach yields competitive performance on common computational kernels such as stencil applications, neighbor lookups, and particle scattering. We demonstrate our language by implementing simulation, rendering, and vision tasks including a material point method simulation, finite element analysis, a multigrid Poisson solver for pressure projection, volumetric path tracing, and 3D convolution on sparse grids. Our computation-data structure decoupling allows us to quickly experiment with different data arrangements, and to develop high-performance data structures tailored for specific computational tasks. With ₁<u>¹</u>₀ th as many lines of code, we achieve 4.55× higher performance on average, compared to hand-optimized reference implementations.

Supplemental Material

a201-hu.mp4

mp4

194.2 MB

Download

Available for Download

zip

a201-hu.zip (440.1 KB)

Supplemental files.

References

Mike Acton. 2014. Data-oriented design and C++. (2014). https://www.youtube.com/watch?v=rX0ItVEVjHcGoogle Scholar
Aseem Agarwala. 2007. Efficient Gradient-domain Compositing Using Quadtrees. ACM Trans. Graph. (Proc. SIGGRAPH) 26, 3 (2007), 94.Google ScholarDigital Library
Riyadh Baghdadi, Ulysse Beaugnon, Albert Cohen, Tobias Grosser, Michael Kruse, Chandan Reddy, Sven Verdoolaege, Adam Betts, Alastair F Donaldson, Jeroen Ketema, et al. 2015. PENCIL: A platform-neutral compute intermediate language for accelerator programming. In Parallel Architecture and Compilation. IEEE, 138--149.Google Scholar
Riyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming Zhang, Patricia Suriana, Shoaib Kamil, and Saman Amarasinghe. 2019. Tiramisu: A polyhedral compiler for expressing fast and portable code. Code Generation and Optimization (2019), 193--205.Google Scholar
Dan Bailey, Ian Masters, Matt Warner, and Harry Biddle. 2013. Simulating fluids using a coupled voxel-particle data model. In ACM SIGGRAPH 2013 Talks. ACM, 15.Google ScholarDigital Library
Gilbert Louis Bernstein, Chinmayee Shah, Crystal Lemire, Zachary Devito, Matthew Fisher, Philip Levis, and Pat Hanrahan. 2016. Ebb: A DSL for physical simulation on CPUs and GPUs. ACM Trans. Graph. 35, 2 (2016), 21:1--21:12.Google ScholarDigital Library
Robert Edward Bridson. 2003. Computational Aspects of Dynamic Surfaces. Ph.D. Dissertation. Stanford University, Stanford, CA, USA. Advisor(s) Fedkiw, Ronald.Google Scholar
Jiawen Chen, Dennis Bautembach, and Shahram Izadi. 2013. Scalable Real-time Volumetric Surface Reconstruction. ACM Trans. Graph. (Proc. SIGGRAPH) 32, 4 (2013), 113:1--113:16.Google ScholarDigital Library
Kazem Cheshmi, Shoaib Kamil, Michelle Mills Strout, and Maryam Mehri Dehnavi. 2017. Sympiler: Transforming sparse matrix codes by decoupling symbolic analysis. In International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 13:1--13:13.Google Scholar
Kazem Cheshmi, Shoaib Kamil, Michelle Mills Strout, and Maryam Mehri Dehnavi. 2018. ParSy: Inspection and transformation of sparse matrix computations for parallelism. In International Conference for High Performance Computing, Networking, Storage, and Analysis. 62.Google ScholarDigital Library
Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2018. Format abstraction for sparse tensor algebra compilers. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 123.Google ScholarDigital Library
Frederica Darema, David A George, V Alan Norton, and Gregory F Pfister. 1988. A single-program-multiple-data computational model for EPEX/FORTRAN. Parallel Comput. 7, 1 (1988), 11--24.Google ScholarCross Ref
Zachary DeVito, Niels Joubert, Francisco Palacios, Stephen Oakley, Montserrat Medina, Mike Barrientos, Erich Elsen, Frank Ham, Alex Aiken, Karthik Duraisamy, et al. 2011. Liszt: A domain specific language for building portable mesh-based PDE solvers. In International Conference for High Performance Computing, Networking, Storage and Analysis. 9.Google ScholarDigital Library
Ming Gao, Xinlei Wang, Kui Wu, Andre Pradhana-Tampubolon, Eftychios Sifakis, Yuksel Cem, and Chenfanfu Jiang. 2018. GPU Optimization of Material Point Methods. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 32, 4 (2018), 102.Google Scholar
Benjamin Graham, Martin Engelcke, and Laurens van der Maaten. 2018. 3D semantic segmentation with submanifold sparse convolutional networks. In Computer Vision and Pattern Recognition. 9224--9232.Google Scholar
Rama Karl Hoetzlein. 2016. GVDB: Raytracing sparse voxel database structures on the GPU. In Proceedings of High Performance Graphics. Eurographics Association, 109--117.Google Scholar
Ben Houston, Michael B. Nielsen, Christopher Batty, Ola Nilsson, and Ken Museth. 2006. Hierarchical RLE level set: A compact and versatile deformable surface representation. ACM Trans. Graph. 25, 1 (2006), 151--175.Google ScholarDigital Library
Yuanming Hu. 2018. Taichi: An open-source computer graphics library. arXiv preprint arXiv:1804.09293 (2018).Google Scholar
Yuanming Hu, Yu Fang, Ziheng Ge, Ziyin Qu, Yixin Zhu, Andre Pradhana, and Chenfanfu Jiang. 2018. A moving least squares material point method with displacement discontinuity and two-way rigid body coupling. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 37, 4 (2018), 150.Google ScholarDigital Library
Wenzel Jakob. 2010. Mitsuba renderer. (2010). http://www.mitsuba-renderer.org.Google Scholar
Michael Kazhdan, Matthew Bolitho, and Hugues Hoppe. 2006. Poisson surface reconstruction. In Eurographics Symposium on Geometry Processing, Vol. 7.Google Scholar
Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The tensor algebra compiler. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 77.Google ScholarDigital Library
Fredrik Kjolstad, Shoaib Kamil, Jonathan Ragan-Kelley, David I. W. Levin, Shinjiro Sueda, Desai Chen, Etienne Vouga, Danny M. Kaufman, Gurtej Kanwar, Wojciech Matusik, and Saman Amarasinghe. 2016. Simit: A language for physical simulation. ACM Trans. Graph. 35, 2 (2016), 20:1--20:21.Google ScholarDigital Library
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Code Generation and Optimization.Google Scholar
Mark Lee, Brian Green, Feng Xie, and Eric Tabellion. 2017. Vectorized Production Path Tracing. In High Performance Graphics.Google Scholar
Roland Leißa, Sebastian Hack, and Ingo Wald. 2012. Extending a C-like Language for Portable SIMD Programming. SIGPLAN Not. 47, 8 (2012), 65--74.Google ScholarDigital Library
Haixiang Liu, Yuanming Hu, Bo Zhu, Wojciech Matusik, and Eftychios Sifakis. 2018. Narrow-band Topology Optimization on a Sparsely Populated Grid. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 37, 6 (2018), 251:1--251:14.Google Scholar
Frank Losasso, Frédéric Gibou, and Ron Fedkiw. 2004. Simulating water and smoke with an octree data structure. In ACM Trans. Graph. (Proc. SIGGRAPH), Vol. 23. ACM, 457--462.Google ScholarDigital Library
Aleka McAdams, Eftychios Sifakis, and Joseph Teran. 2010. A parallel multigrid Poisson solver for fluids simulation on large grids. In Symposium on Computer Animation. ACM/Eurographics Association, 65--74.Google Scholar
Ravi Teja Mullapudi, Vinay Vasista, and Uday Bondhugula. 2015. PolyMage: Automatic optimization for image processing pipelines. SIGARCH Comput. Archit. News 43, 1 (2015), 429--443.Google ScholarDigital Library
Ken Museth. 2013. VDB: High-resolution sparse volumes with dynamic topology. ACM Trans. Graph. 32, 3 (2013), 27.Google ScholarDigital Library
Ken Museth. 2014. Hierarchical digital differential analyzer for efficient ray-marching in OpenVDB. (2014).Google Scholar
Michael B Nielsen and Robert Bridson. 2016. Spatially adaptive FLIP fluid simulations in bifrost. In ACM SIGGRAPH 2016 Talks. ACM, 41.Google ScholarDigital Library
Michael B. Nielsen and Ken Museth. 2006. Dynamic Tubular Grid: An Efficient Data Structure and Algorithms for High Resolution Level Sets. J. Sci. Comput. 26, 3 (2006), 261--299.Google ScholarDigital Library
Stanley Osher and James A. Sethian. 1988. Fronts Propagating with Curvature-dependent Speed: Algorithms Based on Hamilton-Jacobi Formulations. J. Comput. Phys. 79, 1 (1988), 12--49.Google ScholarDigital Library
Matt Pharr, Craig Kolb, Reid Gershbein, and Pat Hanrahan. 1997. Rendering Complex Scenes with Memory-coherent Ray Tracing. In SIGGRAPH. ACM, 101--108.Google Scholar
Matt Pharr and William R Mark. 2012. ispc: A SPMD compiler for high-performance CPU programming. In Innovative Parallel Computing. 1--13.Google Scholar
Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. 2012. Decoupling Algorithms from Schedules for Easy Optimization of Image Processing Pipelines. ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4 (2012), 32:1--32:12.Google ScholarDigital Library
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. SIGPLAN Not. 48, 6 (jun 2013), 519--530.Google ScholarDigital Library
Gernot Riegler, Ali Osman Ulusoy, and Andreas Geiger. 2017. Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3577--3586.Google ScholarCross Ref
Rajsekhar Setaluri, Mridul Aanjaneya, Sean Bauer, and Eftychios Sifakis. 2014. SPGrid: A sparse paged grid structure applied to adaptive smoke simulation. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 33, 6 (2014), 205.Google ScholarDigital Library
Alexey Stomakhin, Craig Schroeder, Lawrence Chai, Joseph Teran, and Andrew Selle. 2013. A material point method for snow simulation. ACM Transactions on Graphics (TOG) 32, 4 (2013), 102.Google ScholarDigital Library
John E Stone, David Gohara, and Guochun Shi. 2010. OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in science & engineering 12, 3 (2010), 66--73.Google ScholarDigital Library
Deborah Sulsky, Shi-Jian Zhou, and Howard L Schreyer. 1995. Application of a particlein-cell method to solid mechanics. Computer physics communications 87, 1--2 (1995), 236--252.Google Scholar
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions. arXiv:1802.04730 (2018).Google Scholar
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, and Xin Tong. 2017. OCNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics (SIGGRAPH) 36, 4 (2017).Google ScholarDigital Library
Peng-Shuai Wang, Chun-Yu Sun, Yang Liu, and Xin Tong. 2018. Adaptive O-CNN: A patch-based deep representation of 3D shapes. ACM Transactions on Graphics (SIGGRAPH Asia) 37, 6 (2018).Google Scholar
Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens. 2016. Gunrock: A high-performance graph processing library on the GPU. SIGPLAN Not. 51, 8 (2016), 11:1--11:12.Google ScholarDigital Library
E Woodcock, T Murphy, P Hemmings, and S Longworth. 1965. Techniques used in the GEM code for Monte Carlo neutronics calculations in reactors and other systems of complex geometry. In Applications of Computing Methods to Reactor Problems, Vol. 557.Google Scholar
Kui Wu, Nghia Truong, Cem Yuksel, and Rama Hoetzlein. 2018. Fast fluid simulations with sparse volumes on the GPU. In Computer Graphics Forum (Proc. Eurographics), Vol. 37. Wiley Online Library, 157--167.Google Scholar
Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. GraphIt: A high-performance graph DSL. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 121.Google ScholarDigital Library
Yongning Zhu and Robert Bridson. 2005. Animating sand as a fluid. ACM Transactions on Graphics (TOG) 24, 3 (2005), 965--972.Google ScholarDigital Library

Index Terms

Taichi: a language for high-performance computation on spatially sparse data structures
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Physical simulation
  2. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Context specific languages
      1. Domain specific languages

Recommendations

A unified optimizing compiler framework for different GPGPU architectures

This article presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and ...
Read More
Brook for GPUs: stream computing on graphics hardware
SIGGRAPH '04: ACM SIGGRAPH 2004 Papers

In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a ...
Read More
Brook for GPUs: stream computing on graphics hardware

In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 38, Issue 6
December 2019
1292 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3355089
Issue’s Table of Contents

Copyright © 2019 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 November 2019
Published in tog Volume 38, Issue 6

Check for updates
Author Tags
GPU computing
sparse data structures
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 117
  Total Citations
  View Citations
- 4,887
  Total Downloads
- Downloads (Last 12 months)1,123
- Downloads (Last 6 weeks)134
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Taichi: a language for high-performance computation on spatially sparse data structures

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A unified optimizing compiler framework for different GPGPU architectures

Brook for GPUs: stream computing on graphics hardware

Brook for GPUs: stream computing on graphics hardware

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Taichi: a language for high-performance computation on spatially sparse data structures

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A unified optimizing compiler framework for different GPGPU architectures

Brook for GPUs: stream computing on graphics hardware

Brook for GPUs: stream computing on graphics hardware

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media