ABSTRACT
In high-level synthesis (HLS), software multithreading constructs can be used to explicitly specify coarse-grained parallelism for multiple accelerators. While software threads typically operate independently and in isolation of each other on CPUs, HLS threads/accelerators are sub-components of one circuit. Since these components generally reside in the same clock domain, we can schedule their execution statically to avoid shared-resource contention among threads. We propose thread weaving, a technique that statically interleaves requests from different threads through scheduling constraints. With the guarantee of a contention-free schedule, we eliminate replication/arbitration of shared resources, reducing the area footprint of the circuit and improving its maximum operating frequency (Fmax).
- A. Canis, S. D. Brown, and J. H. Anderson. 2014. Modulo SDC scheduling with recurrence minimization in high-level synthesis. In FPL.Google Scholar
- A. Canis et al. 2011. LegUp: High-level Synthesis for FPGA-based Processor/Accelerator Systems. In ACM/SIGDA International Symposium on FPGA. ACM, New York, NY, USA, 33--36. Google ScholarDigital Library
- J. Choi, S. Brown, and J. Anderson. 2015. Resource and memory management techniques for the high-level synthesis of software threads into parallel FPGA hardware. In IEEE FPT. 152--159.Google Scholar
- J. Choi, S. D. Brown, and J. H. Anderson. 2013. From software threads to parallel hardware in high-level synthesis for FPGAs. In IEEE FPT. 270--277.Google Scholar
- S. Hadjis et al. 2012. Impact of FPGA Architecture on Resource Sharing in High-level Synthesis. In ACM FPGA. 111--114. Google ScholarDigital Library
- C. Lattner and V. Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In ACM/IEEE CGO. Google ScholarDigital Library
- M. Lattuada and F. Ferrandi. 2015. Exploiting Outer Loops Vectorization in High Level Synthesis. In Proceedings of the Architecture of Computing Systems (Lecture Notes in Computer Science), Vol. 9017. Springer International Publishing, 31--42.Google Scholar
- M. Lattuada and F. Ferrandi. 2017. Exploiting Vectorization in High Level Synthesis of Nested Irregular Loops. Journal of Systems Architecture 75 (2017), 1--14. Google ScholarDigital Library
- C. Pilato and F. Ferrandi. 2013. Bambu: A modular framework for the high level synthesis of memory-intensive applications. In FPL. 1--4.Google Scholar
- N. Ramanathan et al. 2017. Hardware Synthesis of Weakly Consistent C Concurrency. In ACMFPGA. 169--178. Google ScholarDigital Library
- Xilinx. 2017. Vivado Design Suite User Guide High-Level Synthesis. https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_2/ug902-vivado-high-level-synthesis.pdfGoogle Scholar
- Thread Weaving: Static Resource Scheduling for Multithreaded High-Level Synthesis
Recommendations
Thread algebra for strategic interleaving
AbstractWe take a thread as the behavior of a sequential deterministic program under execution and multi-threading as the form of concurrency provided by contemporary programming languages such as Java and C#. We outline an algebraic theory about threads ...
Thread-local concurrency: a technique to handle data race detection at programming model abstraction
HPDC '18: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed ComputingWith greater adoption of various high-level parallel programming models to harness on-node parallelism, accurate data race detection has become more crucial than ever. However, existing tools have great difficulty spotting data races through these high-...
Efficient Java thread serialization
PPPJ '03: Proceedings of the 2nd international conference on Principles and practice of programming in JavaThe Java system supports the transmission of code via dynamic class loading, and the transmission or storage of data via object serialization. However, Java does not provide any mechanism for the transmission/storage of computation (i.e., thread ...
Comments