Dissecting the CUDA scheduling hierarchy: a Performance and Predictability Perspective | IEEE Conference Publication | IEEE Xplore