TTLG - An Efficient Tensor Transposition Library for GPUs | IEEE Conference Publication | IEEE Xplore