Libra: Contention-Aware GPU Thread Allocation for Data Parallel Training in High Speed Networks | IEEE Conference Publication | IEEE Xplore