A reliability-aware approach for an optimal checkpoint/restart model in HPC environments | IEEE Conference Publication | IEEE Xplore