Training neural networks with policy gradient | IEEE Conference Publication | IEEE Xplore