Continuous-time Markov decision process with average reward: Using reinforcement learning method

Continuous-time Markov decision process with average reward: Using reinforcement learning method | IEEE Conference Publication | IEEE Xplore