Skip to main content
Advertisement

< Back to Article

Deep reinforcement learning for optimal experimental design in biology

Fig 1

Reinforcement learning for optimal experimental design.

A) A hypothetical example of a poorly designed experiment (left) corresponding to an increasing sequence of input values u over time, with a resulting continual increase in the observable output Y. A corresponding confidence ellipse in p1-p2 parameter space is depicted. The logarithm of the determinant of the Fisher information matrix, log(|I|), is low. In contrast, a hypothetical well-designed experiment (right), which maximises the determinant of the Fisher information matrix, corresponds to non-intuitive choices of input, and a resulting dynamic response in the output. The corresponding confidence ellipse is tight and the determinant of the Fisher information is high. B) Optimal experimental design formulated as a reinforcement learning problem. The model dynamics, F, describe the rate of change of the state vector X in terms of model parameters θ and input u. For each time step, τ, an observation of the system, oτ, is provided to the agent which chooses an action, aτ, to apply over that time step and receives a corresponding reward, rτ. C) Training over a parameter distribution. 1) A distribution of parameters is chosen (shown as uniform). 2) The RL controller is trained. Each episode employs a model parametrisation θ sampled from the distribution. 3) When acting as a feedback controller, the trained RL agent designs near optimal experiments across the parameter distribution. For example well-designed experiments will be executed for either θ1 and θ2 (inputs u1(t) and u2(t) respectively). D) Model of an auxotrophic bacterial strain growing in a chemostat. The nutrient inflows, Cin and C0,in, can be controlled as part of an experiment.

Fig 1

doi: https://doi.org/10.1371/journal.pcbi.1010695.g001