Human and Machine Learning in Non-Markovian Decision Making
Fig 3
Intermixed feedback experiment human data with model predictions.
Proportion correct versus trial number plotted for simulation results from a population of spiking neurons (red), a Bayesian learner (green), and a policy gradient learner (orange) compared with human performance (blue; re-plotted from Fig 2). A First training experiment without any delays. B Second training experiment with delay. C. Main experiment with random delay. D. Memory experiment with no delay. E. Switching experiment with no delay. F. Repetition of the main experiment. G. Akaike information criterion (corrected for finite sample sizes) for each model under consideration in the intermixed reward task. Lower AICc values imply greater support for the given model. In general, all models perform similarly.