Christopher D. Fiorillo, Minryung R. Song, and Sora R. Yun
The Journal of Neuroscience, 13 March 2013, 33(11):4710-4725; doi:10.1523/JNEUROSCI.3883-12.2013
The transient response of dopamine neurons has been described as reward prediction error (RPE), with activation or suppression by events that are better or worse than expected, respectively. However, at least a minority of neurons are activated by aversive or high-intensity stimuli, casting doubt on the generality of RPE in describing the dopamine signal. To overcome limitations of previous studies, we studied neuronal responses to a wider variety of high-intensity and aversive stimuli, and we quantified and controlled aversiveness through a choice task in which macaques sacrificed juice to avoid aversive stimuli. Whereas most previous work has portrayed the RPE as a single impulse or “phase,” here we demonstrate its multiphasic temporal dynamics. Aversive or high-intensity stimuli evoked a triphasic sequence of activation-suppression-activation extending over a period of 40–700 ms. The initial activation at short latencies (40–120 ms) reflected sensory intensity. The influence of motivational value became dominant between 150 and 250 ms, with activation in the case of appetitive stimuli, and suppression in the case of aversive and neutral stimuli. The previously unreported late activation appeared to be a modest “rebound” after strong suppression. Similarly, strong activation by reward was often followed by suppression. We suggest that these “rebounds” may result from overcompensation by homeostatic mechanisms in some cells. Our results are consistent with a realistic RPE, which evolves over time through a dynamic balance of excitation and inhibition.