Differential roles of human striatum and amygdala in associative learning

Jian Li, Daniela Schiller, Geoffrey Schoenbaum, Elizabeth A Phelps & Nathaniel D Daw
Nature Neuroscience (2011)

Although the human amygdala and striatum have both been implicated in associative learning, only the striatum's contribution has been consistently computationally characterized. Using a reversal learning task, we found that amygdala blood oxygen level–dependent activity tracked associability as estimated by a computational model, and dissociated it from the striatal representation of reinforcement prediction error. These results extend the computational learning approach from striatum to amygdala, demonstrating their complementary roles in aversive learning.

The evolution of overconfidence

Dominic D. P. Johnson and James H. Fowler
Nature 477, 317–320 (15 September 2011)

Confidence is an essential ingredient of success in a wide range of domains ranging from job performance and mental health to sports, business and combat1, 2, 3, 4. Some authors have suggested that not just confidence but overconfidence—believing you are better than you are in reality—is advantageous because it serves to increase ambition, morale, resolve, persistence or the credibility of bluffing, generating a self-fulfilling prophecy in which exaggerated confidence actually increases the probability of success3, 4, 5, 6, 7, 8. However, overconfidence also leads to faulty assessments, unrealistic expectations and hazardous decisions, so it remains a puzzle how such a false belief could evolve or remain stable in a population of competing strategies that include accurate, unbiased beliefs. Here we present an evolutionary model showing that, counterintuitively, overconfidence maximizes individual fitness and populations tend to become overconfident, as long as benefits from contested resources are sufficiently large compared with the cost of competition. In contrast, unbiased strategies are only stable under limited conditions. The fact that overconfident populations are evolutionarily stable in a wide range of environments may help to explain why overconfidence remains prevalent today, even if it contributes to hubris, market bubbles, financial collapses, policy failures, disasters and costly wars9, 10, 11, 12, 13.


Behavioral and Neural Properties of Social Reinforcement Learning

Rebecca M. Jones, Leah H. Somerville, Jian Li, Erika J. Ruberry, Victoria Libby, Gary Glover, Henning U. Voss, Douglas J. Ballon, and B. J. Casey
J. Neurosci. 2011;31 13039-13045

Social learning is critical for engaging in complex interactions with other individuals. Learning from positive social exchanges, such as acceptance from peers, may be similar to basic reinforcement learning. We formally test this hypothesis by developing a novel paradigm that is based on work in nonhuman primates and human imaging studies of reinforcement learning. The probability of receiving positive social reinforcement from three distinct peers was parametrically manipulated while brain activity was recorded in healthy adults using event-related functional magnetic resonance imaging. Over the course of the experiment, participants responded more quickly to faces of peers who provided more frequent positive social reinforcement, and rated them as more likeable. Modeling trial-by-trial learning showed ventral striatum and orbital frontal cortex activity correlated positively with forming expectations about receiving social reinforcement. Rostral anterior cingulate cortex activity tracked positively with modulations of expected value of the cues (peers). Together, the findings across three levels of analysis—social preferences, response latencies, and modeling neural responses—are consistent with reinforcement learning theory and nonhuman primate electrophysiological studies of reward. This work highlights the fundamental influence of acceptance by one's peers in altering subsequent behavior.

The Decision Value Computations in the vmPFC and Striatum Use a Relative Value Code That is Guided by Visual Attention

Seung-Lark Lim, John P. O'Doherty, and Antonio Rangel
J. Neurosci. 2011;31 13214-13223

There is a growing consensus in behavioral neuroscience that the brain makes simple choices by first assigning a value to the options under consideration and then comparing them. Two important open questions are whether the brain encodes absolute or relative value signals, and what role attention might play in these computations. We investigated these questions using a human fMRI experiment with a binary choice task in which the fixations to both stimuli were exogenously manipulated to control for the role of visual attention in the valuation computation. We found that the ventromedial prefrontal cortex and the ventral striatum encoded fixation-dependent relative value signals: activity in these areas correlated with the difference in value between the attended and the unattended items. These attention-modulated relative value signals might serve as the input of a comparator system that is used to make a choice.

Feedback Timing Modulates Brain Systems for Learning in Humans

Karin Foerde and Daphna Shohamy
J. Neurosci. 2011;31 13157-13167

The ability to learn from the consequences of actions—no matter when those consequences take place—is central to adaptive behavior. Despite major advances in understanding how immediate feedback drives learning, it remains unknown precisely how the brain learns from delayed feedback. Here, we present converging evidence from neuropsychology and neuroimaging for distinct roles for the striatum and the hippocampus in learning, depending on whether feedback is immediate or delayed. We show that individuals with striatal dysfunction due to Parkinson's disease are impaired at learning when feedback is immediate, but not when feedback is delayed by a few seconds. Using functional imaging (fMRI) combined with computational model-derived analyses, we further demonstrate that healthy individuals show activation in the striatum during learning from immediate feedback and activation in the hippocampus during learning from delayed feedback. Additionally, later episodic memory for delayed feedback events was enhanced, suggesting that engaging distinct neural systems during learning had consequences for the representation of what was learned. Together, these findings provide direct evidence from humans that striatal systems are necessary for learning from immediate feedback and that delaying feedback leads to a shift in learning from the striatum to the hippocampus. The results provide a link between learning impairments in Parkinson's disease and evidence from single-unit recordings demonstrating that the timing of reinforcement modulates activity of midbrain dopamine neurons. Collectively, these findings indicate that relatively small changes in the circumstances under which information is learned can shift learning from one brain system to another.


The Neural and Cognitive Time Course of Theory of Mind

Joseph P. McCleery, Andrew D. R. Surtees, Katharine A. Graham, John E.
Richards, and Ian A. Apperly
J. Neurosci. 2011;31 12849-12854

Neuroimaging and neuropsychological studies implicate both frontal and temporoparietal cortices when humans reason about the mental states of others. Here, we report an event-related potentials study of the time course of one such “theory of mind” ability: visual perspective taking. The findings suggest that posterior cortex, perhaps the temporoparietal cortex, calculates and represents the perspective of self versus other, and then, later, the right frontal cortex resolves conflict between perspectives during response selection.

Contextual Novelty Modulates the Neural Dynamics of Reward Anticipation

Nico Bunzeck, Marc Guitart-Masip, Ray J. Dolan, and Emrah Duzel
J. Neurosci. 2011;31 12816-12822

We investigated how rapidly the reward-predicting properties of visual cues are signaled in the human brain and the extent these reward prediction signals are contextually modifiable. In a magnetoencephalography study, we presented participants with fractal visual cues that predicted monetary rewards with different probabilities. These cues were presented in the temporal context of a preceding novel or familiar image of a natural scene. Starting at ∼100 ms after cue onset, reward probability was signaled in the event-related fields (ERFs) over temporo-occipital sensors and in the power of theta (5–8 Hz) and beta (20–30 Hz) band oscillations over frontal sensors. While theta decreased with reward probability beta power showed the opposite effect. Thus, in humans anticipatory reward responses are generated rapidly, within 100 ms after the onset of reward-predicting cues, which is similar to the timing established in non-human primates. Contextual novelty enhanced the reward anticipation responses in both ERFs and in beta oscillations starting at ∼100 ms after cue onset. This very early context effect is compatible with a physiological model that invokes the mediation of a hippocampal-VTA loop according to which novelty modulates neural response properties within the reward circuitry. We conclude that the neural processing of cues that predict future rewards is temporally highly efficient and contextually modifiable.