The Processing of Unexpected Positive Response Outcomes in the Mediofrontal Cortex

Nicola K. Ferdinand, Axel Mecklinger, Jutta Kray, and William J. Gehring
J. Neurosci. 2012;32 12087-12092

The human mediofrontal cortex, especially the anterior cingulate cortex, is commonly assumed to contribute to higher cognitive functions like performance monitoring. How exactly this is achieved is currently the subject of lively debate but there is evidence that an event's valence and its expectancy play important roles. One prominent theory, the reinforcement learning theory by Holroyd and colleagues (2002, 2008), assigns a special role to feedback valence, while the prediction of response–outcome (PRO) model by Alexander and Brown (2010, 2011) claims that the mediofrontal cortex is sensitive to unexpected events regardless of their valence. However, paradigms examining this issue have included confounds that fail to separate valence and expectancy.

In the present study, we tested the two competing theories of performance monitoring by using an experimental task that separates valence and unexpectedness of performance feedback. The feedback-related negativity of the event-related potential, which is commonly assumed to be a reflection of mediofrontal cortex activity, was elicited not only by unexpected negative feedback, but also by unexpected positive feedback. This implies that the mediofrontal cortex is sensitive to the unexpectedness of events in general rather than their valence and by this supports the PRO model.

Corticostriatal Connectivity Underlies Individual Differences in the Balance between Habitual and Goal-Directed Action Control

Sanne de Wit, Poppy Watson, Helga A. Harsay, Michael X. Cohen, Irene van de Vijver, and K. Richard Ridderinkhof
J. Neurosci. 2012;32 12066-12075

Why are some individuals more susceptible to the formation of inflexible habits than others? In the present study, we used diffusion tensor imaging to demonstrate that brain connectivity predicts individual differences in relative goal-directed and habitual behavioral control in humans. Specifically, vulnerability to habitual “slips of action” toward no-longer-rewarding outcomes was predicted by estimated white matter tract strength in the premotor cortex seeded from the posterior putamen (as well as by gray matter density in the posterior putamen as determined with voxel-based morphometry). In contrast, flexible goal-directed action was predicted by estimated tract strength in the ventromedial prefrontal cortex seeded from the caudate. These findings suggest that integrity of dissociable corticostriatal pathways underlies individual differences in action control in the healthy population, which may ultimately mediate vulnerability to impulse control disorders.


Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value

Lung-Hao Tai, A Moses Lee, Nora Benavidez, Antonello Bonci and Linda Wilbrecht
Nature Neuroscience 15, 1281–1289 (2012)

In changing environments, animals must adaptively select actions to achieve their goals. In tasks involving goal-directed action selection, striatal neural activity has been shown to represent the value of competing actions. Striatal representations of action value could potentially bias responses toward actions of higher value. However, no study to date has demonstrated the direct effect of distinct striatal pathways in goal-directed action selection. We found that transient optogenetic stimulation of dorsal striatal dopamine D1 and D2 receptor–expressing neurons during decision-making in mice introduced opposing biases in the distribution of choices. The effect of stimulation on choice was dependent on recent reward history and mimicked an additive change in the action value. Although stimulation before and during movement initiation produced a robust bias in choice behavior, this bias was substantially diminished when stimulation was delayed after response initiation. Together, our data suggest that striatal activity is involved in goal-directed action selection.

Neural encoding of competitive effort in the anterior cingulate cortex

Kristin L Hillman and David K Bilkey
Nature Neuroscience 15, 1290–1297 (2012)

In social environments, animals often compete to obtain limited resources. Strategically electing to work against another animal represents a cost-benefit decision. Is the resource worth an investment of competitive effort? The anterior cingulate cortex (ACC) has been implicated in cost-benefit decision-making, but its role in competitive effort has not been examined. We recorded ACC neurons in freely moving rats as they performed a competitive foraging choice task. When at least one of the two choice options demanded competitive effort, the majority of ACC neurons exhibited heightened and differential firing between the goal trajectories. Inter- and intrasession manipulations revealed that differential firing was not attributable to effort or reward in isolation; instead ACC encoding patterns appeared to indicate net utility assessments of available choice options. Our findings suggest that the ACC is important for encoding competitive effort, a cost-benefit domain that has received little neural-level investigation despite its predominance in nature.

Social error monitoring in macaque frontal cortex

Kyoko Yoshida, Nobuhito Saito, Atsushi Iriki and Masaki Isoda
Nature Neuroscience 15, 1307–1312 (2012)

Although much learning occurs through direct experience of errors, humans and other animals can learn from the errors of other individuals. The medial frontal cortex (MFC) processes self-generated errors, but the neuronal architecture and mechanisms underlying the monitoring of others' errors are poorly understood. Exploring such mechanisms is important, as they underlie observational learning and allow adaptive behavior in uncertain social environments. Using two paired monkeys that monitored each other's action for their own action selection, we identified a group of neurons in the MFC that exhibited a substantial activity increase that was associated with another's errors. Nearly half of these neurons showed activity changes consistent with general reward-omission signals, whereas the remaining neurons specifically responded to another's erroneous actions. These findings indicate that the MFC contains a dedicated circuit for monitoring others' mistakes during social interactions.

On the evolutionary origins of the egalitarian syndrome

Sergey Gavrilets
PNAS August 28, 2012 vol. 109 no. 35 14069-14074

The evolutionary emergence of the egalitarian syndrome is one of the most intriguing unsolved puzzles related to the origins of modern humans. Standard explanations and models for cooperation and altruism—reciprocity, kin and group selection, and punishment—are not directly applicable to the emergence of egalitarian behavior in hierarchically organized groups that characterized the social life of our ancestors. Here I study an evolutionary model of group-living individuals competing for resources and reproductive success. In the model, the differences in fighting abilities lead to the emergence of hierarchies where stronger individuals take away resources from weaker individuals and, as a result, have higher reproductive success. First, I show that the logic of within-group competition implies under rather general conditions that each individual benefits if the transfer of the resource from a weaker group member to a stronger one is prevented. This effect is especially strong in small groups. Then I demonstrate that this effect can result in the evolution of a particular, genetically controlled psychology causing individuals to interfere in a bully–victim conflict on the side of the victim. A necessary condition is a high efficiency of coalitions in conflicts against the bullies. The egalitarian drive leads to a dramatic reduction in within-group inequality. Simultaneously it creates the conditions for the emergence of inequity aversion, empathy, compassion, and egalitarian moral values via the internalization of behavioral rules imposed by natural selection. It also promotes widespread cooperation via coalition formation.


Evidence for Hyperbolic Temporal Discounting of Reward in Control of Movements

Adrian M. Haith, Thomas R. Reppert, and Reza Shadmehr
J. Neurosci. 2012;32 11727-11736.

Suppose that the purpose of a movement is to place the body in a more rewarding state. In this framework, slower movements may increase accuracy and therefore improve the probability of acquiring reward, but the longer durations of slow movements produce devaluation of reward. Here we hypothesize that the brain decides the vigor of a movement (duration and velocity) based on the expected discounted reward associated with that movement. We begin by showing that durations of saccades of varying amplitude can be accurately predicted by a model in which motor commands maximize expected discounted reward. This result suggests that reward is temporally discounted even in timescales of tens of milliseconds. One interpretation of temporal discounting is that the true objective of the brain is to maximize the rate of reward—which is equivalent to a specific form of hyperbolic discounting. A consequence of this idea is that the vigor of saccades should change as one alters the intertrial intervals between movements. We find experimentally that in healthy humans, as intertrial intervals are varied, saccade peak velocities and durations change on a trial-by-trial basis precisely as predicted by a model in which the objective is to maximize the rate of reward. Our results are inconsistent with theories in which reward is discounted exponentially. We suggest that there exists a single cost, rate of reward, which provides a unifying principle that may govern control of movements in timescales of milliseconds, as well as decision making in timescales of seconds to years.


Noise and Correlations in Parallel Perceptual Decision Making

Thomas U. Otto, Pascal Mamassian
Current Biology, Volume 22, Issue 15, 1391-1396, 05 July 2012

Perceptual decisions involve the accumulation of sensory evidence over time, a process that is corrupted by noise [1]. Here, we extend the decision-making framework to crossmodal research [2,3] and the parallel processing of two distinct signals presented to different sensory modalities like vision and audition. Contrary to the widespread view that multisensory signals are integrated prior to a single decision [4,5,6,7,8,9,10], we show that evidence is accumulated for each signal separately and that consequent decisions are flexibly coupled by logical operations. We find that the strong correlation of response latencies from trial to trial is critical to explain the short latencies of multisensory decisions. Most critically, we show that increased noise in multisensory decisions is needed to explain the mean and the variability of response latencies. Precise knowledge of these key factors is fundamental for the study and understanding of parallel decision processes with multisensory signals.

Human dorsal anterior cingulate cortex neurons mediate ongoing behavioural adaptation

Sameer A. Sheth, Matthew K. Mian, Shaun R. Patel, Wael F. Asaad, Ziv M. Williams, Darin D. Dougherty, George Bush & Emad N. Eskandar
Nature 488, 218–221 (09 August 2012)

The ability to optimize behavioural performance when confronted with continuously evolving environmental demands is a key element of human cognition. The dorsal anterior cingulate cortex (dACC), which lies on the medial surface of the frontal lobes, is important in regulating cognitive control. Hypotheses about its function include guiding reward-based decision making1, monitoring for conflict between competing responses2 and predicting task difficulty3. Precise mechanisms of dACC function remain unknown, however, because of the limited number of human neurophysiological studies. Here we use functional imaging and human single-neuron recordings to show that the firing of individual dACC neurons encodes current and recent cognitive load. We demonstrate that the modulation of current dACC activity by previous activity produces a behavioural adaptation that accelerates reactions to cues of similar difficulty to previous ones, and retards reactions to cues of different difficulty. Furthermore, this conflict adaptation, or Gratton effect2, 4, is abolished after surgically targeted ablation of the dACC. Our results demonstrate that the dACC provides a continuously updated prediction of expected cognitive demand to optimize future behavioural responses. In situations with stable cognitive demands, this signal promotes efficiency by hastening responses, but in situations with changing demands it engenders accuracy by delaying responses.


Neuronal Correlates of Metacognition in Primate Frontal Cortex

Paul G. Middlebrooks, Marc A. Sommer
Neuron, Volume 75, Issue 3, 517-530, 9 August 2012

Humans are metacognitive: they monitor and control their cognition. Our hypothesis was that neuronal correlates of metacognition reside in the same brain areas responsible for cognition, including frontal cortex. Recent work demonstrated that nonhuman primates are capable of metacognition, so we recorded from single neurons in the frontal eye field, dorsolateral prefrontal cortex, and supplementary eye field of monkeys (Macaca mulatta) that performed a metacognitive visual-oculomotor task. The animals made a decision and reported it with a saccade, but received no immediate reward or feedback. Instead, they had to monitor their decision and bet whether it was correct. Activity was correlated with decisions and bets in all three brain areas, but putative metacognitive activity that linked decisions to appropriate bets occurred exclusively in the SEF. Our results offer a survey of neuronal correlates of metacognition and implicate the SEF in linking cognitive functions over short periods of time.

Dopamine Enhances Model-Based over Model-Free Choice Behavior

Klaus Wunderlich, Peter Smittenaar, Raymond J. Dolan
Neuron, Volume 75, Issue 3, 418-424, 9 August 2012

Decision making is often considered to arise out of contributions from a model-free habitual system and a model-based goal-directed system. Here, we investigated the effect of a dopamine manipulation on the degree to which either system contributes to instrumental behavior in a two-stage Markov decision task, which has been shown to discriminate model-free from model-based control. We found increased dopamine levels promote model-based over model-free choice.

Neuronal Activity during a Cued Strategy Task: Comparison of Dorsolateral, Orbital, and Polar Prefrontal Cortex

Satoshi Tsujimoto, Aldo Genovesio, and Steven P. Wise
J. Neurosci. 2012;32 11017-11031

We compared neuronal activity in the dorsolateral (PFdl), orbital (PFo), and polar (PFp) prefrontal cortex as monkeys performed three tasks. In two tasks, a cue instructed one of two strategies: stay with the previous response or shift to the alternative. Visual stimuli served as cues in one of these tasks; in the other, fluid rewards did so. In the third task, visuospatial cues instructed each response. A delay period followed each cue. As reported previously, PFdl encoded strategies (stay or shift) and responses (left or right) during the cue and delay periods, while PFo encoded strategies and PFp encoded neither strategies nor responses; during the feedback period, all three areas encoded responses, but not strategies. Four novel findings emerged from the present analysis. (1) The strategy encoded by PFdl and PFo cells during the cue and delay periods was modality specific. (2) The response encoded by PFdl cells was task and modality specific during the cue period, but during the delay and feedback periods it became task and modality general. (3) Although some PFdl and PFo cells responded to or anticipated rewards, we could rule out reward effects for most strategy- and response-related activity. (4) Immediately before feedback, only PFp signaled responses that were correct according to the cued strategy; after feedback, only PFo signaled the response that had been made, whether correct or incorrect. These signals support a role in generating responses by PFdl, assigning outcomes to choices by PFo, and assigning outcomes to cognitive processes by PFp.

What and Where Information in the Caudate Tail Guides Saccades to Visual Objects

Shinya Yamamoto, Ilya E. Monosov, Masaharu Yasuda, and Okihide Hikosaka
J. Neurosci. 2012;32 11005-11016 Open Access

We understand the world by making saccadic eye movements to various objects. However, it is unclear how a saccade can be aimed at a particular object, because two kinds of visual information, what the object is and where it is, are processed separately in the dorsal and ventral visual cortical pathways. Here, we provide evidence suggesting that a basal ganglia circuit through the tail of the monkey caudate nucleus (CDt) guides such object-directed saccades. First, many CDt neurons responded to visual objects depending on where and what the objects were. Second, electrical stimulation in the CDt induced saccades whose directions matched the preferred directions of neurons at the stimulation site. Third, many CDt neurons increased their activity before saccades directed to the preferred objects and directions of the neurons in a free-viewing condition. Our results suggest that CDt neurons receive both “what” and “where” information and guide saccades to visual objects.


Deciding When to Decide: Time-Variant Sequential Sampling Models Explain the Emergence of Value-Based Decisions in the Human Brain

Sebastian Gluth, Jorg Rieskamp, and Christian Buchel
J. Neurosci. 2012;32 10686-10698

The cognitive and neuronal mechanisms of perceptual decision making have been successfully linked to sequential sampling models. These models describe the decision process as a gradual accumulation of sensory evidence over time. The temporal evolution of economic choices, however, remains largely unexplored. We tested whether sequential sampling models help to understand the formation of value-based decisions in terms of behavior and brain responses. We used functional magnetic resonance imaging (fMRI) to measure brain activity while human participants performed a buying task in which they freely decided upon how and when to choose. Behavior was accurately predicted by a time-variant sequential sampling model that uses a decreasing rather than fixed decision threshold to estimate the time point of the decision. Presupplementary motor area, caudate nucleus, and anterior insula activation was associated with the accumulation of evidence over time. Furthermore, at the beginning of the decision process the fMRI signal in these regions accounted for trial-by-trial deviations from behavioral model predictions: relatively high activation preceded relatively early responses. The updating of value information was correlated with signals in the ventromedial prefrontal cortex, left and right orbitofrontal cortex, and ventral striatum but also in the primary motor cortex well before the response itself. Our results support a view of value-based decisions as emerging from sequential sampling of evidence and suggest a close link between the accumulation process and activity in the motor system when people are free to respond at any time.

Activation of Dorsal Raphe Serotonin Neurons Is Necessary for Waiting for Delayed Rewards

Kayoko W. Miyazaki, Katsuhiko Miyazaki, and Kenji Doya
J. Neurosci. 2012;32 10451-10457 Open Access

The forebrain serotonergic system is a crucial component in the control of impulsive behaviors. We previously reported that the activity of serotonin neurons in the midbrain dorsal raphe nucleus increased when rats performed a task that required them to wait for delayed rewards. However, the causal relationship between serotonin neural activity and the tolerance for the delayed reward remained unclear. Here, we test whether the inhibition of serotonin neural activity by the local application of the 5-HT1A receptor agonist 8-hydroxy-2-(di-n-propylamino) tetralin in the dorsal raphe nucleus impairs rats' tolerance for delayed rewards. Rats performed a sequential food-water navigation task that required them to visit food and water sites alternately via a tone site to get rewards at both sites after delays. During the short (2 s) delayed reward condition, the inhibition of serotonin neural activity did not significantly influence the numbers of reward choice errors (nosepoke at an incorrect reward site following a conditioned reinforcer tone), reward wait errors (failure to wait for the delayed rewards), or total trials (sum of reward choice errors, reward wait errors, and acquired rewards). By contrast, during the long (7–11 s) delayed reward condition, the number of wait errors significantly increased while the numbers of total trials and choice errors did not significantly change. These results indicate that the activation of dorsal raphe serotonin neurons is necessary for waiting for long delayed rewards and suggest that elevated serotonin activity facilitates waiting behavior when there is the prospect of forthcoming rewards.