Parietal Cortex and Insula Relate to Evidence Seeking Relevant to Reward-Related Decisions

Nicholas Furl and Bruno B. Averbeck
J. Neurosci. 2011;31 17572-17582 Open Access

コストを支払って情報収集を続けるのか?さっさと決断するのか?人間は(ベイズ・モデルで計算された最適な行動に比べて)「さっさと決断してしまう」傾向にあり、Parietal CortexとInsulaが大きな役割を果たしている。 http://bit.ly/s7SHPK

Decisions are most effective after collecting sufficient evidence to accurately predict rewarding outcomes. We investigated whether human participants optimally seek evidence and we characterized the brain areas associated with their evidence seeking. Participants viewed sequences of bead colors drawn from hidden urns and attempted to infer the majority bead color in each urn. When viewing each bead color, participants chose either to seek more evidence about the urn by drawing another bead (draw choices) or to infer the urn contents (urn choices). We then compared their evidence seeking against that predicted by a Bayesian ideal observer model. By this standard, participants sampled less evidence than optimal. Also, when faced with urns that had bead color splits closer to chance (60/40 versus 80/20) or potential monetary losses, participants increased their evidence seeking, but they showed less increase than predicted by the ideal observer model. Functional magnetic resonance imaging showed that urn choices evoked larger hemodynamic responses than draw choices in the insula, striatum, anterior cingulate, and parietal cortex. These parietal responses were greater for participants who sought more evidence on average and for participants who increased more their evidence seeking when draws came from 60/40 urns. The parietal cortex and insula were associated with potential monetary loss. Insula responses also showed modulation with estimates of the expected gains of urn choices. Our findings show that participants sought less evidence than predicted by an ideal observer model and their evidence-seeking behavior may relate to responses in the insula and parietal cortex.

Dopaminergic Modulation of Risky Decision-Making

Nicholas W. Simon, Karienn S. Montgomery, Blanca S. Beas, Marci R.
Mitchell, Candi L. LaSarge, Ian A. Mendez, Cristina Banuelos, Colin M.
Vokes, Aaron B. Taylor, Rebecca P. Haberman, Jennifer L. Bizon, and Barry
J. Neurosci. 2011;31 17460-17470

Many psychiatric disorders are characterized by abnormal risky decision-making and dysregulated dopamine receptor expression. The current study was designed to determine how different dopamine receptor subtypes modulate risk-taking in young adult rats, using a “Risky Decision-making Task” that involves choices between small “safe” rewards and large “risky” rewards accompanied by adverse consequences. Rats showed considerable, stable individual differences in risk preference in the task, which were not related to multiple measures of reward motivation, anxiety, or pain sensitivity. Systemic activation of D2-like receptors robustly attenuated risk-taking, whereas drugs acting on D1-like receptors had no effect. Systemic amphetamine also reduced risk-taking, an effect which was attenuated by D2-like (but not D1-like) receptor blockade. Dopamine receptor mRNA expression was evaluated in a separate cohort of drug-naive rats characterized in the task. D1 mRNA expression in both nucleus accumbens shell and insular cortex was positively associated with risk-taking, while D2 mRNA expression in orbitofrontal and medial prefrontal cortex predicted risk preference in opposing nonlinear patterns. Additionally, lower levels of D2 mRNA in dorsal striatum were associated with greater risk-taking. These data strongly implicate dopamine signaling in prefrontal cortical-striatal circuitry in modulating decision-making processes involving integration of reward information with risks of adverse consequences.


Ubiquity and Specificity of Reinforcement Signals throughout the Human Brain

Vickery TJ, Chun MM, and Lee D
Neuron, Volume 72, Issue 1, 166-177, 6 October 2011

コイン合わせゲーム中の脳活動をfMRIで計測、サポート・ベクター・マシンを用いて解析。メインの結論は「前帯状皮質の活動から次の行動が予測できる!」ではなく、「報酬・罰は(線条体や眼窩前頭野だけでなく)脳全体でコードされている」ということ。 http://1.usa.gov/t0FOfn

Reinforcements and punishments facilitate adaptive behavior in diverse domains ranging from perception to social interactions. A conventional approach to understanding the corresponding neural substrates focuses on the basal ganglia and its dopaminergic projections. Here, we show that reinforcement and punishment signals are surprisingly ubiquitous in the gray matter of nearly every subdivision of the human brain. Humans played either matching-pennies or rock-paper-scissors games against computerized opponents while being scanned using fMRI. Multivoxel pattern analysis was used to decode previous choices and their outcomes, and to predict upcoming choices. Whereas choices were decodable from a confined set of brain structures, their outcomes were decodable from nearly all cortical and subcortical structures. In addition, signals related to both reinforcements and punishments were recovered reliably in many areas and displayed patterns not consistent with salience-based explanations. Thus, reinforcement and punishment might play global modulatory roles in the entire brain.

Double dissociation of value computations in orbitofrontal and anterior cingulate neurons

Steven W Kennerley, Timothy E J Behrens and Jonathan D Wallis
Nature Neuroscience 14, 1581–1589 (2011)

This study reports a double dissociation in the neuronal correlates of value-based decision making in monkey prefrontal cortex, with orbitofrontal cortex neurons encoding choice value relative to recent choice values, while anterior cingulate cortex neurons flexibly encode multiple decision parameters and reward prediction errors using a 'common valuation currency'.

Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex

Yuji K Takahashi, Matthew R Roesch, Robert C Wilson, Kathy Toreson, Patricio O'Donnell, Yael Niv and Geoffrey Schoenbaum
Nature Neuroscience 14, 1590–1597 (2011)

The orbitofrontal cortex (OFC) has been hypothesized to carry information regarding the value of expected rewards. Such information could be used for generating instructive error signals conveyed by dopamine neurons. Here the authors report that this is indeed the case. However, contrary to the simplest hypothesis, OFC lesions did not result in the loss of all value information. Instead, lesions caused the loss of value information derived from model-based representations.

It's a pleasure: a tale of two cortical areas

Daeyeol Lee
Nature Neuroscience 14, 1491–1492 (2011) doi:10.1038/nn.2981

Reward signals are widespread in the brain, but why? A study now identifies an important difference in the reward signals encoded by the neurons in the primate anterior cingulate and orbitofrontal cortices during decision making, suggesting that reward-related activity in these areas is shaped by different contextual factors.


Dissociable Reward and Timing Signals in Human Midbrain and Ventral Striatum

M.C. Klein-Flügge, L.T. Hunt, D.R. Bach, R.J. Dolan, and T.E.J. Behrens
Neuron, Volume 72, Issue 4, 654-664, 17 November 2011

Reward prediction error (RPE) signals are central to current models of reward-learning. Temporal difference (TD) learning models posit that these signals should be modulated by predictions, not only of magnitude but also timing of reward. Here we show that BOLD activity in the VTA conforms to such TD predictions: responses to unexpected rewards are modulated by a temporal hazard function and activity between a predictive stimulus and reward is depressed in proportion to predicted reward. By contrast, BOLD activity in ventral striatum (VS) does not reflect a TD RPE, but instead encodes a signal on the variable relevant for behavior, here timing but not magnitude of reward. The results have important implications for dopaminergic models of cortico-striatal learning and suggest a modification of the conventional view that VS BOLD necessarily reflects inputs from dopaminergic VTA neurons signaling an RPE.


Hedging your bets by learning reward correlations in the human brain.

Wunderlich K, Symmonds M, Bossaerts P, Dolan RJ.
Neuron. 2011 Sep 22;71(6):1141-52. Epub 2011 Sep 21.


Human subjects are proficient at tracking the mean and variance of rewards and updating these via prediction errors. Here, we addressed whether humans can also learn about higher-order relationships between distinct environmental outcomes, a defining ecological feature of contexts where multiple sources of rewards are available. By manipulating the degree to which distinct outcomes are correlated, we show that subjects implemented an explicit model-based strategy to learn the associated outcome correlations and were adept in using that information to dynamically adjust their choices in a task that required a minimization of outcome variance. Importantly, the experimentally generated outcome correlations were explicitly represented neuronally in right midinsula with a learning prediction error signal expressed in rostral anterior cingulate cortex. Thus, our data show that the human brain represents higher-order correlation structures between rewards, a core adaptive ability whose immediate benefit is optimized sampling.

Dorsolateral and ventromedial prefrontal cortex orchestrate normative choice

Thomas Baumgartner, Daria Knoch, Philine Hotz, Christoph Eisenegger & Ernst Fehr
Nature Neuroscience 14, 1468–1474 (2011)


Humans are noted for their capacity to over-ride self-interest in favor of normatively valued goals. We examined the neural circuitry that is causally involved in normative, fairness-related decisions by generating a temporarily diminished capacity for costly normative behavior, a 'deviant' case, through non-invasive brain stimulation (repetitive transcranial magnetic stimulation) and compared normal subjects' functional magnetic resonance imaging signals with those of the deviant subjects. When fairness and economic self-interest were in conflict, normal subjects (who make costly normative decisions at a much higher frequency) displayed significantly higher activity in, and connectivity between, the right dorsolateral prefrontal cortex (DLPFC) and the posterior ventromedial prefrontal cortex (pVMPFC). In contrast, when there was no conflict between fairness and economic self-interest, both types of subjects displayed identical neural patterns and behaved identically. These findings suggest that a parsimonious prefrontal network, the activation of right DLPFC and pVMPFC, and the connectivity between them, facilitates subjects' willingness to incur the cost of normative decisions.