Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia

Carlos Diuk, Karin Tsai, Jonathan Wallis, Matthew Botvinick, and Yael Niv
J. Neurosci. 2013;33 5797-5805

Studies suggest that dopaminergic neurons report a unitary, global reward prediction error signal. However, learning in complex real-life tasks, in particular tasks that show hierarchical structure, requires multiple prediction errors that may coincide in time. We used functional neuroimaging to measure prediction error signals in humans performing such a hierarchical task involving simultaneous, uncorrelated prediction errors. Analysis of signals in a priori anatomical regions of interest in the ventral striatum and the ventral tegmental area indeed evidenced two simultaneous, but separable, prediction error signals corresponding to the two levels of hierarchy in the task. This result suggests that suitably designed tasks may reveal a more intricate pattern of firing in dopaminergic neurons. Moreover, the need for downstream separation of these signals implies possible limitations on the number of different task levels that we can learn about simultaneously.

A Computational Framework for Understanding Decision Making through Integration of Basic Learning Rules

Maxim Bazhenov, Ramon Huerta, and Brian H. Smith
J. Neurosci. 2013;33 5686-5697

Nonassociative and associative learning rules simultaneously modify neural circuits. However, it remains unclear how these forms of plasticity interact to produce conditioned responses. Here we integrate nonassociative and associative conditioning within a uniform model of olfactory learning in the honeybee. Honeybees show a fairly abrupt increase in response after a number of conditioning trials. The occurrence of this abrupt change takes many more trials after exposure to nonassociative trials than just using associative conditioning. We found that the interaction of unsupervised and supervised learning rules is critical for explaining latent inhibition phenomenon. Associative conditioning combined with the mutual inhibition between the output neurons produces an abrupt increase in performance despite smooth changes of the synaptic weights. The results show that an integrated set of learning rules implemented using fan-out connectivities together with neural inhibition can explain the broad range of experimental data on learning behaviors.


Predicting free choices for abstract intentions

Chun Siong Soon, Anna Hanxi He, Stefan Bode, and John-Dylan Haynes
PNAS March 18, 2013 201212218

抽象的な意思決定(例:足し算をするか引き算をするか)であっても、意識に上る数秒前にはすでに行われている。意識的に決定を行う四秒前の脳活動(前頭極、楔前部のfMRI信号)から、その後の意思決定を予測できる。 http://www.pnas.org/content/early/2013/03/14/1212218110

Unconscious neural activity has been repeatedly shown to precede and potentially even influence subsequent free decisions. However, to date, such findings have been mostly restricted to simple motor choices, and despite considerable debate, there is no evidence that the outcome of more complex free decisions can be predicted from prior brain signals. Here, we show that the outcome of a free decision to either add or subtract numbers can already be decoded from neural activity in medial prefrontal and parietal cortex 4 s before the participant reports they are consciously making their choice. These choice-predictive signals co-occurred with the so-called default mode brain activity pattern that was still dominant at the time when the choice-predictive signals occurred. Our results suggest that unconscious preparation of free choices is not restricted to motor preparation. Instead, decisions at multiple scales of abstraction evolve from the dynamics of preceding brain activity.


Neural Changes with Tactile Learning Reflect Decision-Level Reweighting of Perceptual Readout

K. Sathian, Gopikrishna Deshpande, and Randall Stilla
J. Neurosci. 2013;33 5387-5398

Despite considerable work, the neural basis of perceptual learning remains uncertain. For visual learning, although some studies suggested that changes in early sensory representations are responsible, other studies point to decision-level reweighting of perceptual readout. These competing possibilities have not been examined in other sensory systems, investigating which could help resolve the issue. Here we report a study of human tactile microspatial learning in which participants achieved >six-fold decline in acuity threshold after multiple training sessions. Functional magnetic resonance imaging was performed during performance of the tactile microspatial task and a control, tactile temporal task. Effective connectivity between relevant brain regions was estimated using multivariate, autoregressive models of hidden neuronal variables obtained by deconvolution of the hemodynamic response. Training-specific increases in task-selective activation assessed using the task × session interaction and associated changes in effective connectivity primarily involved subcortical and anterior neocortical regions implicated in motor and/or decision processes, rather than somatosensory cortical regions. A control group of participants tested twice, without intervening training, exhibited neither threshold improvement nor increases in task-selective activation. Our observations argue that neuroplasticity mediating perceptual learning occurs at the stage of perceptual readout by decision networks. This is consonant with the growing shift away from strictly modular conceptualization of the brain toward the idea that complex network interactions underlie even simple tasks. The convergence of our findings on tactile learning with recent studies of visual learning reconciles earlier discrepancies in the literature on perceptual learning.

Dopaminergic Reward Signals Selectively Decrease fMRI Activity in Primate Visual Cortex

John T. Arsenault, Koen Nelissen, Bechir Jarraya, Wim Vanduffel
Neuron, Volume 77, Issue 6, 1174-1186, 20 March 2013

Stimulus-reward coupling without attention can induce highly specific perceptual learning effects, suggesting that reward triggers selective plasticity within visual cortex. Additionally, dopamine-releasing events—temporally surrounding stimulus-reward associations—selectively enhance memory. These forms of plasticity may be evoked by selective modulation of stimulus representations during dopamine-inducing events. However, it remains to be shown whether dopaminergic signals can selectively modulate visual cortical activity. We measured fMRI activity in monkey visual cortex during reward-only trials apart from intermixed cue-reward trials. Reward without visual stimulation selectively decreased fMRI activity within the cue representations that had been paired with reward during other trials. Behavioral tests indicated that these same uncued reward trials strengthened cue-reward associations. Furthermore, such spatially-specific activity modulations depended on prediction error, as shown by manipulations of reward magnitude, cue-reward probability, cue-reward familiarity, and dopamine signaling. This cue-selective negative reward signal offers a mechanism for selectively gating sensory cortical plasticity.


Resting-State Functional Connectivity Predicts Impulsivity in Economic Decision-Making

Nan Li, Ning Ma, Ying Liu, Xiao-Song He, De-Lin Sun, Xian-Ming Fu, Xiaochu Zhang, Shihui Han, and Da-Ren Zhang
J. Neurosci. 2013;33 4886-4895

Increasing neuroimaging evidence suggests an association between impulsive decision-making behavior and task-related brain activity. However, the relationship between impulsivity in decision-making and resting-state brain activity remains unknown. To address this issue, we used functional MRI to record brain activity from human adults during a resting state and during a delay discounting task (DDT) that requires choosing between an immediate smaller reward and a larger delayed reward. In experiment I, we identified four DDT-related brain networks. The money network (the striatum, posterior cingulate cortex, etc.) and the time network (the medial and dorsolateral prefrontal cortices, etc.) were associated with the valuation process; the frontoparietal network and the dorsal anterior cingulate cortex–anterior insular cortex network were related to the choice process. Moreover, we found that the resting-state functional connectivity of the brain regions in these networks was significantly correlated with participants' discounting rate, a behavioral index of impulsivity during the DDT. In experiment II, we tested an independent group of subjects and demonstrated that this resting-state functional connectivity was able to predict individuals' discounting rates. Together, these findings suggest that resting-state functional organization of the human brain may be a biomarker of impulsivity and can predict economic decision-making behavior.

Multiphasic Temporal Dynamics in Responses of Midbrain Dopamine Neurons to Appetitive and Aversive Stimuli

Christopher D. Fiorillo, Minryung R. Song, and Sora R. Yun
J. Neurosci. 2013;33 4710-4725

The transient response of dopamine neurons has been described as reward prediction error (RPE), with activation or suppression by events that are better or worse than expected, respectively. However, at least a minority of neurons are activated by aversive or high-intensity stimuli, casting doubt on the generality of RPE in describing the dopamine signal. To overcome limitations of previous studies, we studied neuronal responses to a wider variety of high-intensity and aversive stimuli, and we quantified and controlled aversiveness through a choice task in which macaques sacrificed juice to avoid aversive stimuli. Whereas most previous work has portrayed the RPE as a single impulse or “phase,” here we demonstrate its multiphasic temporal dynamics. Aversive or high-intensity stimuli evoked a triphasic sequence of activation-suppression-activation extending over a period of 40–700 ms. The initial activation at short latencies (40–120 ms) reflected sensory intensity. The influence of motivational value became dominant between 150 and 250 ms, with activation in the case of appetitive stimuli, and suppression in the case of aversive and neutral stimuli. The previously unreported late activation appeared to be a modest “rebound” after strong suppression. Similarly, strong activation by reward was often followed by suppression. We suggest that these “rebounds” may result from overcompensation by homeostatic mechanisms in some cells. Our results are consistent with a realistic RPE, which evolves over time through a dynamic balance of excitation and inhibition.

Diversity and Homogeneity in Responses of Midbrain Dopamine Neurons

Christopher D. Fiorillo, Sora R. Yun, and Minryung R. Song
J. Neurosci. 2013;33 4693-4709

Dopamine neurons of the ventral midbrain have been found to signal a reward prediction error that can mediate positive reinforcement. Despite the demonstration of modest diversity at the cellular and molecular levels, there has been little analysis of response diversity in behaving animals. Here we examine response diversity in rhesus macaques to appetitive, aversive, and neutral stimuli having relative motivational values that were measured and controlled through a choice task. First, consistent with previous studies, we observed a continuum of response variability and an apparent absence of distinct clusters in scatter plots, suggesting a lack of statistically discrete subpopulations of neurons. Second, we found that a group of “sensitive” neurons tend to be more strongly suppressed by a variety of stimuli and to be more strongly activated by juice. Third, neurons in the “ventral tier” of substantia nigra were found to have greater suppression, and a subset of these had higher baseline firing rates and late “rebound” activation after suppression. These neurons could belong to a previously identified subgroup of dopamine neurons that express high levels of H-type cation channels but lack calbindin. Fourth, neurons further rostral exhibited greater suppression. Fifth, although we observed weak activation of some neurons by aversive stimuli, this was not associated with their aversiveness. In conclusion, we find a diversity of response properties, distributed along a continuum, within what may be a single functional population of neurons signaling reward prediction error.


Indirect reciprocity is sensitive to costs of information transfer

Scientific Reports誌から論文が出版されました!
"Indirect reciprocity is sensitive to costs of information transfer"
Shinsuke Suzuki & Hiromichi Kimura
Scientific Reports 3, Article number: 1435 doi:10.1038/srep01435


"Cost of reputation building vanishes indirect reciprocity"

・各個体の行動(協力 or 裏切り)に応じて評判が付く
と言われていました。← 間接互恵 "indirect reciprocity" という。


自分では大事な指摘だと思うのですが「当たり前だと言われるかも?」という懸念があったので、「その辺は読者に判断してもらおう」とNature系列のオープンアクセス誌「Scientific Reports」に投稿しました。

・2月5日に査読結果が返送 (minor revision)


Dorsolateral prefrontal and orbitofrontal cortex interactions during self-control of cigarette craving

Takuya Hayashi, Ji Hyun Ko, Antonio P. Strafella, and Alain Dagher
PNAS 2013 110 (11) 4422-4427

Drug-related cues induce craving, which may perpetuate drug use or trigger relapse in addicted individuals. Craving is also under the influence of other factors in daily life, such as drug availability and self-control. Neuroimaging studies using drug cue paradigms have shown frontal lobe involvement in this contextual influence on cue reactivity, but have not clarified how and which frontal area accounts for this phenomenon. We explored frontal lobe contributions to cue-induced drug craving under different intertemporal drug availability conditions by combining transcranial magnetic stimulation and functional magnetic resonance imaging in smokers. We hypothesized that the dorsolateral prefrontal cortex (DLPFC) regulates craving during changes in intertemporal availability. Subjective craving was greater when cigarettes were immediately available, and this effect was eliminated by transiently inactivating the DLPFC with transcranial magnetic stimulation. Functional magnetic resonance imaging demonstrated that the signal most proportional to subjective craving was located in the medial orbitofrontal cortex across all contexts, whereas the DLPFC most strongly encoded intertemporal availability information. The craving-related signal in the medial orbitofrontal cortex was attenuated by inactivation of the DLPFC, particularly when cigarettes were immediately available. Inactivation of the DLPFC also reduced craving-related signals in the anterior cingulate and ventral striatum, areas implicated in transforming value signals into action. These findings indicate that DLPFC builds up value signals based on knowledge of drug availability, and support a model wherein aberrant circuitry linking dorsolateral prefrontal and orbitofrontal cortices may underlie addiction.




「カプチーノ cappuccino」がなかなか通じなかった…











先月、DAISO USAで買った傘が大活躍です。

発する言葉は「Hi. Good morning」くらいでOK(←だから英語が上達しない…)





その後は、Conversation Partnerと英語レッスン中の妻を迎えに行って20時前に帰宅。




Reward Prediction Error Signal Enhanced by Striatum–Amygdala Interaction Explains the Acceleration of Probabilistic Reward Learning by Emotion

Noriya Watanabe, Masamichi Sakagami, and Masahiko Haruno
The Journal of Neuroscience, 6 March 2013, 33(10):4487-4493; doi:10.1523/JNEUROSCI.3400-12.2013


Learning does not only depend on rationality, because real-life learning cannot be isolated from emotion or social factors. Therefore, it is intriguing to determine how emotion changes learning, and to identify which neural substrates underlie this interaction. Here, we show that the task-independent presentation of an emotional face before a reward-predicting cue increases the speed of cue–reward association learning in human subjects compared with trials in which a neutral face is presented. This phenomenon was attributable to an increase in the learning rate, which regulates reward prediction errors. Parallel to these behavioral findings, functional magnetic resonance imaging demonstrated that presentation of an emotional face enhanced reward prediction error (RPE) signal in the ventral striatum. In addition, we also found a functional link between this enhanced RPE signal and increased activity in the amygdala following presentation of an emotional face. Thus, this study revealed an acceleration of cue–reward association learning by emotion, and underscored a role of striatum–amygdala interactions in the modulation of the reward prediction errors by emotion.