Confidence Is the Bridge between Multi-stage Decisions

van den Berg R, Zylberberg A, Kiani R, Shadlen MN, Wolpert DM
Curr Biol. 2016 Dec 5;26(23):3157-3168. doi: 10.1016/j.cub.2016.10.021.

Demanding tasks often require a series of decisions to reach a goal. Recent progress in perceptual decision-making has served to unite decision accuracy, speed, and confidence in a common framework of bounded evidence accumulation, furnishing a platform for the study of such multi-stage decisions. In many instances, the strategy applied to each decision, such as the speed-accuracy trade-off, ought to depend on the accuracy of the previous decisions. However, as the accuracy of each decision is often unknown to the decision maker, we hypothesized that subjects may carry forward a level of confidence in previous decisions to affect subsequent decisions. Subjects made two perceptual decisions sequentially and were rewarded only if they made both correctly. The speed and accuracy of individual decisions were explained by noisy evidence accumulation to a terminating bound. We found that subjects adjusted their speed-accuracy setting by elevating the termination bound on the second decision in proportion to their confidence in the first. The findings reveal a novel role for confidence and a degree of flexibility, hitherto unknown, in the brain's ability to rapidly and precisely modify the mechanisms that control the termination of a decision.


Dynamic neural architecture for social knowledge retrieval

Y Wang et al.
Proc Natl Acad Sci U S A 114 (16), E3305-E3314. 2017 Mar 13.

Social behavior is often shaped by the rich storehouse of biographical information that we hold for other people. In our daily life, we rapidly and flexibly retrieve a host of biographical details about individuals in our social network, which often guide our decisions as we navigate complex social interactions. Even abstract traits associated with an individual, such as their political affiliation, can cue a rich cascade of person-specific knowledge. Here, we asked whether the anterior temporal lobe (ATL) serves as a hub for a distributed neural circuit that represents person knowledge. Fifty participants across two studies learned biographical information about fictitious people in a 2-d training paradigm. On day 3, they retrieved this biographical information while undergoing an fMRI scan. A series of multivariate and connectivity analyses suggest that the ATL stores abstract person identity representations. Moreover, this region coordinates interactions with a distributed network to support the flexible retrieval of person attributes. Together, our results suggest that the ATL is a central hub for representing and retrieving person knowledge.


Observational learning computations in neurons of the human anterior cingulate cortex

Michael R. Hill, Erie D. Boorman & Itzhak Fried
Nature Communications 7, Article number: 12722 (2016) doi:10.1038/ncomms12722

When learning from direct experience, neurons in the primate brain have been shown to encode a teaching signal used by algorithms in artificial intelligence: the reward prediction error (PE)—the difference between how rewarding an event is, and how rewarding it was expected to be. However, in humans and other species learning often takes place by observing other individuals. Here, we show that, when humans observe other players in a card game, neurons in their rostral anterior cingulate cortex (rACC) encode both the expected value of an observed choice, and the PE after the outcome was revealed. Notably, during the same task neurons recorded in the amygdala (AMY) and the rostromedial prefrontal cortex (rmPFC) do not exhibit this type of encoding. Our results suggest that humans learn by observing others, at least in part through the encoding of observational PEs in single neurons in the rACC.


Neurons in the primate dorsal striatum signal the uncertainty of object–reward associations

J. Kael White & Ilya E. Monosov
Nature Communications 7, Article number: 12735 (2016) doi:10.1038/ncomms12735

To learn, obtain reward and survive, humans and other animals must monitor, approach and act on objects that are associated with variable or unknown rewards. However, the neuronal mechanisms that mediate behaviours aimed at uncertain objects are poorly understood. Here we demonstrate that a set of neurons in an internal-capsule bordering regions of the primate dorsal striatum, within the putamen and caudate nucleus, signal the uncertainty of object–reward associations. Their uncertainty responses depend on the presence of objects associated with reward uncertainty and evolve rapidly as monkeys learn novel object–reward associations. Therefore, beyond its established role in mediating actions aimed at known or certain rewards, the dorsal striatum also participates in behaviours aimed at reward-uncertain objects.


Blunted ventral striatal responses to anticipated rewards foreshadow problematic drug use in novelty-seeking adolescents

Christian Büchel, Jan Peters[…]the IMAGEN consortium
Nature Communications 8, Article number: 14140 (2017) doi:10.1038/ncomms14140

Novelty-seeking tendencies in adolescents may promote innovation as well as problematic impulsive behaviour, including drug abuse. Previous research has not clarified whether neural hyper- or hypo-responsiveness to anticipated rewards promotes vulnerability in these individuals. Here we use a longitudinal design to track 144 novelty-seeking adolescents at age 14 and 16 to determine whether neural activity in response to anticipated rewards predicts problematic drug use. We find that diminished BOLD activity in mesolimbic (ventral striatal and midbrain) and prefrontal cortical (dorsolateral prefrontal cortex) regions during reward anticipation at age 14 predicts problematic drug use at age 16. Lower psychometric conscientiousness and steeper discounting of future rewards at age 14 also predicts problematic drug use at age 16, but the neural responses independently predict more variance than psychometric measures. Together, these findings suggest that diminished neural responses to anticipated rewards in novelty-seeking adolescents may increase vulnerability to future problematic drug use.


Striatal prediction errors support dynamic control of declarative memory decisions

Jason M. Scimeca, Perri L. Katzman & David Badre
Nature Communications 7, Article number: 13061 (2016) doi:10.1038/ncomms13061

Adaptive memory requires context-dependent control over how information is retrieved, evaluated and used to guide action, yet the signals that drive adjustments to memory decisions remain unknown. Here we show that prediction errors (PEs) coded by the striatum support control over memory decisions. Human participants completed a recognition memory test that incorporated biased feedback to influence participants’ recognition criterion. Using model-based fMRI, we find that PEs—the deviation between the outcome and expected value of a memory decision—correlate with striatal activity and predict individuals’ final criterion. Importantly, the striatal PEs are scaled relative to memory strength rather than the expected trial outcome. Follow-up experiments show that the learned recognition criterion transfers to free recall, and targeting biased feedback to experimentally manipulate the magnitude of PEs influences criterion consistent with PEs scaled relative to memory strength. This provides convergent evidence that declarative memory decisions can be regulated via striatally mediated reinforcement learning signals.


Computations Underlying Social Hierarchy Learning: Distinct Neural Mechanisms for Updating and Representing Self-Relevant Information

Kumaran D, Banino A, Blundell C, Hassabis D, Dayan P.
Neuron. 2016 Dec 7;92(5):1135-1147. doi: 10.1016/j.neuron.2016.10.052.

Knowledge about social hierarchies organizes human behavior, yet we understand little about the underlying computations. Here we show that a Bayesian inference scheme, which tracks the power of individuals, better captures behavioral and neural data compared with a reinforcement learning model inspired by rating systems used in games such as chess. We provide evidence that the medial prefrontal cortex (MPFC) selectively mediates the updating of knowledge about one's own hierarchy, as opposed to that of another individual, a process that underpinned successful performance and involved functional interactions with the amygdala and hippocampus. In contrast, we observed domain-general coding of rank in the amygdala and hippocampus, even when the task did not require it. Our findings reveal the computations underlying a core aspect of social cognition and provide new evidence that self-relevant information may indeed be afforded a unique representational status in the brain.


Lateral orbitofrontal cortex anticipates choices and integrates prior with current information

Ramon Nogueira, Juan M. Abolafia, Jan Drugowitsch, Emili Balaguer-Ballester, Maria V. Sanchez-Vives & Rubén Moreno-Bote
Nature Communications 8, Article number: 14823 (2017) doi:10.1038/ncomms14823

Adaptive behavior requires integrating prior with current information to anticipate upcoming events. Brain structures related to this computation should bring relevant signals from the recent past into the present. Here we report that rats can integrate the most recent prior information with sensory information, thereby improving behavior on a perceptual decision-making task with outcome-dependent past trial history. We find that anticipatory signals in the orbitofrontal cortex about upcoming choice increase over time and are even present before stimulus onset. These neuronal signals also represent the stimulus and relevant second-order combinations of past state variables. The encoding of choice, stimulus and second-order past state variables resides, up to movement onset, in overlapping populations. The neuronal representation of choice before stimulus onset and its build-up once the stimulus is presented suggest that orbitofrontal cortex plays a role in transforming immediate prior and stimulus information into choices using a compact state-space representation.


Computational Precision of Mental Inference as Critical Source of Human Choice Suboptimality

Drugowitsch J, Wyart V, Devauchelle AD, Koechlin E
Neuron. 2016 Dec 21;92(6):1398-1411. doi: 10.1016/j.neuron.2016.11.005. Epub 2016 Dec 1.

Making decisions in uncertain environments often requires combining multiple pieces of ambiguous information from external cues. In such conditions, human choices resemble optimal Bayesian inference, but typically show a large suboptimal variability whose origin remains poorly understood. In particular, this choice suboptimality might arise from imperfections in mental inference rather than in peripheral stages, such as sensory processing and response selection. Here, we dissociate these three sources of suboptimality in human choices based on combining multiple ambiguous cues. Using a novel quantitative approach for identifying the origin and structure of choice variability, we show that imperfections in inference alone cause a dominant fraction of suboptimal choices. Furthermore, two-thirds of this suboptimality appear to derive from the limited precision of neural computations implementing inference rather than from systematic deviations from Bayes-optimal inference. These findings set an upper bound on the accuracy and ultimate predictability of human choices in uncertain environments.


Generalization of prior information for rapid Bayesian time estimation

Roach NW, McGraw PV, Whitaker DJ, Heron J
Proc Natl Acad Sci U S A. 2017 Jan 10;114(2):412-417. doi: 10.1073/pnas.1610706114. Epub 2016 Dec 22.

To enable effective interaction with the environment, the brain combines noisy sensory information with expectations based on prior experience. There is ample evidence showing that humans can learn statistical regularities in sensory input and exploit this knowledge to improve perceptual decisions and actions. However, fundamental questions remain regarding how priors are learned and how they generalize to different sensory and behavioral contexts. In principle, maintaining a large set of highly specific priors may be inefficient and restrict the speed at which expectations can be formed and updated in response to changes in the environment. However, priors formed by generalizing across varying contexts may not be accurate. Here, we exploit rapidly induced contextual biases in duration reproduction to reveal how these competing demands are resolved during the early stages of prior acquisition. We show that observers initially form a single prior by generalizing across duration distributions coupled with distinct sensory signals. In contrast, they form multiple priors if distributions are coupled with distinct motor outputs. Together, our findings suggest that rapid prior acquisition is facilitated by generalization across experiences of different sensory inputs but organized according to how that sensory information is acted on.


Psychopathic individuals exhibit but do not avoid regret during counterfactual decision making

Baskin-Sommers A, Stuppy-Sullivan AM, Buckholtz JW
Proc Natl Acad Sci U S A. 2016 Dec 13;113(50):14438-14443. Epub 2016 Nov 28.

Psychopathy is associated with persistent antisocial behavior and a striking lack of regret for the consequences of that behavior. Although explanatory models for psychopathy have largely focused on deficits in affective responsiveness, recent work indicates that aberrant value-based decision making may also play a role. On that basis, some have suggested that psychopathic individuals may be unable to effectively use prospective simulations to update action value estimates during cost-benefit decision making. However, the specific mechanisms linking valuation, affective deficits, and maladaptive decision making in psychopathy remain unclear. Using a counterfactual decision-making paradigm, we found that individuals who scored high on a measure of psychopathy were as or more likely than individuals low on psychopathy to report negative affect in response to regret-inducing counterfactual outcomes. However, despite exhibiting intact affective regret sensitivity, they did not use prospective regret signals to guide choice behavior. In turn, diminished behavioral regret sensitivity predicted a higher number of prior incarcerations, and moderated the relationship between psychopathy and incarceration history. These findings raise the possibility that maladaptive decision making in psychopathic individuals is not a consequence of their inability to generate or experience negative emotions. Rather, antisocial behavior in psychopathy may be driven by a deficit in the generation of forward models that integrate information about rules, costs, and goals with stimulus value representations to promote adaptive behavior.


A neural model of valuation and information virality

Scholz C, Baek EC, O'Donnell MB, Kim HS, Cappella JN, Falk EB
Proc Natl Acad Sci U S A. 2017 Mar 14;114(11):2881-2886. doi: 10.1073/pnas.1615259114. Epub 2017 Feb 27.

Information sharing is an integral part of human interaction that serves to build social relationships and affects attitudes and behaviors in individuals and large groups. We present a unifying neurocognitive framework of mechanisms underlying information sharing at scale (virality). We argue that expectations regarding self-related and social consequences of sharing (e.g., in the form of potential for self-enhancement or social approval) are integrated into a domain-general value signal that encodes the value of sharing a piece of information. This value signal translates into population-level virality. In two studies (n = 41 and 39 participants), we tested these hypotheses using functional neuroimaging. Neural activity in response to 80 New York Times articles was observed in theory-driven regions of interest associated with value, self, and social cognitions. This activity then was linked to objectively logged population-level data encompassing n = 117,611 internet shares of the articles. In both studies, activity in neural regions associated with self-related and social cognition was indirectly related to population-level sharing through increased neural activation in the brain's value system. Neural activity further predicted population-level outcomes over and above the variance explained by article characteristics and commonly used self-report measures of sharing intentions. This parsimonious framework may help advance theory, improve predictive models, and inform new approaches to effective intervention. More broadly, these data shed light on the core functions of sharing-to express ourselves in positive ways and to strengthen our social bonds.


Explicit representation of confidence informs future value-based decisions

Tomas Folke, Catrine Jacobsen, Stephen M. Fleming & Benedetto De Martino
Nature Human Behaviour 1, Article number: 0002 (2016) doi:10.1038/s41562-016-0002

Humans can reflect on decisions and report variable levels of confidence. But why maintain an explicit representation of confidence for choices that have already been made and therefore cannot be undone? Here we show that an explicit representation of confidence is harnessed for subsequent changes of mind. Specifically, when confidence is low, participants are more likely to change their minds when the same choice is presented again, an effect that is most pronounced in participants with greater fidelity in their confidence reports. Furthermore, we show that choices reported with high confidence follow a more consistent pattern (fewer transitivity violations). Finally, by tracking participants’ eye movements, we demonstrate that lower-level gaze dynamics can track uncertainty but do not directly impact changes of mind. These results suggest that an explicit and accurate representation of confidence has a positive impact on the quality of future value-based decisions.


Perceptual learning alters post-sensory processing in human decision-making

Jessica A. Diaz, Filippo Queirazza & Marios G. Philiastides
Nature Human Behaviour 1, Article number: 0035 (2017) doi:10.1038/s41562-016-0035

An emerging view in perceptual learning is that improvements in perceptual sensitivity are not only due to enhancements in early sensory representations but also due to changes in post-sensory decision-processing. In humans, however, direct neurobiological evidence of the latter remains scarce. Here, we trained participants on a visual categorization task over three days and used multivariate pattern analysis of the electroencephalogram to identify two temporally specific components encoding sensory (‘Early’) and decision (‘Late’) evidence, respectively. Importantly, the single-trial amplitudes of the Late, but not the Early component, were amplified in the course of training, and these enhancements predicted the behavioural improvements on the task. Correspondingly, we modelled these improvements with a reinforcement learning mechanism, using a reward prediction error signal to strengthen the readout of sensory evidence used for the decision. We validated this mechanism through a robust association between the model’s decision variables and the amplitudes of our Late component that encode decision evidence.