Marcos Economides, Marc Guitart-Masip, Zeb Kurth-Nelson, and Raymond J. Dolan
J. Neurosci. 2014;34 3340-3349
Actions can lead to an immediate reward or punishment and a complex set of delayed outcomes. Adaptive choice necessitates the brain track and integrate both of these potential consequences. Here, we designed a sequential task whereby the decision to exploit or forego an available offer was contingent on comparing immediate value and a state-dependent future cost of expending a limited resource. Crucially, the dynamics of the task demanded frequent switches in policy based on an online computation of changing delayed consequences. We found that human subjects choose on the basis of a near-optimal integration of immediate reward and delayed consequences, with the latter computed in a prefrontal network. Within this network, anterior cingulate cortex (ACC) was dynamically coupled to ventromedial prefrontal cortex (vmPFC) when adaptive switches in choice were required. Our results suggest a choice architecture whereby interactions between ACC and vmPFC underpin an integration of immediate and delayed components of value to support flexible policy switching that accommodates the potential delayed consequences of an action.