• Tidak ada hasil yang ditemukan

Neural structures that subserve reward- or punishment-based

CHAPTER 2 NEUROBIOLOGY OF DECISION MAKING- A

2.3 Neural structures that subserve reward- or punishment-based

The key neural structures that implement decision-making based on rewards and punishments include—cortex, BG and amygdala. This section explains the roles of each of these components in coding value, risk or reward delays in decision making process.

2.3.1 Amygdala

This key subcortical structure is hypothesized to mediate the affective-cognitive connection (Brink, 2008; Carlson, 2012). Many studies relate the signals from amygdala to represent emotions such as anxiety, rage, appetitive and aversive feelings- factors that are known to influence decision making (Fanselow et al., 1999;

Parkinson et al., 2000; Baxter et al., 2002; Kennedy et al., 2009). The effect of emotions on decision making, both in terms of the perceived state and planned response, is proposed to be mediated by the amygdala (Wagar et al., 2004). For instance, emotions such as anxiety might exaggerate the constructed aversive error feedback and thence decrease value function, resulting in increased avoidance of the stimuli (Paulus et al., 2006). A similar control mechanism by the amygdala during anxiety on computing risk measure could lead to risk aversion (De Martino et al., 2006; Seymour et al., 2008; Liu et al., 2011). Thus both the value and risk computations are influenced by the activity of amygdala.

2.3.2 Cortex

Many areas of the cortex such as the sensory-motor cortices, associative cortices, orbito-frontal cortex, and the prefrontal cortex are found to be involved in reward- based learning (Tremblay et al., 1999; Daw et al., 2005). Specifically the prefrontal cortex is known to play a major role in the maintenance and manipulation of choice preferences by encoding their value and utility (Goldman-Rakic, 1995; Frank et al., 2001; Chatham et al., 2013). They are also known to code for a policy that governs the execution of response (Botvinick, 2008). Patients with lesions in the prefrontal areas are likely to be sub-optimal in their choice preferences (Manes et al., 2002;

Fellows et al., 2003). Some lesion studies in the prefrontal areas have shown

13

selectively impaired reversal learning in experiments such as Iowa gambling task. In such cases, the patients develop increased preference for the risky deck than the safer one, indicating an increased risk-seeking behavior (Bechara et al., 1994; Bechara et al., 2000; Fellows et al., 2003). Apart from the value and the risk associated with the rewards, the cortex is also known to encode the delays associated with receiving the outcomes. These delays are differentially coded by different areas of the brain, such as the medial prefrontal cortex codes for immediate rewards, and the lateral prefrontal cortex codes for delayed rewards (McClure et al., 2004; Tanaka et al., 2004).

2.3.3 Basal Ganglia

The striatum of the BG is one of the prominent areas reported to be involved in reward-punishment learning. The nucleus can be broadly divided into dorsal striatum (caudate and dorsal putamen), and the ventral striatum (ventral putamen and the nucleus accumbens) (Haber, 2003; Haber, 2009). Chemical staining studies show the striatal anatomy to possess a mosaic of patches especially based on enzymes such as acetylcholinesterase. This promotes a theory of modular organization of the striatum containing patches and matrices called striosomes and matrisomes, respectively (Graybiel et al., 1978). The striatum is made of various types of neurons such as the medium spiny neurons (MSNs), cholinergic interneurons and GABAergic interneurons. The MSNs form the majority cell type, covering around 90 - 95% of the striatum; they are GABAergic in nature (Kemp et al., 1971; Smith et al., 1998;

Bolam et al., 2000). The striatal neurons respond to the major neuromodulators such as dopamine and serotonin through the activation of the corresponding receptors present in them. The activation of those receptors further excite the secondary messengers which can control the pre- and post-synaptic plasticity in a short or long term (Bedard et al., 2011; Boureau et al., 2011; Cools et al., 2011). The MSNs possessing the neuropeptides substance P and dynorphin contain the dopamine D1 receptors (D1R), and are known to project to the Globus pallidum interna (GPi) and the substantia nigra; The MSNs projecting to GPi are GABAergic and therefore exert an inhibitory influence over GPi; These direct projections of D1R expressing MSNs to GPi constitute the Direct pathway (DP). On the other hand, those MSNs that express the neuropeptide enkephalin contain the dopamine D2 receptors (D2R), and they are reported to exert GABAergic projections over the Globus pallidum externa

14

(GPe); The GPe are also GABAergic in nature whose neurons invade the glutamatergic subthalamic nucleus (STN); The GPe and STN interact bidirectionally:

the STN sending glutamatergic projections to GPe which in turn sends GABAergic projections to STN; The STN eventually sends glutamatergic efferent projections to GPi; The pathway from the striatum to GPi via GPe and STN is called the Indirect pathway (IP). The IP thereby contains two inhibitory connections mediated by GABA and one excitatory connection mediated by glutamate, and therefore exerts an overall excitatory influence over the GPi. Further the GPi neurons are GABAergic which project to the thalamus whose activity facilitates that of the motor and executive cortex. In summary, the direct and indirect pathways effectively facilitate and inhibit the cortical activity respectively (Figure 2.1) (Albin et al., 1989; DeLong, 1990b).

Figure 2.1: The schematic of the BG showing the direct (DP) and indirect (IP) pathways

Functional MRI experiments show that the dorsal striatum represents both the reward magnitude and the valence of the outcome obtained on executing an action (Tricomi et al., 2004). Specifically, the response of the striatum increases with the reward magnitude, and decreases with the punishment magnitude (Breiter et al., 2001;

15

Delgado et al., 2003). Other fMRI experiments correlate the activity of striatum to the expectation of rewarding (O'Doherty et al., 2003; McClure et al., 2004; O'Doherty et al., 2004) as well as punitive outcomes (Seymour et al., 2004). Ventral striatum receives major inputs from prefrontal cortex, hippocampus and amygdala (Wagar et al., 2004), and also responds to the actual and expected reward magnitudes (Knutson et al., 2001). Ventral striatum also responds to the magnitude of variability or risk (expected uncertainty) associated with the outcomes (Zink et al., 2004). Particularly, the BOLD signals in the ventral striatum reflect the risk preferences that correlate with the amount of risk anticipation (Preuschoff et al., 2006). The striatum is also sensitive to the delays in receiving the rewards—the ventral striatum codes for the immediate rewards, while the dorsal striatum codes for the delayed rewards (McClure et al., 2004; Tanaka et al., 2004).