Chapter 7 General discussion
7.2 B ROADER FRAMEWORKS FOR NEUROMODULATORY ACTIONS
Due to their virtually global influence and capacity to organize neural activity to support broad behavioral states, it has been proposed that the neuromodulatory systems operate in a meta- learning framework (Doya, 2002). It has been proposed that essentially, each major
neuromodulator represents a meta-learning parameter: dopamine signals RPEs, 5-HT controls the time scale of RPEs, NE controls stochasticity of action while ACh controls the speed of memory updating. Thus, it is posited that changes in the activity of the nuclei responsible for these neuromodulators adjusts these parameters and shifts behavioral strategies and behaviors. Although studies find general support for shifts in meta-learning parameters though pharmacological manipulation of neuromodulators (Jepma et al., 2016; Howlett et al., 2017; Cook et al., 2019), the specificity of the neuromodulators and the exact measure of ‘neuromodulation’ in such a framework is not clear. There is recent data suggesting that the activity of noradrenergic neurons in the LC (Su and Cohen, 2022), serotonergic neurons in the dorsal raphe (Grossman et al., 2022) and cholinergic neurons in the basal forebrain (Hegedüs et al., 2023) encode RPEs or learning rates. Although RPE signaling could still be predominantly carried out and propagated by one neuromodulator, it is possible that individual neuromodulators do not directly map onto single parameters in a meta-learning framework. However, the exact same line of evidence strongly suggests that neuromodulators are somehow involved in adjusting meta-learning parameters, perhaps by their extra-cellular concentrations.
The optimization of meta-learning parameters will differ between tasks based on the environmental statistics and internal goals. Furthermore, the strength of externally driven shifts in meta-learning parameters through, e.g. pharmacological intervention, will vary based on the individual’s baseline neuromodulatory tone. Such task specific modulation of neuromodulatory effects would explain different optimal doses for individuals engaged in tasks with different cognitive demands as described in chapter 3 and is consistent with the general premise of inverted- U curves describing dose – performance relationships. Furthermore, shifts of cholinergic tone has been implicated in other trade-offs such as speed-accuracy (Turchi and Sarter, 1997). Similarly, as described in chapter 6, guanfacine administration enhances post-reversal learning, already in the first 3 post-reversal trials, but this may reflect enhanced tendencies for exploratory behavior at the expense of exploitative behaviors as evident in somewhat reduced plateau performance and a slightly reduced likelihood to finish blocks learned. Optimization of meta-learning parameters through neuromodulatory action will thus likely come at a cost, at the very least a metabolic one, otherwise they would evolutionarily be pushed towards being static, instead of dynamic. Lastly, a variable baseline of neuromodulatory tone may help explain the heterogeneity of optimal doses
for the clinical population. It can be predicted that depending on the neuromodulator in question, an individual’s baseline may be significantly different for some clinical populations. The extent of the loss of cholinergic synapses in the cortex and hippocampus has been linked to symptom progression in AD (Fahnestock and Shekari, 2019) and pharmacological strategies have been proposed utilizing different drug cocktails at different stages of AD to account for the progression in cholinergic loss (Tobin, 2018). Ultimately, based on an individual’s baseline neuromodulatory tone, optimal performance may require more or less neuromodulatory activation depending on the particular demands of a given task.
7.3 Multi-modulator measurements
Based on the discussion so far, in a meta-learning framework, we may predict that the pattern of neuromodulatory tone could be informative of an individual’s cognitive state as defined by the current values of meta-learning parameters. Furthermore, shifts in behavioral needs may require shifting multiple meta-learning parameters and thus be reflected in multi-modulatory changes. Unpublished data from our lab using the SPME method described in chapter 2 shows that we do indeed see such multi-modulatory changes. During the same feature-learning task used in chapters 3 and 4, we find that low attentional load blocks contain higher concentrations of dopamine, and lower concentrations of ACh and 5-HT in the striatum. Furthermore, some changes seem to be area specific. For example, before and after the start of the task, we find that prefrontal ACh increases while striatal ACh decreases. This suggests that despite overlap in stimuli and events that may trigger neuromodulatory release (see section 1.4.1), they are differentially modulated in time scales measurable by SPME.
This data supports the role of neuromodulators in adjusting meta-learning parameters to meet different environmental demands through multi-modulatory shifts. However, the differences we find between brain regions brings up another question: in which brain region do multi- modulatory tones best reflect the current meta-learning parameters? It is possible that multi- modulatory shifts observable in the PFC correspond to different changes in meta-learning variables than the multi-modulatory shifts observable in the striatum. A study utilizing human neuroanatomy and receptor density maps suggests that areas with similar functions have similar receptor
fingerprints (Zilles et al., 2002) consistent with later findings utilizing similar methods in primates (Rapan et al., 2022) as well as the actual area-specific volume of available neuromodulators in the macaque brain (Ward et al., 2018). Although, this does not yet reveal if the neuromodulatory changes in the striatum, for example, do indeed provide additional information to the neuromodulatory changes in prefrontal cortices for understanding meta-learning parameters.