labeled structure, like the amygdala, may not be a unitary structure (as al-ready noted in our discussion of eye fixation). Indeed, it has been suggested that the amygdala is a complex collection of nuclei with quite diverse con-nectivities. Thus, fear-related phenomena may be associated with only a subset of the amygdala’s nuclei, not all of them. This makes it unproblematic if other amygdaloid nuclei are involved with other emotions, such as sadness or even negative affect more diffusely. Similarly, in the case of disgust, the substrate of disgust is not the entire insula. As Wicker et al. (2003) note, activations elicited by aversive stimuli occur in the anterior portion of the insula, whereas pleasant stimuli elicit activations in the posterior portion of the insula. Third, cognitive neuroscience and neuropsychology regularly encounter inconsistent data across studies for several familiar reasons. Lesion patients have different specific areas of damage, and such differences may be crucial. There are also differences in individual (undamaged) brains that generate differences in empirical findings. Next, there are limitations in im-aging techniques, such as limitations of temporal resolution inherent in BOLD fMRI, which undoubtedly account for certain discrepancies across experiments. Finally, different individuals with brain lesions may compensate in different ways (as we speculated in the case of SM) and thus may show different patterns of impairment and ‘‘preserved’’ function. So we should not hastily abandon the general regularities identified in this area simply because some discrepancies remain to be accounted for. No doubt, greater complexity will have to enter the picture in future theorizing, but the general outlines of a simulation story are still well supported by the evidence we have highlighted.
Until now, I have provided only a bare sketch of how a simulation routine might execute FaBER tasks. An ST account of FaBER will not be wholly compelling, however, unless and until a detailed simulational method of FaBER execution is presented. Is there such a method, and is it compatible with existing evidence? This is the topic of the next section.
instruction to construct such an expression, matches the expression observed in the target, then the hypothesized emotion is confirmed and imputed to the target. According to Model 1, this scenario is what transpires in a normal emotion interpreter. When the relevant emotion area is damaged, however, a facsimile of the emotion cannot be produced. The face-related downstream activity needed to recognize the emotion isn’t generated. This results in recognition impairment specific to that emotion.
6.3.2 Model 2: Reverse Simulation
The idea here is that an attributor runs a standard emotional process in the reverse direction.9In most cognitive processes, reverse simulation is not an option: The standard forward directionality of mental processes precludes the possibility that these processes can be utilized in the opposite direction. How-ever, FaBER may be an important exception.
Under conditions of normal operation, an emotion episode causes a co-ordinated suite of cognitive and physiological changes, including, at least in the case of the so-called basic emotions, a characteristic facial expression (Ekman, 1992). This causal relationship appears to be bidirectional. There is
Generate an hypothesized emotion
Produce a facial expression
Test this facial expression against that of the target
Is there a match?
Yes
No
Classify one’s current emotion state and attribute this state to the target
Figure 6.1. Generate-and-test simulation. (Reprinted from Goldman and Sripada, 2005, with permission from Elsevier.)
substantial evidence that manipulation of the facial musculature, either voluntarily or involuntarily, has a causal effect in generating, at least in attenuated form, the corresponding emotional state and its cognitive and physiological correlates. Changes in a person’s facial musculature can pro-duce corresponding emotions, even when he is unaware that the musculature has any emotion-linked properties (Ekman, Levenson, and Friesen, 1983).
Techniques have been used to induce smiles or frowns without subjects’
awareness that they were smiling or frowning. For example, simply holding a pen in one’s teeth eases the facial muscles into a smile, whereas holding a pen in one’s lips eases the muscles into a frown. Corresponding types of emo-tional experiences can be induced by such unconscious tilts of the face.
Strack, Martin, and Stepper (1988) had students look at Gary Larson cartoons while holding a pen either in their teeth or in their lips. Students found the cartoons funnier when they held the pen in their teeth (and smiled) than when they held it in their lips (and frowned). Cacioppo, Klein, Berntson, and Hatfield (1993) found that a similar manipulation of bodily postures had an effect on liking or disliking attitudes. When we like a stimulus, we tend to bring it toward us, but when we dislike something, we tend to push it away.
Cacioppo et al. (1993) found that manipulating people’s posture into one of these contrasting poses subtly influences their attitude.
Because the relationship between basic emotion states and their facial expressions exhibits a kind of rough one-to-one correspondence in both
Visual representation of target’s facial expression
Activation of facial muscles which imitate target’s facial expression
Experience of emotion
Classification of the current emotion state and attribution to the target
4 3 2 1
Figure 6.2. Reverse simulation. (Adapted from Goldman and Sripada, 2005, with permission from Elsevier.)
directions, a characteristic facial expression could potentially be used in a backward direction for the purpose of attribution by simulation.10As shown in figure 6.2, a potential attributor who sees a target’s emotion-expressive face [1] proceeds to imitate the observed facial expression in an attenuated and largely covert manner [2]. These facial exertions produce traces of the relevant emotion [3]. These emotion traces in the attributor are classified for their emotion type and finally, in keeping with the common core of all simulational heuristics, produce a corresponding attribution to the target whose face is being observed [4]. All this would happen at a preconscious level.
The reverse simulation model would explain the paired deficits as follows:
Someone impaired in experiencing a certain emotion would be unable to produce that emotion, or even significant traces thereof, in her own system.
The requisite facial exertions [2] would occur, but they would not arouse the appropriate neural activity that constitutes an experience of the emotion [3].
Such a person would not have a matching emotion in herself to classify and hence would not reliably attribute that emotion to the target [4].
6.3.3 Model 3: Reverse Simulation with as-if Loop
Adolphs, Damasio, Tranel, Cooper, and Damasio (2000) did a quantitative study of 108 subjects with focal brain lesions and concluded that recognition of facial emotion requires the integrity of right somatosensory cortices. They hypothesized that emotion recognition engages somatosensory representa-tions that may simulate ‘‘how one would feel if making the facial expression shown in the stimulus.’’ This suggests a variant of the reverse simulation model, using Damasio’s (1994) idea of an ‘‘as if’’ loop (see figure 6.3).
Perhaps there is a link between a visual representation of a facial expression [1] and a somatosensory representation of what it would feel like to make that expression [2]. This visual/somatosensory pathway bypasses the facial musculature, hence the phrase ‘‘as-if ’’ loop. Activation of the appropriate somatosensory representation in turn leads to (subthreshold) activation of an emotion [3] appropriate to the facial expression of the target. Finally, this emotion activated in the self is classified and imputed to the target. Expla-nation of the paired deficits by Model 3 would follow the explaExpla-nation by Model 2.
6.3.4 Model 4: Unmediated Resonance (Mirroring)
According to this model, perception of the target’s face ‘‘directly’’ triggers (subthreshold) activation of the same neural substrate of the emotion in question. ‘‘Directly’’ here implies some form of mediation different from any of those postulated by the other models. A detailed positive account of the resonance, or mirroring, process is not presently available.11The proposal of
unmediated matching, or mirroring, is made for the case of disgust by Wicker et al. (2003: 661) and echoes a hypothesis of Vittorio Gallese, a co-investigator in the Wicker et al. study, of widespread sharing of mental states by targets and observers, which Gallese (2001, 2003) calls the ‘‘shared manifold hypothesis.’’
This hypothesis, in turn, is a generalization of the notion of mirror systems in monkeys and humans (Rizzolatti, Fadiga, Gallese, and Fogassi, 1996; Gallese, Fadiga, Fogassi, and Rizzolatti, 1996; Rizzolatti, Fogassi, and Gallese, 2001).
Mirror systems provide one paradigm of mental simulation and hence po-tentially a method of simulation-based mindreading. Model 4 would explain the paired-deficit data in the same familiar way as Models 2 and 3. Someone impaired in the capacity to experience a particular emotion won’t undergo a recognizable occurrence of that emotion, so the cognitive center will not reliably classify and attribute that emotion to the target.
Let us be clear why these four models are all descriptions of simulation processes, and why the processes might generally yield accurate attributions.
Models 2, 3, and 4 share the following characteristic. Some process in the observer responds to the facial cues of the target by generating an (attenu-ated) emotion in the observer. This emotion is classified and then projected onto the target. Recall that projection (section 2.6) is a core part of a simu-lational mindreading routine. Now if the first stage of the process is suffi-ciently sensitive to the target’s facial cues and if the observer’s emotion equipment is intact, the emotion produced in the observer will match the
Visual representation of target’s facial expression
Activation of somatosensory representation of what it would feel like to make that expression
Experience of emotion
Classification of current emotion state and attribution to the target
4 3 2
1
Figure 6.3. Reverse simulation with as-if loop. (Adapted from Goldman and Sripada, 2005, with permission from Elsevier.)
triggering emotion in the target. Such matching implies that interpersonal mental simulation has occurred. If, in addition, the observer’s classification of his own emotion is accurate,12 his attribution of that same emotion to the target will also be accurate. So, there will be accurate, simulation-based mindreading. Model 1 differs slightly from 2, 3, and 4, because in Model 1 the hypothesized emotion must first pass the facial-match test before it is accepted and projected onto the target. But otherwise the process is similar.
So all four processes share core simulational properties, certainly in cases where the observer’s emotion is sufficiently similar to that of the target.
6.3.5 Assessing the Four Models
Let us now assess the strengths and weaknesses of the four models from an evidential standpoint. Model 1, the generate-and-test model, raises several questions. One centers on the phase of the process in which the observer’s system tries to match his own facial expression to that of the target. One’s own facial expression is represented proprioceptively, whereas the target’s ex-pression is represented visually. How can these representations ‘‘match’’? One possibility is that the system has acquired an association between proprio-ceptive and visual representations of the same facial configuration, through some type of learning. Alternatively, there might be an innate cross-modal matching of the sort postulated by Meltzoff and Moore (1997) to account for neonate facial imitation. This postulate has struck researchers as quite plausible.
A second question is how the hypothesis-generation process works. If candidate emotions are generated serially and randomly, say, from the six basic emotions, the observer must covertly generate on average three facial expressions before hitting on the right one. This might be too slow to account for actual covert mimicry of displayed facial expressions, which occur as early as 300 milliseconds after stimulus onset (Dimberg and Thunberg, 1998). One alternative is parallel rather than serial testing of hypotheses, which might solve the timing problem, but it’s not clear that this is feasible. A second alternative is to say that ‘‘theoretical’’ information guides and narrows the generation process—though it isn’t clear what theoretical information it would be. The latter proposal would turn the generate-and-test model into more of a theory-simulation hybrid rather than a pure simulation model. In any case, unless parallel testing of hypotheses is plausible, the timing prob-lem makes the generate-and-test model the least promising of the four on offer, and the other three are all more purely simulational in character.
Moreover, all three of the other models have more evidential support than generate-and-test, so the latter should be relegated to the bottom of the stack.
What can be said for Model 2, the reverse simulation model? Its plausi-bility crucially depends on speedy facial imitation. That such imitation
capacities exist, apparently innately, is well established. Meltzoff and Moore (1983) found that infants as young as 1 hour old imitate tongue protrusion and other facial displays modeled before them. In addition to a capacity for facial mimicry, there is evidence that adult humans spontaneously and rapidly activate facial musculature corresponding to visually presented facial ex-pressions. Dimberg and colleagues (Dimberg and Thunberg, 1998; Dimberg, Thunberg, and Elmehed, 2000; Lundquist and Dimberg, 1995) have found that the presentation of pictures of facial expressions produces covert acti-vation of one’s own facial musculature, which mimics the presented faces.
Such muscular activation is often subtle but electromyographically detect-able. It occurs extremely rapidly, as noted previously, as early as 300 mil-liseconds after stimulus onset.
However, spontaneous, rapid, and covert facial imitation is also consistent with a model in which self-generated facial expressions are the consequences rather than the causes of emotion states. Is there any support for the claim that facial muscle movements come first and produce subsequent emotion states? Yes. The first line of evidence is based on the rapidity of the covert muscular movements. The early onset of these movements suggests that they arise because of direct imitation of the target, rather than a presumably slower process in which the facial expression of the target is deciphered, the cor-responding emotion state is induced, and the facial expression is then pro-duced. This direct imitation of the target may be part of an action-mirroring system, which is known to generate covert activation of distal musculature. In an early experiment that helped establish an action-mirroring system in hu-mans, Fadiga, Fogassi, Pavesi, and Rizzolatti (1995) found that observation of actions (e.g., grasping an object, tracing a figure in the air) modeled by a target reliably produced electromyographically detectable activation in the corresponding muscle groups of the observer. Other evidence indicates that the action-mirroring system may also operate during the observation of facial expressions. An fMRI study by Carr, Iacoboni, Dubeau, Mazziotta, and Lenzi (2003) found that subjects passively observing emotion-expressive faces display neural activation in the premotor cortex and neighboring regions, which are normally activated in the production of facial movements and which are in one of the regions thought to house the action-mirroring system (to be explored later).
Some important data, however, are inconsistent with the reverse simulation model. Hess and Blairy (2001) used a more challenging FaBER task and found that although spontaneous facial mimicry did occur, successful mim-icry did not correlate with accuracy in facial recognition, suggesting that facial mimicry may accompany but not actually facilitate recognition. More pointedly, Calder, Keane, Cole, Campbell, and Young (2000) found that three patients with Mobius syndrome, a congenital syndrome whose most promi-nent symptom is complete facial paralysis, performed normally on FaBER
tasks. Keillor, Barrett, Crucian, Kortenkamp, and Heilman (2002) reported a similar finding in which a patient with bilateral facial paralysis performed normally on FaBER tasks. These findings need to be interpreted with caution;
given the long-standing nature of these patients’ impairments, they may have found a compensatory (TT) strategy for executing FaBER, so their normal level of performance may not have utilized typical pathways. Nonetheless, these findings constitute grounds for skepticism about the reverse simulation model.
The problem of facial paralysis with spared FaBER performance is a problem attending Model 2 but not Model 3, the as-if loop variant of reverse simulation. Model 3 explicitly postulates a pathway that bypasses facial musculature, so it is unthreatened by the findings cited in the preceding par-agraph. This is a signal advantage. Recall that the postulated pathway engages somatosensory cortices. A role for right somatosensory cortices in emotion recognition is confirmed in a study by Heberlein, Adolphs, Tranel, and Da-masio (2004), who found that lesion patients with impairments in judging emotions from point-light walkers had the most reliable focus of lesion overlap in right somatosensory cortices.13
The trouble with Model 3 is that it isn’t wholly clear how it explains the selectivity of emotion-recognition impairments. Take the just-cited study of point-light walkers. This study found that the region in which lesions were most consistently associated with impairments in emotion judgment was right somatosensory cortices. But there were no clear differences in the regions of lesion overlap associated with impaired judgment of specific individual emotions. So it is not clear that activation in this region is specific enough to recognize one emotion as contrasted with others, which is what a model of (accurate) emotion recognition must explain, of course. A second concern with Model 3 is whether it accounts for the anger-recognition findings by Lawrence et al. (2002). Would lowering of dopamine levels have an associ-ation with impairment in right somatosensory cortices? Would such a low-ering have such a specific impact on right somatosensory cortices as to significantly affect only anger recognition, not recognition of the other basic emotions? These are dubious prospects.
Let us turn, then, to Model 4, the unmediated resonance, or mirroring, model. A deeper examination of empirical evidence that bears on this model is deferred to later sections (6.4 and 6.7). Here I merely pose a theoretical question: Does the model fit our proposed pattern of ST? Because the model posits unmediated resonance, it does not fit the traditional form of simulation in which pretend states are fed into an attributor’s own cognitive equipment (e.g., a decision-making mechanism) to produce a further state. However, I do not regard the creation of pretend states, or the deployment of cognitive equipment to operate on such states, as essential to simulation. I associate that form of simulation only with high-level mindreading. First, pretense or
imagination is a high-level activity. Second, it is an activity that is potentially and intermittently under intentional guidance or control, whereas low-level mindreading is fully automatic. As articulated in chapter 2, the generic idea of simulation is the idea of a process that is similar, in relevant respects, to a second process. (Alternatively, it can be undertaken with the aim of being similar, or have the function of being similar to the simulated process.) Such process similarity is what we have here, at a minimum, a similarity between the pair of emotion events in target and observer.
But, the reader may object, doesn’t a simulating process have to consist of multiple steps or stages, multiple steps that match those of a target process? I shall take a relaxed stance here: A simulating process can consist, minimally, of a single matching (or semimatching) state or event. This minimal condi-tion for simulacondi-tion is satisfied in Model 4. The model says that in successful FaBER by normal people, an attributor’s attribution is based on a (sub-threshold) tokening of the same emotion experienced in the target. The ob-server’s emotional system ‘‘resonates’’ with that of the target, and this is the matching event on which the attribution is based. So Model 4 fits the ST pattern as I characterize it.14
A defense of ST, even for the restricted category of FaBER, does not require a firm decision about which of the four models is correct.15Each of them, after all, is a simulational account. (Well, Model 1 is a hybrid account, so its correctness would lend the weakest support to ST, but it’s the least plausible model of the four.) In fact, I think that a good case can be made for Model 4, the resonance or mirroring model. To make this case, we must inspect evidence for resonance, or mirroring, in related areas of cognition.
This is the principal topic of the next section.