Definici�n: THE REPRESENTING BRAIN: NEURAL CORRELATES OF MOTOR INTENTION AND IMAGERY Marc Jeannerod Vision et Motricite INSERM Unite 94 16 avenue du Doyen Lepine 69500 Bron France Keywords affordances, goals, intention, motor imagery, motor schemata, neural codes, object manipulation, planning, posterior parietal cortex, premotor cortex, representation. Abstract This target article concerns how motor actions are neurally represented and coded. Action planning and motor preparation can be studied using motor imagery. A close functional equivalence between motor imagery and motor preparation is suggested by the positive effects of imagining movements on motor learning, the similarity between the neural structures involved, and the similar physiological correlates observed in both imagining and preparing. The content of motor representations can be inferred from motor images at a macroscopic level: from global aspects of the action (the duration and amount of effort involved) and from the motor rules and constraints which predict the spatial path and kinematics of movements. A microscopic neural account of the represenation of object-oriented action is described. Object attributes are processed in different neural pathways depending on the kind of task the subject is performing. During object-oriented action, a pragmatic representation is activated in which object affordances are transformed into specific motor schemata independently of other tasks such as object recognition. Animal as well as clinical data implicate posterior parietal and premotor cortical areas in schema instantiation. A mechanism is proposed that is able to encode the desired goal of the action and is applicable to different levels of representational organization.
5. Representation of goals for actions.
It becomes progressively clear throughout this paper that the representation of an action cannot be limited to the parameters and constraints dictated by execution of the action by the motor system. The goal of the action also must be encoded. In this section, the idea will be developed that the goal of an action includes an internal representation of both the external object toward which the action is directed, and the final state of the organism when this object has been reached. This conception has a long history in the literature, under the heading of the "schema" theory. Head (1920) first used this concept to account for maintenance and regulation of posture. It was later developed by others as an internal model of the body in action built from sensations and previous responses to external stimuli (for a critical account, see Oldfield and Zangwill, 1942. See also Schilder, 1935 and Lashley, 1951).
More recently, Schmidt (1975) proposed that the motor response schema arises as a representation of the information elements that are present within the context of executing a movement (e.g., initial condition of the musculature, specification of the muscular commands, sensory consequences of the movement, and response outcome). Some of these entities, the monitoring of initial state and the specification of the commands for example, are independent from the movement itself, while some others are consequences of it. Motor control and learning would thus be the result of an interplay between central and peripheral events, until the actual outcome of the movement would correspond to the desired (represented) outcome. The notion of the representation of a goal therefore implies that the organism is looking ahead toward a new state, the representation of which steers the transformation until its completion. In the present section, this aspect of motor representations will be studied at two levels, that of the visuomotor transformation, and that of action planning.
5.1. The lower level: Objects as goals.
It will be first shown that, when objects are goals for actions, their visual attributes are represented in a specific way (the pragmatic mode), used for the selection of appropriate movements, and distinct from other possible modalities of representation used for other aspects of object-oriented behaviour (one of them being the semantic mode). This description of object processing is in continuity with previous attempts, as it has long been been recognized that the same object or event can be processed in different ways, according to the task in which the subject is engagd. The distinction between different modes of processing, however, was usually grounded on the amount of overt cognitive content involved: hence the distinctions between conscious/non- conscious, explicit/implicit or automatic/controlled modes of processing (Bridgeman, 1989). Consciousness, however, is not at issue here, as it does not discriminate between semantic operations and motor or sensorimotor operations: both can be achieved without explicit awareness (see Holender, 1986).
The nature of the dichotomy proposed here would be closer to the distinction made by the Mishkin group on neuroanatomical grounds (e.g., Mishkin et al, 1983), between object vision and spatial vision, epitomized as the now classical distinction between "What?" and "Where?". At variance with Mishkin, however, it is postulated that the motor representation includes much more than the spatial aspects of movements. During object-oriented action, objects are not nly located in space and reached; they are grasped, manipulated and used. For this reason, it seems preferable to classify object attributes, not with regard to putative anatomical or functional channels, but rather with regard to the observd interactions between the subject and the objects during a given action: object attributes would thus be classified only on the basis of their inclusion in one or the other aspect of object-oriented behaviour.
If it is the relevance f a given attribute to a particular task that governs its inclusion within a given type of representation, a large set of attributes are relevant to both: this would be the case for those contributing to shape, size, compliance, texture, etc. Others, by contrast, are probably irrelevant to the pragmatic representation (e.g., color) or are of little or no relevance to semantic processing (e.g., weight). This is the reason why the classification based on a distinction between spatial vision and object vision may be incomplete or even misleading. According to Kosslyn et al (1990), parameters related to object size should be specified at the same level as location or orientation, and should therefore pertain to the process of "spatiotopic mapping". Size, however, is not merely a spatial attribute: although phenomenal size depends on distance, this is not true for represented size which remains invariant with respect to distance. Represented size, not phnomenal size, is thus part of object identity. The same could be said of shape, which obviously changes as a function of the relative positions of the object and the perceiver, though it remains representationally constant. The Marr distinction between viewer- centered and object-centered descriptions of objects may be useful in this context (Marr, 1982). Operations which require a viewer- centered description (localizing, reaching) activate a spatial ("Where?") system. Those which require an object-centered description (identifying, grasping, manipulating) activate other systems: One is the system for semantic processing (the "What?" system), and the other one, which is under discussion here, is the system for pragmatic processing. This again stresses the fact that the same attribute can belong in theory to several representational categories according to the task which has to be performed.
5.1.1. Neuroanatomical evidence for multiple representations of objects.
Research in this field already has a long history, which can be followed across the various interpretations of the duality (in fact, the multiplicity) of central visual pathways. In spite of the above critiques, one of the most useful interpretations was that of Ungerleider and Mishkin (1982) who proposed that projections arising from primary visual areas follow two distinct pathways: the ventral route involving the occipito-temporal pathway with its main relay in infero-temporal cortex; the dorsal route involving the occipito- parietal pathway ending in posterior parietal cortex (see also Morel and Bullier, 1990, Baleydier and Morel, 1992). Te functional value of this anatomical duality is supported by the effects of lesion of each pathway on monkey behaviour. Lesions of the ventral system primarily affect object recognition, whereas lesions of the dorsal system produce disturbances in object localization (Pohl, 1973).
Clinical observations, following the lead of Poppelreuter (1917), indicate that a similar duality in cortical functions also exists in man. More recent observations, however, pointed to a division of labour between the two pathways, different from what was initially suggested by the Mishkin group. Goodale et al (1991; see Milner and Goodale, 1991) reported the case of a patient who was unable to recognize objects (the classical picture of visual object agnosia). The patient was also unable to purposively size her fingers according to the size of visually inspected target objects (an easy task for normal subjects, see Jeannerod and Decety, 1990). In contrast, when instructed to take these target objects by performing prehension movements, the patient was quite accurate. Not only was she able to reach the object location, but she also preshaped her hand accurately according to the object size and shape. The lesion in this patient was likely to have interrupted the occipito-temporal (ventral) pathway. In fact, this observation fits into the broader framework of the preservation of object use in patients presenting semantic processing deficits. In these patients, lesions predominate in the temporal lobes and are usually more marked on the left side (Saffran and Schwartz, in press).
At variance with occipito-temporal lesions, posterior parietal lesions do not impair object recognition. Instead, they alter arm movements during reaching (the typical optic ataxia symptom) and also finger movements during preshaping and grasping. Patients misplace their fingers when they have to visually guide their hand to a slit at different orientations (Perenin and Vighetto, 1988). During prehension of objects, they open their finger grip too widely with no or poor preshaping, and close it when they come in contact with the object (Jeannerod, 1986b. See also Jakobson et al, 1992). These clinical observations, in showing that lesions can create a "double dissociation" between different types of deficits in object-oriented behaviour, confirm the hypothesis of selective semantic and pragmatic representation mechanisms. The prediction can therefore be made that, if this hypothesis is correct, motor images of object-oriented actions (e.g., grasping) and visual images of the same objects shoul be processed by different cortical areas. Recent results showing preferential activation of posterior parietal cortex (using a brain imaging technique) during visually guided finger movements (Grafton et al, 1992) partly support this predition. It can also be predicted that the pattern of brain activation could be changed, by only changing the requirement of the task in which the subject would be engaged (e.g., imagine manipulating vs imagine classifying the same objects).
These neuroanatomical data thus provide a framework for describing the neurophysiological substrate of the pragmatic representation. The function of the representation neurons postulated earlier in this paper stands in between a "sensory" role (to extract from the external world the attributes of objects or situations that are relevant to a given action) and a "motor" role (to encode some aspects of that action). A population of neurons located in the monkey posterior parietal cortex (area 7a) and studied by Taira et al (1990) might fulfill this criterion of a visuomotor representation. These neurons are selectively activated during manipulation by the animal of visual objects of a given configuration (e.g., a push-button, a handle, etc). Neither presentation of the preferred object, nor movements aimed at this object in the dark are sufficient alone to activate them. Because they involve both a sensory and a motor counterpart, these neurons thus relate to a given type of action, rather than to a given motor pattern of the hand or to a given visual configuration. It is thus not surprising that lesions of this posterior parietal area in the monkey affect the ability to shape the hand to object size or orientation, in the same way as it does in man (Haaxma and Kuypers, 1975, Faugier-Grimaud et al, 1978).
Another population of neurons, located in premotor cortex (area 6), also pertain to the same category: indeed, they seem to be complementary to those recorded in area 7. Their discharge relates to preparation of distal movements and is influenced by the way the acting hand is shaped prior to and during the action directed toward a target object: precision grip neurons, whole hand prehension neurons, etc., can be identified. In many cases, the hand used for the action is irrelevant, so that neuron discharge is altered in relation with movements of either hand. Finally, they are poorly responsive to visual objects presented outside the context of an action oriented toward them (Rizzolatti et al, 1988). The fact that a significant proportion of these premotor neurons have been found to be also activated during observation, by the animal, of their preferred object-oriented action performed by an actor (Di Pellegrino et al, 1992, see above) is of critical importance for their representational role. It is likely that, if properly examined, the parietal neurons described by Taira et al would also show a similar property. Mutual exchange between these parietal and premotor neuronal groups (granted by extensive anatomical linkage betwen the two areas, see Morel and Bullier, 1990) seems a highly plausible substrate for pragmatic representation.
5.1.2. Predetermined motor patterns.
Actions like grasping relate to the object as a goal for the action. The object attributes are represented therein as affordances, that is, to the extent that they afford specific motor patterns, not as cues for a given perceptual category. This function does not imply binding of object attributes into a single entity, as this would be the case in a semantic type of representation. Instead, each attribute contributes to the motor configuration of the hand by selecting the relevant degrees of freedom. In addition, as already suggested, this process is effected within an object- centered system of coordinates, as shown by the fact that the hand configuration for grasping an object is not affected by its location with respect to the body and that, conversely, the opposition space tends to remain invariant with respect to the object. There are limitations to this assertion, however, because the arm geometry creates constraints that may require changes in hand configurations for extreme positions of the object. Examples of such constraints were shown in the expeiments of Stelmach et al (In press) and Rosenbaum et al (1990).
it can be proposed that the motor patterns corresponding to the object affordances, which unfold during the movement, are predetermined within the pragmatic representation. Indeed, none of the aspects of prehension movements can be shown to depend on direct visual control of the hand. The coordination of reaching and grasping, the biphasic pattern of grip formation, the covariation of maximum grip size with object size, etc., can all be achieved in situations where the hand remains invisible to the subject (Jeannerod, 1981, 1984). In addition, the level of muscular force involved in prehension is largely specified during the preshaping phase, in anticipation to contact with the object. Lifting an object implies a sequence of coordinated events where the grip force (to grasp the object) and the load force (to lift it) vary in parallel (see Johansson and Westling, 1988). Several experiments have shown that available information about object weight (e.g., based on visual size cues) can accurately determine grip and load forces in advance with respect to the grasp itself (Gordon et al, 1991).
These results point to the fact that actions are driven by implicit knowledge of object attributes. To account for this fact, Arbib has proposed a simplified form of representation, the "motor schema" (Arbib, 1985; Iberall et al, 1986; Iberall and Arbib,1990). Arbib's conception is that motor representations are composed of elementary schemas which are activated by object affordances and can adjust to visual input. During the action of prehension, motor schemas for the subactions of "reach", "preshape", "enclose", "rotate forearm", or for the selection of a the number of fingers involved, etc., would be available and would be automatically selected when required by object affordances. Those are functional units which can be assembled into a limited number of postures: the posture selected during the preshape defines the optimal opposition space for applying the required forces to the object (Iberall et al, 1986). Finally, it is specially interesting in the present context that Arbib recently included in his schema model a "look-ahead" device, whereby a given schema would remain activated until completion of the corresponding action (Hoff and Arbib, 1992). This notion has been introduced in physiology already some time ago (see Jeannerod, 1990). Robinson (1975), for example, in his model for the generation of saccadic eye movements, specifically proposed that the spatiotemporal course of saccades is obtained by substracting a signal related to the actual position of the eyes from a signal related to their intended position, thereby driving the eyes at the target until the difference between the two signals becomes zero.
This idea of a central coding of the "desired" position of an effector system has been proposed for the control of various kinds of movements, e.g., speech movements (MacNeilage 1970, Abbs and Gracco 1984), arm movements (Pelisson et al, 1986) or finger movements (Cole and Abbs 1987, Paulignan et al, 1991a, b). Most of the above authors drew their conclusion from experiments involving sudden perturbations occurring during movement execution. The corrections in movement trajectory and/or kinematics in response to these perturbation can be so fast (usually within less than 100 ms) that they cannot be due to programming another movement based on feedback error detection. Instead, they have to rely on an open-lood adjustment of the ongoing program. This suggests that the central representation must be a "dynamic" structure, in the sense that it permanently monitors movement-related signals (e.g., proprioceptive) and compares them with the ongoing efferent commands. Any deviation arising from this comparison would immediately trigger corrections (see Prablanc et al. 1979; Jeannerod 1990; and for a computer model using the same principle, Bulock and Grossberg 1988). This property of the pragmatic representation of steering the action toward a goal-state will be further developed in the final section.
The level at which motor schemas are implemented neurally is obviously quite hypothetical. The neurons recorded by the Sakata group in area 7a and the Rizzolatti group in area 6 would fulfill the requirements for "schema neurons". They would compare to neurons which have been recorded in the other representational system (the ventral system for semantic or i conic representation of objects). Those are not mere feature detectors, they are endowed with high- order visual properties: they are selective for complex visual stimuli (e.g., faces) and generalize their response across different modes of presentation of the same stimuli (see Perrett et al, 1982).
A more global approach to this problem has also been developed, contributing to the same notion that preshaping, manipulation and tactile exploration of objects are based on predetermined schemas. Klatzky et al (1987) showed that subjects tend to classify usual objects into broad categories, the boundaries of which are determined by the pattern of hand movements these objects elicit when they are to be grasped, used and manipulated. Four main prototypical hand shapes (e.g., poke, pinch, clench or palm) seemed to be sufficient for defining the interaction between the hand and most usual objects. Conversely, when subjects were shown unfamiliar forms, and were asked to indicate which hand shape was the most appropriate, they generated highly regular responses, which could be predicted from the geometrical parameters of the forms (e.g., the area of their projecting surface in the frontal plane). This differentiation of hand shapes according to the form of objects was retained in preshaping during actual reaching (see also Pellegrino et al, 1989). Finally, the same authors trained ubjects to produce the various hand shapes in response to visual presentation of these shapes. They showed that, in the trained subjects, presentation of a hand shape subsequently facilitated the judgements they made about the feasibility of interactions with objects. The same was not true if the cue was presented verbally (Klatzky et al, 1989). These results, which demonstrate that representation of a motor configuration of the hand influences knowledge about manual interactions with objects (and conversely), are reminiscent of the effects of motor imagery on motor performance, described in section 3-1.
5.2. The higher level: Long-term action planning.
The lower level of representation of actions, as tenatively described in the foregoing sections, is only one of the possible levels where representation mechanisms might operate. Object oriented actions are embedded into broader representations where longer term plans are encoded. It is beyond the scope of this paper to attempt a full description of this upper level, because it would involve consideration of the many processes underlying selective attention, motivation and decision (see Butter, 1987 and Heilman et al, 1987, for reviews). Only a few suggestions can be made, showing that the same general framework which has been used for the lower level may also tentatively be applied.
One has first to conceive the existence of higher-level schemas controlling the selection, the activation or the inhibition of those processed at the lower level. These higher-level schemas are action plans where, in addition, the serial order of the movements needed to achieve the action is represented (Lashley, 1951; for a review, see Keele et al, 1990). Human neuropsychology provides indications as to the reality of such a hierarchical organization for the supervisory control of action (see Noran and Shallice, 1980, Shallice, 1988). One of these indications comes from observation of a group of patients with prefrontal lesions, who present the so-called "utilization behaviour" (Lhermitte, 1983). These patients will compulsively imitate gestures or even complex actions performed in front of them by an experimenter. Similarly, when faced with usual objects (e.g., a glass and a bottle of water) they will not be able to refrain from using these objects in a compulsive way (pouring water in the glass and drinking large quantities of water, etc). This striking behaviour can be explained by an impairment of the inhibitory control normaly exerted on the elementary motor schemas. In addition, it stresses the role of prefrontal cortex in organizing behavioural inhibition. One possible consequence of this impairment is that frontal patients presenting this syndrome should be unable to generate motor imagery without immediately transferring the imagined action into motor output.
Whereas utilization behaviour seems to represent an exaggerated expression of the activity of high-level representation neurons, another pathological situation, ideomotor apraxia, provides the opposite picture. Apraxic patients have difficulties imitating movements, they cannot perform symbolic gestures (those which require the use of a stored representation). They often cannot reconstruct the sequence of movements which are needed to achieve a complex action. Lehmkuhl et al (1981) showed that this deficit was not due to a memory problem in the usual sense, but rather reflected an impairment of the representational stage of sequencing: their apraxic patients were not able to correctly order cards depicting the successive steps of actions like preparing tea, for example. As the elementary schemas are usually preserved in such patients, their deficit should be in selecting and organizing schemas into a purposive action. A further logical step would be to examine apraxic patients for their ability to generate motor imaery, with the idea that their deficit in selecting and organizing schemas would also reflect in a deficit in evoking actions mentally. This possibility was explicitly considered by Roy and Hall (1992), their hypothesis (even more radical than ours) being that, "if an impairment in image generation does underlay apraxia, one might predict a coincidence between the severity of the image generation deficit and the severity of apraxia" (p 276). Clinical observations indicate that this hypothesis might be correct. Heilman et al (1982) identified one type of ideomotor apraxia following left posterior parietal lesions, where patients were impaired in recognizing gestures: their interpretation was that the patients had lost the "visuokinesthetic engrams" needed for building up a representation of these gestures. The parietal neurons involved in these lesions obviously should have a higher hierarchical status than those underlying the low-level pragmatic representation.
In fact, it should be possible to study the higher level using simple experiments similar to those described for the lower one. The Rosenbaum et al (1990) situation, for example, stresses that hand shapes seem to be planned in anticipation of movements to be performed at a later stage in the sequence, that is beyond the movement which is currently prepared. A similar result was reported by Marteniuk et al (1987). Subjects had to reach for a small disc, and then to either fit the disc into a tightly fitting well (fit condition), or to throw it in a larger container (throw condition). Marteniuk et al found that the kinematics of the initial reach, which was common to the two conditions, differed according to whether the instruction was to fit or to throw. In the fit condition, the peak velocity of the reach was lower and a larger proportion of movement time was spent decelerating. These results clearly indicate that the motor representations, including those during motor preparation, expand in time far beyond the movement which is actually prepared.
The higher level is more difficult to describe in neurophysiological terms. The functional operations underlying motor planning, preparation and imagery must involve large neuronal ensembles, which are likely to be widely distributed. In addition, high-order representation neurons should encode complex goals, not only affordances. For this reason, their discharge should reflect the temporal contingency of the action required for reaching the goal. In other words, they should not be influenced by completion of the intermediate steps of the action (e.g., by activation of elementary schemas), but should rather continue firing until the final goal has been reached. One possibility would be that these neurons encode final configurations (of the environment, of the body, of the moving segments, etc) as they should arise at the end of the action, and that they remain active until the requested configuration has been obtained. This sustained activity would represent the reference (the goal) to which the current state of execution of the action would be compared (Jeannerod, 1990). Accordingly, these neurons would remain activated as long as the represented action would not be completed, including in situations where the execution would be blocked. In this case, where the action would not take place, the sustained discharge would be interpreted centrally as a pure representational activity and would give rise to mental imagery. At the moment, only a few experimental results suggest that neurons in the monkey prefrontal cortex might have properties relevant to this function (Fuster, 1985; Barone and Joseph, 1989). Interestingly, the prefrontal areas where these neurons are located are reciprocally connected with abundant projections to and from the posterior parietal cortex, particularly the inferior parietal lobule (see Petrides and Pandya, 1984) and, in addition, project on the premotor area. It is thus possible that the elementary schemas available in the posterior parietal and the premotor areas can be gated by the prefrontal cortex for achieving the selected action plan. New experimental designs will be needed for a more complete demonstration of this mechanism.
6. Conclusion.
Any study dealing with motor behaviour must take into account the fact that the overt aspect (the movement) is only part of the entire phenomenon. The hidden part (the representation) exists in its own right. The two are equally important to study, because, as Bernstein (1967) conjectured, they do not map entirely onto each other. Describing the overt movement does not give full access to the representation; and conversely, fully describing the representation (if this were at all possible) would not tell what the corresponding movement would be. The reason for this discrepancy is that execution involves biomechanical constraints related to implementation by the musculoskeletal apparatus, and distortion by external forces, which are not necessarily represented centrally. In addition, representations are likely to be endowed with properties (partly built on experience from previous actions) which may not be apparent in their eventual motor counterpart. They seem to be structured with different levels of organization; they use cognitive rules for establishing the serial order of action parts, for assembling programs, etc. To be provocative, the overt movement is not a reliable source of information on its own representation.
One of the aims of this paper was to show that motor representations can indeed be observed and described in themselves, due to our strange ability of exploring our own minds and displaying our mental states. A specific methodology has to be used, because imagined or represented movements are not easy to describe verbally (as opposed to other types of representations), a property which limits the impact of paradigms derived from cognitive psychology. In addition, methods exclusively based on subjective verbal reports or on observation of behavioural responses would overemphasize the "macroscopic" aspects of motor representations, those for which it is difficult to find precise neural correlates.
An attempt was therefore made into description of more "microscopic" aspects, by infering the content of the representation from recordable physiological correlates. This attempt leads to the operational definition of a pragmatic representation, specific for the different levels at which an action is imagined, planned and prepared. Observation of autonomic changes during imagined action, psychophysical analysis of motor "sensations", effects of brain lesions, etc. are potential windows into its hidden part. More and more direct measures will become available, particularly with improvement of brain imaging techniques in man, and with extension of the representation paradigm in monkeys. The real challenge will be to be able to directly observe a representing brain and to record therein the activity of neuron populations fitting the criteria for representation neurons.
One of the most stringent criteria is that once the proper neurons have been selected, the resulting network must remain active as long as action is not performed: This enduring activity is the basis for the central representation of a goal to which execution can be compared. The comparison mechanism simultaneously takes place at several levels, for controling execution of the whole action as well as its basic elements. Combination of this framework of interlocked schemas with the concept of enduring activation will provide a basis for further experimental description of the pragmatic representation.
en elaboraci�n, pronto estar� completado
29.jun.1999
Pulsar tecla de vuelta
Glosario de Carlos von der Becke.