Anda di halaman 1dari 7

Action selection - The brain as a behavioural organizer

M. Heisenberg, Rudolf-Virchow-Zentrum, Wrzburg University Josef-Schneider-Str.2, D15, 97080 Wrzburg, Germany Key words: Autonomous agent, Operant behaviour, Initiating activity, Chance, Behavioural module Animals owe much of their fitness to their behaviour. They often have a large behavioural repertoire which they have to manage. For this they need their brain. Using Drosophila as the study case, the present account depicts animals as autonomous agents and the brain as a behavioural organizer. Behaviour is active. It is generated for its consequences. It serves to change or restore the animal's condition, with no guarantee for improvement. There are two kinds of activity, reactivity and initiating activity. If in a special situation the animal's repertoire contains a behaviour with sufficiently positive inferred outcome and this is activated, it is called a reaction. Most situations, however, provide no special cues for which reactions would be available. Animals do not have to wait. They can activate behaviours 'by themselves', in search of one with positive outcome. Introduction Animals have to generate the right behaviour at the right moment and the right place. Action selection is the main task of the brain. Behavioural brain science, -in particular with small animals such as insects-, has attributed a pivotal role in this process to the sensory stimuli which are thought to be transformed into motor commands to initiate behaviour. Here I would like to offer an alternative view based on the autonomy and, more precisely, the self of an animal. I argue that most behaviours are active and that animals like humans are agents [7,8]. Sensory stimuli trigger behaviour only in exceptional cases. Some guide the ongoing behaviour and probably all of them are evaluated in the search process. In this article the concept of the brain as a behavioural organizer is examined. It will be shown that as soon as one takes the organism as an autonomous agent, the behavioural organizer can readily replace the stimulus-response doctrine. In the new concept there is no need to assume that all behaviours occurring go back to sets of sensory stimuli that triggered them. Behaviour is initiated from within, spontaneously, actively. This is part of the organism's self-ness. It will also be apparent, that there is no alternative to the brain as a behavioural organizer: If an organism has many behavioural options, the brain has to search for the right one. And this can be a most demanding task. The behaviours that matter for action selection come in discrete modules . These are readily discernible in insects. A behavioural module in its simplest form is a precisely orchestrated sequence of muscular contractions and relaxations driven by a central pattern generator (CPG) [2]. Most of the time a CPG is silent, its motor pattern is inhibited. Most behavioural modules are mutually exclusive. While inhibited we call a module a behavioural option. Relief from inhibition of the CPG releases the motor pattern. From birth to death the overt life of an animal or human is a sequence of activated behavioural modules. Terminating an ongoing module and selecting the module to be activated next is what a brain does. If an animal has many options this process may well be a search rather than just a choice. The search is shaped by the network of motivational states, dispositions, moods and feelings, sensory information, attention, memories, the current physiological and anatomical state and availability of the options (modules), etc.. As the search evaluates not so much the options per se but rather their inferred consequences, these must be represented in the brain and must be continuously updated. This is why the search is so demanding.

The search among the options includes their ranking. Some options may linger in the background for most of a life time with low priority waiting for the right occasion. On the other hand, the search must be organized such that many options can be activated at any moment. A particular combination of sensory stimuli may prompt the immediate release of a certain module (stimulusresponse). Once the search process has converged to a choice among few options, we call it a decision. To compare the inferred consequences of behavioural options, the brain has to quantify their benefits and costs on a common value scale and has to store the probabilities for these to become true. Behaviours as different as, for instance, hiding, courting and migration are difficult to compare. Immediate and delayed benefits of an option may have to be weighed against each other, and in some cases the potential value of a behavioural module can only be assessed if one or even multiple followup rounds of choice are included in the consideration. It has long been recognized that action selection must be based on the inferred consequences of these actions [6]. We should be impressed how well this principle works in the brain, but it also requires considerable error-friendliness. In the following I will focus on a particular brain, that of the fly Drosophila melanogaster, assuming that all automobile animals with brains have the same basic organization of behaviour. As the fly brain by the number of neurons is about a million times smaller than ours, its organization may eventually be more readily understood. Behavioural modules Let us go through the flys known repertoire of behavioural modules. These are: heart beat, opening and closing of spiracles, swallowing, egg laying, eclosion, feeding, defecating, flight (hover flight or forward flight), various postures, flight start, landing, walking at different gaits, jumping, climbing, different kinds of grooming, courtship with its several components, copulation, fighting at different levels of escalation. In addition, one observes movements of body parts, such as the head, antennae, maxillary palps, proboscis, single legs, abdomen. Surely, this list is not complete. Not all modules are part of the selection process. For instance, heart beat goes on throughout life, eclosion occurs only once at the onset. To which degree modules like opening and closing of the spiracles, swallowing, defecating and egg laying are part of action selection, is not known. Might flies be able to control the excretion of pheromones? In most cases of modules involving legs and wings only one at a time can be activated, with some obvious exceptions such as landing and flight. 'No behaviour' can be any of at least three states: rest, sleep and hibernation. In other insects a further state of 'no behaviour', freezing is observed. The prevalent and probably oldest module in the context of action selection is locomotion (walking, running). Walking comes in bouts of variable duration and occurs at different speeds. The fly has two walking patterns, a tetrapod gait for lower, and a tripod gait for higher speeds. Even with these two motor patterns the range of speeds is small [21]. It may be useful and even necessary to differentiate some of the options according to the goals the fly pursues with them. This brings up the peculiar quality of space which contains near to infinitely many locations. Taking into account that a fly can walk or fly to any of them, one arrives at a large number of behavioural options among which the brain must search. The same module (e.g. walking) may be activated with different goals and in different contexts, with very different inferred consequences. To some extent flies can also compose behavioural sequences from simpler modules and improve them by practice [17]. If they have to climb across a wide gap they place their hind- and middle legs strategically in order to reach out as far as possible. This again increases the number of different behavioural options. Finally, flies can even find new combinations and sequences of modules by trying out. Already for a fly the repertoire of options is open and action selection based on the inferred consequences of the actions is a highly involved process.

Outcome expectation As stated above, for suitable options to be selected, these must be represented in the brain by their inferred consequences. Such representations have been called 'outcome expectations' [20]. The term 'expectation' which is used in a similar way in learning/memory research [19], alludes to the unpredictability of the future. It tries to capture the possible deviation of the later realization from what is inferred. Even the effects of the most stereotype sensory-motor reflexes can not be predicted with certainty. For some modules such as feeding or copulation the outcome is part of the behaviour. They are called consummatory acts. Their immediate consequences seem certain: food uptake and transfer of sperm, respectively. Yet, their longer-range outcome is not. Food may be poisonous, the copulation partner may have deleterious genes, etc.. Moreover, the consummatory act has costs. At the feeding site or during copulation the animal may be exposed to predators and other dangers. The same outcome may be reached with lower costs at a later time. This shows that even with consummatory acts difficult trade-offs may be involved which need to be taken into account during the search process. Examples of how important outcome expectations are for the search process in Drosophila are legions. I mention only a few: In dark, narrow tubes threatened flies are strongly attracted by near-UV light. Arguably, this is because in nature the sky is the only source of bright UV-rich light, and the fly wants to escape from the tube and get into flight. This interpretation is corroborated by the observation that rendering the fly flightless but fully capable of walking, for example by cutting the wings, abolishes the high attractiveness of UV light. Why? Getting into the open without being able to fly away would put it at higher predation risk than in the tube [9]. The second example is much simpler: Drosophila trained to expect food with a certain odour will later search for the food in the odour's vicinity, but this conditioned search does not occur if the animal is satiated [13]. Walking gets the fly to a new location in space offering new opportunities and risks. Surely, the fly must have an inherited (phylogenetically fixed) 'expectation' of these very general consequences of locomotion. If enclosed without food and water and left to itself, the fly switches between rest and activity periods in a characteristic sequence of ordered randomness [15]. This may be the optimal trade-off between saving energy and not missing a chance for escape. As mentioned above, locomotion most often occurs with a goal such as escape, finding food, shelter, a mate, or just reaching a strategic position (e.g. open space). These different options may have vastly different inferred consequences. How goals influence the search process, is a major theme in behavioural research. It is well known that walking and flight while going on are accompanied by outcome expectations. For instance, if a tethered fly at a torque meter surrounded by a vertical drum with black and white stripes is made to control with its yaw torque the rotations of the drum in a horizontal plane (flight simulator), it can be shown to 'expect' the drum to rotate against the direction of its intended turns [10]. In this situation the fly can be trained by an infrared laser beam and chromatic changes in the illumination to avoid certain orientations relative to the drum. The fly turns away from the sectors where it would be heated before reaching them. It anticipates the heat as the outcome of a certain turning manoeuvre [26]. Not only can flies learn about the consequences of their behaviours but also about conditions of the outside world (classical learning [23]). This kind of adaptation in the search 'space' is not associated with a particular behavioural option but with many or all of them (see also Conclusions). R. Strauss and co-workers have shown that before crossing a gap flies estimate the success of their climbing attempt [17]. As the width of a gap they can cross depends upon their body size, they have to individually learn how large a gap they might still be able to cross [B. Kienitz, T. Krause and R. Strauss, pers. com.]. In ants, locomotion has been shown to have an outcome expectation regarding step size, because these insects estimate the wrong distance travelled if their step size is manipulated [24].

Motivational states such as hunger, thirst, fear, rage, love or tiredness are thought to be important organizers of the search process. In Drosophila it has been difficult to study them. Arousal has been defined as a period of increased and faster walking after a stressful experience [14]. There is a strong case for pain [1]. Are flies in love while courting? We still do not know. Courtship has been studied in flies for more than a century. The readiness of females to copulate is increased by male courtship song and this conditioning effect lasts longer than the sound [12]. Is there a hunger state in flies? This is suggested by the observation that food deprivation has a variety of behavioural effects. It raises, for instance, locomotor activity, changes the attractiveness of certain odours and lowers the threshold for the attractiveness of sugar. Moreover, neuropeptide F (dNPF) which regulates feeding and is released from the respective neurons under starvation, also is involved in the read-out of appetitive odour memories [13]. To finish this section let us return to the role of sensory stimuli in the search process. Signals from all sensory modalities arrive in the brain all the time. The brain makes use of them in its search for the right behaviour. They need to be interpreted as to their behavioural significance. They may modify the outcome expectations and change the hierarchy of the behavioural options for the next selection. They can affect motivational states and cause a reassessment of goals. They provide what I have called 'orientedness', the disposition to behave with the right changes in orientation and position when necessary. Stimuli triggering a behaviour are the rare exception. Such sensory-motor reflexes mostly serve in emergencies. We can safely assume that flies like humans avoid to settle in environments where these are frequently needed. Surely, guiding on-going behaviour is a prevalent task of the sensors. The active brain While it is evident that the incoming sensory stimuli serve the brain in its search for the right behaviour, it may be less obvious why initiating activity is so important in this process. The answer is that behaviour deals with an open future that is only partially predictable. For flies, as for humans, most situations are unique and often new, with risks and opportunities. Flies need to be inventive and to take their chances. For many situations the search does not converge onto a single behavioural option with a clearly superior outcome expectation. Two or more options may score the same low. Often, but not always the right timing of a behaviour is important. Flies, like humans have to solve problems by trying out something new, activating an option never activated in such a situation before [9]. The importance of activity is well expressed by the 'golden rule' of behaviour: Do not wait until you are forced by circumstance. It applies not only to flies and humans but also to companies and countries. As animals and humans are autonomous agents, their behaviour may occur spontaneously, by itself, involving the catalytic element of chance. More than for most other animals, behavioural research on Drosophila provides compelling evidence that the search process makes use of chance as an adaptive element. One such example is the decision conflict: If flies in a narrow tube with a light on one end (as discussed above) are shaken to the other end, their probability of immediately 'running' or 'not running' towards the light is 80% and 20% respectively. If the 'runners' and 'non-runners' are separated and the experiment is repeated with either group, one finds in both of them again 'runners' and 'non-runners' with the same 80/20 ratio as in the first experiment [3]. In each round the search process arrives at the same ambiguous answer and the flies again take their chance. A second example is the fly's ability to solve problems by trying out. With little technical sophistication one can design a set-up in which the fly has to generate a certain behaviour chosen by the experimenter, in order to switch off a threatening heat or electric shock. In most cases the fly quickly finds out what to do [11]. In tethered flight the fly can beat its wings, turn its abdomen, lift its antennae and move its legs but can not execute its intended flight manoeuvres. It must be trying to escape this disastrous situation. If under these conditions one records the fly's yaw torque and forward thrust, one observes that the fly 4

continuously modulates these flight forces. In addition, it probably activates many other behaviours compatible with flight. What we observe is that it immediately takes advantage of any consequences of its behaviours it can detect. If it measures a coincidence between one of its actions and an incoming sensory signal, it tries to confirm it and, if successful, uses this minute degree of freedom to make the best of it. The feedback signal may hint at how to further improve the situation for an eventual escape [25]. These examples show that the search process to a large extent involves learning about the consequences of one's own behaviour. Much of what happens is new. Fortunately this truism does not only apply to the outside world but also to the universe of behavioural options and their consequences. A tethered fly may be in a state of maximal arousal and its continuous activity may not be representative. Yet, the sequence of turning attempts in this situation can be shown to be a well organized probabilistic behaviour pattern [16]. Also, walking flies left alone in a dark chamber organize their time between activity and rest periods (see above [15]). Behavioural science has long adopted methods to account for the continuous stream of activity in observational data (ethograms [18]) even under more natural conditions. Action selection I have argued that action selection, the search for the right behaviour is the basic task of the fly brain. Many of the behavioural properties mentioned above, are in one way or another part of this process. Yet, rarely has action selection explicitly been addressed. EA Kravitz and his group studied it in malemale interactions and identified three octopaminergic neurons in the suboesophageal ganglion that influence the decision between courtship and aggression [4, 5]. A. Guo and co-workers investigated behavioural choices in stationary flight by setting up a decision conflict [22]. They trained flies to turn towards or away from two kinds of landmarks that differed in two features, color and shape. Once one landmark had become attractive and the other repulsive, they switched the combinations of features for the subsequent retrieval test such that now all landmarks had one attractive and one repulsive feature. The strengths of the memory traces had been balanced to make the landmarks equally attractive for the flies after the switch. Surprisingly, when for the retrieval test the intensity of the colors was increased the flies opted for this feature and ignored the shapes. Inversely, if the shapes had been made more distinct for the test, flies disregarded the colors [27]. Flies did not take into account that the changed features differed from those presented during acquisition. The search process appeared to be governed more by the salience of the sensory stimuli than the reliability of the memory traces. These examples show how little we know about the search process. It is not yet possible to describe its outline, even for a brain as small as that of the fly. Concluding remarks This essay does not pretend to reinvent behavioural brain science. In the search process the brain serves all that brains are known to serve: it stores and integrates experiences from the past to secure the future, it mitigates in advance the squeezes of demands and constraints, and, among others, it establishes and preserves, -in humans over many decades- the behavioural uniqueness and continuity of the individual. As far as mental processes have an evolutionary origin and can be ascribed to the brain they are taken in this approach as properties of the search process. Motivational states such as moods, emotions and feelings direct the search process to certain regions of the search space and anything like thinking can be understood as a landscaping activity in this space for future searches. With time brains have evolved more and more of these landscaping activities. For instance, an egocentric representation of outside space with its general properties of up and down, right and left, front and rear, far and close, would improve the search under many circumstances. It would improve 'orientedness' to which I have alluded above. Likewise, the representation of time, as provided by the circadian clock, would be understood as a property of the search process. As an overarching framework of functional brain science this perspective tells us to understand sensory integration, motivational control, learning and memory, cognition, intentions, selective 5

attention, decision making, motor programming and planning as one integrated process: trying to do the right thing. Animals and humans are autonomous, auto-mobile agents. Activity is the most fundamental property of behaviour deserving more recognition than it has received in the past.
Acknowledgement: I thank Bertram Gerber and Randolf Menzel who contributed ideas and focus. References: [1] Al-Anzi B, Tracey WD Jr, Benzer S. Response of Drosophila to wasabi is mediated by painless, the fly homolog of mammalian TRPA1/ANKTM1. Curr Biol 2006; 16:1-7 [2] Bssler U. On the definition of central pattern generator and its sensory control. Biol Cybern 1986; 54:65 69 [3] Brown W, Haglund K. The Landmark Interviews. Bringing behavioral genes to light. J NIH Res 1994; 6:66-73 [4] Certel SJ, Savella MG, Schlegel DC, Kravitz EA. Modulation of Drosophila male behavioral choice. Proc Natl Acad Sci USA 2007; 104:47064711 [5] Certel SJ, Leung A, Lin C-Y, Perez P, Chiang A-S, Kravitz EA. Octopamine neuromodulatory effects on a social behavior decision-making network in Drosophila males. PloS ONE 2010; 5:e13248 [6] Elsner B, Hommel B. Effect anticipation and action control. J Exp Psychol 2001; 27:229240 [7] Heisenberg M. Initiale Aktivitt und Willkrverhalten bei Tieren. Naturwiss 1983; 70:70-78 [8] Heisenberg M. Voluntariness (Willkrfhigkeit) and the general organization of behaviour. In: RJ Greenspan, CP Kyriacou, eds. Flexibility and Constraint in Behavioral System, Wiley & Sons Ltd, 1994:147156 [9] Heisenberg M, Wolf R. Vision in Drosophila. Vol. XII, of: Studies of Brain Function, V. Braitenberg, ed. Berlin, Heidelberg, New York, Springer, 1984 [10] Heisenberg M, Wolf R. Reafferent control of optomotor yaw torque in Drosophila melanogaster. J comp Physiol A 1988; 163:373-388 [11] Heisenberg M, Wolf R, Brembs B. Flexibility in a Single Behavioral Variable of Drosophila. Learn Mem 2001; 8:110 [12] Kowalski S, Aubin T, Martin J-R. Courtship song in Drosophila melanogaster: a differential effect on malefemale locomotor activity. Can J Zool 2004; 82:12581266 [13] Krashes MJ, DasGupta S, Vreede A, White B, Armstrong JD, Waddell S. A neural circuit mechanism integrating motivational state with memory expression in Drosophila. Cell 2009; 139:416427 [14] Lebestky T, Jung-Sook T, Chang C, Dankert H, Zelnik L, Kim Y-C, Han K-A, Wolf FW, Perona P, Anderson DJ. Two different forms of arousal in Drosophila are oppositely regulated by the dopamine D1 receptor ortholog DopR via distinct neural circuits. Neuron 2009; 64:522536 [15] Martin J-R, Ernst R, Heisenberg M. Temporal pattern of locomotor activity in Drosophila melanogaster. J Comp Physiol A 1999; 184:73-84 [16] Maye A, Hsieh CH, Sugihara G, Brembs B. Order in spontaneous behaviour. PLoS ONE 2007; 2:e443

[17] Pick S, Strauss R. Goal-driven behavioral adaptations in gap-climbing Drosophila. Curr Biol 2005; 15:14731478 [18] Reif M, Linsenmair KE, Heisenberg M. Evolutionary significance of courtship conditioning in Drosophila melanogaster. Animal Behaviour 2002; 63:143-155 [19] Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement. In: AH Black and WF Prokasy, eds. Classical conditioning II: Current research and theory. New York: Appleton 1972 [20] Schleyer M, Saumweber T, Nahrendorf W, Fischer B, von Alpen D, Pauls D, Thum A, Gerber B. A behavior-based circuit model of how outcome expectations organize learned behavior in larval Drosophila. Learn Mem 2011; 18:639-653 [21] Strauss R, Heisenberg M. Coordination of legs during straight walking and turning in Drosophila melanogaster. J Comp Physiol A 1990; 167:403-412 [22] Tang S, Guo A. Choice behavior of Drosophila facing contradictory visual cues. Science 2001; 294:15431547 [23] Tang S, Wolf R, Xu S, Heisenberg M. Visual pattern recognition in Drosophila is invariant for retinal position. Science 2004; 305:1020-1022 [24] Wittlinger M, Wehner R, Wolf H. The ant odometer: stepping on stilts and stumps. Science 2006; 312:19651967 [25] Wolf R, Heisenberg M. Basic organization of operant behavior as revealed in Drosophila flight orientation. J Comp Physiol A 1991; 169:699-705 [26] Wolf R, Heisenberg M. Visual space from visual motion: Turn integration in tethered flying Drosophila. Learn Mem 1997; 4:318-327 [27] Zhang K, Guo JZ, Peng Y, Xi W, Guo A. Dopamine-mushroom body circuit regulates saliency-based decision-making in Drosophila. Science 2007; 316:19011904

Anda mungkin juga menyukai