1453 62.Neurorobotics:From Vision to Action Michael A.Arbib,Giorgio Metta,Patrick van der Smagt The lay view of a robot is a mechanical human, 62.1 Definitic0n5...1453 and thus robotics has always been inspired by at- 62.2 Neuroethological Inspiration.... ..1454 tempts to emulate biology.In this Chapter,we 62.2.1 Optic Flow in Bees and Robots.....1455 extend this biological motivation from humans to 62.2.2 Visually Guided Behavior animals more generally,but with a focus on the in Frogs and Robots........ .1456 central nervous systems rather than the bodies of 62.2.3 Navigation in Rat and Robot........1457 these creatures.In particular,we investigate the 62.2.4 Schemas sensorimotor loop in the execution of sophisti- and Coordinated Control Programs1459 cated behavior.Some of these sections concentrate 62.2.5 Salience and Visual Attention......1461 on cases where vision provides key sensory data. 62.3 The Role of the Cerebellum....................1462 Neuroethology is the study of the brain mecha- 62.3.1 The Human Control Loop..............1462 nisms underlying animal behavior,and Sect.62.2 62.3.2 Models of Cerebellar Control........1463 exemplifies the lessons it has to offer robotics by 62.3.3 Cerebellar Models and Robotics....1466 looking at optic flow in bees,visually guided be- 62.4 The Role of Mirror Systems .1467 havior in frogs,and navigation in rats,turning 62.4.1 Mirror Neurons and the then to the coordination of behaviors and the role Recognition of Hand Movements..1467 of attention.Brains are composed of diverse sub- 62.4.2 A Bayesian View systems,many of which are relevant to robotics, of the Mirror System................1470 but we have chosen just two regions of the mam- 62.4.3 Mirror Neurons and Imitation......1473 malian brain for detailed analysis.Section 62.3 presents the cerebellum.While we can plan and 62.5 Extroduction........1474 execute actions without a cerebellum,the actions 62.6 Further Reading.... .1475 are no longer graceful and become uncoordinated. References................... 1475 We reveal how a cerebellum can provide a key ingredient in an adaptive control system,tun- ing parameters both within and between motor movements,provide a Bayesian view of a robot schemas.Section 62.4 turns to the mirror sys- mirror system,and discuss what must be added tem,which provides shared representations which to a mirror system to support robot imitation.We bridge between the execution of an action and conclude by emphasizing that,while neuroscience the observation of that action when performed can inspire novel robotic designs,it is also the case by others.We develop a neurobiological model of that robots can be used as embodied test beds for Part how learning may forge mirror neurons for hand the analysis of brain models. 62.1 Definitions Neurorobotics may be defined as the design of com-We note the success of artificial neural networks-net- putational structures for robots inspired by the study works of simple computing elements whose connections of the nervous systems of humans and other animals. change with experience-as providing a medium for par-
1453 Neurorobotics 62. Neurorobotics: From Vision to Action Michael A. Arbib, Giorgio Metta, Patrick van der Smagt The lay view of a robot is a mechanical human, and thus robotics has always been inspired by attempts to emulate biology. In this Chapter, we extend this biological motivation from humans to animals more generally, but with a focus on the central nervous systems rather than the bodies of these creatures. In particular, we investigate the sensorimotor loop in the execution of sophisticated behavior. Some of these sections concentrate on cases where vision provides key sensory data. Neuroethology is the study of the brain mechanisms underlying animal behavior, and Sect. 62.2 exemplifies the lessons it has to offer robotics by looking at optic flow in bees, visually guided behavior in frogs, and navigation in rats, turning then to the coordination of behaviors and the role of attention. Brains are composed of diverse subsystems, many of which are relevant to robotics, but we have chosen just two regions of the mammalian brain for detailed analysis. Section 62.3 presents the cerebellum. While we can plan and execute actions without a cerebellum, the actions are no longer graceful and become uncoordinated. We reveal how a cerebellum can provide a key ingredient in an adaptive control system, tuning parameters both within and between motor schemas. Section 62.4 turns to the mirror system, which provides shared representations which bridge between the execution of an action and the observation of that action when performed by others. We develop a neurobiological model of how learning may forge mirror neurons for hand 62.1 Definitions ........................................... 1453 62.2 Neuroethological Inspiration ................. 1454 62.2.1 Optic Flow in Bees and Robots ..... 1455 62.2.2 Visually Guided Behavior in Frogs and Robots.................... 1456 62.2.3 Navigation in Rat and Robot........ 1457 62.2.4 Schemas and Coordinated Control Programs1459 62.2.5 Salience and Visual Attention ...... 1461 62.3 The Role of the Cerebellum.................... 1462 62.3.1 The Human Control Loop ............. 1462 62.3.2 Models of Cerebellar Control ........ 1463 62.3.3 Cerebellar Models and Robotics.... 1466 62.4 The Role of Mirror Systems .................... 1467 62.4.1 Mirror Neurons and the Recognition of Hand Movements.. 1467 62.4.2 A Bayesian View of the Mirror System ................... 1470 62.4.3 Mirror Neurons and Imitation ...... 1473 62.5 Extroduction ........................................ 1474 62.6 Further Reading ................................... 1475 References .................................................. 1475 movements, provide a Bayesian view of a robot mirror system, and discuss what must be added to a mirror system to support robot imitation. We conclude by emphasizing that, while neuroscience can inspire novel robotic designs, it is also the case that robots can be used as embodied test beds for the analysis of brain models. 62.1 Definitions Neurorobotics may be defined as the design of computational structures for robots inspired by the study of the nervous systems of humans and other animals. We note the success of artificial neural networks – networks of simple computing elements whose connections change with experience – as providing a medium for parPart G 62
1454 Part G Human-Centered and Life-Like Robotics allel adaptive computation that has seen application in inspired robot behaviors.and then showing how differ- robot vision systems and controllers but here we empha- ent additional mechanisms yield a variety of enriched size neural networks derived from the study of specific behaviors.Braitenberg's book [62.2]is very much in neurobiological systems.Neurorobotics has a twofold this spirit and has entered the canon of neurorobotics. aim:creating better machines which employ the prin- While their work provides a historical background for ciples of natural neural computation;and using the study the studies surveyed here,we instead emphasize stud- of bio-inspired robots to improve understanding of the ies inspired by the computational neuroscience of the functioning of the brain.Chapter 60,Biologically In-mechanisms serving vision and action in the human and spired Robots,complements our study of brain design animal brain.We seek lessons from linking behavior to with work on body design,the design of robotic con-the analysis of the internal workings of the brain (1)at trol and actuator systems based on careful study of the the relatively high level of characterizing the functional relevant biology. roles of specific brain regions (or the functional units Walter [62.1]described two biologically inspired of analysis called schemas Sect.62.2.4),and the behav- robots,the electromechanical tortoises Machina spec-iors which emerge from the interactions between them ulatrix and M.docilis (though each body has wheels and (2)at the more detailed level of models of neu- not legs).M.speculatrix has a steerable photoelectric ral circuitry linked to the data of neuroanatomy and cell,which makes it sensitive to light,and an electri-neurophysiology.There are lessons for neurorobotics cal contact,which allows it to respond when it bumps to be learned from even finer-scale analysis of the bio- into obstacles.The photoreceptor rotates until a light of physics of individual neurons and the neurochemistry moderate intensity is registered,at which time the or-of synaptic plasticity but these are beyond the scope ganism orients itself towards the light and approaches of this chapter(see Segev and London [62.3]and Freg- it.However,very bright lights,material obstacles,and nac [62.4].respectively,for entry points into the relevant steep gradients are repellent to the tortoise.The lat- computational neuroscience). ter stimuli convert the photoamplifier into an oscillator, The plan of this Chapter is as follows.After some which causes alternating movements of butting and selected examples from computational neuroethology, withdrawal,so that the robot pushes small objects out of the computational analysis of neural mechanisms under- its way,goes around heavy ones,and avoids slopes.The lying animal behavior,we show how perceptual and tortoise has a hutch.which contains a bright light.When motor schemas and visual attention provide the frame- the machine's batteries are charged,this bright light is work for our action-oriented view of perception,and repellent.When the batteries are low,the light becomes show the relevance of the computational neuroscience attractive to the machine and the light continues to ex- to robotic implementations (Sect.62.2).We then pay ert an attraction until the tortoise enters the hutch,where particular attention to two systems of the mammalian the machine's circuitry is temporarily turned off until the brain,the cerebellum and its role in tuning and coordi- batteries are recharged,at which time the bright hutch nating actions (Sect.62.3),and the mirror system and light again exerts a negative tropism.The second robot,its roles in action recognition and imitation(Sect.62.4). M.docilis was produced by grafting onto M.speculatrix The extroduction will then invite readers to explore the a circuit designed to form conditioned reflexes.In one many other areas in which neurorobotics offers lessons experiment,Walter connected this circuit to the obstacle- from neuroscience to the development of novel robot Part avoiding device in M.speculatrix.Training consisted of designs.What follows.then.can be seen as a contribu- blowing a whistle just before bumping the shell. tion to the continuing dialogue between robot behavior ⊙ Although Walter's controllers are simple and not and animal and human behavior in which particular 62.2 based on neural analysis,they do illustrate an attempt emphasis is placed on the search for the neural under- to gain inspiration from seeking the simplest mecha- pinnings of vision,visually guided action,and cerebellar nisms that will yield an interesting class of biologically control. 62.2 Neuroethological Inspiration Biological evolution has yielded a staggering variety of cific niches.One may thus turn to the neuroethology of creatures,each with brains and bodies adapted to spe- specific creatures to gain inspiration for special-purpose
1454 Part G Human-Centered and Life-Like Robotics allel adaptive computation that has seen application in robot vision systems and controllers but here we emphasize neural networks derived from the study of specific neurobiological systems. Neurorobotics has a twofold aim: creating better machines which employ the principles of natural neural computation; and using the study of bio-inspired robots to improve understanding of the functioning of the brain. Chapter 60, Biologically Inspired Robots, complements our study of brain design with work on body design, the design of robotic control and actuator systems based on careful study of the relevant biology. Walter [62.1] described two biologically inspired robots, the electromechanical tortoises Machina speculatrix and M. docilis (though each body has wheels not legs). M. speculatrix has a steerable photoelectric cell, which makes it sensitive to light, and an electrical contact, which allows it to respond when it bumps into obstacles. The photoreceptor rotates until a light of moderate intensity is registered, at which time the organism orients itself towards the light and approaches it. However, very bright lights, material obstacles, and steep gradients are repellent to the tortoise. The latter stimuli convert the photoamplifier into an oscillator, which causes alternating movements of butting and withdrawal, so that the robot pushes small objects out of its way, goes around heavy ones, and avoids slopes. The tortoise has a hutch, which contains a bright light. When the machine’s batteries are charged, this bright light is repellent. When the batteries are low, the light becomes attractive to the machine and the light continues to exert an attraction until the tortoise enters the hutch, where the machine’s circuitry is temporarily turned off until the batteries are recharged, at which time the bright hutch light again exerts a negative tropism. The second robot, M. docilis was produced by grafting onto M. speculatrix a circuit designed to form conditioned reflexes. In one experiment, Walter connected this circuit to the obstacleavoiding device in M. speculatrix. Training consisted of blowing a whistle just before bumping the shell. Although Walter’s controllers are simple and not based on neural analysis, they do illustrate an attempt to gain inspiration from seeking the simplest mechanisms that will yield an interesting class of biologically inspired robot behaviors, and then showing how different additional mechanisms yield a variety of enriched behaviors. Braitenberg’s book [62.2] is very much in this spirit and has entered the canon of neurorobotics. While their work provides a historical background for the studies surveyed here, we instead emphasize studies inspired by the computational neuroscience of the mechanisms serving vision and action in the human and animal brain. We seek lessons from linking behavior to the analysis of the internal workings of the brain (1) at the relatively high level of characterizing the functional roles of specific brain regions (or the functional units of analysis called schemas Sect. 62.2.4), and the behaviors which emerge from the interactions between them, and (2) at the more detailed level of models of neural circuitry linked to the data of neuroanatomy and neurophysiology. There are lessons for neurorobotics to be learned from even finer-scale analysis of the biophysics of individual neurons and the neurochemistry of synaptic plasticity but these are beyond the scope of this chapter (see Segev and London [62.3] and Fregnac [62.4], respectively, for entry points into the relevant computational neuroscience). The plan of this Chapter is as follows. After some selected examples from computational neuroethology, the computational analysis of neural mechanisms underlying animal behavior, we show how perceptual and motor schemas and visual attention provide the framework for our action-oriented view of perception, and show the relevance of the computational neuroscience to robotic implementations (Sect. 62.2). We then pay particular attention to two systems of the mammalian brain, the cerebellum and its role in tuning and coordinating actions (Sect. 62.3), and the mirror system and its roles in action recognition and imitation (Sect. 62.4). The extroduction will then invite readers to explore the many other areas in which neurorobotics offers lessons from neuroscience to the development of novel robot designs. What follows, then, can be seen as a contribution to the continuing dialogue between robot behavior and animal and human behavior in which particular emphasis is placed on the search for the neural underpinnings of vision, visually guided action, and cerebellar control. 62.2 Neuroethological Inspiration Biological evolution has yielded a staggering variety of creatures, each with brains and bodies adapted to specific niches. One may thus turn to the neuroethology of specific creatures to gain inspiration for special-purpose Part G 62.2
Neurorobotics:From Vision to Action 62.2 Neuroethological Inspiration 1455 robots.In Sect.62.2.1,we will see how researchers have in both the neuroscience and robotics literatures.Borst studied bees and flies for inspiration for the design and Dickinson [62.8]provide a recent study of continu- of flying robots,but have also learned lessons for the ing biological research on visual course control in flies. visual control of terrestrial robots.In Sect.62.2.2,we Such work has inspired a large number of robot stud- introduce Rana computatrix,an evolving model of vi- ies,including those of van der Smagt and Groen [62.9], suomotor coordination in frogs and toads.The name van der Smagt [62.10]Liu and Usseglio-Viretta [62.111. the frog that computes,was inspired by Walter's M. Ruffier et al.[62.121,and Reiser and Dickinson [62.13]. speculatrix and inspired in turn the names of a number Here,however,we look in a little more detail at of other species of neuroethologically inspired robots,honeybees.Srinivasan,Zhang,and Chahl [62.14]con- including Beer's [62.5]computational cockroach Peri- tinued the tradition of studying image motion cues in planeta computatrix and Cliff's [62.6]hoverfly Syritta insects by investigating how optic flow (the flow of pat- computatrix. tern across the eye induced by motion relative to the Moreover,we learn not only from the brains of spe- environment)is exploited by honeybees to guide loco- cific creatures but also from comparative analysis of motion and navigation.They analyzed how bees perform the brains of diverse creatures,looking for homologous a smooth landing on a flat surface:image velocity is held mechanisms as computational variants which may then constant as the surface is approached,thus automatically be related to the different ecological niches of the crea- ensuring that flight speed is close to zero at touchdown. tures that utilize them.A basic theme of brain evolution This obviates any need for explicit knowledge of flight is that new functions often emerge through modulation speed or height above the ground.This landing strat- and coordination of existing structures.In other words,egy was then implemented in a robotic gantry to test to the extent that new circuitry may be identified with the its applicability to autonomous airborne vehicles.Bar- new function,it need not be as a module that computes ron and Srinivasan [62.15]investigated the extent to the function autonomously.but rather as one that can which ground speed is affected by headwinds.Honey deploy prior resources to achieve the novel functional- ity.Section 62.2.3 will introduce the role of the rat brain in navigation,while Sect.62.2.4 will look at the general framework of perceptual schemas motor schemas and coordinated control programs for a high-level view of the neuroscience and neurorobotics of vision and action. Finally,Sect.62.2.5 will look at the control of visual at- tention in mammals as a homolog of orienting behavior in frogs and toads.All this sets the stage for our empha- sis on the roles of the cerebellum(Sect.62.3)and mirror systems(Sect.62.4)in the brains of mammals and their implications for neurorobotics.We stress that the choice of these two systems is conditioned by our own exper- tise,and that studies of many other brain systems also Fig.62.1a-c Observation of the trajectories of honeybees hold great importance for neurorobotics. flying in visually textured tunnels has provided insights into how bees use optic flow cues to regulate flight speed 62.2.1 Optic Flow in Bees and Robots and estimate distance flown,and balance optic flow in Part the two eyes to fly safely through narrow gaps.This Before we turn to vertebrate brains for much of our in-information has been used to build autonomously navi- spiration for neurorobotics,we briefly sample the rich gating robots.(b)schematic illustration of a honey-bee literature on insect-inspired research.Among the found- brain,carrying about a million neurons within about ing studies in computational neuroethology were a series one cubic millimeter.(Images courtesy of M.Srini- of reports from the laboratory of Werner Reichardt in vasan:(a)Science 287,851-853 (2000):(b)Virtual Tubingen which linked the delicate anatomy of the fly's Atlas of the Honeybee Brain,http://www.neurobiologie.fu- brain to the extraction of visual data needed for flight berlin.de/beebrain/Bee/VRML/SnapshotCosmoall.jpg.(c) control.More than 40 years ago,Reichardt [62.7]pub- (Research School of Biological Sciences,Australian Na- lished a model of motion detection inspired by this work tional University)A mobile robot guided by an optic flow that has long been central to discussions of visual motion algorithm based on the studies exemplified in (a))
Neurorobotics: From Vision to Action 62.2 Neuroethological Inspiration 1455 robots. In Sect. 62.2.1, we will see how researchers have studied bees and flies for inspiration for the design of flying robots, but have also learned lessons for the visual control of terrestrial robots. In Sect. 62.2.2, we introduce Rana computatrix, an evolving model of visuomotor coordination in frogs and toads. The name the frog that computes, was inspired by Walter’s M. speculatrix and inspired in turn the names of a number of other species of neuroethologically inspired robots, including Beer’s [62.5] computational cockroach Periplaneta computatrix and Cliff’s [62.6] hoverfly Syritta computatrix. Moreover, we learn not only from the brains of specific creatures but also from comparative analysis of the brains of diverse creatures, looking for homologous mechanisms as computational variants which may then be related to the different ecological niches of the creatures that utilize them. A basic theme of brain evolution is that new functions often emerge through modulation and coordination of existing structures. In other words, to the extent that new circuitry may be identified with the new function, it need not be as a module that computes the function autonomously, but rather as one that can deploy prior resources to achieve the novel functionality. Section 62.2.3 will introduce the role of the rat brain in navigation, while Sect. 62.2.4 will look at the general framework of perceptual schemas motor schemas and coordinated control programs for a high-level view of the neuroscience and neurorobotics of vision and action. Finally, Sect. 62.2.5 will look at the control of visual attention in mammals as a homolog of orienting behavior in frogs and toads. All this sets the stage for our emphasis on the roles of the cerebellum (Sect. 62.3) and mirror systems (Sect. 62.4) in the brains of mammals and their implications for neurorobotics. We stress that the choice of these two systems is conditioned by our own expertise, and that studies of many other brain systems also hold great importance for neurorobotics. 62.2.1 Optic Flow in Bees and Robots Before we turn to vertebrate brains for much of our inspiration for neurorobotics, we briefly sample the rich literature on insect-inspired research. Among the founding studies in computational neuroethology were a series of reports from the laboratory of Werner Reichardt in Tübingen which linked the delicate anatomy of the fly’s brain to the extraction of visual data needed for flight control. More than 40 years ago, Reichardt [62.7] published a model of motion detection inspired by this work that has long been central to discussions of visual motion in both the neuroscience and robotics literatures. Borst and Dickinson [62.8] provide a recent study of continuing biological research on visual course control in flies. Such work has inspired a large number of robot studies, including those of van der Smagt and Groen [62.9], van der Smagt [62.10] Liu and Usseglio-Viretta [62.11], Ruffier et al. [62.12], and Reiser and Dickinson [62.13]. Here, however, we look in a little more detail at honeybees. Srinivasan, Zhang, and Chahl [62.14] continued the tradition of studying image motion cues in insects by investigating how optic flow (the flow of pattern across the eye induced by motion relative to the environment) is exploited by honeybees to guide locomotion and navigation. They analyzed how bees perform a smooth landing on a flat surface: image velocity is held constant as the surface is approached, thus automatically ensuring that flight speed is close to zero at touchdown. This obviates any need for explicit knowledge of flight speed or height above the ground. This landing strategy was then implemented in a robotic gantry to test its applicability to autonomous airborne vehicles. Barron and Srinivasan [62.15] investigated the extent to which ground speed is affected by headwinds. Honey a) b) c) Fig. 62.1a–c Observation of the trajectories of honeybees flying in visually textured tunnels has provided insights into how bees use optic flow cues to regulate flight speed and estimate distance flown, and balance optic flow in the two eyes to fly safely through narrow gaps. This information has been used to build autonomously navigating robots. (b) schematic illustration of a honey-bee brain, carrying about a million neurons within about one cubic millimeter. (Images courtesy of M. Srinivasan: (a) Science 287, 851–853 (2000); (b) Virtual Atlas of the Honeybee Brain, http://www.neurobiologie.fuberlin.de/beebrain/Bee/VRML/SnapshotCosmoall.jpg. (c) (Research School of Biological Sciences, Australian National University) A mobile robot guided by an optic flow algorithm based on the studies exemplified in (a)) Part G 62.2
1456 Part G Human-Centered and Life-Like Robotics bees were trained to enter a tunnel to forage at a sucrose to the animal's ecological niche to show that different feeder placed at its far end (Fig.62.la).The bees used cells in the retina and the visual midbrain region known visual cues to maintain their ground speed by adjusting as the tectum were specialized for detecting predators their airspeed to maintain a constant rate of optic flow, and prey.However,in much visually guided behavior, even against headwinds which were,at their strongest, the animal does not respond to a single stimulus,but 50%of a bee's maximum recorded forward velocity. rather to some property of the overall configuration.We Vladusich et al.[62.16]studied the effect of adding thus turn to the question what does the frog's eye tell goal-defining landmarks.Bees were trained to forage the frog?,stressing the embodied nervous system or, in an optic-flow-rich tunnel with a landmark positioned perhaps equivalently,an action-oriented view of per- directly above the feeder.They searched much more ac-ception.Consider,for example,the snapping behavior curately when both odometric and landmark cues were of frogs confronted with one or more fly-like stimuli. available than when only odometry was available.When Ingle [62.20]found that it is only in a restricted region the two cue sources were set in conflict,by shifting the around the head of a frog that the presence of a fly-like position of the landmark in the tunnel during tests,bees stimulus elicits a snap,that is,the frog turns so that its overwhelmingly used landmark cues rather than odome-midline is pointed at the stimulus and then lunges for- try.This,together with other such experiments,suggests ward and captures the prey with its tongue.There is that bees can make use of odometric and landmark cues a larger zone in which the frog merely orients towards in a more flexible and dynamic way than previously en- the target,and beyond that zone the stimulus elicits no visaged.In earlier studies of bees flying down a tunnel,response at all.When confronted with two flies within Srinivasan and Zhang [62.17]placed different patterns the snapping zone,either of which is vigorous enough on the left and right walls.They found that bees bal- that alone it could elicit a snapping response,the frog ex- ance the image velocities in the left and right visual hibits one of three reactions:it snaps at one of the flies,it fields.This strategy ensures that bees fly down the mid- does not snap at all,or it snaps in between at the average dle of the tunnel,without bumping into the side walls,fly.Didday [62.21]offered a simple model of this choice enabling them to negotiate narrow passages or to fly behavior which may be considered as the prototype for between obstacles.This strategy has been applied to a winner-take-all (WTA)model which receives a vari- a corridor-following robot(Fig.62.1c).By holding con- ety of inputs and(under ideal circumstances)suppresses stant the average image velocity as seen by the two the representation of all but one of them;the one that eyes during flight,the bee avoids potential collisions, remains is the winner which will play the decisive role slowing down when it flies through a narrow passage. in further processing.This was the beginning of Rana The movement-sensitive mechanisms underlying these computatrix(see Arbib [62.22,23]for overviews). various behaviors differ qualitatively as well as quanti- Studies on frog brains and behavior inspired the tatively,from those that mediate the optomotor response successful use of potential fields for robot navigation (e.g.,turning to track a pattern of moving stripes)that strategies.Data on the strategies used by frogs to cap- had been the initial target of investigation of the Re- ture prey while avoiding static obstacles(Collett [62.24]) ichardt laboratory.The lesson for robot control is that grounded the model by Arbib and House [62.25]which flight appears to be coordinated by a number of visuo- linked systems for depth perception to the creation of motor systems acting in concert,and the same lesson spatial maps of both prey and barriers.In one version Part can apply to a whole range of tasks which must con-of their model,they represented the map of prey by vert vision to action.Of course,vision is but one of the a potential field with long-range attraction and the map 9 sensory systems that play a vital role in insect behavior.of barriers by a potential field with short-range repul- Webb [62.18]uses her own work on robot design in- sion,and showed that summation of these fields yielded spired by the auditory control of behavior in crickets to a field that could guide the frog's detour around the bar- anchor a far-ranging assessment of the extent to which rier to catch its prey.Corbacho and Arbib [62.26]later robotics can offer good models of animal behaviors. explored a possible role for learning in this behavior. Their model incorporated learning in the weights be- 62.2.2 Visually Guided Behavior tween the various potential fields to enable adaptation in Frogs and Robots over trials as observed in the real animals.The success of the models indicated that frogs use reactive strategies Lettvin et al.[62.19]treated the frog's visual system from to avoid obstacles while moving to a goal,rather than an ethological perspective,analyzing circuitry in relation employing a planning or cognitive system.Other work
1456 Part G Human-Centered and Life-Like Robotics bees were trained to enter a tunnel to forage at a sucrose feeder placed at its far end (Fig. 62.1a). The bees used visual cues to maintain their ground speed by adjusting their airspeed to maintain a constant rate of optic flow, even against headwinds which were, at their strongest, 50% of a bee’s maximum recorded forward velocity. Vladusich et al. [62.16] studied the effect of adding goal-defining landmarks. Bees were trained to forage in an optic-flow-rich tunnel with a landmark positioned directly above the feeder. They searched much more accurately when both odometric and landmark cues were available than when only odometry was available. When the two cue sources were set in conflict, by shifting the position of the landmark in the tunnel during tests, bees overwhelmingly used landmark cues rather than odometry. This, together with other such experiments, suggests that bees can make use of odometric and landmark cues in a more flexible and dynamic way than previously envisaged. In earlier studies of bees flying down a tunnel, Srinivasan and Zhang [62.17] placed different patterns on the left and right walls. They found that bees balance the image velocities in the left and right visual fields. This strategy ensures that bees fly down the middle of the tunnel, without bumping into the side walls, enabling them to negotiate narrow passages or to fly between obstacles. This strategy has been applied to a corridor-following robot (Fig. 62.1c). By holding constant the average image velocity as seen by the two eyes during flight, the bee avoids potential collisions, slowing down when it flies through a narrow passage. The movement-sensitive mechanisms underlying these various behaviors differ qualitatively as well as quantitatively, from those that mediate the optomotor response (e.g., turning to track a pattern of moving stripes) that had been the initial target of investigation of the Reichardt laboratory. The lesson for robot control is that flight appears to be coordinated by a number of visuomotor systems acting in concert, and the same lesson can apply to a whole range of tasks which must convert vision to action. Of course, vision is but one of the sensory systems that play a vital role in insect behavior. Webb [62.18] uses her own work on robot design inspired by the auditory control of behavior in crickets to anchor a far-ranging assessment of the extent to which robotics can offer good models of animal behaviors. 62.2.2 Visually Guided Behavior in Frogs and Robots Lettvin et al. [62.19] treated the frog’s visual system from an ethological perspective, analyzing circuitry in relation to the animal’s ecological niche to show that different cells in the retina and the visual midbrain region known as the tectum were specialized for detecting predators and prey. However, in much visually guided behavior, the animal does not respond to a single stimulus, but rather to some property of the overall configuration. We thus turn to the question what does the frog’s eye tell the frog?, stressing the embodied nervous system or, perhaps equivalently, an action-oriented view of perception. Consider, for example, the snapping behavior of frogs confronted with one or more fly-like stimuli. Ingle [62.20] found that it is only in a restricted region around the head of a frog that the presence of a fly-like stimulus elicits a snap, that is, the frog turns so that its midline is pointed at the stimulus and then lunges forward and captures the prey with its tongue. There is a larger zone in which the frog merely orients towards the target, and beyond that zone the stimulus elicits no response at all. When confronted with two flies within the snapping zone, either of which is vigorous enough that alone it could elicit a snapping response, the frog exhibits one of three reactions: it snaps at one of the flies, it does not snap at all, or it snaps in between at the average fly. Didday [62.21] offered a simple model of this choice behavior which may be considered as the prototype for a winner-take-all (WTA) model which receives a variety of inputs and (under ideal circumstances) suppresses the representation of all but one of them; the one that remains is the winner which will play the decisive role in further processing. This was the beginning of Rana computatrix (see Arbib [62.22, 23] for overviews). Studies on frog brains and behavior inspired the successful use of potential fields for robot navigation strategies. Data on the strategies used by frogs to capture prey while avoiding static obstacles (Collett[62.24]) grounded the model by Arbib and House [62.25] which linked systems for depth perception to the creation of spatial maps of both prey and barriers. In one version of their model, they represented the map of prey by a potential field with long-range attraction and the map of barriers by a potential field with short-range repulsion, and showed that summation of these fields yielded a field that could guide the frog’s detour around the barrier to catch its prey. Corbacho and Arbib [62.26] later explored a possible role for learning in this behavior. Their model incorporated learning in the weights between the various potential fields to enable adaptation over trials as observed in the real animals. The success of the models indicated that frogs use reactive strategies to avoid obstacles while moving to a goal, rather than employing a planning or cognitive system. Other work Part G 62.2
Neurorobotics:From Vision to Action 62.2 Neuroethological Inspiration 1457 (e.g.,Cobas and Arbib [62.271)studied how the frog's quire a visual target.Moreover.the superior colliculus ability to catch prey and avoid obstacles was integrated can integrate auditory and somatosensory information with its ability to escape from predators.These models into its visual frame (Stein and Meredith [62.35])and stressed the interaction of the tectum with a variety of this inspired Strosslin et al.[62.36]to use a biologically other brain regions such as the pretectum(for detecting inspired approach based on the properties of neurons in predators)and the tegmentum(for implementing motor the superior colliculus to learn the relation between vi- commands for approach or avoidance). sual and tactile information in control of a mobile robot Arkin [62.28]showed how to combine a computer vi- platform.More generally,then,the comparative study of sion system with a frog-inspired potential field controller mammalian brains has yielded a rich variety of compu- to create a control system for a mobile robot that could tational models of importance in neurorobotics.In this successfully navigate in a fairly structured environment section,we further introduce the study of mammalian using camera input.The resultant system thus enriched neurorobotics by looking at studies of mechanisms of other roughly contemporaneous applications of poten- the rat brain for spatial navigation. tial fields in path planning with obstacle avoidance for The frog's detour behavior is an example of what both manipulators and mobile robots (Khatib [62.29]; O'Keefe and Nadel [62.37]called the taxon (behavioral Krogh and Thorpe [62.30]).The work on Rana Com-orientation)system [as in Braitenberg,[62.38]a taxis putatrix proceeded at two levels -both biologically (plural tares)is an organism's response to a stimulus by realistic neural networks,and in terms of functional movement in a particular direction].They distinguished units called schemas,which compete and cooperate this from a system for map-based navigation,and pro- to determine behavior.Section 62.2.4 will show how posed that the latter resides in the hippocampus,though more general behaviors can emerge from the competi-Guazzelli et al.[62.39]qualified this assertion,showing tion and cooperation of perceptual and motor schemas how the hippocampus may function as part of a cogni- as well as more abstract coordinating schemas too.tive map.The taxon versus map distinction is akin to the Such ideas were,of course,developed independently distinction between reactive and deliberative control in by a number of authors,and so entered the robotics robotics (Arkin et al.[62.33]).It will be useful to relate literature by various routes,of which the best known taxis to the notion of an affordance (Gibson [62.401). may be the subsumption architecture of Brooks [62.31]a feature of an object or environment relevant to action. and the ideas of Braitenberg cited above,whereas Ark- for example,in picking up an apple or a ball,the iden- in's work on behavior-based robotics [62.32]is indeed tity of the object may be irrelevant,but the size of the rooted in schema theory.Arkin et al.[62.33]present object is crucial.Similarly,if we wish to push a toy car, a recent example of the continuing interaction between recognizing the make of car copied in the toy is irrele- robotics and ethology,offering a novel method for cre- vant,whereas it is crucial to recognize the placement of ating high-fidelity models of animal behavior for use in the wheels to extract the direction in which the car can robotic systems based on a behavioral systems approach be readily pushed.Just as a rat may have basic taxes for (i.e..based on a schema-level model of animal behav- approaching food or avoiding a bright light,say,so does ior,rather than analysis of biological circuits in animal it have a wider repertoire of affordances for possible brains),and describe how an ethological model of a do- actions associated with the immediate sensing of its en- mestic dog can be implemented with AlBO,the Sony vironment.Such affordances include go straight ahead entertainment robot. for visual sighting of a corridor,hide for a dark hole, eat for food as sensed generically,drink similarly,and Part 62.2.3 Navigation in Rat and Robot the various turns afforded by,e.g.,the sight of the end of the corridor.It also makes rich use of olfactory cues. The tectum,the midbrain visual system which deter- In the same way,a robot's behavior will rely on a host mines how the frog turns its whole body towards it prey of reactions to local conditions in fulfilling a plan,e.g.. or orients it for escape from predators (Sect.62.2.2),is knowing that it must go to the end of a corridor it will homologous with the superior colliculus of the mam-nonetheless use local visual cues to avoid hitting obsta- malian midbrain.The rat superior colliculus has been cles,or to determine through which angle to turn when shown to be frog-like,mediating approach and avoid- reaching a bend in the corridor. ance (Dean et al.[62.34]),whereas the best-studied role Both normal and hippocampal-lesioned rats can of the superior colliculus of cat,monkey,and human is learn to solve a simple T-maze (e.g.,learning whether in the control of saccades,rapid eye movements to ac- to turn left or right to find food)in the absence of any
Neurorobotics: From Vision to Action 62.2 Neuroethological Inspiration 1457 (e.g., Cobas and Arbib [62.27]) studied how the frog’s ability to catch prey and avoid obstacles was integrated with its ability to escape from predators. These models stressed the interaction of the tectum with a variety of other brain regions such as the pretectum (for detecting predators) and the tegmentum (for implementing motor commands for approach or avoidance). Arkin [62.28] showed how to combine a computer vision system with a frog-inspired potential field controller to create a control system for a mobile robot that could successfully navigate in a fairly structured environment using camera input. The resultant system thus enriched other roughly contemporaneous applications of potential fields in path planning with obstacle avoidance for both manipulators and mobile robots (Khatib [62.29]; Krogh and Thorpe [62.30]). The work on Rana Computatrix proceeded at two levels – both biologically realistic neural networks, and in terms of functional units called schemas, which compete and cooperate to determine behavior. Section 62.2.4 will show how more general behaviors can emerge from the competition and cooperation of perceptual and motor schemas as well as more abstract coordinating schemas too. Such ideas were, of course, developed independently by a number of authors, and so entered the robotics literature by various routes, of which the best known may be the subsumption architecture of Brooks [62.31] and the ideas of Braitenberg cited above, whereas Arkin’s work on behavior-based robotics [62.32] is indeed rooted in schema theory. Arkin et al. [62.33] present a recent example of the continuing interaction between robotics and ethology, offering a novel method for creating high-fidelity models of animal behavior for use in robotic systems based on a behavioral systems approach (i. e., based on a schema-level model of animal behavior, rather than analysis of biological circuits in animal brains), and describe how an ethological model of a domestic dog can be implemented with AIBO, the Sony entertainment robot. 62.2.3 Navigation in Rat and Robot The tectum, the midbrain visual system which determines how the frog turns its whole body towards it prey or orients it for escape from predators (Sect. 62.2.2), is homologous with the superior colliculus of the mammalian midbrain. The rat superior colliculus has been shown to be frog-like, mediating approach and avoidance (Dean et al. [62.34]), whereas the best-studied role of the superior colliculus of cat, monkey, and human is in the control of saccades, rapid eye movements to acquire a visual target. Moreover, the superior colliculus can integrate auditory and somatosensory information into its visual frame (Stein and Meredith [62.35]) and this inspired Strosslin et al. [62.36] to use a biologically inspired approach based on the properties of neurons in the superior colliculus to learn the relation between visual and tactile information in control of a mobile robot platform. More generally, then, the comparative study of mammalian brains has yielded a rich variety of computational models of importance in neurorobotics. In this section, we further introduce the study of mammalian neurorobotics by looking at studies of mechanisms of the rat brain for spatial navigation. The frog’s detour behavior is an example of what O’Keefe and Nadel [62.37] called the taxon (behavioral orientation) system [as in Braitenberg, [62.38] a taxis (plural taxes) is an organism’s response to a stimulus by movement in a particular direction]. They distinguished this from a system for map-based navigation, and proposed that the latter resides in the hippocampus, though Guazzelli et al. [62.39] qualified this assertion, showing how the hippocampus may function as part of a cognitive map. The taxon versus map distinction is akin to the distinction between reactive and deliberative control in robotics (Arkin et al. [62.33]). It will be useful to relate taxis to the notion of an affordance (Gibson [62.40]), a feature of an object or environment relevant to action, for example, in picking up an apple or a ball, the identity of the object may be irrelevant, but the size of the object is crucial. Similarly, if we wish to push a toy car, recognizing the make of car copied in the toy is irrelevant, whereas it is crucial to recognize the placement of the wheels to extract the direction in which the car can be readily pushed. Just as a rat may have basic taxes for approaching food or avoiding a bright light, say, so does it have a wider repertoire of affordances for possible actions associated with the immediate sensing of its environment. Such affordances include go straight ahead for visual sighting of a corridor, hide for a dark hole, eat for food as sensed generically, drink similarly, and the various turns afforded by, e.g., the sight of the end of the corridor. It also makes rich use of olfactory cues. In the same way, a robot’s behavior will rely on a host of reactions to local conditions in fulfilling a plan, e.g., knowing that it must go to the end of a corridor it will nonetheless use local visual cues to avoid hitting obstacles, or to determine through which angle to turn when reaching a bend in the corridor. Both normal and hippocampal-lesioned rats can learn to solve a simple T-maze (e.g., learning whether to turn left or right to find food) in the absence of any Part G 62.2
1458 Part G Human-Centered and Life-Like Robotics consistent environmental cues other than the T-shape of ferred direction.To unlearn-90,say,the array must the maze.If anything,the lesioned animals learn this reduce the peak there,while at the same time building problem faster than normals.After the criterion was a new peak at the new direction of+90.If the old peak reached,probe trials with an eight-arm radial maze were has mass p(t)and the new peak has mass g(t),then as interspersed with the usual T-trials.Animals from both p(t)declines toward 0 while g(t)increases steadily from groups consistently chose the side to which they were 0,the center of mass will progress from-90°to+90°, trained on the T-maze.However,many did not choose fitting the behavioral data. the 90 arm but preferred either the 45 or 135 arm, The determination of movement direction was mod- suggesting that the rats eventually solved the T-maze by eled by rat-ification of the Arbib and House [62.25] learning to rotate within an egocentric orientation sys-model of frog detour behavior.There,prey was repre- tem at the choice point through approximately 90.This sented by excitation coarsely coded across a population. leads to the hypothesis of an orientation vector being while barriers were encoded by inhibition whose ex- stored in the animal's brain but does not tell us where tent closely matched the retinotopic extent of each or how the orientation vector is stored.One possible barrier.The sum of excitation was passed through model would employ coarse coding in a linear array of a winner-takes-all circuit to yield the choice of move- cells,coding for turns from-180 to +180.From the ment direction.As a result,the direction of the gap behavior,one might expect that only the cells close to closest to the prey,rather than the direction of the prey the preferred behavioral direction are excited,and that itself,was often chosen for the frog's initial movement. learning marches this peak from the old to the new pre- The same model serves for behavioral orientation once Hippocampal formation place Sensory inputs Prefrontal Motor world graph outputs Dynamic remapping Goal object Affordances Parietal Premotor Consequences action selection Internal state Caudoputamen Nucleus accumbens Hypothalamu Incentives drive states Actor-critic Sensory inputs Part G162.2 Fig.62.2 The TAM-WG model has at its basis a system,TAM(the taxon affordance model),for exploiting affordances. This is elaborated by a system,WG(the world graph),which can use a cognitive map to plan paths to targets which are not currently visible.Note that the model processes two different kinds of sensory inputs.At the bottom right are those associated with,e.g.,hypothalamic systems for feeding and drinking,and that may provide both incentives and rewards for the animal's behavior,contributing both to behavioral choices.and to the reinforcement of certain patterns of behavior.The nucleus accumbens and caudo-putamen mediate an actor-critic style of reinforcement learning based on the hypothalamic drive of the dopamine system.The sensory inputs at the top left are those that allow the animal to sense its relation with the external world,determining both where it is(the hippocampal place system)as well as the affordances for action (the parietal recognition of affordances can shape the premotor selection of an action).The TAM model focuses on the parietal-premotor reaction to immediate affordances;the WG(world graph)model places action selection within the wider context of a cognitive map.(after Guazzelli et al.[62.41])
1458 Part G Human-Centered and Life-Like Robotics consistent environmental cues other than the T-shape of the maze. If anything, the lesioned animals learn this problem faster than normals. After the criterion was reached, probe trials with an eight-arm radial maze were interspersed with the usual T-trials. Animals from both groups consistently chose the side to which they were trained on the T-maze. However, many did not choose the 90◦ arm but preferred either the 45◦ or 135◦ arm, suggesting that the rats eventually solved the T-maze by learning to rotate within an egocentric orientation system at the choice point through approximately 90◦. This leads to the hypothesis of an orientation vector being stored in the animal’s brain but does not tell us where or how the orientation vector is stored. One possible model would employ coarse coding in a linear array of cells, coding for turns from −180◦ to +180◦. From the behavior, one might expect that only the cells close to the preferred behavioral direction are excited, and that learning marches this peak from the old to the new preHippocampal formation place Hypothalamus drive states Prefrontal world graph Premotor action selection Affordances Actor-critic Dopamine neurons Parietal Caudoputamen Nucleus accumbens Sensory inputs Sensory inputs Motor outputs Goal object Consequences Internal state Incentives Dynamic remapping Fig. 62.2 The TAM-WG model has at its basis a system, TAM (the taxon affordance model), for exploiting affordances. This is elaborated by a system, WG (the world graph), which can use a cognitive map to plan paths to targets which are not currently visible. Note that the model processes two different kinds of sensory inputs. At the bottom right are those associated with, e.g., hypothalamic systems for feeding and drinking, and that may provide both incentives and rewards for the animal’s behavior, contributing both to behavioral choices, and to the reinforcement of certain patterns of behavior. The nucleus accumbens and caudo-putamen mediate an actor–critic style of reinforcement learning based on the hypothalamic drive of the dopamine system. The sensory inputs at the top left are those that allow the animal to sense its relation with the external world, determining both where it is (the hippocampal place system) as well as the affordances for action (the parietal recognition of affordances can shape the premotor selection of an action). The TAM model focuses on the parietal–premotor reaction to immediate affordances; the WG (world graph) model places action selection within the wider context of a cognitive map. (after Guazzelli et al. [62.41]) ferred direction. To unlearn −90◦, say, the array must reduce the peak there, while at the same time building a new peak at the new direction of +90◦. If the old peak has mass p(t) and the new peak has mass q(t), then as p(t) declines toward 0 while q(t) increases steadily from 0, the center of mass will progress from −90◦ to +90◦, fitting the behavioral data. The determination of movement direction was modeled by rat-ification of the Arbib and House [62.25] model of frog detour behavior. There, prey was represented by excitation coarsely coded across a population, while barriers were encoded by inhibition whose extent closely matched the retinotopic extent of each barrier. The sum of excitation was passed through a winner-takes-all circuit to yield the choice of movement direction. As a result, the direction of the gap closest to the prey, rather than the direction of the prey itself, was often chosen for the frog’s initial movement. The same model serves for behavioral orientation once Part G 62.2
Neurorobotics:From Vision to Action 62.2 Neuroethological Inspiration 1459 we replace the direction of the prey (frog)by the di- be shared by brain theorists.cognitive scientists.connec- rection of the orientation vector(rat),while the barriers tionists,ethologists,kinesiologists-and roboticists.In correspond to the presence of walls rather than alley particular,schema theory can provide a distributed pro- ways. gramming environment for robotics [see,e.g.,the robots To approach the issue of how a cognitive map can schemas (RS)language of Lyons and Arbib [62.47]. extend the capability of the affordance system,Guazzelli and supporting architectures for distributed control as et al.[62.43]extended the Lieblich and Arbib [62.44]in Metta et al.[62.48]].Schema theory becomes specif- approach to building a cognitive map as a world graph,ically relevant to neuroroborics when the schemas are a set of nodes connected by a set of edges,where the inspired by a model constrained by data provided by, nodes represent recognized places or situations,and the e.g.,human brain mapping,studies of the effects of brain links represent ways of moving from one situation to lesions,or neurophysiology. another.A crucial notion is that a place encountered in A perceptual schema not only determines whether different circumstances may be represented by multiple an object or other domain of interaction is present in the nodes,but that these nodes may be merged when the environment but can also provide important parameters similarity between these circumstances is recognized.to motor schemas (see below)for the guidance of ac- They model the process whereby the animal decides tion.The activity level of a perceptual schema signals where to move next,on the basis of its current drive the credibility of the hypothesis that what the schema state(hunger,thirst,fear,etc.).The emphasis is on spatial represents is indeed present,whereas other schema maps for guiding locomotion into regions not necessarily parameters represent relevant properties such as size, current visible,rather than retinotopic representations of location,and motion of the perceived object.Given a per- immediately visible space,and yields exploration and ceptual schema we may need several schema instances, latent learning without the introduction of an explicit each suitably tuned,to subserve perception of several exploratory drive.The model shows:(1)how a route,instances of its domain,e.g.,several chairs in a room. possibly of many steps,may be chosen that leads to Motor schemas provide the control systems which the desired goal;(2)how short cuts may be chosen; can be coordinated to affect a wide variety of actions. and (3)through its account of node merging why,in open fields,place cell firing does not seem to depend on Recognition Visual direction. criteria input The overall structure and general mode of operation Visual Visual of the complete model is shown in Fig.62.2,which gives input input Visual a vivid sense of the lessons to be learned by studying Activation location 4 not only specific systems of the mammalian brain but of visual Target Size Orientation also their patterns of large-scale interaction.This model search locaton recognition recognition is but one of many inspired by the data on the role of the Size Orientation Visual. hippocampus and other regions in rat navigation.Here, Activation Visual and kinesthetic and of reaching kinesthetic input tactile input we just mention as pointers to the wider literature the papers by Girard et al.[62.45]and Meyer et al.[62.461, which are part of the Psikharpax project,which is doing Fast phase Hand Hand movement preshape rotation for rats what Rana computatrix did for frogs and toads. art 62.2.4 Schemas Slow phase Actual and Coordinated Control Programs movement grasp Hand reaching Grasping Schema theory complements neuroscience's well- established terminology for levels of structural analysis Fig.62.3 Hypothetical coordinated control program for reaching (brain region,neuron,synapse)with a fiunctional vo- and grasping.The perceptual schemas (top)provide parameters for cabulary,a framework for analysis of behavior with no the motor schemas (bottom)for the control of reaching (arm trans- necessary commitment to hypotheses on the localization port and reaching)and grasping (controlling the hand to conform of each schema (unit of functional analysis),but which to the object).Dashed lines indicate activation signals which estab- can be linked to a structural analysis whenever appropri- lish timing relations between schemas;solid lines indicate transfer ate.Schemas provide a high-level vocabulary which can of data.(After Arbib [62.42])
Neurorobotics: From Vision to Action 62.2 Neuroethological Inspiration 1459 we replace the direction of the prey (frog) by the direction of the orientation vector (rat), while the barriers correspond to the presence of walls rather than alley ways. To approach the issue of how a cognitive map can extend the capability of the affordance system, Guazzelli et al. [62.43] extended the Lieblich and Arbib [62.44] approach to building a cognitive map as a world graph, a set of nodes connected by a set of edges, where the nodes represent recognized places or situations, and the links represent ways of moving from one situation to another. A crucial notion is that a place encountered in different circumstances may be represented by multiple nodes, but that these nodes may be merged when the similarity between these circumstances is recognized. They model the process whereby the animal decides where to move next, on the basis of its current drive state (hunger, thirst, fear, etc.). The emphasis is on spatial maps for guiding locomotion into regions not necessarily current visible, rather than retinotopic representations of immediately visible space, and yields exploration and latent learning without the introduction of an explicit exploratory drive. The model shows: (1) how a route, possibly of many steps, may be chosen that leads to the desired goal; (2) how short cuts may be chosen; and (3) through its account of node merging why, in open fields, place cell firing does not seem to depend on direction. The overall structure and general mode of operation of the complete model is shown in Fig. 62.2, which gives a vivid sense of the lessons to be learned by studying not only specific systems of the mammalian brain but also their patterns of large-scale interaction. This model is but one of many inspired by the data on the role of the hippocampus and other regions in rat navigation. Here, we just mention as pointers to the wider literature the papers by Girard et al. [62.45] and Meyer et al. [62.46], which are part of the Psikharpax project, which is doing for rats what Rana computatrix did for frogs and toads. 62.2.4 Schemas and Coordinated Control Programs Schema theory complements neuroscience’s wellestablished terminology for levels of structural analysis (brain region, neuron, synapse) with a functional vocabulary, a framework for analysis of behavior with no necessary commitment to hypotheses on the localization of each schema (unit of functional analysis), but which can be linked to a structural analysis whenever appropriate. Schemas provide a high-level vocabulary which can be shared by brain theorists, cognitive scientists, connectionists, ethologists, kinesiologists – and roboticists. In particular, schema theory can provide a distributed programming environment for robotics [see, e.g., the robots schemas (RS) language of Lyons and Arbib [62.47], and supporting architectures for distributed control as in Metta et al. [62.48]]. Schema theory becomes specifically relevant to neurorobotics when the schemas are inspired by a model constrained by data provided by, e.g., human brain mapping, studies of the effects of brain lesions, or neurophysiology. A perceptual schema not only determines whether an object or other domain of interaction is present in the environment but can also provide important parameters to motor schemas (see below) for the guidance of action. The activity level of a perceptual schema signals the credibility of the hypothesis that what the schema represents is indeed present, whereas other schema parameters represent relevant properties such as size, location, and motion of the perceived object. Given a perceptual schema we may need several schema instances, each suitably tuned, to subserve perception of several instances of its domain, e.g., several chairs in a room. Motor schemas provide the control systems which can be coordinated to affect a wide variety of actions. Visual location Fast phase movement Hand preshape Hand rotation Slow phase movement Actual grasp Recognition criteria Activation of visual search Activation of reaching Visual input Size recognition Visual input Size Orientation recognition Visual input Orientation Hand reaching Grasping Visual and kinesthetic input Visual, kinesthetic and tactile input Target location Fig. 62.3 Hypothetical coordinated control program for reaching and grasping. The perceptual schemas (top) provide parameters for the motor schemas (bottom) for the control of reaching (arm transport ≈ and reaching) and grasping (controlling the hand to conform to the object). Dashed lines indicate activation signals which establish timing relations between schemas; solid lines indicate transfer of data. (After Arbib [62.42]) Part G 62.2
1460 Part G Human-Centered and Life-Like Robotics The activity level of a motor schema instance may signal schemas for grasping in a localized area (AIP)of pari- its degree of readiness to control some course of action. etal cortex and motor schemas for grasping in a localized What distinguishes schema theory from usual control area(F5)of premotor cortex;see Fig.62.4.]The notion theory is the transition from emphasizing a few basic of schema is thus recursive-a schema may be analyzed controllers (e.g.,for locomotion or arm movement)to as a coordinated control program of finer schemas,and a large variety of motor schemas for diverse skills(peel-so on until such time as a secure foundation of neural ing an apple,climbing a tree,changing a light bulb),with specificity is attained. each motor schema depending on perceptual schemas to Subsequent work has refined the scheme of Fig.62.3, supply information about objects which are targets for for example,Hoff and Arbib's [62.52]model uses the interaction.Note the relevance of this for robotics-the time needed for completion of each of the movements robot needs to know not only what the obiect is but also transporting the hand and preshaping the hand-to how to interact with it.Modern neuroscience (see the explain data on how the reach to grasp responds to per- works by Ungerleider and Mishkin [62.49]and Goodale turbation of target location or size.Moreover,Hoff and and Milner [62.50])has indeed established that the mon-Arbib [62.53]show how to embed an optimality princi- key and human brain each use a dorsal pathway(via the ple for arm trajectories into a controller which can use parietal lobe)for the how and a ventral pathway(via the feedback to resist noise and compensate for target per- inferotemporal cortex)for the what.Moreover,coupling turbations,and a predictor element to compensate for between these two streams mediates their integration in delays from the periphery.The result is a feedback sys- normal ongoing behavior. tem which can act like a feedforward system described A coordinated control program interweaves the by the optimality principle in familiar situations,where activation of various perceptual,motor,and coordinat- the conditions of the desired behavior are not perturbed ing schemas in accordance with the current task and and accuracy requirements are such that normal errors sensory environment to mediate complex behaviors.Fig- in execution may be ignored.However,when perturba- ure 62.3 shows the original coordinated control program tions must be corrected for or when great precision is (Arbib [62.42],inspired by the data of Jeannerod and required,feedback plays a crucial role in keeping the Biguer [62.511).As the hand moves to grasp a ball,it is behavior close to that desired,taking account of delays preshaped so that,when it has almost reached the ball,it in putting feedback into effect.This integrated view of is of the right shape and orientation to enclose some part feedback and feedforward within a single motor schema of the ball prior to gripping it firmly.The outputs of three seems to us of value for neurorobotics as well as the perceptual schemas are available for the concurrent acti- neuroscience of motor control. vation of two motor schemas,one controlling the arm to It is standard to distinguish a forward or direct model transport the hand towards the object and the other pre- which represents the path from motor command to motor shaping the hand.Once the hand is preshaped,it is only output,from the inverse model which models the reverse the completion of the fast initial phase of hand transport pathway,i.e.,going from a desired motor outcome to that wakes up the final stage of the grasping schema to a set of motor commands likely to achieve it.As we shape the fingers under control of tactile feedback.[This have just suggested,the action plan unfolds as if it were model anticipates the much later discovery of perceptual feedforward or open-loop when the actual parameters of Part G62.2 Cerebellum Efferent Cerebral Spinal Skeleto- feedback motor muscular cortex Motor plan cord system Afferent feedback Fig.62.4 Simplified control loop relating cerebellum and cerebral motor cortex in supervising the spinal cord's control of the skeletomuscular system
1460 Part G Human-Centered and Life-Like Robotics The activity level of a motor schema instance may signal its degree of readiness to control some course of action. What distinguishes schema theory from usual control theory is the transition from emphasizing a few basic controllers (e.g., for locomotion or arm movement) to a large variety of motor schemas for diverse skills (peeling an apple, climbing a tree, changing a light bulb), with each motor schema depending on perceptual schemas to supply information about objects which are targets for interaction. Note the relevance of this for robotics – the robot needs to know not only what the object is but also how to interact with it. Modern neuroscience (see the works by Ungerleider and Mishkin [62.49] and Goodale and Milner [62.50]) has indeed established that the monkey and human brain each use a dorsal pathway (via the parietal lobe) for the how and a ventral pathway (via the inferotemporal cortex) for the what. Moreover, coupling between these two streams mediates their integration in normal ongoing behavior. A coordinated control program interweaves the activation of various perceptual, motor, and coordinating schemas in accordance with the current task and sensory environment to mediate complex behaviors. Figure 62.3 shows the original coordinated control program (Arbib [62.42], inspired by the data of Jeannerod and Biguer [62.51]). As the hand moves to grasp a ball, it is preshaped so that, when it has almost reached the ball, it is of the right shape and orientation to enclose some part of the ball prior to gripping it firmly. The outputs of three perceptual schemas are available for the concurrent activation of two motor schemas, one controlling the arm to transport the hand towards the object and the other preshaping the hand. Once the hand is preshaped, it is only the completion of the fast initial phase of hand transport that wakes up the final stage of the grasping schema to shape the fingers under control of tactile feedback. [This model anticipates the much later discovery of perceptual Efferent feedback Cerebral motor cortex Cerebellum Motor plan ∑ Afferent feedback + – Skeletomuscular system Spinal cord Fig. 62.4 Simplified control loop relating cerebellum and cerebral motor cortex in supervising the spinal cord’s control of the skeletomuscular system schemas for grasping in a localized area (AIP) of parietal cortex and motor schemas for grasping in a localized area (F5) of premotor cortex; see Fig. 62.4.] The notion of schema is thus recursive – a schema may be analyzed as a coordinated control program of finer schemas, and so on until such time as a secure foundation of neural specificity is attained. Subsequent work has refined the scheme of Fig. 62.3, for example, Hoff and Arbib’s [62.52] model uses the time needed for completion of each of the movements – transporting the hand and preshaping the hand – to explain data on how the reach to grasp responds to perturbation of target location or size. Moreover, Hoff and Arbib [62.53] show how to embed an optimality principle for arm trajectories into a controller which can use feedback to resist noise and compensate for target perturbations, and a predictor element to compensate for delays from the periphery. The result is a feedback system which can act like a feedforward system described by the optimality principle in familiar situations, where the conditions of the desired behavior are not perturbed and accuracy requirements are such that normal errors in execution may be ignored. However, when perturbations must be corrected for or when great precision is required, feedback plays a crucial role in keeping the behavior close to that desired, taking account of delays in putting feedback into effect. This integrated view of feedback and feedforward within a single motor schema seems to us of value for neurorobotics as well as the neuroscience of motor control. It is standard to distinguish a forward or direct model which represents the path from motor command to motor output, from the inverse model which models the reverse pathway, i. e., going from a desired motor outcome to a set of motor commands likely to achieve it. As we have just suggested, the action plan unfolds as if it were feedforward or open-loop when the actual parameters of Part G 62.2
Neurorobotics:From Vision to Action 62.2 Neuroethological Inspiration 1461 the situation match the stored parameters,while a feed- presses recently attended locations from the saliency back component is employed to counteract disturbances map).Because it includes a detailed low-level vision (current feedback)and to learn from mistakes (learning front-end,the model has been applied not only to lab- from feedback).This is obtained by relying on a forward oratory stimuli,but also to a wide variety of natural model that predicts the outcome of the action as it un- scenes,predicting a wealth of data from psychophysical folds in real time.The accuracy of the forward model can experiments. be evaluated by comparing the output generated by the When specific objects are searched for,low-level system with the signals derived from sensory feedback visual processing can be biased both by the gist (e.g. (Miall et al.[62.541).Also.delays must be accounted outdoor suburban scene)and also for the features of that for to address the different propagation times of the neu-object.This top-down modulation of bottom-up process- ral pathways carrying the predicted and actual outcome ing results in an ability to guide search towards targets of the action.Note that the forward model in this case of interest (Wolfe [62.57]).Task affects eye movements is relatively simple,predicting only the motor output (Yarbus [62.581),as do training and general expertise in advance:since motor commands are generated inter- Navalpakkam and Itti [62.59]propose a computational nally it is easy to imagine a predictor for these signals model which emphasizes four aspects that are impor- (known as an efference copy).The inverse model,on the tant in biological vision:determining the task relevance other hand,is much more complicated since it maps sen-of an entity,biasing attention for the low-level visual sory feedback(e.g.,vision)back into motor terms.These features of desired targets,recognizing these targets concepts will prove important both in our study of the using the same low-level features,and incrementally cerebellum(Sect.62.3)and mirror systems(Sect.62.4).building a visual map of task relevance at every scene location.It attends to the most salient location in the 62.2.5 Salience and Visual Attention scene,and attempts to recognize the attended object through hierarchical matching against object representa- Discussions of how an animal (or robot)grasps an ob-tions stored in long-term memory.It updates its working ject assume that the animal or robot is attending to the memory with the task relevance of the recognized en- relevant object.Thus,whatever the subtlety of process- tity and updates a topographic task-relevance map with ing in the canonical and mirror systems for grasping,its the location and relevance of the recognized entity,for success rests on the availability of a visual system cou-example,in one task the model forms a map of likely pled to an oculomotor control system that bring foveal locations of cars from a video clip filmed while driv- vision to bear on objects to set parameters needed for ing on a highway.Such work illustrates the continuing successful interaction.Indeed,the general point is that interaction between models based on visual neurophys- attention greatly reduces the processing load for animal iology and human psychophysics with the tackling of and robot.The catch,of course,is that reducing comput- practical robotic applications. ing load is a Pyrrhic victory unless the moving focus of Orabona et al.[62.60]implemented an extension attention captures those aspects of behavior relevant for of the Itti-Koch model on a humanoid robot with the current task-or supports necessary priority inter- moving eyes,using log-polar vision as in Sandini and rupts.Indeed,directing attention appropriately is a topic Tagliasco [62.61],and changing the feature construction for which there is a great richness of both neurophys- pyramid by considering proto-objectelements(blob-like iological data and robotic application (see Deco and structures rather than edges).The inhibition-of-return Rolls [62.55]and Choi,et al.[62.41]). mechanism has to take into account a moving frame Part In their neuromorphic model of the bottom-up guid- of reference,the resolution of the fovea is very different ance of attention in primates,Itti and Koch [62.56] from that at the periphery of the visual field,and head and decompose the input video stream into eight feature body movements need to be stabilized.The control of channels at six spatial scales.After surround sup-movement might thus have a relationship with the struc- pression,only a sparse number of locations remain ture and development of the attention system.Rizzolatti active in each map,and all maps are combined into et al.[62.62]proposed a role for the feedback projec- a unique saliency map.This map is scanned by the fo- tions from premotor cortex to the parietal lobe,assuming cus of attention in order of decreasing saliency through that they form a tuning signal that dynamically changes the interaction between a winner-takes-all mecha- visual perception.In practice this can be seen as an im- nism (which selects the most salient location)and an plicit attention system which selects sensory information inhibition-of-return mechanism(which transiently sup- while the action is being prepared and subsequently ex-
Neurorobotics: From Vision to Action 62.2 Neuroethological Inspiration 1461 the situation match the stored parameters, while a feedback component is employed to counteract disturbances (current feedback) and to learn from mistakes (learning from feedback). This is obtained by relying on a forward model that predicts the outcome of the action as it unfolds in real time. The accuracy of the forward model can be evaluated by comparing the output generated by the system with the signals derived from sensory feedback (Miall et al. [62.54]). Also, delays must be accounted for to address the different propagation times of the neural pathways carrying the predicted and actual outcome of the action. Note that the forward model in this case is relatively simple, predicting only the motor output in advance: since motor commands are generated internally it is easy to imagine a predictor for these signals (known as an efference copy). The inverse model, on the other hand, is much more complicated since it maps sensory feedback (e.g., vision) back into motor terms. These concepts will prove important both in our study of the cerebellum (Sect. 62.3) and mirror systems (Sect. 62.4). 62.2.5 Salience and Visual Attention Discussions of how an animal (or robot) grasps an object assume that the animal or robot is attending to the relevant object. Thus, whatever the subtlety of processing in the canonical and mirror systems for grasping, its success rests on the availability of a visual system coupled to an oculomotor control system that bring foveal vision to bear on objects to set parameters needed for successful interaction. Indeed, the general point is that attention greatly reduces the processing load for animal and robot. The catch, of course, is that reducing computing load is a Pyrrhic victory unless the moving focus of attention captures those aspects of behavior relevant for the current task – or supports necessary priority interrupts. Indeed, directing attention appropriately is a topic for which there is a great richness of both neurophysiological data and robotic application (see Deco and Rolls [62.55] and Choi, et al. [62.41]). In their neuromorphic model of the bottom-up guidance of attention in primates, Itti and Koch [62.56] decompose the input video stream into eight feature channels at six spatial scales. After surround suppression, only a sparse number of locations remain active in each map, and all maps are combined into a unique saliency map. This map is scanned by the focus of attention in order of decreasing saliency through the interaction between a winner-takes-all mechanism (which selects the most salient location) and an inhibition-of-return mechanism (which transiently suppresses recently attended locations from the saliency map). Because it includes a detailed low-level vision front-end, the model has been applied not only to laboratory stimuli, but also to a wide variety of natural scenes, predicting a wealth of data from psychophysical experiments. When specific objects are searched for, low-level visual processing can be biased both by the gist (e.g., outdoor suburban scene) and also for the features of that object. This top-down modulation of bottom-up processing results in an ability to guide search towards targets of interest (Wolfe [62.57]). Task affects eye movements (Yarbus [62.58]), as do training and general expertise. Navalpakkam and Itti [62.59] propose a computational model which emphasizes four aspects that are important in biological vision: determining the task relevance of an entity, biasing attention for the low-level visual features of desired targets, recognizing these targets using the same low-level features, and incrementally building a visual map of task relevance at every scene location. It attends to the most salient location in the scene, and attempts to recognize the attended object through hierarchical matching against object representations stored in long-term memory. It updates its working memory with the task relevance of the recognized entity and updates a topographic task-relevance map with the location and relevance of the recognized entity, for example, in one task the model forms a map of likely locations of cars from a video clip filmed while driving on a highway. Such work illustrates the continuing interaction between models based on visual neurophysiology and human psychophysics with the tackling of practical robotic applications. Orabona et al. [62.60] implemented an extension of the Itti–Koch model on a humanoid robot with moving eyes, using log-polar vision as in Sandini and Tagliasco [62.61], and changing the feature construction pyramid by considering proto-object elements (blob-like structures rather than edges). The inhibition-of-return mechanism has to take into account a moving frame of reference, the resolution of the fovea is very different from that at the periphery of the visual field, and head and body movements need to be stabilized. The control of movement might thus have a relationship with the structure and development of the attention system. Rizzolatti et al. [62.62] proposed a role for the feedback projections from premotor cortex to the parietal lobe, assuming that they form a tuning signal that dynamically changes visual perception. In practice this can be seen as an implicit attention system which selectssensory information while the action is being prepared and subsequently exPart G 62.2
1462 Part G Human-Centered and Life-Like Robotics ecuted (see Flanagan and Johansson [62.63],Flanagan tor and parietal neurons suggest a premotor mechanism et al.[62.64],and Mataric and Pomplun [62.651).The of attention that deserves exploration in further work in early responses,before action onset,of many premo- neurorobotics. 62.3 The Role of the Cerebellum Although cerebellar involvement in muscle control was grasp the object.Thus analysis of how various compo- advocated long ago by the Greek gladiator surgeon nents of cerebral cortex interact to support forward and Galen of Pergamum (129-216/17 CE),it was the publi- inverse models which determine the overall shape of cation by Eccle et al.[62.66]of the first comprehensive the behavior must be complemented by analysis of how account of the detailed neurophysiology and anatomy the cerebellum handles control delays and nonlineari- of the cerebellum (/to [62.671)that provided the inspi-ties to transform a well-articulated plan into graceful ration for the Marr-Albus model of cerebellar plasticity coordinated action.Within this perspective,cerebellar (Marr [62.68];Albus [62.69])that is at the heart of structure and function will be very helpful in the control most current modeling of the role of the cerebellum of a new class of highly antagonistic robotic systems as in control of motion and sensing.From a robotics well as in adaptive control. point of view,the most convincing results are based on Albus'[62.70]cerebellar model articulation con- 62.3.1 The Human Control Loop troller(CMAC)model and subsequent implementations by Miller [62.71].These models,however,are only Lesions and deficits of the cerebellum impair the co- remotely based on the structure of the biological cerebel- ordination and timing of movements while introducing lum.More detailed models are usually only applied to excessive.undesired motion:effects which cannot be two-degree-of-freedom robotic structures,and have not compensated by the cerebral cortex.According to main- been generalized to real-world applications(see Peters stream models,the cerebellum filters descending motor and van der Smagt [62.72]).The problem may lie with cortex commands to cope with timing issues and com- viewing the cerebellum as a stand-alone dynamics con- munication delays which go up to 50 ms one way for arm troller.An important observation about the brain is that control.Clearly,closed-loop control with such delays is schemas are widely distributed,and different aspects of not viable in any reasonable setting,unless augmented the schemas are computed in different parts of the brain. with an open-loop component,predicting the behavior of Thus,one view is that (1)the cerebral cortex has the the actuator system.This is where the cerebellum comes necessary models for choosing appropriate actions and into its own.The complexity of the vertebrate muscu- getting the general shape of the trajectory assembled loskeletal system,clearly demonstrated by the human to fit the present context,whereas (2)the cerebellum arm using a total of 19 muscle groups for planar mo- provides a side-path which (on the basis of extensive tion of the elbow and shoulder alone (see Nijhof and learning of a forward motor model)provides the ap- Kouwenhoven [62.73])requires a control mechanism propriate corrections to compensate for control delays,coping with this complexity,especially in a setting with Part muscle nonlinearities,Coriolis and centrifugal forces long control delays.One cause for this complexity is occasioned by joint interactions,and subtle adjustments that animal muscles come in antagonistic pairs (e.g., 9 of motor neuron firing in simultaneously active motor flexing versus extending a joint).Antagonistic control 23 pattern generators to ensure their smooth coordination.of muscle groups leads to energy-optimal(Damsgaard Thus,for example,a patient with cerebellar lesions may et al.[62.741)and intrinsically flexible systems.Contact be able to move his arm to successfully reach a target,with stiff or fast-moving objects requires such flexibil- and to successfully adjust his hand to the size of an ob-ity to prevent breakage.In contrast,classical (industrial) ject.However,he lacks the machinery to perform either robots are stiff,with limb segments controlled by lin- action both swiftly and accurately,and further lacks the ear or rotary motors with gear boxes.Even so,most ability to coordinate the timing of the two subactions. laboratory robotic systems have passively stiff joints, His behavior will thus exhibit decomposition of move- with active joint flexibility obtainable only by using fast ment-he may first move the hand till the thumb touches control loops and joint torque measurement.Although the object,and only then shape the hand appropriately to it may be debatable whether such robotic systems re-
1462 Part G Human-Centered and Life-Like Robotics ecuted (see Flanagan and Johansson [62.63], Flanagan et al. [62.64], and Mataric and Pomplun [62.65]). The early responses, before action onset, of many premotor and parietal neurons suggest a premotor mechanism of attention that deserves exploration in further work in neurorobotics. 62.3 The Role of the Cerebellum Although cerebellar involvement in muscle control was advocated long ago by the Greek gladiator surgeon Galen of Pergamum (129–216/17 CE), it was the publication by Eccle et al. [62.66] of the first comprehensive account of the detailed neurophysiology and anatomy of the cerebellum (Ito [62.67]) that provided the inspiration for the Marr–Albus model of cerebellar plasticity (Marr [62.68]; Albus [62.69]) that is at the heart of most current modeling of the role of the cerebellum in control of motion and sensing. From a robotics point of view, the most convincing results are based on Albus’ [62.70] cerebellar model articulation controller (CMAC) model and subsequent implementations by Miller [62.71]. These models, however, are only remotely based on the structure of the biological cerebellum. More detailed models are usually only applied to two-degree-of-freedom robotic structures, and have not been generalized to real-world applications (see Peters and van der Smagt [62.72]). The problem may lie with viewing the cerebellum as a stand-alone dynamics controller. An important observation about the brain is that schemas are widely distributed, and different aspects of the schemas are computed in different parts of the brain. Thus, one view is that (1) the cerebral cortex has the necessary models for choosing appropriate actions and getting the general shape of the trajectory assembled to fit the present context, whereas (2) the cerebellum provides a side-path which (on the basis of extensive learning of a forward motor model) provides the appropriate corrections to compensate for control delays, muscle nonlinearities, Coriolis and centrifugal forces occasioned by joint interactions, and subtle adjustments of motor neuron firing in simultaneously active motor pattern generators to ensure their smooth coordination. Thus, for example, a patient with cerebellar lesions may be able to move his arm to successfully reach a target, and to successfully adjust his hand to the size of an object. However, he lacks the machinery to perform either action both swiftly and accurately, and further lacks the ability to coordinate the timing of the two subactions. His behavior will thus exhibit decomposition of movement – he may first move the hand till the thumb touches the object, and only then shape the hand appropriately to grasp the object. Thus analysis of how various components of cerebral cortex interact to support forward and inverse models which determine the overall shape of the behavior must be complemented by analysis of how the cerebellum handles control delays and nonlinearities to transform a well-articulated plan into graceful coordinated action. Within this perspective, cerebellar structure and function will be very helpful in the control of a new class of highly antagonistic robotic systems as well as in adaptive control. 62.3.1 The Human Control Loop Lesions and deficits of the cerebellum impair the coordination and timing of movements while introducing excessive, undesired motion: effects which cannot be compensated by the cerebral cortex. According to mainstream models, the cerebellum filters descending motor cortex commands to cope with timing issues and communication delays which go up to 50 ms one way for arm control. Clearly, closed-loop control with such delays is not viable in any reasonable setting, unless augmented with an open-loop component, predicting the behavior of the actuator system. This is where the cerebellum comes into its own. The complexity of the vertebrate musculoskeletal system, clearly demonstrated by the human arm using a total of 19 muscle groups for planar motion of the elbow and shoulder alone (see Nijhof and Kouwenhoven [62.73]) requires a control mechanism coping with this complexity, especially in a setting with long control delays. One cause for this complexity is that animal muscles come in antagonistic pairs (e.g., flexing versus extending a joint). Antagonistic control of muscle groups leads to energy-optimal (Damsgaard et al. [62.74]) and intrinsically flexible systems. Contact with stiff or fast-moving objects requires such flexibility to prevent breakage. In contrast, classical (industrial) robots are stiff, with limb segments controlled by linear or rotary motors with gear boxes. Even so, most laboratory robotic systems have passively stiff joints, with active joint flexibility obtainable only by using fast control loops and joint torque measurement. Although it may be debatable whether such robotic systems rePart G 62.3