Eye movements and the control of actions in everyday life

https://doi.org/10.1016/j.preteyeres.2006.01.002Get rights and content

Abstract

The patterns of eye movement that accompany static activities such as reading have been studied since the early 1900s, but it is only since head-mounted eye trackers became available in the 1980s that it has been possible to study active tasks such as walking, driving, playing ball games and ordinary everyday activities like food preparation. This review examines the ways that vision contributes to the organization of such activities, and in particular how eye movements are used to locate the information needed by the motor system in the execution of each act. Major conclusions are that the eyes are proactive, typically seeking out the information required in the second before each act commences, although occasional ‘look ahead’ fixations are made to establish the locations of objects for use further into the future. Gaze often moves on before the last act is complete, indicating the presence of an information buffer. Each task has a characteristic but flexible pattern of eye movements that accompanies it, and this pattern is similar between individuals. The eyes rarely visit objects that are irrelevant to the action, and the conspicuity of objects (in terms of low-level image statistics) is much less important than their role in the task. Gaze control may involve movements of eyes, head and trunk, and these are coordinated in a way that allows for both flexibility of movement and stability of gaze. During the learning of a new activity, the eyes first provide feedback on the motor performance, but as this is perfected they provide feed-forward direction, seeking out the next object to be acted upon.

Introduction

Throughout the animal kingdom, in animals with as diverse evolutionary backgrounds as men, fish, crabs, flies and cuttlefish, one finds a consistent pattern of eye movements which can be referred to as a ‘saccade and fixate’ strategy (Land, 1999). Saccades are the fast movements that redirect the eye to a new part of the surroundings, and fixations are the intervals between saccades in which gaze is held almost stationary. As Dodge showed in 1900, it is during fixations that information is taken in: during saccades we are effectively blind.

In humans there are two reasons for this strategy. First, the fovea, the region of most acute vision, is astonishingly small. Depending on exactly how it is defined, its angular diameter is between 0.3° and 2°, and the foveal depression (fovea means pit) covers only about 1/4000th of the retinal surface (Steinman, 2003). Away from the foveal centre resolution falls rapidly (Fig. 1). To see detail in what we are looking at, we need to move the fovea to centre the target of interest. Because a combination of blur and active suppression causes us to be blind during these relocations we have to move the eyes as fast as possible, and saccades are indeed very fast, reaching speeds of 700° s−1 for large saccades (Carpenter, 1988). Second, gaze must be kept still between saccades, during the fixations when we take in visual information. The reason for this is that the process of photoreception is slow: it takes about 20 ms for a cone to respond fully to a step change in the light reaching it (Friedburg et al., 2004). The practical effect of this is that at image speeds of greater than about 2–3° s−1 we are no longer able to use the finest (highest spatial frequency) information in the image (Westheimer and McKee, 1975; Carpenter, 1991): in short, the image starts to blur, just as in a camera with a slow shutter speed. Interestingly, animals without well-defined foveas still employ the saccade and fixate strategy. Keeping gaze rotationally stable is the primary requirement whatever the retinal configuration, but in addition mobile animals necessarily require saccadic gaze-shifting mechanisms. Without such a mechanism, when the animal makes a turn the eyes will counter-rotate until they become stuck at one end of their movement range (Walls, 1962).

During ordinary activity, the body and head rotate the eyes in space at velocities as high as several hundred degrees per second, so that for fixation to be maintained during such motion powerful compensatory mechanisms are required to move the eyes in the opposite direction to the rotation of the head. These mechanisms are of two kinds. In the vestibulo-ocular reflex (VOR) the semi-circular canals measure head rotation velocity, and the signal they provide is fed to the eye muscles via the vestibular and oculomotor nuclei. The gain of this reflex is close to 1, so that a rotation of the head evokes an eye movement that almost exactly counteracts it. At slower velocities a second reflex, the optokinetic reflex (OKR), takes over from VOR. It operates by measuring the actual velocity of the image on the retina, and causes the eye muscles to rotate the eye in the same direction as the retinal motion, thus nulling it out. OKR is a feedback system, working on the error between the desired image speed (0 ° s−1) and its actual speed. VOR on the other hand is not a feedback mechanism, as the movements of the eyes have no effect on the sensor—the semi-circular canals. Between them these two reflexes keep eye rotation in space within acceptable limits. Residual image motion, under conditions of realistic natural head rotation is in the range 0.5–5 ° s−1 (Collewijn et al., 1981; Kowler, 1991), i.e. close to the limit at which blur would start to set in.

In ordinary active life, these two types of eye movement—saccades and stabilizing movements—dominate. Two others are important and need to be mentioned. Small moving objects can be tracked by the smooth pursuit system. Here the target is kept on the fovea by smooth movements not unlike those of OKR. However, OKR operates on large areas of the image whereas pursuit requires a small target, and when a target is being tracked the pursuit system is actually pitted against the wide-field OKR system, whose function is to keep the overall image still. Smooth pursuit on its own only works up to target velocities of about 15 ° s−1. Above this speed the smooth movements are supplemented by saccades, and above about 100 ° s−1 pursuit is entirely saccadic. Vergence movements are responsible for adjusting the angle between the eyes to different distances, and they are unique in that the eyes move in opposite directions relative to the head. The role of vergence in real tasks is unclear. In principle, the eyes should converge so that the two foveal directions intersect at the target, but during a task where the subjects had to use vision to guide tapping, ‘vergence tends to be set 25–45% beyond the attended plane; in other words, subjects do not adjust gaze to intersect the attended target’ (Steinman, 2003, p. 1350). It may well be that, out of the laboratory situation, vergence control is quite imprecise.

These, then, are the components from which eye movement strategies in real life tasks are constructed. They are essentially the same as those studied under various kinds of restraint in laboratories over the past century. There are other issues that have been less well studied in laboratory conditions, for example the co-operative actions of eye, head and body, which become important in the behaviour of freely moving individuals. And we may not necessarily expect the same constraints on eye movements outside the laboratory as we find when subjects are asked to do their best at some artificial task. To quote Steinman (2003, p. 1350) again: ‘Under natural conditions gaze control is lax, perhaps even lazy. One could just as easily call it efficient. Why should the oculomotor system set its parameters so as to make it do more work than is needed to get the job done’.

Objective studies of human eye movements date from around the turn of the twentieth century, although methods involving the use of after-images go back to the 18th century (Wade and Tatler, 2005). The first eye movement recordings were made by Delabarre in 1898, using a mechanical lever attached to the eye via a plaster of Paris ring (!). Dodge and Cline (1901) introduced a method for photographing movements of the reflection of a light source from the cornea, which remained the standard method of recording eye movements for 50 years (Steinman, 2003). The method was used in various forms, notably by Buswell (1920) to study reading aloud, and later to record eye movements made while looking at pictures (Buswell, 1935). Butsch (1932) used it to study eye movements during copy typing, and the eye movements of pianists during sight-reading were examined by Weaver (1943). The method required the head to be kept as still as possible, because any head movement changes gaze direction relative to the object being viewed, and so makes it impossible to determine where the eye is looking from eye-in-head movements alone. Improvements of the technique by Ratliff and Riggs (1950) permitted a modest amount of head movement (by using a collimated beam to put the object being observed at infinity), but nonetheless eye movement recordings were still limited to subjects who were essentially stationary. This meant that the study of the kinds of eye movements made during most of the active tasks of everyday life was precluded.

The first devices that made it possible to record eye movements during relatively unconstrained activity were made by Mackworth and Thomas (1962). They used a camera mounted on the head—they had both cine and TV versions—which simultaneously filmed the view ahead and the corneal reflection. By means of some ingenious optics they combined the images so that the moving dot produced by the corneal reflection was superimposed on the scene view to give the location of foveal gaze direction (Thomas, 1968). In this way they could visualize directly where the eye was looking, and because the device was head-mounted the ‘problem’ of head movement no longer existed. The device was used successfully to study both driving and flying. However, it was heavy and not particularly accurate (about 2° visual angle), and the design was not taken up by others. For a time another recording method seemed promising: the use of search coils mounted on both the eye and head (search coils generate currents when they rotate in the magnetic field of a larger surrounding coil). The combined output gives gaze direction, or eye and head direction separately (Collewijn, 1977). However, wearing a search coil for any length of time is uncomfortable, and movements are only possible within the magnetic field of the external coils, and the method has only been used in laboratory situations. By the 1980s video cameras had become much smaller and lighter, and a number of commercial eye trackers, along the lines of the Mackworth and Thomas cameras, began to become available. They were usually based on pupil position, made visible by illuminating the eye with infra-red light to produce a ‘white’ pupil which is tracked electronically. Its location relative to the head is then transferred as a spot or crosshair to the image from the scene camera, to give a pictorial display of foveal gaze direction, or ‘point of regard’ (Duchowski, 2003). These eye trackers are now in common use (Fig. 2). A variant tracks the iris rather than the pupil (Land, 1993), and many of the records in this review were made with such an arrangement. Head-mounted eye trackers, in combination with an external video camera to record motor activity, are the main tools required to explore the relations between eye movements and motor actions.

It is appropriate here to mention two studies that have had a profound effect on the development of the field. The first was by the Russian physiologist Alfred Yarbus. He recorded the eye movements of subjects looking at pictures, extending the earlier work of Buswell (1935). Yarbus got his subjects to look at the pictures with a number of different questions in mind (Yarbus, 1967). These might relate to the relationships of the people in the picture, or the clothes they were wearing (see Section 2.1.4 and Fig. 5). What he found was that each question evoked a different pattern of eye movements, clearly related to the information required by the question. This meant that eye movements were not simply related to the structure of the picture itself, but also to ‘top-down’ instructions from executive regions of the brain. The significance of this for the present review is that when we are engaged in some activity, such as carpentry or cookery, we are also presented with a series of questions—‘Where is the hammer?’ ‘Is the kettle boiling?’—which can only be answered if appropriate eye movements are made. Yarbus’ work provided the precedent for abandoning the older idea that eye movements were basically reflex actions, and demonstrated that they are much more strategic in character.

The second study is really the one that ushered in the present era of exploring the relationship between eye movements and actions. Ballard et al. (1992) devised a task in which a model consisting of coloured blocks had to be copied using blocks from a separate pool. Thus the task involved a repeated sequence of looking at the model, selecting a block, moving it to the copy and setting it down in the right place (Fig. 3). The most important finding was that the operation proceeds in a series of elementary acts involving eye and hand, with minimal use of memory. Thus a typical repeat unit would be as follows. Fixate (block in model area); remember (its colour); fixate (a block in source area of the same colour); pickup (fixated block); fixate (same block in model area); remember (its relative location); fixate (corresponding location in model area); move block; drop block. The eyes have two quite different functions in this sequence: to direct the hand in lifting and dropping the block, and, alternating with this, to gather the information required for copying (the avoidance of memory use is shown by the fact that separate glances are used to determine the colour and location of the model block). The only times that gaze and hand coincide are during the periods of about half a second before picking up and setting down the block.

The main conclusion from this study was that the eyes look directly at the objects they are engaged with, which in a task of this complexity means that a great many eye movements are required. Given the relatively small angular size of the task arena, why do the eyes need to move so much? Could they not direct activity from a single central location? Ballard et al. (1992) found that subjects could complete the task successfully when holding their gaze on a central fixation spot, but it took three times as long as when normal eye movements were permitted. For whatever reasons, this strategy of ‘do it where I’m looking’ is crucial for the fast and economical execution of the task. As we shall see, this strategy seems to apply universally. With respect to the relative timing of fixations and actions, Ballard et al. (1995) came up with a second maxim: the ‘just in time’ strategy. In other words, the fixation that provides the information for a particular action immediately precedes that action; in many cases the act itself may occur, or certainly be initiated, within the lifetime of a single fixation. It seems that memory is used frugally here, as testified by the fact that separate fixations are used to obtain the colour and relative position of the blocks (although in other tasks memory for object location can persist for quite long periods, as we shall see later in Section 3.1.1). The conclusions from these studies are substantially borne out by most of the examples detailed in part 2 of this review, and they can be regarded as basic rules for the interaction of the eye movement and action systems.

This review differs from most previous reviews of eye movements (e.g. Carpenter, 1988) in that it is not concerned with eye movements per se, but rather with the functions of the sequences of fixations that accompany different kinds of activity. The latter part of the twentieth century saw a huge amount of experimental work devoted to the physiology of eye movements. This included the mechanics and neuromuscular physiology of the eye, the nature of the control systems involved, and the neurophysiology of the central mechanisms responsible for their generation (see Robinson, 1968, Robinson, 1981; Carpenter, 1988, Carpenter, 1991). Much recent effort has gone into working out how different regions of the brain—in particular, the superior colliculus—are involved in the generation of saccades (Gandhi and Sparks, 2003; Sommer and Wurtz, 2003). At the same time much psychological research has gone into saccade generation, especially in the fields of attention and visual search (Findlay and Gilchrist, 2003; Schall, 2003). Almost all these studies deal with eye movements as single entities: saccades, stabilizing reflexes, pursuit and vergence were mainly considered as isolated systems rather than components of a larger strategy (although work on search patterns comes closest to this). It is this larger strategy—how we use our eyes to obtain the information that we need for action—that I will address here, and I will not deal in any great detail with the individual components, whose characteristics are well reviewed elsewhere.

Different kinds of activity have different requirements for visual information. A tennis player has to assess the trajectory of a rapidly approaching ball in order to formulate a return stroke. A pianist needs to acquire notes continuously from the two staves of a score, translate them into finger movements and emit them simultaneously as a continuum of key strokes. A driver must simultaneously keep the car in lane, avoid other traffic and be aware of road signs. A cook following a recipe must perform a succession of acts of preparation and assembly, each one different from the others, in a defined sequence. In all of these activities, the eyes provide crucial information at the right time and from the right place, and the patterns of fixations are unique to the particular task.

The rest of the review is in two main parts. In Section 2, I will present descriptions of the patterns of eye movements and fixations that accompany different types of activity. This will provide a data base which I will mine in Part 3 to address some of the questions that the different studies throw up. For example: What kinds of information do the eyes supply to the motor system of the limbs? How close does gaze have to be to the site of the action it is controlling? When is visual information acquired and supplied in relation to the timing of the motor actions themselves? What does the oculomotor system need to know about the location of objects in order to find the appropriate information? How do eyes, head, limbs and trunk cooperate in the production of an action? What can we learn about the central mechanisms responsible for these patterns of coordination? What role does memory play? Except in the context of reading, and some other sedentary activities, few of these questions were addressed prior to about 1990, and many of them remain unanswered.

Section snippets

Sedentary activities

The eye movements associated with activities in which the head could be kept still were amenable to study from the time of the very earliest eye movement recordings. For example Erdmann and Dodge (1898; see also Dodge, 1900) first showed that during reading the subjectively smooth passage of the eye across the page is in reality a series of saccades and fixations, in which information is taken in during the fixations.

Coordination of eye movements and actions

It is clear from the examples given in Part 2 that eye movements play a crucial role in the organization of actions, and that in general the eyes begin to collect information before the action itself has begun (Fig. 20). Eye movements are thus a planned-in, proactive, part of every action sequence, and are not simply summoned up when more information is needed. In the discussions that follow I explore how close are the relations, in space and time, between where we look and what we do, and what

References (98)

  • J.B. Pelz et al.

    Oculomotor behavior and perceptual categories in complex tasks

    Vision Res.

    (2001)
  • H. Ripoll et al.

    Analysis of visual patterns of table tennis players

  • H. Shinoda et al.

    What controls attention in natural environments?

    Vision Res.

    (2001)
  • B.W. Tatler et al.

    Visual correlates of fixation selection: effects of scale and time

    Vision Res.

    (2005)
  • J. Tchalenko et al.

    Eye movement and voluntary control in portrait drawing

  • G. Underwood et al.

    Automatic and controlled information processing: the role of attention in the processing of novelty

  • G.L. Walls

    The evolutionary history of eye movements

    Vision Res.

    (1962)
  • J.R. Antes

    The time course of picture viewing

    J. Exp. Psychol.

    (1974)
  • A. Baddeley

    Human Memory, Theory and Practice

    (1997)
  • A.T. Bahill et al.

    Why can’t batters keep their eyes on the ball?

    Am. Sci.

    (1984)
  • D.H. Ballard et al.

    Hand–eye coordination during sequential tasks

    Philos. Trans. R. Soc. Lond. B

    (1992)
  • D.H. Ballard et al.

    Memory representations in natural tasks

    J. Cogn. Neurosci.

    (1995)
  • Buswell, G.T., 1920. An experimental study of the eye-voice span in reading. Supplementary Educational Monographs No....
  • G.T. Buswell

    How People Look at Pictures: A Study of the Psychology of Perception in Art

    (1935)
  • R.L.C. Butsch

    Eye movements and the eye–hand span in typewriting, J

    Educ. Psychol.

    (1932)
  • R.H.S. Carpenter

    Movements of the Eyes

    (1988)
  • R.H.S. Carpenter

    The visual origins of ocular motility

  • H. Collewijn

    Gaze in freely moving subjects

  • H. Collewijn et al.

    Natural retinal image motion: origin and change

    Ann. N.Y. Acad. Sci.

    (1981)
  • D. Crundall

    The integration of top-down and bottom-up factors in visual search during driving

  • E.B. Delabarre

    A method for recording eye-movements

    Am. J. Psychol.

    (1898)
  • R. Dodge

    Visual perception during eye movement

    Psychol. Rev.

    (1900)
  • R. Dodge et al.

    The angle velocity of eye movements

    Psychol. Rev.

    (1901)
  • E. Donges

    A two-level model of driver steering behavior

    Hum. Factors

    (1978)
  • A.T. Duchowski

    Eye Tracking Methodology: Theory and Practice

    (2003)
  • B. Erdmann et al.

    Psychologische Untersuchungen über das Lesen auf experimenteller Grundlage

    (1898)
  • J.M. Findlay et al.

    Active Vision

    (2003)
  • J.M. Findlay et al.

    A model of saccade generation based on parallel processing and competitive inhibition

    Behav. Brain Sci.

    (1999)
  • A.G. Fleischer

    Control of eye movements by working memory load

    Biol. Cybernet.

    (1986)
  • C. Friedburg et al.

    Contribution of cone photoreceptors and postreceptoral mechanisms to the human photopic electroretinogram

    J. Physiol. (Lond.)

    (2004)
  • S. Furneaux et al.

    The effects of skill on the eye–hand span during musical sight-reading

    Proc. R. Soc. Lond. B

    (1999)
  • N.J. Gandhi et al.

    Changing views of the role of superior colliculus in the control of gaze

  • J. Grimes

    On the failure to detect changes in scenes across saccades

  • D. Guitton et al.

    Gaze control in humans: eye–head coordination during orienting movements to targets within and beyond the oculomotor range

    J. Neurophysiol.

    (1987)
  • D. Guitton et al.

    Visual, vestibular and voluntary contributions to human head stabilization

    Exp. Brain Res.

    (1986)
  • M. Hayhoe

    Vision using routines: a functional account of vision

    Vis. Cogn.

    (2000)
  • J.M. Henderson et al.

    Visual memory for scenes

  • M.A. Hollands et al.

    “Look where you’re going!”: gaze behaviour associated with maintaining and changing the direction of locomotion

    Exp. Brain Res.

    (2002)
  • T. Imai et al.

    Interaction of the body, head, and eyes during walking and turning

    Exp. Brain Res.

    (2001)
  • Cited by (0)

    View full text