DOWNLOAD PDF |
Richard L Gregory
From: Phil. Trans. R. Soc. Lond. B (1997) 352, 1121 - 1128 with the kind permission of the Editor
Department of Psychology, University of Bristol, 12a Priory Road, Clifton, Bristol. BS8 1TU. UK
Summary
Following Hermann von Helmholtz, who described visual perceptions as unconscious inferences from sensory data and knowledge derived from the past, perceptions are regarded as similar to predictive hypotheses of science, but are psychologically projected into external space and accepted as our most immediate reality. There are increasing discrepancies between perceptions and conceptions with science's advances, which makes it hard to define 'illusion'. Visual illusions can provide evidence of object knowledge and working rules for vision, but only when the phenomena are explained and classified. A tentative classification is presented, in terms of appearances and kinds of causes.
The large contribution of knowledge from the past for vision raises the issue: how do we recognize the present, without confusion from the past. This danger is generally avoided as the present is signalled by real-time sensory inputs - perhaps flagged by qualia of consciousness.
1. Intelligence and Knowledge
Philosophy and science have traditionally separated intelligence from perception, vision being seen as a passive window on the world and intelligence as active problem-solving. It is a quite recent idea that perception, especially vision, requires intelligent problem-solving based on knowledge.
There is something of a paradox confounding intelligence and knowledge, for one thinks of knowledgeable people as being specially intelligent and yet more knowledge can reduce the intelligence needed for solving problems. The paradox is resolved, when we consider two senses of 'intelligence': active processing of information (as supposedly measured in IQ tests) and available answers (as in 'military intelligence') These senses of 'intelligence' have been named by rough analogy with creating and the storing of energy as, potential intelligence and kinetic intelligence (Gregory 1987). The notion is that stored-from-the-past potential intelligence of knowledge, is selected and applied to solve current perceptual problems by active processing of kinetic intelligence. The more available knowledge, the less processing is required; however, kinetic intelligence is needed for building useful knowledge, by learning through discovery and testing. (The analogy is imperfect because knowledge is not conserved. Nevertheless, these terms may be useful though, apart from secret knowledge, 'potential intelligence' is not diminished by use.) When almost complete answers are available, knowledge takes the dominating role. Then 'top-down' becomes more important than 'bottom-up', which may be so for human vision. (Remarkably, there are more downwards fibres from the cortex to the lateral geniculate bodies LGN) 'relay stations' than bottom-up from the eyes (Sillito 1995).)
Errors of perception (phenomena of illusions) can be due to knowledge being inappropriate or being misapplied. So illusions are important for investigating cognitive processes of vision. Acceptance that knowledge makes a major contribution to human vision is recent, remaining controversial. This applies even more to the machine vision of artificial intelligence. Perhaps progress in artificial intelligence has been delayed through failure to recognize that artificial potential intelligence of knowledge is needed for computer vision to be comparable to brains.
It was the German polymath, Hermann von Helmholtz (182l - 1894) who introduced the notion that visual perceptions are unconscious inferences (von Helmholtz 1866). For von Helmholtz, human perception is but indirectly related to objects, being inferred from fragmentary and often hardly relevant data signalled by the eyes, so requiring inferences from knowledge of the world to make sense of the sensory signals. There are, however, theorists who try to maintain 'direct' accounts of visual perception as requiring little or no knowledge, notably followers of the American psychologist J. J. Gibson (l904 - l979) whose books The Perception of the Visual World (1950) and The Senses Considered as Perceptual Systems (1966) remain influential. in place of knowledge and inference, Gibson sees vision as given directly by available information 'picked-up from the ambient array' of light, with what he calls 'affordances' giving object-significance to patterns of stimulation without recourse to stored knowledge or processing intelligence. The 'affordance' notion might be seen as an extension of the ethologist's concept of innate 'releasers', which trigger innate behaviour such as robins responding aggressively to a red patch. This fits Gibson's 'ecological optics'; but how new objects, such as telephones, arc recognized without acquired knowledge is far from clear. To maintain that perception is direct, without need of inference or knowledge, Gibson generally denied the phenomena of illusion.
Following von Helmholtz's lead we may say that knowledge is necessary for vision because retinal images are inherently ambiguous (for example for size, shape and distance of objects). and because many properties that are vital for behaviour cannot be signalled by the eyes, such as hardness and weight, hot or cold, edible or poisonous. For von Helmholtz, ambiguities are usually resolved, and non-visual object properties inferred, from knowledge by unconscious inductive inference from what is signalled and from knowledge of the object world. It is a small step (Gregory l968 a, b, 1980) to say that perceptions are hypotheses, predicting unsensed characteristics of objects, and predicting in time, to compensate neural signalling delay (discovered by von Helmholtz in 1850), so 'reaction time' is generally avoided, as the present is predicted from delayed signals. This has recently been investigated with elegant experiments by Nijhawan (1997). Further time prediction frees higher animals from the tyranny of control by reflexes, to allow intelligent behaviour into anticipated futures.
It is a key point that vision is not only indirectly related to objects, but also to stimuli. As Helmholtz appreciated (Boring 1950, p. 304), this follows from the law of specific energies, proposed by his teacher, Johannes Muller. It is perhaps better named the law of specific qualities: any afferent nerve signals the same quality or sensation whatever stimulates it. Thus we see colours not only from light but also when the eyes are mechanically pressed, or stimulated electrically. We may regard eyes and the other sense organs as designed by natural selection to allow the universal neural code of action potentials to signal a great variety of object properties, routed to specialized brain regions to create qualities of colour and touch, sounds and so on (colours being generated by a specialized brain module in area V4 of the striate cortex (Zeki 1993). It was clear to Newton in Opticks (1704) that it is strictly incorrect to say that light is coloured. Rather, light evokes sensations of colours in suitable eyes and brains. Perceptions, such as colours, are psychologically projected into accepted external space. This 'projection' is demonstrated most clearly with retinal photographs of after-images, which appear on the surfaces of external objects, or are projected into outer darkness.
An essential problem for vision is perceiving scenes and objects in a three-dimensional external world, which is very different from the flat ghostly images in eves. Some phenomena of illusion provide evidence for the uses of knowledge for vision; this is revealed when it is not appropriate to the situation and so causes a systematic error, even though the physiology is working normally. A striking example is illustrated in the following section.
2. The Hollow Face
The strong visual bias of favouring seeing a hollow mask as a normal convex face (figure 1), is evidence for the power of top-down knowledge for vision (Gregory 1970). (Barlow (1997) takes a more 'reductionist' view preferring to think of this in terms of redundancies of bottom-up signals from the eyes. I would limit this to very general features, such as properties of' edge-signalling giving contrast effects, rather than phenomena attached to particular objects or particular classes of objects, such as faces.) This bias of seeing faces as convex is so strong it counters competing monocular depth cues, such as shading and shadows, and also very considerable unambiguous information from the two eyes signalling stereoscopically that the object is hollow. (There is a weaker general tendency for any object to be seen as convex, probably because most objects are convex. The effect is weaker when the mask is placed upside down, strongest for a typical face. If the mask is rotated, or the observer moves, it appears to rotate in the opposite to normal direction, at twice the speed; because distances are reversed motion parallax becomes effectively reversed. This also happens with a depth-reversed wire cube.)
It is significant that this, and very many other illusions, are experienced perceptually though the observer knows conceptually that they are illusory - even to the point of appreciating the causes of the phenomena. This does not, however, show that knowledge has no part to play in vision. Rather, it shows that conceptual and perceptual knowledge are largely separate. This is not altogether surprising because perception must work extremely fast (in a fraction of a second) to be useful for survival, though conceptual decisions may take minutes, or even years. Further, perceptions are of particulars, rather than the generalities of conceptions. (We perceive a triangle, but only conceptually can we appreciate triangularity.) Also, if knowledge or belief determined perception we would be blind to the unusual, or the seemingly impossible, which would be dangerous in unusual situations, and would limit perceptual learning.
The distinguished biologist J. Z. Young was a pioneer who stressed the importance of handling knowledge for understanding brain function, and that there may be a 'brain language' preceding spoken or written language. Thus )\bung 1978, p.56): 'If the essential feature of the brain is that it contains information then the task is to learn to translate the language that it uses. But of course this is not the method that is generally used in the attempt to understand the brain. Physiologists do not go around saying that they are trying to translate brain language. They would rather think that they are trying to understand it in the "ordinary scientific terms of physics and chemistry"' Cognitive illusions reveal knowledge and assumptions for vision, and perhaps take us ('lose to 'brain language', but they must be understood and also classified. Classifying is important for the natural sciences: it should be equally important for the unnatural science' of illusions.
Classifying must he important for learning and perception, for it is impossible to make inductive generalizations without at least implicit classes. It is also impossible to make deductive inferences, as deductions are not from facts or events, but from descriptions (in words or mathematics) of real or imaginary members of classes. Von Helmholtz's 'unconscious inference' for vision was inductive; 'for example inferring distances from perspective and shapes from shading. As there are frequent exceptions certainty is not attainable. Thus atypical shapes give systematic errors, when general rules or specific knowledge are inappropriate for these unusual objects or scenes, as shown most dramatically by the Ames demonstrations such as the Ames window (Ittelson 1952). (This is a slowly rotating trapezoid, the shape of a rectangle as viewed from an oblique angle. It changes bizarrely in size and form as it does not go through the usual perspective transformations of a familiar sect angle, such as a normal window.) Much the same applies to seeing familiar objects in the very different brush strokes of paintings; this is evidently seen by object knowledge and rules, such as perspective, and is normally applied to the world of objects but is activated by the patterns of paint.
3. What are Illusions?
It is extraordinarily hard to give a satisfactory definition of an 'illusion'. It may be the departure from reality, or from truth; but how are these to be defined? As science's accounts of reality get ever more different from appearances, to say that this separation is 'illusion' would have the absurd consequence of implying that almost all perceptions are illusory. It seems better to limit 'illusion' to systematic visual and other sensed discrepancies from simple measurements with rulers, photometers. clocks and so on.
There are two clearly very different kinds of illusions: those with a physical cause and cognitive illusions due to misapplication of knowledge. Although they have extremely different kinds of causes, they can produce some surprisingly similar phenomena (such as distortions of length or curvature), so there are difficulties of classification that require experimental evidence.
Illusions due to the disturbance of light, between objects and the eyes, are different from illusions due to the disturbance of sensory signals of eye, though both might be classified as 'physical'. Extremely different from both of these are cognitive illusions, due to misapplied knowledge employed by the brain to interpret or read sensory signals. For cognitive illusions, it is useful to distinguish specific knowledge of objects, from general knowledge embodied as rules. Either can be mislead in unusual conditions, and so can be revealed by observation and experiment. An example of misleading specific knowledge is how a grainy texture is seen as wood, though it is a plastic imitation or a picture. More dramatic is how a hollow face or mask is seen as convex (figure 1), because faces are very rarely hollow (Evidently the perceptual hypothesis of a face carries the, not always appropriate, knowledge that it is convex.) Examples of misleading rules are the Gestalt laws of 'closure', 'proximity', 'continuity' and the 'common fate' of movements of parts of objects Wertheimer 1923, 1938). When these do not apply illusion can result, because not all objects are closed in form, with close-together parts and continuous edges, or with parts moving together as leaves of a tree in the wind. Exceptional objects are mis-seen when Gestalt laws are applied, and when perspective rules are applied for atypical objects, such as the Ames window and flat projections of pictures.
4. 'Ins-And-Outs'
To the usual terms 'bottom-up' signals and 'top-down' knowledge, we add what might be called 'sideways' rules. Both top-down and sideways are knowledge; the first specific (such as faces being convex), the second being general rules applied to all objects and scenes (such as the Gestalt laws and perspective). These are 'ins-and-outs' of vision, which it might he useful to consider, before attempting to explain how the visual brain works, with the scheme presented in figure 2.
5. Classifying Illusions
Appearances of illusions fall into classes which may be named quite naturally from errors of language: ambiguities, distortions, paradoxes, fictions. It may be suggestive that these apply both to vision and to language, because language possibly grew from prehuman perceptual classifications. This would explain why language developed so rapidly in biological time, if based on a take-over from pre-human classification (especially of objects and actions) for intelligent vision (Gregory 1971). Could this be Chomsky's innate 'deep structure' of the grammar of languages (cf. Pinker 1994)? In any case, this is illustrated in table 1.
kinds |
illusion appearances |
sentence errors |
ambiguities |
Necker Cube |
|
distortions |
Müller-Lyer |
|
paradoxes |
Penrose triangle |
she's a dark haired blonde |
fictions |
faces-in-the-fire |
they live in a mirror |
To classify causes we need to explain the phenomena. There is no established explanation for many illusions, but even a tentative classification may suggest where to look for answers amid may suggest new experiments. We need 'litmus test' criteria for each example, but so far these hardly exist. There are, however, various experimental tests (especially using phenomena of ambiguity to separate the bottom-tip signal from top-down or sideways cognitive errors), and selective losses of the visual agnosias may help to reveal perceptual classes (Humphreys & Riddock 1987 a, b; Sacks 1985).
We suggest four principal kinds of causes: the first two lying broadly within physics; the last associated with knowledge, and so perhaps with 'brain language'. The first is optical disturbance intervening between the object and the retina. The second is disturbed neural sensory signals. The third and fourth are extremely different from these, as they are cognitive and so knowledge-based, for making sense of neural signals. (Thus writing is meaningless without semantic knowledge called up by words, organized by syntactic structures of grammar.)
Adding the kinds of appearances (named 'from errors of language as in table 1), we arrive at something like table 2 for classifying visual illusions. One illustrative example is given for each class, under the major division between (physical) optical and neural signal disturbances and (cognitive) general rules and specific knowledge. When any are inappropriate, characteristic phenomena of illusion may occur.
physics |
knowledge |
|||
kinds |
optics |
signals |
rules |
objects |
ambiguity |
1 mist |
5 retinal rivalry |
9 figure-ground |
13 hollow face |
distortion |
2 mirage |
6 Café wall |
10 Muller -Lyer |
14 size - weight |
paradox |
3 looking-glass |
7 rotating spiral |
11 Penrose triangle |
15 Magritte mirror |
fiction |
4 rainbow |
8 after-images |
12 Kanizsa triangle |
16 faces in the fire |
No doubt some attributions will be controversial; they are not intended to he set in stone. The task is to develop 'litmus test' experimental criteria for assigning the phenomena to their proper classes of appearances and causes. It is entirely possible that different classes will be needed as understanding advances. We reach complicated issues, but some of them are summarized below
(i) Mist. Any loss of information may increase uncertainty and produce ambiguities.
(ii) Mirage. Refraction of light between the object and the eyes displaces objects or parts of objects, as for mirages, or a spoon bent in water. (Conceptual understanding does not correct these distortions, though motor performance may adapt, as for diving birds catching fish.)
(iii) Looking-glass. One sees oneself double: through the glass, as a kind of ghost; yet one knows one is in front of it. So perception and conception separate. (This may be the origin of notions of mind separate from body, i.e. dualism (Gregory 1997).)
(iv) Rainbow. An illusion when it is seen as an object, with expectations as for a normal object. (Thus unlike an arch of stone, when approached, it moves away and can never be touched. With this in mind it is not illusory.)
(v) Retinal rivalry. Small horizontal separations of corresponding points of the eyes' images are 'fused', and signal depth stereoscopically. At angles greater than about 1? (Panum's limit) fusion breaks down, and perception shifts and changes in bizarre ways.
(a)
(b) (c)
(vi) Cafe Wall. The rows of 'tiles' (figure 3a) with alternate rows displaced by half a cycle, appear as long alternating wedges. This lacks perspective, or other depth cues. Unlike the distortions of point 10 below, it depends critically on luminances, disappearing when the neutral 'mortar' lines are brighter than the light, or dimmer than the dark tiles. It appears to violate Curie's principle that systematic asymmetry cannot be generated from symmetry; but there are two processes: small wedges are produced by local asymmetry where there is luminance contrast of light - dark half tiles and these small wedges integrate along the rows, to form long wedges (Gregory & Heard 1979).
(vii) Rotating spiral (after-effect of movement). The spiral expands yet, paradoxically, does not change size. The adapted motion channel gives conflicting evidence with unadapted position signals.
(viii) After-images. These are almost entirely due to local losses of retinal visual pigments, from intense or prolonged stimulation.
(ix) Figure-ground. The primary decision: which shapes are objects and which are spaces between objects. This seems to be given by general rules of closure and so on. (These rules cannot always make up the brain's mind.)
(x) Muller - Lyer (Ponzo, Poggendorif, Orbison, Hering and many other illusions) seem to be due to perspective, or other depth cues, setting constancy sealing inappropriately, e.g. when depth is represented on the plane of a picture. Scaling can be set bottom-up from depth cues, though depth is not seen, e.g. when countermanded by the surface texture of a picture (Gregory 1963). The distortions disappear when these figures are presented and seen in true depth: corners for the Muller- -Lyer and parallel receding lines for the Ponzo, etc. (Gregory & Harris 1975).
(xi) Penrose impossible triangle. When a simple closed figure or object, seen from a critical position, has features lying at different distances but that touch in a picture, or retinal image, the visual system accepts a rule that they are the same distance. This false assumption generates a rule-based paradoxical perception.
(xii) Kaniza triangle and many other illusory contours and surfaces. Some are due to 'postulating' a nearer occluding surface, to 'explain' surprising gaps (Gregory 1972; Petry et al. 1987).
(xiii) Hollow face. This illustrates the power of probabilities (and so knowledge for object perception (figure 1).
(xiv) Size - weight illusion. Small objects feel heavier than larger objects of the same scale weight; muscles are set by knowledge-based expectation that the larger will be heavier, which is generally, though not always true.
(xv) Magritte mirror. René Magritte's painting La reproduction interdite (1937) shows a man facing a mirror, but the back of his head appears in the glass. This looks impossible from our knowledge of mirrors (Gregory 1997).
(xvi) Faces-in-the-fire, ink blots, galleons in the clouds and so on, show the dynamics of perception. Hypotheses are generated that go fancifully beyond the evidence.
The Café wall distortion, due to disturbed neural signals, is shown in figure 3a, for comparison with the knowledge rules-distortion of' the Muller - Lyer distortion (figure 3b) and the specific-knowledge distortion of the size - weight illusion (figure 3c). They may appear similar (all being distortions) but their causes are fundamentally different.
We may develop the 'flat box' of ins-and-outs (figure 2) to a fuller 'black box' (figure 4). These diagrams do not attempt to show anatomical paths or brain regions, but rather, functional ins-and-outs of vision.
A 'downwards' loop is also shown, from the prevailing perceptual hypothesis, affecting bottom-up signal processing. This may be demonstrated by the change of apparent brightness with depth-reversal of the Mach's corner illusion (figure 5). Though as Barlow points out (personal communication, 1997) this is not necessarily the explanation; it requires experiments.
6. Qualia
Most mysterious of all brain phenomena is consciousness. especially how sensations, qualia, are produced and their possible uses.
In the account given here, perception depends very largely on knowledge (specific 'top-down' and general 'sideways' rules), derived from past experience of the individual and from ancestral, sometimes even prehuman experience. So perceptions are largely based on the past, but recognizing the present is essential for survival in the here and now.
The present moment must not be confused with the past, or with imagination, i.e. as indeed one appreciates when crossing a busy road. So, although knowledge from the past is so important, it must not obtrude into the present. Primitive non-cognitive animals have no such danger of confusion, as their present is simply signalled by real-time afferent inputs.
Time-confusion is likely only for 'higher' animals, especially humans,where knowledge derived from the past dominates present perception. As for primitive (reflex and tropism-controlled) animals our present is also signalled by real-time afferent inputs, but as input signals have a smaller part to play than knowledge from the past, for cognitive perception, they must be very clearly distinguished. (Exceptions are qualia in dreams and in schizophrenic hallucinations. There are rare cases (Luria 1969) of individuals having such vivid memory that their present is dangerously confused with their past and with imagination. Memories of emotion such as embarrassment can evoke qualia, perhaps from real-time signals from visceral changes or blushing evoked by memory.) As a speculation: are real-time sensory signals - and so the present - flagged by the vividness of qualia?
It is interesting to compare the qualia of seeing, with memory of a scene immediately the eyes are closed. Surely the visual qualia almost if not entirely disappeai' when the sensory inputs cease. Reversing this simple experiment by opening the eyes following immediate memory, the onset of the visual qualia is so striking that they make the memory pale by comparison. So perhaps consciousness serves to avoid confusion with the remembered past, by flagging tile present with the unique vividness of sensory qualia.
References
Barlow, H. B. 1997 The knowledge used in vision: and where it comes from. Phil. Trans. R. Soc. Lond. B. 352, 1143 1149.
Boring. E. G. 1950 A history of experimental psychology 2nd edn. New York: Appleton Century Crofts.
Gregorv. R. L. 1963 Distortion of visual space as inappropriate constancy scaling. Nature 199, 678-691.
Gregory R. L. 1968a Perceptual illusions and brain models. Proc. R. Soc. Lond. B 171, 179 - 196.
Gregory R. L. 1968b On how so little information controls so much behaviour. In Towards a theoretical biology 2 (ed. C. H. Waddington). Edinburgh: University of Edinburgh Press.
Gregory. R. L. 1970a The intelligent eye. London: Weidenfeld & Nicolson.
Gregory, R. L. 1970b The grammar of vision. Listener 83, 242. Reprinted in R. L. Gregory 1974 Concepts of vision, pp. 622 629. London: Duckworth.
Gregory, R. L. 1972 Cogitive contours. Nature 238, 51-52.
Gregory, R. L. 1980 Perceptions as hypotheses. Phil. Trans. R. Soc. Lond. B 290, 181 - 197.
Gregory, R. L. 1997 Mirrors in mind. Oxford: Spektrurn/New York: W. H. Freeman.
Gregory, R. L. & Harris, J. P. 1975 Illusion-destruction by appropriate scaling. Perception 4, 203 - 220.
Gregory. R. L. & Heard. P. 1979 Border-locking and the Cafe Wall illusion. Perception 4, 203-220.
Helmholtz, H. von 1866 Concerning the perceptions in general. In Treatise on physiological optics, vol. III, 3rd edn (translated by J. P. C. Southall 1925 Opt. Soc. Am. Section 26, reprinted New York: Dover, 1962).
Humphries, G. W & Riddock, M. J. 1987a The fractionation of visual agnosia. In Visual object processing: a cognitive neuropsychological approach (ed. G. W. Humphries & M. J. Riddock), pp. 281-306. London: Lawrence Erlbaum.
Humphries, G. W. & Riddock, M. J. 1987b To see but not to see: a case study of visual agnosia. London: Lawrence Erlbaum.
Ittelson, W H. 1952 The Ames demonstrations in perception. Princeton University Press.
Luria, A. R. 1969 The mind of a mnemonist. London: Penguin.
Mach, E. 1897 The analysis of sensations (English translation 1959, 5th edn:. New York: Dover.
Marr, D. 1982 Vision. New York: W. H. Freeman.
Nijhawan, R. 1997 Visual decomposition of colour through motion extrapolation. Nature 386, 66 - 69.
Petry. S. & Meyer, G. E. (eds) 1987 The perception of illusion contours. New York: Springer-Verlag.
Pinker, S. 1994 The language instinct. London: Allen Lane / Penguin.
Sacks, 0. 1985 The man who mistook his wife for a hat. London: Duckworth.
Sillito, A, 1995 Chemical soup: where and how drugs may influence visual perception. In The artful eye (ed. R. L. Gregory, J. Harris, P. Heard & D. Rose. pp. 294 - 306. Oxford University Press.
Wertheimer, NI. 1923 Untersuchungen zur Lehre von Gestalt II. Psychol. Forsch 4, 301 - 350. Transl. 1938 Organization of percetual Forms. In A source book of Gestalt psychology (ed. W. D. Ellis), pp. 71 - 88. London: Routledge and Kegan Paul.
Young, J. Z. 1978 Programs of the brain. Oxford University Press.
Zeki, S. 1993 A vision of the brain. Oxford: Blackwell.