|PAGE 2 OF 2|| ||DOWNLOAD PDF|
Richard L. Gregory and Priscilla F. Heard
Reprinted by kind permission of the Editor from: Quarterly Journal of Experimental Psychology (1983) 35A, 217-237
Brain and Perception Laboratory, University of Bristol, Department of Anatomy, The Medical School, University Walk, Bristol BS8 1TD, England
The apparent movement is greatly affected by the width of the stripes. For movement of the rectangle figures as a whole to occur, the edge stripes must not subtend more than about to 10 min of arc. The movement is most pronounced with narrower edge stripes. The stripes for these experiments were 1.8 min of arc. With stripes wider than about 10 min of arc. the rectangle figures do not move as a whole with any modulated luminance. Although the broad stripes do not shift, they clearly shrink and expand when the background is modulated around their luminances (this is Gamma movement, discussed below). There is a curious asymmetry here: broad light stripes shrink when the background is increased up to isoluminance, and they expand when isoluminance is approached downwards from a greater background luminance. Wide dark stripes show no such asymmetry: they shrink as the background approaches isoluminance with them, either from above or from below their luminance.
It is worth noting that while sweeping over the entire luminance range, the figures appear to move as a whole, with all the borders having the same velocity; though on close inspection, when the modulated background is below the central grey luminance of the display, only the dark stripes are seen to move.
A well known effect may easily be confused with these border shifts and movements. A rectangle continuously graded from dark to light (a long luminance wedge) exhibits a moving bar, or band, as either the background or its luminance is varied. The wedge itself is not, however, seen to move. This effect is very different from the moving edge-striped rectangle figure phenomena reported here. The moving band on the wedge we attribute to local loss of contrast with the background, at the region where the wedge is isoluminant with the background: the band therefore runs along the wedge as the background is changed, but with no movement of the wedge itself.
The width of the grey rectangle separating the narrow stripes was not critical for these effects. It may indeed be absent; or it may subtend at least 10 deg. However, when present its luminance relative to the light and dark stripes was critical. The dramatic switch in depth occurred as the background crossed isoluminance with the grey; not as it crossed the intermediate luminance between the light and the dark stripes. This was clearly indicated in subsidiary experiments, in which the relative luminances of the central grey and the stripes was varied, but these will not be discussed further here.
When the display is blurred, by viewing with a defocusing lens, or when presented at a low level of illumination, the illusory movement and stereo depth are increased. Movement is also markedly greater in peripheral vision.
The area of the background is surprisingly uncritical. The same phenomena are observed with a surround of less than 10 mins of arc. This suggests that the adaptation level of the eyes is not involved in these phenomena. This will be investigated further.
Interpretation of results
We started by asking whether position, movement, and stereo depth are signalled by the same, or by different channels which might be specially designed (by Natural Selection) to the functional requirements of edge, movement, and depth perception. If they are different we should expect dissociation under some conditions: which itself raises the question of why registration errors at edges or borders are normally seldom if ever observed. We have previously postulated active "border locking" (Gregory and Heard, 1979) to maintain registration against discrepant signals from channels which would, if they are specially adapted for different functions, have different characteristics and so sometimes must surely give incompatible signals - to produce misregistrations. At isoluminance, contiguous regions of contrasting colours (such as red and green) are unstable; and there is instability also at very high contrasts, especially at low illumination where retinal delays are large (Gregory, 1977). These instabilities we tentatively attribute to loss of border-locking, in the absence of luminance contrast, when neighbouring regions are isoluminant with only colour contrast; or when there are extremely different retinal delays at very high luminance contrast when locking appears to break down.
A further and very different (though easily confused) source of instability or misregistration is that different features of the stimulus display may be selected, perhaps for different kinds of processing for different tasks. This feature selection principle will be invoked in particular to explain the anomalous stereo result reported above.
We shall now consider some implications of different channel characteristics. The concept of channel is not altogether clear, and the term is used in several senses, though all refer to transmission of signals or of information. Channels can be anatomically defined structures, such as a nerve fibre or a nerve bundle; or they may be complete modalities, associated with different sense organs such as the visual and auditory "channels" of eyes and ears. Channels can also be defired functionally, where there are no distinguishing pathways. For an electronic example of this, in multiplex telephony there is only one "anatomical" path (a high band-width co-axial cable) but twenty or more information channels, for separate messages, given by a high frequency carrier which is divided into a frequency band for each functional channel, though without anatomical distinctions. Even cognitive selection of messages of different meaning has been described as channels, as in the "cocktail party effect" (Treisman 1964). Any of these may be valid, according to context, but here we are restricting "channel" to transmission characteristics of more peripheral neural signalling. The phenomena observed here may suggest that (quite apart from selective adaptations) specific channel characteristics can be revealed as dissociations between observed edge position, stereo depth, and movement.
There are dangers in this interpretation, which are most evident for the reversal of the stereo depth as the background crosses isoluminance with the central grey rectangle; though the edge positions (measured from vernier displacement) and the movement (measured by matching with true movement) do not reverse direction. The point is that what are accepted as the corresponding edges for stereo fusion may change when the contrast values of the figures change. It may be that different features - different edges of the display - are selected for various visual tasks. So the dissociations of the three functions may be due to different selections from the available stimulus features rather than to differences of neural channel characteristics. The problem is to distinguish between these conceptually very different though easily confused possibilities.
It is likely that which edges are selected as corresponding for stereo fusion depends upon their relative contrasts. The sudden switch of stereo depth in the "wrong" direction may then be explained by supposing that two plausible fusion rules are obeyed by the visual system. The first was suggested by Whittle (1964): that edges of the same luminance sign fuse. In these striped rectangle displays, having a light stripe in one eye and a corresponding dark stripe in the other, there are two alternative fusions as there are two edges having the same luminance sign. The inner edges of the dark stripes in one eye may fuse with the outer edges of the light stripes in the other eye; or the outer edges of the dark stripes may fuse with the inner edges of the light stripes. On this rule alone, therefore, this particular situation would be ambiguous. This ambiguity is usually resolved, we suggest, by a second rule: that where there are such alternative candidates for fusion, those with the greatest luminance contrast are favoured. This notion is shown diagrammatically in the luminance profiles of Figure 5. As the background luminance changes there will he a change in contrast of the outer edges, but not of the inner edges of the stripes. The sudden change of depth occurs with the switch-over of which edges of the stripe are fused.
When the background is darker than the central grey rectangles, the outer edge of the light stripes will have a greater contrast with the background than the inner edges have with the central grey rectangles; and the outer edges of the dark stripes will now have less contrast with the background than their inner edges have with the central grey. So, by the second rule, the outer edges of the light stripes will now fuse with the inner edges of the dark stripes. When on the other hand the background is lighter than the central grey rectangles, then the outer edges of the dark stripes will have a greater contrast with the background than with the inner edges with the central grey; and the outer edges of the light stripes will have less contrast with the background than their inner edges with the central grey. So now the other pair of edges will fuse - the outer edges of the dark stripes will now fuse with the inner edges of the light stripes.
In the special case when the background is isoluminant with the central grey, the edges for both of the possible fusions have the same contrast. So the situation with this luminance ratio is ambiguous. This is where the depth is found to be extremely labile. The visual system appears to be unable to select either pair of edges for stereoscopic fusion in this special situation; although for the narrow stripes, there is no diplopia and no obvious rivalry.
Experiments on edge location
The observed switch in depth when the background is changed from just darker to just lighter than the central grey rectangle fits this account in direction; but for narrow stripes, not in extent. This model would give a step-function at the supposed switch of fusion of the boarders across a stripe. Also, it would give stereo depth exactly corresponding to the disparity given by a stripe width. We find however that the observed depth switch is not quite a step-function, and the change of depth is somewhat greater than normal stereo depth corresponding to a stripe width. The depth increases beyond this expected maximum when the background luminance is close to isoluminance with the central grey. These findings suggest that we should look for shifts of individual edges with the changes of background luminance.
It turned out in practice to be difficult to measure individual edge positions; but it is possible with a carefully designed, light or dark, pointer. We found it impossible to make judgements with a line or a slit pointer; but a wedge-shaped pointer can be positioned under the vertical edges of the display to measure individual shifts of each edge for any background luminance.
Measurement of individual edges
A luminous wedge pointer was introduced optically with a half-silvered mirror, and positioned for measurements by the subject, under each of the four edges of the lower rectangle. The pointer was moved with the swinging glass plate shown in Figure 4. As a check against the possibility that the pointer contrast might have a biasing effect, a dark wedge pointer was substituted for the light pointer in preliminary trials. No biasing effect was found, but the luminous pointer was somewhat easier to see and so was used for these measures, although the subject's task was still by no means easy.
For the vernier displacement measures of the main experiment (plotted in Fig. 3) the subjects were instructed to, and believed they were, aligning the outside edges of the striped figures. This vernier function is very similar in its form to the outer edge shifts measured by the pointer in the experiment just described (Fig. 7) though of somewhat greater amplitude. These two functions are shown for comparison in Figure 8. A possible reason for the amplitude difference will be suggested after we discuss the stereo depth function.
Given the fusion rules for stereo depth suggested above, we should expect this stereo function (Fig. 5) to correlate with the shifts of the edges as selected by the fusion rules. For background luminances less than the grey rectangle, they would be the outer light and the inner dark edges. For background luminances greater than the central grey, they will be the outer dark and the inner light edges. The shifts of these borders (measured by the wedge pointer) correlate well with the stereo depth function in form; but their amplitude is too small. The functions are plotted for comparison with one of - the depth functions of the main experiment, in Figure 9.
It does seem that the pointer-measured functions are smaller in amplitude than the equivalent functions (vernier shift and stereo depth) of the whole figures as measured by the nulling and matching techniques in the main experiment. This might be because different features of the display are accepted for the pointer measures: for example only the ends of the borders contiguous with the pointer, rather than the entire length of the borders which may be used for matching and nulling. Or very different, the luminous wedge-shaped pointer might perhaps, by some interactive process, affect the measured positions; but the tests with the dark pointer, and the absence of any observable change as either pointer was introduced, are evidence against this nuisance. To argue from similarities or differences of the measured functions to whether the same or different channels are operating, we must decide whether the form of the functions or their relative amplitudes give the best indication for identifying channels. We consider that form gives a better indication than amplitude, for amplitude may well depend on such factors as signal/noise ratio demanded, or required, according to the task; and given the large spread-function of the optics of the eye, such a change of demanded or required signal would produce a correspondingly large change of amplitude of the function. More detailed measures, and further considerations of these border shifts and changes of apparent widths of the stripes, will be considered in a later paper.
The results reported here suggest that two of the three dissociations we started with - position and stereo depth - need not be due to differences of neural channel characteristics; but rather to which stimulus features are selected by the visual system. We suggest that for the stereo depth the highest contrast edges of the same sign are selected for fusion, and for the vernier alignment it is the outer edges of the stripes that are selected.
The movement function for the figure as a whole (Fig. 2) is not so easily explained in terms of the measured shifts of the individual edges (Fig. 7a,b). Here it can be seen that the dark and light stripes shift differently. The measured edges of the light stripes do not shift significantly with changes of background luminance below that of the grey rectangle. in this background luminance region both edges of the dark stripes move together in the "required" direction. So, if the shifts are directly related to movement, only the edges of the dark stripe contribute to movement below isoluminance with the grey. As already noted, there does appear to be more movement of the dark stripes, with background luminance modulation in this range. One might expect that where the movement is maximum - around isoluminance with the grey - there would be consistent shifts of the edges but this is not the case. The situation in this luminance range is complicated. Most of the edges shift in the wrong direction. Above isoluminance of the background with the grey rectangle, the higher contrast dark stripe edge shifts in the right direction while the other dark stripe edge shifts equally in the wrong direction. This suggests that signals from the lower contrast edge are effectively rejected. The situation here for the light stripes may be similar; for although the low contrast edge shift changes direction in this range, the high contrast edges do shift systematically in the right direction for the movement. The conclusion is that if the higher contrast edge is always selected, the movement function is compatible with the other functions, except in the range immediately around isoluminance of the background with the grey rectangle which, anomalously, is where the movement is greatest. This suggests that movement is not signalled by the same mechanisms, or channels as those responsible for vernier shift, and stereo depth from disparity; and so movement is dissociated from them in this situation.
We are now left with two basic issues to explain: first the difference in the channel characteristics of the signalled movement compared to the signalled position and stereo depth; and secondly the cause of the perceived movement, and shifts of the edge position and stereo depth with changes in luminance contrast.
Let us first consider the difference in the channel characteristics of the movement compared to the position and stereo depth. It is well known that there are several phenomenally different kinds of movement. It would be highly surprising, and in some cases surely impossible for all these to be mediated by the same channel. It is certainly clear that different channels are involved when movement is signalled by the eyes tracking moving objects, by the "eye/head system", than from retinal images running across the retinal receptors while the eyes are at rest, giving movement signals from the very different "image/retina system" (Gregory, 1966).
Here we are only dealing with image retina movements, which may however involve more than one - channel. Although about six phenomenally different types of image/retina movement can be identified, it is not clear whether they share similar underlying mechanisms or channel characteristics. It has been suggested by Braddick (1974) that there are two different movement channels - a long-range process and a short-range process - associated with phi movement and "co-operative" or global movement. This occurs when regions of dots are displaced within a dotted background, and are seen to move as a unitary whole. These movements involve real shifts in edge location, and there is no suggestion that the seen movement under these conditions has different characteristics from the seen edge positions. In our situation the physical position of the edge remains stationary, only the luminance contrast is varied; but our luminance changes may activate the same channels as for a physically shifting edge. We are certainly concerned here with short range processes, as our illusory movement cannot be obtained with stripe widths more than about ten minutes of arc.
Perhaps the other well known illusory movements produced by luminance changes, rather than by physical shifts in edges, are produced by the same channel characteristics as our movement, and may be dissociated from signalled edge position. Irradiation (Helmholtz, 1867), or Gamma movement (Kenkel, 1913; cf. Boring, 1942 p. 597) occurs where a brightening stimulus is seen to expand, and a darkening stimulus is seen to contract. The classical irradiation effect of a bright stimulus appearing bigger than a dark stimulus is consistant with gamma movement. However Weale (1974) has described a new effect, in which a low contrast dark square appears larger than a low contrast light square. This apparent size change occurs in the opposite direction to the gamma movement, and is another example of dissociation between movement and signalled edge position. In our situation, the dark striped side of the display obeys gamma movement, but the light striped side of the display moves in the opposite direction. As the display is brightened, the light striped side contracts from the background and the dark striped side expands into it. There is no agreed explanation for gamma movement, although increase in scattered light on the retina with increasing brightness may be a partial explanation (Helmholtz, 1867).
Delta movement occurs in the direction of the earlier stimulus when the later one is much brighter. It gives a 'reversed' movement and is not strictly produced by luminance changes alone, and so may not be comparable to the situation here. It is probably best explained by the well known long retinal action time with dim stimuli, and shorter action time with bright stimuli: so in the extreme and critical conditions needed the arrival times for the signals may be reversed from the stimuli times. The dark and light stripes of our displays should similarly signal with different retinal action times. They do not however have extremely different luminances and the rate of change of luminance is not critical. Further, the observed movements occur equally whether the striped display or the background is varied in luminance. We therefore rule out differences of retinal delay as significant for these effects.
Anstis (1970) described an illusory movement, which he termed "reversed phi", occurring when a photographic positive is gradually substituted for a slightly displaced negative; the movement is in the opposite direction to normal phi. It is not clear that this phenomenon is related to phi movement because it does not have the same critical time-distance relation (Korte's laws), so it is unfortunate that the term "reversed phi" came to be used, and it has now been abandoned by Anstis and Rogers (personal communication, 1981). The movement they describe is in the same direction and may be the same as the illusory movement produced by the luminance changes in these experiments, although the displays are somewhat different. With the experimental technique they used, Anstis and Rogers were unable to present the same (mirror reversed) figures to the eyes for stereo fusion with luminance ratios crossing isoluminance because they presented a grey shape (without stripes) to one eye at constant luminance, while the other eve was given static stages from the sequence of negative-displaced-positive dissolves. They did not, therefore, find the switch in depth across isoluminance of the background with the grey rectangle. Considering vernier measurements they are consonant with their depth and movement functions; hut arc for the most part in the opposite direction from ours. So Anstis and Rogers find no dissociation, in their situation, between movement and signalled position or stereo depth.
Another piece of strong observational evidence for separate channels for movement and position is provided by the fact that the after-effect of movement is paradoxically seen without changes in position (Gregory, 1966, p. 107). There is some evidence (Tyler, 1973) that stereo depth and static edge position are mediated by different channels. High spatial frequency modulation of a line is not resolved as well when stereoscopically fused with a straight line, as when viewed monocularly. Tyler suggested that vernier and stereoscopic processing are carried out by two systems, operating relatively independently at higher cortical levels, and that depth signals are integrated over a longer time (Foley and Tyler, 1976).
Having discussed differences between movement, edge position and stereo depth, ~ can now address the question of how the visual system operates to produce the movement edge position and stereo depth shifts that we are describing with changing luminance. We can discuss this issue from two aspects: relevant physiological evidence and hypothetical operations that are supposed to be carried out by the visual system.
Considering what is known of the physiological basis for movement as normally seen from retinally shifting images ('image/retina' movements), movement is conveyed by sequentially changing ratios of intensity between neighbouring receptors, as signalled by later neural channels. For the rabbit retina, as Barlow and Levick (1964) found, movement can also be signalled - by light or by dark spots - moving within a receptive field; so at least for the rabbit retina the primary units or channels for signalling movement to the brain are not the receptive fields of ganglion cells but are earlier. On the other hand, the primary units for signalling edge position must require comparisons between signals from separated receptive fields. This difference could well be the key for why the movement function is different from the position and stereo depth functions. Not only can movement be signalled within a ganglion cell's receptive field; there is also overwhelmingly strong evidence that movement can also be signalled from successive stimulation of widely separated ganglion cells - provided the time intervals and distance between stimuli are appropriate for Korte's Laws of phi movement (Korte, 1915; Graham, 1965).
It is not clear that monkey or man have direction-selective movement detectors prior to the cortex. Hubel and Wiesel (1968) describe that for monkey, movement without directional specificity is signalled b a class of cortical cells named "simple", and particular directions of movement by "complex" cortical cells. These cells are often described as bar or edge detectors mediating perceived edge position and stereo depth. Zeki (1974) describes cells in the posterior bank of the monkey's superior temporal sulcus that respond specifically to movement. Movement is also signalled in the superior colliculus of the mid brain, where it appears to mediate eye movements. There are well known indications of cells having different properties and different sizes of receptive field, especially Y-cells and X-cells as described by Cleland and Levick (1974) in the cat and Gouras (1968) in the monkey. They find that the Y-cells, which have a transient response, possibly mediating movement, have larger receptive fields, and are more sparsely spaced than the X-cells which have a slow and sustained response, add possibly mediate signalled position. It seems unfortunate that these physiological recordings require moving stimuli. There is also human psychophysical evidence supporting the notion that position and movement are signalled by separate channels having different receptive field sizes, the movement channel having lower spatial frequency (King-Smith and Kulikowski, 1975).
There have been several attempts to model the visual system's processing of the visual image. Anstis and Rogers (1975). Rogers and Anstis (1975) and Rogers (1976) suggested a spatial summation model to explain their illusory movement, and shifts of edge position and stereo depth. They convolved the luminance profiles of their stimuli with a Mexican hat type function, which they derived from typical receptive field characteristics. The resulting convolved functions demonstrated an effective contour shift in the direction of their illusory movement and shifts. They explained the shifts from the overall shape of the convolved function directly without specifying what part of the function was giving the effective edge, such as the peak or the zero-crossing of the second derivative. This type of theory is attractive, but there are two problems in applying it to our situation. First, the edges of the stripes are not resolved with the size of the convolving function's space constant chosen by Anstis and Rogers; second, we do not find all our measured functions moving together, so the model does not obviously fit all our functions. Watt and Morgan (1983) have found that their vernier acuity experiments are best explained by a model which encodes only the occurrence and location of zero-crossings in the second derivative of the retinal light distribution. Marr (1982) has described a theory of vision in which the zero-crossings of the second derivative of a Laplacian operator, can most efficiently represent the image. It would be interesting to see how well these models can account for our data. The rules we have suggested for feature selection may well be extended to restraint rules based on what objects generally do (Ullman, 1979).
Possibly the shifts of position with the luminance changes are due to what we have previously called "border locking", which is supposed to correct for positional discrepancies, to maintain registration at borders in spite of signalling errors in parallel channels. Such general avoidance of discrepancies would require correcting shifts of position; which may be activated by the asymmetrical luminances on either side of the borders and narrow stripes to produce, in these particular displays distortions of position and of stereo depth.
ANSTIS, S. M. (1970). Phi movement as a subtraction process. Vision Research, 10, 1411 - 30.
ANSTIS, S. M. and ROGERS, B. J. (1975). Illusory reversals of movement and depth during changes in contrast. Vision Research, 15, 957 - 61.
BARLOW, H. B. and LEVICK, W. R. (1965). The mechanism of directionally selective units in rabbits' retina. Journal of Physiology, 178, 477 - 504.
BORING, E. G. (1942). Sensation and Perception in the History of Experimental Psychology. New York: Appleton Century Crafts.
BRADDICK, O.J. (1974). A short range process in apparent motion. Vision Research. 14, 519 - 27.
CLELAND, B. C. and LEVICK, W. R. (1974). Brisk and sluggish concentrically organized ganglion cells of the cat's retina. Journal of Physiology, 240, 421 - 56.
FOLEY, J. M. and TYLER, C. W. (1976). Effect of stimulus duration on stereo and vernier displacement thresholds. Perception and Psychophysics, 20, 125 - 8.
GOURAS, P. (1968). Identification of cone mechanisms in monkey ganglion cells in monkey retina. Journal of Physiology. 199, 533 - 47.
GRAHAM, C. H. (1965) The perception of movement. In: GRAHAM, C. H., BARTLETT, N. R., BROWN, J. L., HSIA, Y., MUELLER C. G. AND RIGGS, L. A. (Eds), Vision and Visual Perception. New York: Wilev.
GREGORY, R. L. (1966). Eve and Brain. London: Weidenfeld and Nicolson.
GREGORY, R. L. (1977). Vision with isoluminant colour contrast: 1, A projection technique and observations. Perception, 6, 113 - 9.
GREGORY, R. L. (1979). Stereo vision and isoluminance. Proceedings of the Royal Society B, 204, 467 - 76.
GREGORY. R. L. and HEARD, P. F. (1979). Border locking and the cafe wall illusion. Perception. 8, 366 - 80.
HELMHOLTZ, H. VON. (1867). In: SOUTHALL, J. P.C. S. (ed.) (1963) Handbook of Physiological Optics. London: Dover reprint.
HUBEL, D. H. and WIESEL, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215 - 43.
KING-SMITH, P. E. and KULIKOWSKI, J. J. (1975) Pattern and flicker detection analysed by subthreshold summation. Journal of Physiology, 249, 519 - 48.
KORTE, A. (1915). Kinematoskopische Untersuchungen. Zeizschrift für Pschologie, 72, 193 - 296.
MARR, D. (1982). Vision: A computational Investigation into the Human Representation and Processing of Visual information. San Francisco: W. H. Freeman and Co.
ROGERS, B. J. (1976). Perceptual consequences of temporal and spatial summation in the human visual system. Ph.D. Thesis Bristol University.
ROGERS, B. J. and ANSTIS, S. M. (1975). Reversed depth from positive and negative stereograms. Perception, 4, 193 - 201.
TREISMAN, A. M. (1964). Selective attention in man. British Medical Bulletin, 12 - 16.
TYLER, C. W. (1973). Stereoscopic vision: cortical limitations and a disparity scaling effect. Science, NY., 181, 276 - 8.
ULLMAN, S. (1979). The interpretation of visual motion. Cambridge, Mass.: MIT Press.
WATT, R. J. and MORGAN, M. J. (1983). Mechanisms responsible for the assessment of visual location: Theory and evidence. Vision Research, 23, 97 - 109.
WEALE, R. A. (1974). Apparent size and contrast. Vision Research, 15, 949 - 55.
WHITTLE, P. (1965). Binocular rivalry and the contrast at contours. Quarterly Journal of Experimental Psychology, 17, 217 - 26.
ZEKI, S. (1974). Functional organization of a visual area in the posterior bank of the superior temporal sulcus of the Rhesus monkey. Journal of Physiology, 236, 549 - 73.
Revised manuscript received 14 January 1982