Figure and Ground in the Visual Cortex: V2 Combines Stereoscopic Cues with Gestalt Rules

doi:10.1016/j.neuron.2005.05.028

Journal List > NIHPA Author Manuscripts

Neuron.Author manuscript; available in PMC 2006 September 12.

Published in final edited form as:

Neuron. 2005 July 7; 47(1): 155–166.

doi: 10.1016/j.neuron.2005.05.028.

PMCID: PMC1564069

NIHMSID: NIHMS5205

Figure and Ground in the Visual Cortex: V2 Combines Stereoscopic Cues with Gestalt Rules

Fangtu T. Qiu and Rüdiger von der Heydt

Krieger Mind/Brain Institute, and Department of Neuroscience, Johns Hopkins University, 3400 N Charles Street, Baltimore, MD 21218

Correspondence: Rüdiger von der Heydt Krieger Mind/Brain Institute Johns Hopkins University 3400 North Charles Street Baltimore, MD 21218 Phone: 410 516-6416 Fax: 410 516-8648 E-mail: von.der.heydt/at/jhu.edu

The publisher's final edited version of this article is available at Neuron.

See commentary "Resolving border disputes in midlevel vision." in Neuron, volume 47 on page 5.

See other articles in PMC that cite the published article.

Abstract

Figure-ground organization is a process by which the visual system identifies some image regions as foreground and others as background, inferring three-dimensional (3D) layout from 2D displays. A recent study reported that edge responses of neurons in area V2 are selective for side-of-figure, suggesting that figure-ground organization is encoded in the contour signals (border-ownership coding). Here we show that area V2 combines two strategies of computation, one that exploits binocular stereoscopic information for the definition of local depth order, and another that exploits the global configuration of contours (gestalt factors). These are combined in single neurons so that the ‘near’ side of the preferred 3D edge generally coincides with the preferred side-of-figure in 2D displays. Thus, area V2 represents the borders of 2D figures as edges of surfaces, as if the figures were objects in 3D space. Even in 3D displays gestalt factors influence the responses and can enhance or null the stereoscopic depth information.

Keywords: primate visual cortex, visual perception, figure-ground organization, stereoscopic vision, gestalt principles, single-unit activity, awake macaque, receptive fields, area V1, area V2

Introduction

We perceive the world in three dimensions although our eyes register only two-dimensional images. These images are generally cluttered because objects occlude one another, and surfaces that are widely separated in space are projected onto adjacent image regions (Fig. 1A). Thus, a fundamental task of vision is to identify the borders between image regions that correspond to different objects. These borders, also termed ‘occluding contours’, carry information about the form of the occluding object, but are generally not related to the background objects. For example, the border between the dark and midgray regions in Fig. 1 defines the shape of the lighter tree in the foreground, but not the shape of the partly occluded darker tree. Somehow, the brain immediately ‘knows’ that the object corresponding to the darker region extends behind the lighter region, and consequently registers the darker tree as a more or less symmetrical shape and not as a banana shaped object (the actual form of the dark gray region). Thus, the task of vision is not only to detect the occluding contours, but also to assign them correctly to the occluding objects.

Fig. 1

The problem of interpreting two-dimensional (2D) images in terms of objects in a 3D world. Images are composed of regions that correspond to objects in space (A). The boundaries of these regions are generally the contours of objects that occlude more (more ...)

It might be thought that this perceptual interpretation is only possible because the image contains familiar shapes of objects. Yet psychologists in the early twentieth century argued that mechanisms of figure-ground organization exist that work automatically, and independently of the observer’s knowledge and expectation (Koffka, 1935; Rubin, 1921; 2001; Wertheimer, 1923; 2001) (for a review see Spillmann and Ehrenstein, 2003). Indeed, figure-ground perception can be manipulated experimentally by providing specific cues that define the depth relationships explicitly, for example, by means of stereograms. Under these conditions, the perception of form and recognition of objects is dramatically affected when the depth ordering between regions is altered (Nakayama et al., 1989). This indicates that assignment of border ownership precedes the recognition process.

Single-cell recordings show that stereoscopic cues contribute to the cortical representation of contours in many ways. Some of the neurons in area V2 that signal location and orientation of luminance contours respond also to disparity-defined contours created by ‘random-dot stereograms’ (RDS) and represent the depth ordering of surfaces (von der Heydt et al., 2000). Binocular disparity influences the representation of contours in V1 and V2 (Bakin et al., 2000; Heider et al., 2002; Sugita, 1999) and affects motion signals in area MT (Duncan et al., 2000) in ways that parallel perceptual figure-ground organization. Illusory contour signals depend on occlusion cues which might also be used for assigning figure and ground (Baumann et al., 1997; von der Heydt et al., 1993). Thus, depth cues profoundly influence the neural visual representation at early cortical levels.

The phenomenon of figure-ground organization in the absence of specific depth cues is still a mystery. Why is the white square in Fig. 1B generally perceived as an object in front of a dark background rather than a window in a dark screen, or simply a lightly pigmented patch of surface surrounded by a darker pigmented region? The borders between light and dark are interpreted as the edges of an occluding object. Apparently, the system assigns border ownership despite the absence of depth cues, using criteria such as compact shape, the global configuration of contours (closure, ‘surroundedness’), or perhaps by identifying familiar shapes (in this case a square). Without implying a specific theory we refer to this phenomenon as gestalt-based figure-ground organization.

Neural correlates of gestalt-based figure-ground organization were recently discovered at early levels in the visual cortex (Lamme, 1995; Lee et al., 1998; Zhou et al., 2000; Zipser et al., 1996) (but see Rossi et al., 2001). Lamme and colleagues found enhancement of texture-evoked activity in figure regions compared to the ground region in neurons of V1. Zhou et al. found that neural edge responses were selective for the side of the figure to which the edge ‘belonged’ (see below). This phenomenon was more pronounced in V2 and V4 than in V1. Remarkable about these findings is that neurons at these early levels integrate the image context far beyond the classical receptive field (for a review see Albright and Stoner, 2002).

The selectivity for side-of-figure of neurons might be just a random asymmetry of receptive fields. If it indeed reflects the process of figure-ground segregation as hypothesized (Zhou et al., 2000), then these neurons should also respond to stereoscopically defined 3D edges and be selective for depth order: For example, a neuron with a preference for figure-to-the-left (Fig. 1C, black dot indicates receptive field) should respond to edges in which the surface to the left of the receptive field is nearer than the surface to the right, because this is so for objects in 3D space; but the neuron should not respond to edges of the opposite depth order because a left-far edge can only occur if the figure is a window. Zhou et al. presented two examples of cells in which the preferred side of figure in fact coincided with the ‘near’ side of the preferred depth order. Finding this in two cells could have been a coincidence. The question remained open if the visual cortex systematically combines stereoscopic cues with gestalt-based criteria, and how it does this. Is there a statistical association between both kinds of cues, and if so, how strong is it? Are gestalt cues comparable to ‘real’ depth cues such as binocular disparity? How do neurons respond if gestalt cues contradict the binocular information?

In the present study we have investigated the interplay between stereoscopic cues and gestalt cues in the visual cortex quantitatively. The results show that there is a robust tendency to combine these different sources of information according to the rule that a compact shape corresponds to an object in 3D space. Experiments with combinations of cues show that gestalt factors influence the border-ownership signal even when explicit depth information is available.

Results

Two main experiments were performed. The aim of Experiment 1 was to determine if side-of-figure preference and stereoscopic edge preference are combined in a systematic way in single neurons. The two hypothetical mechanisms were tested separately: Side-of-figure selectivity was determined with contrast-defined figures which do not provide depth cues, and stereo-edge selectivity was determined with RDS which define depth, but are devoid of contrast-defined form. In Experiment 2, depth and gestalt cues were combined, and synergistic and antagonistic combinations were tested to see how the cues interact.

Additional experiments were performed on a subset of the neurons to establish size invariance of the gestalt effect, and position invariance of 3D edge selectivity. We will begin by discussing these results in sections 1-2, because they serve well to explain the basic findings of side-of-figure selectivity and stereo edge selectivity. In sections 3-4 we will then present the results of the main experiments, and in section 5 some controls.

1. Side-of-figure selectivity

A fraction of the orientation selective neurons in macaque area V2 signal not only the location and orientation of luminance and color edges, but also the location of the figure to which an edge ‘belongs’ (Zhou et al., 2000). Fig. 2A illustrates a V2 neuron that responds more strongly to the bottom edge of a light square than to the top edge of a dark square although the edge in the receptive field is the same. Note that the left and right displays in Fig. 2A are indistinguishable over the entire region occupied by the two squares (dashed line in Fig. 2B) and that information about the side of the figure can only come from outside that region. Thus, despite its small receptive field (black ellipse), the neuron apparently processes a large image context. As can be seen in Fig. 2B, the size of the square determines the distance over which context signals need to be integrated to determine the location of the figure. Cells were tested with two sizes of squares, 3 deg and 8 deg visual angle, and two contrast polarities, and the side-of-figure effect was quantified by the response modulation index, taking the preferred side for the 3-deg figure as reference (see section 3). This index is plotted in Fig. 2C for all cells in which the effect for the 3-deg figure was significant (p<0.05, analysis of variance -- ANOVA). The points corresponding to the same neuron are connected by lines. It can be seen that most cells (27 of 33) showed same side preference for the 8-deg figure as for the 3-deg figure. Zhou et al. found consistent side selectivity for figures that spanned up to 20 deg of visual angle. This range of context integration is huge compared to the small size of the ‘classical receptive field’ of V2 neurons, which is only 0.6 deg on average for the median eccentricity of receptive fields in our sample (Gattass et al., 1981).

Fig. 2

Side-of-figure selectivity. A, Responses of a V2 neuron to the same local contrast border forming either the top edge of a dark square, or the bottom edge of a light square. Squares of two sizes were tested (3° and 8° visual angle). Displays (more ...)

2. Stereoscopic edge selectivity

Many neurons in V2 are sensitive to binocular disparity (Poggio et al., 1985) and some respond to stereoscopically defined 3D edges (von der Heydt et al., 2000). The majority of these cells are selective for the orientation of the edge and also for the depth order, that is, which surface is in front and which in back. Fig. 3 illustrates this selectivity for three V2 neurons. Disparity-defined edges were created by RDS. The disparity of one surface was set to the preferred disparity of the neuron (or zero if there was no clear tuning), and the other surface was placed behind it at a distance corresponding to 10 or 24 arc min disparity (depending on the eccentricity of the receptive field). The edges were tested in four orientations, as illustrated at the top of Fig. 3. (For the purpose of illustration, the preferred orientation was assumed to be vertical; hatching indicates the nearer of the two surfaces). To control for effects of stimulus position, each edge was presented at various positions relative to the receptive field, as indicated by the scales. The bar graphs below show the responses as a function of position.

Fig. 3

Neural selectivity for stereoscopic edges. Neurons were tested with random-dot stereograms (RDS) portraying a square floating in front of a background plane. An edge of the square was presented in the receptive field (ellipse) at four orientations, as (more ...)

It can be seen that, at the preferred orientation, each neuron responds vigorously to one depth order, but hardly at all to the opposite depth order. For example, the cell in Fig. 3A responds to a vertical edge whose right surface is in front, but not at all if the left surface is in front (although the edge is at the same depth in both configurations!). The other two cells have the opposite preference. Note that the preference for one or the other depth order does not depend on the exact position of the edge in the receptive field; at any position, the responses to the non-preferred depth order are much smaller than the maximum response. Also edges orthogonal to the preferred orientation (horizontal in the Figure) produce only weak, erratic responses. Thus, cells in V2 can signal orientation and depth order of 3D edges. Generally, these cells respond to contrast edges as well as to disparity-defined edges and show similar orientation tuning for both (von der Heydt et al., 2000).

3. Convergence of gestalt processing and stereoscopic mechanisms in single cells

The stereoscopic selectivity of neurons provides a key to understanding the meaning of their signals. If neurons are selective for the depth order of stereoscopic edges we know that they are involved in the representation of the 3D layout of surfaces, and hence border-ownership coding. While contrast-defined displays are generally ambiguous (Fig. 1), there is no such ambiguity in random-dot stereograms because the depth relations are defined by the binocular disparities; the nearer surface owns the border (Nakayama et al., 1989). Thus, the random-dot stereogram can be considered as the ‘gold standard’ for border-ownership assignment. If the side-of-figure selective neurons are involved in border-ownership coding, they should also be selective for the depth order of edges in random-dot stereograms. We may not expect to see this in every case, because stereopsis is obviously not indispensable for the perception of border-ownership. However, if neurons combining side-of-figure with depth order selectivity exist in significant numbers, and if the depth-order preference, in the population, is biased towards the object interpretation (Fig. 1C), this would be strong evidence for mechanisms that implement gestalt rules to infer border ownership.

In Experiment 1 we examined the relationship between preferred side-of-figure and preferred depth order of single neurons. Fig. 4 illustrates this experiment for a neuron recorded in area V2. The responses to the contrast-defined figures (A-D) show that the neuron is activated more strongly when the square is located to the left of the receptive field (responses A and C are stronger than responses B and D). The test with random-dot stereograms (E-H) shows that the neuron responds vigorously to the step when the left-hand surface is nearer than the right-hand surface (E, F), but hardly at all to the reverse step (G, H). Thus, the neuron associates “figure left” with “left surface in front”, which is consistent with an interpretation of the contrast-defined square as an object in front of a background. Note also that, in the case of the random-dot stereograms, the responses are determined by the depth order of the surfaces in the receptive field, but are independent of the location of the global shape. Whether the edge was the right-hand edge of a square surface (E) or the left-hand edge of a window (F) made no difference.

Fig. 4

Convergence of gestalt mechanisms and stereoscopic mechanisms in a single neuron. A-D, Responses to left and right sides of contrast-defined figures. For either contrast polarity of the local edge, figure location left of the receptive field (A, C) produces (more ...)

Fig. 5 illustrates the results from four other V2 neurons in this experiment. The averaged firing rate is plotted as a function of time after stimulus onset. The plots labeled Contrast show the responses to edges of contrast figures: solid line for preferred side, dashed line for non-preferred side (averaged over both contrast polarities). The plots labeled RDS show the responses to 3D steps, and solid lines correspond to steps in which the surface on the preferred figure side was near (the object case), whereas dashed lines correspond to steps in which the surface on the preferred figure side was far (the window case). It can be seen that, in neurons in A-C, the 3D step that was consistent with the object interpretation evoked the stronger response, while for the neuron in D, the 3D step corresponding to the window interpretation was more effective. In each case, the differentiation of side-of-figure and depth order occurred soon after the onset of responses.

Fig. 5

The responses of four other V2 neurons in the same experiment. The graphs show the smoothed mean firing rates as a function of time after stimulus onset. For contrast-defined figures (Contrast), solid and dashed lines show the responses for preferred (more ...)

This experiment was performed in 251 orientation selective neurons, 77 from area V1, and 174 from area V2. Fig. 6 shows how these neurons combined side-of-figure and 3D-step selectivity. The modulation index for side-of-figure:

where R is mean firing rate, is plotted on the vertical axis, while the horizontal axis shows the corresponding modulation index for depth order:

where pref-near and pref-far signify the edges whose surface on the preferred side is near, and far, respectively (‘preferred side’ for the contrast figure). This index is > 0 if side-of-figure and step-edge preferences are consistent with an object interpretation of the figure, and < 0 if they are consistent with a window interpretation. (The side-of-figure modulation index is always positive because preferred side was defined as the side associated with the greater response.) Filled symbols indicate cells that were selective for both, side-of-figure and depth order (p<0.05 in each case, ANOVA). It can be seen that in the V2 sample (Fig. 6, top) cells on the object side are more frequent and tend to have higher modulation indices for side-of-figure than cells on the window side. Of the 174 neurons tested in area V2, 35% were selective for side-of-figure, 40% were selective for depth order, and 21% were both. Of the latter, 81% (30/37) represented the object interpretation. In area V1 (Fig. 6, bottom), only two of 77 neurons tested selective for both, side-of-figure and depth order, significantly less than in V2 (P<0.0001, Fisher’s exact test).

Fig. 6

Gestalt-based and stereoscopic figure-ground mechanisms in neurons of areas V2 and V1. The modulation index for side-of-figure is plotted on the vertical axis, and the modulation index for depth order on the horizontal axis. Each symbol represents a neuron. (more ...)

To quantify the degree of object preference in the population of neurons we calculated the object bias of the population response, defined as the mean of the index I_side with each neuron weighted by its index I_depth. I_depth indicates which way, and how strongly, a neuron signals figure and ground when unambiguous depth information is provided. Thus, we take the RDS as the standard test that tells us how to read the neural signals. The object bias thus calculated would be zero if there was no association between side-of-figure and depth order preference, and positive (between 0 and 1) if there was a bias towards object interpretation, and negative if there was a bias towards window interpretation. All cells tested were included in this analysis. For the V2 data of Fig. 6 we obtained an object bias of +0.42 (t=24.2, df=173, p<0.0001). For V1, it was not significantly different from zero (t= —0.1, N=77, p=0.93). Note that the side-of-figure modulation index was calculated from the responses to contrast-defined figures without depth cues, and the object bias was obtained from this index by pooling neurons according to their 3D edge selectivity (which is their signature of coding 3D layout). Thus, the fact that the object bias for V2 is positive means that contrast-defined figures without specific depth information are represented in V2 as if they were objects in 3D space.

Besides the neurons that combined selectivity for side-of figure and depth order (filled symbols) Fig. 6 shows that there were also neurons that were selective for side-of-figure, but not for stereoscopic depth order, and others that were selective for depth order, but not side-of-figure. This indicates that two different mechanisms provide inputs to these neurons, and sometimes converge onto a single neuron. The predominance of the object interpretation shows that the two mechanisms are not combined at random, but according to the rule that the region of the figure corresponds to an object in 3D space. The convergence seems to occur mainly in V2.

The symbols corresponding to the examples in the previous figures are labeled with numbers in Fig. 6, number 1 representing the cell of Fig. 4, and numbers 2-5 the cells of Fig. 5A-D. It was easy to find examples of cells with strong modulation in both dimensions on the object side, but on the window side only two of the 7 cells with both effects had larger modulation indices. Cell number 5 was the best example of this kind. This cell responded vigorously to stereoscopic edges and was completely selective for depth order (Fig. 5D), and this was confirmed by recording responses for various edge positions relative to the receptive field (Fig. 3A). The contrast-edge and bar responses were weak (Fig. 5D). Nevertheless, the side-of-figure preference was confirmed by several repetitions, and for different sizes of the square. Cell number 6 of Fig. 6 barely responded to RDS, but its depth order preference was confirmed with displays of drifting, dense random-dot patterns. Such displays generate strong depth stratification in perception (cf. Kaplan, 1969; Yonas et al., 1987) and were found to evoke depth-order selective responses in V2 cells similar to those from RDS (von der Heydt et al., 2003). In cell 6 such displays again produced responses according to the window interpretation. Thus, the window combination of side preference and edge selectivity might be more than a variation produced by chance; representing the alternative interpretation might have functional significance. However, the general weakness of response modulation in the few ‘selective’ cells on the window side underscores the predominance of the object-type wiring in neurons of area V2.

The modulation index plotted in Fig. 6 indicates the relative change of responses, but not their absolute strength. To show that our analysis is based on robust responses we have listed in Table 1, for contrast edges and for RDS edges, the means and medians of the response strengths (mean firing rate for the preferred of the four stimulus conditions illustrated in Fig. 4). For comparison, the statistics are listed for cells classified as ‘selective in both tests’ (represented by filled dots in Fig. 6) and for other cells. The average response strengths were in the range between 30-47 spikes/second for contrast edges, and about half of that for RDS. The V2 data show that the responses of the ‘selective’ cells were actually stronger than those of the other cells on average, for contrast edges as well as for RDS.

Table 1

Comparison of response strengths between cells that were selective for side-of-figure as well as depth order (p<0.05 for each) versus other cells.

Experiment 1 consisted of two tests, one with contrast-defined figures, and the other with stereoscopic figures. Each involved two factors, and only the effects of side-of-figure and depth order are represented in Fig. 6. In the contrast figure test, the second factor was edge contrast polarity (Fig. 4A-D). The effect of this factor was significant in 42% of the V2 cells. Similar to previous results (Zhou et al., 2000), the effect of contrast polarity was found in about half of the side-of-figure selective cells, and interaction was found in one fifth. The most frequent type of interaction was multiplicative behavior, with a strong side-of-figure difference for the preferred contrast polarity, but little difference for the other polarity because responses were close to zero.

In the stereogram test, the second factor was the location of the disparity-defined figure (Fig. 4E-H). This factor was rarely significant (9%, compared to 35% for contrast-defined figures) and interaction between side of disparity-defined figure and local depth order was also rare. The example shown in Fig. 4 is typical. Thus, RDS responses depended on the depth order of the edge in the receptive field, but not on the location of the global shape. We conclude that disparity-defined (‘cyclopean’, Julesz, 1971) figures have a weaker gestalt effect than contrast-defined figures. The selectivity for stereoscopic depth order is produced mainly by local mechanisms.

4. Contradictory versus coherent cues for objects: do gestalt cues modulate stereoscopic responses?

In the above experiment, side-of-figure preference and stereoscopic selectivity were examined in separate tests. The contrast-defined figures had no stereoscopic cues, while the stereoscopic figures had no contrast borders that would define the shape of the figure. Natural stimuli generally provide global shape information as well as stereoscopic cues. The stereoscopic information tends to ‘disambiguate’ perception. For example, the tilted square in Fig. 1B could be perceived as an object or as a window. Although the object interpretation usually dominates, perception may flip back and forth between the two interpretations. However, when texture is added to the display and the square region is given a ‘near’ disparity relative to the dark region, an object is invariably perceived. But when the same region is given a ‘far’ disparity, a window is perceived. In the latter case, disparity overrides the gestalt influence. This observation suggests that the gestalt influence may be easily obliterated by unambiguous depth cues. How are the different cues combined in single neurons? Are the gestalt cues weaker than conventional cues such as stereoscopic disparity? Can they influence the responses when pitted against disparity?

In Experiment 2 we studied displays in which figures were defined by luminance contrast and disparity. As before, a contrast square was presented left or right of the receptive field, but the light and dark regions were also textured with a random-dot pattern (RDS contrast=0.3). The neural selectivity for depth order was determined with object and window displays, as shown schematically in Fig. 7A (which does not show the random-dot texture). The same 3D edge was presented in the receptive field in two conditions: one in which the global shape supports the object interpretation, and the other in which the global shape was located on the ‘wrong’ side, that is, the gestalt cue contradicts the depth cue. For each condition, the depth order modulation index was calculated. The index for object displays is plotted on the horizontal axis, the index for window displays on the vertical axis. The former was taken as the reference; if it was negative, the signs of both indices were reversed. Responses were recorded for the two contrast polarities of the local edge and averaged (only one polarity is illustrated).

Fig. 7

Interaction of gestalt factors and stereoscopic depth. Figures were defined by luminance contrast and disparity. A, Schematic illustration of 3D stimuli and receptive field position (the random-dot texture is not illustrated; in the case of window stimuli (more ...)

Neurons whose responses were determined solely by the local 3D edge would tend to produce the same depth order modulation index for object and window displays, because, in both cases, the index subtracts responses to far-near edges from responses to near-far edges. Such cells would therefore be represented by data points clustering about the 45° line. However, neurons that were dominated by side-of-figure would show inverted modulation indices, because for the horizontal axis, figure-right was subtracted from figure-left, whereas for the vertical axes, figure-left was subtracted from figure-right. Thus, neurons that are dominated by side-of-figure would be represented near the -45° line.

The cue interaction experiment was performed in 29 stereo edge selective cells (9 of V1 and 20 of V2) and the results are plotted in Fig. 7B. Filled dots indicate neurons in which the main effect of side-of-figure was significant (p<0.05, 3-way ANOVA with factors depth order, side-of-figure, and contrast polarity). The plot shows that these cells are represented below the 45° diagonal; they had a lower modulation index in the window condition than in the object condition. Thus, the ‘wrong’ localization of the figure reduced or abolished the depth order signal (the fact that most of these cells cluster about the horizontal axis suggests that the window displays are represented with no clear depth at all in those cells). This shows that gestalt factors influenced the responses even in the presence of effective stereoscopic cues. However, in none of the cells did the gestalt cue fully reverse the modulation (no dots on the -45° line).

The interaction of cues is further illustrated by an example in Fig. 8 (recordings from the cell labeled 7 in Fig. 7). As before, the figures were defined by luminance contrast and disparity, but in this case, the contrast of the random-dot texture was varied, thereby varying the strength of the stereoscopic cue. The insets illustrate the four configurations; A and C represent object conditions, B and D window conditions; in A and B, the square shape is located on the left of the receptive field, in C and D, on the right.

Fig. 8

Interaction of gestalt factors and stereoscopic depth. Figures were defined by luminance contrast and disparity, as in the previous experiment (Fig. 7), but the contrast of the random-dot texture was varied to show the transition to the no-disparity condition (more ...)

The bar graphs at the bottom of Fig. 8 show the responses of the neuron for these four conditions at 3 different contrast levels of the random-dot texture (RDS contrast). Bars extending left and right of the zero line correspond to left and right location of the square. It can be seen that with stereoscopic cues (RDS contrast=0.1 and 0.3), responses to A are stronger than responses to C, and responses to D are stronger than responses to B. Thus, the neuron responds according to stereoscopic depth order. However, in the no-texture condition (RDS contrast=0), the responses to the window displays flip to the left; B now produces stronger responses than D. This corresponds to a change in perception of border ownership -- without the stereoscopic cues, the squares in displays B and D are no longer perceived as windows, but as objects, according to gestalt cues. Border ownership flips from right to left in B, and from left to right in D. Note that even with the disparity cue, the responses for D were slightly weaker than the responses for A (dashed lines in plot D are copies of the bars from A). This shows the attenuation of stereoscopic signals by the gestalt factor that was demonstrated in Fig. 7.

5. Controls

We considered errors in centering the edge of the test figure in the receptive field and deviations of direction of gaze as possible confounds. For the side-of-figure test, position errors can probably be neglected because we compare responses between two conditions in which the displays are identical over a region that is larger than the ‘minimum response field’ of the cells. Thus, random position errors would produce similar variations of response in both cases and thus cancel. Systematic deviations of fixation according to figure location were ruled out by eye movement recordings. For the stereoscopic test, depth order selectivity was verified by recording position-response curves (Fig. 3) for part of the cells of our sample, specifically for 18 of the 37 V2 neurons classified as selective for side-of-figure and depth order (filled symbols in Fig. 6).

Changes in convergence of the eyes would not be detected by our eye movement recordings which were only for one eye. To see if the stereograms caused changes of convergence we analyzed the responses of disparity-selective cells in the presence of background disparities (see Methods and Procedures). This analysis indicated that convergence was maintained accurately.

Discussion

The phenomenon of figure-ground organization played a key role in the formulation of the gestalt theory, which conjectured that central processes such as attention and recognition access visual image information not directly, but through an intermediate, structured representation (Rubin, 1921; Wertheimer, 1923). Later studies have demonstrated that changes in perceived depth stratification dramatically affect perception of form, recognition of objects, and selective visual attention (Driver and Baylis, 1996; He and Nakayama, 1992; Nakayama et al., 1989; Rensink and Enns, 1998). Both older and recent studies pointed out that the internal assignment of border-ownership seems to be the key to understanding these results. Based on single-cell recordings in macaques Zhou et al. (2000) suggested that border ownership is encoded in the contrast edge responses of neurons in the visual cortex.

The present results show that the visual cortex processes global configuration together with binocular information to relate contrast borders to object contours and assign border ownership. There are two key observations. First, neurons that are side-of-figure selective for edges of 2D figures are often (61%) selective for depth order of 3D edges. Second, the side of the figure that produces the stronger response is also usually the ‘near’ side of the 3D step for which the neuron is selective (Fig. 6). Thus, the system assigns the contrast borders of 2D figures as if they were objects in 3D space. For contrast-defined figures that provide no stereo cues, the configuration of contours determines the border-ownership signal according to gestalt rules. When contrast borders are missing, as in random-dot stereograms, the depth order determines the signal. In general, both kinds of information contribute to the border-ownership signal; but if stereo depth is in conflict with gestalt rules (according to which enclosed, compact image regions should be interpreted as objects), the influence of the stereoscopic input is reduced or abolished (Fig. 7). These results support the hypothesis of border-ownership coding (Zhou et al., 2000). Side-of-figure selectivity by itself might be dismissed as a random asymmetry of receptive fields (spatial heterogeneity of non-classical surround has been observed in V1: Freeman et al., 2001; Jones et al., 2001; Levitt and Lund, 2002), but the linkage between stereoscopic selectivity and 2D contextual influence is unequivocal evidence for border-ownership coding.

The possibility that the side-of-figure effect is an artifact of displacements of the receptive field due to residual eye movements can be ruled out because responses are compared between stimulus conditions that are identical in and around the minimum response field. That selectivity for depth order was genuine, and not due to eccentric positioning, was demonstrated by recording position-response profiles for figures in random-dot stereograms in about half of the neurons of the main sample. If anything, positioning errors would have produced depth order preferences at random in different neurons, but Fig. 6 shows that depth order preference was correlated with side-of-figure preference. Stimulus-induced changes in fixation were ruled out by eye movement recordings and by analysis of the disparity tuning of neurons which indicated that convergence of the eyes was unaffected by the stimulus. Also, the effects of positioning errors and eye movements would be more noticeable in V1 than in V2 because of the smaller size of receptive fields in V1, but the observed depth order selectivity was more pronounced in V2.

Cells that were selective for side-of-figure and depth order (filled symbols in Fig. 6) responded with higher mean firing rates than other cells (Table 1). One possible explanation for this is that border-ownership modulation produces enhancement of responses for the preferred condition. However, there might be other reasons. The most effective spatial pattern generally varies from cell to cell, some responding best to edges, others to gratings, bars, or other patterns. These variations are probably related to the different functions of cortical cells in the visual process, for example, contour versus surface representation. Thus, border-ownership selective cells might be more responsive to edges than other cells because they are involved in contour representation.

The fact that only a fraction of cells was found to be selective for side-of-figure or depth order (combined these were 54% of the cells tested) is not surprising considering that only a fraction of the contrast borders in natural images are occluding contours (contrast borders are also produced by surface pigmentation, bending of a surface, shadows etc.). Accordingly, border ownership assignment is only one of several tasks performed in the visual cortex. Also, in micro-electrode recording experiments, as described here, signals are selected randomly from the neural network and therefore, presumably, reflect various stages of processing and thus various levels of neural selectivity.

The origin of the gestalt influence

The influence of global configuration is still mysterious. Our results show that the range of this influence extends far beyond the limits of the classical receptive fields, which might be taken as indicating a process of central origin. However, several observations argue against this possibility.

One is the early differentiation of the responses for the two sides of figure (Figs. 4-5) which seems to exclude central loops such as IT cortex as the mechanism of figure-ground differentiation, as we have discussed earlier (Zhou et al., 2000).

Another observation is that the side-of-figure preference of each single neuron is fixed in relation to its receptive field. Another neuron with the same location and orientation of receptive field may have the opposite preference. This means that the identification of the figure area is probably not due to an influence of top-down attention. How can attention signals, which should be able to gate the activity for a figure in either location, produce different effects for the two locations? And if attention is directed to the figure in one location, how can it simultaneously enhance activity in one cell, but suppress it in the other? It seems that, for the top-down signal to produce opposite effects in different neurons there must be lower-level mechanisms that differentiate the cells. A similar argument can be made regarding back-propagation of signals from a shape recognition stage such as the inferior temporal cortex. It is unlikely that such influences would be side-specific to the individual receptive fields.

It is important also to remember that our findings reflect the activity in the visual cortex when the animal was engaged in a demanding fixation task (depth matching at stereoscopic threshold). This probably means that the animal tried, as much as possible, to ignore the stimuli to which the neurons responded. Recent experiments with multiple figures and operational control of attention confirmed that border ownership in V2 is generated independently of attention (although many cells also show an attention effect) (von der Heydt et al., 2004).

The present results, showing that side-of-figure selectivity is ‘wired up’ with stereoscopic selectivity in a specific way in single neurons, support the conclusion that the preference of neurons for one or the other side is hard-wired and not under central control. Stereoscopic selectivity originates early in the visual cortex and, therefore, probably is hard-wired. Because the object bias illustrated in Fig. 1 is an invariable property of images of a 3D world, the side-of-figure preference of neurons and its link to depth order preference should also be invariant.

Exactly how the lower cortical areas would integrate information from distant parts of the visual field remains to be determined. Because image information is laid out retinotopically in area V2 (Gattass et al., 1981; Van Essen and Zeki, 1978), the representations of the figure boundaries are widely distributed in the cortex. Thus, for the processing to occur within V2, one would have to assume fast horizontal propagation of signals to explain the rapid emergence of border-ownership signals. Given the large size of V2, the conduction velocity of intracortical fibers might be too slow. Another possibility is that the integration occurs via recurrent signals from nearby areas, such as V3 or V4, which would travel through the much faster fibers of the white matter (Bullier, 2001; Hupe et al., 2001).

Neural coding of figure-ground organization

Figure-ground organization is a complex phenomenon that involves depth stratification as well as grouping of elementary features into larger units (‘figures’). Border-ownership coding provides a key to understanding a broad range of observations. Our results suggest that the coding of border ownership is surprisingly simple: Each segment of contrast border is represented by two groups of orientation selective neurons, one for each side of ownership, whose differential activity encodes the border assignment, similarly as motion is encoded by cells with opposite direction preference, or light and dark by on- and off-center ganglion cells. We assume that the strength of the neural border-ownership signal is related to the probability of perceiving one of two adjacent regions as occluding the other. Thus, neural border-ownership assignment is not an all-or-none process. For example, side-of-figure signals in V2 decrease with increasing figure size (Fig. 2C). Correspondingly, smaller regions have a higher probability to be perceived as foreground than larger regions (Rubin, 1921). We do not imply that V2 is “the site of perception” of figure and ground. Such an interpretation would be incompatible with the graded nature of border-ownership signals and the observation of neurons representing alternative interpretations in parallel (Fig. 6).

Coding border ownership in orientation selective cells is an effective way of representing the overlay structure of scenes because the signals of these cells form the basis of shape representation for subsequent stages of processing. The assignment of border ownership directly specifies which contour elements are to be processed for each shape and which not. Indeed, border-ownership assignment affects shape recognition (Driver and Baylis, 1996; Nakayama et al., 1989) and shape specific visual search (He and Nakayama, 1992; Rensink and Enns, 1998). The figure-ground dependence of motion signals in MT (and of motion perception) indicates that MT mechanisms compute the direction of motion of a surface from features at the borders of the surface, selecting the features according to border ownership (Duncan et al., 2000; Shimojo et al., 1989).

The finding of a convergence of stereoscopic and gestalt-based mechanisms provides interesting clues about how the visual cortex might represent surfaces. Our results show that 3D edge selective neurons not only detect disparity edges (von der Heydt et al., 2000), but in many cases also assign border-ownership. Thus, these neurons do not represent isolated 3D features, but edges with reference to an adjacent region. Other neurons represent brightness and color borders, again with a pointer to an adjacent region (Zhou et al., 2000). Taken together, these instances of ‘gestalt’ influence are evidence for mechanisms that link diverse feature signals to larger entities. We argue that linking contour features to regions is a fundamental operation in coding 3D surfaces.

The existence of 3D surface representations has been suggested by many studies (for a review see Nakayama et al., 1995). For example, stereograms can produce illusory surfaces (Gregory and Harris, 1974; Idesawa, 1991); depth order influences the perceived color of surfaces (Nakayama et al., 1989; 1990); and visual attention is deployed according to perceptual surfaces (He and Nakayama, 1995). Correlates of depth stratification and illusory surface formation have been demonstrated in neuronal responses (Bakin et al., 2000). The convergence of stereoscopic and contrast information in border-ownership selective neurons might provide a basis for a general explanation of these phenomena. Theoretical studies (Craft et al., 2004; Schuetze et al., 2003) show that the neural mechanisms of gestalt-based border-ownership assignment can be modeled by relatively simple ‘grouping’ circuits. They suggest that these circuits might serve top-down attention mechanisms to access the ‘grouped’ information by polling the various neurons representing the borders of a figure. This way the features that define a surface, such as 3D shape and color, can be selected as a whole for further processing. Thus, the demonstration of a link between stereoscopic and gestalt-based mechanisms for assignment of contrast borders is a step towards understanding the coding of visual information at this intermediate stage and its role in the vision process.

Experimental Procedures

Single neurons were recorded from areas V1 and V2 of the visual cortex in alert, behaving macaques (Macaca mulatta). Three small posts for head fixation and two recording chambers over the left and right visual cortex were attached to the skull with bone cement and surgical screws. The surgery was done under aseptic conditions under pentobarbital anesthesia induced with ketamine, and buprenorphine was used for postoperative analgesia. All animal procedures conformed to National Institutes of Health and USDA guidelines as verified by the Animal Care and Use Committee of the Johns Hopkins University.

Recording

Single-neuron activity was recorded extracellularly with glass-insulated Pt-Ir or Quartz-insulated Pt-W microelectrodes inserted through small (3-5 mm) trephinations. Area V1 was recorded right under the dura, V2 either in the posterior bank of the lunate sulcus, after passing through V1 and the white matter, or in the lip of the post-lunate gyrus. The two areas were distinguished by their retinotopic organization and by histological reconstruction of the recording sites as described previously (Zhou et al., 2000).

Control of fixation

Eye movements were recorded for one eye using a video-based infra-red pupil tracking system with a camera mounted on the axis of fixation via a 45° beam splitter. A novel fixation task was used that required the subjects to align a dot to a short line stereoscopically to within a disparity near the stereoscopic threshold. To facilitate fixation in the presence of random-dot texture the fixation target was presented on a black circular background of 20 arc min diameter. The criterion disparity was set so low (generally 0.5-0.67 arc min) that the adjustment took 1-2 seconds during which fixation was steady. Lateral movements during fixation were generally small (S.D. 0.15-0.2 deg), and data from trials during which the fixation deviated from the target by more than 1 deg were discarded. Performance in the depth matching task reached the limits of stereoscopic acuity. This indicates that the eyes converged accurately on the target, because stereoscopic acuity is highest only for targets on the horopter and falls off steeply with distance from the horopter (Blakemore, 1970).

To see if the depth of fixation was altered by the disparity of the random-dot patterns we estimated the stimulus-induced vergence movements from the responses of V2 neurons with sharp disparity tuning. Using two-surface RDS stimuli, we recorded the disparity-tuning functions for various disparities of the texture surrounding the fixation target. We then modeled the effect of vergence movements on the neuronal responses under the assumption that the disparity around the fixation target would induce a proportional deviation of vergence, and determined the gain factor of vergence induction that maximized the cross-correlation between the different tuning functions recorded from each neuron (R. von der Heydt and F.T. Qiu, manuscript in preparation). For inducing disparities varying between -10 and +10 arc min the mean estimated gain of vergence induction was 0.03 (S.D. 0.04, range -0.01 to 0.12, N=7). Thus, the estimated gain was very low. There are two possible explanations for this. Either stimulus-induced vergence eye movements are virtually absent, or they occur, but are compensated by neural mechanisms. It has been shown that the responses of V2 cells to a stimulus in the receptive field can be influenced by the disparity of the surrounding region, causing, in some cases, the disparity tuning to shift in the direction of the surround disparity (Thomas et al., 2002). In the extreme, cells may signal the “relative disparity” (the difference between center and surround disparities) rather than the “absolute disparity” of the stimulus in the receptive field. However, to produce gain factors as low as a few percent, as in our estimates, neuronal mechanisms would have to compensate for 97% of the vergence. Cells with nearly complete disparity differencing are rare, though, and the majority of V2 cells shows no effect of the disparate surround (Thomas et al., 2002). Thus, the possibility that the cells we tested were all of the differencing kind is extremely unlikely. We conclude that the more likely explanation for the small estimates of the gain of vergence induction is that the animal was able to maintain stable vergence even under conditions in which the disparity of the surrounding texture varied. Based on our estimate, an induced vergence change of 0.03 x 24 arc min = 0.72 arc min would be expected for the largest surround disparity used. Changes this small can probably be neglected.

Visual stimuli and test procedures

Stimuli were generated on a Silicon Graphics O2 workstation using the anti-aliasing feature of the Open Inventor software, and presented on a Barco CCID 121 FS color monitor with a 72 Hz refresh rate. Stereoscopic pairs were presented side-by-side and superimposed optically at 40cm viewing distance. The optical system could be switched between magnifications of 0.74 and 1.56 arc min/pixel, providing fields of 8 by 12, and 17 by 26 deg visual angle, respectively. Stationary bars were used to determine the color preference, and bars and drifting gratings to map the ‘minimum response field’ (the minimum region outside which the stimulus does not evoke a response; Barlow et al., 1967) of each cell. Orientation and disparity tuning curves were recorded using moving bars. The bars were presented on a neutral background of 16 cd/m² luminance. Subsequently, an edge of a square figure (3-8 deg) was centered on the minimum response field at the preferred orientation. For contrast figures, the preferred color and gray (16 cd/m²) were used for figure and surround (Zhou et al., 2000). The preferred color could be chromatic or achromatic, and white was used in the absence of color selectivity. In general there was a luminance contrast between figure and surround. Stereoscopic figures were generated by means of dynamic random-dot stereograms (RDS; Julesz, 1960) using randomly positioned white dots (53 cd/m²) on gray (16 cd/m²) with 2 or 6 arc min dot size, 14% coverage, and a pattern renewal frequency of 8Hz. Stereoscopic (cyclopean) squares and square windows with edges corresponding exactly to the edges of the contrast figures were generated. The preferred disparity (or zero, if the disparity tuning was flat) was used for the ‘near’ plane, while the ‘far’ plane was placed at a distance of 10 or 24 arc min disparity behind the fixation target. In the experiment of Fig. 7, the luminance modulation of the random-dot pattern (Michelson contrast=0.3) was applied to the colors of figure and surround.

Experimental design and data analysis

For the experiment of Fig. 2, four displays as shown in Fig. 2A plus four displays with reversed contrast were tested. For the experiment of Fig. 3, cyclopean squares, 3 deg on a side, were generated using dynamic RDS. The depth of the square was set to the optimum disparity for the cell under study, or zero, if there was no clear disparity tuning. Each of the four sides of the square was placed in the receptive field at seven positions spaced 0.167 deg, in random order. For the experiment of Figs. 4-6, four displays of contrast-defined figures, as shown in Fig. 4A-D, and four RDS portraying cyclopean squares and windows at the same positions as the contrast squares were tested. In the experiment of Figs. 7-8, the figures were defined by both contrast and disparity, and stimuli consisted of four displays as illustrated in Fig. 7, plus four displays with reversed contrast. In each experiment, all stimuli were presented four times in random order. Analysis was based on the spike counts during 800ms after stimulus onset. Cells with contrast edge responses <4 spikes/second were excluded because we felt that our stimuli were not adequate to drive these cells (23%). Selectivity was assessed by analysis of variance (ANOVA) performed on the square-root transformed spike counts, using a significance criterion of p<0.05. For the experiment of Fig. 2, a 3-factor ANOVA was performed (factors: side-of-figure, edge contrast polarity, and size). For the experiment of Figs. 4-6, two separate 2-factor ANOVAs were performed, one for the contrast figure data (factors: side-of-figure, edge contrast polarity), and one for the RDS data (factors: depth order, side-of-figure). To calculate the modulation index for side-of-figure (Fig. 6, vertical axis) the responses to the two contrast polarities were averaged; to calculate the modulation index for depth order (Fig. 6, horizontal axis) the responses to squares and windows were averaged. No subtraction was made for spontaneous activity (which would have exaggerated the modulation indices). The data from the experiment of Fig. 7 were analyzed by 3-way ANOVA (factors: depth order, side-of-figure, and edge contrast polarity).

Acknowledgments

Acknowledgements: We thank Ofelia Garalde for technical assistance and Todd J. Macuda for his participation in some of the experiments.

Footnotes

This work was supported by National Institutes of Health Grants EY-02966 and NS-38034.

References

Albright, TD; Stoner, GR. Contextual influences on visual processing. Annu. Rev. Neurosci. 2002;25:339–379. [PubMed]
Bakin, JS; Nakayama, K; Gilbert, CD. Visual responses in monkey areas V1 and V2 to three-dimensional surface configurations. J. Neurosci. 2000;20:8188–8198. [PubMed]
Barlow, HB; Blakemore, C; Pettigrew, JD. The neural mechanism of binocular depth discrimination. J. Physiol. (Lond). 1967;193:327–342. [PubMed]
Baumann, R; van der Zwan, R; Peterhans, E. Figure-ground segregation at contours: a neural mechanism in the visual cortex of the alert monkey. Eur. J. Neurosci. 1997;9:1290–1303. [PubMed]
Blakemore, C. The range and scope of binocular depth discrimination in man. J. Physiol. 1970;211:599–622. [PubMed]
Bullier, J. Integrated model of visual processing. Brain Res. Rev. 2001;36:96–107. [PubMed]
Craft, E; Schuetze, H; Niebur, E; von der Heydt, R. Neural mechanisms of border ownership representation: a computational model. Neuron. 2004
Driver, J; Baylis, GC. Edge-assignment and figure-ground segmentation in short-term visual matching. Cogn. Psychol. 1996;31:248–306.
Duncan, RO; Albright, TD; Stoner, GR. Occlusion and the interpretation of visual motion: perceptual and neuronal effects of context. J. Neurosci. 2000;20:5885–5897. [PubMed]
Freeman, RD; Ohzawa, I; Walker, G. Beyond the classical receptive field in the visual cortex. Prog. Brain Res. 2001;134:157–170. [PubMed]
Gattass, R; Gross, CG; Sandell, JH. Visual topography of V2 in the macaque. J. Comp. Neurol. 1981;201:519–539. [PubMed]
Gregory, RL; Harris, JP. Illusory contours and stereo depth. Percept. Psychophys. 1974;15:411–416.
He, ZJ; Nakayama, K. Surfaces versus features in visual search. Nature. 1992;359:231–233. [PubMed]
He, ZJ; Nakayama, K. Visual attention to surfaces in three-dimensional space. Proc. Natl. Acad. Sci. U. S. A. 1995;9:11155–11159. [PubMed]
Heider, B; Spillmann, L; Peterhans, E. Stereoscopic illusory contours--cortical neuron responses and human perception. J. Cogn. Neurosci. 2002;14:1018–1029. [PubMed]
Hupe, JM; James, AC; Girard, P; Lomber, SG; Payne, BR; Bullier, J. Feedback connections act on the early part of the responses in monkey visual cortex. J. Neurophysiol. 2001;85:134–145. [PubMed]
Idesawa, M. Perception of 3-D illusory surface with binocular viewing. Jpn. J. Applied Physics. 1991;30:751–754.
Jones, HE; Grieve, KL; Wang, W; Sillito, AM. Surround suppression in primate V1. J. Neurophysiol. 2001;86:2011–2028. [PubMed]
Julesz, B. Binocular depth perception of computer-generated patterns. Bell System Technical Journal. 1960;39:1125–1161.
Julesz, B. Foundations of Cyclopean Perception. University of Chicago Press; Chicago: 1971.
Kaplan, GA. Kinetic disruption of optical texture: the perception of depth at an edge. Percept. Psychophys. 1969;4:193–198.
Koffka, K. Principles of Gestalt Psychology. Harcourt, Brace and World; New York: 1935.
Lamme, VAF. The neurophysiology of figure-ground segregation in primary visual cortex. J. Neurosci. 1995;15:1605–1615. [PubMed]
Lee, TS; Mumford, D; Romero, R; Lamme, VAF. The role of the primary visual cortex in higher level vision. Vision Res. 1998;38:2429–2454. [PubMed]
Levitt, JB; Lund, JS. The spatial extent over which neurons in macaque striate cortex pool visual signals. Vis. Neurosci. 2002;19:439–452. [PubMed]
Nakayama, K; He, ZJ; Shimojo, S. Visual surface representation: a critical link between lower-level and higher-level vision. In: Kosslyn SM, Osherson DN. , editors. Invitation to Cognitive Science. MIT; Cambridge, MA: 1995. pp. 1–70.
Nakayama, K; Shimojo, S; Ramachandran, VS. Transparency: relation to depth, subjective contours, luminance and neon color spreading. Perception. 1990;19:497–513. [PubMed]
Nakayama, K; Shimojo, S; Silverman, GH. Stereoscopic depth: its relation to image segmentation, grouping, and the recognition of occluded objects. Perception. 1989;18:55–68. [PubMed]
Poggio, GF; Motter, BC; Squatrito, S; Trotter, Y. Responses of neurons in visual cortex (V1 and V2) of the alert macaque to dynamic random-dot stereograms. Vision Res. 1985;25:397–406. [PubMed]
Rensink, RA; Enns, JT. Early completion of occluded objects. Vision Res. 1998;38:2489–2505. [PubMed]
Rossi, AF; Desimone, R; Ungerleider, LG. Contextual modulation in primary visual cortex of macaques. J. Neurosci. 2001;21:1698–1709. [PubMed]
Rubin, E. Visuell wahrgenommene Figuren. Gyldendal; Copenhagen: 1921.
Rubin, E. Figure and ground. In Visual Perception: Essential Readings. Yantis S. , editor. Psychology Press; Philadelphia: 2001. pp. 225–229.
Schuetze, H; Niebur, E; von der Heydt, R. Modeling cortical mechanisms of border ownership coding. J. Vision. 2003;3/9:114.
Shimojo, S; Silverman, GH; Nakayama, K. Occlusion and the solution to the aperture problem for motion. Vision Res. 1989;29:619–626. [PubMed]
Spillmann, L; Ehrenstein, WH. Gestalt factors in the visual neurosciences. In: Chalupa LM, Werner JS. , editors. The Visual Neurosciences. MIT press; Cambridge, Mass: 2003.
Sugita, Y. Grouping of image fragments in primary visual cortex. Nature. 1999;401:269–272. [PubMed]
Thomas, OM; Cumming, BG; Parker, AJ. A specialization for relative disparity in V2. Nat. Neurosci. 2002;5:472–478. [PubMed]
Van Essen, DC; Zeki, SM. The topographic organization of rhesus monkey prestriate cortex. J. Physiol. (Lond). 1978;277:193–226. [PubMed]
von der Heydt, R; Heitger, F; Peterhans, E. Perception of occluding contours: Neural mechanisms and a computational model. Biomed. Res. 1993;14(suppl 4):1–6.
von der Heydt, R; Qiu, FT; He, ZJ. Neural mechanisms in border ownership assignment: motion parallax and gestalt cues. J. Vision. 2003;3/9:666.
von der Heydt, R; Sugihara, T; Qiu, FT. Border ownership and attentional modulation in neurons of the visual cortex. Perception. 2004;33:46.
von der Heydt, R; Zhou, H; Friedman, HS. Representation of stereoscopic edges in monkey visual cortex. Vision Res. 2000;40:1955–1967. [PubMed]
Wertheimer, M. Untersuchungen zur Lehre von der Gestalt II. Psychol. Forsch. 1923;4:301–350.
Wertheimer, M. Laws of Organization in Perceptual Forms. In: Yantis S. , editor. Visual perception: essential readings. Psychology Press; Philadelphia: 2001. pp. 216–224.
Yonas, A; Craton, LG; Thompson, WB. Relative motion: Kinetic information for the order of depth at an edge. Percept. Psychophys. 1987;41:53–59. [PubMed]
Zhou, H; Friedman, HS; von der Heydt, R. Coding of border ownership in monkey visual cortex. J. Neurosci. 2000;20:6594–6611. [PubMed]
Zipser, K; Lamme, VAF; Schiller, PH. Contextual modulation in primary visual cortex. J. Neurosci. 1996;16:7376–7389. [PubMed]