Dead Rectangles as a Stimulus for Perceptual Organisation Research

We describe dead rectangles, a new stimulus class for research on perceptual organisation. These stimuli are generated by randomly placing rectangles inside an image window, which allows for occlusion, surface properties and ambiguous stimuli. To show their utility we asked 14 observers to judge whether two points in a dead rectangle stimulus belong to the same rectangle or not. We find observers to perform around 70% correct and judge the points to belong to the same rectangle considerably more often than they do. Also, some pairs of points were consistently judged to belong to the same rectangle although they were not. The possible decompositions of one stimulus into rectangles form a tree, which can be used for formal analysis. These stimuli may allow research on perceptual organisation to move on to more natural conditions, while maintaining experimental control and a rigorous mathematical framework.


Introduction
Perceptual organisation is the process by which humans segment their percept into objects by grouping smaller parts. This process is effortless and leads to highly accurate segmentation under natural conditions. Perceptual organisation is central to perception as it affects many other aspects of perception like attention (Soto & Blanco, 2004) and potentially even low-level processes like pooling and normalisation (Neri, 2011;Coen-Cagli, Kohn, & Schwartz, 2015). Thus, researchers study perceptual organisation since the Gestalt psychologists in the beginning of the 20th century (Koffka, 1935). These researchers asked subjects whether they would group simple stimuli on a grid rather in one or the other direction, which lead to a list of factors influencing grouping like proximity or similarity. However, most of these stimulus groups are not perceived as single objects and whether and how the grouping principles apply to natural stimuli and how multiple objects interact remain unclear.
We believe that one obstacle for perceptual organisation research is the lack of stimuli and an experimental paradigm which allow deeper insights into the problem. Most research on grouping and perceptual organisation is done using highly simplified stimuli like dots (Feldman & Singh, 2006) or Gabor patches (Field, Hayes, & Hess, 1993). These stimuli lack im- portant aspects of natural scenes like surfaces, textures and occlusions, which may be highly relevant for perceptual organisation. As a new artificial stimulus we here suggest dead rectangles, which consist of randomly placed stacked rectangles of various sizes. These stimuli are a special case of deadleaves stimuli, which are generated by randomly placing simple shapes and named after foliage on the forest floor (Lee, Mumford, & Huang, 2001;Gousseau & Roueff, 2007;Pitkow, 2010;Zoran & Weiss, 2012). We display an example dead rectangles stimulus in Figure 1.
Dead rectangles are still artificial stimuli, but add important aspects, previously used stimuli do not possess: Dead rectangles have a surfaces and can thus occlude each other, which is possibly the most important visual interaction of objects. Dead rectangles stimuli contain many objects, mimicking natural conditions. Dead rectangles can mimic natural image statistics like edge frequencies as all dead-leaves stimuli can (Gousseau & Roueff, 2007;Pitkow, 2010  the percept of object-hood, as we show in Figure 2. This true ambiguity motivated us to use rectangles instead of the more commonly used ellipses and circles, for which a tiny part of the contour is sufficient to infer the whole shape.
As artificial stimuli dead rectangles also have a number of advantages over natural stimuli: We can control and vary any aspect of them. We know the true decomposition into objects which is unambiguously defined. There are no additional cues to object assignments like object classes or correlations among objects. Finally, subjects are less likely to have prior knowledge or expectations about dead rectangles influencing their decisions.

Formal Description
To generate a dead rectangle stimulus we draw rectangles from a discrete set of possible rectangle sizes Ω R and place them uniformly randomly over the positions rendering any part of the rectangle visible. We sample rectangles until every pixel in the image is covered by a rectangle and then reverse the stack. This samples from the same distribution as visible on top of an infinitely deep stack of rectangles (Pitkow, 2010).
We draw the rectangle sizes s x and s y in the two dimensions independently from a range [s min , s max ] with a stepsize s step and probability p(s) ∝ s −α/2 with a size exponent α. According to earlier analyses this leads to a size invariant distribution with an exponent α = 3 (Lee et al., 2001).
To adjust this distribution for the fact that we draw rectangle positions from a different area for different sizes of rectangles, but want the distribution of rectangles to be independent of the image size d we use a corrected distribution p (s) for the size of a rectangle: (1) Figure 3: Histogram of stimuli showing how many pairs of points were judged correctly and how many were judged to be on the same rectangle by how many subjects.

Behavioural Experiment Methods
In our behavioural experiment we asked 14 subjects whether two marked points in a dead rectangle stimulus belonged to the same rectangle or not. Subjects were recruited from the NYU campus community, were naive to the aim of the study and gave informed consent.
We pre-generated 2000 300 × 300px dead rectangles stimuli for this experiment, which were shown as up-sampled 900 × 900 pixel images centrally on a screen with 1280 × 960 pixels resolution until the subject responded for up to 5 seconds. We drew sizes from the range [5px, 400px] with a a stepsize of s step = 5px. We varied the exponent p of the size distribution from 1 to 5 and chose the queried points at five different distances symmetrically around the image centre with the points displaced horizontally, vertically (5, 10, 20, 40 or 80 px apart) or along either diagonal (4, 7, 14, 28 or 57 px apart) resulting in 4 orientation conditions. Thus, we had 5 × 5 × 4 = 100 conditions in total allowing for 20 different stimuli per condition.
We blocked the trials by distribution exponent to keep the image distribution constant within a block. Within a block trials were shown in a new random order for each subject. Each block was preceded by a 500 frame long movie showing the image generation with the distribution for that block starting from a blue background image and adding one rectangle every frame. We split the experiment into two sessions with 5 blocks for the 5 exponents of 200 trials each. At the beginning of each session we added a training session of 100 trials, which showed a single trial for each condition, which was not used in the main experiment, while keeping the separation into the 5 blocks and the movies.

Results
Subjects respond reasonably consistently in our task as shown by clearly bi-modal distributions for how many subjects judge a pair of points to fall on the same rectangle and how many subjects are correct (see Figure 3).
Subjects all performed clearly above chance and their responses follow the big trends in the correct probabilities for two points to lie on the same rectangle. I.e., the true probability that two points belong to the same rectangle and subjects' judgements decrease with increasing distance between the two points and with increasing size exponent (smaller rectangles), as displayed in Figure 4.
However, subjects generally report to see the two points belonging to same rectangle considerably more often than they do. This trend in consistent across all conditions. Interestingly, this effect is also seen in the consistent errors subjects make. In our dataset 93 stimuli were judged wrongly by all subjects and 112 were judged wrongly by 13 of 14 subjects. These 205 point pairs were all judged to belong to the same rectangle despite belonging to different ones.

Discussion
Our subjects were able to do the task with little instruction and produced reasonably consistent results. Nonetheless they make a relatively large number of errors which makes this data interesting for modelling as simply predicting the correct response will not be the best model. Finally, they produce a peculiar tendency to make consistent errors only in one direction, which is an interesting qualitative pattern future models should explain.

Graph Description & Tree Search
For formal descriptions, optimal inference, and models of human behaviour it is helpful to consider the set of all possible decompositions of a dead rectangle stimulus into rectangles as a graph tree, which starts from an empty image as the root and adds one rectangle at every step through the tree starting from the top. If we list all possible, visible rectangles as possibilities at each node each path from the root to a leaf of the tree represents one decomposition and the graph as a whole contains all possible decompositions. We can generate this graph recursively by first searching for all possible sizes and positions of the top rectangle in the stimulus. For the next rectangle we then allow everything for the area behind the first rectangle and find all possibilities for the second rectangle. As we remove the possible but invisible rectangle positions behind the already placed rectangles this recursion will eventually end.
From our construction we can also deduce the prior probability for each branch of the tree and thus for each decomposition into rectangles. This probability is the product of probabilities for each branch taken in the tree, where the probability for a branch is calculated by normalising the probability to place the rectangle at that position such that the probabilities for all visible rectangles (consistent and inconsistent) add to 1. To compute the posterior over possible decompositions of a dead rectangle stimulus we need to find all decompositions consistent with the image and weigh them with their prior probability times the likelihood to produce the observed pixel colours. To find the consistent decompositions we can find which additional rectangles are consistent with the image at every node of the graph and can restrict our search for possible decompositions to these options. The likelihood to produce the colours we observe is simply 1 N l , where N is the number of colours and l is the number of rectangles in the decomposition.
This allows us to compute the optimal observer for the task in our behavioural experiment by summing up the posterior probabilities of all decompositions where the two points are on the same rectangle and the ones where the two are on different rectangles. These will be the posterior probabilities for the two cases.
We illustrate the tree of possible rectangle decompositions with the prior probabilities in Figure 5 for a minimal example of a 2 × 1px image with only two possible rectangles (2 × 1 and 1×1) with probabilities p = 1 3 and 2 3 . First, we find all possible rectangles which could be on top of the stack, which in this case are 5: Three positions for the two pixel rectangle, which each have probability 1 3 · 1 3 = 1 9 and 2 positions of the one pixel rectangle, which each have probability 1 2 · 2 3 = 1 3 . One of the positions of the 2 pixel rectangle fills the whole image and is thus a leaf. For all other positions there is one pixel left to be covered. For these situations we find that three of the five rectangle positions remain visible: two of the 2 pixel rectangle with probability 1 9 and one position of the 1 pixel rectangle with probability 1 3 . Re-normalising these values gives the 1 5 and 3 5 probabilities in the figure.
Thus, the prior probability for the two pixels to end up on the same rectangle is 1 9 , the probability of the single decomposition where both pixels are on the same rectangle. To calculate the posterior over the decompositions we need the additional information that there are N possible colours. For example, if the two pixels have the same colour, the probability that the one decomposition with both pixels on one rectangle creates the image is 1 N , while the probability for all others is 1 N 2 . Thus, the posterior probability for the two to be on the same rectangle is: As the search for possible rectangle positions can be implemented as a single convolution and thresholding for each rectangle shape and colour, paths through the graph can be computed relatively efficiently for moderately sized images.
For larger stimuli however, the number of decompositions consistent with any particular stimulus is extremely large. For the stimuli we used in our experiment there are typically hundreds of branches from each node and each decomposition consists of hundreds of rectangles. This extremely large space of possibilities renders the exact computation of the optimal observer computationally infeasible. Furthermore, it seems unlikely that our human observers considered all those possibilities in the 1-2 seconds they typically took to answer a trial.
There are, however, many possible approximations to the optimal solution. For example, one can try to search for high probability decompositions and base the decision on only few decompositions. Another possibility is to consider only a subset of the pixels in the image for the decision, which can drastically reduce the depth of the tree.

Conclusion
We present dead rectangle stimuli as a stimulus for research on perceptual organisation. In contrast to previously used stimuli dead rectangle stimuli are proper images, allow occlusion among the objects in the stimulus and can be ambiguous. Nonetheless, we know the true solution for each stimulus and can compute the distribution over stimuli in contrast to natural images. Furthermore, subjects in our experiment could quickly and consistently judge the stimuli, while making errors and consistently judging about 10 % of the point pairs to belong to the same rectangle although they did not. Thus, dead rectangles seem to be interesting stimuli for further experiments and modeling endeavours.