Local contour symmetry facilitates the neural representation of scene categories in the PPA

Human observers can rapidly classify real-world scenes into their natural categories (e.g. beaches, mountains). It is unclear what neural mechanisms underlie this rapid processing of scenes. In a previous behavioral study, we demonstrated that local ribbon symmetry facilitates scene classification. Here we manipulate the ribbon symmetry content of line drawings of real-world scenes and then decode scene categories from patterns of voxel activities of the observers obtained via fMRI. We can decode scene categories from the parahippocampal place area (PPA) more easily from symmetric scenes than asymmetric scenes. In earlier visual areas the decoding accuracy for symmetric and asymmetric scenes was not significantly different. This suggests that the benefit for symmetric scenes in both behavior and fMRI is not solely driven by a lower-level preference for symmetry. Instead, ribbon symmetry may be uniquely informative for scene categorization.


Introduction
As soon as a person opens their eyes, their visual experience is that of a cohesive world, composed of individual objects and surfaces. The human visual system must transform low-level image properties into conscious percepts of objects and continuous surfaces. Visual processing takes the responses from individual photo-receptors and groups them through a hierarchical process, where the elements are grouped into larger structures, and finally into objects and scenes. An observer can classify a scene when it is presented very briefly (Potter & Levy, 1969;Thorpe, Fize, Marlot, et al., 1996;VanRullen & Thorpe, 2001). Human observers can rapidly classify line drawings of real-world scenes, even though they do not contain the richness of a photograph (Biederman, Mezzanotte, & Rabinowitz, 1982;Walther, Chai, Caddigan, Beck, & Fei-Fei, 2011). The speed at which photographs or line drawings can be classified demonstrates that the grouping process carried out by the visual system occurs very rapidly.
The Gestalt psychologists proposed qualitative grouping rules describing what image features influence this process. While many of these grouping rules have been implemented in concrete, quantitative algorithms, a rigorous algorithm for the cue of symmetry has been lacking. We recently described an algorithm for measuring contour symmetry in a line drawing of a scene (Wilder et al., 2019), and showed that the presence of symmetry greatly facilitates scene categorization.
Here we look for the neural correlates of this local symmetry 1118 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0 effect, using fMRI. Previous work on global mirror symmetry as well as axiality of shapes (related to the local symmetry we measure here) has found an influence of symmetry in V3, V4, and LOC. Local contour symmetry is distinct from these previously studied types of symmetry, so it is not clear if we should expect an effect of symmetry in these areas. The parahippocampal place area has been shown to have activity consistent with behavioral scene classification as well as 3D scene geometry (Ferrara & Park, 2016;Lescroart & Gallant, 2019), so we hypothesize that PPA will be influenced by local symmetry in line drawings of natural scenes. To test this hypothesis we showed intact, symmetric, and asymmetric scenes to observers and decoded scene categories from voxel activity patterns in several ROIs (V1, V2, V3, LOC, PPA, OPA, and RSC).

Scenes dataset
We showed participants line drawings of six categories of real-world scenes (Walther et al., 2011;Wilder et al., 2019): Beaches, Forests, Mountains, Cities, Highways, and Offices. The line drawings we generated by artists tracing the important lines from a set of color photographs. There are 475 images across all scene categories.

Scoring Symmetry
Our measure of symmetry is based upon the medial axis transform, which is a way to transform a shape from the pixels along its boundary to the central axis in each part of the shape. The medial axis marks where in the shape the contour is equidistant on both sides. Details of our method can be found in (Wilder et al., 2019). We will briefly describe the process here.
We compute symmetry by starting with a line drawing of a scene, and taking the Euclidean distance transform, which, for each pixel in the image, measures the distance from that pixel to the nearest contour. From this distance image we can determine the gradient at each point. The medial axis lies at the points where this gradient flows outward. From the distance transform, we also know the distance from a medial axis point to the contour in each direction. We call this the radius function. As we move along the medial axis, we measure how the radius function changes. If there is no change in the radius, we are in a locally ribbon symmetric region, meaning that the contours on either side are parallel, as a curving ribbon of constant width. Before scoring the contours, we must score the axis points. Our score is based on the behavior of the radius function within a local region. We count the number of times the radius changes between neighboring axis points within that local region. If this number is high, we give a low symmetry score, because the contours are curving independently of one another. If the number of radii changes between neighboring axis points in this region is low, we give a high symmetry score. Once the axis is scored, we find the contour points that this axis point flows to (using the gradient of the distance transform). Those contour points receive the score of their corresponding axis point. As our method finds the medial axis in all white-space regions of an image, there are two skeletons that correspond to each contour point, one on either side. The score we assign is the maximum score of the two axis scores.
Concretely, we consider a window of 2K + 1 medial axis points centered at our target medial axis point Q. These points are Q −K , . . . , Q K , and Q 0 = Q. At medial axis point Q, the score S(Q) is where R (Q i ) represents the radius value at Q i , and τ is a marginal threshold.

Stimuli
In our study we manipulated the amount of symmetric content in the stimuli we showed to our observers. We used either the original, intact line drawing, or we used a half-image, containing exactly half of the contour pixels of the intact image. To create the half-images we first rank ordered all contour pixels based upon their symmetry score. For the Symmetric images, we used the highest ranking half of the pixels, and for the Asymmetric images, we used the lowest ranking half of the pixels. Thus, the symmetric and asymmetric images combine to create the intact image, and they contain an equal number of contour pixels. You can see an example of a scored image, along with the accompanying splits in Fig. 1.
Each of the 475 images was transformed into three conditions, resulting in 1425 total images that may be shown to participants.

Participants
Twenty-two participants were recruited from University of Toronto Facebook groups, and were paid for their participation. Only 19 participants' data were used because three participants fell asleep during multiple runs of the experiment.

Design and Procedure
We used a block design fMRI experiment. Participants were scanned in nine runs. Each run contained 18 blocks, one block for each combination of scene category and image condition. In each block, participants were shown 8 scenes from the same category and image condition. Each image was shown for 800 ms, with a 200 ms gap between images. There was a 10s fixation period prior to the first block and after each block. Participants were instructed to maintain fixation for the entire run, and a fixation mark was always on the screen to aid the participants.
Separate localizer scans were conducted in order to allow for the determination of areas V1-V3, LOC, PPA, OPA, and RSC, for each individual participant.

Low
High Symmetry Figure 1: Example Office line drawing. The top image (gray border) shows the intact image with the symmetry score overlayed on top of the contour pixels. After the image is scored the pixels are rank ordered, and the top half is used to create the symmetric image (in red) and the bottom half is used to create the asymmetric image (in blue). Each of these images contain exactly half of the contour pixels of the intact image, and combine to form the intact image.

Multivariate Analysis
Our main analysis is a multivariate analysis where we train linear support-vector machines (SVMs) to classify scene categories from the activity patterns of voxels elicited by participants viewing the scene images. The classifiers were trained using leave-one-run-out cross-validations. Separate classifiers were trained for each image condition.
In all ROIs decoding accuracy was above chance, for each image condition. We were specifically interested in the relative performance between the symmetric and asymmetric conditions. For each ROI we conducted a paired t-test on the decoding accuracies from the symmetric and asymmetric conditions. We found significantly better performance in the symmetric condition for both PPA and OPA (p < 0.05). In RSC we found significantly better performance in the asymmetric condition (p < 0.05).

Univariate Analysis
We also performed a univariate analysis in each ROI, for each image condition. In all ROIs, the activity for the intact condition was the lowest. In V1, V2, and V3 there was significantly more activity in the asymmetric condition than the symmetric condition (V1 and V2 p < 0.001, V3 p < 0.05). Also, in the OPA, there was significantly more activity for the symmetric condition than the asymmetric condition (p < 0.05). No differences were found in LOC, PPA, and RSC.

Discussion
As hypothesized, we found a significant effect of symmetry versus asymmetry in scene selective ROIs. Previous work has suggested that the axial structure of shapes are represented in V3 (Lescroart & Biederman, 2013). Additionally, single cell recordings in early visual cortex have shown that some neurons respond highly when their receptive field is centered on the medial axis (Lee, 1996). Our scoring of symmetry is based upon the medial axis, thus these areas could have been involved in processing our stimuli, but we failed to find any significant results. Our line drawing scene stimuli are quite different from the stimuli of the previous studies and are not isolating single medial axes. This could be why we found no preference for symmetry in these areas. Additionally, we restrict our symmetry measure to ribbon symmetry. While many objects have ribbon symmetry in the real world, when projected into the image plane, due to perspective foreshortening, they taper in the image. We may have found no influence of ribbon symmetry in earlier areas that encode object and shape parts because our method does not score tapering regions highly.
In addition to the significant difference between decoding the scene categories for symmetric scenes and asymmetric scenes in PPA, we found a similar effect in OPA, and the opposite effect in RSC. All three of these areas are involved in processing scenes. Recent studies have characterized the ways in which the functions for these areas differ (Dillon, Persichetti, Spelke, & Dilks, 2018;. The PPA was reported to be highly activated during categorization but not for a navigation task, while OPA was more active for navigation than categorizations . RSC showed low activity in both tasks. Both PPA and OPA were sensitive to changes in scene layout, but RSC was not (Dillon et al., 2018). Our work is consistent with those findings; manipulating symmetry information, which is related to scene layout, affects both PPA and OPA in a consistent manner, while RSC had the opposite effect.

Conclusion
Human rapid scene classification is aided by local symmetry. Here we show that scene selective cortex represents local symmetry in its voxel patterns. Area PPA mirrors that of categorization behavior, where symmetric scenes are easier to classify than asymmetric ones. Ribbon symmetry may be uniquely informative for scene categorization.