A unified account of tilt illusions, association fields, and contour detection based on elastica

As expressed in the Gestalt law of good continuation, human perception tends to associate stimuli that form smooth continuations. Contextual modulation in primary visual cortex, in the form of association fields, is believed to play an important role in this process. Yet a unified and principled account of the good continuation law on the neural level is lacking. In this study we introduce a population model of primary visual cortex. Its contextual interactions depend on the elastica curvature energy of the smoothest contour connecting oriented bars. As expected, this model leads to association fields consistent with data. However, in addition the model displays tilt-illusions for stimulus configurations with grating and single bars that closely match psychophysics. Furthermore, the model explains not only pop-out of contours amid a variety of backgrounds, but also pop-out of single targets amid a uniform background. We thus propose that elastica is a unifying principle of the visual cortical network.


Introduction
The Gestalt psychologists emphasized that human perception should be understood as a whole, rather than as the sum of individual elements. In the context of contour recognition, they proposed the law of good continuation, in which collinear or curvilinear line elements are associated together (Koffka, 1935;Wertheimer, 1923). This principle presumably underlies the ease with which humans extract smooth contours in natural as well as artificial images. The detection of contours does not necessarily require high level vision or receptive fields that encompass the complete contour. Instead, Field, Hayes, and Hess (1993) argued that association fields in early vision boost the response to collinear line elements, Fig. 1A, and that these local interactions are sufficient for contour detection. Computational models have used association fields to explain contour extraction and completion (e.g. Li, 1998Li, , 1999Tang, Sang, & Zhang, 2007;Williams & Thornber, 2001).
From a functional point of view, association fields emphasize smooth contours. Indeed, in Fig. 1B the red bars are smooth continuations from the black bar, while the blue bars are more tortuous continuations. Smoothness can be quantified using the elastica principle, which measures the bending energy needed to connect two bars with a curve. In computer vision the curve with the lowest energy is better known as a spline (historically, a thin wooden rod used by draftsmen to create smooth curves). Line elements that can be connected with a low energy spline likely belong to the same object contour and are therefore particularly relevant to higher level vision. The elastica curve can also be seen as the maximum likelihood path of a stochastic contour completion process based on drifting particles, with the two bars as source and sink elements (Mumford, 1994;Williams & Jacobs, 1997;Williams & Thornber, 2001).
In this paper we propose that smoothness, as formalized by elastica, is the underlying principle for contextual modulations in V1. We first derive an efficient calculation of the elastica energy between two oriented line elements. We next assume that the contextual modulation in a neural population is determined by this http://dx.doi.org/10.1016/j.visres.2015.05.021 0042-6989/Ó 2015 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). elastica energy. We find that the resulting model has (1) realistic association fields and neural responses, (2) produces various tilt illusions for both small sets of bars and grating-like stimuli, and, (3) leads to robust contour extraction, as well as single target pop-out from uniform backgrounds.
While links between these phenomena and elastica have been shown before individually, from contour extraction (Ernst et al., 2012;Sharon, Brandt, & Basri, 1997) to the tilt illusion (Schwartz, Sejnowski, & Dayan, 2006), there has been considerably variation in the precise implementation and biological realism of these studies. Here we present a unified account that, to our knowledge, is the first to collect such a wide range of phenomena in a single model with elastica at its core. Yet, the model is straightforward and biophysically realistic, relying only on independent contextual modulation terms. These results suggest that elastica is a core principle underlying the contextual interactions in V1.

Elastica
In this section we quantify smoothness according to the elastica principle. Extending earlier results we derive an accurate approximation for the smoothness of a curve connecting two line elements. Consider a scene with two bars, Fig. 2A: a center bar (red) and a flanker bar (blue). We define the positions and orientations as follows: The orientation of the center bar relative to the vertical is h c . The flanker is a distance r f away from the center, and placed at a position which has an angle u f with the vertical. The orientation of the flanker is given by h f . The angles of the center and flanker bar relative to the line connecting them, are where the minus signs signifies circular differences, so that the angles lie in the interval ½Àp; p.
We would like to know the smoothest curve connecting the two line elements. This minimization problem, known as elastica, has a long history dating back several hundred years (Mumford, 1994;Levien, 2008). In particular in computer vision and computer graphics this problem has been studied extensively, where the smooth curves are known as splines. According to the elastica principle, the two line elements are imagined to be connected using a flexible rod. Since most human perception seems to be scale invariant, we use the scale invariant version of the elastica energy (Bruckstein & Netravali, 1990;Sharon et al., 1997) with s signifying the position along the curve, W the relative angle of the curve at location s, and L the total length of the curve. The energy has a minimal value of zero when the curve is straight. As a curve becomes more tortuous, and thus has more curvature, the energy increases. To find the smoothest curve, this energy has to be minimized w.r.t. WðsÞ and subject to the conditions Wð0Þ ¼ b c and WðLÞ ¼ b f .
The length of the bars is assumed much shorter than r f so that the curve between two bars goes from the middle of one bar to the other (see Section 4). The elastica energy can be found using the minimization method outlined in Sharon et al. (1997). Typically elastica assumes a start and end direction. However, in our case, the elements have an orientation only, and we thus need a direction invariant energy, or the minimal energy across situations where the center and/or flanker angles can be flipped 180°. This results in the direction invariant elastica energy Fig. 2B shows the family of curves that minimize this energy as a function of the orientation of the center. The corresponding energy  Field et al. (1993). Solid lines connect collinear elements and correspond to strong associations, while dotted lines correspond to weak associations. (B) Association fields and the relation to smooth contours. When trying to connect the center bar to any of the 4 presented flankers using smooth lines, the connections to the red bars are smoother continuations and the connecting curve will have lower curvature energy than for the blue bars. The population response of neurons with the center bar in A as their receptive field as a function of the neurons' preferred orientation. Each neuron receives a feedforward drive from the center bar with a strength dependent on the neuron's preferred orientation (red). Each neuron also receives contextual modulation from the flanker (blue) which depends on the curvature energy of the elastica curve connecting the flanker to a center bar with the neuron's preferred orientation. The input drive is multiplied by the modulation to give the resulting population response (black), which is both deformed and shifted compared to the feed-forward drive.
for the curves is depicted in panel C. Notably, the energy has cusps where the solution switches from the green to the light blue curve. Finding the true curvature energy is computationally expensive. However, the energy of the smoothest curve between a given center and flanker pair is approximated by Leung and Malik (2001) and Sharon et al. (1997) While Eq. (3) was derived from the assumption of small angles, it turned out to be a very good approximation for larger angles (Sharon et al., 1997). Furthermore, the direction invariant energy (Eq. (2)) based on this approximation is an exceptionally good match with the true direction invariant energy. Fig. 3 compares the approximated and true energies across different center orientations for several flanker orientations. The error is small -certainly for our purposes -and free of qualitative differences.

Neural model
Here we implement how the neural responses in V1 are modulated by the surround using the elastica principle. For each bar, we assume that there is a population of N ¼ 32 neurons, with preferred orientations / i in the interval ½À p 2 ; p 2 . Initially we shall be mainly interested in the population that encodes the center bar. The response of a neuron in the absence of flankers is modeled by a von Mises function where A c is the response amplitude, which we set to 1 Hz without loss of generality, as we only consider stimuli with identical contrast (see Section 4 for a possible extension to stimuli with heterogeneous contrasts). h c is the orientation of the stimulus in the neuron's receptive field (the center bar). K c sets the width of the neural tuning, with narrower curves for higher values. We use K c ¼ 1. The function gðÞ across the population is illustrated in Fig. 2D by the red curve. The smoothest curve connecting a flanker to a center bar with a neuron's preferred orientation / i has a curvature energy Eð/ i ; h f ; u f Þ. We propose that the neural response is modulated by a flanker through a modulation term This modulation is illustrated across the population (i.e. versus / i ) in Fig. 2D by the blue curve. This formula was arrived at as follows. The elastica energy is always positive, but, in order to be consistent with physiology, we want flankers with a low curvature energy (i.e. high smoothness) to facilitate the response. Since E P 0 for all cases, we subtract an offset energy E 0 from the elastica energy, so that hðÞ > 1 for smooth contours. In addition, we divide the energy by r f , the distance to the flanker, so that far away flankers do not modulate the response, similar to previous elastica studies (Bruckstein & Netravali, 1990;Sharon et al., 1997). The gain parameter a determines the strength of the modulation. In the limit when a ¼ 0, one has hðÞ ¼ 1 and the response is independent of any flankers. For all simulations we use a ¼ 0:1, and E 0 ¼ 4. The exact values of these parameters have little qualitative effect, as will be discussed at the end of the results.
The modulation hðÞ acts multiplicatively, consistent with physiology . We assume that each flanker contributes independently to the modulation, so that the final response of a neuron to a stimulus with n flankers is An example for n ¼ 1 flanker is shown by the black line in Fig. 2D.

Population vector
We read out the orientation encoded by the population using the population vector method. The population vector is a 2D vector given by the sum of the preferred orientation vectors of the neurons weighted by their firing rate (Georgopoulos, Schwartz, & Kettner, 1986), representing the estimated center orientation vectorv ĉ where r i is the firing rate, and u i ¼ ðsin 2/ i ; cos 2/ i Þ is the unit vector pointing in neuron i's preferred orientation (multiplied by two to ensure circularity). The estimated center orientationĥ c follows from the angle of the population vector where \ denotes a vector's angle.

Results
In the context of the Gestalt law of good continuation, we study a neural network in which the contextual interactions are based on pairwise optimal smoothness. For clarity we restrict ourselves to images composed of bars with various orientations all with identical contrast, and at a single spatial scale. Each bar is assumed to fall in the classical receptive field of a population of neurons with preferred orientations / i ¼ Àp=2 . . . p=2. We will refer to the bar under consideration as the center bar, and the other bars as the flankers. As an example, consider the scene in Fig. 2A, where the red bar represents the center and the blue bar a single flanker. We write the neural response of neuron i with preferred orientation / i as The function gðÞ models the classical V1 orientation response to the center stimulus using a von Mises function, where h c is the orientation of the center stimulus. The function hðÞ models the contextual modulation from the flanker as follows (Section 2): where h f ; u f an r f are the flanker orientation, angular position, and distance from the center respectively. EðÞ describes the curvature energy of the smoothest curve connecting the flanker to a bar of orientation / i in the center. The parameters a and E 0 set the strength and offset of the modulation (see Section 2 and below). The smoother the curve from the flanker to the center, the more positive the modulation (for smooth curves for which EðÞ < E 0 it become facilitatory, hðÞ > 1), whereas when the curve is tortuous and thus has a high elastica energy, the modulation is inhibitory. In Section 2 the functional form of the modulation is derived from the elastica principle, and an efficient approximation for the curvature energy is presented. The center drive gðÞ, contextual modulation hðÞ and resulting population response r are illustrated in Fig. 2D. It can be observed that due to the modulation the population response is a deformed version of the center drive.

Single flanker association field
We first analyze the association field that the model predicts by considering the effect of a single flanker on the response of a neuron with preferred orientation / i ¼ 0, as represented by the black bar in Fig. 4. A flanker in a given surround location will modulate the response of this neuron by a factor hð/ The color of a bar indicates excitation (red, hðÞ > 1) or inhibition (blue, hðÞ < 1), and its opacity is proportional to the modulation strength. The bars that increase the response (red bars in panel A), correspond to flankers that form a smooth contour. Flankers with an inhibitory effect correspond to more tortuous contours. The parameter E 0 determines for which amount of curvature the modulation become inhibitory (see also Fig. 8).
To get further insight in the model we plot the bar which leads to the most suppressive modulation, which can be called the 'dis-association field' in panel B. Consistent with Fig. 3, the strongest suppressing flankers are approximately rotated 90°from the most facilitating ones. Furthermore, inhibition exists across a wider range of orientations. Finally, we plot the effect of flankers oriented the same way as the preferred orientation to more directly test our association field against neural measurements, in which the flanker orientation is usually kept the same as the preferred orientation of the neuron being measured, panel C.
The butterfly shapes in panel A and C corresponds to both psychophysics in the form of the association field (Field et al., 1993), as well as electrophysiological results in monkeys where, in particular at low contrast, flankers collinear with the preferred orientation in the center excite (Kinoshita, Gilbert, & Das, 2009), and flankers parallel to the center inhibit (Kapadia et al., 1995(Kapadia et al., , 2000.

The tilt illusion: two flankers
Next we study the case when two flankers are placed oppositely each other, around the center. To extend the model to this situation we assume that each flanker independently modulates the response, thus the response of a neuron in the center population We decode the center population response using the population vector (Section 2). Due to deformation of the population response caused by the modulation, the population vector is no longer aligned with the stimulus orientation. Psychophysically this presumably leads to a tilt illusion in the percept of the center bar.
We illustrate several configurations in Fig. 5. In each panel we show two example rotations in the top row. The elastica curves connect the flankers to neurons in the center with different preferred orientations; the curve's opacity is proportional to that neuron's response, indicative of the resulting population response.
For parallel flankers rotating around the center, panel A, the elastica energy is smallest for neurons with preferred orientations ±45°from the flankers' orientations  3)) compares well to the true elastica energy (red curve, Eq. (1)). Both energies were made direction invariant (Eq. (2)).
(A-C) The elastica energy across different center orientations, for fixed flanker orientations of 0°, 45°and 90°respectively. For À90°, À45°, 0°, 45°and 90°a situation sketch is shown, with the gray bar indicating the center orientation, and the black bar the flanker orientation. Note that for a perpendicular flanker, relative to the center, the energy is minimal at two center orientations resulting in two dips in the energy, panel C. ''valleys'' in the energy. Accordingly, the elastica curves from these flankers fan in two directions, Fig. 5A top. The resulting modulation shifts the population response depending on the flanker orientation. In the example of a 30°flanker rotation, top left, the modulation is mostly counter-clockwise. However, when the flanker is rotated to 60°, top right, the same curves pass the vertical before reaching the center, resulting in the population responses being shifted clockwise. These effects result in a repulsive illusion for flanker rotations 0-45°and an attractive tilt for 45-90°, panel A bottom.
When the lateral flankers are rotated in place, Fig. 5B top, the effect is always repulsive. With the flankers either at 30°(left) or 60°(right), the most influential elastica curves from the left flanker always move up first before going down (and vice versa for the right flanker), ending in an orientation which is repulsed away from the flanker orientation.
For the aligned flankers rotating around the center, panel C top, elastica curves connect smoothly to the flankers, with the lowest energy curve being a straight line. As a result the modulation is excitatory for neurons with a preferred orientation close to the flanker orientation (Fig. 4A), and suppressive far away, and the population response follows the flankers. Across flanker orientations this generates an attractive illusion. Note that the attractive effect does not require excitation, it is sufficient that the modulation from both flankers is least suppressive for the neurons with a preferred orientation equal to the flanker orientation.
Next, we rotate the flankers in place above and below the center, panel D top. When the flankers are tilted 45°clockwise, right example, the elastica curves from the top flanker start off towards the left, then bend back towards to center. The population response is thus shifted away from the flanker orientation, resulting in a repulsive illusion.
The two bar illusions in the model are in close accordance with known psychophysics. Westheimer (1990) reported repulsion from tilted flankers on the sides (Fig. 5B), or above and below (panel D). The orientation dependence of the repulsion from lateral flankers (panel B) closely matches (Kapadia et al., 2000) who also found repulsion maximal at around 30°(except for the small attractive effect they reported for larger flanker orientations), inset panel B. Kapadia et al. (2000) also placed flankers above and below the center, and tilted them (as in panel D). But in addition the flankers were displaced so that the near end of the flankers aligned with the center bar. In this case they found an attractive effect for small flanker orientations, and a repulsive effect other orientations, inset panel E. The attractive effect went away quickly as the flanker distance was increased.
We reproduced their setup by assuming a finite bar length for the flanker rotation, and displacing the flanker accordingly, as in  Fig. 5E top. Since the bar lengths in our model are assumed infinitesimally short, the curves are drawn from and to the centers of the bars. This turns out to be key to understanding the attractive part of this illusion. As long as the flankers are close enough, smaller orientations will seem aligned as the origin of the elastica curves is slightly displaced, panel E top left. When the flanker orientation is larger, this is no longer the case and the curves first need to pass the vertical before reaching the center, resulting in a repulsive effect. When the flankers are moved further away (i.e. displaced in the y direction), the situation becomes more analogous to panel D, as the x-displacement of the flanker becomes insignificant, and the illusion turns repulsive for all flanker orientations. Although the precise angular dependence in the model does not match the data, it is surprising that it can exhibit both tilt effects.
All illusions and modulations weaken as the flanker distance was increased, as illustrated by curves of decreasing opacity in the bias graphs. This is in accordance with most tilt illusion studies which note that the illusions decreases in strength as the contextual stimuli are placed further way (Kapadia et al., 2000;Westheimer, 1990).
In the context of elastica, the illusions in Fig. 5B and D were previously explained by noting that the lowest energy curve connecting the two flankers is either oriented towards the flankers' orientation, as in A, or away from them, as in B (Schwartz et al., 2006). Bayesian estimation then results in an orientation between this smooth orientation and the presented center orientation, resulting in the repulsive or attractive illusion. However, an additional displacement of the center bar was allowed to produce an attractive solution for small angles in panel C.

The tilt illusion: full surround
The two flanker stimuli described above, lead to both repulsive and attractive effects. However when either a hexagon of surround bars (Westheimer, 1990), or a surround grating (Clifford, 2014) is used a repulsive tilt illusion occurs for small center-surround orientation differences, while a weak attractive effect occurs for larger orientation differences (inset Fig. 5F), which has been speculated to have a different origin (Clifford, 2014). Mechanistically, the repulsive tilt illusion has been explained by the fact that a surround grating results in orientation tuned suppression, with most suppression when the surround is the same as a neuron's preferred orientation (e.g. Clifford, Wenderoth, & Spehar, 2000;Schwartz, Hsu, & Dayan, 2007).
To examine these illusions in our model, we first turn to Westheimer's experiment, which we can reproduce exactly. There a hexagon of 6 flankers was placed around a center bar, evenly spaced so that there are two parallel bars on the sides, as in Fig. 5F top. The bars were then rotated in place and we measured the effect on the neural population response at the center location. As above we assume that each flanker independently modulates the center responses and the orientation of the center bar was again decoded from the neural activities using a population vector. For most orientations, the net effect from the elastica curves to the center is repulsive. However, when they are close to perpendicular to the center, the four top and bottom flankers win out with a small attractive effect (also see Fig. 8). Thus the model explains both the repulsive illusion, and the attractive effect for larger orientation differences.
We next approximate a center grating by a single oriented bar, and a surround grating as a large set of 16 identically oriented bars, Fig. 6A. At first glance perhaps a weak attractive tilt would be expected again. However, the net modulation from all flankers is inhibitory, panel B, and strongest when a neuron's preferred orientation is the same as the surround orientation, in close accordance with known neural responses. As a result, the decoded orientation is repulsed away from the surround orientation, corresponding to a repulsive tilt illusion, panel C. We also varied the number of flankers and found much the same effects, with still a weak attractive effect for 8 bars, but repulsion otherwise (not shown). In summary, tilt illusions in stimuli with surround gratings and with pairs of flankers can be unified under the elastica principle.

Contour detection
So far we have focused on the effect of flankers on the decoded orientation of a center bar. We now turn our attention to larger scenes consisting of several bars, where we find the response to each bar in succession by taking that bar as the center, and considering all other bars as the flankers. The principle of elastica and smooth contours has classically been used to extract contours from images. These implementations typically explicitly calculate the curvature energy for all element combinations, rather than incorporating the energy in a modulation term as we do here. While association fields in general have been used to facilitate contour detection through contextual modulations (e.g. Bauer & Heinze, 2002;Field et al., 1993;Li, 1998Li, , 1999, here we examine if elastica based modulation also leads to contour extraction. To study contour extraction in our model we measure the apparent saliency s of the contour (as in Li, 1999). The apparent saliency of a bar is defined as the maximal response in the population responding to that bar (i.e. the maximum over their preferred orientations / i ), compared to the maximal responses to bars in other locations where the average in the denominator is taken over the whole image. The saliency of a complete contour is defined as the mean maximal response to a contour relative to the whole image where the average in the nominator is over the bars that constitute the contour. If s > 1, the responses are higher in the contour and it is salient. We also implemented a mean based saliency measure, which uses mean responses instead of maximal responses. This resulted in weaker saliencies, but no qualitative differences (not shown).
In the plots that follow, we show the encoded image above its modeled percept. The opacity of the decoded bars is proportional to their saliency. Consider, first, a lone target bar amidst a homogeneous background, Fig. 7A. While this is not a true contour, from psychophysics and neural measurements we expect the target to be salient (Nothdurft, 1993;Shushruth et al., 2013). Indeed, the neurons in the background are inhibited more than the center flanker, similar to the effect in Fig. 6B. Due to the resulting higher neural response of the center bar it jumps out from the background, as signified by its darker color. As the target is rotated towards the surrounding orientations, the saliency decreases until it is no longer salient (not shown).
Next, we embed a simple contour in a homogeneous background, as in Fig. 7B. Here again, for the decoded image the bars of the feature of interest are darker than those in the background, indicating a salient contour. In this case, although actually all bars in the image experience suppression, those in the background are suppressed more, since they are surrounded by more bars of similar orientations. However, bars that are part of the contour enhance each others responses, resulting in the high saliency of the contour. Further note that the decoded bar orientations differ from the stimulus orientations.
We finally examine saliency of more general contours in random backgrounds. We use the method described in Field et al. (1993) to generate random images containing a random contour of length 8. Briefly, the contour is generated with a starting orientation and location, after which a set orientation change is made in a random direction (left or right), and a new bar is placed following the new orientation. An example stimulus containing a contour with orientation changes of 11.25°, is shown in Fig. 7C, with the decoded image in B. In Fig. 7D we quantify the contour saliency by calculating the average contour saliency over 50 different contour and background configurations, for different contour angle changes. As this angle increases, the saliency drops quickly, mirroring psychophysical contour detection probability (Field et al., 1993).

Dependency on model parameters
The elastica based contextual modulation has two parameters, E 0 and a, here we show that the essential features of the model do not depend on them. First, we fix a ¼ 0:1 and vary E 0 to be 0, 4, or 8. Due to the fact that both contour extractions and the tilt illusion rely on only relative changes, they are fully invariant to changes in E 0 . However, the association field varies strongly with the E 0 parameter, as would be expected from a parameter which mainly varies excitation versus inhibition, Fig. 8, top.
Next, we fix E 0 ¼ 1, and set a ¼ 0:02, 0.1 and 0.5, Fig. 8, middle. As the gain parameter a changes the strength of the modulation, we see no change in the shape of the association field, but quantitative changes of the saliency of features and the decoding biases, displaying a trade-off between saliency strength and coding biases.
The increased bias with large a can be partially counteracted by narrowing the neural tuning width K c (Eq. (4)), as contextual modulation results in a smaller shift of the population response when the tuning curves are sharp. Interestingly, the tuning width K c also has an effect on the attraction effect in the tilt illusion with 6 flankers, Fig. 8, bottom row. As the tuning curve becomes sharper (i.e. larger K c ), the total illusion becomes weaker as expected, but the attractive reduces more. For K c ¼ 1:5 the attractive illusion completely disappears. This is because the attractive pull of the flankers is felt most strongly by neurons with a preferred orientation close to 90°. When the tuning curves are too narrow, these neurons do not respond, and thus the population response is not shifted.

Discussion
We have developed a computational model of V1 that implements the Gestalt Law of good continuation on a neural level through contextual modulations that were determined by the elastica energy. More specifically, the modulation by each bar outside a neuron's receptive field is governed by the curvature energy of the smoothest curves connecting to it. This quite naturally lead to contour extraction, but more surprisingly also explains saliency detection, association fields, and various forms of the tilt illusion. Our work builds on a large body of literature linking these various aspects of visual processing. Association fields have been derived from an image statistics perspective (Geisler et al., 2001;Sigman et al., 2001), and contour detection has been linked to association fields (Hansen & Neumann, 2008;Li, 1998Li, , 1999Spratling, 2012). Elastica and association fields have been linked before and used for contour detection and completion (Ernst et al., 2012;Sharon et al., 1997;Williams & Jacobs, 1997;Williams & Thornber, 2001). Finally, the tilt illusion for two flankers was explained from elastica in a Bayesian framework, but using a different expression for the modulation from us (Schwartz et al., 2006). Our model proposes a new explanation for several forms of the tilt illusions as following from individual elastica contour completions, including some counter-intuitive attractive effects. However, more importantly our model combines the phenomena described in earlier studies, both contour and illusion related, and links them together under the same basic elastica principle.
Not all aspects of the tilt illusion are captured by the model. Most prominently, the exact shape of the tilt illusion for collinear flankers as observed in Kapadia et al. (2000) was not reproduced, Fig. 5E. However, we did find attraction for smaller angles, and repulsion otherwise, which has not been explained before. The attraction disappears as the flankers are placed further away, also in accordance with what was found in Kapadia's work. An important factor in producing these effects is calculating the elastica curves from the centers of the bars. Although the flanker ends were level with the center bar, this allowed for both the attractive and repulsive effects dependent on orientation and distance. This might suggest that neurons with a receptive field of smaller scale than the flanker bar drive the attractive effect in humans. However, such an explanation would require assumptions about how tilt estimates at different spatial scales are combined.
Interestingly, the illusion caused by a hexagon of 6 flankers was captured by the model, with both a repulsive and attractive effect. This effect was dependent on the tuning to the center bar. Although the attraction completely disappeared for very narrow tuning K c ¼ 1:5, most neurons have broad tuning corresponding to K c % 0:5 . Next, when in order to mimic gratings we increased the number of flankers and rotated the entire surround, the attractive effect disappeared for more than 8 flankers (for our parameters). However the repulsive illusion was always present. Existing explanations of the tilt illusion have taken either a mechanistic view, where it is purely a result of the surround suppression (Clifford et al., 2000), or functional views such as arising from image statistics (Schwartz, Sejnowski, & Dayan, 2009), Bayesian processing (Schwartz et al., 2006), or as some form of image normalization (Clifford, 2014). We propose a new hypothesis: the smoothest continuations of the surround elements tilts the percept away from the surround and, in special cases, attracts it.
Despite the model's simplicity, we consider the model biologically feasible. The contextual modulations are effected as independent contributions from each flanker, as one would expect for modulation from individual surround neurons. The resulting Fig. 8. Exploring the effect of the a; E0 and Kc parameters in model. Top row: modulation strength a is kept constant, while E0 is varied. The E0 parameter determines the relative presence of excitation and inhibition in the association field. The illusions and saliency results depend on the relative difference of modulation and are not affected (not shown). Middle row: modulation offset E0 is kept constant, while a is varied. While the relative presence of excitation and inhibition is not affected by this parameter, the strength of modulation is changed directly, resulting in larger differences in response. This results in stronger bias and saliency effects. Bottom row: neural tuning width Kc is varied while both E0 and a are kept constant. While this generally affects the magnitude of the tilt illusions, the attractive illusion in the Westheimer experiment (Fig. 5F) disappears with large KC . modulation matches electrophysiology; both for individual flanker contributions in the form of the neural association field, with excitatory effects for collinearity and inhibitory effects for parallel bars (Kapadia et al., 2000;Kinoshita et al., 2009), and the net effect for many surrounding bars which leads to suppressive surround modulation Gilbert & Wiesel, 1990;Seriès, Lorenceau, & Frégnac, 2003). Because the modulation relies on pairwise interactions only, it is plausible that some form of Hebbian learning shapes its tuning (Bednar, 2012). A caveat is that the statistics of natural images which include both textures and contours are dominated by parallel structures; an association field arises only when the statistics are restricted to contours (Geisler et al., 2001).
Currently we have assumed a single contrast level, which is clearly unrealistic for most natural images. In particular, association fields are known to change with contrast. Low contrast leads predominantly to excitation, while high contrast leads predominantly to inhibition Kapadia et al., 2000). It is possible to extend the model to describe responses to stimuli with heterogeneous contrasts. The center contrast can be represented by the A c parameter (e.g. Sclar, Maunsell, & Lennie, 1990). The contrast of the flanker can be coded in the a and E 0 parameters of each modulation term. In particular, changing E 0 as in Fig. 8 top row, qualitatively matches the observed contrast dependence of the association fields in Kapadia et al.
It is perhaps not surprising that elastica models the contextual interactions of V1 well, if the interactions do indeed exist for the purpose of detecting contours. Besides the elastica curves being especially pleasing to the eye, contours of natural objects are often well described by elastica curves. As an example in this paper, the shape in Fig. 2B would be a good candidate for a leaf. This is the very reason it is used in computer vision for contour completion of partially hidden objects (e.g. Kimia, Frankel, & Popescu, 2003;Mumford, 1994;Zhou, Zheng, & Yang, 2012).
Apart from the illusions of Fig. 5A, C and D, our model makes several predictions: first, the inhibitory connections seem to be broader tuned than excitatory connections. However, which inhibitory connections are the strongest is strongly dependent on relative position and orientation, Fig. 4. This could be tested experimentally. Secondly, our contextual modulations affect a neuron individually, and lead to contextual interactions both for small sets of bars and full surrounds. Experimentally, neither neurophysiologically nor psychophysically, it is known if these are linked. I.e., it is unknown whether the modulation by a surround built up with individual elements, can be explained from its individual contributions.
The neural character of the model allows for a number of straightforward extensions: (1) it will be interesting to include more realistic Gabor-type receptive fields at a variety of scales.
(2) Currently the bars are assumed to have zero length. It is straightforward to find the elastica curves connecting the ends of the bars, with the only minor complication that b f and b c now become dependent on h f and h c . However, without a more realistic receptive field such an extension is rather ad hoc. (3) In the current implementation flankers modulate the center, but there is no recurrent feedback in which the modulated response change the activity of the flankers. In this sense the model performs a one-step approximation, which is valid as long as the shifts in the tuning curve are moderate. In a recurrent model, the dynamics of the illusions presented here would be of interest.