A novel method for estimating the strength of positive mating preference by similarity in the wild

Abstract Mating preference can be a driver of sexual selection and assortative mating and is, therefore, a key element in evolutionary dynamics. Positive mating preference by similarity is the tendency for the choosy individual to select a mate which possesses a similar variant of a trait. Such preference can be modelled using Gaussian‐like mathematical functions that describe the strength of preference, but such functions cannot be applied to empirical data collected from the field. As a result, traditionally, mating preference is indirectly estimated by the degree of assortative mating (using Pearson's correlation coefficient, r) in wild captured mating pairs. Unfortunately, r and similar coefficients are often biased due to the fact that different variants of a given trait are nonrandomly distributed in the wild, and pooling of mating pairs from such heterogeneous samples may lead to “false–positive” results, termed “the scale‐of‐choice effect” (SCE). Here we provide two new estimators of mating preference (C rough and C scaled) derived from Gaussian‐like functions which can be applied to empirical data. Computer simulations demonstrated that r coefficient showed robust estimations properties of mating preference but it was severely affected by SCE, C rough showed reasonable estimation properties and it was little affected by SCE, while C scaled showed the best properties at infinite sample sizes and it was not affected by SCE but failed at biological sample sizes. We recommend using C rough combined with the r coefficient to infer mating preference in future empirical studies.

Irrespective of the evolutionary causes, exhibiting a mating preference has two distinct key evolutionary consequences (Gavrilets, 2004;Lewontin, Kirk, & Crow, 1968;Merrell, 1950): sexual selection (changing the probability of transmitting alleles in progeny of the preference-targeted trait, sensu Arnold & Wade, 1984) and assortative mating (nonrandom mating between individuals bearing different phenotypes/genotypes, Gavrilets, 2004). These processes can be linked in certain scenarios, as a preference causing positive assortative mating (similar types are more frequent in mates than expected by chance) is expected to generate positive frequency-dependent sexual selection (Servedio, 2016), while a preference causing negative assortative mating (different types are preferred in mates) is expected to produce negative frequency-dependent sexual selection (Pusey & Wolf, 1996;Takahashi & Hori, 2008). These concepts and their definitions have, however, been the subject of much debate over the past decades (Arnold & Wade, 1984;reviewed in Andersson, 1994;Edward, 2015;Gavrilets, 2004).
Mating preference has been shown to play a key influence in the theoretical dynamics of several evolutionary processes: assortative mating and sexual selection (consequences of mating preference), for example, may contribute to reproductive isolation between incipient taxa (Gavrilets, 2004(Gavrilets, , 2014Santos, Matos, & Varela, 2014;Servedio, 2016;Thibert-Plante & Gavrilets, 2013;Thibert-Plante & Hendry, 2011;Turelli, Barton, & Coyne, 2001). A major question is whether the theoretical conditions that allow the evolution of mating preference (intermediate levels of disruptive selection, low mating cost, strength of the mating preference, etc., see Gavrilets, 2004) in sympatry can be empirically observed in the wild. A difficulty in answering this question remains in linking theoretical arguments (and definitions) with empirical estimates (Gavrilets, 2004;Servedio, 2016; but see Roff & Fairbairn, 2015 for an exception). The methods to model mating preference and their consequences (e.g., assortative mating) have, however, not been empirically validated. To attempt to address this we briefly review the main strategies to model theoretically, and estimate empirically, true mating preferences from field data in an attempt to integrate these approaches.
Two mating preference mechanisms have been modelled depending on the evolutionary scenario considered (reviewed in Gavrilets, 2004;Kirkpatrick, Rand, & Ryan, 2006;Servedio, 2016;and ignoring any indirect mechanism to find a mate via habitat choice, resource search, etc.). The first mechanism refers to the case where individuals of the choosy sex (usually females) prefer certain mates that display particular variants of a trait (see Gavrilets, 2004). Such form of mating preference may lead to sexual selection and, hence, as a strong driver of extreme sexual dimorphism (e.g., weapons and ornaments in one sex but not in the other) observed in many birds and insects (Crespi, 1989;Futuyma, 2013). The second is a preference based on phenotype matching or similarity (i.e., a tendency to choose mates possessing similar variants of a trait), and such a preference by similarity may lead to positive assortative mating observed in many species (reviewed in Arnqvist et al., 1996;Crespi, 1989;Jiang, Bolnick, & Kirkpatrick, 2013;Servedio, 2016). These preferences can be modelled by using explicit genetic mechanisms (Kirkpatrick et al., 2006;Servedio, 2016) or by Gaussian-like mathematical functions (Gavrilets, 2004(Gavrilets, , 2014Lande, 1981). Explicit genetic mechanisms are often adequate to model the effects on qualitative traits (e.g., color) assuming one or two loci control the mating preference, while Gaussian-like functions seem more appropriate to model quantitative trait loci (e.g., size and length, Lande, 1981;Roff & Fairbairn, 2015). For example, under a positive preference by similarity, any preference function should give a higher probability of mating when the mating individuals share similar variants of a trait (e.g., similar color or size, Carvajal-Rodríguez & Rolán-Alvarez, 2014).
Traditionally, the Gaussian-like functions originally developed for theoretical studies were not, however, applicable to empirical data but recent modifications now allow their application (Carvajal-Rodríguez & Rolán-Alvarez, 2014). Different strategies have been considered to infer mating preferences empirically. Laboratory choice experiments, for example, have been used to investigate the mechanisms of mating preference (Coyne, Elwyn, & Rolán-Alvarez, 2005;Knoppien, 1985), and the associated statistical tools to analyze such experiments have also been developed (Gilbert & Starmer, 1985;Rolán-Alvarez & Caballero, 2000). These approaches, however, have limitations because mating is often difficult to induce under laboratory conditions, and the patterns observed under such conditions may not reflect the true mating patterns which occur in the field (Coyne, Kim, Chang, Lachaise, & Elwyn, 2002;Coyne et al., 2005).
An alternative strategy is to measure the strength of mating preference by observing mating pairs directly in the field (reviewed in Crespi, 1989;Jiang et al., 2013). In this second strategy, there is one statistical tool (PSI; the ratio of the observed frequency of a pair/expected frequency under random mating; see Rolán-Alvarez & Caballero, 2000) available that could, under certain scenarios, estimate mating preferences for qualitative traits (e.g., color) in the wild. There is, however, no direct estimator of mating preference for quantitative traits (e.g., size) in the wild. Therefore, most authors have adopted an indirect approach for estimating the mating preference, focusing either on assortative mating or on sexual selection effects. When estimating assortative mating in the field (presumably caused by mating preference by similarity), the most common strategy is to use the Pearson's correlation coefficient (r) or related statistics on the trait values across the range of observed mates (reviewed in Jiang et al., 2013); the larger the coefficient, the stronger the preference by similarity. Recently, however, it has been shown that such a strategy can produce a great bias in certain cases (e.g., simulations have shown that a Pearson's r of .8 could be observed under random mating in certain scenarios; see Rolán-Alvarez et al., 2015), caused by the scale-of-choice effect (SCE). The concept of the SCE is that different variants of a given trait can be distributed nonrandomly across spatial and temporal scales, and hence, pooling of mating pairs from such heterogeneous samples may lead to "false-positive" results (Rolán-Alvarez et al., 2015). Indeed, mating pairs can be difficult to observe/score in the field and, because of this, researchers often pool these pairs over a geographic range or time series (e.g., Jiang et al., 2013). The SCE will, therefore, occur when two conditions are met: firstly that the organism looks for a mate at a smaller scale than the pooled scale and, secondly, that at the smaller scale there is some trait heterogeneity ( Figure 1). These two conditions could be common in organisms which exhibit low adult mobility (Rolán-Alvarez et al., 2015) and has already been demonstrated for one species with negative assortative mating (Rolán-Alvarez et al., 2015) and two species with positive assortative mating (Ng, Williams, Davies, Stafford, & Rolán-Alvarez, 2016).
The major focus of the present study was, therefore, to develop new estimators of mating preference by similarity that are less biased by the SCE (as compared to traditional approaches using Pearson's coefficient r) and hence provide a better linkage between theoretical and experimental estimates of mating preference. We use a modified Gaussian function from traditional theoretical models to simulate positive assortative mating and thus obtain a set of simulated mating pairs with an a priori-controlled strength of preference. With such a collection of simulated mating pairs, we were then able to evaluate a posteriori dif-

| Estimating mating preference
Several Gaussian mathematical functions have been used to infer mating preference under the similarity preference model (Carvajal-Rodríguez & Rolán-Alvarez, 2014;Débarre, 2012;Dieckmann & Doebeli, 1999;Gavrilets, Vose, Barluenga, Salzburger, & Meyer, 2007;Thibert-Plante & Gavrilets, 2013;Servedio, 2015). These functions predict the probability of mating for any particular pair based on a few key parameters (Gavrilets, 2004), namely: (1) the C parameter (equivalent to Pearson's r in empirical approaches) which represents the strength of mating preference for a trait which is supposedly evolving and contributing to assortative mating; (2) the D parameter, which represents the absolute difference between male and female trait values (see Equation 1 below). In addition, several of these functions include a parameter, s 2 , which allows fine-tuning of the preference under simulated conditions, but is as- where s 2 is the mating tolerance, C is the mating preference itself (range from 0 to 1), D is the absolute difference between male (X m ) and female (X f ) unstandardized traits (size or shell length in this case) for each pair evaluated, and D max is the maximum D value that can be observed in the population. For example, we can model positive size assortative mating (say C = 0.5) by computer simulation and obtain a series of N random male and female size pairs from a population (from certain a priori population mean and variance; see Table S1 and corresponding explanations in Appendix S1). Therefore, the encounter between a male and a female is random but whether they will mate or not depends on the mating probability given by the preference function FND. The FND value of each mating pair is calculated by Equation 1.
Once we have the FND values of the N randomly formed couples, a Monte Carlo procedure based on pseudorandom numbers (as is the standard practice) will pick-up the mating pairs so that the probability of being chosen is proportional to their FND values (see Appendix S1).
The resulting set of mating pairs is expected to show a Pearson's r (for size) close to 0.5 (see Table S1). In this example, the preference parameter is C = 0.5, which has been established a priori, while the measured Pearson's r is a posteriori and could be considered as an estimate of the C parameter.
We were interested to check the robustness of the new mating preference estimators proposed in this study following a particular Ng et al., 2016). The small, white, circles in pairs represent putative mating pairs, while the relative size of these circles is correlated with the trait mean. The SCE occurs as a consequence of pooling mating pairs at a larger scale (S pooled ; yellow area), while mate choice is actually produced at a smaller scale (within S1-S5; green areas), and in addition, there are some trait heterogeneity at this scale (between S1 to S5). Therefore, a way to estimate the SCE is to measure the statistic (Pearson's r, C scaled , or C rough ) at the pooled level minus the average value within homogeneous groups (Groups 1 and 2). Note that SCE is expressed in the same units than the statistic used

2.
Estimating for every pair the value of D (D = |X male − X female |) and D max for each population (D max = |X max − X min |). X is the value of the trait (shell size in our experimental model) used in the pair (X male , X female ) or in the population (X min , X max ). The same tolerance is used in all simulations and during empirical estimation (s 2 = 0.01).
3. Solving C from Equation 1. This approach occasionally gives C estimates (C′) larger than 1, and so the way to correct for this will characterize the two alternative statistics proposed: C rough excludes any C value larger than 1, and so the sample size for estimation would be reduced when the data sample size is low and the a priori C values high. Alternatively, C scaled allows all C values, but the final mean estimate is rescaled to range between 0 and 1.

| Validation of the estimation process by simulations (EP simulation)
Simulations were undertaken to validate the mating preference estimations under different scenarios (Table 1) Table 1).
The null case distribution (case 0) considered certain mean (D max /2) and standard deviation (D max /4; named case 0), but considered three further alternative scenarios: case 1 (female mean = 5 × D max /2), case 2 (SD = D max ), and case 3 (SD= D max ; female mean= D max ). Mating preferences were simulated using different tolerances (s 2 = 0.1, 0.01, and 0.001, although as the results were qualitatively similar, only results for 0.01 are presented). Each simulation was repeated 1,000 times.
Once the mating pairs were generated, mating preferences were estimated by using classical Pearson's correlation (r) and C′, as explained above.
These estimates were compared with the a priori true C values and, therefore, the robustness of the different estimators (r, C rough , and C scaled ) was compared by measuring bias (= true C − estimated C), range of estimation, regression coefficient between estimators and true C, and coefficient of variation among computer samplings (which allow inference of sampling robustness), as in Carvajal-Rodríguez and Rolán-Alvarez (2014).

| Validation of the scale-of-choice effect by simulations (SCE simulation)
The SCE is the bias caused by measuring assortative mating at an inappropriate scale (Rolán-Alvarez et al., 2015), and it can be measured by the difference between the estimator (e.g., r or C rough ) at the incorrect scale-the estimator at the appropriate/true scale (Figure 1). In order to investigate how SCE could affect our estimators, an additional set of simulations were performed following the same scenarios used above (Table 1

| Estimating mating preference from wild mating pair (empirical) data
The new estimators (C rough and C scaled ) were applied and com- N pair is number of pairs simulated, D max the maximum possible difference in the population for the trait. Distribution represents four distinct scenarios for mean and variance of the trait across sexes. For the SCE simulation, N group is the number of subgroups simulated, CV the coefficient of variation expected across the simulated subgroups and SD the standard deviation within those groups. Finally, N scenarios are the number of combinations of scenarios in each simulation. Each combination was replicated 1,000 times.

| Validation of estimation process
The robustness of the three estimators of positive mating preference by similarity, C rough , C scaled , and Pearson's correlation coefficient (r) were evaluated ( Table 2). All statistics showed a high and significant (all cases p < .05) linear regression slope, but only Pearson's r and C scaled showed a slope close to 1 (Table 2), and hence, these two estimators of mating preference (C) were relatively more robust than the C rough considering this property (Figure 4). This  sample sizes (N pair ; see Table 2). Additionally, the overall error in estimation of C was relatively moderate for the three estimators (expected mean value should be 0.5), although the bias for the C rough and C scaled increased somewhat at the largest sample size (Table 2).
These properties were rather insensitive to the different scenarios proposed (see low SD in Table 2), and the estimation errors within each scenario were typically small enough to effectively distinguish the C values differing by 0.1 units (except for C rough when estimating values of C larger than 0.6, Figure 4). When using simulation averages across scenarios, Pearson's r and C scaled outperformed C rough in estimating C. The sampling robustness of estimators was measured by the mean coefficient of variation of the different statistics across the 1,000 computer simulations within the scenarios (summarized across scenarios by averages ± SD): CV Pearsonr = 1.0% ± 1.5; CV Cscaled = 392% ± 818.0; CV Crough = 9% ± 1.3. The results clearly showed that both Pearson's r and C rough outperformed C scaled , which showed severe sampling errors during simulations, which limits the utility of this estimator.

| Validation of SCE
The sensitivity of each estimator (Pearson's r, C scaled , and C rough ) of mating preference by similarity to the SCE bias was evaluated (Tables 1   and 3). The results were averaged across subgroups (CV) and level of variation within groups (SD) as they did not produce any great variation on SCE trends (except under small CV; see Figure 5). Pearson's r, as expected, showed a strong bias for those scenarios that included the pooling of subgroups which showed a certain degree of heterogeneity (i.e., CV > 0.5). The bias was rather insensitive to sample size (N pair ; Table 3). The SCE biased the estimation of mating preference (C) based on Pearson's r from low to high values (up to 0.6), while C estimates based on C rough and C scaled were biased to a much lesser extent (moderately to no bias; Figure 5). In this case, C scaled and C rough clearly outperformed Pearson's r and were less sensitive to the problems associated with the SCE.

| Application of the new estimators of mating preference to empirical data
The estimations of mating preference (C) using C scaled were too noisy to be useful (see above) and are not presented, but the estimated C based on C rough averaged across the five homogeneous subgroups and its corresponding estimated SCE are illustrated in Table 4. The C rough across samples was relatively similar between species (around 0.4).
The estimated SCE was, however, reduced by half in E. malaccana and E. radiata, although it remained similar in L. fabalis, which indicates the ability of C rough to reduce the SCE bias at least in those cases with the highest SCE.

| DISCUSSION
A mathematical description of any potential evolutionary mechanism is a prerequisite to fully understand and predict biological phenomenon . In this study, we proposed a new method to estimate positive mating preference by similarity using the FND mathematical function (Carvajal-Rodríguez & Rolán-Alvarez, 2014).
This strategy can be used to infer mating preference in organisms that show positive assortative mating for size (or any similar trait in both sexes). The method is based on the assumption that, without a priori knowledge of the genetic mechanisms contributing to the preference, a mathematical function can amalgamate all the preferences into one variable, C (sensu Gavrilets, 2004Gavrilets, , 2014Thibert-Plante & Hendry, 2011;Débarre, 2012;Thibert-Plante & Gavrilets, 2013;Roff & Fairbairn, 2015), which itself could be determined by many quantitative loci. Such a strategy has been used since the origin of quantitative genetics (Falconer & Mackay, 1996) but previously was only used for making theoretical predictions. The FND Gaussian-like function is a modification of the traditional methods used in theoretical studies which is able to accommodate empirical data and as such provides a link between the two research approaches.
F I G U R E 4 All estimated statistics (Pearson's r, C scaled , and C rough ) for the true strength of the preference (C) simulated a priori and the three different mating pair sample sizes (N pairs ). The true values regressed against the estimated statistics are shown. Note that the statistics are basically not affected by the six scenarios (N group × SD; see Table 1) considered For the first time, we were able to formally evaluate how the classical Pearson's r is related to the strength of mating preference using the FND function in a combination of simulations and empirical data.
Interestingly, Pearson's r showed excellent estimation properties and allowed efficient estimations of mating preferences in all scenarios, except in situations when the SCE was simulated. Here we showed that when SCE was not present, Pearson's r could be a valuable tool to estimate the strength of mating preference, but as shown previously (Rolán-Alvarez et al., 2015), when the SCE is present, it can produce huge bias in Pearson's r as an estimator of mating preference.
Therefore, for any model organism in which SCE has been experimentally shown to be small or negligible, Pearson's r can be used to infer mating preference directly in the wild. Using such an approach, theoretical predictions and empirical studies can be connected, which allows fundamental progress in our understanding of the role of mating preference in driving genetic differentiation in the wild (Gavrilets, 2004;Roff & Fairbairn, 2015;Servedio, 2016). Future theoretical predictions regarding mating preferences by similarity can, therefore, be empirically verified whenever the study has corrected for any potential SCE bias. Where there is a bias due to the SCE, there are only two known strategies to correct for this. The first uses the information of nonmating individuals surrounding the mating pair to reorganize the dataset into a series of homogeneous subgroups and then uses the averaged of Pearson's r across subgroups to correct for the pooled estimate (see Figure 1 and Table 4; also see Rolán-Alvarez et al., 2015;Ng et al., 2016). This strategy is feasible, but it requires extra sampling effort and cannot be used on published data that have not applied an appropriate experimental design.
The second strategy makes use of specific estimators of the strength of mating preference, such as the C scaled or C rough described here. From our evaluation of the two new estimators (C rough and C scaled ), one of them (C scaled ) showed ideal theoretical properties but failed when applied to realistic sample sizes, while the other (C rough ) showed limited theoretical properties but behaved reasonably well at low sample sizes. We also empirically demonstrated that C rough greatly reduced the SCE bias as compared with the traditional approach using Pearson's r in some cases (C < 0.6). This new estimator, therefore, on a different algorithm from Pearson's r which is known to be extremely affected by outliers (Rousselet & Pernet, 2012). In addition, our methods indirectly limit the effects of outliers due to partial rescaling (or excluding extreme values), and this could be part of the explanation. Nevertheless, more research will be needed to understand this kind of bias (or its absence) in statistics related either directly or indirectly to correlation coefficients. The new proposed estimators could, however, be further improved in the future, ideally to a level without bias due to the SCE in estimating mating preference.
Several authors have called for improvement in the relationship between theoretical and empirical methodologies to allow progress in evolutionary theory (Gavrilets, 2004(Gavrilets, , 2014Roff & Fairbairn, 2015;Servedio, 2015). In this paper, we add to the strategy initiated by Roff and Fairbairn (2015) trying to connect both frameworks, by proposing a new estimator (C rough ) for mating preferences (as well as checking the applicability of Pearson's r for the same purpose) from mating pairs directly captured in the wild. Although the method could be problematic for estimating unbiased preferences, it may be sound and robust enough for comparing estimates among groups and testing hypotheses on mate choice evolution. The priority would be to use this function in theoretical and empirical studies, as well to check whether theoretical predictions can be supported or rejected by observations in the field.  The SCE is experimentally obtained by C pooled − C averaged (see Ng et al., 2016;Rolán-Alvarez et al., 2015). *p < .05; **p < .01; ***p < .001.
T A B L E 4 C rough estimates from experimental data, and experimental estimations of the scale-of-choice effect (SCE) for this new estimator, which can be compared with the SCE estimates from Pearson's r Our approach could be applied, for example, to ecological models for studies of speciation, such as Littorina saxatilis (Rolán-Alvarez, 2007), stick-insects (Nosil, Egan, & Funk, 2008;Nosil & Feder, 2013), the stickleback (Kraak & Hart, 2011;Hendry, Hudson, Walker, Räsänen, & Chapman, 2011;Vines et al., 2016), or cichlids Martin, 2013;Seehausen et al., 2008), to check whether theoretical predictions match empirical estimates in the wild. Additionally, this methodology could be used for testing whether runaway sexual selection could contribute to the allopatric process of speciation (reviewed in Servedio & Bürger, 2014).

ACKNOWLEDGMENTS
We thank N. Santamaría for administrative contributions for discussions and comments on the manuscript. This work was funded by Spanish "Ministerio de Economía y Competitividad" (codes BFU2013-44635-P y CGL2016-75482-P), Fondos Feder and Xunta de Galicia