Concordance between stabilizing sexual selection, intraspecific variation, and interspecific divergence in Phymata

Abstract Empirical studies show that lineages typically exhibit long periods of evolutionary stasis and that relative levels of within‐species trait covariance often correlate with the extent of between‐species trait divergence. These observations have been interpreted by some as evidence of genetic constraints persisting for long periods of time. However, an alternative explanation is that both intra‐ and interspecific variation are shaped by the features of the adaptive landscape (e.g., stabilizing selection). Employing a genus of insects that are diverse with respect to a suite of secondary sex traits, we related data describing nonlinear phenotypic (sexual) selection to intraspecific trait covariances and macroevolutionary divergence. We found support for two key predictions (1) that intraspecific trait covariation would be aligned with stabilizing selection and (2) that there would be restricted macroevolutionary divergence in the direction of stabilizing selection. The observed alignment of all three matrices offers a point of caution in interpreting standing variability as metrics of evolutionary constraint. Our results also illustrate the power of sexual selection for determining variation observed at both short and long timescales and account for the apparently slow evolution of some secondary sex characters in this lineage.

Proportionality between intraspecific covariation and interspecific covariation is suggestive of macroevolutionary divergence constrained to occur along "genetic lines of least resistance" and in a manner predicted under a neutral model of evolution by mutation and drift (Lande, 1979;Lynch & Hill, 1986;Schluter, 1996).
It is important to recognize, however, that concordance between intra-and interspecific covariance matrices is not a prediction exclusive to a "genetic constraint" hypothesis and the key to understanding the resemblance between these two matrices might actually be found in a third matrix, γ, summarizing the pattern of nonlinear selection on single traits or on trait combinations (Lande & Arnold, 1983;Phillips & Arnold, 1989). The evolution of genetic covariances after a single generation of selection depends largely on nonlinear selection Phillips & McGuigan, 2006), and, ultimately, standing trait variance is predicted to align with γ (Cheverud, 1984;Jones, Arnold, & Bürger, 2003Zeng, 1988). Theoretically, the evolution of genetic covariances need not be reflected in the Pmatrix (Roff, Prokkola, Krams, & Rantala, 2012;Willis, Coyne, & Kirkpatrick, 1991); however, empirical treatments have often found similarities between genetic and phenotypic covariances (Cheverud, 1988(Cheverud, , 1996Marroig & Cheverud, 2001;Roff, 1995;Steppan, Phillips, & Houle, 2002).
If patterns of nonlinear selection are preserved over long timescales (i.e., among related taxa), γ may also predict the structure of D if evolution occurs along "selective lines of least resistance" (Arnold et al., 2001;Schluter, 1996). Using the adaptive landscape metaphor popularized by Simpson (1953), this postulates that despite taxa evolving toward different (taxon-specific) optima within a given lineage, peaks tend to be clustered in a ridge-like arrangement on the landscape. This implies similarities among related species in the alignment of γ, possibly due to ecological demands that themselves show phylogenetic signal (Losos, 2008;Revell, Harmon, & Collar, 2008;Wiens & Graham, 2005).
Thus, one possibility is that both within-and among-species covariance (P and D) will align with γ, underscoring the difficulty in disentangling the effects of selection on divergence from those of genetic constraints inferred from standing intraspecific trait variance (Conner, 2012). That is, the predictions of models emphasizing "genetic constraint" and "selection" are not mutually exclusive. Although separate lines of empirical study offer some support for intraspecific covariances (Brodie, 1992;Hunt, Blows, Zajitschek, Jennions, & Brooks, 2007;Revell et al., 2010; also see Roff & Fairbairn, 2012) or among-species covariances  corresponding with the pattern of nonlinear selection, we are not aware of an empirical study that has evaluated the concordance of among all three sets of parameters; that is, whether γ can simultaneously predict patterns of covariance observed at both micro-and macroevolutionary scales. In this study, we employed a genus of true bug (Phymata, Order: Reduviidae) to test the conjecture that patterns of both standing trait variation (in one species) and divergence among related taxa are explained by strong nonlinear sexual selection. We also discuss the implications of our findings with respect to the evolution of sexual dimorphism.

| Background and study organisms
The genus Phymata comprises over 100 species and subspecies (Froeschner, 1988;Kormilev, 1962), with most occurring in the New World. For the purposes of this study, we do not distinguish between recognized species or subspecies and treat these as equivalent "taxa." We should note that this treatment assumes no gene flow between taxa. Although we are unable to predict how introgression would influence among-species covariance in trait means, a potential concern has been raised for an analogous issue with respect to the shape of the Gmatrices of interbreeding populations, whereby gene flow may inflate (co)variance along the direction of greatest difference between population means (Guillaume & Whitlock, 2007). The genus belongs to a basal and relatively old monophyletic clade in the Reduviidae (Hwang & Weirauch, 2012;Weirauch & Munro, 2009), but phylogenetic relationships at the species level have yet to be resolved.
Ecologically, all Phymata are thought to be generalist sit-and-wait predators, and coloration may have been shaped by selection for crypsis (Schuh & Slater, 1995). Yet, sexual dimorphism in size and coloration appears to be widespread in the genus (Handlirsch, 1897;Kormilev, 1962;Melin, 1930) although quantitative treatments have only been performed for a handful of taxa (Mason, 1977;McLain & Boromisa, 1987;Punzalan & Rowe, 2015). In one species (P. americana americana, Figure 1), we have previously detected multivariate sexual selection in a wild population, including negative nonlinear selection favoring intermediate values/combinations of melanic color patterns (Punzalan, Rodd, & Rowe, 2010). In this particular species, dark coloration appears to serve a thermoregulatory role in mediating male mating success Punzalan & Rowe, 2013;. A previous comparative study of sexual dimorphism among a much smaller subset of Phymata taxa suggested macroevolutionary conservatism of some components of male color pattern as well as biogeographical variation (i.e., across -species clines) consistent with melanism serving an important function in these taxa (Punzalan & Rowe, 2015).

| Collection and rearing of P. americana americana for P-matrix estimation
In 2013 and 2015, we collected 4th and 5th instar nymphs from a field site (Koffler Scientific Reserve, King, Ontario, Canada) housing a stable population of ambush bugs. Bugs were maintained individually in cages consisting of a transparent plastic tube (10.16 cm long, 1.27 cm diameter; 19.5 ml capacity) with a transparent rubber lid fitted with a flexible hole (Aquatube #53 floral watering tube). Nested within each tube was a semipermeable clear plastic spacer (2 cm long) that allowed air and small feeder insects to pass through. The tubes were mounted, lid down onto a feeding apparatus that provided bugs with live Drosophila melanogaster (Drosophilidae) and/or Megaselia sp. (Phoridae) per day (mean = 16.5, SD = 7.7; estimated from a separate assay). Every 4 days, cages were also provisioned with larger prey-either one adult or three 4-to 8-d-old pupae of Calliphoridae and/or Sarcophagidae, reared from ground beef. Every other day, bugs were checked for molts and the positions of cages on the feeding apparatus were randomized. All insects were maintained in a temperature-controlled room at 25 ± 2°C, 25% RH, with a 14L: 10D fluorescent light cycle.

| Trait measurement, scaling, and standardization
In this study, we focus on three traits that are sexually homologous and often-sexually dimorphic. Following established methodology, we measured pronotum width (PN)-a standard measure of overall sizeas well as measures of mean dorsal darkness (MD) and mean lateral darkness (ML). The latter two traits are proxies of melanism of the integument in two distinct developmental units of the thorax (i.e., the pro-and mesothorax, respectively). Briefly, melanism was measured, using digital image analysis software, as the "value" (average number of black pixels) on a standardized patch of integument (see Punzalan, Cooray, Rodd, & Rowe, 2008, Punzalan & Rowe, 2015. All three measured traits could be treated either as ratio or loginterval scale types (Houle, Pelabon, Wagner, & Hansen, 2011). The types differ in their utility for biological interpretation but are also accompanied by limitations in representing variability in a given dataset. For example, both provide valid representations regarding the relative variability of traits; however, the latter assumes that differences in trait values are not of interest (Hansen & Houle, 2008;Houle et al., 2011). Given the difficulty in making a priori assumptions regarding the importance of such differences, we performed tests of concordance among matrices and/or their eigenvectors using both meanstandardized data and the raw, natural log-transformed data. Mean standardization has several properties that are desirable for this study, including appropriate scaling of variability of each trait by its absolute size (Houle et al., 2011) while also accommodating differences in trait means, corresponding to slight differences in methodology used to obtain data for each of the three matrices. For example, although methods were always internally consistent within each dataset, trait measures for γ and P were obtained using methods optimized for measurement of live bugs (albeit in different years) while D was measured from preserved specimens mounted on entomological pins. For estimation of the former two matrices, phenotype data was standardized by trait means within each sample (i.e., the mean of P. americana americana estimated from the field mating success study and each of the two common-garden studies, respectively); for the latter, phenotypic data were divided by the trait means derived from P. americana americana obtained using the protocol for museum specimens. That is, comparisons among selection coefficients, intraspecific variability, and interspecific divergence are conducted in units of P. americana americana trait means.
Log-transformation of data is often adopted on account of a number of well-known statistical benefits. In the present dataset, however, log-transformation actually had undesirable effects (i.e., generating skew in already normally distributed phenotypic data), which may have contributed to some problems encountered in some analyses of this dataset (e.g., Dmatrix estimation using a mixed model approach; discussed below).
In all analyses, results using both mean-standardized and natural log-transformed data were qualitatively identical. Consequently, we primarily focus on the analyses using mean-standardized estimates except for a few key statistical tests, for which we present the results from both sets of analyses.

| Estimating phenotypic selection
Phenotypic selection gradients were derived from a previous field study of sexual in a wild population of P. americana americana (Punzalan et al., 2010). Using the original digital photographs from the prior study, we obtained measures of MD and ML and, following the conventional method, estimating selection gradients based on the partial regression coefficients of relative fitness on the variance-standardized phenotype distribution (Brodie, Moore, & Janzen, 1995;Lande & Arnold, 1983;Stinchcombe, Agrawal, Hohenlohe, Arnold, & Blows, 2008). We report the conventional variance-standardized selection gradients for the homologous traits (PN, MD and ML) in Table 1 but convert these to mean-standardized gradients (Hereford, Hansen, & Houle, 2004) in corresponding comparisons with P and D (discussed in detail below).
T A B L E 1 Trait distributions for male Phymata from museum collections Mean and standard deviations (in parentheses) for pronotum width (PN) in mm, average darkness for mean dorsal (MD), and mean lateral melanism (ML) measured in units of (average) number of pixels. N is sample size The matrix of nonlinear selection gradients (γ) was diagonalized to find the canonical axes (m) of nonlinear selection (Blows & Brooks, 2003;Phillips & Arnold, 1989), and significance testing of curvature was performed using a randomization procedure to derive a null distribution of eigenvalues for eigenvectors calculated from the permuted data (Reynolds, Childers, & Pajewski, 2010). We interpreted the eigenvector with the largest negative eigenvalue to represent the direction of strongest multivariate stabilizing selection (m max ).
Although we perform subsequent analyses using the explicitly multivariate approach advocated by some (Blows, 2007;Walsh & Blows, 2009), the eigenvectors of nonlinear selection were closely aligned with original trait space (i.e., approximately stabilizing selection on MD), permitting a relatively straightforward biological interpretation (Conner, 2007).

| Estimating intraspecific trait covariation, the P-matrix
The Pmatrix was estimated from wild-caught subadults (nymphs), maintained under common-garden conditions in the laboratory through adulthood, in two separate years (see Section 2, Collection and rearing of P. americana americana for Pmatrix estimation) for information on rearing conditions). Although body size (PN) is fixed upon adult emergence, we maintained bugs until 14d of (adult) age because dark coloration traits may take up to 2 weeks to reach their asymptotic values (Punzalan, Cooray et al., 2008). Subsequently, we sexed and photographed bugs for later measurement of the three traits. In total, 50 males in year 1 and 58 males in year 2 were available for the estimation of the Pmatrix. Standard errors of each element were estimated using a delete-one jackknife approach (Manly, 1997) in R (http://www.R-project.org) using the package "bootstrap" version 2015.2 (Efron & Tibshirani, 1993). For mean-standardized comparisons involving P, we conducted analyses using data pooled between years (but standardized separately within year to remove differences in multivariate means), as well as when considering Pmatrices separately. Results were remarkably consistent irrespective of whether we used pooled data or analyzed P separately by year (i.e., Pmatrices in the 2 years were very similar; data not shown). For simplicity, we report the results from the pooled mean-standardized data.

| Estimating among-species covariation, the D-matrix
We employed a straightforward approach (see Schluter, 1996)  benefit of the mixed model approach is that it allows for estimates of uncertainty in the elements of D. We employed this approach as well, using restricted maximum likelihood in the MIXED procedure in SAS (v. 9.2, Cary, NC) to estimate D for the mean-standardized traits.
It was not possible to derive such estimates using the natural logtransformed dataset because models failed to converge.

| Measuring concordance among the three matrices and significance testing
First, we tested for similarity between P and D, using the Flury (1988) hierarchy and implemented using the common principal components analysis program (Phillips & Arnold, 1999). CPCA has the ability to distinguish matrix concordance of various degrees, ranging from matrix equality, proportionality, concordance of (some) eigenvectors to unrelated matrices. To find the most probable model of matrix similarity, we employed both the model building (based on the lowest score of Akaike's Information Criterion) and "jump-up" (based on pair-wise model comparisons) (Phillips & Arnold, 1999).
To test the degree to which nonlinear (in this case, negative) selection shapes intraspecific variation, we compared the alignment m max (the direction of greatest negative nonlinear selection in γ) with the eigenvector associated with the least variability in P, or p min (the direction of least trait variance/covariance), by means of the absolute value of the inner product (or vector correlation, ρ) between m max and p min .
As the (inverse of the) phenotypic covariance matrix is used in the estimation of γ, one concern is that this could generate spurious correlations with P. However, in this study, P was estimated separately from the selection study, so an observed correlation cannot be strictly an artifact of this mathematical relationship.
For each sample of P (i.e., year), significance testing was performed using a permutation test, whereby observed trait values from the Pmatrix dataset were shuffled (within traits) prior to calculation of a random Pmatrix and its associated eigenvectors. The permutation test assessed the likelihood of obtaining the observed (or greater) absolute value of ρ by chance (10,000 iterations), given the multivariate phenotype distribution. The same procedures were used to perform the analogous test, comparing m max with the direction of least across-species multivariate divergence (d min ). These analyses are similar to the approach taken by other authors (e.g., Schluter, 1996) in that the reference vector (m max ) is not allowed to vary and, thus, implicitly assumes this direction to be without error.
We also caution that a potential bias arises in comparisons of P and γ, stemming from the errors in estimation of the latter, that causes the dominant (canonical) vectors of nonlinear selection to tend toward alignment with dimensions of low phenotypic variance (Morrissey, 2014). An alternative approach is to use projection pursuit regression to estimate the vector (projection) that explains the most covariance with relative fitness (Morrissey, 2014;Schluter & Nychka, 1994). In this study, no significant linear selection was observed, allowing for a relatively straightforward interpretation of the projection as a direction that describes (stabilizing) nonlinear selection (see Results). We calculated this projection using the program provided by D. Schluter (https://www.zoology.ubc.ca/~schluter/wordpress/software/), over a range of possible smoothing parameters (λ = −10 to λ = 10 in increments of 2) with 5,000 random projections. Coordinates of the projection were invariant throughout almost the entire tested range (i.e., for all λ > −10). To our knowledge, the potential alignment bias does not apply to the comparisons of γ and D.
Note that, our approach to comparing alignment of eigenvectors differs from the approach used by some previous studies (e.g., Blows, Chenoweth, & Hine, 2004;Revell et al., 2010) that compared the direction of weakest selection (i.e., ω max , the leading eigenvector of the negative inverse of γ) with the direction of most variability (e.g., g max , p max, or d max ). Our rationale for essentially doing the reverse (i.e., aligning the axis of strongest selection with the axes of least variation) is that it provides a more direct test of the prediction that variance will be most constricted in the principal direction of stabilizing selection. The utility of the minor eigenvectors has been recognized by other authors in analogous evaluations of Dmatrices (Stock, Campitelli, & Stinchcombe, 2014). Our alignment approach is also advantageous in that we employ the direction on the estimated selection surface that has the most statistical support (e.g., eigenvectors associated with low eigenvalues could simply reflect low statistical power in a selection analysis rather than convey biological information). We emphasize that our approach does not assume that our estimates of phenotypic sexual selection are necessarily reflections of net selection. That is, we find it unrealistic to expect stabilizing sexual selection to predict where all/most variation and divergence ought to lie-surely, a considerable amount of evolution is expected to reflect selection via other (unmeasured) fitness components. Instead, our focus is on identifying whether stabilizing sexual selection has contributed to restricted relative variability as predicted by theory.
We also employed a second approach, directly estimating the length (e) of the projection P and D, respectively, onto the space We acknowledge that our analyses of concordance between D and the other two matrices are limited by a lack of phylogenetic information and our analyses essentially assume a "star" phylogeny. This can potentially result in a conflation between the effects of the actual divergence rate matrix and coancestry (see ) on the observed Dmatrix. We address these limitations in more detail in Section 4.

| An adaptive ridge generated by sexual selection
Phenotypic selection analyses of sexual selection on three traits in  Table 1 m max (λ = −0.522, permutation test: p = .0319). Irrespective of data transformations, the loadings of m max describe an axis that primarily separates MD from PN and ML (Table 3). Thus, the canonical selection surface was quite closely aligned with the original trait space with most curvature stemming from selection on MD, or possibly on a linear combination of MD and ML (Figure 2).

| Alignment of phenotypic variance and divergence with the adaptive ridge
Common principal components analysis using mean-standardized data revealed that P and D (Table 4) did exhibit significant similarity; the model-building procedure indicated that "common eigenvectors" are the best model, although the "jump-up" approach supported matrix proportionality (Table 5) Table 1.
T A B L E 3 Linear (θ) and nonlinear selection (λ) gradients along each canonical vector (m) of the variance-standardized γ-matrix (see Table 1 for trait abbreviations) T A B L E 4 The Pmatrix estimated from 108 Phymata americana americana males reared under common-garden conditions (A) and the D-matrix based on the trait means for 37 Phymata taxa when estimated directly as the covariance among species means (B) versus from a mixed model (C

| DISCUSSION
We compared a set of micro-and macroevolutionary parameters for Phymata to address nonexclusive explanations for two broadly observed and puzzling phenomena: the prominence of relatively restrained divergence (including phenotypic stasis) within a lineage and proportionality between intra-and interspecific trait covariation. Our results illustrate a simple resolution, based on the predicted effects of stabilizing selection at both short and long evolutionary timescales. We discuss our findings with respect to the potential role of sexual selection in shaping divergence in this group of insects, and in light of the ongoing debate regarding the relative importance of selective and genetic constraints.

| Stabilizing selection aligned with variation within-and between-species
We confirmed the first prediction, that standing phenotypic trait covariance in P. americana americana conforms to the pattern of nonlinear selection, exhibiting a characteristic reduction of trait variance in the direction of strongest stabilizing selection. This microevolutionary expectation derives from the effects of within-generation selection on the genetic variance-covariance matrix (Phillips & Arnold, 1989) and/ or a developmental system that interacts with the environment in a manner that preserves beneficial trait combinations (Cheverud, 1996;Marroig & Cheverud, 2001). The similarity between P (or G) with γ has been corroborated in several other studies (Brodie, 1992;Conner & Via, 1993;Revell et al., 2010), although fewer have performed so with respect to sexual selection McGlothlin, Parker, Nolan, & Ketterson, 2005).
We also found support for the second prediction that macroevolu- Average e is the mean value from 1,000 random directions of m. Coordinates of m-vectors correspond to loadings on PN, MD, and ML (abbreviations the same as in Table 1). Unrelated

Comparison in Flury hierarchy
The "model building" approach tests the model in each line against the model indicated on the line below it. The "jump-up" approach tests each model against a model of "Unrelated." Abbreviations df and AIC refer to degrees of freedom and Akaike's Information Criterion, respectively.
T A B L E 5 Results of common principal components analysis of matrix similarity between the mean-standardized Pand Dmatrices, using two approaches to result in peak displacements. Nonetheless, the data suggest that these displacements occur within a relatively restricted range, fully consistent with many models of macroevolution (e.g., an Ornstein-Uhlenbeck process; Hansen, 1997) and consistent with Simpson's (1953) notion of fitness peaks remaining within an adaptive zone.
There is some empirical evidence consistent with adaptive landscapes remaining stable for long enough periods to generate repeatable and predictable patterns of phenotypic divergence (e.g., Mahler, Ingram, Revell, & Losos, 2013). A more direct evaluation of this conjecture (requiring the estimation of γ in multiple, related taxa; Arnold et al., 2001) is a challenging feat and beyond the scope of the present study.
We acknowledge a number of limitations and concerns presented by our dataset; first, our estimates of standing variance employed P instead of G to illustrate the potential alignment of phenotypic variance with relevant descriptors of selection and divergence, and this is a well recognized issue of contention. The use of P in the present study is purely a consequence of logistic difficulties (i.e., multigenerational, mass rearing) associated with our study system that currently precludes the estimates of G. While we agree that we must be cautious about our interpretation of P, our data are simply descriptors of an existing pattern of intraspecific covariation and we show that this appears to be aligned with other relevant evolutionary parameters. It is certainly possible that, in reality, P does not resemble the underlying Gmatrix, but if so, it is not clear how environmental (or nonadditive genetic) sources of covariance should generate the alignment we observe.
Second, we rely on estimates of P obtained for a single species.
As with estimates of γ, ideally one would have estimates of P for all species and analyses of alignment with D could be performed using the among-species average P (or G) matrix as a more accurate estimate of the (time) average Gmatrix, upon which the neutral expectation for divergence under drift holds even with fluctuating variances (see Lande, 1979). Additionally, multiple estimates of P could be used to evaluate several related questions including their degree of with species-specific selection gradients (Arnold et al., 2001), as well as the rate and direction of divergence among species in intraspecific covariation (e.g., Berner, Stutz, & Bolnick, 2010;Steppan et al., 2002).
A third point of concern is that γ has a tendency to align with P, generating potentially spurious statistical concordance between eigenvectors of the respective matrices (i.e., p min and m max ), when using the orthogonal approach embraced by most previous authors (Blows & Brooks, 2003;Phillips & Arnold, 1989; see Morrissey, 2014).
The degree to which this will always pose a problem in interpreting empirical data is not clear, but one approach is to employ alternative methods to estimating the major axes of γ. In the present study, we compared the principal directions obtained by canonical and projection pursuit approaches and found strong agreement between the two, suggesting that the alignment we describe was not simply an artifact of the canonical approach.
Fourth, the absence of a phylogeny for these taxa prevents analyses that account for coancestry, which itself can impose covariance in D. Phylogenetic data (using D from independent contrasts; for example, Baker & Wilkinson, 2003;Revell, 2007;Revell et al., 2007) or by incorporating phylogeny directly into the hypothesis tests of divergence  can be used to evaluate whether the pattern of divergence is seen in Phymata differs from expectations under a neutral model of mutation-drift (Lande, 1979;Lynch & Hill, 1986). An emerging empirical consensus, however, is that genetic drift is a poor explanation for patterns of divergence over most timescales (Arnold, 2014 (Dodson & Marshall, 1984;McLain & Boromisa, 1987;. The pattern of sexual selection also appears to depend on demographic parameters (e.g., sex ratio and density; Punzalan et al., 2010) as well as microclimatic variables (Punzalan & Rowe, 2013), presumably because of how it mediates the efficacy of searching. Thus, our results are suggestive of some degree of conservatism in aspects of both the mating system and ecology of Phymata. Darwin (1871) postulated sexual selection largely to explain the exaggeration and diversification of secondary sex characters seen within lineages. Subsequently, authors have come to view sexual selection primarily as an agent of diversification (Andersson, 1994 Figure 4). This is consistent with prior studies pointing to melanism as a principal target of divergent sex-specific selection (e.g., Punzalan & Rowe, 2015;. Furthermore, these data suggest that sexual dimorphism evolves readily despite positive male-female macroevolutionary correlations (Figure 4) potentially arising from sexually concordant expression of genetic variance (Baker & Wilkinson, 2001;Reeve & Fairbairn, 2001). Clearly, selective constraints on a given trait (or trait combination) do not necessitate constraints on sexual dimorphism itself.

| Alignment of genetic and selective lines of least resistance
One of the most debated subjects in evolutionary biology is with regard to the importance and persistence of genetic constraints.
While genetic constraints must exist at a fundamental level (Gould & Lewontin, 1979), many have questioned whether estimates of standing trait variability have any importance for evolutionary response over the long term (Conner, 2012;Futuyma, 2010;Pigliucci, 2006).
In the present study, we found that intra-(P) and interspecific variation (D) were, in fact, concordant but also that both were closely aligned with the features of the adaptive landscape. That is, our findings are consistent with intra-and interspecific variation being shaped by similar selective processes (i.e., nonlinear selection). An analogous F I G U R E 4 Bivariate plots depicting contributions of divergence in male traits to an index of total (multivariate) sexual dimorphism (d i ) (left panels) and the among-species covariance between male and female homologous traits (right panels). Each row corresponds to plots for each trait separately. The index d i was computed as the male-female Mahalanobis distance for mean-standardized phenotypes considering the three measured traits. Methods for trait measurement of females were identical to those used for males (see Section 2). Sample sizes for females ranged from 2 to 106 (median = 7). PN is measured in mm, and MD and ML are in units of average pixel number (i.e., darkness). Total dimorphism is unit free conclusion comes from a theoretical study that predicts "triple alignment" between mutational and genetic variances with the pattern of stabilizing selection (Jones, Bürger, & Arnold, 2014; also see Jones, Arnold, & Bürger, 2007. It is tempting to ascribe such results to the primacy of selection over genetic (including mutational) constraints at both micro-and macroscales. However, "selection" and "genetic constraints" are, of course, not exclusive alternatives. We show here that, even though stabilizing selection appears to explain the pattern of restricted trait covariation within-and between-species, this does not preclude the existence of substantial evolutionary potential within this boundary [i.e., e(m max )]. Bolstad et al. (2014) reported evidence of divergence principally restricted to directions that mirror the pattern of standing trait covariance, despite apparently ample evolvabilities in any given direction. Undoubtedly, trait (i.e., genetic and especially mutational) covariance imposes constraints on evolutionary response and the timescales over which this covariance is preserved is an open issue-but if the metric used to estimate genetic constraint (i.e., standing variation) itself conforms to the adaptive landscape, then the distinction between the two becomes blurred. At the very least, we believe our results highlight the need for caution when attempting to interpret standing patterns of trait variation necessarily as indicators of past and future macroevolutionary constraints.

ACKNOWLEDGMENTS
We are grateful to the various museums and their staff for facilitating the comparative work, to C. Weirauch, A. Michael, and S. Frankenberg for some timely taxonomic information, and to S. De Lisle, N. Rollinson, A.G. Jones, and two anonymous reviewers for comments on previous drafts of the manuscript. Thanks also to L. Yun for providing bug food, to H. Rodd for laboratory resources, and to J. Stinchcombe for discussions on analyses. This work was funded by a Theodore Roosevelt Collection Study Grant (AMNH) to DP and by a Canada Research Chair (NSERC) grant to LR.