Wolf in sheep's clothing: Model misspecification undermines tests of the neutral theory for life histories

Abstract Understanding the processes behind change in reproductive state along life‐history trajectories is a salient research program in evolutionary ecology. Two processes, state dependence and heterogeneity, can drive the dynamics of change among states. Both processes can operate simultaneously, begging the difficult question of how to tease them apart in practice. The Neutral Theory for Life Histories (NTLH) holds that the bulk of variations in life‐history trajectories is due to state dependence and is hence neutral: Once previous (breeding) state is taken into account, variations are mostly random. Lifetime reproductive success (LRS), the number of descendants produced over an individual's reproductive life span, has been used to infer support for NTLH in natura. Support stemmed from accurate prediction of the population‐level distribution of LRS with parameters estimated from a state dependence model. We show with Monte Carlo simulations that the current reliance of NTLH on LRS prediction in a null hypothesis framework easily leads to selecting a misspecified model, biased estimates and flawed inferences. Support for the NTLH can be spurious because of a systematic positive bias in estimated state dependence when heterogeneity is present in the data but ignored in the analysis. This bias can lead to spurious positive covariance between fitness components when there is in fact an underlying trade‐off. Furthermore, neutrality implied by NTLH needs a clarification because of a probable disjunction between its common understanding by evolutionary ecologists and its translation into statistical models of life‐history trajectories. Irrespective of what neutrality entails, testing hypotheses about the dynamics of change among states in life histories requires a multimodel framework because state dependence and heterogeneity can easily be mistaken for each other.


| 3349
AUTHIER ET Al. in life histories of iteroparous organisms in the wild (Cam, Aubry, & Authier, 2016). This topic has sustained a steady number of new publications (≈20 every year since 2010) and a large number of citations (>300 per year) in the ecological research community over the past 5 years (Appendix S1). An important question is to what extent, if any, are variations in life histories heritable. Individual heterogeneity is the oft-used term to explain variation in life-history traits, variations which can fuel adaptive phenotypic evolution (Wilson & Nussey, 2010). By contrast, individual stochasticity refers to variations that are irrelevant to natural selection: "[t]he movement of an individual through its life cycle is a random process, and although [death] is certain, the pathways taken to that destination are stochastic and will differ even between identical individuals…" (Caswell, 2009). Individual stochasticity sensu Caswell (2009) manifests itself in the diversity of life-history trajectories: Would the same individual be able to live its life a second time; the trajectory would be different simply because of sampling variation. What causes these different trajectories is at the core of recent studies debating the relative importance of within-and betweenindividual variance in life histories, and in particular, whether observed variations are selectively neutral or not (Bonnet & Postma, 2016;Cam et al., 2016;Jenouvrier, Péron, & Weimerskirch, 2015;Plard, Bonenfant, Delorme, & Gaillard, 2012;Steiner & Tuljapurkar, 2012).
Two mechanisms can explain how variations in individual trajectories may arise (see Cam et al., 2016 for a review): (1) state dependence and (2) heterogeneity. True state dependence sensu Heckman (1981) is the process whereby "past experience has a genuine behavioral effect in the sense that an otherwise identical individual who did not experience the event would behave differently in the future than an individual who experienced the event." Although originally framed in the context of human behavior, this definition is not restrictive but could include other processes (e.g., physiology). An event means the realization of a random variable such as successful breeding. State dependence describes a Markovian process in which experiencing an event affects an individual and changes its propensity to re-experience the event. State dependence can generate variation (also known as "dynamic heterogeneity") in a population of identical individuals, simply because of sampling variance in the realization of stochastic processes such as survival or reproduction (Caswell, 2009;Orzack, Steiner, Tuljapurkar, & Thompson, 2011;Steiner & Tuljapurkar, 2012;Steiner, Tuljapurkar, & Orzack, 2010;Tuljapurkar, Steiner, & Orzack, 2009). This sampling variance is a within-individual variance or "individual stochasticity" sensu Caswell (2009).
In contrast, the heterogeneity hypothesis starts from the concern that all relevant variables that can affect an individual's fate may not be available to the investigator, either because they are unknown, difficult to measure or not directly observable (Mood, 2010;Wienke, 2010). Assuming this heterogeneity is fixed (time-invariant), it is hidden to investigators but may account for the correlation between states in the life-history trajectory of a given individual. Cam et al. (2016) speak of hidden permanent demographic heterogeneity (HPDH). HPDH statistically translates into a between-individual variance due to unobserved differences at the individual level, upon which natural selection may act if individual variation is heritable (Chambert, Rotella, & Garrott, 2014;Wilson & Nussey, 2010). HPDH is commonly estimated with generalized linear mixed models (Bolker et al., 2009) or mixture models (Fay, Barbraud, Delord, & Weimerskirch, 2016): It corresponds to "individual quality" (Bergeron, Baeta, Pelletier, Réale, & Garant, 2011;Cam et al., 2016;Wilson & Nussey, 2010). HPDH does not exclude random variation, but relies on statistical models to partition the variance in individual trajectories into between-individual and within-individual components (Van de Pol & Wright, 2009). Such variance-partitioning models have been heavily used, in part because they address pseudo-replication when measurements from the same individual are still correlated after accounting for observed covariates.
Both state dependence and HPDH are concerned with accounting for changes in states along an individual's life-history trajectory.
With HPDH, any change in state is short lived and an individual quickly returns to a trajectory reflecting its latent "quality". This results in repeatability in one state (success or failure) with short-lived visits to the other state. With state dependence, change can be more sustained in the case of positive state dependence, or short lived in the case of negative state dependence (trade-off). In other words, positive state dependence leads to some degree of persistence in state, where past experience of failure (for example) increases the probability of experiencing failure again. This can be illustrated by the "spiral of failure" phenomenon in behavioral ecology, whereby breeding failure is associated with increased probabilities of dispersing and divorcing, both being in turn associated with increased probability of unsuccessful reproduction in the following year (Naves, Monnat, & Cam, 2006 This question motivated the development of a Neutral Theory for Life Histories (hereafter NTLH, e.g., Orzack et al., 2011;Steiner & Tuljapurkar, 2012;Steiner et al., 2010;Tuljapurkar et al., 2009). NTLH studies concluded that the bulk of variations in life histories observed in natura across a wide panel of species was selectively neutral, a result that took empiricists by surprise and called for renewed vigilance against adaptationism (Gould & Lewontin, 1979;Pigliucci & Kaplan, 2000). NTLH investigations take a Markovian model with state dependence as an appropriate null model. This null model is deemed neutral because it does not include HPDH: All individuals are assumed to have the same phenotype (Steiner & Tuljapurkar, 2012). Parameters of this null model are estimated from data and subsequently used to predict the distribution of lifetime reproductive success (LRS). Lifetime reproductive success is an individual-level metric: It is the number of descendants an individual produces over its reproductive life span, conditional on the individual having recruited into the breeding population. As a measure of individual fitness, the shortcomings of LRS are well known (Metcalf & Parvard, 2007), although this does not prevent its use in practice (e.g. Mourocq et al., 2016). Lifetime reproductive success, which was used extensively in testing for the presence of HPDH in NTLH studies (Bonnet & Postma, 2016), is scrutinized at the population level (Tuljapurkar et al., 2009). If the predicted (populationlevel) distribution of LRS matches the observed one, support for the NTLH is inferred. Quoting Tuljapurkar et al. (2009): "[State dependence] can provide a 'neutral' standard by which one can assess whether the observed distribution of fitness components, such as the LRS or average annual reproduction, are influenced by certain kinds of [HPDH]. In particular, a lack of fit between an observed distribution of, say, the LRS and a distribution generated solely by dynamic heterogeneity [i.e., state dependence] suggests that the observed distribution is influenced by fixed differences among individuals." This statement is normative about the ability and usefulness of LRS to infer HPDH in life histories: If a good fit is obtained between the observed distribution of LRS and the one predicted from parameter estimates following model fitting, then this model is likely to provide a good approximation to the true data-generating mechanism.
This statement has, however, not been empirically evaluated. In other words, can a data-generating mechanism that involves only HPDH predict a population-level LRS distribution that is identical to one expected from a data-generating mechanism that involves only state dependence? Can the current NTLH methodology lead to model misspecification? Model misspecification happens when data are analyzed with, and inferences drawn from a model that is very different from the true data-generating mechanism (Burnham & Anderson, 2002:158). Previous tests of NTLH (e.g., Orzack et al., 2011;Steiner & Tuljapurkar, 2012;Steiner et al., 2010;Tuljapurkar et al., 2009) implicitly assumed that model misspecification has no impact on parameter estimation.
We empirically explore this premise with Monte Carlo simulations, using LRS and entropy as in the standard NTLH framework (Tuljapurkar et al., 2009). We compare models including no HPDH and no state dependence, state dependence only, HPDH only, and both state dependence and HPDH. Our focus is on accurate estimation of parameters in statistical models of life histories: This study thus complements the power analysis of Bonnet and Postma (2016). Simulations generate data according to a known process, hereafter referred to as the true data-generating mechanism. Knowledge of the true value of parameters enabled to assess bias.

| ASSUMPTIONS
For data simulation, we made the following assumptions. Individuals have recruited into the breeding population and survived to a second breeding occasion: The shortest breeding trajectory includes two occasions. Once recruited, no individual skips breeding, but there is (Bernoulli) variability in breeding success. HPDH can be described by a bivariate normal distribution with possible correlation between individual survival and breeding success propensities.
We simulated both survival and breeding trajectories to mimic data commonly used to investigate HPDH (e.g., Cam, Link, Cooch, Monnat, & Danchin, 2002;Cam et al., 2013). Only simulated data on reproductive life histories were subsequently analyzed, but survival allowed us to take into account that reproductive life span is a random variable. We chose a mean survival (conditional on having recruited in the breeding segment) of = 0.75 and simulated 1,000 individual life histories of maximum length 42. These values were chosen to reflect the biology of long-lived iteroparous organisms, such as black-legged kittiwakes (Rissa tridactyla) about which conflicting results on NTLH have been published (Cam et al., 2013;Steiner et al., 2010). Across simulation scenarios, the expected sample size is 1,000 × (1 + ∑ 40 k=0 k ) ≈ 5,000.
This is a large sample size, both with respect to the number of individual trajectories, and their length: Asymptotic justifications of statistical tests should hold.

| NOTATIONS
Greek letters denote the true value of a parameter, which is unknown in analyses of empirical data. Greek letters with a hat denote esti- where is an intercept, is the parameter quantifying state dependence, and ( i,1 , i,2 ) are individual random effects:  2 denotes a bivariate normal distribution. The parameters 2 and 2 repro are variance parameters, and cor is a correlation parameter bounded between −1 and 1. The multivariate normal distribution was used for convenience as often in empirical investigations (e.g., Authier, Cam, & Guinet, 2011;Browne, McCleery, Sheldon, & Pettifor, 2007;Cam et al., 2013).

| Between-individual Variations
correlation corresponds to the reverse situation of "individual quality" whereby successful breeders survive best (Cam et al., 2002).

| State dependence
is the parameter quantifying state dependence, that is the succession of 0s (failures) and 1s ( is the log odds ratio (OR) of how much having experienced the event (success i(t−1) = 1) affects the probability of experiencing it again relative to not having experienced it (success i(t−1) = 0). For example, an odds ratio of 2 means that a successful breeder is twice more likely to breed successfully again compared to a failed one. Likewise, an odds ratio of ½ means that a successful breeder is half as likely to breed successfully again compared to a failed one. An odds ratio less than 1 ( < 0) would be evidence of a cost of reproduction, and a trade-off between current and future reproduction. The interpretation of as an odds ratio is convenient for simulating realistic amounts of state dependence. is assumed the same for all individuals.

| Entropy
The transition matrix i describes how an individual i that survived from t−1 to t can change states: where 11 i is the probability of a failed breeder to fail again, 12 i is the probability of a failed breeder to become successful, 21 i is the probability of a successful breeder to fail its next breeding attempt, and 22 i is the probability of a successful breeder to breed successfully again.
The average entropy, which measures randomness in transitions between states, of the transition matrix i is: where � = ( 1 , 2 ) are the stationary proportions of failures and successes along trajectories (Tuljapurkar et al., 2009

| Within-individual Variations
Individual stochasticity sensu Caswell (2009)  The variability studied by Steiner et al. (2010) is defined at the level of a trajectory. In practice, the true trajectory that any given individual follows is only known up to that individual's death, which can be purely accidental (e.g., an unfortunate lightning strike). Because death censors a life-history trajectory, it is impossible to know whether any two individuals that had the same trajectory until their death would have remained on the same trajectory had they both lived longer. It is pragmatically impossible to know whether two individuals are truly sharing the same trajectory. The withinindividual variance in Equation 11 is defined at the individual level for any time step along a realized trajectory. Because variances are additive, the total variance is the sum of all the steps along that trajectory.
All the above equations (Equations 1−11) involved parameters that are unknown in practice and must be estimated from data. With Monte Carlo simulations, the true values of parameters are known: Model misspecification and its impact on parameter estimates (e.g., bias) can be investigated.

| Data simulation
We simulated life histories under several scenarios corresponding to different values for the set of 4 parameters ( , , repro , cor).
We considered data-generating mechanisms with only HPDH and with none or both (Table 1). There were 7 × 4 × 4 × 5 = 560 different combinations of values for ( , , repro , cor). For each combination, 500 random datasets were simulated and analyzed (Appendix S2: Fig. S1-S2). In all simulation scenarios, the parameter μ was set to 0, corresponding to an average breeding success probability of 0.5.

| Model fitting
To keep the problem tractable, mean survival was kept constant throughout life in simulations. Furthermore, we only analyzed breeding trajectories: Observed reproductive life span was treated as data.
In other words, although the true data-generating mechanism is a joint model of breeding success and survival, only data on breeding success were analyzed with probabilistic models (

| Inference
Our aims were to assess the empirical validity of the current NTLH framework to draw inferences about the processes generating variation in life histories. For each fitted model, estimated parameters were used to predict individual LRS conditional on the observed survival trajectory. To quantify the discrepancy between the predicted and observed distribution of LRS, Kolmogorov-Smirnov tests have been used in previous studies of NTLH, but were found underpowered (Bonnet & Postma, 2016). We used the Earth Mover Distance to compare two histograms or distributions. Each histogram may be viewed as a pile of sand and the Earth Mover Distance reflects the amount of sand multiplied by the distance needed to turn one pile into the other (Gottschlich & Schuhmacher, 2014). A smaller Earth Mover Distance reflects a better match between predictions and observations.
For each dataset and fitted model, estimated parameters (̂ ,̂ ,̂ repro ) were stored to assess bias and to compute individual stochasticity and the scaled entropy. Entropy is a measure of randomness in transitions between breeding success and failure (Tuljapurkar et al., 2009). The scaled entropy varies between 0 and 1, with 1 corresponding to complete randomness in the succession of states in the trajectory. It has been argued that HPDH should decrease entropy (Bonnet & Postma, 2016;Jenouvrier et al., 2015;Tuljapurkar et al., 2009) and that entropy could thus be used to infer the presence of HPDH. We used estimated parameters (̂ ,̂ ) to compute the within-individual variance ̂ 2 within for the breeding trajectory 01. Since variances are additive, the total variance is the sum the rightmost terms in Equation 11. The estimated within-individual variance term was compared to its true value for each simulation scenario to assess bias, that is whether 2 within =̂ 2 within or not. Finally, the log-likelihood of each simulated dataset under each model (Table 2) was recorded to compute the Bayesian information criteria (BIC; Link & Barker, 2009). BIC weights ̂ BIC were then calculated assuming that random individual effects count for 1 additional (variance) parameter. We used BIC rather than Akaike information criterion (AIC) because the later tends to favor overcomplex models in large data sets (chapter 7 in Link & Barker, 2009 The magnitude of true state dependence is given on an odd ratio (OR) scale: Values above 1 (below 1) correspond to positive (negative) effects of previous state on current state. In the scenario with no heterogeneity, a negligible value of HPDH (0.01) was used to avoid numerical errors when simulating random effects from a bivariate normal distribution (Equation 3) and when computing the relative bias.
T A B L E 1 Summary of parameter values used in simulations of life history data T A B L E 2 Models for analyzing breeding trajectories. * i is an individual univariate random effect because only breeding success was analyzed at this stage. Figure 1 summarizes results with the Earth Mover Distance:

| RESULTS
Irrespective of the true data-generating mechanism, all fitted models could predict the observed population-level distribution of LRS in a similar manner. Exceptions were for models  nil and  NTLH (that is models excluding HPDH) whose Earth Mover Distance was the greatest in scenarios with small to large true HPDH. This distance was further increased with increased value of true positive state dependence (e γ > 1) and increased value of the individual-level correlation in HPDH (Appendix S3: Fig. S1). In scenarios with both state dependence and HPDH, model  HPDH had the smallest distance, even though it was not the true data-generating mechanism. In other words, a misspecified model could outperform the true model at predicting a LRS distribution that was most similar to the observed one.  Fig. S2). This behavior was also apparent for models  HPDH and  full although the ranking was invariable with the smallest, intermediate, and largest scaled entropy for  NTLH ,  full and  HPDH , respectively. All models, whether they included state dependence, heterogeneity, or excluded them both, could generate similar values of scaled entropy: Scaled entropy was insensitive to the true data-generating mechanism (Appendix S3: Fig. S2). Figure 3 shows the bias in estimated state dependence (̂ ). When data were analyzed with a correctly specified model, state dependence estimates were on average unbiased. However, when data were analyzed with a different model than the true data-generating mechanism, estimates from all models other than  full were biased. In particular, estimates from model  NTLH were always positively biased when HPDH was present in the data (Appendix S3: Fig. S3). Even in scenarios where true state dependence was nil (e γ = 1), estimates from model  NTLH were positively biased, with the magnitude of the bias depending only on the magnitude of true HPDH (Appendix S3: Fig.   S3). Figure 4 shows the bias in estimated HPDH ̂ repro . When data were analyzed with a correctly specified model, HPDH estimates were on average unbiased. However, when data were analyzed with a different model than the true data-generating mechanism, estimates were in F I G U R E 1 Tile-plots of the average estimated Earth Mover Distance (across 500 simulated datasets) between the observed and predicted distribution of LRS for each simulation scenario. True values of state dependence (e γ on the Odds-Ratio scale) and HPDH (σ repro ) are on the x− and y− axes, respectively. Each panel corresponds to one of the four models used to analyze data. Vertical black lines bracket scenarios in which heterogeneity ( HPDH ) is the true data-generating mechanism. Horizontal black lines bracket scenarios in which state dependence ( NTLH ) is the true data-generating mechanism. At the intersection,  nil is the true data-generating mechanism. Everywhere else,  full is the true datagenerating mechanism. Earth mover distance general biased. In particular, estimates from model  HPDH were always biased when data were generated with true state dependence (Appendix S3: Fig. S4). In scenarios where true HPDH was nil ( repro = 0), estimates from model  HPDH or  full were positively biased. In scenarios where both state dependence and HPDH were truly present, estimates from model  HPDH were always biased with the sign and severity of the bias depending on the sign and magnitude of true state dependence (Appendix S3: Fig. S4).
Results with respect to within-individual variance estimation, or individual stochasticity sensu Caswell (2009), are summarized in  S5). When data were analyzed with a different model than the true data-generating mechanism, estimates were in general biased. In particular, estimates from model  NTLH were always biased when HPDH was present in the data. Likewise, estimates from model  HPDH were biased when state dependence was present in the data.
Inference with BIC is summarized in Figure 6. Across all the scenarios, the information theoretic approach was able to identify the correct data-generating mechanism among the competing models. Although a small amount of HPDH or state dependence were more difficult to detect with certainty, ̂ BIC were always the largest for the true datagenerating mechanism: Using BIC in a multimodel framework avoided the problems linked to model misspecification detailed in Figures 1-5 above (see also Appendix S3: Figs S6 and S7).

| Inference with misspecified models
An important methodological choice in evolutionary ecology is that of an appropriate null model (Pigliucci & Kaplan, 2006;chapter 10 (Fox, Roff, & Fairbairn, 2001). With that goal in mind, a model's ability to predict the observed distribution of LRS is not sufficient; the selected model has to provide more than populationlevel predictions, it has to be the one that reflects best the biological processes that gave rise to the data at the individual level (Cam et al., 2016).
We investigated the use of scaled entropy as a summary statistic for inference about the processes generating variation in life-history trajectories. Entropy is a measure of randomness in transitions between states in a sequence: It measures uncertainty in predicting the next state (Adami, 2016 Because  HPDH excludes γ, estimates (̂ ) are by definition exactly 0. Horizontal black lines bracket scenarios in which  NTLH is the true datagenerating mechanism. At the intersection,  nil is the true data-generating mechanism. Everywhere else,  full is the true data-generating mechanism. Estimates from the true data-generating model should have no bias on average. Actual bias values (rounded to the nearest integer) are displayed on each tile. Biases larger than 100% in magnitude were capped at 100% Bias (%) specification that best approximated the correct data-generating mechanism. Our simulations were carried out with both a large number of individuals and a large reproductive life span in order to allow asymptotic justifications to hold. In practice, sample size may be small and asymptotic justifications may not hold. To achieve correct model specification, it is paramount to consider a set of candidate models simultaneously to assess their fit, and not to rely on LRS prediction or entropy computation for inferring the true data-generating mechanism driving the intra-and interindividual dynamics of change among states in life-history trajectories.  (Hsiao, 2014). Endogeneity is a special case of omitted variable bias, in which the bias arises because one of the variables used to explain the other is itself caused by the phenomenon it seeks to explain. Because it can itself be the result of unobserved HPDH, past state acts as a proxy for "individual quality", resulting in spurious state dependence (Hsiao, 2014 We focused on the variance in states along a trajectory to define both a within-and between-individual component, which is consistent with previous investigations focusing on HPDH (Cam et al., 2002;. In contrast, previous studies of the NTLH defined a within-and a between-trajectory variance, which is only F I G U R E 4 Tile-plots of the average bias in estimated HPDH ̂ repro (across 500 simulated datasets) for each simulation scenario. True values of state dependence (e γ on the Odds-Ratio scale) and HPDH (σ repro ) are on the x− and y− axes, respectively. Each panel corresponds to one of the four models used to analyze data. Vertical black lines bracket scenarios in which  HPDH is the true data-generating mechanism. Horizontal black lines bracket scenarios in which  NTLH is the true data-generating mechanism. Because  NTLH excludes σ repro , estimates (̂ repro ) are by definition exactly 0. At the intersection,  nil is the true data-generating mechanism. Everywhere else,  full is the true data-generating mechanism. Estimates from the true data-generating model should have no bias on average. Actual bias values (rounded to the nearest integer) are displayed on each tile. Biases larger than 100% in magnitude were capped at 100% Using data on American college basket-ball competition recasted as SAD, Warren II et al. (2011) showed that neutral models of community ecology could very well predict the patterns in these data: A non-neutral process at a microlevel can generate a seemingly neutral pattern at a macrolevel. Similarly, we demonstrate here that in the current NTLH framework, predicting LRS distributions does not provide robust evidence of neutrality.

| OLD WINE IN NEW BOTTLES
The problem of biased estimates has been diagnosed more than 30 years ago by econometricians (Heckman, 1981): "misspecification of the heterogeneity process gives rise to an erroneous estimate of the impact of the true effect of past state on current outcome." Econometricians now always consider a model with both state dependence and unobserved heterogeneity (e.g., Arulampalam, Booth, & Taylor, 2000;Bartels, Box-Steffensmeier, Smidt, & Smith, 2011;Halliday, 2008;Hsiao, 2014) to avoid mis-estimating either. In scenarios where true state dependence γ was negative, consistent with a trade-off between current and future reproduction, estimated state dependence ̂ could be positive. The bias was always positive and F I G U R E 5 Tile-plots of the average bias in estimated within-individual variance (̂ 2 within ) for a 01 trajectory in breeding success of an average individual for each simulation scenario. True values of state dependence (e γ on the Odds-Ratio scale) and HPDH (σ repro ) are on the x− and y− axes respectively. Each panel corresponds to one of the four models used to analyze data. Vertical black lines bracket scenarios in which  HPDH is the true data-generating mechanism. Horizontal black lines bracket scenarios in which  NTLH is the true data-generating mechanism. At the intersection,  nil is the true data-generating mechanism. Everywhere else,  full is the true data-generating mechanism. Estimates from the true data-generating model should have no bias on average. Since variances are additive, any bias can blow up with the large sample considered in our simulations of a panel of 1,000 individuals. Actual bias values (rounded to the nearest integer) are displayed on each tile Negative state dependence is classically interpreted as a cost of reproduction, which can hardly be considered a neutral process with respect to natural selection (Flatt & Heyland, 2011). As Munoz and Huneman (2016) underscored in the context of the Neutral Theory of Biodiversity, "one can have neutral patterns with non-neutral processes". The current formulation of NTLH may obfuscate the difference between patterns and processes. While the realization of stochastic processes governed by state dependence creates variation among individual trajectories that may be evolutionary neutral, the biological and evolutionary processes at the origin of state dependence need not themselves be neutral. Consequently, it is necessary to clarify why the state dependence model sensu (Heckman, 1981), which also underpins population projection matrix models (Caswell, 2001;Lefkovitch, 1965), is an evolutionary neutral model of lifehistory evolution. NTLH requires a philosophical clarification like the one Munoz and Huneman (2016)

| NEUTRALITY IN LIFE-HISTORY STUDIES
Trade-offs between life-history traits are commonly understood as allocation constraints acting on the development and physiology of organisms. They are a cornerstone of evolutionary biology (Roff, Mostowy, & Fairbairn, 2002) and translate into negative state dependence in statistical models (Nichols, Hines, Pollock, Hinz, & Link, 1994 and HPDH (σ repro ) are on the x− and y− axes, respectively. Each panel corresponds to one of the four models used to analyze data. Vertical black lines bracket scenarios in which  HPDH is the true data-generating mechanism. Horizontal black lines bracket scenarios in which  NTLH is the true data-generating mechanism. At the intersection,  nil is the true data-generating mechanism. Everywhere else,  full is the true data-generating mechanism. The best predictive model, which would be selected for inference, has the largest weight. Actual values (rounded to one decimal) are displayed on each tile Model selection ω BIC nullification of the heterogeneity (variance) parameter. However, it is unclear why it is more appropriate to nullify this parameter, rather than the state dependence parameter in the first place. HPDH affects in no small way the interpretation of logistic regression output, and ignoring it a priori is not recommended (Mood, 2010 de Jong, 1986) and should be considered context-dependent rather than static (Roff et al., 2002).
An analogy to two other disciplines, namely population genetics and community ecology, was called upon to promote consideration of nonselective mechanisms in life-history studies (Steiner & Tuljapurkar, 2012). The underlying motivation was to avoid an automatic presumption of adaptationism (Gould & Lewontin, 1979) that could flourish under the label of "individual quality" (Bergeron et al., 2011;Steiner & Tuljapurkar, 2012). As a result, NTLH was developed to provide a baseline model that could be used as a working null hypothesis for empiricists (Steiner & Tuljapurkar, 2012). However, in contrast to population genetics or community ecology (Chave, 2008;Leigh, 2007), NTLH has not been seized upon by empiricists. This state of affairs calls for an in-depth investigation into the limits of drawing analogies between different disciplines developing neutral theories and using null models.
In NLTH, neutrality is deduced not because there are no biological differences between individuals, but because these differences are deemed fitness-irrelevant. The latter conclusion is reached when a state dependence statistical model can predict well the populationlevel distribution of LRS without assuming between-individual differences (heterogeneity) in vital rates beyond the usual differences in age-and stage-structured populations. LRS plays the same role as allele frequencies in population genetics, or SAD in community ecology.
In the latter two disciplines, the emphasis is on accounting for temporal changes in the population-level distribution of allele frequencies and community-level SAD (Chave, 2008), respectively. In both population genetics and community ecology, a microlevel process stemming from the finite size of populations, drift, can induce random change in the distribution of a macrolevel statistic (allele frequency or SAD). In NTLH, state dependence provides a mechanism to explain change in states within an individual life-history, but does not necessarily explain change in the population-level LRS distribution over time. Support for NTLH was claimed from predicting accurately the LRS distribution at a specific time point, not from accounting for a change over time in that distribution. While change in the magnitude or sign of state dependence can generate different LRS distribution (Appendix S2), the current emphasis of NTLH is not on the origin, the sign, or the magnitude of the state dependence parameter. These blind spots call for further theoretical elaborations of NTLH and renewed attention on the focus of tests of NTLH: predicting from individual/microlevel processes (i) the population/macrolevel LRS distribution at a specific time point, or (ii) changes in that distribution over time.

| CONCLUSION
Researchers interested in the evolutionary significance of interindividual variations in longitudinal trajectories should not use LRS or entropy to infer the correct data-generating mechanism behind life histories. Neither LRS prediction nor entropy estimation can diagnose model misspecification. NTLH studies (e.g., Tuljapurkar et al., 2009) should be re-evaluated with standard inferential tools, such as information criteria. In theory, the latter can be used to compare and accurately select a model accounting for the data-generating mechanisms behind longitudinal life-history data. Although with real data collected on wild populations the true data-generating mechanism is, in general, unknown and out of reach (Burnham & Anderson, 2004;Link & Barker, 2009), investigators should start from a set of statistical models reflecting a complete set of nonmutually exclusive hypotheses (including alternative biological and evolutionary scenarios) on individual life-history evolution (Browne et al., 2007;Cam et al., 2013;Chambert, Rotella, & Higgs, 2014;. Then, they should proceed with a multihypothesis framework (Chamberlin, 1965) based on information theoretic inferential tools (BIC, AIC, or analogs in the Bayesian framework; Gelman, Hwang, & Vehtari, 2014;Link & Barker, 2009).
These criteria allow determining a model whose complexity (e.g., non-Gaussian HPDH, second-order Markovian state dependence) is supported by the data at hand (Burnham & Anderson, 2004;Link & Barker, 2009), comparing non-nested models and assessing the relative importance of heterogeneity and state dependence in individual life-history evolution.

CONFLICT OF INTEREST
None declared.

AUTHORS' CONTRIBUTION
All the authors have equally contributed to this work. MA, LMA, and EC involved in conceptualization, analysis, and writing.