The case of muddled units in temporal discounting

While parameters are crucial components of cognitive models, relatively little importance has been given to their units. We show that this has lead to some parameters to be contaminated, introducing an artifactual correlation between them. We also show that this has led to the illegal comparison of parameters with different units of measurement - this may invalidate parameter comparisons across participants, conditions, groups, or studies. We demonstrate that this problem affects two related models: Stevens' power law and Rachlin's delay discounting model. We show that it may even affect models which superficially avoid the incompatible units problem, such as hyperbolic discounting. We present simulation results to demonstrate the extent of the issues caused by the muddled units problem. We offer solutions in order to avoid the problem in the future or to aid in re-interpreting existing datasets.

It does not make sense to compare 7 meters with 8 seconds and ask, "Which is bigger?" The comparison is not allowed because one cannot compare numbers with different physical units-we have an idiom for this: "it is like comparing apples and oranges". In this article we show that neglecting the units of psychological parameters can lead to applesand-oranges type problems and, ultimately, to mistaken conclusions.

A demonstration with Stevens' power law
Stevens ' (1975) power law describes the relationship between physical magnitudes and their psychological equivalent. In Stevens' power law, the psychological magnitude ψ(I) of physical magnitude or intensity I is given by where λ and a are free parameters.
The law may well correctly capture the transformation from physical to psychological magnitude. However, there is a problem with the λ parameter which means it cannot be used for measurement of an individual difference and cannot be compared across individuals. This is because Stevens' law is parameterised incorrectly-so that the units of one parameter are muddled with the value of another parameter. The concepts of dimensional invariance (Fourier, 1822;Maxwell, 1891;Krantz, 1972;Stewart, Scheibehenne, & Pachur, 2019) and meaningfulness (Falmagne, 1985;Falmagne & Narens, 1983) can be applied in mathematical models of psychology, by considering the units of psychological parameters. Let's consider the perception of visual length, and particularly of lines. If length is measured in the International System of Units (SI) of metres (m), then I has units of m. This means that I a has units of m a . If ψ(I) is to be a unitless psychological scale, or at least a scale with its own psychological units, then it must be free of the physical units. This means that λ must have the reciprocal units of I a so that the units cancel out. Thus λ must have units of = m a 1 m a . The key problem here is that λ has units which depend upon a. Stewart et al. (2019) have shown, for similar models, how this will lead to estimates of λ and a that are highly correlated, and that it is illegal to compare λ values across, for example, individuals with different values of a. When a differs, the units of λ differ, and comparing magnitudes with different units is not permitted.
We have highlighted a problem with the units of λ in Stevens' law, though λ is often not the parameter of core interest in psychophysical modelling. Instead it is the exponent a that is of primary consideration. The exponent a has been tabulated, in reviews of the experimental literature, for more than 20 physical continua, including loudness, brightness, length and area, tastes and smells, temperature, pressure, texture, vibration, weight, duration, and even electric shocks. The λ parameter is of lesser theoretical interest, because it is determined, in part, by the properties of the judgement scale for ψ which are somewhat https://doi.org/10.1016/j.cognition.2020.104203 Received 30 January 2019; Received in revised form 20 December 2019; Accepted 23 January 2020 arbitrary and determined by the experimenter. For example, λ will differ depending on whether the scale runs from 0-10 or 0-100. But later in this paper, when we are considering temporal discounting, the analogue of λ is of core theoretical interest.

Fechner's law is dimensionally correct
In contrast to Stevens' law, Fechner's (1966) is a meaningful and dimensionally invariant law. The ratio of the physical quantity I and the threshold physical quantity I 0 (at which the perception ψ(I) is zero) is unitless, because the units of I and I 0 cancel in the ratio. Logarithms are also unitless-they are the power to which the base of the log must be raised, and powers are unitless real numbers. This means that λ need have only whatever unit required to match the scale of ψ(I) is measured in.

Fixing the units in Stevens' law
We propose the modified Stevens' power law: where γ a = λ or γ = λ 1/a . Note that a is still a dimensionless quantity, but γ is in inverse units of I, such as 1 metres for length, 1 metres 2 for area, 1 days for duration, for example. Also, now γ is appealingly independent of a.
Comparisons between a parameters and between γ parameters are allowable in dimensional analysis; but, as we explain above, comparisons of λ are not. This fix-moving the constant inside the power-is described in Stewart et al. (2019). To summarise, studies which have compared values of λ across or within individuals are wrong because comparing quantities which are not in the same units is not permitted. Our strong suggestion would be that for such studies, the λ parameter is transformed into γ and the results be reinterpreted based on these values. The extent to which this may be a problem, and the feasibility of this suggestion, is explored in the remainder of the paper.

A case study in time perception and temporal discounting
We illustrate the incompatible units problem in the domain of hot affective emotional states. Multiple studies have found that when people undergo a hot affective state manipulation (e.g. by viewing sexually arousing stimuli) then their present bias increases (Wilson & Daly, 2004;Ariely & Loewenstein, 2006;Van den Bergh & Dewitte, 2008;Lemley, Asmussen, & Reed, 2015). That is, they discount future rewards to a greater extent, such that preferences shift toward smaller but sooner rewards compared to larger but later rewards. But what are the cognitive processes that are responsible? Are people's temporal preferences altered in hot states 2 because of changes in discount rates, or because of changes in subjective time perception, or some combination of the two?
Regardless of the precise measure used, results of the above studies were interpreted as changes in time discounting caused by the experimental hot state manipulation. Caution must be expressed however as these results could have also been driven partly or wholly by changes in subjective time perception. In an excellent series of studies, Kim and Zauberman (2013) found a similar increase in present bias caused by hot state manipulations, but, because they also measured subjective time perception, were able to conclude that this change in present bias is driven by changes in subjective time perception rather than changes in discount rates. They measured the relationships between subjective time perception and inter-temporal choice for money under control and hot states. In Study 1 they showed that male participant's subjective time perception was altered by viewing pictures of female lingerie models. They used a procedure to estimate perceived durations from objectively stated durations according to Stevens' power law (see Eq. (1), where I is measuring the duration). Participants indicated subjective time by adjusting the length of a line on the computer screen, relative to a reference duration of 1 month corresponding to 32.71 mm), therefore the units of I was in mm. This resulted in group level fits of ψ(I) = 0.998I 0.68 for the hot condition and ψ(I) = 0.610I 0.73 for the control condition.
One conclusion drawn by the researchers was that participant's subjective time perception was sublinear 3 -based upon the point estimates of the exponents both being below 1, and a non-significant difference between these exponent parameter values. Because the exponent is unitless, this conclusion is not affected by any units problems.
A second conclusion was that perceived time increased in the hot state, such that a fixed duration was perceived as longer. The scaling parameter (λ in our Eq. (1)) increased from the control to experimental condition. This comparison of λ values is not valid, undermining the conclusion that changes in subjective time perception were responsible. As we have seen, the λ parameters are in units which are also affected by the exponent. Specifically, the group level constant for the hot condition is λ = 0.998 mm −0.68 (i.e. units of mm −0.68 ), and λ = 0.61 mm −0.73 (i.e. units of mm −0.73 ) for the control condition. These constants are in different units and therefore cannot be compared. Likewise conducting t-tests or ANOVAs on λ values for participant level fits is also illegal, as they are all units of mm a where a is different for each participant. Instead, λ should be transformed to γ (for each participant, which requires the λ and a values for each participant). So even though there were non-significant differences in a in the control and hot state groups, the a values will have been different for each participant, and we do not know whether group differences would have been significantly different when comparing the fitted γ values across control and hot conditions. The best we can do without the participant-level data or parameter estimates is to compute γ at the group level. This results in γ hot = 0.998 (1/0.68) = 0.997 and γ control = 0.610 (1/ 0.73) = 0.508, but these are just point estimates so we have no way to verify if there are statistically significant differences between γ hot and γ control . This is also a highly dubious operation-as we will discuss in more detail later in the paper, transforming (λ,a) to (γ,a) parameters for group level summary statistics is invalid because a will vary across participants. And so we are unfortunately left with uncertainty about the effect of the hot state manipulations in this experiment on subjective time perception.
Our intention is to point out that we simply cannot make claims based on the comparison of quantities with different units. We do not intend to cast doubt upon the role of subjective time perception in hot state manipulations. Indeed, the basic claim seems reasonable given the findings of a previous study (Zauberman, Kim, Malkoc, & Bettman, 2009) which modelled subjective time with the Weber-Fechner Law (which bypasses these concerns) rather than Stevens' power law.

Implications for delay discounting
This units problem is not just restricted to Stevens' power law and magnitude estimation. In the remainder of the paper we outline how this problem filters through into the temporal discounting literature in multiple ways. First we demonstrate that Rachlin's popular discount function suffers from the units problem and we propose a fix. Second, we demonstrate that the popular hyperbolic discount function may also suffer from this problem despite superficially escaping the incompatible units problem.
Here, we illustrate these problems in the domain of inter-temporal choice (also known as delay discounting). The core phenomena of interest here is how agents make trade-offs between the magnitude of a gain (or a loss) and the immediacy of that. For example, the present subjective value of £100 now is greater than £100 in 40 years because future rewards are discounted by some fraction. This could be driven by many reasons including inflation expectations, risk of future rewards not materialising, opportunity costs etc. But how exactly are decisions made about outcomes which occur at different points in time? The general utility-based approach to answering this is to propose that our present subjective value V of a reward R at a given delay D is given by where u(R) is a utility function relating objective rewards R to subjective values, and f(D) is a discount function which modulates our subjective values as a function of delay. In the discounting literature it is common to assume a linear subjective value function, i.e. the identify function u(R) = R in which case u(R) is in units of pounds, euros, dollars, etc. The focus is instead upon the form of the discount function f (D) which we will explore below.
There are a range of popular discount functions which do not suffer from these unit comparison problems: • Exponential discounting (Samuelson, 1937) where f(D) = exp (−kD). D is in time units (e.g., days). Here k is in inverse time units of this (e.g., days −1 ).
• Constant sensitivity function (Ebert, Prelec, & Prelec, 2007) Here a is in inverse time units, and b is dimensionless.
• The Myerson and Green (1995) hyperboloid where f(D) = 1/ (1 + kD) s . Here k is in inverse time units and s is dimensionless.
• Double exponential (McClure, Ericson, Laibson, Loewenstein, & Cohen, 2007) where (this model might more accurately be called a 'mixture of exponentials'). The mixture component ω is dimensionless and k 1 and k 2 are in inverse time units.
Nevertheless even if a discount function's parameters does not suffer from the incompatible units problem, when comparing parameter values (such as k) across participants, conditions, or studies, it is important to ensure that they are all in the same units. For example, if the unit of time is days in one paper and years in another, then the raw published k values cannot be compared because one will have units of days −1 and the other will have units of years −1 . It is only allowable to compare k values in the same units, and so the k values need to be scaled to the same units 4 . Because discount rates vary drastically across species (half lives (1/k) range from seconds to years or decades; Vanderveldt, Oliveira, & Green, 2016) this mistake could easily be made in a meta-analysis, for example.

Implications for Rachlin's delay discounting function
Some discount functions suffer from a problem where fitted parameters are unknowingly in different units and therefore are not comparable. This problem affects two prominent discount functions. The first is exponential discounting of subjective (i.e. Stevens' power law scaled) time (Takahashi, Oono, & Radford, 2008 and the second is the prominent Rachlin (2006) hyperboloid model 5 equating to hyperbolic discounting of (Stevens' power law scaled) subjective time, We proceed to illustrate the issues with the Rachlin discount function given its frequent use in the discounting literature, but the issues we highlight also affect Eq. (5) (see Appendix A).
s is a power -a unitless real number. This means that comparison of fitted values or posterior distributions of s across participants or studies is allowable under dimensional analysis, as s has the same units (in this case, no units). Of course, there may be other issues around parameter trade offs that make this comparison hard.
However a units problem does arise with the k parameter. f(D) is a unitless fraction, which means the right hand side of Eq. (6) must also be unitless. As the numerator 1 is unitless, the denominator 1 + kD s must also be unitless. The (kD s ) term must be unitless, because it is added to 1, which has no units, and one can only add quantities with the same units. Because D is in units of days (for example) then this means that D s is in units of days s . This means that k must have units of 1/days s to cancel with the units of D s . Given that s will vary across participants, then you cannot compare k across participants as they are all in different units. For example, when s = 1 then k has units of 1/ days but when = s 1 2 then k has units of = days 1/ days 1 2 . Based upon our proposed fix to Stevens' power law (Eq. (3)), we propose the modified-Rachlin discounting function: where and κ s = k or κ = k 1/s . Note that s is still a dimensionless quantity, but κ is in units of days −1 , which is appealingly independent of s. Comparisons between s parameters and between κ parameters are allowable in dimensional analysis; but, as we explain above, comparisons of k parameters are not.
The modified-Rachlin function has a number of advantages. First and most obviously, it now becomes legitimate to compare discounting behaviours (using κ) across participants with different subjective time perception (as specified by s). This is a significant advantage because previous comparisons of k across participants or studies will in fact be invalid because they are contaminated by varying values of s.
Second, the κ parameter is now conveniently always equal to the inverse half life (the delay at which a reward is equal to half its objective value) regardless of the value of s. (See this by noting that the delay D half at which the value of the reward is halved can be substituted into Eq. (7) to give = + D 1 2 1 1 ( ) s half which means 2 = 1 + (κD half ) s and 1 = (κD half ) s and 1 1/s = κD half and thus 1 = κD half .) This was an appealing property of the discount rate in the hyperbolic discount function (Mazur, 1987), but which was lost in the original Rachlin function.
Third, parameter estimation of (κ,s) will be improved and more robust. The top panel in Fig. 1 shows some simulated data from a delay discounting experiment. The bottom panels show likelihood surfaces for the parameters of the Rachlin and modified-Rachlin discount functions for a single simulated experiment. There is a very clear parameter trade-off which occurs with Rachlin's discount function, as seen by the negatively sloped ridge in the likelihood surface ( Fig. 1 bottom left). These parameter trade-offs are often present but hard to detect using methods which estimate only point estimate parameters (e.g. Gilroy, Franck, & Hantula, 2017), only those which estimate the full likelihood or posterior surface over parameter space (such as Vincent, 2016). However, parameter correlations across participants have been noted in modelling work (such as Peters, Miedl, & Büchel, 2012). This disappears in the likelihood surface of the modified Rachlin function ( Fig. 1 bottom right). This is especially appealing in the context of Bayesian parameter estimation-the highly anti-correlated structure of the likelihood surface in Fig. 1 (bottom left) could pose challenges for some sampling algorithms to accurately estimate the true posterior distribution (see Stewart et al., 2019).
Fourth, a direct consequence of this ridge in the likelihood surface is that errors in estimating the maximum likelihood estimates of true (k,s) parameters will contain undesirable correlational structure. Fig. 2 (top) shows the distribution of maximum likelihood estimates from a parameter recovery simulation-200 simulated experiments were run with stochastic choices and maximum likelihood estimation of an observer with fixed parameters. While the 200 observers were identical, with the same fixed (k,s), scatter of the estimates away from the true (k,s) crosshairs is caused by the stochasticity of the binary responses to the inter-temporal choices. The result is that errors in the maximum likelihood parameters are undesirably correlated. Fig. 2b shows that the modified-Rachlin function fixes this problem, we no longer have this parameter trade-off in the maximum likelihood estimates.
We propose that existing research with (k,s) estimated from the Rachlin function can, and should, be transformed to our modified parameters (κ,s) so that comparison between participants and studies become valid and relevant. This transformation is a valid approach-we found that a maximum likelihood procedure to estimate (k,s) are accurate, and map on precisely (after the κ = k 1/s transformation) to parameter estimates of (κ,s) directly (see Fig. 2c, d). The correlation coefficient between s estimated from the Rachlin and modified Rachlin function was virtually equal to 1, within 5-6 decimal places. This was also the case for the correlation coefficient between κ (transformed from the k recovered from the Rachlin function) and the κ recovered from the modified Rachlin function. This is good news-assuming rigorous maximum likelihood estimation procedures were followed, we do not believe that estimation with the Rachlin function would introduce systematic errors in the actual parameters estimated, just that the k parameter is contaminated as described above. If there is doubt however about the accuracy of past maximum likelihood procedures, the most prudent approach would be to estimate (κ,s) directly from the archived raw intertemporal choice data.
To probe this mapping between (k,s) and (κ,s) further, we repeated the parameter recovery approach (from Fig. 2) but extended this for multiple true parameter values in Fig. 3. Fig. 3 shows true parameter values chosen from a grid over (κ,s) space, along with recovered parameter values using maximum likelihood estimation. The results are in line with the intuition from Fig. 2, that there is a 1-to-1 mapping between (k,s) and (κ,s) parameter spaces. That is, it should be possible to accurately map to (κ,s) directly from existing estimates of (k,s) obtained from Rachlin's function. There are two concerns which remain however.
The first concern is that when past results are re-examined and k,s is transformed into κ,s, this may well merit reinterpretation of existing findings in the literature. For example, differences between k between groups or conditions could have been interpreted (wrongly) as relevant differences in discount rates between participants, groups, or conditions. But because k is contaminated by s, these differences could have been caused by changes in subjective time perception. This is clear to see in Fig. 3a-c. As stated, κ is unrelated to s and corresponds to the inverse half life 6 . It is clear from Fig. 3b, that increases in k could either  6)) and modified Rachlin model (bottom right; Eq. (7)). The true data generating parameters were (k = exp(−3),s = 0.7), and the simulated response data were generated using the adaptive procedure described by Frye, Galizio, Friedel, DeHart, and Odum (2016) (see Appendix B for simulation details). Points above the indifference curve correspond to those inter-temporal choices where immediate rewards have greater present subjective value, and vice versa. The x-axis is equal to the immediate reward value divided by the delayed reward value. The code to generate this figure is available at https://osf.io/uscmd/. 6 Although κ is still the product of both a discounting process and the constant term in Stevens' power law. B.T. Vincent and N. Stewart Cognition 198 (2020) 104203 be caused by an increase in k while s remains constant, or by k remaining constant, and a decrease in s. This should hopefully underscore the importance of revisiting published studies which make theoretical claims about discounting behaviour on the basis of changes in k obtained from Rachlin's hyperboloid function.
Our second concern is that conversion of existing parameter estimates of (k,s) from the Rachlin discount function to our proposed (κ,s) parameterisation should be done with care. As we eluded to in the case study above, this conversion is only valid when conducted on participant level parameters, not on group mean or median parameter values. To get a sense of why this is the case, we can see from Fig. 3b that group mean or median values of k will be disproportionately influenced by participants with high s values. This is shown further in the histograms Fig. 3d-e. For example, consider a number of participants with the same discounting behaviour (same values of κ) but with different subjective time perception (different values of s). If we fit with the modified-Rachlin function, then our group level estimate of average κ will be accurate and independent of the varying s values. However, if we fit the same set of participants (i.e. a column of points) with the Rachlin function then our group level estimate of k will be undesirably influenced by the variation of s. To summarise, researchers wishing to convert (k,s) parameters into our proposed superior (κ,s) parameter space must do so on a participant level, not on a group mean or median level.

How this can lead to incorrect psychological theorising
To what extent is the muddled units issue a problem for psychological theorising? In order to assess this, we conducted a number of simulated experiments where we use Bayesian t-tests to infer the effect size of group differences in discount rates based upon either k from the original Rachlin discount function or κ from the modified Rachlin discount function.
A first example demonstrates how we may make Type 1 errors. Imagine a within participant experiment where an experiment condition is hypothesised to affect the discount rate κ, relative to a control group. However in this example the experimental manipulation has no effect on discount rates, only upon the s parameter. The results of simulations in Fig. 4 demonstrate that if we analysed parameter fits to the original Rachlin discount function, then we could incorrectly infer group level differences in discounting and therefore wrongly conclude that the experimental manipulation affected discounting processes when they did not. Conversely, if we analysed data based upon the modified Rachlin discount function, then we would correctly conclude that the experimental manipulation did not result in any systematic group difference in discounting behaviour.
A second examples demonstrates how it is also possible to make Type 2 errors. Now imagine an experimental situation where we make inferences about differences about discount rates in two groups who do actually differ in terms of their discount rate κ. Imagine further that participant within each group have some variation in their actual s values (shown as σ s in Fig. 5). We can see that analyses based on the original Rachlin discount function can potentially drastically underestimate (or miss entirely) the presence of a group level effect in discount rates when participants vary in their s parameters. This problem gets worse as the variability in s increases. In contrast, if we conduct our analyses based upon the modified Rachlin discount function, then our inferred effect sizes accurately track the true effect size.
While real world examples are unlikely to be as clear cut as the examples we have explored, we have outlined how our research conclusions and thus psychological theorising, may be led astray by either Type 1 or Type 2 errors when our parameters suffer from the muddled Fig. 2. Robustness of ML estimation to stochastic response data. A set of 200 experiments (akin to that shown in Fig. 1) were conducted on a simulated participant with fixed true parameter values (shown by crosshairs; k = exp(−3), s = 0.7, thus κ = exp(−3) 1/0.7 ) and stochastic responses. Maximum likelihood estimation was used to estimate parameters for the Rachlin (panel a) and modified Rachlin (panel b) functions. Points represent the maximum likelihood parameters for each simulated dataset. The parameter estimation procedure was found to be robust-conducting MLE on data using the Rachlin or the modified Rachlin functions will result in identical maximum likelihood estimates, see main text for details. This was demonstrated by near perfect correlations between s from both equations (panel c) being almost exactly 1, and likewise for k transformed to κ, and κ (panel d). The data on the x and y axes of panel c are the same as the x axis of panel a and x axis of panel b, respectively. The data on the x and y-axes of panel d are the same as the x axis of panel b and a, respectively. See Appendix B for simulation details. The code to generate this figure is available at https://osf.io/ uscmd/. units problem. Conversely, we have shown that our proposed fix, the modified Rachlin discount function, avoids these problems and gives rise to more correct inferences about effect sizes.

Concerns extend to hyperbolic discounting
We have outlined how the incompatible units problem effects Rachlin's discount function, and outlined examples of how this may lead to erroneous psychological theorising. One way to deal with this problem is given by our modified Rachlin's discount function (Eq. (7)). Another possible approach would be to omit the s parameter altogether, leaving us with the classic hyperbolic 7 discount function (Mazur, 1987) Superficially, this function does not suffer from the incompatible units problem-k is simply in units of days −1 (or 1/k is measured in days) and we can compare k values across participants. Or can we?
This depends on how we interpret hyperbolic discounting. The first and strongest position could be that humans and other agents have direct access to objective time and have perfect prospective time perception. If we believe this then we can entirely ignore the issue of subjective time perception and proceed with analysing data using the hyperbolic discount function unencumbered. While this approach may be valid for ideal observers or abstract modelling, it could be problematic under the currently dominant approach of indirect perception in experimental psychology (Helmholtz, 1856;Gregory, 1980). The second strategy could accept the notion of indirect perception and that agents may only have access to subjective time, but simply assume that observer's time perception is veridical. This stance would also allow researchers to proceed with hyperbolic discounting of objective time unencumbered, however the assumption of accurate and veridical time perception is a strong one which would greatly benefit from empirical justification. The third interpretation is that we take the indirect perception approach but simply ignore the role of subjective time perception by omitting s, or equivalently fixing s = 1, and proceeding with the hyperbolic discount function. While this is least philosophically problematic, and resolves the incompatible units problem, it does mean we are left with a model misspecification problem.
We suspect that most experimental psychologists would fall under the third camp, and therefore our concern is that we may not be able to draw relevant conclusions about discounting from changes in discount rates k from the hyperbolic discount function. This would again mean  (8) is not actually a hyperbolic function, but we will stick with this convention as it has been adopted wholesale in the discounting literature. We refer the reader to (Rasmusen, 2008) for further insights on this point. B.T. Vincent and N. Stewart Cognition 198 (2020) 104203 that comparisons of discount rates k between participants or conditions or groups would again be invalid. This may have broad consequences which could require revisiting previous results. The problem revolves around the fact that the Mazur (1987) hyperbolic discount function is a special case of the Rachlin discount function 8 when the exponent s equals 1. So if it actually is the case that people hyperbolically discount subjective time (s≠1) rather than objective time (when s = 1), then analyses based upon the hyperbolic discount function will suffer from a model misspecification problem, where the parameters of a misspecified model are systematically biased. Expecting that papers all contain accurate and comparable estimates of k just because the unit of k does not contain s is not a good solution. If s differs across participants but is left out of our model, all of our k values will be differently systematically biased, and thus not comparable. This model misspecification problem is a different problem to the units problem-by choosing a model specification that avoids the units problem one has run into the model misspecification problem instead.
We propose that this is a real problem. Claims about discounting behaviour (i.e. choices made in inter-temporal choice tasks) previously attributed to changes in discount rates, but may have been partly down to changes in subjective time perception. For example, one of the most highly cited empirical works on delay discounting shows that discounting is higher in current smokers than ex-smokers, than never smokers (Bickel, Odum, & Madden, 1999). However we also have empirical support for atypical time perception in addictive disorders (stimulant-dependent participants over-estimate time), with the explicit suggestion that this may influence broad lack of impulse control (Wittmann, Leland, Churan, & Paulus, 2007). On the other end of the spectrum, patients with anorexia nervosa display some of the lowest observed discounting behaviour (Steinglass et al., 2012;Decker, Figner, & Steinglass, 2015;Bartholdy et al., 2017) also under-estimate time (Vicario & Felmingham, 2018). So to what extent is this discounting behaviour caused by changes in discount rates versus subjective time perception? We therefore mirror the call of Kim and Zauberman (2018) that research needs to disentangle the relative contributions of subjective time perception and discount rates. Until we have a clearer understanding here, it may be premature to claim that changes in discounting behaviour is straightforwardly attributable to changes in discount rates alone.
In order to estimate the extent of the problem, we conducted further simulations. Fig. 6 shows the degree of bias in the hyperbolic discount rate k parameter as a function of true (κ,s) parameters from the modified Rachlin function. The degree of bias is shown using the estimated k, normalised by the true κ value. For simulated observers who have linear time perception (s = 1; equal to hyperbolic discounting) we can recover discount rates with no systematic bias. Worryingly, we find systematic biases in the estimates of k for observers who do discount subjective time (s≠1). We see systematic underestimates of k for accelerating time perception (s > 1) and systematic overestimates of k for decelerating time perception (s < 1). These biases are not subtle-for example, a normalised estimate of +2 means k estimate is twice the true κ value, and a normalised estimate of -0.5 means k estimate is half of the true κ value. These simulations suggest Fig. 4. The muddled units problem can give rise to Type 1 errors, incorrectly detecting the presence of an effect which is not there. We repeatedly simulated two groups of participants (top) with a group mean difference in s log( ) of s log( ) (see top right) and no differences in discounting behaviour in terms of log( ). We then used Bayesian methods to infer the true effect size (difference in discounting behaviour), based on either the original (top left) or the modified Rachlin parameterisation (top right). We varied s log( ) and plot the corresponding posterior mean and 95% credible intervals (points and error bars, bottom). Analysis based on the original Rachlin is prone to Type 1 errors, incorrectly inferring the presence of an effect (differences in discount rates) when there is no such effect. The analysis based on the modified Rachlin discount function correctly infers the lack of an effect. The code to generate this figure is available at https://osf.io/uscmd/.
that if we accept that subjective time perception influences preferences in inter-temporal choice tasks, and that participant's subjective time perception is uncontrolled for, then claims of changes in discount rates could be conflated with subjective time perception.
One line of evidence suggests that this may be a real problem for conclusions based on hyperbolic discount rates alone. When discount functions are pitted against each other to explain inter-temporal choice behaviour, 2-parameter hyperboloid models (including the Rachlin hyperboloid) fit behavioural data better than the hyperbolic model (McKerchar et al., 2009). That study only assessed goodness of model fit however and did not assess either fit to out-of-sample data (e.g. as in cross validation) or compare model metrics which add a penalty for the additional parameter. Franck, Koffarnus, House, and Bickel (2015) did however report BIC (Bayesian Information Criterion; which penalises models with more parameters) scores for fits to individuals. They report the proportion of participants for which a range of models were the most probable to have generated the data as: Rachlin (34.3%), Myerson and Green (27.0%), hyperbolic (18.0%), Laibson (10.8%), exponential (8.1%), and a control model (1.8%). Given the hyperbolic discount function was the most probable model for only 18.0% of participants, this is not strong support for linear subjective time perception. This suggests that in many cases s≠1 and so estimates of k from the hyperbolic discount function will be contaminated by subjective time perception and not solely reflect discounting processes. If we believe s = 1 based upon empirical evidence, model comparison, or a priori beliefs, then there is no problem. Otherwise however, this potentially poses a problem for some established findings in the discounting literature based upon participant, group, or condition differences in k values from the hyperbolic discount function.
A second line of evidence can be drawn from the Kim and Zauberman (2013) and Zauberman et al. (2009) studies we have already seen. Taken together these studies provide compelling evidence that subjective time perception is decelerating, and varies across participants and/or experimental conditions. We propose that this may be a serious issue-previous results claiming that inter-temporal choice is affected by discounting may need to be revisited in order to assess the confound of subjective time perception.

Plotting and reporting log parameter values
Before we conclude, we add a note on plotting or reporting transformed parameter values. Researchers often report, or plot, the logarithm of the k parameter from the hyperbolic discounting model, k log( ). But what is k log( )? Following dimensional analysis leads us to the answer. Recall that k is the inverse half life in the hyperbolic discounting model. That is, k is the inverse of the number of days it takes for the present value to be half that of the delayed outcome: If it takes 10 days for the value to drop by half, then k = 1/10 per day. Dimensional analysis requires that the number to which a logarithm is applied is unitless. For this reason, a standard reference level is required. The level can be set at any value (e.g., k reference = 1 per day or k reference = 3.141592654 per day, or any value). However if researchers just take the logarithm of the numerical value of k without regard to the units, they have effectively selected a reference value of 1 unit. In this example for k, that would be a reference level of k reference = 1 per day.
For example, say k = 2 per day and suppose the experimenter is using logarithms to the base 10. log 10 (2) = 0.30103. But this number Fig. 5. The muddled units problem can also give rise to Type 2 errors. We simulated 2 groups of participants (top) who varied in their true difference in discount rates ( k log( )) and variability of the s parameter, σ s . We consider 3 levels of σ s (bottom panels) and vary the true effect size via k log( ). We plot the posterior mean and 95% credible intervals of the effect size using Bayesian methods for both the original and modified Rachlin parameterisation (bottom). Analyses based upon the κ from the modified Rachlin discount function result in accurate inferred effect sizes. The same inferences based upon k from the original Rachlin discount function can miss the presence of group differences in discount rates when there is variation in the s values of participants. The code to generate this figure is available at https:// osf.io/uscmd/. 0.30103, which is "log 10 (k)", really should be written as . This means that k = 2 per day is e 0.6931472 = 2 times larger than the reference level of k = 1 per day. Changing the reference level alters k log( ) by an additive constant. So if only differences in k log( ) are of concern then the reference level cancels. However, if the absolute value of k is to be used, then the reference level (and the base of the logarithm) should be reported-although k log( ) has no units, it should be understood as a logarithm of the number of times larger k is than some reference level k. 9

Conclusions
The importance of the units of psychological parameters in perceptual and cognitive models has sometimes been underappreciated. This has led to the muddled units problem where researchers illegally compare parameter values with different units. Further, this causes parameters to become polluted by other parameters, which changes their meaning and interpretation, potentially invalidating research conclusions.
We have illustrated the problem with Stevens' power law and proposed a simple-to-implement re-parameterisation which can (a) allow past research to be reevaluated, and (b) avoid the parameter incompatibility problem in the future. We have also shown how the problem affects the study of subjective time perception and temporal discounting. Using simulations, we demonstrated this using the Rachlin temporal discounting function, and show how the units of the temporal discounting parameter are contaminated by the units of the subjective time perception parameter, and show how our re-parameterisation avoids this problem.
More subtle, but still deeply problematic, is that in reformulating models to make them dimensionally sound can swap the units problem for the misspecified model problem. Switching to a dimensionally Fig. 6. Estimation biases of k from the hyperbolic discount function, based upon inter-temporal choice data for simulated observers who discount according to the modified-Rachlin function. Each point represents a simulated observer with a true κ corresponding to the x-axis position and true s value as shown by the colour (see legend). The y-axis shows normalised error k estimated /κ true such that a value of 1 means no error. We see no systematic bias when observers discount linear time, s = 1. But we see systematic underestimates of k for accelerating time perception (s > 1) and systematic overestimates of k for decelerating time perception (s < 1). The code to generate this figure is available at https://osf.io/uscmd/. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) 9 Perhaps the most prominent example of this is the reporting of sound levels on the decibel scale. For example the ear-drum-rupturing 150 dB level of a jet at takeoff at a distance of 25 m, often reported as "150 dB" is really "150 dB SPL" or "150 dB sound pressure level". Sound pressure level = ( ) L 20 log p p p 10 0 , where p is the sound pressure level of the jet measured in any unit of pressure and p 0 is the reference level, measured in the same unit. p 0 is typically set at 20 μPa or 20 micro Pascals (which Wikipedia says is the loudness of a mosquito flying 3 m away). This means that the "150" means the jet at 25 m is 10 150/20 = 31, 622, 777 times louder than a mosquito at 3 m. sound model that is not the 'true' data generating process to avoid the units issue, leads to parameters of the misspecified model being systematically biased. This was illustrated with the hyperbolic discount function, which is used very often in the delay discounting literature. We do not claim that our modified Rachlin function is the true data generating model but it does satisfactorily resolve the units problem that we have identified.
If one wished to keep the interpretation of Rachlin's discount function as hyperbolic discounting of Stevens' power law scaled subjective time, then there are other alternative approaches. One would be to conduct time perception experiments in addition to inter-temporal choice experiments, such that the s parameter (and ideally also λ) are known for each participant. This would allow the k to equate to discounting of subjective time. A related approach could be to run modified inter-temporal choice tasks (with no time perception tasks) where participants are presented with rewards at subjective time delays.
We end with a series of important, but potentially alarming, recommendations in relation to cognitive modelling. First, researchers should routinely report the units of their psychological parameters. This will help reduce the possibility of erroneous comparison of parameter estimates in different units. Ideally this reporting will apply to both axis labels of plots in parameter space as well as reporting of parameter values in tables or the main text. For example, reporting that k = 0.5 is not sufficient; reporting k = 0.5days −1 or k = 0.5 per day is preferred. It is also typical to report log transformed k values ( k ln( )). While log transformed values have no units, they do have a (possibly implicit) reference which does have units and its units should be reported. The same goes for κ or ln( ) in our modified Rachlin discount function.
Our second recommendation is that cognitive modellers might routinely consider the units of their models during model formulation, in order to avoid the incompatible units problem. For example, in the Introduction, we show that Stevens' Law is not dimensionally correct, but that Fechner's Law is.
Our third recommendation is for the readers of the existing literature. When interpreting existing models, and especially their parameterisation and parameter estimation, readers should have in mind the units of the parameters. If there is a problem, then a solution involving re-parameterisation needs to be found and this may necessitate revisiting the theoretical claims made. For example, in conducting a meta-analysis of loss aversion, Walasek, Mullett, and Stewart (2018) had to obtain the raw choice-level data and re-estimate prospect theory's loss aversion parameter for each participant. For any non-linear re-parameterisation, transforming group level average parameter values will not be sufficient.
We also have recommendations relating to delay discounting and subjective time perception. Our fourth suggestion is that exponential discounting of power-scaled subjective time (Takahashi et al., 2008) should be disfavoured and treated with caution. Instead, focus should be placed on the constant sensitivity function (Ebert et al., 2007, see Appendix A), or on exponential discounting of Weber-Fechner time perception (which is equivalent to the Myerson (2004) hyperboloid (Takahashi et al., 2008)). And finally, fifth, we suggest that researchers should switch to using our modified Rachlin discount function from this point onwards. Published research findings based upon participant, group, or condition differences in the contaminated discount rate parameter (k) from Rachlin's discount function may need to be re-examined (Mazur, 2007;Jones & Rachlin, 2009;Myerson, Green, & Morris, 2011;Peters et al., 2012;Kralik & Sampson, 2012;Schneider, Peters, Peth, & Büchel, 2014).
The likelihood of the data on a trial for given parameters was modelled as a biased coin flip, i.e. a Bernoulli trial, = = P P R P P (data | ) Bernoulli ( ( 1| , , )) is the probability of choosing P t B (coded as R t = 1).
We defined the data as where P A and P B are prospects (see above) and R t is the response on trial t of T trials in total.
We defined the response probability as where Φ is the standard cumulative normal distribution which forms a psychometric function mapping the difference between present subjective values of the rewards to a response probability. We set a fixed value of α = 4, which is the slope of this psychometric function and can be thought of as a 'comparison acuity' parameter-lower values mean greater response accuracy for prospects with similar present subjective values (see Vincent, 2016, for details). The first term deals with response errors, where ϵ was fixed at 0.01. The function V (P) converts a prospect (consisting of a reward and its delay) into a present subjective value (see Eq. (4)). We assume a linear value function, u(R) = R as is common in the discounting literature.