Qualitative variations in delay discounting: A brief review and future directions

The discounting paradigm has been challenged by an increasing number of studies presenting qualitative vari- ations in the individual discount function. Particularly, the subjective value of a loss does not necessarily sys-tematically decrease with delay to the outcome. Qualitative variation refers to variations in shape rather than steepness of the discount function, such as positive discounting, zero discounting, unsystematic discounting, and negative discounting. Data from three previous studies were analysed in terms of qualitative variations observed in delay discounting patterns. Attention was also given to methods used and to the relationship between the results from the various levels of investigation. We found qualitative differences between discounting of monetary gains and losses on an individual level. While discounting of gains mainly took the form of conventional positive dis- counting, discounting of losses often took the form of zero discounting or unsystematic discounting. Further, there were more qualitative variations in discounting of both gains and losses among adolescents compared to adults. By examining verbal reports and single choices, we identified some of the rules and consequences involved in these delay discounting patterns. The different rules and consequences observed for the gain and loss scenarios, support that discounting of gains and losses may involve different combinations of reinforcing contingencies. These results point towards a possible way to explain the influences of qualitative variations in delay discounting.


Introduction
Discounting as a research field is developing rapidly and the body of literature is extensive. One interesting line of work focuses on the irregularities in discounting. Specifically, an increasing number of studies have found qualitative variations in discounting patterns (Abdellaoui et al., 2018;Furrebøe, 2020a;Hardisty et al., 2013;Myerson et al., 2017). There is also evidence of substantial inter-and intraindividual differences in delay discounting (Myerson et al., 2017;Scholten et al., 2019). Such findings related to qualitative variations and individual differences give rise to questions regarding what qualitative patterns prevail under what circumstances, and what the underlying influences may be for these variations. The purpose of this targeted review is to discuss some of the dynamics related to the qualitative variations in delay discounting, discovered through my research program aimed at identifying variables that influence delay discounting. Clarifying some of the issues related to quantitative variations may also help identify possible future directions for this area of research.

Quantitative and qualitative features of discounting
Delay discounting can be defined as the systematic decrease in subjective value of an outcome as the delay to their receipt is increased (Reed and Martens, 2011). The discounting paradigm states that we devalue behavioral outcomes that take place in the future, whether they are gains or losses (Madden and Bickel, 2010). On an aggregate level this temporal choice behavior is appropriately described by the discount function, depicting how the subjective value of a reinforcer decreases relatively rapidly at shorter delays and relatively slower at longer delays. However, steeper discounting (a quantitative feature) may occur although the hyperbolic shape is the same. Indeed, studies have shown that people discount gains steeper than losses (Asgarova et al., 2020;Clatch and Borgida, 2020;Frederick et al., 2002;Gonçalves and Silva, 2015;Hardisty et al., 2013;Molouki et al., 2019), and this specific quantitative difference relating to gains and losses, is referred to as the sign effect.
As opposed to quantitative differences in discounting which refers to steepness of the slope, the qualitative differences refer to other differences in shape or pattern of the discount function. The conventional discounting shape is positive discounting, a systematic devaluation of outcome as a function of delay. Other shapes that have been observed are, for instance zero discounting (Furrebøe, 2020a;Hardisty et al., 2013), which means there are constant preferences for only one alternative (Only SS or only LL) resulting in a slope of zero and a horizontal line. Unsystematic discounting (Furrebøe, 2020b) refers to the seemingly random choices of the outcome alternatives and fluctuating indifference points with no overall systematic increase or decrease in subjective value. Negative discounting is a systematic change but in the opposite direction as conventional discounting, an increase in subjective value of the outcome as delay increases. This discounting pattern is typically connected to discounting of losses (Abdellaoui et al., 2018;Hardisty et al., 2013;Myerson et al., 2017). The participant tends to choose the SS loss as delay increases. These qualitative patterns contradict the systematic devaluation on which the discounting paradigm is based.

Influences on the pattern of discounting
The molecular view of behavior argues that behavior consists of discrete responses and therefore choice is a derived measure, and that discrete responses and reinforcer contiguity are required to strengthen behavior (Baum, 2002). The molar view of behavior, on the other hand, argues that behavior is a pattern of actions over time. Hence, the molar view regards choice as the concrete behavioral allocation, and the response as the derived measure (Baum, 2002), and that it is necessary to observe the ongoing (molar) activity of an organism in order to observe behavior (Simon et al., 2020). While some molecular theories would state that the subject chooses the alternative with the "higher value" at that moment (Mazur, 2006), other molecular theories would not accept the momentary maximization principle, but rather focus on the strengthening effect of the short-term consequences of the response (Mazur, 2006). Hybrid theories argue that both molar and molecular variables affect choice. That is, choices over time are both affected by matching of response proportions to reinforcement proportions and controlled by the momentary shorter delay to reinforcement (Mazur, 2006). Studies on human choice have found that both current reinforcement contingencies and a subject's verbal rules can influence behavior. The use of rules may compete with control by direct consequences (Fisher and Mazur, 1997;Hayes et al., 1986, p. 145). Shimp (Shimp, 2020) described a range of different meanings and usages of molar and molecular approaches. He concludes by referring to Skinner's use of cumulative record and how this simultaneously depicts aggregate and moment-to-moment behavior, and argues that the molar (aggregate) and molecular (moment-to-moment) approach can be combined into one single analysis.
Discounting can also be understood in terms of a combined molar and molecular analysis. Discounting experiments assess the participant's distribution of responses over periods of time -the overall rate of reinforcement. Expressing discounting as a devaluation of outcome, or defining choice as time allocation (Baum, 2016), presupposes some kind of overall molar evaluations or matching of outcomes. Certainly, human discounting involves verbal behavior, including overall evaluations or rule-governed behavior (Foxall, 2003). Verbal behavior works as discriminative stimuli, and the behavior is rule-governed if the behavior is in fact reinforced by the rule (Baum, 2005, p. 160). Even non-predictive information, expectations or imagination can be rewarding or punishing (Ainslie, 2016). Choice bundling, which is a strategy to decrease excess discounting (Ainslie, 2012), can be considered a molar strategy. Choice bundling refers to how the agent compares multiple short-term consequences against multiple long-term consequences rather than to compare a single short-term reward against a single long-term reward, enabling the agent to emphasize the molar picture, distancing oneself from the approaching preference reversal (Ashe and Wilson, 2020). Similarly, a concept such as debt aversion holds an element of molar considerations. The unwillingness to incur a debt, requires an overall view and considerations of short-term and long-term consequences. Discounting can also be analysed in terms of direct reinforcing contingencies. Experiments have demonstrated how discounting of gains and losses both are connected to present bias, that is, preference for the immediate outcome. Present bias reflects the strengthening effect of the short-term consequences of the response. Through experiments using monetary outcomes, Hardisty et al. (2013) found that people not only prefer gains immediately, but also prefer to resolve losses immediately. The delay aversion theory (Sonuga- Barke, 2005;Sonuga-Barke et al., 1992) proposes that for some people the main motivation for choosing the SS alternative is to escape the aversive waiting time, emphasizing the role of proximity to the outcome. Also, the steep end of the hyperbolic curve explicitly points to the effect of immediacy of the outcomes, underlining the importance of response-reinforcer contiguity (Mazur, 2006, p. 339).

Approaches for investigating qualitative variations
To investigate choice behavior one might measure rate of responding to one alternative related to rate of responding to other alternatives (de Villier andHerrnstein, 1976, p. 1132). Other measures of choice can be proportion of responses to discrete-trial alternatives (Estes, 1994), or time spent responding (Baum and Rachlin, 1969). To investigate discounting behavior in humans, participants are given series of choices. These procedures typically consist of choices between a present and a delayed outcome with titrations of delay and/or amounts. Procedures may also contain conditions in which both outcome alternatives are delayed. Such double delay procedures are particularly useful in order to delineate present bias cases from non-present bias cases (Mitchell et al., 2015). Next, it is common to fit a mathematical function to the data points obtained. Numerous studies have found that the hyperbolic function is a better fit to data than is the exponential function (e.g., Frederick et al., 2002;Kirby and Herrnstein, 1995;Myerson and Green, 1995;Rachlin et al., 1991). The hyperbolic function is steeper than the exponential function at shorter delays and shallower at longer delays, while the exponential function has a constant rate regardless of delay. Still, humans tend to deviate from hyperbolic discounting, and by adding an exponent in the discount function, the hyperboloid function was found to be an even better fit, particularly when dealing with individual data (Myerson and Green, 1995) and losses (Holt et al., 2008). A point-based area under the curve (AUC) (Myerson et al., 2001) is a different way of measuring discounting. Rather than using a theoretical discount model, it is simply a geometric measure of the area under the empirical discounting curve. Based on the indifference points (points where the subjective values of both alternatives are equal), trapezoid areas are calculated and added into one AUC measure. The AUC measure is a rough numeric summary of the indifference points (Gilroy and Hantula, 2018), and does not determine the steepness of the curve. However, when the goal is to explore the various patterns of discounting on an individual level, AUC measure is assumed useful as a supplement to visual inspections.
The increasing number of studies that show unsystematic discounting call for further investigations of how and why these qualitative variations occur. Questions related to the qualitative variations in delay discounting also emerged from own research. For instance, results from adults and adolescents differed, and questions formed concerning discounting as a matter of maturity. Also, verbal reports indicated the use of rules, which gave rise to further investigations on how the various discounting patterns may be connected to verbal behavior and the use of rules. The current targeted review is based on three studies conducted in my own lab. The objective is to address the following four questions, (1) What were the main qualitative differences in discounting patterns between monetary gains and losses? (2) How did qualitative variations in discounting differ in adolescents and adults? (3) How might rules and consequences interact to influence qualitative differences in discounting? and (4) Are the current procedures and measures sufficient for identifying qualitative variations in discounting? These four research questions are important to address in order to proceed with research aiming at finding the underlying influences of delay discounting, and to search for the best approaches for such investigations.
The current review highlights and contextualizes results from these three studies on variations in delay discounting. To obtain a broader understanding of the processes taking place, the investigations are performed on multiple levels. In addition to between-groups and individuals investigations, it is important to examine discounting on an intra-individual level both as the pattern of behavior and as single operant responses, particularly in complex choice situations as seen in discounting studies on gains and losses (Białaszek et al., 2019;Estle et al., 2019).

Three studies reviewed
The initial objectives in Study 1 and Study 2 were to test the quantitative description of the sign effect, and to compare discounting patterns between individual and group-level data. While evidence of the sign effect typically stems from group-level data, we aimed at exploring the sign effect also on an individual level. We searched for similarities and differences in discounting patterns between gains and losses.
In Study 1 we compared discounting of hypothetical monetary gains and losses in 31 adults (Furrebøe, 2020a). We used a computerized choice task adapted from an earlier study (Holt et al., 2008). There were two scenarios where the choices were between receiving a smaller-sooner (SS) or larger-later (LL) amount (Gain scenario) and paying a SS or LL amount (Loss scenario). The two scenarios had identical procedures with 11 delay-difference conditions (from 1 week to 20 years) and 16 SS delay titration steps (immediate to 20 years). The titration procedure took the form of a" double-delay" procedure, implying that the delay-difference (time between the SS outcome and the LL outcome) was kept constant while SS delay (delay before the SS outcome) was titrated up and down ( Fig. 1). At the end of each scenario the participants were asked what strategy they had for their choices, hoping to reveal whether and how the participants evaluated the alternative outcomes as they progressed through the session. Discount functions were plotted and area under the curve (AUC) was calculated for gains and losses pr participant and on a group level. The corresponding verbal reports were categorized into three groups based on the participants' description of how they considered the alternatives (Table 1). Participants' verbal reports were for the most part clearly either based on considerations of both alternatives and in terms of delay and amount, or only based on one feature. However, there were a few descriptions that were difficult to interpret. These were placed in the "No strategy" category.
The results from Study 1 showed that the mean AUC for gains was significantly smaller than the mean AUC for losses (Table 2). Also on an individual level, the discounting curves for gains were mostly steeper than for losses. Fig. 2 shows two individual sets of IP values on gain and loss, participant 3 (panel 1) and participant 24 (panel 2), and the mean IP values for the participants in Study 1 (n = 31; panel 3). All the individual graphs from Study 1 can be found in Appendix A and B. The most interesting observations was, however, that the discounting patterns for losses and for gains were qualitatively different. Discounting of losses typically did not have a gradual decrease, but rather were fluctuating or horizontal. The latter case means AUC = 1.00 and is referred to as zero discounting. The discounting-of-gains curves were still systematically decreasing in most cases. Further, we found that the sign  Note. The numbers correspond to how many of the verbal responses reflected the different strategies under gain and loss scenario Source:Adapted from "The sign effect, systematic devaluations and zero discounting" by E.
effect on an aggregate level was a result of a combination of positive discounting, unsystematic discounting, and a large number (18 cases) of zero discounting of losses on an individual level, rather than merely cases of shallower positive devaluation of losses. Although zero discounting may be regarded as an extreme case of shallow positive discounting, we distinguish between zero discounting and shallow positive discounting based on our definition of zero discounting as absolutely no systematic devaluation (AUC=1). Finally, zero discounting of losses were, except for the two cases, not accompanied by zero discounting of gains for the same person ( Table 2). The verbal reports offered details about the contingencies that were unobservable in this study, the participants' use of rules and their evaluations of the alternatives. Examples of verbal responses are shown in Table 3, examples were selected to represent variation in responses.
The full report is found in Appendix C. The responses corresponded well with the qualitative differences found by the visual inspection of the discounting curves. Typically, those who showed a systematic devaluation also verbally reported how they evaluated both alternatives throughout the trials. This was the case mostly for gains (Table 2 and  Table 3). For example, they would mention how they would choose LL if delay was relatively short but considered SS if the delay to LL became too long. Conversely, those who showed zero discounting reported that they were only concerned with a single feature, which was the case mostly for losses (Table 2 and Table 3). For instance, zero discounting participants often mentioned the gratification of getting a loss out of the way (SS outcome), but rarely said anything about the gratification of deferring a loss (LL outcome), or any consideration up against the amount. Furthermore, the verbal reports revealed rather specific rules and strategies used by the participants. In sum these results, obtained through observed responses and verbal reports, indicated that discounting of losses and discounting of gains involve different behaviorenvironment situations.
Study 2 (Furrebøe, 2020b) was a partial replication of the Note. Verbal report strategy type: both = Both Alternatives strategy; single = Single Feature strategy; no = No Strategy. Loss rule: x = Use of the rule: "Make payments right away" Source:Adapted from "The sign effect, systematic devaluations and zero discounting" by E. experimental procedure in Study 1. We examined the differences in discounting of gains and losses, using the same procedure as in Study 1 on 24 adolescents. Study 2 did not include verbal reports. In addition to adding empirical data on the sign effect both on an aggregate and individual level, the purpose was to compare discounting behavior of adolescents and adults. Again, we investigated the potential variations in behavioral patterns between discounting of monetary gains and losses. Based on previous studies (Green et al., 1994(Green et al., , 1999Steinberg et al., 2009) one might expect adolescents to discount more by delay than adults.
The results showed that, like adults, most of the adolescents discounted gains steeper than losses, depicting the sign effect on a group level. Discounting of gains and losses also appeared qualitatively different. While discounting of gains curves often displayed some systematic change, the discounting of losses curves did not, but rather reflected consistent choices of SS loss (zero discounting) or abrupt or irregular changes. Fig. 3 shows two individual sets of IP values on gain and loss, participant 110 (panel 1) and participant 111 (panel 2), and the mean IP values for the 24 participants in Study 2 (n = 24; panel 3). Participant 110 (panel 1) shows zero discounting of losses, no systematic, gradual change of slope. Although there is a decline for gains, the slope changes abruptly. Participant 111 (panel 2) shows positive discounting of gains and a u-shaped discounting-curve for losses. The aggregate discounting curves (panel 3) display steep positive discounting for gains and shallow, slightly u-shaped, discounting of losses. All individual graphs from Study 2 can be found in Appendix D.
Overall, adolescents demonstrated a steeper discounting of gains compared to adults. This corresponds with other studies that show that children are more impulsive than adults (Green et al., 1994(Green et al., , 1999Steinberg et al., 2009). Further, the steeper aggregate discounting curve for gains in Study 2 seemed to be due to both quantitative and qualitative differences on an individual level. There were several cases of abrupt preference-changes among the adolescents, e.g., participant 111 ( Fig. 3, panel 2), as opposed to the gradual devaluations that was observed in other adolescents and most of the adult participants, e.g., participant 3 (Fig. 2, panel 1). Also, the discounting of losses curves for adolescents were qualitatively different than for adults. The aggregated loss curve for adolescents was shallower than for adults and nearly u-shaped (Fig. 3, panel 3).
There are variations in discounting patterns observed in Study 1 and 2, however the direct reinforcing contingencies are not possible to delineate. For instance, while repeated choices of SS loss seemed to be accompanied by the rule: "Get any payment out of the way as soon as possible", we are not certain whether it was the task completion that was reinforcing or perhaps that ending the aversive waiting time was the main reinforcing contingency. In order to check these direct contingencies more closely, we also conducted a third study (Furrebøe et al., 2019). Study 3 consisted of two experiments (Study 3a and 3b). A real-time operant reinforcement setting was used to focus on the actual contingencies of reinforcement involved in choice behavior, rather than the hypothetical subjective values . 1 Previous studies using real-time procedures have proven useful in assessing discounting in humans (e. g., Reynolds and Schiffbauer, 2004) We measured the frequency of responding to the alternatives while manipulating actual time in seconds, hoping to effectively separate the effect of reinforcement delay from the effect of molar considerations. Molar consideration here means the participant considers the larger picture rather than respond to each trial separately. Molar considerations may, for instance, involve responding as to finish the whole sequence of trials as quickly as possible. Another example is when attempting to maximize overall reinforcement (Jacobs and Hackenberg, 2000), or maximize reward density, meaning if speed matters one might try to maximize reward per time (Sonuga-Barke, 2005;Sonuga-Barke et al., 1992). Notably, considerations or calculations may still occur using real-time operant reinforcement. Prolonged exposure to similar contingencies may lead to choices based on considerations about what the alternatives have produced already, or the choices are based on assumptions the participant makes about features of the test, if information is lacking.
For Study 3a we designed a computer task consisting of a discounting procedure in which participants could earn points on concurrent VI schedules. There were 20 participants aged 17-71 assigned to three different conditions 1 . In all three conditions the participants were asked to collect points by responding to keys on the computer, SS points in key A and LL points in key B. The first trial produced 1 point for a response to A or 20 points for a response to B. Points were displayed below the clicked key after 0.3 s and disappeared again after 1 s, but the aggregate points earned from both keys were visible on a separate display on screen throughout the task. As one of the keys was pressed, both keys were inactivated until the delay expired, and additional presses would not have any effect. Each delay was therefore experienced. The delay to release of points from key B would increase throughout the session or until the participant preferred key A. The objective was to establish whether the increasing delay would affect frequency of responding to the alternatives. In order to reduce the possible influence of verbal information on their choice behavior, participants did not know in advance what would happen, in terms of change in delay or number of points, when they responded to keys A and B. Although the lack of information might increase the possibility of some types of molar considerations, it also created a setting where the participants had to respond to the keys to get in contact with the contingencies. Consequently, the preference shift was a gradual change in responses, and the identification of an indifference point (IP) had to be based on a certain number-of-responses criterion. IP was here defined as the point where responses shifted from stable LL to stable SS. The session consisted of 150 trials and lasted for 15-30 min depending on how the participants made their choices. Cumulative records from Study 3a are found in Appendices E, F. Results show that most of the participants who initially chose LL changed to SS as delay to LL became sufficiently long, and it was established that choice behavior was sensitive to variation in delay of seconds, confirming earlier results from discounting studies using real-time choice procedures (Reynolds, 2006).
In Study 3b we recruited 20 new participants aged 18-50. The procedure was identical to condition 1 in study 3a, except the sessions now consisted of 200 trials to allow for a decreasing delay following the increasing delay. The purpose of introducing the decreasing delay was to control for molar considerations. If preference changed back to the initial choice when the delay decreased again, this would indicate that it was the delay to reinforcement that affected the change, rather than an effect of the duration of the test or any molar considerations. It is conceivable that for example trying to end the test as quickly as possible would involve a momentary escape response explained by late session motivating operations (Michael, 1993), but it is also likely that molar considerations are involved. In study 3a and 3b it was not possible to end the test immediately, and the participants possibly had to make overall judgements about previous and prospective alternatives, choosing the alternative that minimized the total time spent on the task. Nevertheless, to specify various types of molar considerations was out the scope of this study. We found that participants changed back to the LL points when delay was sufficiently reduced (Appendices G, H), and we concluded that delay to reinforcement was the main effect for preference change in this study. Two post-experimental questions were included in Study 3b to be compared to actual choice responding, and possibly supplementing with indications of undetected contingencies that may have been involved in their choices. The questions were as follows: 1. Did you attempt to obtain as many points as possible? and 2. If this is the case and if you switched from LL to SS, why did you switch? The post-experimental verbal reports did not prove sufficient to suggest any specific molar considerations. Most participants answered either that they tried to gather the most points possible or that they chose the alternative that went faster. In sum Studies 3a and 3b indicated how each choice may be affected by reinforcer delay for human participants.

What were the main qualitative differences in discounting patterns between monetary gains and losses?
By analysing the individual-level responding in these three studies, we observed some qualitative differences in discounting patterns between gains and losses, which aligns with other studies also based on individual -level analysis (Abdellaoui et al., 2018;Yeh et al., 2020). Specifically, while discounting of gains mainly showed the conventional positive discounting pattern, we found zero discounting and unsystematic discounting in the case of losses, which produced relatively shallow aggregate curves. Like negative discounting, zero discounting seems to result from preference for the larger postponed gains or preference for hastening losses. Myerson et al. (2017) investigated individual differences in delay discounting of gains and losses and found that although many of the participants tended to choose to postpone payment as delay was increased, some participants always chose the smaller, immediate loss. Participants who always chose the smaller, immediate loss were labeled minimizers (zero discounting). Further, Myerson et al. (2017) found that some participants, who were labeled debt-averse, were more likely to choose to pay immediately as delay increased (negative discounting). Importantly, these "debt-averse" individuals were also more likely to choose larger later gains as delay increased, and they scored lower on Eysenck 17 Impulsiveness scale, than the "minimizers" or the "loss averse". A similar "debt-aversiveness" has been observed also for nonmonetary outcomes. Abdellaoui et al. (2018) conducted a discounting experiment on gains and losses of spare time/working time. The participants were presented with a hypothetical situation of a research assistantship, for instance involving two working sessions 1 In the original experiment we also explored specific features of the experimental design, such as the use of seconds of delay, the use of points as reinforcer, and how to measure IP. For the purpose of this review, however, we focus solely on the issue of specifying the direct contingencies.
lasting four hours each, one session scheduled now and the other in six months. They were then asked whether they would prefer to gain two hours now or three hours in six months. Examining the variations in discounting pattern, they found that 36 of the 101 participants exhibited negative discounting for losses of time and positive discounting for gains of time. In other words, they preferred the extra time sooner, but also to lose time sooner.
The main qualitative differences in discounting patterns between gains and losses in the reviewed and current studies, mainly relate to the deviant patterns found in discounting of losses. While discounting of gains typically display positive discounting, discounting of losses may be positive, negative, zero, or unsystematic. Importantly, zero discounting connected to paying a fine is likely not as common in the real world as it is in these experiments. In a real-world situation there are usually budget constraints and people may have trouble paying bills immediately although they would prefer to do so. Nevertheless, the current studies elucidate how discounting of losses consists of different reinforcing contingencies than discounting of gains.

How did qualitative variations in discounting differ in adolescents and adults?
Studies 1 and 2 found larger variations in discounting of both gains and losses among adolescents compared to adults. While adults typically showed a gradual positive discounting of gains, there were several cases of unsystematic (abrupt shifts and u-shaped) discounting of gains among adolescents. Regarding losses, there were zero discounting and unsystematic discounting among both adults and adolescents, but there were more cases of unsystematic discounting among adolescents and specifically the u-shaped discounting was mainly found among adolescents. The differences between adolescents and adults are in accordance with other current studies, who have found that those who discount positively were significantly older than those who discount negatively (Yeh et al., 2020). Some studies have concluded that individual differences in discounting of gains are mainly quantitative, while differences in discounting of losses may be both qualitative and quantitative (Abdellaoui et al., 2018;Yeh et al., 2020). The interpretations of results concerning the connection between quantitative and/or qualitative differences and the sign of the outcome, aligns with the reviewed results on the adult population, but not on the younger population. Within the adolescent group (Study 2) there were several cases of unsystematic discounting of losses, but there were also several cases of unsystematic discounting of gains, particularly abrupt shifts.
The u-shaped discounting of losses curve, more often observed among adolescents, may indicate a lack of established pattern of choice, that zero discounting, and the corresponding rule-following behavior, comes with experience. Also, the abrupt shifts in discounting of gains observed among the adolescents, could be an indication that they differentiate less between delay intervals. The choice being between now or later, regardless of how much later, indicating that the effect of the short-term consequences is stronger than that of devaluation considerations. Adults, on the other hand, seemed to differentiate more between delay intervals, resulting in a more gradual positive discounting pattern of gains, suggesting a greater extent of molar considerations of the outcomes over time, and that the use of rules may have gained control of behavior over direct consequences. There is also the possibility that the abrupt changes and u-shaped discounting among adolescents are due to the long delay intervals used. In order to obtain a graded decline in subjective value in adolescents, it may be necessary to use shorter increments than what is used for adults. Relatedly, adolescents could simply be unexperienced with money, paying bills, and incurring debt, and for that reason be uncertain about how to respond to the discounting task. Further research on discounting in adolescents, including procedures and methods, is called for.

How might rules and consequences interact to influence qualitative differences in discounting?
Results from the three reviewed studies suggest that discounting of gains and discounting of losses are not necessarily associated with the same reinforcing processes. The qualitative differences seemed to involve different combinations of verbal rules and direct reinforcing contingencies, which is in line with early research on choice behavior (Fisher and Mazur, 1997).
The discounting procedures in Study 1 and Study 2 showed that participants often chose SS hypothetical outcomes as delay to the SS outcome became shorter, regardless of sign, which points to the importance of response-reinforcer contiguity involved in discounting (Mazur, 2006). Similarly, Hardisty et al. (2013) argue that present bias is one explanation of the sign effect, that people prefer to resolve gains as well as losses immediately. Results from the reviewed studies support the present bias hypothesis in that they show the same preference for "resolving" any issue sooner rather than later. The operant procedures and suggestive evidence from Study 3, further strengthened the argument that direct contingencies are parts of the explanation of discounting. Participants who initially had a high frequency of responding to the LL alternative, changed to a high frequency of responding to the SS alternative when delay to the LL outcome reached a certain level, and that participants changed back to a high frequency of responding to LL when delay to LL again became shorter. These results are in accordance with previous studies (Hyten et al., 1994;Lagorio and Madden, 2005;Lane et al., 2003), and indicate that the single responding is sensitive to the reinforcing delay. Like Study 3b, Scheres et al. (2006) differentiated between immediacy to the outcome and the effect of delay aversion by introducing and removing a post-reward delay in a real reward study. They found that the steeper discounting of gains among younger participants mainly was due to immediacy of the outcome. The results from Scheres et al. (2006) are corroborated by the current findings and confirm that both the steeper discounting of gains and the shallower discounting of losses may be explained by direct reinforcing contingencies and the proximity of the outcome. Based on the current studies one might further argue that a molecular view of behavior is important in explaining the influences on qualitative variations in discounting.
The observed sign effect and qualitative variations could also be explained by factors related to molar considerations. In Study 1, the discounting appeared in some cases in accordance with rules or considerations of the outcome. The verbal reports described the use of rules, but also how participants seemed to consider one or both alternatives throughout the trial. The verbal reports may be inaccurate, not reflecting actual behavior during trials. The verbal reports did, however, correspond well with the participants' responses on the discounting tasks, showing preference for sooner losses, for instance. Abdellaoui et al. (2018) argued that the observed preference for sooner losses possibly reflects sophisticated decision to avoid procrastination. Myerson et al. (2017) explain the preference for sooner losses through individual differences. "Debt-averse" individuals tend to choose SS loss as delay increases, and therefore they are assumed to focus on avoiding debt, avoiding outstanding negative issues, as opposed to the "Loss-averse" individual who focus on avoiding the loss itself. Whether the participants in Study 1 in fact attempted to avoid procrastination or whether they disliked incurring an outstanding negative issue, is uncertain, but it is conceivable that their choices involved molar consideration of some kind. Shallow discounting of gains was associated with rules of maximization, and zero discounting of losses was connected to the rule: "Get the fine out of the way as quickly as possible". Most people tend to hasten pleasant things and postpone unpleasant things and find the need to apply strategies or rules to prevent these behaviors.

Several reinforcing contingencies
Discounting involves several reinforcing contingencies beyond the SS/LL dimensions (Estle et al., 2019;Furrebøe, 2020aFurrebøe, , 2020bGonçalves and Silva, 2015;Weatherly and Derenne, 2012). For instance, jogging may consist of immediate discomfort, immediate happiness, better shape later, and less time doing other things now etc. Pico economics regard choice behavior as an intrapersonal bargaining (Ashe and Wilson, 2020). The individual behavior comprises of a variety of behaviors, and when opposite behaviors occur, bargaining takes place. Such bargaining seems to emerge in the current studies. There are various combinations of direct reinforcing contingencies and overall considerations about the outcomes, forming an accumulated sum of strength of reinforcement of the alternatives. How we discount one alternative is dependent on the other alternative (Huskinson et al., 2016). Thus, it is not necessarily the sign of the outcome that dictates difference in choice between gains and losses, or that the monetary outcome is the only reinforcing effect on behavior. In Study 1 and 2 the choices appeared to be between the same relative values (between SS and LL monetary gains in Scenario 1 and between SS and LL monetary losses in Scenario 2). However, additional reinforcing contingencies emerged that were distinct for each of the scenarios. By using a double delay procedure the focus was on the relative delay difference between SS and LL rather than the delay between immediate and delayed outcome (Furrebøe, 2020a,b), elucidating direct reinforcing contingencies beyond those related to the presented alternatives themselves. For instance, the monetary loss scenario typically involves positive punishment by imposing a fine, but it also seemed to be positively reinforced in terms of the gratification of adhering to rules.

Competing contingencies
Which reinforcing contingencies that prevailed seemed to vary in the current studies. In case of immediate gains, contingency-shaped behavior typically outcompeted rule-governed behavior, supporting earlier research (Hayes et al., 1986;Kudadjie-Gyamfi and Rachlin, 2002). It may be more tempting to choose the SS money although you know you will gain more if you wait for the LL money, and this dilemma depends on the adjustment of delay and amount. Concerning zero discounting of losses, the negative reinforcement of removing the expected loss (paying the bill immediately) seemed to outcompete the reinforcement value of deferring that loss, regardless of delay adjustment. The reason why deferring the loss (avoidance) is outcompeted in this case, may be that doing so provides no additional reward (Ainslie, 2010), the value remains the same. Along the same lines, Asgarova et al. (2020) found that choices involving losses are less susceptible to contextual influence than choices involving gains. In the case of losses, the short-term consequence of the response has to do with ending the delay or adhere to a rule, rather than to gain something.
The different aggregate patterns between gains and losses may further imply that the use of rules and strategies work in opposite directions. While the use of rules strengthens the value of the LL gain, upholding the competing contingencies, the use of rules strengthens the SS loss, which already has the strength of direct contingencies, adding up to a net value that easily outcompetes other contingencies regardless of delay. To compare, choice bundling is the strategy of pre-committing to LL gain to avoid choosing SS gain, recognizing our tendency to succumb to the SS alternative. Rules or strategies related to losses, such as in Study 1, are easier to follow as they are connected to the SS alternative rather than the LL alternative. For instance, zero discounting participants often mention the gratification of getting a loss out of the way (rule strengthening SS outcome), but rarely said anything about the gratification of deferring a loss (LL outcome).
Other competing contingencies may involve past experience or anticipations of the future (Molouki et al., 2019;Odum et al., 2020). For instance, in prospect of a gain such as a vacation or a kiss, the anticipation itself may be pleasurable and may lead to a choice of LL rather than SS (Loewenstein, 1987 as cited in Harris, 2012). This pleasure in anticipation could explain the high frequency of responses to the LL alternative in the gain scenario in Study 1 and 2. The prospect of a monetary gain can be gratifying, but it is less likely that there is any pleasure in postponement of a LL monetary loss. Along the same lines, research suggests that the steeper discounting of non-monetary losses (physical pain, embarrassment, rejection etc.) compared to monetary losses result from dread minimization (Clatch and Borgida, 2020;Harris, 2012). Although physical pain probably produces more dread than a monetary loss, the same principle can be applied. The expectation of a future monetary loss, the persisting "pain" of knowing that you will have to pay eventually, may add to the aversiveness compared to making the payment right away. Thus, the delay to payment is important, but the length of delay has less effect on the choice, and few cases of gradual devaluation of losses are therefore observed.

Are the current procedures and measures sufficient for identifying qualitative variations in discounting?
Discounting is a pattern of choice responses. Often the objective in a discounting study is to predict or describe behavior on an aggregate level (Group level or individual level), and the focus of investigation is therefore often on the whole discount function, and on the mathematical function that best describes the data. Fitting a mathematical model to the indifference points emphasizes the aggregate subjective value, but also limits the possibility of obtaining information about the factors contributing to discounting (Mitchell et al., 2015). To explain why we discount, the specific reinforcing contingencies needs to be investigated. It may be difficult to describe such molar patterns of behavior in terms of discrete response-consequence relations (Chritchfield and Kollins, 2001). Still, it is necessary to include a molecular view and decompose discounting into smaller parts, in order to capture the details that are not observable otherwise. Following up on recent findings (Białaszek et al., 2019;Estle et al., 2019), we found it necessary to conduct the experiments and analyze the data on multiple levels, to understand the reinforcing contingencies in discounting of gains and losses in connection to the qualitative and quantitative differences in discounting patterns. While the real-time procedures provided single-response data in study 3a and 3b, enabling the investigation of the contingencies of reinforcement, Study 1 and 2 were within-subject discounting experiments, using titration procedures of hypothetical gains and losses, providing clustered single-subject behavioral variables across time, and aggregate behavioral variables across subjects, which in turn allowed for AUC calculations and statistical analysis. The qualitative variations on an individual level corresponded with the regular discount functions on a group-level, corroborating how comparison of data across level of investigation is advantageous.
A discounting pattern is formed by the immediacy of each outcome, but also by overall considerations about the relative values of the alternatives as they change over time (Ainslie and Haslam, 1992;Mazur, 1987;Prelec and Loewenstein, 1991). As opposed to a conventional discounting procedure where one outcome is immediate and stationary, the double delay procedure involves adjusting delays to both SS and LL outcomes, an approach that captures how choices may be driven both by the immediacy of outcome and by an overall consideration of the relative difference in subjective value between the SS and LL outcomes. Except for å few cases (e.g., Mitchell and Wilson, 2010), the double-delay procedure is not widely used. Mitchell and Wilson (2010) compared smokers' and non-smokers' discounting in both a SmallNow versus LargerLater and a SmallSoon versus LargerLater procedure, and found that discounting occurred also in the double-delay procedure, although to a lesser degree. Replicating the procedure from Study 1 in Study 2, further increased the validity of this procedure. More studies using a double-delay procedure would be valuable.
We also relied on visual inspection to investigate the variations in discounting pattern. Combining statistical analysis and visual inspections prevents confirmation bias, subjectivity, and unreasonable emphasis on statistical measures (Hales et al., 2019;Laraway et al., 2019). In two of the reviewed studies, post-experimental verbal reports were conducted in order to indicate possible non-observable reinforcing contingencies. Verbal reports have proven successful in earlier studies, as well. Horne and Lowe (1993) investigated matching in human behavior and the corresponding verbal behavior, in particular the use of rules, in a series of experiments on choices. They found a significant positive relationship between the stated preferences and their rate of responding. Similarly, we found the verbal reports to correspond well with actual responding in Study 1. However, acknowledging the possibility of inaccurate results from verbal reports, we regard these findings as putative. Improved verbal reports, relevant for discounting procedures, should be developed.

Limitations
One limitation common to the three studies is that we had no measure to determine whether participants were attending to the tasks, neither through the discounting procedures nor by questions to the participants. Consequently, the non-systematic effects and differences between gains and losses, as well as differences between adults and adolescents, may be due to differences in attention. Attention towards the task should be controlled for in future studies on qualitative variations in delay discounting. Based on Mitchell and Wilson's (2010) report of shallower discounting by double-delay procedure, the use of such procedure in Study 1 and 2 may partly explain the lack of discounting observed in many of these cases. Another reason for the shallower, or lack of, discounting may be that monetary outcomes were used. In their systematic review, Odum et al. (2020) examined the quantitative differences (steepness) in delay discounting in recent published research, and found that nonmonetary outcomes were discounted steeper than money in most of the studies. The differences between monetary and non-monetary outcomes may certainly have a similar impact on qualitative differences. There is evidence of non-positive discounting also in relation to such outcomes as spare time/working time (Abdellaoui et al., 2018) or plea bargaining (Clatch and Borgida, 2020), but more research is needed on qualitative differences and non-monetary outcomes, in particular.

Conclusion and future directions
This review sought to guide further research in discounting. The purpose was specifically to contribute to building the research aiming at finding the underlying influences of delay discounting. Generally, more empirical data, including replication studies on qualitative variations, is warranted. There is still a lack of evidence on when and how frequently qualitative variations occur in relation to losses, particularly for nonmonetary outcomes. In addition, investigations are needed on how age or experience influence qualitative variations, or whether qualitative variations may be explained by traits. To explain what lies behind these qualitative differences, it may be fruitful to examine more closely the rules and consequences that account for the various patterns of discounting. To follow up on the suggestion from Mitchell et al. (2015), research that manages to delineate bias towards the immediate alternative from longer-term discounting factors, may close in on finding the contributing factors of qualitative variations in discounting.
If we are to explore further these qualitative variations in discounting, we also need appropriate discounting procedures. Extending on the use of double-delay procedures seem to provide additional details valuable to the understanding of qualitative variations in discounting, and needs further exploration. Another plausible extension regarding methods is to improve the verbal reports used. For instance, more specific questions for the verbal reports could facilitate explicit information about participants' strategies, and along with it, specify the reinforcing contingencies connected to these qualitative variations. Alternatively, existing questionnaires could incorporate questions targeting the use of strategies and rules about temporal choices. Further, it is important to connect the single-response, single-subject, and aggregate studies, to obtain an overarching understanding of the complexity of discounting. Finally, new procedures to accommodate an increased focus on the more molecular approach to discounting, are needed. Assuming that the qualitative variations we observe in individual delay discounting are indications of how specific contingencies are controlling behavior, it is important to also employ operant discounting procedures in discounting research. All relevant competing contingencies need to be considered; how the molar considerations or rules influence our choices in relation to other direct contingencies in various ways in various situations. Green and Myerson (2019) argue that interventions intended to change discounting of gains may not change discounting of losses, because there are multiple factors that constitute the discounting behavior and patterns. Thus, pinpointing the specific contingencies involved in discounting of gains and losses, is particularly important for further development of interventions intended to modify discounting behavior.
Discounting research has proven helpful in extending our knowledge about choice in everyday life situations and in assessment and treatment of excessive impulsive behavior and addictive behavior related to substance abuse or gambling, for instance. Once the research area of qualitative variations is explored more fully, these findings could be valuable contributions to the development of treatments for these socially significant problems.  In case the delay becomes more than a certain length, I chose the smaller amount, mostly because I worry I would lose the reward.

Too small a difference between the amounts?
3 1 I tried to find out how much I would gain by choosing the larger delayed outcome. If… 4 2 I wanted as much as possible. But when the delay became years, I figured I would gain the most by choosing the smaller amount and place it in the bank.
If I have to make a payment due in several years ahead, a month or two more or less doesn't matter.

2
When the large payment was postponed for years, I thought it be better for me to choose the amount I could spend while being a student the next 0-5 years. However, when both amounts were postponed, I chose the largest amount.
Normally I prefer to pay right away, at least when the deferred payment alternative is short. The amount was not too large either, so it was possible to make the payment. When the delay was long, it was more tempting to choose that. It's not certain I live in 35 years. 6 2 When the largest amount was paid out not much later than the smaller, I chose the larger. I imagine I don't have the patience to wait 20 years. In that case I would rather receive the smaller amount. 1 I can see that it is an advantage to wait for the larger reward, but when the waiting became too long compared to the difference in amount, it feels as if it is not quite as profitable to wait. Personally, I think that it is worth waiting a year maybe 1,5 years for the additional 1500.
Just wanted to pay the smallest amount possible and get it done as quick as possible.
12 2 Think about how much you can earn by deferring payments, or to withdraw the money and spend them yourself or deposit them in the bank for interest.
As long as I have money available, and not too tight a budget, I always like to pay my bills as soon as possible, so I don't get behind. Particularly if the amount is considerably lower. Therefore, as long as I can afford it I would always pay the smallest amount right away if possible. 13 1 When there were only a few months between the alternatives I found it ok to wait for the money, and receive a bit more. But when I was faced with years of waiting, I didn't think there was enough money for me to wait for.
I considered how much could I earn/save by postponing the payment and the possibility of paying, against the extent of time until payment. 14 2 Now, as a student, I would need the money immediately, because it gives me a sense of security. At the same time, I think that 10-20-30 years from now seems very distant, so I could might as well choose this larger amount.
NOK 3000 is not that much money. So, it is better to pay the fine right away, than to wait and pay more. When the payment is due far into the future, you can might as well pay as quickly as possible.
17 1 My responses were for the most part dependent on when the timing-aspect became abstract. After some time the amounts of money became irrelevant, and one could just as well wait a few more years.
Personally I would like to pay the fine as soon as possible, get it out of the way, but one week is often too short a time. I think the alternative of paying NOK 3000 in 3 months was the best alternative, because then you may be able to save, even though NOK 3000 is not a very large sum of money. To wait 5-20 years would only be a worry for me. 18 2 I felt it was no point in waiting for 1500 extra for more than 3 years, because things become more and more expensive and after 3 years the extra money may not be worth anything anymore. But, to wait a few months I think is ok, because I don't think the economy changes that quickly.
I prefer to get rid of payables as soon as possible, and preferably within 20 years. I prefer to spend as little money as possible, so I had no difficulties making these decisions.

19
1 During the part when the delay was relatively short between the two amounts and you got more money by waiting, it was easy to wait. But it got harder when time increased.
I would never postpone paying something that would increase in amount over time, because I have learned that you should pay when you have the money. When it was mentioned 40 years ahead, I felt it did not apply to me because then I might even be retired. 20 2 I always wanted to receive the larger amount. I always want to pay the smaller amount regardless of timing, and as soon as possible so I wouldn't be bothered with it anymore.
(continued on next page) E.F. Furrebøe The fine should be payed as soon as possible, because the burden of having unpaid bills is greater than a possible gain in interest in this case.

1
No, I did not have a particular strategy. I don't like to borrow or owe anyone money. So, I pay my bills as soon as I receive them instead of waiting until they are due. I don't like having payables at all, because it is tempting to spend the money you are supposed to use on the payment. Besides it is less to pay NOK 3000 than NOK 4500. 24 2 I gained more by choosing the smaller amount when there was talk about several years between the smaller and the larger amount. But then later when there was talk about many years before receiving any of them, the amount did not matter anymore.
Pay the fine as soon as possible and get it out of the way! 25 1 When there's months or years until I receive the money, I would rather have them as soon as possible. On a general basis I like to pay my bills as soon as possible to get it out of the way. On the other hand, I would in some cases postpone in order to save or spread the payments/expenses over a lengthier time period. 26 2 The difference between the amounts were not so big, so I felt it would not be profitable to wait more than one year to have the larger payment.
It is better to pay right away, if time is limited (short delays?). Later when they talk about several years, it is better to pay a larger amount later. 27 1 I wanted the payment as soon as possible I would prefer to get the fine out of the way as soon as possible, but when it starts to become several years between the alternatives it would be more profitable to wait and rather save the money intended to pay the fine. 28 2 I have no patience for waiting. I would pay the smaller amount with in the shortest possible time, get the fine over with in shortest possible time. 29 1 At first the interest-rate was ok, but then, increasingly so, the distance between payments became so long that NOK 1500 in interest-rates seemed a little too small.

No, I didn't have a particular strategy
30 2 no answer My strategy was to pay the least possible. 31 1 no answer no answer Source:Adapted from "The sign effect, systematic devaluations and zero discounting" by E.    Adapted from "An exploratory study of a real-time choice procedure" by E.F.