Operant behavior is organized in response bouts (e.g., Daniels & Sanabria, 2017a; Hoffman & Newland, 2016; Íbias, Daniels, Miguéns, Pellón, & Sanabria, 2017; Matsui, Yamada, Sakagami, & Tanno, 2018; Romero, Daniels, Gipson, & Sanabria, 2018; Smith, McLean, Shull, Hughes, & Pitts, 2014). Such organization implies that operant behavior can be described using three parameters: the rate at which bouts are initiated (b), the rate at which responses are emitted within bouts (w), and the mean length of a bout (L). To understand how operant behavior adapts to environmental demands, it is important to establish the differential sensitivity of these parameters to various classes of manipulations (Sanabria, Daniels, Gupta, & Santos, 2019). Research so far suggests that b is uniquely sensitive to changes in reinforcer efficacy (Brackney, Cheung, Neisewander, & Sanabria, 2011; Brackney, Cheung, & Sanabria, 2017; Daniels & Sanabria, 2017b; Johnson, Pesek, & Christopher Newland, 2009; Shull, 2004; Shull, Gaynor, & Grimes, 2001) and rate of reinforcement (Brackney et al., 2017; Cheung, Neisewander, & Sanabria, 2012; Reed, 2011, 2015; Reed, Smale, Owens, & Freegard, 2018; Shull et al., 2001; Shull & Grimes, 2003; Shull, Grimes, & Bennett, 2004), and w and L are sensitive to changes in contingency requirements (Brackney et al., 2011; Brackney et al., 2017; Brackney & Sanabria, 2015; Chen & Reed, 2020; Reed, 2011; Reed et al., 2018; Shull et al., 2001; Shull & Grimes, 2003; Shull et al., 2004; Tanno, 2016). Thus, the parameters of the microstructure of operant behavior appear to index two of the three necessary conditions for operant performance (Killeen, 1994; Killeen & Sitomer, 2003; Sanabria, 2019): incentive motivation (b) and response–outcome association learning (w and L). The third necessary condition is sensorimotor capacity—the sensory capability to detect stimuli and the motor capability to emit responses that are relevant to the operant task. The present study aims at establishing whether another parameter in the microstructure of operant behavior may index sensorimotor capacity.

The mathematical principles of reinforcement (MPR; Killeen, 1994; Killeen & Sitomer, 2003) suggest that changes in sensorimotor capacity affect operant performance by setting a ceiling to response rate (Killeen, Hall, Reilly, & Kettle, 2002), a notion akin to the asymptotic response rate in Herrnstein’s (1970) hyperbola. The reciprocal of that ceiling is the shortest interresponse time (IRT) possible, which may represent a refractory period δ during which the subject completes one response and prepares for the next response (see Fig. 1a–b). It has been suggested that every IRT contains a fixed period δ that is selectively sensitive to changes in motoric requirements (Brackney et al., 2011; Cheung et al., 2012). According to this hypothesis, w is more sensitive to motoric requirements than b (e.g., Jiménez, Sanabria, & Cabrera, 2017; Johnson et al., 2009) only because δ constitutes a larger portion of within-bout IRTs than of between-bout IRTs.

Fig. 1
figure 1

Illustrative timeline of operant behavior and parameters of its microstructure. a Responses are typically recorded as discrete events of negligible duration (black vertical bars). A refractory period between consecutive responses, δ, is estimated as the shortest interresponse time (IRT) observed; in this illustration, the shortest IRT separates Responses 4 and 5. For the estimation of other parameters, δ is subtracted from every IRT (gray blocks after Responses 6 and 8). The bout initiation rate, b, is estimated as the mean reciprocal of IRTs (without δ) separating consecutive bouts (e.g., Responses 6 and 7). The within-bout response rate, w, is estimated as the reciprocal of IRTs (without δ) within bouts (e.g., Responses 9 and 10). Bout length is estimated as the mean number of responses in a bout (e.g., Responses 11, 12, and 13). b Parameter estimation assumes a fixed δ. Because δ includes the response duration, parameter estimation assumes responses of fixed duration (black blocks). c It is possible, however, that response durations, and thus refractory periods, vary between responses. If a duration threshold is imposed, subthreshold responses (gray blocks) would not be recorded

In support of the motoric hypothesis of δ, Brackney et al. (2011) found that increasing the lever height and force requirement for rats responding on a schedule of reinforcement increases δ. However, they also found that this lever manipulation decreases the rate of bout initiations, b. Similarly, Posadas-Sanchez (2004) found that increasing the lever force requirement increased MPR-based estimates of δ, but also reduced MPR-based estimates of incentive motivation. Although they did not track changes in δ with lever height, Jiménez et al. (2017) found that, when response duration is subtracted from IRTs, not only b but also w and L decline when the lever is sufficiently high. Pinkston and Moore (2020) estimated the effect of force requirement on these parameters in rats responding for food, based also on the distribution of IRTs without response durations. They found that raising the force requirement selectively reduced b, but only if subcriterial responses (recorded responses with peak force below requirement) were excluded. This effect on b vanished when subcriterial responses were included in the analysis, suggesting that the effects of motoric demands on the parameters of behavioral microstructure are artifactual. It is thus unclear what role motoric demands such as lever height and force requirement play in the microstructure of operant behavior.

A possible explanation for the effect of motoric manipulations on the microstructural parameters of operant behavior suggests an indirect link between these variables: Perhaps the shortest IRTs are sensitive only to changes in motoric requirements (as the motoric hypothesis of δ postulates), but changes in the shortest IRTs impact the other microstructural parameters. To test this possibility, the current study directly manipulated the required response duration (and, through it, the effective δ) to determine its effect on the microstructural parameters of operant behavior. Rats were trained to respond on a variable interval (VI) 40-s schedule of reinforcement. Responses that met or exceeded a duration threshold of 0.0, 0.4, or 0.8 s (depending on the condition) were signaled by a brief tone and light flash. Only signaled responses could trigger reinforcement after the end of the variable interval.

A secondary goal of this study was to characterize the distribution of response durations emitted under a VI schedule. Whereas the motoric hypothesis of δ assumes that the time to emit a response is fixed, direct measurements of lever-press durations have found them to vary between responses (Byrne & Sarno, 2019; Fowler, Filewich, & Leberer, 1977; Roberts & Gharib, 2006; see Fig. 1c). This study examines the distribution of response durations across different duration thresholds and discusses its implications for response-bout modeling.

Method

Subjects

Eight male Wistar rats (WI/NCrl; Charles River Laboratories, USA), starting at postnatal day age 74, partook in the study. All rats were pair-housed and had ad libitum access to food and water. Rats were housed on a reverse dark–light cycle (lights off 7 a.m. to 7 p.m.); experiments were conducted during the dark phase of this cycle. Rats had previously been trained to respond on left and right levers in an operant chamber on a VI 120-s schedule of food reinforcement. The study adhered to Arizona State University Institutional Animal Care and Use Committee guidelines.

Apparatus

Experimental sessions took place in eight Med Associates modular test chambers using the standard operant setup. The chambers were enclosed in a light and sound attenuating box with interior dimensions of 30.5 cm × 24.1 cm × 21.0 cm. All test chambers were controlled by MED-PC® IV software (Med Associates, St. Albans, VT). The operant chambers consisted of a clear polycarbonate roof, door, rear, and two lateral walls of aluminum panels mounted to a white polypropylene base. The center panel of one of the walls had a speaker attached to a multiple tone generator and an Eiko 1820 miniature incandescent house light. A liquid dipper with a head entry detector was located centrally on the wall opposite of the speaker. Two retractable levers flanked the dipper; a triple LED stimulus light panel was positioned above each lever. A 0.01-ml cup on the motorized arm of the liquid dipper provided reinforcement, which was a sweetened condensed milk (True Value® Walmart Brand, Bentonville, AR) and water mixture (one-third milk by volume). The operant chambers had a metal-wired floor and a stainless-steel waste pan filled with wood-chip bedding. The levers were set on continuous recording mode so that the duration of lever presses could be measured with a nominal resolution of 0.01 s. A lever press was required to be separated by 0.06 s or more from the previous lever press in order to be counted as a new response. This threshold was selected because 0.06 s was the absolute minimum lever-press IRT previously observed in our laboratory (Brackney et al., 2011). The levers were calibrated to activate when a force of 0.2 (±0.05) N was exerted on their edge.

Procedure

Throughout the experiment, lever presses were categorized as either super-threshold or subthreshold, where the threshold was 0.0, 0.4, or 0.8 s, depending on the experimental condition (see Table 1). If the lever-press duration exceeded the threshold, the response was signaled by a flash of the lights above the lever and a 5-kHz tone sounding for a brief but noticeable time (0.1 s). For the 0.0 s threshold, every lever press was immediately signaled.

Table 1 Training conditions

Daily sessions began with a 5-min acclimation period, during which no experimental events were programmed, followed by the extension of the left lever. Super-threshold lever pressing was reinforced on a VI 40-s schedule. Intervals were sampled without replacement from a 14-item list drawn from a Flesher–Hoffman distribution (Fleshler & Hoffman, 1962). During reinforcement, the lever was withdrawn, the houselight illuminated and the dipper arm raised. Three seconds later, the dipper arm was lowered, the houselight turned off, and the lever reextended. Sessions terminated after 80 min or 84 reinforcer deliveries, whichever occurred first.

The response-duration threshold varied across six experimental conditions (see Table 1). During the first three conditions (Cycle 1) the duration threshold was 0.0, 0.4, and 0.8 s. The following three conditions (Cycle 2) were replications of the previous three. Subjects were transitioned from one condition to the next after a minimum of 10 sessions, and when the mean response rate and the mean median response duration over the previous 5 days were judged stable by visual inspection.

Data analysis

All analyses were conducted on the responses of individual rats aggregated over the last five sessions of each condition. Estimates of bout-initiation rate (b), within-bout response rate (w), mean bout length (L), and refractory period (δ) were based on the assumption that IRTs are distributed according to a shifted mixture of two exponential distributions (Eq. 2 in Brackney et al., 2011):

$$ {\displaystyle \begin{array}{l}\Pr \left( IRT=\tau |\tau <\delta \right)=0\\ {}\Pr \left( IRT=\tau |\tau \ge \delta \right)= pw{e}^{-w\left(\tau -\delta \right)}+\left(1-p\right){be}^{-b\left(\tau -\delta \right)},\\ {}\kern15em \min \left(\mathrm{IRT}\right)\ge \delta >0;w\ge b>0;1\ge p\ge 0.\end{array}} $$
(1)

where L = p / (1 – p) (see Appendix A in Cheung et al., 2012).

Equation 1 was fit to the distribution of IRTs using maximum likelihood estimation (Myung, 2003) with custom-written MATLAB® (MATLAB and Statistics Toolbox Release 2013, The MathWorks, Inc., Natick, MA) software. IRTs were defined as the intervals between the beginnings of each pair of consecutive super-threshold responses, excluding those with an intervening reinforcer; note that IRTs included response durations.Footnote 1 Because estimates of δ were obtained from the shortest interval between super-threshold responses (see Fig. 1a), these estimates were expected to be very close to the 0.4-s and 0.8-s response-duration threshold in the corresponding conditions (see Table 1), and are not informative of sensorimotor capacity in those conditions. Nonetheless, δ was estimated and is reported as a manipulation check. Because b and w were estimated on the basis of the distribution of IRTs without δ (see Fig. 1a), these estimates were not expected to vary with response-threshold in a trivial manner.

Analyses of variance (ANOVAs) were conducted using Prism (GraphPad Software Inc., San Diego, CA). To identify significant effects, a 2 (cycle) × 3 (threshold) repeated-measures ANOVA was conducted on each parameter estimate of interest. Simple main effects were assessed with Tukey’s honestly significant difference test. Significant effects are reported when p < .05. ANOVAs were conducted on the log-transformed response rates, reinforcement rates, median response durations, interquartile range of response durations, and parameters estimates of Eq. 1, except p, which was log-odds transformed. All variables of interest are reported back-transformed. A 2(cycle) × 2 (threshold) repeated-measures ANOVA was conducted on the arcsine-transformedFootnote 2 proportion of responses that exceeded the response threshold for the 0.4 and 0.8 s threshold conditions (all responses were necessarily above the threshold in the 0.0 s threshold condition).

Results

Response and reinforcement rates

Overall response rate (computed including sub and super-threshold responses) and super-threshold response rate declined with longer duration thresholds; overall: F(2, 14) = 36.64, p < .001, super-threshold only: F(2, 14) = 37.85, p < .001 (see Fig. 2a–b). Both dependent measures also declined between Cycles 1 and 2; overall: F(1, 7) = 30.52, p < .001, super-threshold only: F(1, 7) = 28.53, p < .001. Overall response rate declined significantly between the 0.0 and 0.8 s conditions and between the 0.4 and 0.8 s conditions in Cycle 1, p < .05, and differed among all conditions in Cycle 2, p < .001. Super-threshold response rates declined significantly with longer thresholds in both cycles, p < .05.

Fig. 2
figure 2

Mean (±SEM) median response and reinforcement rates as a function of response-duration threshold, calculated over the last five sessions in each threshold condition in Cycles 1 (solid curves) and 2 (dashed curves). a Overall response rate, calculated using both super-threshold and subthreshold responses. b Super-threshold response rate. c Reinforcement rate

Reinforcement rate also declined with longer duration thresholds, F(2, 14) = 13.43, p < .001, and between cycles, F(1, 7) = 9.80, p < .05 (see Fig. 2c). Significant differences in reinforcement rate were observed between the 0.0 and 0.8 s conditions and between the 0.4 and 0.8 s, p < .05. No significant Threshold × Cycle interaction effect was observed on either response rate or reinforcement rate.

Response durations

Figure 3 displays the distribution of response durations for the group and for two representative rats in each condition. In the 0.0 s conditions, the distribution of response durations appears to be approximately log-normal. Longer duration thresholds displaced a large portion of the distribution of response durations rightwards, just above the threshold. For the 0.4 and 0.8 s thresholds, response durations greater than the threshold appear to be log-normally distributed. Subthreshold response durations appear to be distributed according to a flatter distribution, which is distinct form the super-threshold durations.

Fig. 3
figure 3

Distribution of response durations in the last 5 days of each threshold condition in Cycles 1 (solid curves) and 2 (dashed curves). The vertical dotted lines indicate the response-duration threshold. The left column is the group means, the middle and right columns are representative rats. The abscissa is on a log scale to highlight the log-normal-like distribution of a portion of response durations

Figure 4 displays summary statistics of the duration distributions: the mean (±SEM) of the individual median durations (see Fig. 4a), the interquartile range of durations (Fig. 4b), and the proportion of durations that met or exceeded the duration threshold (Fig. 4c). Median response durations increased with longer duration thresholds: F(2, 14) = 219.90, p < .001. The interquartile range of the response durations also increased as the threshold increased, F(2, 14) = 37.80, p < .001, but declined between cycles, F(1, 7) = 5.70, p < .05. The proportion of responses that exceeded the duration threshold increased between cycles, F(1, 7) = 29.49, p < .001. These effects indicate that longer response-duration thresholds resulted in longer, more variable response durations, and point at potential learning effects between cycles.

Fig. 4
figure 4

Mean (±SEM) of three response-duration distribution statistics as a function of response-duration threshold, calculated over the last five sessions in each threshold condition in Cycles 1 (solid curves) and 2 (dashed curves). a Median response duration; across all threshold conditions, the median duration was between 0.32 and 0.54 s longer than the required duration. b Interquartile range of response durations. c Proportion of responses that exceeded the duration threshold; all responses in the 0.0 s condition exceeded the threshold by design

Response-bout modeling

Figure 5 shows the mean (±SEM) parameter estimates for each condition, based on fits of Eq. 1. Individual parameter estimates for each rat and condition are described in Tables 2, 3, and 4 of Appendix 2.

Fig. 5
figure 5

Mean (±SEM) parameter estimates (Eq. 1) as a function of response-duration threshold, computed from super-threshold IRTs in the last five days of each threshold condition in Cycles 1 (solid curves) and 2 (dashed curves). Bout length, not including the bout-initiation response (a), within-bout response rate (b), bout-initiation rate (c), and refractory period (d). Estimates for individual subjects are in Tables 24 of Appendix 2

The average bout length, L, increased with longer duration thresholds, F(2, 14) = 11.53, p < .05. Significant differences in estimates of L were observed between the 0.0 and 0.4 s conditions and between the 0.0 and 0.8 s conditions in both cycles, p < .05. Within-bout response rate, w, significantly decreased with longer thresholds, F(2, 14) = 47.61, p < .001, and between cycles; F(1, 7) = 14.82, p < .05. Significant differences in estimates of w were observed among all conditions in Cycle 1, p < .05, and between the 0.0 and 0.4 s conditions and between the 0.0 and 0.8 s conditions in Cycle 2, p < .001. Bout-initiation rate, b, significantly decreased with longer thresholds, F(2, 14) = 34.12, p < .001, and between cycles, F(1, 7) = 14.89, p < .05. Significant differences in estimates of b were observed among all conditions in both cycles, p < .05. The refractory period, δ, increased with longer duration thresholds, F(2, 14) = 2599, p < .001, and between cycles, F(1, 7) = 28.76, p < .001. Significant differences in estimates of δ were observed among all threshold conditions in both cycles, p < .001. As expected, δ was close to the duration threshold when this was greater than zero. When considered together, the effects of response-duration threshold on L, w, and b suggest that higher response-duration thresholds yielded fewer but longer response bouts that contained more spaced within-bout responses.

To ensure that the model was providing reasonable fits, log-survivor plots (Shull et al., 2001) of the model predictions were compared to log-survivor plots of the observed IRTs. Appendix 3 describes how model predictions were determined. Figure 6 displays the log survivor plots of the group mean and two representative rats for each duration threshold in Cycle 1. As the duration threshold increased, the shape of the log-survivor plot changed from the often-reported “broken-stick” pattern (e.g., Shull et al., 2001) to a straighter pattern. The model faithfully reproduced the distribution of IRTs in the 0.4-s and 0.8-s conditions. In the 0.0-s condition, however, the simulation appears to overestimate the prevalence of the longest, approximately 20%, of IRTs. Because the slope of the right-hand side of the “broken-stick” corresponds to b (Shull et al., 2001), this divergence suggests that b may be underestimated in the 0.0-s condition. Nonetheless, regardless of condition, the shifted mixture of two exponential distributions (Eq. 1) was more likely to generate the observed IRTs than a single shifted exponential distribution,

$$ {\displaystyle \begin{array}{l}\Pr \left( IRT=\tau |\tau <\delta \right)=0\\ {}\Pr \left( IRT=\tau |\tau \ge \delta \right)=\lambda {e}^{-\lambda \left(\tau -\delta \right)}.\\ {}\kern5em \min \left(\mathrm{IRT}\right)\ge \updelta >0;\updelta >0;\uplambda >0.\end{array}} $$
(2)
Fig. 6
figure 6

Log survivor plots of super-threshold IRTs demonstrating model fit in Cycle 1. Solid curves correspond to empirically observed IRTs; dashed curves correspond to IRTs predicted by model simulation (see details in Appendix 3). Each row of plots corresponds to a different response-duration threshold. The left column is the group means, the middle column and right columns are representative rats. Although some deviations from the observed data seem prominent from a visual inspection of the plots, they actually comprise only a small proportion of very long IRTs

In Cycle 1, the likelihood ratio of Eq. 1 relative to Eq. 2 was 11,977 in the 0.0-s condition, 4,189 in the 0.4-s condition, and 1,269 in the 0.8-s condition. Although the mixture-distribution model has 3 times more free parameters than the single-exponential model, based on the observed data, the former is at least e1269 times more likely than the latter.

Reanalyzing the log-survivor plot

In log-survivor plots of IRT distributions, the vast majority of IRTs are represented in a small space in the upper left-hand portion of the plot. This feature of log-survivor plots helps emphasizes the “broken-stick” appearance that is characteristic of bi-exponentially distributed data (Shull et al., 2001), but exaggerates deviations from fit in IRTs corresponding to bout initiations, making it difficult to detect deviations from fit for within-bout IRTs. To identify the range of IRTs over which observation and model diverge, observed and model-predicted IRTs were divided into bins each representing consecutive two percentile slices of the data; the mean IRT for each bin was calculated and plotted. This alternative method of comparing the observed and predicted IRTs allows the full range of IRTs to be more equally represented. The observed and predicted IRTs, organized in percentiles, are shown in Fig. 7 averaged across rats and for the two representative rats from Fig. 6. Figure 7 shows little deviation between data and model, suggesting that much of the apparent deviation in Fig. 6 is due to the “stretching out” of the longest IRTs.

Fig. 7
figure 7

Percentile plots of super-threshold IRTs demonstrating model fit in Cycle 1. Each point is the mean IRT for a two-percentile bin of the data. Heavy dots correspond to empirically observed IRTs; dotted curves correspond to IRTs predicted by model simulation. Each row of plots corresponds to a different response-duration threshold. The left column is the group means, the middle column and right columns are representative rats

Discussion

Longer responses yield fewer but longer response bouts

Increasing the operant lever-press duration requirement on a VI 40-s schedule of reinforcement lengthened the duration of lever presses (Fig. 4a) and, in consequence, also lengthened the shortest IRT (measured as the interval between the beginnings of consecutive responses that met the duration requirement; Fig. 5d). These effects simply show that rats were responsive to reinforcement contingencies. The question that motivated this study was whether response duration—the presumed expression of motoric manipulations in operant performance (Brackney et al., 2011; Posadas-Sanchez, 2004)—affects other parameters of the microstructure of operant behavior, namely bout-initiation rate (b), within-bout response rate (w), and mean bout length (L). Results show that all three parameters varied systematically with lever-press duration requirement, suggesting that longer responses are emitted at a lower rate in fewer but longer bouts (Fig. 5). Unlike changes in the shortest IRT (δ), changes in b, w, and L are not implicit in the manipulation of duration requirement, because these parameters were estimated from the distribution of IRTs after subtracting δ (and, thus, the duration requirement; see Fig. 1).

These duration-requirement effects are only partially consistent with findings from three other comparable studies (Brackney et al., 2011; Jiménez et al., 2017; Pinkston & Moore, 2020). Similar to long-duration requirements in the present study, increased motoric demands in previous studies yielded lower bout-initiation rates. However, Jiménez et al. (2017) observed shorter, not longer, response bouts with very high levers. It is important to note that lever-height effects were detected primarily at the maximum height tested in that study—235 mm, which was 20 mm above the mean maximum forepaw height of the Wistar rats that served as subjects. At lower heights (30–194 mm) no significant changes were observed in rate of reinforcement, response duration, b, w, or L, although certain effects on the distribution of bout length appear to emerge. Jiménez et al. (2017) speculated that there may be a critical lever height between 194 and 235 mm at which a motoric phase transition emerges, such that completion of the operant on higher levers do not simply take longer or involve more effort, but requires a qualitatively different response topography with broad impact on microstructural parameters. Below that critical height, the distribution of response durations and IRTs do not appear to vary significantly. Interestingly, the relative height of concurrently available levers affects the microstructure of the operant behavior they support and the choice between them, even if they are below critical height (Jiménez, Ochoa, Amazeen, Amazeen, & Cabrera, 2019).

Brackney et al. (2011) and Pinkston and Moore (2020) observed no significant effect of workload on bout length or within-bout response rates. The absence of significant effects of workload (lever height and force requirement) on w and L in Brackney et al. (2011) likely stems from the relatively small impact that their workload manipulation had on the refractory period. The higher workload condition in Brackney et al. (2011) did not increase the mean refractory period by more than 0.1 s in any experimental condition; the present study increased the mean refractory period by at least twice that difference. This may explain why even though Brackney et al. (2011) report a mean increase in bout length with higher workload,Footnote 3 that increase was not statistically significant.

Pinkston and Moore (2020) manipulated the peak force required from responses to be reinforced (5.6 and 32.0 g), but do not report response durations. It is possible that responses with peak forces within this range do not substantially vary in duration. In such case, their failure to observe force-requirement effects on bout length and within-bout response rate would be consistent with the notion that these parameters are more sensitive to the duration of the reinforced response than to the force required to complete it.

Mechanisms underlying duration-threshold effects

The most likely mechanism underlying the threshold-induced reduction in bout frequency appears to be somewhat intuitive. Brackney et al. (2011) suggested that reinforcer deprivation, availability, and response cost may drive what they labeled operant motivation, a general predisposition of the organism to engage the operandum, which is expressed in the bout-initiation rate. In the present study, higher duration thresholds may have raised the response cost, in the sense that longer responses entail a reduction in time that could be allocated to other activities.

Pinkston and Moore (2020) suggest an alternative explanation: The estimation of microstructural parameters based on criterial responses—those that are long or forceful enough to be reinforced—removes subcriterial responses from analysis, lengthening IRTs even when behavior has not changed. It is thus possible that, when the duration threshold was raised in the present study, such IRT lengthening contributed to reducing the number of bouts of criterial responses. Nonetheless, when subcriterial responses were considered in their analysis, Pinkston and Moore (2020) also failed to find an effect of force requirement on overall response rate. In contrast, the present study shows that raising the duration requirement reduces overall response rate, even when subcriterial responses are counted (see Fig. 1a). Moreover, Pinkston and Moore’s (2020) analysis is based on non-parametric statistical tests that may not be adequately powered to detect force-requirement effects on bout-initiation rate. A reanalysis of their data on VI 120-s performance, implementing a one-tailed t test on log-transformed estimates of b from overall responses, yielded a significant 25% reduction in (back-transformed) estimates of b with increased force requirement, t(7) = 2.15, p < .05. Pinkston and Moore’s (2020) data may not stand on its own as strong evidence for an effort-induced reduction in bout-initiation rate, but they are consistent with reports of such effect here and in Jiménez et al. (2017).

Potential mechanisms underlying the threshold-induced lengthening of bouts are perhaps less intuitive. Why would rats persist longer in a bout of more effortful and costly responses? One possibility involves the response-reinforcer coupling mechanism that Killeen posited as a mathematical principle of reinforcement (Killeen, 1994; Killeen & Sitomer, 2003). This mechanism may be thought of as a limited memory buffer for recent responses; when a reinforcer is delivered, it strengthens all responses in the buffer according to their recency (for an alternative representation, see Catania, 2005). Longer responses presumably occupy a larger portion of the memory buffer and are thus more efficaciously reinforced; such efficacy may be reflected in longer bouts of those responses. This mechanism would also explain why bout length is proportional to the requirement of ratio schedules (Brackney & Sanabria, 2015; Daniels & Sanabria, 2017a; Reed, 2015; Shull et al., 2001), and why low—and presumably less effortful—ratio requirements yield lower response rates than intermediate ratio requirements (e.g., Sanabria, Acosta, Killeen, Neisewander, & Bizo, 2008): Ratio schedules support short IRTs (Peele, Casey, & Silberberg, 1984; Tanno, 2016), so responses fill up the memory buffer proportionally to the ratio requirement. Moreover, given the lower rate at which longer responses are emitted, more subthreshold responses may intervene between longer responses, potentially occupying the memory buffer, promoting reinforcement efficacy and longer bouts. That is, these subthreshold responses may effectively constitute a “self-imposed” variable-ratio requirement (Notterman & Mintz, 1965; Pinkston & Libman, 2017). To put it briefly, longer responses may be more memorable than shorter responses, so their reinforcement may yield longer bouts of responses.

It would be premature to speculate about the mechanism underlying the threshold-induced reduction of within-bout response, for two reasons. First, as the duration threshold increased, so did the proportion of subthreshold lever presses (Fig. 3), whose duration lengthened IRTs at higher thresholds. Such short lever presses are unlikely to contribute significantly to IRTs separating response bouts, but they are likely to substantially increase the IRTs separating responses within bouts. Second, Eq. 1 assumes that only the shortest IRT captures the time necessary to produce the reinforced response and to prepare the next response. Such assumption underestimates the typical duration and variability of reinforced responses (Figs. 3 and 4). Again, such underestimation is unlikely to contribute significantly to IRTs separating response bouts, but it may have affected the IRTs separating responses within bouts. It appears that, to better understand the mechanisms governing within-bout response rate, it is critical to develop an account of subthreshold behavior and response duration. Such development will probably require the direct manipulation of motoric requirements, which involves substantial challenges for the measurement, analysis, and interpretation of behavioral changes. A particularly important challenge is that those requirements define the operant response (Pinkston & Foss, 2018; Pinkston & Libman, 2017); manipulating motoric requirements entails changing what is considered an IRT, which is the basis of the microstructural model implemented here.

Responses have variable durations

In regard to the provenance of subthreshold behavior, training order cannot be discounted as a potential contributor. Although behavior adjusted to changes in duration threshold, shorter responses reinforced in earlier experimental conditions with lower duration thresholds (see Table 1) may have persisted into later conditions within the same cycle. Subthreshold responses may have been induced through the same memory-buffer mechanism responsible for threshold-induced bout length: the reinforcement of super-threshold responses may have strengthened the subthreshold responses that preceded it. Gharib and colleagues’ findings suggest yet another source of variability in response durations (Gharib, Derby, & Roberts, 2001; Gharib, Gade, & Roberts, 2004; Roberts & Gharib, 2006). Roberts and Gharib (2006) hypothesized that decreasing the probability of reinforcement increases the variability in response durations. The rate of earned reinforcers (Fig. 2c) decreased significantly (if by a small margin) as the response threshold increased, which may have increased the variability of response durations.

Regardless of its cause, the variability in response durations is inconsistent with the assumption of Eq. 1 that, under constant conditions, the refractory period is constant. Instead, it appears that response duration is a mixture-distributed random variable with at least two components: a shifted log-normal distribution of super-threshold latencies, and an unknown but flatter distribution of durations that is insensitive to threshold requirement. Future research may determine whether the parameters of this mixture distribution are sensitive to motivational and schedule manipulations, as suggested previously (Faustman & Fowler, 1981; Roberts & Gharib, 2006).

Conclusion

Reinforced responses are organized in bouts. The selective reinforcement of longer response durations not only increases the relative frequency of these durations, it also yields fewer but longer response bouts. Bout-length effects may reflect a duration-induced facilitation of the response–outcome association. It is possible that, within certain limits, effortful responses lengthen response bouts through a similar mechanism: If those responses take longer to complete, they may be more likely to be associated with reinforcement.

The variability in response durations indicates that, contrary to prior assumptions, the refractory period between IRTs is not constant. Previous studies have shown that response durations can provide valuable information about the behavioral processes involved in operant performance (e.g., Roberts & Gharib, 2006). Future research may explore how the variability of the refractory period is integrated into more comprehensive models of the microstructure of reinforced behavior.