Introduction

It has been known for many years that operant behaviour maintained by food reinforcement can be suppressed by acute treatment with dopamine receptor antagonists (Wise et al. 1978a, b; Beninger et al. 1987). However, despite more than 30 years of research, the processes that underlie the effects of dopamine receptor antagonists on schedule-controlled operant behaviour remain controversial. According to the well-known anhedonia hypothesis (Wise et al. 1978a; Wise 1982, 1985, 2008), these drugs reduce the value of positive reinforcers, thereby diminishing their ability to support voluntary behaviour. However, it has been argued that much of the evidence that has been adduced in support of this hypothesis is open to alternative or additional explanations in terms of motor debilitation or effort-related response cost (Salamone 1988; Salamone et al. 1991, 2002; Randall et al. 2012).

It seems unlikely that the effects of dopamine receptor antagonists on operant behaviour are attributable to a single process, given the wide range of behavioural functions in which central dopaminergic mechanisms have been implicated. These include food ingestion, behavioural arousal, endocrine functions and extrapyramidal motor control (Missale et al. 1998; Beaulieu and Gainetdinov 2011), as well as numerous ‘cognitive’ functions such as attention, impulse control, decision making and the temporal regulation of behaviour (Meck 1996; Floresco 2009; Robbins 2009; Salamone 2009; Jones and Jahanshahi 2011; Rogers 2011). It is likely that even in relatively simple behavioural tasks such as the classical reinforcement schedules, dopaminergic mechanisms are engaged in more than one process, and therefore dopamine receptor blockade may be expected to exert complex effects on schedule-controlled performance.

One approach to dissecting the multiple processes that may be affected by dopamine receptor antagonists entails quantitative analysis based on theoretical models of schedule-controlled behaviour (Reilly 2003; Sanabria et al. 2008). In this paper, we used a model (Bradshaw and Killeen 2012) derived from the Mathematical Principles of Reinforcement (MPR: Killeen 1994) to analyse the effects of dopamine receptor antagonists on performance on a progressive ratio (PR) schedule. The theoretical basis of this model is outlined below.

In ratio schedules of reinforcement, the subject is required to emit a specified number of responses, N, to obtain a reinforcer. In fixed ratio (FR) schedules, N is held constant (Ferster and Skinner 1957), whereas in PR schedules, it is systematically increased, usually from one reinforcer to the next (Hodos 1961; Stafford and Branch 1998), but sometimes after batches of two or more reinforcers (Baunez et al. 2002; Salamone et al. 2002) or between successive sessions (Griffiths et al. 1978; Czachowski and Samson 1999). Responding on PR schedules is usually rapid under low ratios, but declines towards zero as N is increased. The ratio at which the subject stops responding is known as the breakpoint (Hodos 1961; Hodos and Kalman 1963).

The breakpoint has been widely used as a measure of the subject’s motivation or the incentive value of the reinforcer (see Ping-Teng et al. 1996; Killeen et al. 2009). However, despite its compelling face validity, the breakpoint has several shortcomings as a measure of incentive value. Its specificity is called into question by its sensitivity to non-motivational manipulations such as changes in the response requirement (Skjoldager et al. 1993; Aberman et al. 1998) and the ratio step size (Covarrubias and Aparicio 2008); it shows considerable variability, being derived from a single time point, data from the rest of the session being ignored (Arnold and Roberts 1997; Killeen et al. 2009); and its definition is arbitrary, there being no consensus as to the time that must elapse without a response before responding may be said to have stopped (Arnold and Roberts 1997; Killeen et al. 2009).

Quantitative analyses that take into account the response rate in each component ratio of the schedule avoid some of these pitfalls. Models based on MPR provide a theoretical basis for such analyses. According to MPR, schedule-controlled responding is determined by an excitatory effect of reinforcers on behaviour, biological constraints on responding and the efficiency with which schedules couple responses to reinforcers. In FR schedules, response rate, R, is predicted by

$$ R=\frac{1-{\left(1-\beta \right)}^N}{\delta }-\frac{N}{a}\kern1em ;\kern1em \delta, a>0;0<\beta <1 $$
(1)

where β (‘currency’) represents the extent to which the strengthening effect of the reinforcer is focused on the most recent response, δ (‘response time’) is the time taken to execute a response and a (‘specific activation’) is the duration of behavioural activation induced by a reinforcer (Killeen 1994). Equation 1 describes an ‘inverted-U’ function (Fig. 1, left-hand graph). β influences the locus of the peak, δ defines its height and a specifies the slope of the descending limb. a has been proposed as an index of reinforcer value (Killeen and Sitomer 2003; Reilly 2003; Sanabria et al. 2008); consistent with this proposal, a has been shown to be sensitive to manipulation of reinforcer size and quality (Bizo and Killeen 1997; Rickard et al. 2009).

Fig. 1
figure 1

Theoretical response rate functions; ordinates, response rate, R; abscissae, response/reinforcer ratio, N. Left graph: the fixed ratio (FR) model (Eq. 1). Note the linear decline of response rate from its peak towards zero; δ is the (extrapolated) ordinate intercept, −1/a defines the slope and the breakpoint is predicted by a/δ. The locus of the peak is defined by β; when β = 1, the function resolves to a straight line extending from 1/δ to a/δ. Right graph: the progressive ratio (PR) model (Eq. 3: running response rate, R RUN; Eq. 4: overall response rate, R OVERALL). Note that in contrast to the FR model, the PR model defines different curves for R RUN and R OVERALL and that response rate declines in a curvilinear fashion towards zero. An increase in the minimum post-reinforcement pause, T 0, reduces R OVERALL, the effect being mainly confined to lower values of N. An increase in the slope of the linear waiting function, k, results in an increase of the proportion of the inter-reinforcer interval devoted to post-reinforcement pausing; the reduction of R OVERALL occurs at all values of N. A reduction of specific activation, a, is reflected in steepened decline of both response rate functions. An increase in response time, δ, produces a parallel downward displacement of both curves (see Bradshaw and Killeen 2012 for further explanation)

Although Eq. 1 was developed to describe performance on FR schedules, it also provides a good description of overall response rate on PR schedules (Covarrubias and Aparicio 2008; Killeen et al. 2009; Rickard et al. 2009). Its application to PR schedule performance has been used to identify the effects of brain lesions and centrally acting drugs on motivational and motor-related processes (for review, see Bradshaw and Killeen 2012).

Unfortunately, recent studies have revealed significant problems with the application of Eq. 1 to PR schedule performance. In particular, it has transpired that although the equation provides an adequate description of overall response rate, its fit to running response rate (response rate calculated after exclusion of the post-reinforcement pause) is poor (Rickard et al. 2009; Olarte-Sánchez et al. 2012a, b). To address this problem, Bradshaw and Killeen (2012) developed a new model based on MPR which provides a coherent account of overall and running response rate on PR schedules. The model takes into account the sequential nature of the schedule, in contrast to Eq. 1 which treats successive ratios as though they were independent of one another. The model invokes the linear waiting principle (Wynne et al. 1996) to predict the escalating post-reinforcement pause in successive ratios and thereby provides a dynamic account of performance on PR schedules. The linear waiting principle expresses the empirical finding that the post-reinforcement pause on trial i, T P,i , is linearly related to the total inter-reinforcement interval on trial i-1, T TOT,i-1:

$$ {T}_{\mathrm{P},i}={T}_0+k\ {T}_{\mathrm{TOT},i-1}, $$
(2)

where T 0 and k are parameters that define the minimum post-reinforcement pause and the slope of the linear waiting function. The new model contains two key equations that define running response rate, R RUN, and overall response rate, R OVERALL:

$$ {\mathrm{R}}_{RUN,\mathrm{i}}=\frac{1}{\updelta \left(1+{\mathrm{T}}_{TOT,\mathrm{i}-1}/a\right)} $$
(3)
$$ {\mathrm{R}}_{OVERALL,\mathrm{i}}={\mathrm{N}}_{\mathrm{i}}/{\mathrm{T}}_{TOT,\mathrm{i}}, $$
(4)

where a and δ have the same meanings as in Eq. 1. Figure 1 (right-hand graph) shows the curves defined by Eqs. 3 and 4 (see Bradshaw and Killeen (2012).

A re-analysis of the data from several previous studies of the effects of neuropharmacological interventions on PR schedule performance (Bradshaw and Killeen 2012) showed that the new PR model and the original FR model (Eq. 1) yielded concordant effects on a. However, the new model’s superiority over the old one was revealed not only by its ability to accommodate both running and overall response rates, but also by its more subtle treatment of pausing. Incorporation of the linear waiting principle into the model allows a clear distinction to be drawn between post-reinforcement pausing and inter-response pausing, which were shown to be differentially affected by schedule manipulations and pharmacological interventions.

The present experiment used the new model to examine the effects of dopamine receptor antagonists on PR schedule performance maintained by sucrose and corn oil reinforcers. The great majority of previous studies of PR schedule performance have used sucrose or palatable food pellets as the reinforcer. Very few studies have used fatty foodstuffs, and none of these has employed quantitative analysis of performance based on MPR (Yoneda et al. 2007b; Naleid et al. 2008; Liang et al. 2012). The reinforcing value of fatty foodstuffs such as corn oil is of particular interest because of the role of excessive fat ingestion in the aetiology of obesity in humans (see West and York 1998).

There is a growing body of evidence that operant performance maintained by different foodstuffs may be differentially sensitive to the suppressant effects of antagonists of D1-like and D2-like dopamine receptors. Thus, responding for highly palatable food pellets and sucrose reinforcers can be suppressed by antagonists of both receptor classes (Nowend et al. 2008; Salamone et al. 2002; Randall et al. 2012), whereas responding for fatty reinforcers appears to be much less sensitive to D1-like than to D2-like receptor blockade (Yoneda et al. 2007b). Therefore, in the present experiment, rats’ performance on PR schedules was maintained with either sucrose or corn oil reinforcement, and in each case, the sensitivity of the parameters of the model to acute treatment with a D1-like receptor antagonist, 8-bromo-2,3,4,5-tetrahydro-3-methyl-5-phenyl-1H-3-benzazepin-7-ol hydrobromide (SKF-83566), and a D2-like receptor antagonist, haloperidol, was examined. Since in the present model the parameter a is regarded as an index of reinforcer value, it was predicted that haloperidol would reduce the value of a in the case of both reinforcers, whereas SKF-83566 would reduce this parameter only in the case of sucrose.

Methods

The experiment was carried out in accordance with UK Home Office regulations governing experiments on living animals.

Subjects

Twenty-four female Wistar rats (Charles River, UK) approximately 4 months old and weighing 250–300 g at the start of the experiment were used. They were housed individually under a constant cycle of 12 h light and 12 h darkness (light on at 0600–1800 hours) and were maintained at 80 % of their initial free-feeding body weights throughout the experiment by providing a limited amount of standard rodent diet after each experimental session. Tap water was freely available in the home cages, and environmental enrichment (cardboard tunnels and wooden chew blocks) was provided, as prescribed by the local ethics committee. After the completion of the experiment, the rats were returned to ad libitum feeding for 3 weeks and their body weights were redetermined.

Apparatus

The rats were trained in operant conditioning chambers (CeNeS Ltd, Cambridge, UK) of internal dimensions 25 × 25 × 22 cm. One wall of the chamber contained a central recess covered by a hinged Perspex flap, into which a peristaltic pump delivered the liquid reinforcer (see below). An aperture located 5 cm above and 2.5 cm to one side of the recess (left for half the subjects; right for the other half) allowed insertion of a motorised retractable lever (CeNeS Ltd, Cambridge, UK) into the chamber. The lever could be depressed by a force of approximately 0.2 N. The chamber was enclosed in a sound-attenuating chest with additional masking noise generated by a rotary fan. No houselight was present during the sessions. An Acorn microcomputer programmed in Arachnid BASIC (CeNeS Ltd, Cambridge, UK) located in an adjacent room controlled the schedule and recorded the behavioural data.

Behavioural training

Two weeks before starting the experiment, the food deprivation regimen was introduced and the rats were gradually reduced to 80 % of their free-feeding body weights. They were randomly allocated to two groups that underwent training with different reinforcers: (1) 50 μl of a 0.6-M solution of sucrose in distilled water (n = 12) and (2) 25 μl of undiluted corn oil (n = 12). (The calorific contents of the two reinforcers were not equated in this experiment; see “Discussion” for further comment.) The rats were first trained to press the lever for the liquid reinforcer and were then exposed to an FR 1 schedule for 3 days followed by FR 5 for a further 3 days. Thereafter, they underwent daily training sessions under the PR schedule. The PR schedule was based on the exponential progression: 1, 2, 4, 6, 9, 12, 15, 20, 25, 32, 40, …, derived from the formula (5 × e0.2n) − 5, rounded to the nearest integer, where n is the position in the ratio sequence (Roberts and Richardson 1992). Sessions took place at the same time each day during the light phase of the daily cycle (between 0800 and 1300 hours) 7 days a week. At the start of each session, the lever was inserted into the chamber; the session was terminated by withdrawal of the lever 40 min later.

Drug treatment

The drug treatment regimen started after 120 sessions of preliminary training under the progressive ratio schedule. Injections of drugs were given on Tuesdays and Fridays, and injections of the vehicle alone on Mondays and Thursdays; no injections were given on Wednesdays, Saturdays or Sundays. Each rat was tested five times with each dose of each drug, the order of treatments being counterbalanced across animals according to a Latin square design. Drugs were injected intraperitoneally (2.5 ml kg−1; 25-gauge needle) 30 min before the start of the experimental session. Doses were calculated from the weights of the salts. Haloperidol (0.05 and 0.1 mg kg−1) was dissolved in 0.1 M tartaric acid, buffered to pH 5.5 and diluted with sterile 0.9 % sodium chloride to give the desired concentration. SKF-83566 (0.015 and 0.03 mg kg−1) was dissolved in 0.9 % sodium chloride solution. Haloperidol was obtained from Sigma Chemical Company, Poole, UK; SKF-83566 was obtained from Tocris Bioscience, Bristol, UK. The doses of haloperidol were chosen on the basis of previous findings of the effects of this drug on PR schedule performance (Zhang et al. 2005; Olarte-Sánchez et al. 2012a; den Boon et al. 2012). SKF-83566 has not been tested previously in this paradigm; the doses were chosen on the basis of recent findings of the effects of this drug on operant behaviour in free-operant timing schedules (Cheung et al. 2006, 2007).

Data analysis

Overall response rate (R OVERALL) was calculated for each ratio by dividing the number of responses by the total time taken to complete the ratio, including the post-reinforcement pause, measured from the end of the preceding reinforcer delivery until the emission of the last response of the ratio (Bizo and Killeen 1997). The first ratio (a single response) and any ratios that had not been completed at the end of the session were excluded from the analysis. Running rate (R RUN) was calculated by dividing the number of responses by the ‘run-time’ (i.e. the time taken to complete the ratio, excluding the post-reinforcement pause: Bizo et al. 2001). Post-reinforcement pause duration was measured from the end of the reinforcer delivery until the emission of the first response of the following ratio. The breakpoint was defined as the last ratio to be completed before 5 min elapsed without any responding, or, in cases where this criterion was not met within the session, the highest completed ratio (Olarte-Sánchez et al. 2012a, b).

The PR model comprising Eqs. 3 and 4 was fitted to the running and overall response rate data obtained from individual rats, and estimates of the four parameters, T 0, k, a and δ, were derived using the ‘Solver’ facility of Excel (Microsoft Corporation); goodness of the combined fit of Eqs. 3 and 4 to the overall and running response rate data was expressed as r 2 (see Bradshaw and Killeen 2012).

Comparison of the sucrose and corn oil reinforcers

For each rat, the data obtained from 30 sessions in which no treatment was administered were used to derive estimates of the four parameters. The 30 no-treatment sessions were interspersed among the vehicle and drug treatment sessions (on Wednesdays, Saturdays and Sundays) throughout the treatment phase of the experiment (see above, “Drug treatment”). As the variances of T 0, a, δ and the breakpoint differed significantly between the groups, comparisons between the two groups were carried out using the Mann–Whitney U test.

Assessment of the effects of haloperidol and SKF-83566

The effects of the two drugs were analysed separately in each group. For each rat, the model was fitted to the data obtained from the sessions in which injections of the drug or its corresponding vehicle were administered and estimates of the four parameters were derived. These estimates, and the breakpoint, were analysed by separate one-factor analyses of variance with treatment condition (vehicle, lower dose, higher dose) as a within-subject factor, followed, in the case of a significant effect of treatment, by comparison of each dose of the drug with the vehicle-alone treatment using Student’s t test with Šidák’s correction for multiple comparisons. The effect sizes revealed by the analyses of variance were expressed as partial η 2 (η 2 p). A significance criterion of p < 0.05 was adopted in all statistical analyses (two-tailed comparisons in the case of the post hoc tests).

Results

Comparison of the sucrose and corn oil reinforcers

Figure 2 shows the mean response rate data from the two groups in the last 30 no-treatment sessions (see “Data analysis”). In both groups, running response rate declined monotonically towards zero, whereas overall response rate rose to a peak before declining towards zero. The peak of the response rate function was lower, and the slope of the declining phase shallower in the corn oil-reinforced group than in the sucrose-reinforced group. The PR model provided a good description of the group mean overall and running response rate data obtained from both groups (sucrose-reinforced group: r 2 = 0.982; corn oil-reinforced group: r 2 = 0.967).

Fig. 2
figure 2

Comparison of performance on the PR schedule maintained by a sucrose reinforcer (0.6 M, 50 μl) and a corn oil reinforcer (100 %, 25 μl). Ordinate, response rate; abscissa, response/reinforcer ratio, N. Points are group mean data (n = 12 in each group): unfilled symbols indicate running response rate; filled symbols overall response rate. The curves are best-fit functions defined by Eqs. 3 and 4

The PR model was also fitted to the data obtained from the individual rats in each group in blocks of 10 no-treatment sessions taken at 25-session intervals throughout training and during the drug treatment phase. The group mean values of the four parameters of the model are shown in Fig. 3. T 0 was consistently longer in the corn oil-reinforced group than in the sucrose-reinforced group. Analysis of variance revealed significant main effects of group [F(1, 22) = 9.1, p < 0.01] and block [F(7, 154) = 4.7, p < 0.001] and a significant group × block interaction [F(7, 154) = 5.3, p < 0.001]. k showed no consistent difference between the two groups; there was a significant main effect of block [F(7, 154) = 6.7, p < 0.001], but no significant effect of group and no significant interaction [Fs < 1]. a was consistently higher in the corn oil-reinforced group than in the sucrose-reinforced group; there were significant main effects of group [F(1, 22) = 6.9, p < 0.05] and block [F(7, 154) = 16.2, p < 0.001] and a significant group × block interaction [F(7, 154) = 4.1, p < 0.001]. δ showed no significant difference between the two groups: there was no significant main effect of group [F(1, 22) = 2.2, N.S.] or block [F < 1] and no significant interaction [F < 1]. All four parameters remained stable during the last 100 sessions of the experiment: in neither group did any of the parameters show a significant effect of block across the last four blocks [T 0, k and a: Fs < 1 in both corn oil- and sucrose-reinforced groups; δ: corn oil-reinforced group, F(3, 33) = 1.5, N.S.; sucrose-reinforced group, F < 1].

Fig. 3
figure 3

Group mean values of the four parameters of the PR model (T 0, k, a and δ) derived from the individual rats in the corn oil-reinforced group (open symbols) and the sucrose-reinforced group (filled symbols) in 10 session blocks of no-treatment sessions taken at 25 session intervals from the start of training until the completion of the experiment. The horizontal bars show the period in which acute drug treatments were administered. The vertical bars indicate 2 standard errors of the differences between the groups derived from the interaction terms of the analyses of variance (see text for details)

Table 1 shows the mean (± SEM) estimates of the parameters derived from the individual rats in the two groups. The values of T 0 and a were significantly greater in the corn oil reinforcement group than in the sucrose reinforcement group [Mann–Whitney U test: T 0: p = 0.002; a: p = 0.02]. The values of k and δ did not differ significantly between the two groups [Mann–Whitney U test: p > 0.05]. The model accounted for more than 85 % of the within-subject data variance. The breakpoint did not differ significantly between the groups [Mann–Whitney U test: p > 0.05].

Table 1 PR schedule performance maintained by the sucrose and corn oil reinforcers: parameters of the model and the breakpoint (group mean values ± SEM)

Figure 4 shows the relation between the post-reinforcement pause and the preceding inter-reinforcement interval during the no-treatment sessions (‘linear waiting’, Eq. 2). In both groups, the relation was well described by a linear function (sucrose reinforcer: r 2 = 0.945; corn oil reinforcer: r 2 = 0.981).

Fig. 4
figure 4

Relationship between the duration of the post-reinforcement pause (ordinate) and the preceding inter-reinforcement interval (abscissa) derived from the sucrose-reinforced (left-hand graph) and corn oil-reinforced (right-hand graph) groups in the no-treatment sessions. Points are group mean data; the continuous lines are best-fit linear functions (Eq. 2), and the broken lines indicate the 99 % confidence limits (see text for further explanation)

Effect of haloperidol

The group mean response rate data from the sucrose-reinforced group are shown in Fig. 5 and the parameter values in Table 2. Treatment with haloperidol was associated with a steepening of the descending phase of the response rate curves. This is reflected in the parameter values. Analysis of variance showed a significant effect of treatment on the value of a [F(2, 22) = 10.3, p < 0.001; η 2 p = 0.48]; the linear contrast effect was statistically significant [F(1, 11) = 14.5, p < 0.01], and multiple comparisons (t test with Šidák’s correction) showed that both doses of haloperidol significantly reduced the value of this parameter. The lower dose was associated with a reduction of the value of a in 10 of the 12 rats and the higher dose with a reduction in all 12 rats, compared to the values seen in the vehicle-alone condition. There was no significant effect of treatment on T 0 [F < 1; η 2 p = 0.01], k [F < 1; η 2 p = 0.02] or δ [F(2, 22) = 1.0; η 2 p = 0.08]. There was a significant effect of treatment on the breakpoint [F(2, 22) = 5.8, p < 0.05; η 2 p = 0.35]; the linear contrast effect was statistically significant [F(1, 11) = 9.2, p < 0.05], and multiple comparisons showed that the higher dose of haloperidol significantly reduced the breakpoint.

Fig. 5
figure 5

Effect of haloperidol (HAL) on performance on the PR schedule maintained by the sucrose reinforcer. Conventions are as in Fig. 2

Table 2 PR schedule performance maintained by the sucrose reinforcer: effects of haloperidol on the parameters of the model and the breakpoint (group mean values ± SEM)

The group mean data from the corn oil-reinforced group are shown in Fig. 6 and the parameter values in Table 3. The profile of effect was qualitatively similar to that seen with the sucrose-reinforced group. There was a significant effect of treatment on the value of a [F(2, 22) = 10.7, p < 0.001; η 2 p = 0.49]; the linear contrast effect was statistically significant [F(1, 11) = 11.1, p < 0.01], and multiple comparisons showed that the higher dose of haloperidol significantly reduced the value of this parameter. The lower dose was associated with a reduction of the value of a in 8 of the 12 rats and the higher dose with a reduction in all 12 rats, compared to the values seen in the vehicle-alone condition. There was no significant effect of treatment on T 0 [F < 1; η 2 p = 0.06], k [F < 1; η 2 p = 0.02] or δ [F(2, 22) = 1.8, η 2 p = 0.14]. There was a significant effect of treatment on the breakpoint [F(2, 22) = 9.0, p < 0.001; η 2 p = 0.45]; the linear contrast effect was statistically significant [F(1, 11) = 10.3, p < 0.01], and multiple comparisons showed that the higher dose of haloperidol significantly reduced the breakpoint.

Fig. 6
figure 6

Effect of haloperidol (HAL) on performance on the PR schedule maintained by the corn oil reinforcer. Conventions are as in Fig. 2

Table 3 PR schedule performance maintained by the corn oil reinforcer: effects of haloperidol on the parameters of the model and the breakpoint (group mean values ± SEM)

Effect of SKF-83566

The group mean response rate data from the sucrose-reinforced group are shown in Fig. 7 and the parameter values in Table 4. Analysis of variance revealed a significant effect of treatment on a [F(2, 22) = 5.2, p < 0.02; η 2 p = 0.32]; the linear contrast effect was statistically significant [F(1, 11) = 13.9, p < 0.01], and multiple comparisons showed that the higher dose of SKF-83566 significantly reduced the value of this parameter. The lower dose was associated with a reduction of the value of a in 9 of the 12 rats and the higher dose with a reduction in 11 rats, compared to the values seen in the vehicle-alone condition. There was also a significant effect of treatment on k [F(2, 22) = 4.2, p < 0.05; η 2 p = 0.27]; the linear contrast effect was statistically significant [F(1, 11) = 6.0, p < 0.05], the higher dose producing a significant reduction of the value of this parameter. The lower dose was associated with a reduction of the value of k in 9 of the 12 rats and the higher dose with a reduction in 10 rats, compared to the values seen in the vehicle-alone condition. There was a significant effect of treatment on δ [F(2, 22) = 3.5, p < 0.01; η 2 p = 0.24]; however, the linear contrast effect was not statistically significant [F(1, 11) = 2.2, N.S.], and multiple comparisons showed that neither dose produced a significant change from the value obtained under the vehicle-alone condition. There was no significant effect of treatment on the value of T 0 [F < 1; η 2 p = 0.01]. There was a significant effect of treatment on the breakpoint [F(2, 22) = 9.8, p < 0.001; η 2 p = 0.47]; the linear contrast effect was statistically significant [F(1, 11) = 12.0, p < 0.001], and multiple comparisons showed that the higher dose of haloperidol significantly reduced the breakpoint.

Fig. 7
figure 7

Effect of SKF-83566 (SKF) on performance on the PR schedule maintained by the sucrose reinforcer. Conventions are as in Fig. 2

Table 4 PR schedule performance maintained by the sucrose reinforcer: effects of SKF-83566 on the parameters of the model and the breakpoint (group mean values ± SEM)

The group mean data from the corn oil-reinforced group are shown in Fig. 8 and the parameter values in Table 5. There was no significant effect of treatment on the value of T 0 [F(2, 22) = 2.7, N.S.; η 2 p = 0.20], k [F(2, 22) = 2.3, N.S.; η 2 p = 0.17], a [F(2, 22) = 2.0, N.S.; η 2 p = 0.15] or δ [F(2, 22) = 2.1; η 2 p = 0.16]. There was no significant effect of treatment on the breakpoint [F(2, 22) = 1.4, N.S.; η 2 p = 0.11].

Fig. 8
figure 8

Effect of SKF-83566 (SKF) on performance on the PR schedule maintained by the corn oil reinforcer. Conventions are as in Fig. 2

Table 5 PR schedule performance maintained by the corn oil reinforcer: effects of SKF-83566 on the parameters of the model and the breakpoint (group mean values ± SEM)

Body weight

Before the start of the experiment, the mean (± SEM) body weight of the rats was 279 ± 12 g. Three weeks after return to free feeding at the end of the experiment, their weights rose to 297 ± 4 g, an increase of 6.5 ± 1.0 %.

Discussion

Responding on the PR schedule maintained by both sucrose and corn oil reinforcement was well described by the new model of PR schedule performance (Bradshaw and Killeen 2012) derived from Killeen’s (1994) general theory of schedule-controlled operant behaviour, MPR. In addition, post-reinforcement pausing showed an acceptable degree of conformity to the linear waiting principle (Eq. 2: Wynne et al. 1996). Visual inspection of the response rate data suggests that the model tends to underestimate slightly the response rates seen in the higher ratios of the schedule (for example, see Fig. 5). Further work will be needed to establish whether this is a consistent anomaly which necessitates modification of the model.

Comparison of the parameters derived from the two groups showed that a and T 0 differed between the two reinforcers. The higher value of a seen in the group trained with the corn oil reinforcer than that seen in the group trained with the sucrose reinforcer reflects the flatter descending limbs of the response rate functions in the former group. This result indicates that the incentive value of 25 μl of 100 % corn oil was greater than that of 50 μl of a 0.6-M sucrose solution. This does not necessarily imply that corn oil is intrinsically a more efficacious reinforcer than sucrose because the two reinforcers used in this experiment were not matched for concentration or calorific value. However, comparison of the present data with earlier findings may shed some light on this issue. Figure 9 shows the values of a for volumes of 0.6 M sucrose ranging from 6 to 300 μl obtained from a re-analysis of the data of Rickard et al. (2009) using the new PR model (Bradshaw and Killeen 2012). The value of a for the sucrose reinforcer used in this experiment is similar to the value obtained for the same volume in Rickard et al.’s data set. The calorific value of 25 μl undiluted corn oil is approximately equal to that of 280 μl of a 0.6-M sucrose solution (Revelle 2007). The data shown in Fig. 9 suggest that, calorie for calorie, the incentive value of corn oil is somewhat lower than that of sucrose. This finding, obtained using pure corn oil and sucrose reinforcers, is consistent with the conclusion reached by Naleid et al. (2008) based on their analysis of operant responding for a range of corn oil/sucrose mixtures. Future parametric studies comparing the effect of a range of volumes and concentrations of the two reinforcers on the value of a may help to substantiate this conclusion.

Fig. 9
figure 9

Relationship between the value of the parameter a (‘specific activation’) and the calorific content of the reinforcer. Ordinate, value of a (seconds); abscissa, calorific value (kilocalorie). Unconnected open symbols show data from the present experiment: circle, sucrose reinforcer; triangle, corn oil reinforcer. Connected filled symbols show data for a range of volumes of the sucrose reinforcer (6–300 μl: data from Rickard et al. 2009). Points indicate group mean values ± SEM

The value of T 0 obtained with the corn oil reinforcer was significantly higher than that obtained with the sucrose reinforcer, reflecting the lower peak of the overall response rate function in the rats trained with the corn oil reinforcer. T 0 defines the minimum post-reinforcement pause duration. The higher value of this parameter seen in the group trained with the corn oil reinforcer may reflect more protracted consummatory and post-prandial behaviours associated with the greater viscosity of this reinforcer. According to the PR model, the duration of the post-reinforcement pause is jointly determined by T 0 and k; however, unlike T 0, the value of k did not differ significantly between the two groups. This is consistent with the notion that differences in consummatory and post-prandial behaviours were responsible for the between-group difference in post-reinforcement pausing. While these behaviours might be expected to affect the minimum post-reinforcement pause (T 0), they would not be expected to alter the effect the prior inter-reinforcer interval on the subsequent post-reinforcement pause (k).

Haloperidol significantly reduced the value of a in the case of both reinforcers. This is consistent with the results of Bradshaw and Killeen’s (2012) re-analysis of data obtained by Olarte-Sánchez et al. (2012a) in which palatable food pellets were used as the reinforcer. According to MPR, the reduction of a is indicative of a reduction of the incentive value of the reinforcer. Thus, the reductions of a are consistent with the notion that blockade of D2-like dopamine receptors reduces the incentive value of palatable reinforcers, including both sucrose and corn oil. It should be noted that this does not imply that D2-like receptor blockade devalues all food reinforcers. Indeed, there is compelling evidence that haloperidol and other D2-like receptor antagonists have relatively little effect on the incentive value of less palatable foodstuffs such as standard laboratory chow (see Randall et al. 2012).

Haloperidol had no significant effect on δ. This is consistent with Bradshaw and Killeen’s (2012) analysis of data collected by Olarte-Sánchez et al. (2012a). However, it differs from several previous reports of an increase of this parameter obtained using the FR model (Zhang et al. 2005; den Boon et al. 2012). It remains to be determined whether this discrepancy reflects procedural differences between these studies (e.g. different reinforcers or lever force requirements) or whether the new PR model is less sensitive than the FR model to minor motor impairment that may cause increases in the value of δ (Killeen 1994; Bradshaw and Killeen 2012).

SKF-83566, like haloperidol, reduced a, consistent with previous reports that both D1-like and D2-like receptors are involved in determining the rewarding impact of sweet substances (Weatherford et al. 1990; El-Ghundi et al. 2003; Der-Avakian and Markou 2012). However, unlike haloperidol, SKF-83566 also affected k, producing a significant reduction of this parameter. According to the new PR model, this effect indicates that SKF-83566 reduced the slope of the linear waiting function (i.e. it reduced the impact of the progressively increasing inter-reinforcer interval on the duration of the post-reinforcement pause in successive ratios). Further research will be needed to establish whether this is a reliable effect of SKF-83566 and whether it is shared by other antagonists of D1-like receptors.

SKF-83566 had no significant effect on a in the case of performance maintained by corn oil reinforcement. This contrasts with the significant effect of haloperidol on this parameter and is consistent with previous findings suggesting that D1-like receptors may play a less important role than D2-like receptors in the reinforcing effect of corn oil (Yoneda et al. 2007b). It is, of course, possible that higher doses of SKF-83566 would have had a significant effect on a. However, the doses used in this experiment were chosen because they were effective in other operant behaviour paradigms (Cheung et al. 2006, 2007); moreover, in the present experiment, the same doses produced a significant reduction of the value of a in the case of performance maintained by the sucrose reinforcer.

Inspection of the breakpoint data indicates that the effects of the treatments on this measure were concordant with their effects on the value of a. It might be argued, therefore, that the traditional interpretation of the breakpoint in terms of motivation or incentive value is supported and that quantitative analysis based on the new PR model is redundant. It should be noted, however, that a reduction of the breakpoint may be caused by motor as well as motivational effects of interventions (Skjoldager et al. 1993; Aberman et al. 1998; Zhang et al. 2005; Bezzina et al. 2008; den Boon et al. 2012); distinguishing between these possibilities is one of the main purposes of carrying out a quantitative analysis of the type used in this experiment (see Sanabria et al. 2008; Bradshaw and Killeen 2012). An illustration of how different parameters of the present model may exert opposing influences on the breakpoint is provided by the comparison between the sucrose and corn oil reinforcers in the present experiment. Other things being equal, higher values of T 0 and lower values of a are associated with lower breakpoints (Bradshaw and Killeen 2012). In the present experiment, there was no significant difference between the breakpoints seen with the two reinforcers, possibly reflecting the higher values of both T 0 and a in the case of the corn oil reinforcer.

D1-like dopamine receptors are known to be differentially involved in goal-directed and habitual responding. For example, D1-like receptor antagonists have been found to be more effective in suppressing learned approach behaviour at an earlier than at a later stage of acquisition (Choi et al. 2005; Ashby et al. 2010). This could help to explain the differential effect of SKF-83566 on a in the sucrose- and corn oil-reinforced groups, if it were assumed that sucrose was more effective than corn oil in establishing stable, ‘habitual’ responding on the PR schedule. However, this assumption receives little support from the data shown in Fig. 3, since the value of a had stabilised by the fifth block of sessions (between 100 and 125 training sessions) in both groups and showed no systematic change during the remainder of the experiment.

The food restriction regimen used in this experiment consisted of providing each rat with a daily ration of laboratory chow that was individually adjusted so as to maintain the rat’s body weight at 80 % of its free-feeding weight determined at the start of the experiment. Since rats housed under conventional laboratory conditions and given free access to food generally continue to gain weight through much of their adult lives (Pahl 1969; Masoro 1992; Newland and Rasmussen 2000; Wang et al. 2004), this regimen results in an increasing discrepancy between the target weights of experimental rats and the ‘normative’ weight of rats of the same age maintained under free-feeding conditions. This raises the possibility that the level of food deprivation may become increasingly severe as the experiment proceeds. One way of overcoming this problem is to link the target weights of the rats to the projected growth rate of rats maintained under ad libitum feeding conditions (e.g. Jones and Haselgrove 2011; Mika et al. 2012; Peterson et al. 2012; Marshall and Kirkpatrick 2013). This ploy was not adopted in the present experiment for the following reasons: Firstly, normative growth curves for rats are unavoidably arbitrary because the rate of weight gain depends critically on the particular diet on which the animals are fed (Aaes-Jørgensen et al. 1954; Archer et al. 2003; Li et al. 2011; Swithers et al. 2011). Secondly, and more importantly, age-related weight gain in adult rodents mainly reflects the accumulation of fat in hypertrophic adipocytes (Bertrand et al. 1980; Bailey et al. 1993; DiGirolamo et al. 1998), and there is a growing body of evidence that this is not a ‘normal’ process, but a morbid effect of unrestricted ingestion of relatively high energy food coupled with a meagre opportunity for physical exercise provided by standard laboratory cages, leading to obesity, chronic ill health and shortened life expectancy (McCay et al. 1935; Pahlavani 2000; Masoro 2002; Colom et al. 2007). Therefore, it has been argued that the use of a target weight calculated as a fraction of the weight of age-matched freely feeding animals may be an inappropriate basis for maintaining rats under chronic food restriction (Rowland 2007; Martin et al. 2010).

The foregoing argument has implications for the parameters of the model used in this study. If the maintenance of a constant target weight resulted in an increasing level of food deprivation, it might be expected that the value of a would increase progressively throughout the experiment because this parameter is a numerical index of incentive motivation (Killeen 1994; Reilly 2003; Bradshaw and Killeen 2012). This was not apparent in this experiment. As reported previously (Olarte-Sánchez et al. 2012a), a increased gradually during the training phase of the experiment and thereafter remained stable until the end of the experiment, consistent with the notion that maintaining adult rats at a constant fraction of their initial free-feeding body weight does not necessarily result in an increasing level of food motivation. It should be noted, however, that the present experiment did not pose a very stringent test of this suggestion because the increment in free-feeding body weight between the start and the end of the experiment was relatively small, this being consistent with previous findings of rather modest growth rates in adult female rats compared to age-matched males (Pahl 1969; Wang et al. 2004). It may be of interest, in future experiments, to compare the stability of a in male and female rats under different food restriction regimens. However, the present results indicate that maintenance of body weight at a fixed proportion of the initial free-feeding body weight is an appropriate regimen for maintaining a constant level of motivation in the case of female rats.

The new PR model (Bradshaw and Killeen 2012) offers several advantages over the older FR model (Killeen 1994) that has been used in many previous studies (see “Introduction” for references). Firstly, co-option of the linear waiting principle (Wynne et al. 1996) enables the new model to provide a dynamic account of the progressively increasing post-reinforcement pause duration that accompanies the increasing ratio requirement specified by PR schedules. Secondly, the new model explicitly acknowledges the qualitatively different profiles of running and overall response rates that have proven to be a significant problem in applications of the FR model to PR schedule performance (Rickard et al. 2009; Olarte-Sánchez et al. 2012a, b). Thirdly, the new model deconstructs pausing into two meaningful categories: post-reinforcement pausing, governed by the linear waiting principle, and ‘response time’ (i.e. a brief response execution time plus a longer post-response refractory period: Brackney et al. 2011), governed mainly by biological constraints on responding. In applications of the FR model to PR schedule performance, these two sources of pausing are funnelled into a single parameter, δ. And, last but not least, the new model provides a veridical account of running and overall response rates on PR schedules, reflected in the good conformity of the data to the theoretical functions.Footnote 1

The potential utility of quantitative models in interpreting the effects of interventions on operant performance has been noted many times before (e.g. Mazur 2006; Sanabria et al. 2008; Bradshaw and Killeen 2012). The model adopted in this study (Bradshaw and Killeen 2012) is based on behavioural principles (Killeen 1994), and its parameters are expressed in the physical units of time (time to emit a response, pausing time and the duration of behavioural activation induced by reinforcers). Other types of model (e.g. cognitive, economic or connectionist models) may provide equally good descriptive accounts of behaviour maintained by PR schedules. The parameters of such models will reflect the differing theoretical premises on which the models are founded and are unlikely to correspond closely to the behavioural parameters that comprise the present model. For example, response rates on ratio schedules can be described by the economic demand curve in which ‘consumption’ (reinforcement rate) is plotted against ‘price’ (ratio) (see Bickel et al. 1995; Johnson and Bickel 2006). The concept of elasticity of demand captures some, but not all, of the behavioural effects of reinforcers that are accounted for in terms of a in the present model (Killeen 1995; Posadas-Sánchez and Killeen 2005). We make no claim that the present model is intrinsically preferable to models based on other theoretical approaches. Ultimately, an experimenter’s choice of a particular model to analyse a complex phenomenon such as schedule-controlled behaviour is likely to be determined mainly by his or her theoretical orientation; goodness of fit to empirical data, although clearly an important consideration, is seldom the deciding factor.

The PR model may be regarded as work in progress. Future developments may enable it to account for the patterns of responding within the component ratios of PR schedules, which are not encompassed by the model in its present form. Moreover, further exploratory work is clearly needed to assess the reliability and sensitivity of parameters of the model. However, the present results, together with Bradshaw and Killeen’s (2012) re-analysis of extant data, suggest that the present model may prove to be a useful tool to help dissect the effects of neuropharmacological interventions on the multiple processes that underlie schedule-controlled operant behaviour.