Effects of the Type of Reinforcer on Renewal of Operant Responding

Some studies have identified that ABA renewal seems to depend on how the response-reinforcer contingency is established. Using rats as subjects, the present study assessed ABA and ABB renewal using a two-component multiple schedule (VI 30-s – VI 30-s), each with a different reinforcer (pellets or sucrose). Sixteen subjects were trained to lever press during 20 sessions in Context A; lever pressing was extinguished during 10 sessions in Context B. For the renewal test, eight subjects were tested in Context A (Group ABA), whereas the rest were tested in Context B (Group ABB). During acquisition, response rates were higher on the pellets component than the sucrose component; during extinction, responding rates decreased to a near-zero. A renewal effect was observed only for Group ABA during test, showing no differences between components. Our results suggest that different types of reinforcers do not seem to affect ABA renewal. Using different contexts allows for renewal to be observed regardless of the differences in response rates during acquisition.

Given its clinical implications, there has been recent interest in the extinction of behavior and its relapse (Podlesnik, Kelley, Jimenez-Gomez, & Bouton, 2017). Some studies have recognized the importance of context on the relapse of a response. Context has been defined "as the chambers in which conditioning occurs; they typically differ in visual, tactile, and olfactory characteristics" (Bouton & Todd, 2014, p. 14). For instance, lever-pressing can be reinforced and extinguished in two different contexts, and when subjects are placed back in the context where responding was initially reinforced, an increased responding can be observed relative to an extinction condition. Such an effect has been identified as renewal, which refers to the recurrence of previously-extinguished responding when there is a change in context, relative to the context in which response extinction took place (e.g., Bouton, Todd, Vurbic, & Winterbauer, 2011;Trask & Bouton, 2016).
Three different procedures have been used to study renewal. The first, an AAB procedure, is one in which acquisition and extinction of the response occur in Context A, and renewal is tested in Context B. The second is an ABC procedure, in which acquisition, extinction, and test, all occur in three different contexts. The third is an ABA procedure, in which acquisition and extinction occur in two different contexts (Context A and B, respectively) and renewal is tested in Context A. Although renewal has been observed in all of these procedures, an ABA procedure seems to generate the greatest increase in responding (Bouton et al., 2011;Podlesnik & Miranda-Dukoski, 2015); although some exceptions can be found in the literature (Todd, 2013).
Previous experiments have assessed the effects of different reinforcement parameters (e.g., frequency of reinforcement) on response rate and renewal using an ABA procedure. For instance, using a two-component multiple schedule of reinforcement (variable-interval [VI] 30-s -VI 120-s), Berry, Sweeney, and Odum (2014) assessed the effects of the frequency of reinforcement on renewal. During acquisition, subjects responded more to the component associated with the higher frequency of reinforcement (VI 30-s). Also, subjects showed greater renewal during the VI 30-s than the VI 120-s component. Given these findings, it appears that rate of 2 reinforcement is related to the magnitude of renewal. Similar results were reported by Podlesnik and Shahan (2009) when response-dependent and -independent food deliveries were arranged for one of the components; while for the second component, all food was response-dependent. Their results are quite interesting since their procedure resulted in a lower response rate during acquisition in the first component but a greater renewal as a result of a higher frequency of food deliveries (see also Madrigal, Hernández, & Flores, 2018).
Likewise, another reinforcement parameter that has been reported to have an effect on responding is the type of reinforcer, in which differences in responding occur depending upon the type of reinforcer used (e.g., food vs. sucrose; Roca, Milo, & Lattal, 2011;Steinman, 1968;Wunderlich, 1961). Wunderlich (1961) trained three groups of rats to run through a runway to a goalbox, where water, food, and food/water was available for each group, respectively. A greater running speed for the Food and Food/Water groups than the Water group was reported. On the following day, neither of the arranged reinforcers were presented in the goal box (extinction), and there was a greater resistance to extinction for Food/Water Group. On the second extinction day, a spontaneous recovery test was performed, resulting in a greater recovery for the Food and Food/Water groups. These results suggest that differences in responding can be observed when using different types of reinforcers (see also Cruz & Roca, 2017). Similar results were reported by Steinman (1968), who also reported higher response rates when food and sucrose were delivered, in contrast to conditions in which such reinforcers were separately delivered per trial. Thus, as we have outlined, frequency and type as reinforcement parameters produce differences in response rates (Berry et al., 2014;Podlesnik & Shahan, 2009;Roca et al., 2011;Steinman, 1968;Wunderlich, 1961).
The effects of the type of reinforcer have been assessed in the phenomenon of resurgence (Leitenberg, Rawson, & Mulick, 1975;Podlesnik, Jimenez-Gomez, & Shahan, 2006;Pyszczynski & Shahan, 2013;Winterbauer, Lucke, & Bouton, 2013). For example, Winterbauer et al. (2013, Experiment 3) trained two groups of rats to lever press (L1) for food-grain pellets. During Phase 2, L1 lever-pressing was extinguished and lever-pressing a second lever (L2) was reinforced with the same type (food) or a different type of reinforcer (sucrose pellets). During the resurgence test, L1 and L2 responding had no consequences; the authors observed the same resurgence level regardless of the type of reinforcer for L1 and L2. Studies on resurgence have been performed as a way to establish similar experimental situations as those during contingency management treatments, in which reinforcers used during treatment are different from those that have maintained target behavior (Fisher, Green, Calvert, & Glasgow, 2011;Higgins, Heil, & Lussier, 2004).
Even when renewal and resurgence are different relapse phenomena, and each involves different procedures, it is possible to maintain similarities as contingency management treatments not only by the type of reinforcer but also by differences in context, such as in renewal. Nevertheless, to our knowledge, no study has directly assessed the type of reinforcer effects on renewal. Therefore, in order to extend the generality of the use of different types of reinforcers to renewal, the present experiment assessed the effects of two types of reinforcers on ABA renewal, using a two-component multiple schedule of reinforcement with two different reinforcers, each one associated to each component.

Method Subjects
Sixteen experimentally naive Wistar female rats, 90 days old at the start of the experiment, were maintained at approximately 84% of their free-feeding weights. All subjects were individually housed in a temperature-controlled colony under a 12:12 hr light/dark cycle and had free access to water. Care of the animals complied with the guidelines of the University of Guadalajara Biological Science Animal Policy.

Materials
Two sets of four Med Associates Inc.® operant chambers were used (ENV-008), each box measured 30 × 24 × 21 cm (l × w × h) and was equipped with the following Med Associates Inc. apparatus. A recessed (5.1 × 5.1 cm) food and a 0.02 cc water dispenser (ENV-203 and ENV-202RM), which delivered 45 mg of rodent food (5-TUM: 181156, TestDiet, Richmond, IN, USA) or 20% sucrose solution, was placed on the center of the work panel, at approximately 2.5 cm above floor level. On the left side of the food/water dispenser, a retractable lever (ENV-112CM) protruded 1.9 cm into the chamber and required 0.25 N force to be activated. Opposite to the work panel, a 28 V centered house-light, and a speaker which produced 80 dBA of 2000 Hz (ENV-225SM) was placed at a distance of 18 cm and 13 cm from the floor of the chamber, respectively. Each chamber was enclosed in a sound-attenuating box, equipped with a ventilation fan (65 dBA) which provided external-sound masking. A Windows computer coupled to a Med Associates Inc. interface (SG-6080D) and the MED-PC IV® software controlled the experimental events and collected data.
Each set of chambers had unique features (i.e., visual, tactile, and olfactory cues), allowing the use of two different contexts (Context A and Context B). For Context A, the side walls and the ceiling had black and white diagonal stripes, 3.8 cm wide and 3.8 cm apart, and the floor was made of stainless steel grids (0.48 cm diameter). As an olfactory cue, 5 ml of pine scent (Pinol by Alen del Norte) was placed in a dish outside the chamber, beside the food/water dispenser. For Context B, each chamber had black circles of 3.8 cm in diameter on a white background, the floor grids were covered with Plexiglas, and the olfactory cue was 5 ml of cinnamon scent (Nouvela brand) presented the same way as in Context A.

Procedure
Magazine and lever-press training. Subjects were trained to magazine and lever-press using a conjoint variable time (VT 60 s)fixed ratio 1 (FR 1), thus giving the subjects non-contingent and contingent food or solution randomly; each lever-press restarted the timer for the VT 60 s and randomly selected a new value, without replacement, from a 20-value list according to a Fleshler and Hoffman (1962) distribution. This session ended after 60 min or once 40 food/solution deliveries were recorded. For this and the following sessions, each reinforcer delivery was chosen randomly every time the scheduled criterion was met, ending once 40 food/solution deliveries were recorded. For the next two sessions, reinforcers were arranged according to an FR 1. Following FR 1 response training, a variable interval schedule (VI) was used; the value of the VI was gradually increased every two sessions (5, 15, 25 s). Each of these conditions took place in experimental chambers without the context-specific features.
Acquisition. The acquisition context was used throughout 20 acquisition sessions. Two min after the start of each session, the lever was inserted into the chamber, the houselight was turned on, and a two-component multiple schedule of reinforcement was in effect. Every daily session was in effect for a total of 42 min.

Component presentation.
Each 2-min component correlated with food or solution was randomly presented eight times per session and separated by a 30 s intercomponent interval (ICI), during which the lever was retracted and the houselight was turned off. For both components, reinforcer presentations were arranged according to a VI 30 s. For eight subjects, a fixed tone (white noise) was used for the food component and an intermittent tone (0.5 s on/off) was used for the solution component. Reinforcer delivery was counterbalanced for the other eight rats. In order to avoid any context preference, half of the subjects trained under fixed tone-food and intermittent tone-sucrose relations were exposed to Context A, while the other half of the subjects were exposed to Context B; likewise, half of the subjects trained to fixed tone-sucrose and intermittent tone-food relations were exposed to Context A, and the other half to Context B.
Extinction. The extinction context was used throughout 10 sessions. Component presentations and session duration were identical to acquisition, with the exception that no reinforcers were delivered upon lever-pressing in any of the two components. Subjects exposed to Context A during acquisition, were then placed in Context B, whereas subjects exposed to Context B were then placed in Context A.
Test. During test, lever-pressing was also placed on extinction and each group was tested for one session. Each component presentation was arranged in the same manner as previous conditions. For all subjects, first-component presentation was counterbalanced. Two groups of eight rats were formed (Group ABA and Group ABB). Group ABA was tested in the acquisition context, whereas Group ABB was tested in the context used for extinction.  Given that between-component differences were observed during acquisition, response rates during extinction and test are presented as means proportion of baseline from the last two acquisition sessions (see Berry et al., 2014;Podlesnik & Shahan, 2009). Figure 2 shows response rates and between subjects SEM, during all extinction sessions and test. During extinction, response rates in both components decreased across sessions for both groups. A repeated measures analysis with sessions as a within-subjects factor, and reinforcer type (food vs. sucrose) and group (ABA vs. ABB) as between-subjects factors only showed a significant effect across sessions, F(10, 28) = 526.95, p = 0.01; main effects of group, F(1, 28) = 0.24, p = 0.631 and reinforcer type, F(1, 28) = 0.33, p = 0.569, were not significant. The Session × Group F(1, 28) = 0.57 , p = 0.456, Session × Reinforcer Type, F(1, 28) = 0.42, p = 0.520 and Group × Reinforcer Type, F(1, 28) = 1.52, p = 0.228 interactions were not significant.

Discussion
The present study assessed the effects of the type of reinforcer on ABA renewal by using a twocomponent (pellets vs. sucrose) multiple schedule of reinforcement. During acquisition, response rates were higher during the food component. This result could be explained by the type of reinforcer used and is consistent with previous studies in which differences in responding as a function of reinforcer type have been observed (Cruz & Roca, 2017;Roca et al., 2011;Wunderlich, 1961).
For instance, Roca et al. (2011) trained rats to lever-press using a two-component multiple schedule of reinforcement (VI 60-s -VI 60-s); each component was correlated with a different reinforcer (food and condensed milk, respectively). Therefore, since the same frequency of reinforcement was used, differences in the response rates were only attributed to the type of reinforcer.
Based on the results reported by Berry et al. (2014), we should not have observed differences on renewal since we had the same frequency of reinforcement in both components (VI 30-s -VI 30-s). Berry et al.'s results are quite interesting becasue they reported a higher response rate and greater renewal on the VI 30-s component than the VI 120-s component; given these results, we expected that the degree to which renewal is observed could be correlated with response rate during acquisition. However, this prediction is not supported based on the findings reported by Podlesnik and Shahan (2009). In their experiment, they reported a lower response rate in the rich component (VI 120-s + VT 20-s) compared with the lean component (VI 120s). Their low rate of responding during the rich component was explained by the non-contingent food deliveries; nevertheless, they also reported a higher degree of renewal in the rich component. In the present experiment, even when the same frequency of reinforcement was arranged for both components (VI 30-s -VI 30-s), differences in response rates were observed during acquisition as a result of the type of reinforcer used. Therefore, if response rate would predict the degree to which responding is renewed, we should have observed a higher renewal for the food component, in which subjects responded at a higher rate during acquisition. The results of the present study do not confirm these hypotheses because we did not observe differences on renewal in either of the components.
During acquisition, greater response rates were observed for the food than for the sucrose component. Even when statistically different, it is possible that response rates between components were not different enough to observe differences during test. Likewise, the lack of differences in renewal could be explained by the number of extinction sessions (i.e., 10 sessions), in contrast to previous renewal studies in which only four extinction sessions were used (see Bouton et al., 2011;Madrigal et al., 2018;Todd, Winterbauer, & Bouton, 2012). Nevertheless, results reported by Bouton et al. (2011) do not seem to confirm such hypotheses when comparing renewal after 4 or 12 extinction sessions. They reported that tripling the amount of extinction sessions did not yield differences in renewal. Therefore, using different types of reinforcers did not seem to affect renewal, even when differences in response rates during acquisition were observed. However, it is necessary to explore different conditions and reinforcer parameters to integrate these findings into the renewal literature.