Application of a Prediction Error Theory to Pavlovian Conditioning in an Insect

Mizunami, Makoto; Terao, Kanta; Alvarez, Beatriz

doi:10.3389/fpsyg.2018.01272

MINI REVIEW article

Front. Psychol., 23 July 2018

Sec. Comparative Psychology

Volume 9 - 2018 | https://doi.org/10.3389/fpsyg.2018.01272

This article is part of the Research Topic The Mechanisms of Insect Cognition View all 25 articles

Application of a Prediction Error Theory to Pavlovian Conditioning in an Insect

$\r\nMakoto Mizunami*$ Makoto Mizunami^1*

Kanta Terao²

Beatriz Alvarez¹

¹Faculty of Science, Hokkaido University, Sapporo, Japan
²Graduate School of Life Sciences, Hokkaido University, Sapporo, Japan

Elucidation of the conditions in which associative learning occurs is a critical issue in neuroscience and comparative psychology. In Pavlovian conditioning in mammals, it is thought that the discrepancy, or error, between the actual reward and the predicted reward determines whether learning occurs. This theory stems from the finding of Kamin’s blocking effect, in which after pairing of a stimulus with an unconditioned stimulus (US), conditioning of a second stimulus is blocked when the two stimuli are presented in compound and paired with the same US. Whether this theory is applicable to any species of invertebrates, however, has remained unknown. We first showed blocking and one-trial blocking of Pavlovian conditioning in the cricket Gryllus bimaculatus, which supported the Rescorla–Wagner model but not attentional theories, the major competitive error-correction learning theories to account for blocking. To match the prediction error theory, a neural circuit model was proposed, and prediction from the model was tested: the results were consistent with the Rescorla–Wagner model but not with the retrieval theory, another competitive theory to account for blocking. The findings suggest that the Rescorla–Wagner model best accounts for Pavlovian conditioning in crickets and that the basic computation rule underlying Pavlovian conditioning in crickets is the same to those suggested in mammals. Moreover, results of pharmacological studies in crickets suggested that octopamine and dopamine mediate prediction error signals in appetitive and aversive conditioning, respectively. This was in contrast to the notion that dopamine mediates appetitive prediction error signals in mammals. The functional significance and evolutionary implications of these findings are discussed.

Introduction

Pavlovian (or classical) conditioning is a form of associative learning found in many vertebrates and invertebrates (Perry et al., 2013) that is fundamental for animals’ survival since it allows them for finding suitable food, avoiding toxic food, escaping from predators, and detecting mates. This type of learning occurs when an originally unimportant stimulus (conditioned stimulus, CS) becomes associated with a biologically significant stimulus (unconditioned stimulus, US) such that it induces a response (conditioned response, CR) to the CS thereafter. The error-correction learning rule has been thought to account for associative learning in mammals (Pearce, 2008; Mazur, 2013) but little is known about whether the same is true for any species of invertebrates (for earlier attempts in honey bees, see Greggers and Menzel, 1993; Smith, 1997). In this article, we briefly review some basic knowledge of computational rules governing Pavlovian conditioning in both vertebrates and invertebrates and their possible neural substrates, with a special focus on our recent finding that the error correction learning rule seems to best account for Pavlovian conditioning in crickets.

Prediction Error Theories for Mammalian Pavlovian Conditioning

In associative learning in mammals, a widely accepted view is that the discrepancy, or error, between the reward an animal gets and the reward that the animal predicts (or expects) determines whether learning occurs (Rescorla and Wagner, 1972; Pearce, 2008; Mazur, 2013). The error-correction theory has been applied to learning since at least in 1950s (Bush and Mosteller, 1951) and developed into a refined form in 1970s to account for the finding of blocking phenomenon by Kamin (1969). Blocking takes place when a stimulus (X) that had been paired with a US blocks the subsequent association of a novel stimulus (Y) in a second training phase in which the novel stimulus is presented in compound with X and reinforced by the same US. After this training, when the response to Y alone is tested, it is typically observed that animals do not respond to this stimulus (but notice also that some researchers like, Maes et al., 2016, reported difficulties in replicating blocking effect in rats). The finding of the blocking effect suggests that the strength of temporal contingency (correlation) between the CS and the US, known as a critical factor for conditioning to occur (Rescorla, 1968), is not the only factor that determines the occurrence of learning. Kamin proposed that “surprise” is necessary for learning, and that learning about a stimulus (Y) is blocked when the US is fully predicted by another stimulus (X). This proposition was later formulated into the Rescorla–Wagner model, the most influential form of the error-correction learning theory (Rescorla and Wagner, 1972), which assumes that the discrepancy between the strength of the actual US and total strengths of the predicted US by all the CSs determines the amount of learning (Table 1A). Subsequent studies in mammals suggested that dopamine (DA) neurons in the ventral tegmental area of the midbrain mediate prediction error signals for appetitive US, which provided the basis to investigate neural circuit mechanisms of Pavlovian conditioning (Schultz, 2013; Steinberg et al., 2013).

TABLE 1

TABLE 1. Error-correction learning theories to account for blocking.

There are theories other than the Rescorla–Wager model that can account for the blocking effect (Miller et al., 1995; Pearce, 2008; Mazur, 2013). The most influential competitive ones are the attentional theories proposed by Mackintosh (1975) and by Pearce and Hall (1980), which are refined versions of the error-correction learning theory and account for blocking by decreased attention to the CS (Tables 1B,C). It can be stated that Rescorla–Wagner model focuses on US processing whereas attentional models focus more on CS processing. Another notable theory is the comparator hypothesis (Miller and Matzel, 1988), which accounts for blocking by competition between CSs during the memory retrieval process. Remarkably, although efforts have been directed to experimentally test these different theories, which of the theories mentioned best accounts for computational rules governing Pavlovian conditioning remains unclear in any conditioning system (Miller et al., 1995; Pearce, 2008; Mazur, 2013).

Studies on Neural Processing Underlying Pavlovian Conditioning in Invertebrates

Whether error-correction learning models such as the Rescorla–Wagner model represent computational rules underlying learning in any species of invertebrates remained unknown until recently. One of the reasons for the lack of such study is the difficulty in establishing experimental procedures to convincingly demonstrate blocking. In insects, for example, some earlier studies in honey bees (e.g., Smith, 1997; Hosler and Smith, 2000) showed a blocking-like effect but more recent studies failed to establish blocking as a robust phenomenon in honey bees (Guerrieri et al., 2005; Blaser et al., 2006, 2008). Second, although blocking has been reported in the slug Limax maximus (Sahley et al., 1981), the snail Cornu aspersum (formerly Helix aspersa, Acebes et al., 2009; Prados et al., 2013a) and the planaria Dugesia tigrina (Prados et al., 2013b) no attempts have been made to investigate which computational model best accounts for blocking in any of these invertebrate species.

Many of the previous studies on the neural basis of Pavlovian conditioning in invertebrates focused on clarifying the cellular and molecular mechanisms that allow animals to detect the coincident and correlated occurrence of the CS and the US, a pre-requisite for Pavlovian conditioning. In Pavlovian conditioning of gill withdrawal responses in the sea hare Aplysia californica, it has been demonstrated that neural signals mediating CS and US converge in some neurons of the nervous system and that type 1 adenylyl cyclase (AC), which catalyzes ATP to produce cAMP, and the N-methyl-D-aspartate (NMDA) receptor, a type of glutamate receptor, serve as key molecules for the detection of coincident arrival of CS and US signals to these neurons to lead to modification of the efficacy of synaptic transmission that underlies conditioning (Abrams and Kandel, 1988; Hawkins and Byrne, 2015). Similarly, in the fruit-fly Drosophila melanogaster, it has been shown that type 1 AC in intrinsic neurons (Kenyon cells) of the mushroom body, a higher-order associative center in the insect brain (Menzel and Giurfa, 2006; Watanabe et al., 2011; Burke et al., 2012; Liu et al., 2012), serve as key molecules to detect coincident arrival of the olfactory CS and the electric shock or the sucrose US signals to these neurons for achieving conditioning (Davis, 2005; Gervasi et al., 2010). However, whether such coincidence detection mechanisms are sufficient to achieve Pavlovian conditioning in these species remains unclear.

Neural Substrates Underlying Pavlovian Conditioning in Crickets

We recently investigated whether blocking occurs in Pavlovian conditioning in the cricket Gryllus bimaculatus. Crickets are newly emerging experimental animals in which associative learning is explored by pairing visual or olfactory cues with either water (to elicit appetitive learning) or with sodium chloride (to induce aversive learning). With these procedures, the neural mechanisms that are involved in both the acquisition and the retrieval of the CR of Pavlovian conditioning have been investigated in some detail (Matsumoto and Mizunami, 2002; Matsumoto et al., 2006, 2018; Mizunami et al., 2014, 2015; Matsumoto Y. et al., 2016). For example, concerning the acquisition of both olfactory and visual learning, we showed that pharmacological blockade of octopamine (OA)-ergic synaptic transmission impairs appetitive but not aversive Pavlovian conditioning, whereas pharmacological blockade of DA-ergic transmission impairs aversive conditioning but not appetitive conditioning (Unoki et al., 2005, 2006; Mizunami et al., 2009; Nakatani et al., 2009; Matsumoto et al., 2015; Mizunami and Matsumoto, 2017). The results obtained in the pharmacological studies were further confirmed in subsequent studies on the effects of knockout or knockdown of genes that code DA receptors or OA receptors by the CRISPR/cas9 system (Awata et al., 2015) or by RNAi (Awata et al., 2016). These findings suggest that OA neurons and DA neurons mediate neural signals representing appetitive and aversive US, respectively, in both olfactory and visual conditioning. Moreover, OA and DA neurons are also involved in the execution of the CR (or in the retrieval of the memory): blockade of OA-ergic transmission impaired CR execution after appetitive conditioning, but not after aversive conditioning with sodium chloride, and blockade of DA-ergic transmission impaired the execution of the CR after aversive conditioning but not after appetitive conditioning (Mizunami et al., 2009). Therefore, it has been concluded that activation of OA neurons is needed for the execution of a CR after appetitive conditioning, whereas activation of DA neurons is needed for the execution of an aversive CR. These results have been integrated in a neural circuit model for Pavlovian conditioning in crickets, which is assumed to represent neural circuitry of the mushroom body (Mizunami et al., 2009). The model accounted for two higher-order learning phenomena, namely second-order conditioning (Mizunami et al., 2009) and sensory preconditioning (Matsumoto et al., 2013). This model provided the basis to construct a model to account for blocking described in subsequent sections.

Roles of OA and DA in mediating appetitive and aversive signals in Pavlovian learning have also been reported in honey bees (Hammer and Menzel, 1998; Farooqui et al., 2003; Vergoz et al., 2007, but see Perry et al., 2016 for bumblebees). In fruit-flies, on the other hand, it has been concluded that different classes of dopamine neurons projecting to the mushroom body mediate appetite and aversive signals (Burke et al., 2012; Liu et al., 2012). It seems that the neurotransmitter mediating appetitive signals differs in different species of insects, although that mediating aversive signals is conserved among insects.

Applicability of Prediction Error Theory to Pavlovian Conditioning in Crickets

Experiments showing blocking with crickets were conducted, at first, with an appetitive procedure in which water was used as the US. Crickets were subjected to four conditioning trials in which they were exposed to stimulus X immediately before the presentation of water (X+) and were then subjected to compound trials in which stimulus X was presented together with a new stimulus Y followed by the same US (XY+), X and Y being stimuli of different sensory modalities (an olfactory and a visual pattern stimulus, counterbalanced; Terao et al., 2015). Crickets subjected to this training did not respond to Y. In contrast, control crickets that were exposed to unpaired presentations of X and the US (X/+) and then to paired and reinforced presentations of the compound (XY+) or crickets that received only XY+ training exhibited normal learning of Y. Similar results were also obtained in experiments in which blocking was assessed by means of an aversive conditioning procedure (i.e., NaCl was used as the US; Terao and Mizunami, 2017). The results showed that blocking occurs in both appetitive conditioning and aversive conditioning in crickets.

As already mentioned, the most influential models to account for blocking are the Rescorla-Wagner model (Rescorla and Wagner, 1972), the attentional theories proposed by Mackintosh (1975) and by Pearce and Hall (1980), and the retrieval theory (or comparator hypothesis) proposed by Miller and Matzel (1988). However, whether blocking is better accounted for by any of the mentioned models has not been tested in an invertebrate species, except that Smith (1997) examined blocking in honey bees and argued that the Rescorla–Wagner model can at least in part account for blocking but the attentional theories seem not to account for it. To discriminate among these models, one-trial appetitive blocking experiments were performed. In such experiments crickets received X+ training trials followed by one single XY+ training trial. We used one compound conditioning trial because the Rescorla–Wagner model predicts that such training will result in blocking of Y, whereas attentional theories do not (Mackintosh, 1975; Pearce and Hall, 1980). Our results showed that crickets that received X+ training followed by one XY+ compound-conditioning trial did not respond to Y. In contrast, control crickets that were exposed to unpaired presentations of X and the US followed by one XY+ compound training trial or that received only one XY+ training trial exhibited normal learning of Y. The results supported the Rescorla–Wagner model but not the attentional theories for appetitive conditioning (Terao et al., 2015). We also investigated whether blocking with one XY+ training trial can be accounted for by assuming simple selective attentional process not coupled to error-correction learning, and the results were not consistent with this possibility (Terao et al., 2015). In the case of aversive conditioning (i.e., using NaCl as the US), however, a blocking experiment with one compound trial could not be performed since previous studies have shown that one aversive X+ conditioning trial does not result in aversive learning (Unoki et al., 2005, 2006). Therefore, discrimination of the Rescorla–Wagner model and attentional theories in aversive conditioning remains to be explored. The possible applicability of the retrieval theory will be discussed in a later section.

To account for these findings, we proposed a neural circuit model of Pavlovian conditioning in crickets that matches the Rescorla–Wagner theory (Figure 1A; Terao et al., 2015; Terao and Mizunami, 2017), by revising our previous model (Mizunami et al., 2009). The major assumption in our model is that pairing of the CS and the US lead to the enhancement of synaptic transmission from “CS” neurons to three classes of neurons, i.e., “CR,” “OA1/DA1,” and “OA2/DA2” neurons, in which “CS” neurons are neurons mediating signals about CS (which may represent intrinsic neurons of the mushroom body) and “CR” are neurons that lead to the CR when they are activated (which may represent output neurons of the mushroom body lobes). “OA1/DA1” or “OA2/DA2” neurons are separate classes of OA or DA neurons that receive signals about appetitive or aversive USs (which may represent OA or DA neurons projecting to the mushroom body lobes). “OA1/DA1” neurons (colored in yellow in Figure 1A) govern enhancement of “CS-CR” synapses (but not execution of a CR) whereas “OA2/DA2” neurons govern execution of a CR (but not enhancement of “CS-CR” synapses) and here we focus on the former neurons. The model assumes that “OA1/DA1” neurons are critical for error-correction computation, in that (1) the efficacy of “CS-OA1/DA1” inhibitory synapses increases by coincident activation of “CS” and “OA1/DA1” neurons during CS-US pairing trials, (2) inhibitory inputs to “OA1/DA1” neurons represent signals about US prediction by the CS whereas excitatory inputs to these neurons represent US signals, (3) responses of “OA1/DA1” neurons during CS-US pairing trials, hence, represent US prediction error signals, and (4) after sufficient amount of training, responses of “OA1/DA1” neurons during CS-US pairing decrease to the zero level and hence no further enhancement of “CS-CR” synapses occurs. Details of the model are shown in the legend of Figure 1A, and how responses of “OA1/DA1” neurons to paired CS-US presentations represent US prediction error signals is described in Table 2. As for models of the mushroom body that are intended to account for some other memory tasks, see literatures such as Peng and Chittka (2017) and Roper et al. (2017).

FIGURE 1

FIGURE 1. Neural models of Pavlovian conditioning in crickets proposed by Terao et al. (2015) and Terao and Mizunami (2017). (A) Description of the model that has been revised from the model by Mizunami et al. (2009) to match the prediction error theory. The model assumes two classes of OA and DA neurons. One is “OA1/DA1” neurons (colored in yellow) that govern enhancement of “CS-CR” synapses (but not execution of a CR). The other is “OA2/DA2” neurons that govern execution of a CR or memory retrieval (but not enhancement of “CS-CR” synapses). The model also assumes that (1) “CS” neurons [which may represent intrinsic neurons (Kenyon cells) of the mushroom body] that convey signals for CS make silent or weak synaptic connections with dendrites of “CR” neurons [which may represent efferent (output) neurons of the lobes (output regions) of the mushroom body], activation of which leads to a CR, but these synaptic connections are silent or very weak before conditioning, (2) “OA1/DA1” neurons receive excitatory synapses that represent appetitive/aversive US signals and silent or very weak inhibitory synapses from “CS” neurons before training, which are strengthened by CS-US pairing, (3) during training, “OA1/DA1” neurons receive excitatory synaptic input that represents actual US and inhibitory input from “CS” neurons that represents US prediction by CS, and thus their activities represent US prediction error signals, (4) “OA2/DA2” neurons receive excitatory synapses that represent US signals and silent or very weak excitatory synapses from “CS” neurons before training, which are strengthened by CS-US pairing, and (5) “OA2/DA2” neurons make synaptic connections with axon terminals of “CS” neurons, and coincident activation of “CS” neurons and “OA2/DA2” neurons is needed for activation of “CR” neurons (AND gate) and for production of a conditioned response. Presentation of a CS after CS-US pairing activates “CS” neurons and then “OA2/DA2” neurons and thus activates “CR” neurons to lead to a CR. Synapses for which the efficacy can be changed by conditioning are colored in red and marked as “modifiable.” Excitatory synapses are marked as triangles, and inhibitory synapses are marked as bars. UR: unconditioned response. (B) Accounts for blocking by the model. “OA2/DA2” neurons in the model in (A) are not shown in (B) for simplicity. The models are modified from Terao et al. (2015) and Terao and Mizunami (2017) with permission.

TABLE 2

TABLE 2. Information coded in the responses of “OA1/DA1” neurons in the model of Figure 1.

Figure 1B depicts how the model accounts for blocking. CS1-US pairing trials strengthen “CS1-OA1/DA1” inhibitory synapses so that responses of “OA1/DA1” neurons during trials are diminished to the zero level. Therefore, when the CS1-CS2 compound is subsequently presented and reinforced with the same US, “OA1/DA1” neurons produce no responses and hence, no enhancement of “CS2-CR” synapses occur (Terao et al., 2015).

One of the predictions that can be made from the model is that, in the case of appetitive conditioning, blockade of output synapses from OA neurons by administration of an OA receptor antagonist (e.g., epinastine) during the conditioning of a stimulus Y (Y+ training) impairs learning of Y since normal synaptic outputs from “OA1” neurons are needed for enhancement of “CS-CR” synapses. This treatment, however, would not affect the prediction error computation, since synaptic outputs from “OA1” neurons do not participate in prediction error computation (Figure 1B; Terao et al., 2015). Therefore, administration of epinastine before Y+ training would still allow for error correction to take place in each trial, even though it prevents an enhancement of “CS-CR” synapses necessary for learning. The model thus predicts that subsequent Y+ training after recovery from the effect of epinastine should produce no learning if the associative strength of the “CS-OA1” synapses reaches the maximum after initial Y+ training. Crickets of the experimental group indeed exhibited no learning of Y. In contrast, crickets in the control group that were administrated with epinastine before unpaired presentation of Y and US and then subjected to Y+ training after recovery from the effect of epinastine exhibited normal learning of Y. We referred to this inhibitory phenomenon as “auto-blocking,” because learning of Y seems to be blocked by the prediction of the US by Y itself (and not by another stimulus, X, as in the case of blocking experiment) (Terao et al., 2015). The absence of CR in the test could also be explained by the comparator model if memory is formed in the second training but not retrieved in the test due to competition of memories formed in the initial and second trainings. Such competition, however, is difficult to assume since results of all our previous studies suggest that no memory is formed in the first training (e.g., Unoki et al., 2005). Taken together, one-trial blocking and the auto-blocking phenomenon suggest that the Rescorla–Wagner model is the one that best accounts for appetitive conditioning in crickets (Terao et al., 2015). In addition, auto-blocking experiments suggest that OA neurons mediate appetitive prediction error signals.

Subsequent studies also showed auto-blocking in an aversive conditioning experiment. Crickets were first administered with a DA receptor antagonist (flupentixol) before training with Y+ (or before exposure to unpaired presentations of Y and + in the case of the control group). As in the previous case, subsequent Y+ training after animals had recovered from the effect of flupentixol did not result in learning of Y (Terao and Mizunami, 2017), whereas animals in the control group showed an increased aversion to Y. The results suggest that the Rescorla–Wagner model or other forms of error-correction learning theories, but not the retrieval theory, best account for aversive conditioning. The results of auto-blocking experiments also suggest that DA neurons mediate aversive prediction error signals.

It should be noted, however, that we do not suggest that error-correction learning theories account for all aspects of Pavlovian conditioning in crickets. The model proposed to account for Pavlovian conditioning in crickets assumes synaptic plasticity in three different synapses in the circuitry and suggests that the plasticity of one type of synapses (“CS-CR” synapses) is governed by US prediction error whereas the plasticity of the other two synapses (“CS-OA1/DA1” and “CS-OA2/DA2” synapses) is governed by coincident occurrence of CS and US. Moreover, we have observed second-order conditioning (Mizunami et al., 2009) in crickets, which is difficult to be accounted for by the Rescorla–Wagner model without appropriate revisions (Miller et al., 1995). We have proposed that these learning phenomena in crickets can be accounted for by neural models that assume no error-correction computation (specifically, by neural pathways involving “OA2/DA2” neurons) (Mizunami et al., 2009; Matsumoto et al., 2013; Terao et al., 2015).

It can be pointed out that major predictions from our model differ from those of the temporal difference (TD) model (Sutton and Barto, 1987), a variant of error-correction learning models and frequently used for simulations of activities of dopamine neurons in the midbrain in primates. It has been shown that those neurons in primates are activated by learned CS and less by predicted US after Pavlovian conditioning, in accordance with the TD model (Schultz, 2015). Interestingly, some of these features have also been found in a ventral unpaired neuron, an OA neuron in the subesophageal ganglion in honey bees that mediates sucrose signals in appetitive olfactory conditioning (Hammer, 1993). In our model, on the other hand, activities representing the US prediction by the CS (i.e., responses to learned CS) and those representing US prediction error (i.e., less responding to predicted US during paired CS-US presentation after training) are assumed in separate classes of aminergic neurons (i.e., “OA2/DA2” and “OA1/OA1” neurons) for simplification of the model. Physiological investigations are needed to clarify the validity of our model.

Functional and Evolutionary Considerations

The finding that an error-correction learning rule accounts for Pavlovian conditioning in crickets is remarkable since it suggests that the basic computational rules underlying Pavlovian learning in crickets are the same to those in mammals. Error-correction computation, one of fundamental neural computations executed in the mammalian brain, can also be achieved in the small brain of crickets. It is thus of interest to elucidate the neural circuit mechanisms underlying the error-correction learning in crickets, and in other species of invertebrates, to compare them with those in mammals. In mammals, midbrain DA neurons are thought to mediate prediction error signals for appetitive stimuli, and whether DA neurons also mediate aversive prediction error signals is under debate (Schultz, 2013; Matsumoto H. et al., 2016). In mice, it has been suggested that prediction error signals observed in midbrain DA neurons are the result of summation of information across multiple brain areas, rather than prediction error signals being computed in a specific brain area (Tian et al., 2016). In crickets, we hypothesize that OA and DA neurons projecting to the mushroom body mediate appetitive and aversive prediction error signals, respectively (Terao et al., 2015; Terao and Mizunami, 2017). Anatomical and physiological characterizations of these OA and DA neurons should pave the way for elucidating the ubiquity and differences of the neural mechanisms underlying prediction error computation among animals of different phyla.

Some questions arise concerning the functional significance and evolution of the error-correction learning rule underlying Pavlovian conditioning in crickets. An important question is what are the functional advantages of having such associative learning systems in which coincident and correlated occurrence of a CS and a US is not sufficient to lead to learning. To facilitate discussion on this issue, we assume that many of the Pavlovian conditioning systems in invertebrates are based on a simpler learning rule, namely, they are based solely on the detection of coincident or contingent occurrence of a CS and a US, as has been assumed by many neurobiologists. It can be argued that an error-correction learning system is advantageous when multiple CSs occur in association with a US, since, in such a system, the magnitude of learning of a given CS is determined by its relative “surprisingness” or by to what extent the CS predicts the US. This learning system is more efficient in that it prevents learning of redundant cues compared to a learning system that is solely based on the detection of temporal coincidence or contingence, in which all CSs that occur in the same temporal relationship with a US should be equally learned. An error-correction learning, however, should have a cost, in that it requires elaborate neural circuits in the brain, and the development and maintenance of such circuits should be costly. Such a cost, however, is likely to be moderate since it is affordable for crickets that have only small brains.

Another question to be addressed in the future is to what extent the Pavlovian conditioning system with the error-correction rule is ubiquitous among invertebrates. The blocking phenomenon, a hallmark for the existence of the error-correction learning rule, has so far been reported only in slugs (Sahley et al., 1981), snails (Acebes et al., 2009; Prados et al., 2013a), and planarians (Prados et al., 2013b) but whether it occurs by error-correction learning or by other process, such as cue competition during memory retrieval (Miller and Matzel, 1988) or simple selective attentional process not coupled to error-correction learning (see Terao et al., 2015) has not been investigated. Slugs and snails possess well-developed central nervous systems (Sahley et al., 1981; Loy et al., 2006), comparable to those of insects, and it would be therefore likely that the blocking effect is based on error-correction learning rules as well. On the other hand, since the central nervous system of planarians is much less organized than that of insects, it would be likely that blocking in planarians reflects processes other than error-correction learning. In insects, it is of interest to see whether blocking is based on an error-correction rule in species other than crickets. However, unambiguous evidence of blocking phenomenon has not been found in honey bees (Guerrieri et al., 2005; Blaser et al., 2006, 2008) or in the fruit fly Drosophila melanogaster (Young et al., 2011). In the case of honey bees, for example, contradictory results have been reported in the literature from blocking of the CR (Smith, 1997; Hosler and Smith, 2000) to the absence of blocking (Blaser et al., 2006, 2008). Guerrieri et al. (2005) reported blocking, no blocking or even enhanced responding to the blocked element (i.e., augmentation) depending on the odor pairs used in the blocking experiment in honey bees. The reasons for the contradictory results in honey bees remain to be explored.

Finally, phenomena that are not consistent with the Rescorla–Wagner model, such as recovery from extinction, and phenomena that are difficult to be accounted for by the Rescorla–Wagner model without appropriate revisions, such as second-order conditioning, have been reported in some invertebrate species (e.g., Sahley et al., 1981; Loy et al., 2006; Hussaini et al., 2007; Tabone and de Belle, 2011; Alvarez et al., 2014). What neural circuit mechanisms underlie associative learning in these species remains for future subjects.

Author Contributions

MM, KT, and BA wrote the manuscript and approved the final version.

Funding

This study was supported by Grants-in-Aid for Scientific Research from the Ministry of Education, Science, Culture, Sports and Technology of Japan to MM (Nos. 16H04814 and 16K18586) and to KT (No. 15J01414) and by JSPS Postdoctoral Fellowship Program to BA (No. PE17047).

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Abrams, T. W., and Kandel, E. R. (1988). Is contiguity detection in classical conditioning a system or a cellular property? Learning in Aplysia suggests a possible molecular site. Trends Neurosci. 11, 128–135. doi: 10.1016/0166-2236(88)90137-3

CrossRef Full Text | Google Scholar

Acebes, F., Solar, P., Carnero, S., and Loy, I. (2009). Blocking of conditioning of tentacle lowering in the snail (Helix aspersa). Q. J. Exp. Psychol. 62, 1315–1327. doi: 10.1080/17470210802483545

PubMed Abstract | CrossRef Full Text | Google Scholar

Alvarez, B., Morís, J., Luque, D., and Loy, I. (2014). Extinction, spontaneous recovery and reinstatement in the garden snail Helix aspersa. Anim. Behav. 92, 75–83. doi: 10.1016/j.anbehav.2014.03.023

CrossRef Full Text | Google Scholar

Awata, H., Wakuda, R., Ishimaru, Y., Matsuoka, Y., Terao, K., Katata, S., et al. (2016). Roles of OA1 octopamine receptor and Dop1 dopamine receptor in mediating appetitive and aversive reinforcement revealed by RNAi studies. Sci. Rep. 6:29696. doi: 10.1038/srep29696

PubMed Abstract | CrossRef Full Text | Google Scholar

Awata, H., Watanabe, T., Hamanaka, Y., Mito, T., Noji, S., and Mizunami, M. (2015). Knockout crickets for the study of learning and memory: dopamine receptor Dop1 mediates aversive but not appetitive reinforcement in crickets. Sci. Rep. 5:15885. doi: 10.1038/srep15885

PubMed Abstract | CrossRef Full Text | Google Scholar

Blaser, R. E., Couvillon, P. A., and Bitterman, M. E. (2006). Blocking and pseudoblocking: new control experiments with honeybees. Q. J. Exp. Psychol. 59, 68–76. doi: 10.1080/17470210500242938

PubMed Abstract | CrossRef Full Text | Google Scholar

Blaser, R. E., Couvillon, P. A., and Bitterman, M. E. (2008). Within-subjects experiments on blocking and facilitation in honeybees (Apis mellifera). J. Comp. Psychol. 122, 373–378. doi: 10.1037/a0012623

PubMed Abstract | CrossRef Full Text | Google Scholar

Burke, C. J., Huetteroth, W., Owald, D., Perisse, E., Krashes, M. J., Das, G., et al. (2012). Layered reward signalling through octopamine and dopamine in Drosophila. Nature 492, 433–437. doi: 10.1038/nature11614

PubMed Abstract | CrossRef Full Text | Google Scholar

Bush, R. R., and Mosteller, F. (1951). A mathematical model of simple learning. Psychol. Rev. 58, 313–323. doi: 10.1037/h0054388

CrossRef Full Text | Google Scholar

Davis, R. L. (2005). Olfactory memory formation in Drosophila: from molecular to systems neuroscience. Annu. Rev. Neurosci. 28, 275–302. doi: 10.1146/annurev.neuro.28.061604.135651

PubMed Abstract | CrossRef Full Text | Google Scholar

Farooqui, T., Robinson, K., Vaessin, H., and Smith, B. H. (2003). Modulation of early olfactory processing by an octopaminergic reinforcement pathway in the honeybee. J. Neurosci. 23, 5370–5380. doi: 10.1523/JNEUROSCI.23-12-05370.2003

PubMed Abstract | CrossRef Full Text | Google Scholar

Gervasi, N., Tchénio, P., and Preat, T. (2010). PKA dynamics in a Drosophila learning center: coincidence detection by rutabaga adenylyl cyclase and spatial regulation by dunce phosphodiesterase. Neuron 65, 516–529. doi: 10.1016/j.neuron.2010.01.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Greggers, U., and Menzel, R. (1993). Memory dynamics and foraging strategies of honeybees. Behav. Ecol. Sociobiol. 32, 17–29. doi: 10.1007/BF00172219

CrossRef Full Text | Google Scholar

Guerrieri, F., Lachnit, H., Gerber, B., and Giurfa, M. (2005). Olfactory blocking and odorant similarity in the honeybee. Learn. Mem. 12, 86–95. doi: 10.1101/lm.79305

PubMed Abstract | CrossRef Full Text | Google Scholar

Hammer, M. (1993). An identified neuron mediates the unconditioned stimulus in associative olfactory learning in honeybees. Nature 366, 59–63. doi: 10.1038/366059a0

PubMed Abstract | CrossRef Full Text | Google Scholar

Hammer, M., and Menzel, R. (1998). Multiple sites of associative odor learning as revealed by local brain microinjections of octopamine in honeybees. Learn. Mem. 5, 146–156.

PubMed Abstract | Google Scholar

Hawkins, R. D., and Byrne, J. H. (2015). Associative learning in invertebrates. Cold Spring Harb. Perspect. Biol. 7:a021709. doi: 10.1101/cshperspect.a021709

PubMed Abstract | CrossRef Full Text | Google Scholar

Hosler, J. S., and Smith, B. H. (2000). Blocking and the detection of odor components in blends. J. Exp. Biol. 203, 2797–2806.

PubMed Abstract | Google Scholar

Hussaini, S. A., Komischke, B., Menzel, R., and Lachnit, H. (2007). Forward and backward second-order Pavlovian conditioning in honeybees. Learn. Mem. 14, 678–683. doi: 10.1101/lm.471307

PubMed Abstract | CrossRef Full Text | Google Scholar

Kamin, L. (1969). “Predictability, surprise, attention and conditioning,” in Punishment and Aversive Behavior, eds B. A. Campbell and R. M. Church (New York, NY: Appleton-Century-Crofts), 279–298.

Google Scholar

Liu, C., Plaçais, P. Y., Yamagata, N., Pfeiffer, B. D., Aso, Y., Friedrich, A. B., et al. (2012). A subset of dopamine neurons signals reward for odour memory in Drosophila. Nature 488, 512–516. doi: 10.1038/nature11304

PubMed Abstract | CrossRef Full Text | Google Scholar

Loy, I., Fernández, V., and Acebes, F. (2006). Conditioning of tentacle lowering in the snail (Helix aspersa): acquisition, latent inhibition, overshadowing, second-order conditioning, and sensory preconditioning. Learn. Behav. 34, 305–314. doi: 10.3758/BF03192885

PubMed Abstract | CrossRef Full Text | Google Scholar

Mackintosh, N. J. (1975). A theory of attention: variations in the associability of stimuli with reinforcement. Psychol. Rev. 82, 276–298. doi: 10.1037/h0076778

CrossRef Full Text | Google Scholar

Maes, E., Boddez, Y., Alfei, J. M., Krypotos, A. M., D’Hooge, R., De Houwer, J., et al. (2016). The elusive nature of the blocking effect: 15 failures to replicate. J. Exp. Psychol. Gen. 145, e49–e71. doi: 10.1037/xge0000200

PubMed Abstract | CrossRef Full Text | Google Scholar

Matsumoto, H., Tian, J., Uchida, N., and Watabe-Uchida, M. (2016). Midbrain dopamine neurons signal aversion in a reward-context dependent manner. eLife 5, 1–24. doi: 10.7554/eLife.17328

PubMed Abstract | CrossRef Full Text | Google Scholar

Matsumoto, Y., Hirashima, D., and Mizunami, M. (2013). Analysis and modeling of neural processes underlying sensory preconditioning. Neurobiol. Learn. Mem. 101, 103–113. doi: 10.1016/j.nlm.2013.01.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Matsumoto, Y., Matsumoto, C. S., and Mizunami, M. (2018). Signaling pathways for long-term memory formation in the cricket. Front. Psychol. 9:1014. doi: 10.3389/fpsyg.2018.01014

PubMed Abstract | CrossRef Full Text | Google Scholar

Matsumoto, Y., Matsumoto, C. S., Takahashi, T., and Mizunami, M. (2016). Activation of NO-cGMP signaling rescues age-related memory impairment in an insect. Front. Behav. Neurosci. 10:166. doi: 10.3389/fnbeh.2016.00166

PubMed Abstract | CrossRef Full Text | Google Scholar

Matsumoto, Y., Matsumoto, C. S., Wakuda, R., Ichihara, S., and Mizunami, M. (2015). Roles of octopamine and dopamine in appetitive and aversive memory acquisition studied in olfactory conditioning of maxillary palpi extension response in crickets. Front. Behav. Neurosci. 9:230. doi: 10.3389/fnbeh.2015.00230

PubMed Abstract | CrossRef Full Text | Google Scholar

Matsumoto, Y., and Mizunami, M. (2002). Temporal determinants of olfactory long-term retention in the cricket Gryllus bimaculatus. J. Exp. Biol. 205, 1429–1437.

PubMed Abstract | Google Scholar

Matsumoto, Y., Unoki, S., Aonuma, H., and Mizunami, M. (2006). Critical roles of the nitric oxide-cGMP cascade in the formation of cAMP-dependent long-term memory. Learn. Mem. 13, 35–44. doi: 10.1101/lm.130506

PubMed Abstract | CrossRef Full Text | Google Scholar

Mazur, J. E. (ed.). (2013). “Chapter 4: theories and research on classical conditioning,” in Learning and Behavior (Boston, MA: Pearson education), 75–100.

Google Scholar

Menzel, R., and Giurfa, M. (2006). Dimensions of cognition in an insect, the honeybee. Behav. Cogn. Neurosci. Rev. 5, 24–40. doi: 10.1177/1534582306289522

PubMed Abstract | CrossRef Full Text | Google Scholar

Miller, R. R., Barnet, R. C., and Grahame, N. J. (1995). Assessment of the Rescorla-Wagner model. Psychol. Bull. 117, 363–386. doi: 10.1037/0033-2909.117.3.363

CrossRef Full Text | Google Scholar

Miller, R. R., and Matzel, L. D. (1988). The comparator hypothesis: a response rule for the expression of associations. Psychol. Learn. Motiv. 22, 51–92. doi: 10.1016/S0079-7421(08)60038-9

CrossRef Full Text | Google Scholar

Mizunami, M., Hamanaka, Y., and Nishino, H. (2015). Toward elucidating diversity of neural mechanisms underlying insect learning. Zool. Lett. 1:8. doi: 10.1186/s40851-014-0008-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Mizunami, M., and Matsumoto, Y. (2017). Roles of octopamine and dopamine neurons for mediating appetitive and aversive signals in Pavlovian conditioning in crickets. Front. Physiol. 8:1027. doi: 10.3389/fphys.2017.01027

CrossRef Full Text | Google Scholar

Mizunami, M., Nemoto, Y., Terao, K., Hamanaka, Y., and Matsumoto, Y. (2014). Roles of calcium/calmodulin-dependent kinase II in long-term memory formation in crickets. PLoS One 9:e107442. doi: 10.1371/journal.pone.0107442

PubMed Abstract | CrossRef Full Text | Google Scholar

Mizunami, M., Unoki, S., Mori, Y., Hirashima, D., Hatano. A., and Matsumoto, Y. (2009). Roles of octopaminergic and dopaminergic neurons in appetitive and aversive memory recall in an insect. BMC Biol. 7:46. doi: 10.1186/1741-7007-7-46

PubMed Abstract | CrossRef Full Text | Google Scholar

Nakatani, Y., Matsumoto, Y., Mori, Y., Hirashima, D., Nishino, H., Arikawa, K., et al. (2009). Why the carrot is more effective than the stick: different dynamics of punishment memory and reward memory and its possible biological basis. Neurobiol. Learn. Mem. 92, 370–380. doi: 10.1016/j.nlm.2009.05.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Pearce, J. M. (2008). Animal Learning & Cognition, 3rd Edn. New York, NY: Psychology press, 35–91.

Google Scholar

Pearce, J. M., and Hall, G. (1980). A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552. doi: 10.1037/0033-295X.87.6.532

PubMed Abstract | CrossRef Full Text | Google Scholar

Peng, F., and Chittka, L. (2017). A simple computational model of the bee mushroom body can explain seemingly complex forms of olfactory learning and memory. Curr. Biol. 27, 224–230. doi: 10.1016/j.cub.2016.10.054

PubMed Abstract | CrossRef Full Text | Google Scholar

Perry, C. J., Baciadonna, L., and Chittka, L. (2016). Unexpected rewards induce dopamine-dependent positive emotion-like state changes in bumblebees. Science 353, 1529–1531. doi: 10.1126/science.aaf4454

PubMed Abstract | CrossRef Full Text | Google Scholar

Perry, C. J., Barron, A. B., and Cheng, K. (2013). Invertebrate learning and cognition: relating phenomena to neural substrate. Wiley Interdiscip. Rev. Cogn. Sci. 4, 561–582. doi: 10.1002/wcs.1248

PubMed Abstract | CrossRef Full Text | Google Scholar

Prados, J., Alvarez, B., Acebes, F., Loy, I., Sansa, J., and Moreno-Fernández, M. M. (2013a). Blocking in rats, humans and snails using a within-subjects design. Behav. Process. 100, 23–31 doi: 10.1016/j.beproc.2013.07.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Prados, J., Alvarez, B., Howarth, J., Stewart, K., Gibson, C. L., Hutchinson, C. V., et al. (2013b). Cue competition effects in the planarian. Anim. Cogn. 16, 177–186. doi: 10.1007/s10071-012-0561-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Rescorla, R. A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. J. Comp. Physiol. Psychol. 66, 1–5. doi: 10.1037/h0025984

PubMed Abstract | CrossRef Full Text | Google Scholar

Rescorla, R. A., and Wagner, A. R. (1972). “A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement,” in Classical Conditioning II, eds A. Black and W. R. Prokasy (New York, NY: Academic Press), 64–99.

Google Scholar

Roper, M., Chrisantha C., and Chittka, L. (2017). Bio-inspired neural network provides new evidence on how simple feature detectors can enable complex visual generalization and stimulus location invariance in the miniature brain of honeybees. PLoS Comput. Biol. 3:e1005333. doi: 10.1371/journal.pcbi.1005333

PubMed Abstract | CrossRef Full Text | Google Scholar

Sahley, C., Rudy, J. W., and Gelperin, A. (1981). An analysis of associative learning in a terrestrial mollusc. J. Comp. Physiol. 144, 1–8. doi: 10.1007/BF00612791

PubMed Abstract | CrossRef Full Text | Google Scholar

Schultz, W. (2013). Updating dopamine reward signals. Curr. Opin. Neurobiol. 23, 229–238. doi: 10.1016/j.conb.2012.11.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Schultz, W. (2015). Neuronal reward and decision signals: from theories to data. Physiol. Rev. 95, 853–951. doi: 10.1152/physrev.00023.2014

PubMed Abstract | CrossRef Full Text | Google Scholar

Smith, B. H. (1997). An analysis of blocking in odorant mixtures: an increase but not a decrease in intensity of reinforcement produces unblocking. Behav. Neurosci. 111, 57–69. doi: 10.1037/0735-7044.111.1.57

PubMed Abstract | CrossRef Full Text | Google Scholar

Steinberg, E. E., Keiflin, R., Boivin, J. R., Witten, I. B., Deisseroth, K., and Janak, P. H. (2013). A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973. doi: 10.1038/nn.3413

PubMed Abstract | CrossRef Full Text | Google Scholar

Sutton, R. S., and Barto, A. G. (1987). “A temporal-difference model of classical conditioning,” in Proceedings of the Ninth Annual Conference of the Cognitive Science Society (Seattle, WA: Lawrence Erlbaum), 355–378.

Google Scholar

Tabone, C. J., and de Belle, J. S. (2011). Second-order conditioning in Drosophila. Learn. Mem. 18, 250–253. doi: 10.1101/lm.2035411

PubMed Abstract | CrossRef Full Text | Google Scholar

Terao, K., Matsumoto, Y., and Mizunami, M. (2015). Critical evidence for the prediction error theory in associative learning. Sci. Rep. 5:8929. doi: 10.1038/srep08929

PubMed Abstract | CrossRef Full Text | Google Scholar

Terao, K., and Mizunami, M. (2017). Roles of dopamine neurons in mediating the prediction error in aversive learning in insects. Sci. Rep. 7:14694. doi: 10.1038/s41598-017-14473-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Tian, J., Huang, R., Cohen, J. Y., Osakada, F., Kobak, D., Machens, C. K., et al. (2016). Distributed and mixed information in monosynaptic inputs to dopamine neurons. Neuron 91, 1374–1389. doi: 10.1016/j.neuron.2016.08.018

PubMed Abstract | CrossRef Full Text | Google Scholar

Unoki, S., Matsumoto, Y., and Mizunami, M. (2005). Participation of octopaminergic reward system and dopaminergic punishment system in insect olfactory learning revealed by pharmacological study. Eur. J. Neurosci. 22, 1409–1416. doi: 10.1111/j.1460-9568.2005.04318.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Unoki, S., Matsumoto, Y., and Mizunami, M. (2006). Roles of octopaminergic and dopaminergic neurons in mediating reward and punishment signals in insect visual learning. Eur. J. Neurosci. 24, 2031–2038. doi: 10.1111/j.1460-9568.2006.05099.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Vergoz, V., Roussel, E., Sandoz, J. C., and Giurfa, M. (2007). Aversive learning in honeybees revealed by the olfactory conditioning of the sting extension reflex. PLoS One 2:e288. doi: 10.1371/journal.pone.0000288

PubMed Abstract | CrossRef Full Text | Google Scholar

Watanabe, H., Matsumoto, S. C., Nishino, H., and Mizunami, M. (2011). Critical roles of mecamylamine-sensitive mushroom body neurons in insect olfactory learning. Neurobiol. Learn. Mem. 95, 1–13. doi: 10.1016/j.nlm.2010.10.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: blocking, classical conditioning, cricket, dopamine, error-correction learning, invertebrate, octopamine, Rescorla–Wagner model

Citation: Mizunami M, Terao K and Alvarez B (2018) Application of a Prediction Error Theory to Pavlovian Conditioning in an Insect. Front. Psychol. 9:1272. doi: 10.3389/fpsyg.2018.01272

Received: 21 March 2018; Accepted: 03 July 2018;
Published: 23 July 2018.

Edited by:

Lars Chittka, Queen Mary University of London, United Kingdom

Reviewed by:

Martha Escobar, Oakland University, United States
Bertram Gerber, Leibniz Institute for Neurobiology, Germany

Copyright © 2018 Mizunami, Terao and Alvarez. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Makoto Mizunami, mizunami@sci.hokudai.ac.jp

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.