Metazoan evolution of glutamate receptors reveals unreported phylogenetic groups and divergent lineage-specific events

Glutamate receptors are divided in two unrelated families: ionotropic (iGluR), driving synaptic transmission, and metabotropic (mGluR), which modulate synaptic strength. The present classification of GluRs is based on vertebrate proteins and has remained unchanged for over two decades. Here we report an exhaustive phylogenetic study of GluRs in metazoans. Importantly, we demonstrate that GluRs have followed different evolutionary histories in separated animal lineages. Our analysis reveals that the present organization of iGluRs into six classes does not capture the full complexity of their evolution. Instead, we propose an organization into four subfamilies and ten classes, four of which have never been previously described. Furthermore, we report a sister class to mGluR classes I-III, class IV. We show that many unreported proteins are expressed in the nervous system, and that new Epsilon receptors form functional ligand-gated ion channels. We propose an updated classification of glutamate receptors that includes our findings.

The current classification of iGluR subunits includes six classes: a-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors, Kainate receptors, N-methyl-D-aspartate (NMDA) receptors (actually comprising three classes: NMDA1-3) and Delta receptors (Traynelis et al., 2010). iGluR subunits of the same class assemble into homo-or heterotetramers (Karakas and Furukawa, 2014;Kumar et al., 2011) and their ligand selectivity is dictated by a small number of residues located in the ligand-binding domain (Traynelis et al., 2010). Accordingly, NMDA subunits GluN1 and GluN3 as well as the Delta subunit GluD2 bind glycine and D-serine, while all subunits from the AMPA and Kainate classes bind glutamate (Traynelis et al., 2010;Kristensen et al., 2016). Metabotropic glutamate receptors are class C GPCRs and as such are formed by a single polypeptide. mGluRs also appeared before the emergence of metazoans, being present in unicellular organisms such as the amoeba Dictyostellium discoideum (Taniura et al., 2006). mGluRs are presently organized into three classes (I, II and III) and all their members respond to glutamate (Conn and Pin, 1997;Pin et al., 2003).
While the phylogeny of the two families of GluRs is well characterized in vertebrates, that of the entire animal kingdom is only poorly understood. The few studies on iGluR evolution outside vertebrates concentrate on a few phyla, leaving many proteins unclassified (Greer et al., 2017;Brockie et al., 2001;Janovjak et al., 2011;Kenny and Dearden, 2013). Similarly, the vast majority of mGluRs described so far fall into the three classes described in vertebrates (Krishnan et al., 2013;Kucharski et al., 2007;Dillon et al., 2006). Although, the existence of three insect mGluRs that cluster apart from classes I-III led to propose the existence of a fourth class (Mitri et al., 2004).
Here we present what to our knowledge is the most comprehensive phylogenetic study of ionotropic and metabotropic GluRs along the animal kingdom. We have favored the use of more slow-evolving species for the construction of phylogenetic trees. These species are particularly amenable to phylogenetics (Simakov et al., 2013;Simakov et al., 2015;Putnam et al., 2007) as they arguably eLife digest Nerve cells or neurons communicate with each other by releasing specific molecules in the gap between them, the synapses. The sending neuron passes on messages through packets of chemicals called neurotransmitters, which are picked up by the receiving cell with the help of receptors on its surface. Neurons use different neurotransmitters to send different messages, but one of the most common ones is glutamate.
There are two families of glutamate receptors: ionotropic receptors, which can open or close ion channels in response to neurotransmitters and control the transmission of a signal, and metabotropic receptors, which are linked to a specific protein and control the strength of signal.
Our understanding of these two receptor families comes from animals with backbones, known as vertebrates. But the receptors themselves are ancient. We can trace the first family back as far as bacteria and the second back to single-celled organisms like amoebas. Vertebrates have six classes of ionotropic and three classes of metabotropic glutamate receptor. But other multi-celled animals also have these receptors, so this picture may not be complete.
Here, Ramos-Vicente et al. mapped all major lineages of animals to reveal the evolutionary history of these receptors to find out if the receptor families became more complicated as brain power increased. The results showed that the glutamate receptors found in vertebrates are only a fraction of all the types that exist. In fact, before present-day animal groups emerged, the part of the genome that holds the ionotropic receptor genes duplicated three times. This formed four receptor subfamilies, and our ancestors had all of them. Across the animal kingdom, there are ten, not six, classes of ionotropic receptors and there is an extra class of metabotropic receptors. But only two subfamilies of ionotropic and three out of four metabotropic receptor classes are still present in vertebrates today.
The current classification of glutamate receptors centers around vertebrates, ignoring other animals. But this new data could change that. A better knowledge of these new receptors could aid neuroscientists in better understanding the nervous system. And, using this technique to study other families of proteins could reveal more missing links in evolution. present lower rates of molecular evolution than other organisms. Our work shows that metazoan evolution of GluRs is much more complex than previously thought. iGluRs present an overall organization into four subfamilies that were already present in the last ancestor of all metazoans. Vertebrate species only retain members of two of these subfamilies. Furthermore, we identify many lineage-specific gains, losses or expansions of GluR phylogenetic groups. Finally, we present experimental evidence showing that unreported GluRs found in the basally divergent chordate Branchiostoma lanceolatum (amphioxus) are highly expressed in the nervous system and that members of the unreported Epsilon subfamily, the most phylogenetically spread among unreported groups, can form functional ligand-gated ion channels.

Results
Phylogenetics of metazoan ionotropic glutamate receptors reveals four subfamilies, unreported classes and lineage-specific evolutionary dynamics We have performed a systematic phylogenetic study of iGluR evolution across the animal kingdom. To increase the confidence on iGluRs evolutionary history phylogenetic trees have been generated using two independent methods (Bayesian inference and Maximum-likelihood (ML), Figure 1 and Figure 1-figure supplement 1). Our analysis indicates that the family of iGluRs experienced key duplication events that define its present organization into four previously unreported subfamilies, of which two contain the extensively studied vertebrate classes. Assuming ctenophores as the sister group to all other animals (Moroz et al., 2014;Ryan et al., 2013), our data suggest that the three major duplication events leading to this four subfamilies occurred before the divergence of current animal phyla (see Figure 2 for a summary scheme of iGluRs evolution). The first of these duplications produced the separation of the Lambda subfamily, the second lead to divergence of the NMDA subfamily and the third to the split between Epsilon and AKDF subfamilies.
The Lambda subfamily is the most phylogenetically restricted, as we could only identify it in porifers. Thus, Lambda would have been lost in two occasions, in the lineage of ctenophores and in a common ancestor of placozoans, cnidarians and bilaterals. On the other hand, the Epsilon subfamily is the best represented among non-bilaterians, being present in all non-bilaterian phyla investigated. Including in porifers, although we could only identify one Epsilon in sponges, GluE_Ifa from the demosponge Ircinia fasciculata. Our data also indicate that this subfamily has been lost in multiple occasions along metazoan evolution, as we could not find it in the protostome, echinoderm or vertebrate species investigated. Interestingly, all ctenophore iGluRs identified, which have been previously reported (Alberstein et al., 2015), belong to the Epsilon subfamily. Thus, this phylum would have lost NMDA, Lambda and AKDF proteins. Contrarily, ctenophores would have experienced an important expansion of Epsilon iGluRs, as we report 17 and 10 of these proteins in the two species with genomic information available, M. leidyi and P. bachei, respectively.
Although we have not identified NMDA receptors in ctenophores, porifers and placozoans our analysis indicates that this subfamily was already present in the last common ancestor of metazoans. This is because the topology of the tree shows that NMDAs appear in the phylogeny at the same level as the Epsilon subfamily, which has representatives in all non-bilateral phyla. According to our data, NMDA1s on the one hand and NMDA2s and NMDA3s on the other contain members of the cnidarian phylum. Although we have only been able to identify one member more closely related to NMDA2 and NMDA3 than NMDA1 (GluN2/3_Nve), its position in the phylogeny is very well supported by both analyses performed. This indicates that a specific duplication occurred in the ancestor of bilaterians originating NMDA2s and NMDA3s. Moreover, we have also identified a cnidarianspecific NMDA class, that we have termed NMDA-Cnidaria, this class presents representative proteins in 3 of the four species investigated. Among bilaterals we have observed conservation of all NMDA classes with the exception of NMDA2s in echinoderms, which are absent from the two species examined. Interestingly, studied cnidarian species substantially expanded their NMDA subfamily repertoire, with at least six members in Nematostella vectensis.
In bilaterians the AKDF subfamily diversified into the known AMPA, Kainate and Delta classes, but also into a fourth new class that we have termed Phi. The phylogenetic spread of these classes is quite variable, as AMPA and Kainate are in all bilateral phyla investigated but Delta and Phi are . Bayesian phylogeny of metazoan ionotropic glutamate receptors. Ionotropic glutamate receptor subfamilies are indicated in colored boxes at the right. Sequences belonging to the same class are highlighted together by dashed lines and the class name is also shown. Green circles highlight the three duplications occurred before the divergence of the ctenophore lineage that lead to these four subfamilies. Posterior probabilities are Figure 1 continued on next page more restricted. Deltas are almost completely absent from ecdysozoan species, as we could only find a single member of this class in priapulids (P. caudatus) and none in arthropods or nematodes. Similarly, Deltas are poorly represented in mollusks and, with the available data, absent in annelids. Finally, we could only identify Phi proteins in cephalochordates, hemichordates and echinoderms, indicating that this class might be lost in the lineages of protostomes and olfactores (i.e. vertebrates and urochordates). The AKDF subfamily also includes proteins from the non-bilateral phyla of porifera, placozoa and cnidarian. The exact organization of these proteins into classes is not as straightforward as for bilateral proteins. The Bayesian and ML analysis only agree in the position of 12 iGluRs from the sponge O. carmela, these would constitute the only clear class in non-bilaterals, which we have termed AKDF-Oca. Another example of a multiple lineage-specific event that occurred during animal evolution of iGluRs can be observed in the evolution of AMPA and Kainate proteins among protostomes. The general iGluRs phylogeny ( Figure 1) suggests that ecdysozoan species have expanded their repertoire of Kainate subunits when compared with lophotrochozoans (e.g. mollusks, annelids), since C. teleta and L. gigantea only presents one and two genes coding for Kainate receptors, respectively. Contrarily, we found more AMPA subunits in lophotrochozoans than in ecdysozoan species. To investigate whether the two protostome lineages have alternatively expanded genes coding for AMPA or Kainate subunits we conducted a phylogenetic analysis of these two classes using eight species of ecdysozoans and seven of lophotrochozoans with well-characterized genomes (Figure 3 shown at tree nodes and protein names at the end of each branch. Tree branches are colored based on phylum, as indicated in the legend. For unreported phylogenetic groups, names of proteins predicted to bind glycine or glutamate are highlighted in yellow or orange, respectively. Protein names from non-vertebrate species are composed of four parts: (i) 'GluR#', where # is a code denoting class or subfamily (A, AMPA; K, Kainate; F, Phi; D, Delta; Akdf, AKDF; E, Epsilon; N, NMDA and L, Lambda); (ii) a number, or range of numbers, denoting orthologous vertebrate protein(s), if any; (iii) a Greek letter to identify non-vertebrate paralogs, if any and (iv) a three-letter species code. iGluRs from A. thaliana were used as an outgroup. All information on species and proteins used is given in Figure 1-source data 2. Phylogenetic reconstruction was performed using Bayesian inference. The amino acid substitution model used was Vt + G + F, number of generations: 14269000, final standard deviation: 0.007016 and potential scale reduction factor (PSRF): 1.000. Scale bar denotes number of amino acid substitutions per site. Although the GluAkdf2_Tad protein localizes to the Delta class in this tree, we do not consider this molecule as a confident member of this class. This is because the statistical support provided by the Bayesian analysis is low and because the Maximum-likelihood analysis (see    . Nematodes were left out of the analysis as they lack Kainate receptors (Brockie et al., 2001). This analysis retrieved 40 lophotrochozoan genes coding for AMPA subunits but only 15 coding for Kainates. The opposite scenario was observed in the genomes of ecdysozoan species, with 10 AMAP and 40 Kainate proteins,. Yet, among ecdysozoans the priapulid P. caudatus has two AMPA and two Kainate subunits, indicating that the expansion of Kainate receptors might be exclusive to arthropods. Overall the AMPA:Kainate ratio resulted to be around 1:4 in ecdysozoans and 4:1 in lophotrochozoans.

Sequence conservation and ligand specificity of unreported iGluR phylogenetic groups
All proteins from unreported groups (i.e. subfamilies and classes) present well-conserved sequences in iGluR domains, including transmembrane domains or residues involved in receptor tetramerization (Figure 1-figure supplement 2 and Figure 1-source data 1). Three-dimensional (3D) models of two Epsilon subunits from amphioxus (GluE1 and GluE7) indicate that their general fold is well preserved ( Figure 1-figure supplement 3a). The only noticeable distinction in proteins from these groups is an insertion in the intracellular loop between the first and second transmembrane domains in Epsilon proteins. This insertion is particularly distinct in ctenophore iGluRs, having been termed as the cysteine-rich loop (Alberstein et al., 2015) ( Figure 1-figure supplement 4). We have also identified a sequence difference among Epsilon proteins. Ctenophore iGluRs have two cysteines that form a disulfide bond at loop 1 of the ligand binding domain (Alberstein et al., 2015), which are also present in NMDA proteins. Nevertheless, this element is absent from the remaining members of the Epsilon subfamily.
The 'SYTANLAAF' motif, essential for channel gating (Traynelis et al., 2010), is also well conserved in most sequences, in particular the second, fourth and fifth residues (Figure 1-figure supplement 2). Nevertheless, all members of the Lambda subfamily and some proteins of the Phi class present lower levels of conservation in this sequence. Whether these changes have a functional impact is something that will require further investigation. The Q/R site (Q586, residue numbering according to mature rat GluA2) and the acidic residue located four positions downstream D/E590 (Figure 1-figure supplement 4) are involved in calcium permeability and polyamine block of AMPA and Kainate receptors (Bowie and Mayer, 1995;Koh et al., 1995;Kamboj et al., 1995). Of these two positions the latter is much better conserved, especially outside ctenophores and the Lambda subfamily. We have identified an acidic residue at position 590 in 84 out of 122 iGluRs from unreported groups, including cnidarian NMDAs. Yet, only 1/3 of these proteins present a glutamine (Q) at position 586. This includes most AKDFs and Epsilon proteins from non-ctenophores, contrarily, none of the Phi subunits presents a Q586.
The key ligand binding residues involved in fixing the amino acid backbone (aÀamino and aÀcarboxyl) are Arg485 and an acidic residue at position 705 (Naur et al., 2007;Armstrong and Gouaux, 2000;Mayer, 2005;Furukawa et al., 2005;Yao et al., 2008). These two positions are well conserved in 94 of the 122 proteins from unreported groups, suggesting that their endogenous ligand is an amino acid (see  Residues involved in ligand selectivity show higher variability. These are located at positions 653 and 655, and are occupied by glycine and threonine in glutamate-binding proteins and by serine and a non-polar residue in glycine-binding iGluRs. However, a recent study of ctenophore receptors has found that position 653 can be occupied by serine or threonine in glutamate-binding iGluRs, and by an arginine in glycine-binding subunits (Alberstein et al., 2015). Based on this previous knowledge we have predicted the ligand specificities of most previously unreported receptors. The preferred ligand could be confidently predicted for 72 out of the 94 proteins with well-conserved residues involved in fixing the amino acid backbone.
Interestingly, all unreported groups comprise glycine-and glutamate-specific iGluRs. Gly-specific receptors slightly outnumber those predicted to respond to glutamate (overall ratio about 3:2). The Lambda subfamily would include three proteins specific for glutamate and one for glycine, while seven remain with an unknown selectivity. Of note, the protein predicted to bind glycine (Glu-L5_Oca) displays an arginine at position 653, a feature which had only been reported in ctenophores (Alberstein et al., 2015). This residue would form a salt bridge with Glu423, which is key for glycine selectivity in ctenophores (Alberstein et al., 2015). Most Epsilon and AKDF proteins would preferably bind glycine, although ctenophores present a similar number of Epsilon receptors predicted to respond to glycine or glutamate ( Figure 1) (Alberstein et al., 2015). In the Phi class we also found a similar number of receptors binding glycine and glutamate. Finally, we could only predict binding specificity for two of the 9 NMDA-Cnidaria proteins, as they present many changes in the residues involved in either amino acid backbone binding or side chain recognition.
Interestingly, the 22 proteins for which we could not confidently predict their ligand selectivity (Figure 1-figure supplement 5), present a limited number of residues occupying position 653 and 655, suggesting constrained evolution. Of these: (i) nine present residues with negative polarity at both positions, being candidates to bind glutamate, (ii) six present a Gly653 and a non-polar residue at position 655, and thus are candidates to bind glycine, (iii) five proteins, all from the Branchiostoma genus, present a tyrosine at position 653. A structural model of one of these receptors, GluE7 (Figure 1-figure supplement 3c), shows that a Tyr653 aromatic side chain would occupy the ligandbinding pocket, strongly suggesting that amino acid binding would be blocked. Finally, (iv) two proteins present a phenylalanine in either of the two positions and remain unclassified.
Epsilon and Phi iGluR proteins are highly expressed in the nervous system and traffic to the plasma membrane We used quantitative PCR (qPCR) to investigate gene expression levels of all iGluR subunits identified in B. lanceolatum, including those from the Epsilon and Phi groups. All 24 B. lanceolatum iGluR subunits identified in silico were found expressed in amphioxus, with the exception of Grie5 ( Figure 4a). Furthermore, they all showed a significantly higher expression in the nerve cord as compared to the whole body, suggesting tissue-enriched expression. While we observed low expression levels for Epsilon genes coding for subunits with a tyrosine at position 653 (Grie5-8), which according to the 3D model would block the ligand-binding pocket, the expression of Grif1-2, also presenting the same tyrosine, reach much higher levels, comparable to those of subunits from the Kainate, Figure 3 continued paralogues, if any and (iv) a three-letter species code. GluN1s from chordates were used as an outgroup. All information on species and proteins used in this phylogeny is given in Figure 3-source data 2. Phylogenetic reconstruction was performed using Bayesian inference. The amino acid substitution model used was Vt + I + G, number of generations: 8868000, final standard deviation: 0.0072 and potential scale reduction factor (PSRF):   Delta or NMDA classes. Thus, the presence of a tyrosine at position 653 does not appear to be directly correlated with low expression levels. Amphioxus genes coding for GluE1 and GluE7 were synthesized in vitro and transiently expressed in HEK293T cells for functional studies. Wild-type GluE1 and GluE7, which are not predicted to have a canonical signal peptide by SignalP 4.1 (Nielsen, 2017), expressed well but were not trafficked to the plasma membrane (Figure 4-figure supplement 1a-d), even though residues involved in tetramerization (Salussolia et al., 2013) are well conserved (Figure 4b). We thus synthesized new variants of these genes with the signal peptide from rat GluA2 (Figure 1-figure supplement 1cd). These constructs also expressed well ( Figure 4c) and now were efficiently trafficked to the plasma membrane, as indicated by the staining observed in non-permeabilized cells (Figure 4d). Furthermore, analysis of receptor oligomerization, performed using non-denaturing gel electrophoresis and immunoblot, clearly indicates that both proteins form homotetramers in vitro (Figure 4e).

Ligand specificity and electrophysiological properties of Epsilon proteins from amphioxus
We next investigated the gating properties of two Epsilon proteins from amphioxus, GluE1 and GluE7. The presence of a serine and a tryptophan at positions 653 and 704, respectively, suggested that GluE1 would bind glycine. Indeed, neither glutamate nor aspartate elicited a response in our experimental settings. Instead, glycine application was able to elicit an inward whole-cell current at a membrane potential of À60 mV ( Figure 5a). Interestingly, the chemically related amino acids alanine and D-serine only generated very low responses, indicating a high selectivity of the GluE1 homotetramer for glycine.
The Epsilon receptor displayed a strong inward rectification, even in the absence of added polyamines in the intracellular solution (Figure 5b,c). This behavior is characteristic of unedited AMPA and Kainate receptors displaying a glutamine (Q) and an acidic residue at positions 586 and 590, respectively (Bowie and Mayer, 1995;Koh et al., 1995;Kamboj et al., 1995) and GluE1 presents a glutamine and an aspartic acid at these positions ( Figure 1-figure supplement 4). Glycine-mediated currents showed a slow rate of recovery from desensitization when compared with AMPA or Kainate mammalian receptors, requiring 20-25 seconds until a complete recovery was achieved and a full response of the same magnitude could be recorded (Figure 5d,e). Similar observations have been made with ctenophore receptors activated by glycine in which the recovery from desensitization has an unusually long time constant of 81 seconds (Alberstein et al., 2015).
Finally, functional studies on receptors formed by GluE7 did not retrieve any positive results. None of the following amino acids: glutamate, aspartate, asparagine, glycine, alanine or D-serine elicited a response in our experimental system. We hypothesize that, as predicted by the 3D model, the presence of a tyrosine at position 653 renders a homomeric form of this receptor unable to function as an amino acid-gated ion channel. Phylogenetics of metazoan metabotropic glutamate receptors reveals a sister group of classes I to III We next performed a phylogenetic study of metabotropic glutamate receptors (Figure 6 and Figure 6-figure supplement 1). This analysis has revealed that the three historical mGluR classes (I to III) have a sister group. Following the current nomenclature we have named this as class IV. The existence of this class had already been proposed on the bases of three insect proteins (Mitri et al., 2004). Yet, here we show that this class is actually present in all bilateral phyla, excluding vertebrates. Furthermore, we also show that class IV appeared together with classes I-III before radiation of bilateral lineages. We have identified clear orthologues to class I-IV in porifers, placozoans and   Figure 6. Bayesian phylogeny of metazoan metabotropic glutamate receptors. Identified metabotropic glutamate receptor classes from bilateral and non-bilateral organisms are indicated by colored boxes at the right. Dashed boxes further highlight individual classes from bilateral organism. Posterior probabilities are shown at tree nodes and protein names at the end of each branch. Tree branches are colored based on phylum, as indicated in the legend. Protein names from non-vertebrate species are composed of four parts: (i) 'mGluR', followed by a number, or range of numbers, denoting Figure 6 continued on next page cnidarians but not in ctenophores. These are organized into four classes, two from cnidarians, and one from placozoans and porifers ( Figure 6). We have also identified non-bilaterian mGluRs that fall outside the above-mentioned classes. Unfortunately, the Bayesian and ML phylogenies do not agree on the exact organization of these early divergent mGluRs, except for the fact that they diverge prior to bilaterian classes. For this reason we have left these sequences unclassified. Whether these sequences belong to one, or even multiple classes that would have been lost in bilateral organisms is something that will require further investigation. Although all class IV proteins show well conserved sequences overall ( Figure 6-figure supplement 2a, Figure 6-figure supplement 3 and Figure 6-source data 1), two residues critical for glutamate binding, Arg78 and Lys409, are non-conservatively replaced by non-polar or acidic residues in all class IV proteins identified ( Figure 6-figure supplement 2a, residue numbering corresponds to human mGluR1). These changes are predicted to hamper glutamate binding and, indeed, functional studies of a class IV receptor from fruit fly indicated that it does not respond to this amino acid (Mitri et al., 2004). All class IV proteins would share this feature. On the other hand, residues involved in contacts with the amino acid backbone are well conserved ( Figure 6-figure supplement 2a), suggesting that these proteins might bind an amino acid other than glutamate. Similarly, mGluR residues from most non-bilaterian sequences involved in binding the amino acid backbone are highly conserved. Among non-bilaterian proteins the residues involved in glutamate binding are only conserved in approximately half of the proteins from classes orthologous to I-II-III-IV. Finally, we investigated mGluRs expression in amphioxus following the same procedure described for iGluRs. All five amphioxus mGluRs showed an enriched expression in the nerve cord, including the two class IV genes. Noticeably, these two genes showed significantly higher expression levels than orthologues of vertebrate classes ( Figure 6-figure supplement 2b).

Discussion
We have performed what to our knowledge is the most comprehensive phylogenetic study of metazoan glutamate receptors. This has revealed that their evolutionary history is much more complex than what is currently acknowledged, especially for the family of iGluRs. Our study has also revealed the existence of unreported phylogenetic groups in both ionotropic and metabotropic glutamate Figure 6 continued orthologous vertebrate protein(s), if any (for Class IV and group I-II-III-IV proteins, the name is followed by the name of the class/group); (ii) a Greek letter to identify non-vertebrate paralogs, if any and (iv) a three-letter species code. GABA-B receptors from vertebrates were used as an outgroup. All information on species and proteins used in this phylogeny is given in Figure 6-source data 2. Phylogenetic reconstruction was performed using Bayesian inference. The amino acid substitution model used was WAG + I + G + F, number of generations: 5327000, final standard deviation: 0.004788 and potential scale reduction factor ( receptors. Importantly, our data indicate that the evolution of glutamate receptors has not occurred in an unequivocal incremental manner only in those clades with more elaborated neural systems, but it has rather followed an scattered lineage-specific evolutionary history. This means that certain lineages have experienced the gain, loss, expansion or reduction of specific phylogenetic groups. Our phylogenetic analysis indicates that the family of iGluRs is actually divided into four unreported subfamilies that we have termed Lambda, Epsilon, NMDA and AKDF. Interestingly, this general organization was already present in the last common ancestor of all metazoans and later duplications within NMDA and AKDF subfamilies resulted in the formation of well-known iGluR classes. The other two subfamilies are absent from the majority of model species used in neuroscience research. The NMDA subfamily diversified into classes NMDA1-3 but also into the NMDA2/3 and NMDA-Cnidaria. Similarly, the AKDF subfamily diversified into the AMPA, Kainate and Delta classes, but also into the previously unreported Phi class. We have also identified and AKDF class exclusive to porifers, represented by sequences form O. carmela. Most well-studied iGluR classes are the result of duplications in ancestors of current bilateral species, >650 million years ago (mya) (Kumar et al., 2017), only class NMDA1 originated earlier, as cnidarians present members within this class. The Epsilon subfamily, which includes all iGluRs from ctenophores, is the only subfamily present in all non-bilateral phyla investigated, including sponges. It is thus the subfamily presenting a larger phylogenetic spread, as it is also present in hemichordates and in non-vertebrate chordates. On the other hand, the unreported Phi class shows a more restricted phylogenetic spread, as it is present only in three deuterostome phyla. Moreover, Lambda proteins seem restricted to Porifers, which constitutes an interesting evolutionary case due to maintenance of a glutamate receptor family in a phylum without nervous system.
The phylogenetic analysis of metabotropic glutamate receptors has allowed us to unambiguously establish the existence of a sister group to the well-known classes I, II and III. Following the present nomenclature we have named this as class IV. This class had been previously proposed based on the identification of three insect mGluRs that did not cluster with members of known classes (Mitri et al., 2004). Here we show that class IV is not restricted to insects, but is actually present in all bilaterian phyla investigated, with the exception of vertebrates where this class has been lost. Interestingly, as it occurs for most well-known iGluR classes, mGluR classes I-IV appeared simultaneously in the ancestor of bilaterals. Our phylogenetic analysis also indicates that the non-bilateral phyla of cnidarians, placozoans and porifers present clear orthologues to classes I-IV, which are organized into four classes, while we failed to find any in the early-branching ctenophores. Finally, we were unable to confidently classify many non-bilateral mGluRs, which might constitute one or more classes.
We have identified many examples of lineage-specific evolutionary events. These would antagonize with a model in which species with less elaborated nervous systems would present GluR families with lower complexity. The most noticeable examples are: (i) the absence of all subfamilies but Epsilon in analyzed ctenophores, (ii) the loss of Delta receptors from arthropods, nematodes and annelid species investigated, (iii) the loss of the Epsilon subfamily in vertebrates, echinoderms and protostomes, (iv) the loss of the Phi class in vertebrates and studied protostomes, (v) the specific expansion of Kainate receptors in arthropods, which contrasts with the expansion of AMPA receptors in its sister lineages of mollusks and annelids, (vi) the large expansion of the Epsilon subfamily in ctenophores, placozoans and cephalochordates and, finally (vii) the loss of mGluR class IV in vertebrates.
Along the same line, it is interesting to note that amphioxus (B. belcheri and B. lanceolatum), with a simple nervous system, have over 20 genes encoding iGluRs, while mammals have 18. Other nonvertebrate species also present large numbers of iGluRs, including the 19 iGluRs identified in the sponge O. carmela or the 17 present in the ctenophore M. leidyi, to mention a few. Similarly, the cnidarian A. digitifera and the ctenophore M. leidyi have seven mGluRs each, while the placozoan T. adhaerens presents eleven, three more than the eight mGluRs found in the human genome. The large number of GluRs found in many non-vertebrate animals suggests that there has been an evolutionary trend to increase their number in many metazoan lineages.
Our experimental results suggest that unreported receptors would play a role in the nervous system, as Epsilon, Phi and mGluR class IV genes are highly expressed in the nerve cord of amphioxus. Nevertheless, whether all these proteins are expressed at the synapse and act as neurotransmitter receptors is an issue that will require further investigation. Their presence in other tissues, such as sensory organs, cannot be ruled out. Those receptors showing more divergent sequences, particularly in residues involved in ligand binding, might respond to other molecules. For instance, they could behave as chemoreceptors, as it is the case of antennal receptors found in insects (Croset et al., 2010;Benton et al., 2009).
Proteins from all unreported groups generally present a good conservation of residues involved in binding the amino acid backbone, indicating that their ligand would be an amino acid or a closely related molecule. Interestingly, we could identify proteins predicted to bind either glycine or glutamate in all unreported iGluR subfamilies and classes. If our functional predictions are correct, the ability to recognize one or the other amino acid would have emerged repeatedly in all unreported iGluR phylogenetic groups. Unexpectedly, the nature of the residues conferring amino acid specificity indicates that only a minority of proteins from unreported GluR groups would respond to glutamate. Sequence analysis and structural considerations strongly suggest that class IV mGluRs will not bind glutamate and that among non-bilateral mGluRs only a minority, belonging to classes orthologous to I-II-III-IV, are predicted to bind to this neurotransmitter. Similarly, among unreported iGluR groups, the number of proteins binding glycine outnumbers those binding glutamate. Interestingly, we report a glycine-binding poriferan protein (GluL5_Oca) with a structural feature that had only been reported in ctenophores (Alberstein et al., 2015). This is an Arg653 that through establishing a salt bridge with Glu423 confers glycine specificity (Alberstein et al., 2015). We thus report that this structural element is not exclusive to ctenophores. We have also identified iGluR subunits with important changes in critical ligand binding residues, indicating that they might have evolved new biological functions, for example, response to other, as yet unidentified small molecules.
The activation of Epsilon receptors by glycine has been experimentally corroborated by electrophysiological analysis of homotetrameric receptors composed by GluE7 from M. leidy (Alberstein et al., 2015) and GluE1 from amphioxus (this study). In our hands the amphioxus receptor showed a very high selectivity for glycine, since ion currents could not be elicited by chemically related amino acids such as serine or alanine. Glycine-binding Epsilon subunits from phyla other than ctenophores present structural features similar to those from glycine-binding iGluRs in vertebrates. The greater number of glycine receptors found in non-vertebrate species could be related to the higher abundance of this amino acid in their nerve cord as compared with the mammalian brain (Pascual-Anaya and D' Aniello, 2006).
Altogether, our phylogenetic analysis and experimental findings have uncovered the complex evolution of glutamate receptors within the metazoan kingdom. Our data indicate that the classification of iGluRs is not restricted to the six classes currently recognized. Instead, iGluRs are organized into four subfamilies: Lambda, Epsilon, NMDA and AKDF and ten classes with varying phylogenetic spread. With the data available, the NMDA subfamily is organized into classes NMDA1, NMDA 2, NMDA3, NMDA-Cnidaria and NMDA2/3, while subfamily AKDF contains classes AMPA, Kainate, Delta, Phi and AKDF-Oca. Both NMDA2/3 and AKDF-Oca are represented by sequences from only one species, further sequencing of non-bilateral species will be required to fully demonstrate their existence. Furthermore, the evolution of mGluRs has generated a sister group to classes I, II and III, class IV. We have also identified classes of non-bilaterian mGluRs orthologous to I-II-III-IV. We propose that the classification of these two families of GluRs, key to the physiology of the nervous system, has to be updated to include our findings.

Identification of genes coding for members of glutamate receptor families in metazoan genomes
Phylogenetic analysis were performed with sequences from at least two species from each of the following metazoan phyla: Porifera, Ctenophora, Placozoa, Cnidaria, Lophotrochozoa, Ecdysozoa, Hemichordata, Chordata and Vertebrata, with the exception of placozoans for which only one species is available. When possible, we chose slowly evolving species. The complete lists of species used for iGluR phylogenies are given in Figure 1-source data 2. Species used in the phylogeny of metabotropic glutamate receptors are listed in Figure 6-source data 2. Sponge sequences were taken from (Riesgo et al., 2014), B. lanceolatum sequences were retrieved from unpublished genomic and transcriptomic databases (access was kindly provided by the Mediterranean Amphioxus Genome Consortium), A. digitifera and P. flava sequences were obtained from the Marine Genomics Unit (Simakov et al., 2015;Shinzato et al., 2011) and P. bachei sequences from NeuroBase (Moroz et al., 2014). GluR sequences were identified using homology-based searches in a two-tier approach. Mouse glutamate receptors were used as search queries (iGluRs: Gria1-4; Grik1-5; Grid1-2, Grin1, Grin2A-D and Grin-3A-B; mGluRs: mGluR1-8). In a first search GluR homologs were identified using the BLASTP tool (Altschul et al., 1990) with default parameters. Subject sequences with an E-value below 0.05 were selected as candidate homologs. These were re-blasted against the NCBI database of 'non-redundant protein sequences' using the same BLAST tool. If the first hit obtained in the reciprocal BLAST was a glutamate receptor the sequence was included in the phylogenetic analysis. In a second stage the same mouse sequences were used to perform TBLASTN searches against genomic and, when available, transcriptomic databases. Subject sequences not identified in the first tear and having an E-value below 0.05 were selected as candidate homologs. These were re-blasted using BLASTX against the NCBI 'non-redundant protein sequences' database. Finally, if the first hit of this search was a glutamate receptor the sequence was also included in the phylogenetic analysis. Identified iGluR sequences in which less than four residues of the SYTANLAAF motif (Traynelis et al., 2010) were conserved were not considered for the final phylogenetic analysis. mGluR sequences lacking two or more of the seven transmembrane regions were also discarded. The complete reference lists of all iGluRs used in the final phylogeny are given in files Figure 1source data 2. The reference list of metabotropic glutamate receptors is presented in Figure 6source data 2. The alignments used for the phylogenetic analysis of iGluRs, mGluRs and AMPAs and Kainates from protostomes are provided in Figure 1-source data 3, Figure 3-source data 1 and Figure 6-source data 3.

Phylogenetic analyses
The iGluR tree was constructed with 224 sequences identified in 26 non-vertebrate species (Figure 1-source data 2). The tree also included 18 iGluR sequences from vertebrates and two iGluR proteins from A. thaliana, used as an outgroup (Chiu et al., 2002). The phylogenetic analysis of AMPA and Kainate classes in protostomes was inferred using 110 sequences from 15 protostome species (Figure 3-source data 2) and 37 sequences from deuterostomes, of which 4 GluN1 proteins were used as an outgroup. The mGluR tree was constructed with 149 proteins from 29 non-vertebrate species, 38 mGluRs from vertebrate species and 10 sequences from vertebrate metabotropic GABA receptors, used as an outgroup ( Figure 6-source data 2).
Protein sequences were aligned with the MUSCLE algorithm (Edgar, 2004), included in the software package MEGA6 (Tamura et al., 2013) with default parameters. ProtTest v3.4.2 was used to establish the best evolutionary model (Darriba et al., 2011). Trees were constructed using MrBayes v3.2.6 (Ronquist et al., 2012) for Bayesian inference and IQ-TREE (Nguyen et al., 2015) for Maximum-likelihood analysis. For Bayesian inference phylogenies were stopped when standard deviation was below 0.01 and its value was fluctuating but not decreasing. Markov chain Monte Carlo (MCMC) was used to approximate the posterior probability of the Bayesian trees. Bayesian analyses included two independent MCMC runs, each using four parallel chains composed of three heated and one cold chain. Twenty-five % of initial trees were discarded as burn-in. Convergence was assessed when potential scale reduction factor (PSRF) value was between 1.002 and 1.000. In Maximum-likelihood analysis the starting tree was estimated using a neighbor-joining method and branch support was obtained after 1000 iterations of ultrafast bootstrapping (Hoang et al., 2018). Gene/protein names were given based on their position in the tree. Phylogenetic trees were rendered using FigTree (http://tree.bio.ed.ac.uk/software/figtree/). Phylogenetic calculations were performed at the IBB -UAB heterogeneous computer cluster 'Celler' and at the CIPRES science gateway (RRID: SCR_ 008439) (Miller et al., 2010).

Collection and housing of animals
Branchiostoma lanceolatum adults were collected in the bay of Argelè s-sur-Mer, France (latitude 423 2' 53' N and longitude 3˚03' 27' E) with a specific permission delivered by the Prefect of Region Provence Alpes Cô te d'Azur. B. lanceolatum is not a protected species. Animals were kept in tanks with seawater at 17˚C under natural photoperiod.

RNA isolation, cDNA synthesis and quantitative gene expression (qPCR)
Adult amphioxus (B. lanceolatum) were anesthetized in 0.1% diethyl pyrocarbonate (DEPC; Sigma, D5758) PBS buffer. Animals were sacrificed by cutting the most anterior part of the body. The nerve chord was surgically extracted from the animal while submerged in DEPC-PBS using a magnifying glass. Individual nerve chords were snap frozen in liquid nitrogen and stored at À80˚C until use. RNA was extracted from whole animals or from dissected nerve chords. Ten nerve chords were used for each RNA extraction, so that biological variability between individuals could be normalized. The tissue was homogenized in 1 mL of TRI Reagent (Sigma, T9424) using a Polytron homogenizer. Homogenates were transferred into an Eppendorf tube and incubated 5 min at room temperature (RT) before adding 100 mL of 1-bromo-3-cloropropane. Tubes were vigorously mixed by vortexing for 10-15 s, incubated 15 min at RT and centrifuged at 13000 rpm for 15 min at 4˚C. RNA was precipitated from the aqueous phase with 500 mL of isopropanol and 20 mg of glycogen. Tubes were frozen for 1 hr at À80˚C and then thawed, incubated at RT for 10 min and centrifuged at 13000 rpm for 10 min at 4˚C. The RNA pellet was washed twice with 500 mL of 75% ethanol and air-dried. cDNA was synthesized from 0.5 mg of total RNA. One mL of Oligo(dT)15 (Promega), 1 mL of 10 mM dNTP mix (Biotools), RNA and DEPC distilled water were mixed in a PCR tube to a final volume of 14 mL. This mix was incubated at 65˚C for 5 min in a T100 Thermal Cycler (BioRad). After cooling tubes on ice for 1 min, we added 4 mL of First Strand 5x buffer, 1 mL of 0.1 M DTT and 1 mL of Super-Script III (Invitrogen). Tubes were placed in a T100 Thermal Cycler (BioRad) with the following program: 60 min at 50˚C, 15 min at 70˚C. RNA expression levels were determined using qPCR and the GAPDH gene used as a reference. Primers used for qPCR analysis of iGluRs are in Figure 4-figure supplement 2 and those used for mGluR qPCR in Figure 6-figure supplement 4. qPCR data for iGluRs and mGluRs are given in Figure 4-source data 1 and Figure 6-source data 4, respectively.
cDNA from nerve chord and whole body samples was diluted 1:10 for the glutamate receptor gene reactions, and 1:100 for the reference gene reaction. For each gene 2.5 mL of diluted cDNA were added to 5 mL of iTaq Universal SYBR Green Supermix (Bio-Rad), along with 0.5 mL of each primer and 1.5 mL of RNase free water. qPCR was run in a C1000 Touch thermocycler combined with the optic module CFX96. Three technical replicates were performed for all genes analyzed. Primer pairs were designed to detect the expression levels of each glutamate receptor (Figure 4-figure supplement 2 and Figure 6-figure supplement 4). B. belcheri glutamate receptor sequences were aligned with the genomic sequence of B. lanceolatum, and high identity fragments were used to design primers. All primers were 20-25 base pair long, had GC content over 40-45% and a Tm between 60-65˚C. Primers were designed to obtain amplicons between 140-270 base pairs. Values of normalized expression were statistically analyzed using GraphPad Prism5. No outliers were identified and no data points were excluded. Comparisons between whole body and nerve chord expression levels were done with Student's T-Test for unpaired samples or the Welch variant of the Student's T-Test for samples with different variance. For multiple comparisons between the expression levels of genes belonging to the same class one-way ANOVA analysis was performed using Tukey's Post-Hoc test.
Grie1 and Grie7 gene synthesis Grie1 and Grie7 genes were selected for transient expression in the mammalian cell line HEK293T. We prepared two constructs for each gene. We first introduced an immuno-tag in the N-terminus before the first element of secondary structure. For Grie1 we used the c-Myc tag, which was placed after residue 39, and for Grie7 we used the hemagglutinin (HA) tag introduced after residue 10 of the wild-type sequence. The second set of constructs prepared substituted the wild type N-terminal sequence for the signal peptide from rat GluA2 while maintaining the immuno-tags (Figure 4-figure supplement 1). Codon-optimized genes for expression in human cells were synthesized and cloned into pICherryNeo (Addgene, 52119) and pIRES2_EGFP (Addgene 6029-1) by the Invitrogen GeneArt Gene Synthesis service.
Expression of GluE1 and GluE7 in HEK293T cells and analysis of plasma membrane trafficking HEK293T cells were maintained in Dulbecco's Modified Eagle Medium (DMEM) supplemented with 10% FBS and 1% Antibotic-Antimycotic (Gibco) in a humidified incubator at 5% CO 2 air and 37˚C. The day before transfection, cells were plated onto poly-D-lysine coated coverslips in 6-well plates, to reach 60-80% confluence. HEK293T cells were transiently transfected with the following plasmids: empty pIRES2-EGFP, pIRES2-EGFP containing the Grie7_Bbe gene, empty pICherryNeo and pICher-ryNeo containing Grie1_Bla. Cells were transfected using 3 mg of polyethylenimine and 1 mg of plasmid DNA for each ml of non-supplemented DMEM. Cells were incubated 4-5 hr with transfection medium without supplementation, which was then removed and replaced by supplemented medium. Twenty-four hours after transfection the medium was removed and cells were washed 3 times with PBS. For surface receptor staining, cells were blocked in 2% BSA in PBS for 10 min at 37˚C, and incubated for 25 min at 37˚C with primary antibodies against HA (Covance, MMS-101P, RRID: AB_291259), c-Myc (Cell Signalling, 2272S, RRID: AB_10692100) or GluA2 (Millipore, MAB397, RRID: AB_2113875). HA and GluA2 antibodies were diluted 1:200 and c-Myc 1:100 in DMEM without supplementation. Cells were washed 3 times with PBS, fixed in 4% paraformaldehyde (PFA) for 15 min at RT, rinsed in PBS and incubated 1 hr at 37˚C with secondary antibodies Alexa Fluor 555 donkey anti-mouse IgG (H + L) (A-31570, Invitrogen, RRID: AB_2536180) and Alexa Fluor 647 goat anti-rabbit IgG (H + L) highly cross-adsorbed (Life Technologies, A-21245, RRID: AB_2535813), diluted 1:1000 and 1:500 in PBS, respectively. Finally, coverslips were washed and mounted onto slides with Fluoroshield with DAPI (Sigma-Aldrich, F6057). For intracellular labeling cells were first fixed in 4% PFA for 15 min at RT, permeabilized with 0.2% Triton X-100 in PBS for 10 min, and finally blocked with PBS containing 2% BSA and 0.2% Triton X-100 for 20 min. Primary antibodies against HA (Covance, MMS-101P, RRID: AB_291259) and GluA2 (Millipore, MAB397, RRID: AB_2113875) were diluted 1:1000 and c-Myc (Cell Signalling, 2272S, RRID: AB_10692100) antibody was prepared at 1:100 in PBS. Incubation lasted 25 min at 37˚C. Secondary antibody incubations and coverslip mounting were done in the same way as for non-permeabilized cells. Cells were examined using a confocal laser-scanning microscope (Zeiss LSM 700) with a 63x oil objective.

Western blot and native gel electrophoresis
HEK293T cells were grown in 6-well plates as described previously and transfected with plasmids expressing amphioxus GluE1, GluE7 or GluA2. Twenty-four hours after transfection cells were rinsed with PBS and the content of 4 wells was resuspended in solubilization buffer (PBS containing 2% N-dodecyl-a-maltopyranoside (DDM; D310HA, Anatrace) and the protease inhibitors mix cOmplete EDTA-free Protease Inhibitor Cocktail, Roche). Cell lysates were homogenized in a Dounce homogenizer in ice with 20 strokes and kept under orbital agitation for 1 hr at 4˚C. Lysates were centrifuged at 89000xg in a Beckman TLA120.2 rotor for 40 min at 4˚C. The supernatant containing solubilized membrane proteins was recovered in a new tube and stored at À20˚C until used.
For denaturing gel electrophoresis (SDS-PAGE) protein lysates were denatured by adding loading sample buffer 10x (500 mM Tris-HCl pH 7.4, 20% SDS, 10% b-mercaptoethanol, 10% glycerol and 0.04% bromophenol blue), and incubated for 5 min at 95˚C. Protein lysates were loaded in a 10% SDS-polyacrylamide gel and separated at a constant current (25 mA). Gels were transferred at a constant voltage of 100 V for 90 min in ice. Membranes were blocked for 1 hr with Odyssey Blocking Buffer in TBS, and incubated overnight at 4˚C with the same primary antibodies at the same dilution as for native gels in TBS containing 0.1% Tween 20. After three 15 min washes in TTBS, membranes were incubated with secondary antibodies as above. Blots were analyzed in an Odyssey scanner.
Whole-cell macroscopic currents were recorded from isolated or coupled pairs of mCherry or EGFP positive HEK293T cells. Rapid application (<1 ms exchange) of agonists (500 ms pulses) at a membrane potential of À60 mV was achieved by means of a theta-barrel tool (1.5 mm o.d.; Sutter Instruments) coupled to a piezoelectric translator (P-601.30; Physik Instrumente). One barrel contained extracellular solution diluted to 96% with H 2 O and the other barrel contained 10 mM of the amino acid solution. For measuring current-voltage relationships, 500 ms agonist jumps were applied at different membrane voltages (À80 mV to +80 mV in 20 mV steps) and peak currents were fitted to a 5 th order polynomial function. To study recovery from desensitization, a two-pulse protocol (500 ms each) was used in which a first pulse was applied followed by a second pulse at different time intervals (from 2.5 s to 25 s). The paired pulses were separated 30-60 s to allow full recovery from desensitization. To estimate the percentage of recovery, the magnitude of peak current at the second pulse (P2) was compared with the first one (P1). Electrophysiological recordings were analyzed using IGOR Pro (Wavemetrics Inc.) with NeuroMatic (Jason Rothman, UCL, RRID: SCR_004186). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Data availability All data generated or analysed during this study are included in the manuscript and supporting files. Source data files have been provided for Figures 1, Figure 1  The following previously published datasets were used: