G protein-coupled receptors: the evolution of structural insight

G protein-coupled receptors (GPCR) comprise a diverse superfamily of over 800 proteins that have gained relevance as biological targets for pharmaceutical drug design. Although these receptors have been investigated for decades, three-dimensional structures of GPCR have only recently become available. In this review, we focus on the technological advancements that have facilitated efforts to gain insights into GPCR structure. Progress in these efforts began with the initial crystal structure determination of rhodopsin (PDB: 1F88) in 2000 and has continued to the most recently published structure of the A 1A R (PDB: 5UEN) in 2017. Numerous experimental developments over the past two decades have opened the door for widespread GPCR structural characterization. These efforts have resulted in the determination of three-dimensional structures for over 40 individual GPCR family members. Herein we present a comprehensive list and comparative analysis of over 180 individual GPCR structures. This includes a summary of different GPCR functional states crystallized with agonists, dual agonists, partial agonists, inverse agonists, antagonists, and allosteric modulators.


Introduction
G protein-coupled receptors (GPCR) comprise a superfamily of over 800 proteins that are the largest family of cell surface receptors in the human genome [1][2][3]. These proteins share a characteristic seven transmembrane spanning, alpha-helical structure [4]. The GPCR superfamily is commonly subdivided, based on sequence comparisons, into five distinct families: Rhodopsin (class A), Adhesion (class B), Secretin (class B), Glutamate (class C), and Frizzled/Taste2 (class F) [3,5]. More details on the sequence-based analyses that led to these phylogenetic divisions are further discussed in the section titled "Phylogenetic classification/structure" in this review. GPCR have been implicated in numerous biological processes such as cognitive responses [6], cardiovascular functions [7], and cancer growth and development [8]. GPCR implication in human disease is reflected by the estimated 50% of pharmaceutical drugs that interact with these receptors [9]. GPCR mediate signal transduction cascades initiated by numerous extracellular molecules through which they produce downstream physiological responses [4,8,10]. However, receptor activity is not solely stimulated by binding of extracellular ligands. Constitutively active basal signaling [11] has been demonstrated in over 60 wild-type GPCR, including ADRB 2 ; A 2A R; and CB 1 [12]. Some GPCR, such as taste receptors [13,14], adopt active state conformations upon interaction with other receptors [15]. Further details regarding GPCR activation can be found in the "GPCR function" section.
Although these receptors have diverse roles in cellular signaling and in physiology and pathophysiology, they share similar structural topology. This topology includes common structural features shown in figure 1 that include extracellular loops (EL1-3) and intracellular loops (IL1-3) that alternately connect a characteristic seven 251660288251659264 transmembrane (7TM) α-helical bundle (figure 1B) [3,4]. Crystallographic approaches described in more detail in the section titled "Structural characterization" have revealed structural similarities shared by many GPCR found mainly within the 7TM domain. This structural homology, described in more detail in later sections titled by class, suggests similar means for cell signaling. Although GPCR have a shared topology and 7TM structure, there is diversity within the superfamily in regards to sequence composition and length, producing structural variations that are associated with functional specificity in regards to ligand binding or G protein coupling for different types of GPCR. Select structural differences are also described in more detail in later sections titled by class.

GPCR function
Although this review focuses on GPCR structure and structural characterization, GPCR play a significant role in signal transduction cascades that warrants a generalized overview of their functional and regulatory mechanisms. GPCR are a critical mediator in overall cell signaling, both through ligand-dependent and ligand-independent mechanisms. The majority of these receptors relay signals initiated by binding of extracellular ligands. These ligands are either naturally produced (endogenous) or externally administered (exogenous). Ligands are classified as agonists, inverse agonists, or antagonists. Agonists bind to the receptors and induce a conformational change to an active state, which increases signaling effects. Inverse agonists shift the receptor conformational equilibrium toward inactive conformations, thereby inhibiting basal activity. Antagonists, on the other hand, prevent binding of agonists or inverse agonists without affecting the dynamic conformational equilibrium, which prevents agonist-dependent receptor activation [16]. Changes in conformation within the TM domain, in turn, initiate a conformational change in the intracellular region. IL2 and IL3 have been found to contain critical interaction sites for G proteins or other cytoplasmic effectors [17]. This is illustrated in the crystal structure of ADRB 2 -Gα s complex (PDB: 3SN6). Residues located in IL2, TM5, and TM6 of ADRB 2 demonstrate an interaction interface with the α4 helix, α5-helix, αN-β1 junction, and β3 strand of Gα s [18]. In the classical understanding of GPCR signaling, agonist binding promotes the formation of the activated receptor-G protein complex that modulates functional effects. A detailed analysis of G protein structure, regulation, and their involvement in signaling has been effectively summarized by Gilman [19]. The heterotrimeric G protein complex is made up of α, β, and γ subunits where the α subunit falls into four subfamilies: Gα s , which stimulates adenylyl cyclase activity; Gα i , which inhibits adenylyl cyclase activity; Gα q , which activates phospholipase Cβ; and Gα 12/13 , which promotes Rho activation. An activated receptor can interact with heterotrimeric G protein complexes. Receptors function as guanine nucleotide exchange factors (GEF) to promote dissociation of GDP from the Gα subunit in exchange for GTP (figure 2A) [4]. In the activated heterotrimeric G protein complex, the Gα-GTP bound subunit dissociates from the βγ subunit resulting in propagation of signaling cascades. However, subsequent studies have shown some receptors are able to reach an activated state through the formation of dimers and oligomers without agonist binding [4,16]. Alternatively, GPCR activation can occur through interactions with adaptor and scaffolding proteins. Through specific binding sites, adaptor/scaffolding proteins, such as Akinase anchoring proteins (AKAP) and β-arrestins (figure 2B), mediate interactions between the receptor and downstream second messengers by scaffolding a network of proteins which then operate as a large molecular complex [4]. Furthermore, receptors including the cannabinoid receptor 1 (CB 1 ) exhibit high basal signaling, independent of agonist activation [20]. High basal signaling might be indicative of conformational flexibility [21]; however, a study with a constitutively active ADRB 2 mutant has linked conformational flexibility with structural instability [11].
Receptor signaling is modulated by various mechanisms to control cellular functions. Three such processes are highlighted here -deactivation, desensitization, and receptor internalization. At the receptor level, members of the RGS family of proteins serve as GPTase activating proteins (GAPs) and accelerate deactivation by enhancing the rate of Gαcatalyzed hydrolysis of GTP to GDP by factors up to 2000-fold (figure 3A) [22]. In other cases of continuous agonist stimulation, the receptor can be phosphorylated by downstream second-messenger protein kinases or a family of G protein-coupled receptor kinases (GRK). Following phosphorylation, β-arrestin is recruited to the receptor leading to the inhibition of receptor-G protein interaction via steric hindrance. The 251659264 receptor-β-arrestin complex is targeted and removed from the cell surface through endocytosis. The receptor is then either degraded or recycled back to the cell surface in the inactive conformation (figure 3B). Receptor internalization is mediated by β-arrestin, clathrin-coated pits, and caveolae through diverse mechanisms [23]. Further in-depth evaluations of GPCR function can be found in excellent reviews by Pierce, et al. [4] and Syrovatkina, et al. [16].
Purifying hydrophobic membrane proteins requires additional steps compared to watersoluble proteins. Prior to purification, membrane proteins must be removed from their native lipid environment. Although solubilizing detergents are required to enable efficient extraction of GPCR from the membranes, it is essential to select a mild detergent that discourages protein denaturation. It is important to note that detergent protocols used for one receptor may not be appropriate to other receptors, thus making detergent screening a critical step in the obtaining viable protein. Rhodopsin (PDB: 1F88) was effectively solubilized in a mixed micellar solution containing heptane-1,2,3-triol (HPTO), nonyl glucoside, and zwitterionic lauryldimethylamine-N-oxide (LDAO) [24]. Both ADRB 2 structures from 2007 (PDB: 2RH1 [26], 2R4R/2R4S [27]) were solubilized in dodecylmaltoside. Often, detergents used for extraction are not sufficient to stabilize protein and a detergent exchange is required [31]. This method was observed in the crystallization trials of the ADRB 2 -Gs complex (PDB: 3SN6) where the complex was formed in dodecylmaltoside solution and exchanged into maltose neopentylglycerol detergent (MNG-3) [18].
Crystallization of membrane proteins has been accomplished through the use of different lipid phases including micelles, bicelles, and nanodiscs. These different lipid phases allow for protein stabilization and promote crystal lattice formation [28][29][30]. Detergent micelles surround the hydrophobic GPCR structural regions and allow crystal lattice contacts to form between exposed protein loop structures [28]. Lipid-based bicelles and nanodiscs, on the other hand, act as lipidic mimics of a native bilayer environment. Bicelles are formed via the combination of a detergent or short-chain lipid with a long chain lipid, such as dimyristoyl phosphatidylcholine (DMPC) [29]. A nanodisc is a non-covalently assembled phospholipid bilayer encircled by membrane scaffold proteins, such as plasma lipoproteins [33]. Rhodopsin (PDB: 1F88 [24]) and ADRB 2 (PDB: 2R4R/2R4S [27]) were crystallized in micelles and DMPC/CHAPSO bicelles, respectively. Nanodiscs were used in the crystallization of the nanobody-stabilized ADRB 2 (PDB: 3P0G).
Following many detergent optimization efforts, rhodopsin was crystallized by vapor diffusion [24]. Vapor diffusion forces protein solutions to reach supersaturation prior to crystallization. In this method, protein solutions are mixed with a crystallization solution and positioned in a sitting-drop or hanging-drop orientation in an airtight chamber that also contains the crystallization solution. The chemical equilibrium between the drop and the chamber results in the evaporation of volatile species, which diffuse from the drop to the well. This diffusion leads to a slow saturation of protein within the drop allowing the protein-detergent complexes to form crystal lattice contacts with neighboring molecules [34]. In lipidic cubic phase (LCP) methods, proteins are placed in a membrane-like environment where they can diffuse and interact with each other to form crystal lattice contacts on both complementary hydrophobic and hydrophilic regions [31]. Crystals of rhodopsin (PDB: 1F88 [25]) and ADRB 2 (PDB: 2R4R/2R4S [27]) were grown by hanging drop diffusion, whereas ADRB 2 (PRB: 2RH1 [26]) and ADRB 2 -Gs complex (PDB: 3SN6 [18]) crystallization experiments implemented LCP methods.
In determining crystallographic structures, molecular replacement has been successfully utilized to help in determining the phase of an unknown target by applying the phase of a closely related and previously characterized target. This approach was implemented in the analysis of the ADRB 2 structure [26]. Multi-wavelength anomalous scattering (MAD) and single-wavelength anomalous scattering (SAD) methods have impacted atomic-level structure determinations of GPCR when a heavy atom anomalous scatterer is incorporated into the protein structure. Structural determination using MAD phasing was implemented in the first crystal structure of rhodopsin (PDB: 1F88) [25].
Developments in protein engineering and heterologous protein expression have accelerated structural determinations of GPCR. Site-directed mutagenesis has been beneficial for the stabilization of receptor structure, as well as determination of the functional activity of many GPCR [31]. Mutagenesis has also enabled the introduction of fusion partners into protein sequences. Fusion partners are designed to be highly stable, compact, and easily crystallized. These partners maintain essential surface contacts and increase protein solubility, making them suitable replacements for flexible domains [31,32]. The position of the fusion partner and the number of residues conjoining the fusion partner to the protein may alter expression and/or stability; nevertheless, optimizing the number of linker residues can minimize adverse effects. A Fab5 epitope was engineered into IL3 of ADRB 2 (PDB: 2R4R/2R4S) by mutagenesis, which allowed a Fab5 monoclonal antibody to bind to the epitope [27]. Fab5 stabilized the receptor conformation and increased the polar surface area necessary for crystal lattice contacts. Since the initial use of the Fab5 antibody, many GPCR have been successfully engineered with fusion partners accelerating the number solved of crystallographic structures in the last two decades as illustrated in the timeline in . Although the addition of fusion protein sequences have been effective in the analysis of many GPCR, it is important to note that engineering designs are often specific for one protein and require optimization for each application to new receptor sequences.
GPCR loops and termini, which are characteristically flexible and difficult to crystallize, can result in additional challenges. Proteins truncated or modified at these areas can reduce flexibility. However, simple truncations can also reduce the polar surface area of the protein, which is a characteristic crucial for forming the crystal lattice contacts required for crystal formation. Rhodopsin has a relatively short IL3 with ~3 amino acids and C-terminal domain with ~25 amino acids, in contrast, ADRB 2 has a lengthy IL3 with ~28 amino acids and a Cterminal domain ~70 amino acids [43]. Two early structures of ADRB 2 were engineered differently in these regions to achieve crystal lattice formation during crystallization. In the ADRB 1 -Fab5 complex (PDB: 2R4R/2R4S), truncating the final 48 amino acids in the unstructured C-terminal domain optimized crystal size and uniformity [27]. In another structure of ADRB 2 (PDB: 2RH1), IL3 was completely removed and replaced with the soluble fusion partner, T4 lysozyme, which promoted the growth of diffraction quality crystals [26]. Amino acid truncations have also been found to have a variable effect on protein expression. Truncations at the N-terminus reduced expression. However, replacing the N-terminus with BRIL maintained similar expression as constructs with the complete Nterminus [39]. This was the case with NOP (PDB: 4EA3) where an N-terminal replacement with BRIL and a 31 amino acid deletion at the C-terminus significantly increased protein expression [31,39].
There have been challenges in GPCR structure determination due to their hydrophobic nature and instability outside of the hydrophobic membrane environment [44,45]. Since the first complete x-ray crystallographic structure of a GPCR (bovine rhodopsin; PDB: 1F88) was solved in 2000 [25], there are now over 180 comprehensive structures of GPCR in the Protein Data Bank (www.rcsb.org [46]) listed in table 1. More than 150 of these structures co-crystallized with ligands are described in more detail in table 2. However, less than 45 distinct family members are represented out of a superfamily of over 800 proteins. This limitation in representation suggests that significantly more work is yet to be done.

Phylogenetic classification/structure
Prior to the availability of three-dimensional structural information, GPCR classification methods initially utilized primary amino acid sequences to phylogenetically categorize receptors. This approach ultimately laid the foundation for the most-commonly used GPCR classification system. The sequences of seven hydrophobic regions (represented as cylinders in figure 1A) were used to design a fingerprinting method to identify sequences not previously categorized as rhodopsin-like receptors [47]. In the method, an individual conserved hydrophobic region served as a single "feature." Multiple features grouped together comprised a signature "fingerprint" within a sequence. A feature, when used by a scanning algorithm to search a database, is then referred to as a "discriminator" and fingerprints are referred to as "composite discriminators. The search results using the discriminators generate an output hit list for each hydrophobic region. A hit list correlation was used to differentiate between true members, where all features of the fingerprints matched, and noise, where zero, one or two random matches occur. A second database was built from the sequences resulting from the previous search and scans were repeated with the new discriminators. This process was repeated until the composite discriminator continuously distinguished between true members and noise. As a result, 240 sequences were identified as members of the superfamily. Sequences for pheromone receptors did not match any of the discriminators and cAMP receptors exhibited only two matches, thus falling within the noise [47]. These sequences were previously identified as GPCR [48] suggesting they were members of distantly related families [47]. The hit-list fingerprint correlation was later expanded to distinguish partial matches. Upon this expansion, sequences belonging the GPCR superfamily increased from 240 to 393. In addition, the pheromone, cAMP, and secretin-like families were established as rhodopsin-like receptors [49].
The rhodopsin-like family rapidly became overly complex as more diverse receptors were discovered, leading to the establishment of the GPCR superfamily and formation of the 'class' system. This A-F system includes GPCR found in both vertebrates and invertebrates [3]. In 2001, the first draft of the human genome became available [1,2] and allowed further GPCR to be classified using the most prevalent classification system with most of the human GPCR categorized into five families shown in italics: Glutamate (class C), Rhodopsin (class A), Adhesion (class B), Frizzled/Taste2 (class F), and Secretin (class B); also recognized as GRAFS [3].

Rhodopsin (Class A)
The Rhodopsin (class A) family, the largest of the GPCR classes [3], is divided into α, β, γ, and δ subgroups [5]. The α-subgroup is the largest group in class A and is comprised of the prostaglandin, amine, opsin, melatonin, and MECA receptor clusters [3]. Generally, ligands bind to these receptors within a pocket inside the transmembrane cavity involving residues contained in TM3, TM5, and TM6 [50-53]. An example of this target binding domain can be found in the ADRB 2 structure in complex with carazolol (PDB: 2RH1) [26]. Biogenic monoamine receptors, such as the adrenergic; cannabinoid; muscarinic; serotonin; dopamine; and histamine receptors, are important drug targets for cardiovascular drugs, antipsychotics, and anti-histamines [54,55]. Novel drugs that lack receptor specificity for particular amine receptors pose a risk for cardiovascular side effects due to off-target interactions with adrenergic receptors that show expression in many tissues [56]. Since the α-subgroup receptors outside the amine receptor subset are divergent from the amine receptors, side effects of novel aminergic drugs are most likely to occur only through the amine subset [5].
The β-subgroup of the Rhodopsin (class A) family has no branching subgroups and includes the hypocretin receptor, neuropeptide FF receptor, tachykinin receptor, cholecystokinin receptor, neuropeptide Y receptor, endothelin-related peptide receptor, gastrin-releasing peptide receptor, neuromedin receptor, thyrotropin releasing hormone receptor, ghrelin receptor, arginine vasopressin/oxytocin receptor, gonadotropin-releasing hormone receptor, and orphan receptors. Orphan receptors have been established as GPCR based on DNA sequence but have unidentified endogenous ligands [57]. Members of the β-subgroup mainly bind peptides with a high specificity-binding profile and are pursued as drug targets for treatments including pulmonary arterial hypertension [58] and hormone-related cancer [59]. Agonist drug design for β-subgroup members has been a challenge since it is difficult to design novel ligands that are flexible enough to mimic the magnitude of interaction sites engaged by endogenous peptide agonists [3].
The γ-subgroup of Rhodopsin (class A) contains the SOG, MCH, and chemokine receptor clusters, which bind both peptide and small ligand-like compounds. The SOG receptor cluster members include GPR54, the somatostatin receptors (SSTRs), and the opioid receptors [3]. The opioid receptors are important drug targets for the treatment of pain, cough, and alcoholism [54]. The MCH cluster includes receptors that branch off the SOG cluster. These receptors bind melanin-concentrating hormone (MCH), whereas the chemokine receptor cluster contains the chemokine receptors, the angiotensin/bradykininrelated receptors, and additional orphan receptors. Most ligands that interact with these receptors are peptides such as the chemokines and angiotensin. The chemokine receptors are drug targets due to their roles in acute and chronic inflammation [3] and as co-receptors engaged by some HIV strains [60]. While there are chemokine receptor-targeting drugs in clinical stages, maraviroc, an antagonist for CCR5, is the only currently FDA-approved drug on the market [61] targeting this class.
The δ-subgroup of Rhodopsin (class A) includes four main groups, the MAS-related receptor, glycoprotein receptor, purine receptor, and the olfactory receptor clusters. The MAS-receptor cluster includes the MAS1 oncogene receptor and the MAS-related receptors. The glycoprotein receptor cluster contains the classic glycoprotein hormone receptors and the leucine-rich-repeat-containing G protein-coupled receptors, whereas the purine receptor cluster is the largest in the δ-subgroup made up of the formyl peptide receptors, the nucleotide receptors, and a number of orphan receptors [3].

Extracellular region
The extracellular region of GPCR contains the N-terminus and three loops (ECL1-3) that shape the opening to the ligand-binding pocket. The ligand-binding region is found within the extracellular region of the 7TM bundle. EL2 (figure 5B) is known to vary in length between GPCR classes resulting in distinct conformations, while EL1 (figure 5A) and EL3 (figure 5C) are short and often have disordered structures [62]. A highly conserved disulfide bond between EL2 and C3.25 (see description of the Ballesteros-Weinstein numbering system in the 7TM region section) has been observed in a majority of crystallized GPCR structures. This covalent modification limits movement and stabilizes the conformation of EL2 [62,63]. Compared to other classes, class A receptors have a shorter N-terminus [63].

7TM region
Despite differences in primary structure, a majority of the proteins in class A/Rhodopsin share conserved residues found within the 7TM domain [5]. The Ballesteros-Weinstein numbering system is often used to assign numbers to common residues that are conserved in both sequence and structure-based alignments of different receptors [64]. This indexing system consists of two numbers separated by a period. The first value represents the TM helix in which the residue is found and has values ranging from 1 to 7. The second number denotes the residue position relative to the most conserved residue within that TM segment, which is defined as position 50. The value decreases as you move through the amino acid sequence toward the N-terminus and increases as you move though the amino acid sequence toward the C-terminus. Class A/Rhodopsin receptors have highly conserved residues in each TM segment: N1.50, D2.50, R3.50, W4.50, P5.50, P6.50, and P7.50 [64]. In addition to these residues, class A/Rhodopsin contains two fingerprint regions: the D/ERY motif at positions 3.49-3.51 and the NPXXY motif at positions 7.49-7.53 [3]. In 2000, x-ray crystallography studies on rhodopsin (PDB: 1F88) confirmed that conserved residues formed interhelical networks important for stabilization and activation [25] as had previously been suggested [65]. Seven years later, two structures of ADRB 2 (PDB: 2RH1, 2R4R/2R4S) were determined and revealed TM structural similarity to rhodopsin [26,27]. The basic canonical architecture of this region has now been observed in numerous examples of crystallized GPCR as is shown in figure 4. It is interesting to note that an allosteric sodium ion binding pocket involving two conserved residues D2.50 and S3.39 was identified in the 1.8Å crystal structure of A 2A (PDB: 4EIY) [66,67]. This central Na + pocket was formed by D2.50, S3.39, and three water molecules. Liu, et al., compared their inactive A 2A structure to an active A 2A (PDB 3QAK [68]) structure and found that the size of the central pocket in the active form could only support three water molecules without sufficient room for Na + coordination. This suggests that Na + may stabilize the inactive conformation of A 2A [66]. After the discovery of the Na + pocket in A 2A , the same characteristic was seen in other inactive GPCR crystal structures. These other structures include representatives of the α, γ, and δ subgroups of class A [ADRB1 (PDB: 4BVN [69]), delta-opioid (δ-OR; PDB: 4N6H [70]), and protease-activated receptor 1 (PAR1; PDB: 3VW7 [71])], suggesting the sodium ion binding pocket is a common feature in class A GPCR.
Conformational changes in the 7TM bundle are required for signal transduction across the cell membrane following ligand binding. Experimentally determined crystal structures of ADRB 2 reflect a variety of activation states. In these structures, TM5, TM6, and TM7 have a critical function in GPCR activation due to clear differences in helical arrangement. The active state of ADRB 2 (PDB: 3P0G) exhibits altered conformations of TM5 and TM7 and a prominent outward shift of TM6 [18] in contrast to the inactive state (PDB: 2RH1) [26].
The 7TM region forms distinctive ligand-binding pockets for different receptors where varying size, shape, and electrostatics provide receptor-ligand selectivity. Aminergic and nucleotide receptors have small binding pockets buried inside the 7TM bundle while peptide receptors have large, more accessible binding pockets near the extracellular surface [62]. These diverse ligand binding pocket profiles are demonstrated with 29 class A GPCR in figure 6A.

Intracellular region
The intracellular region of GPCR includes the C-terminus and three loops (ICL1-3) that interact with G proteins, β-arrestins, and other downstream effectors [4,10,62]. GPCR crystal structures have shown structural conservation in the short IL1 chain, though high levels of variability have been observed within IL2 and IL3 suggesting dynamic and/or unstable conformations. Differences have been observed in IL2 in both the D 3 R and ADRB receptor structures. In a crystal structure of D 3 R (PDB: 3PBL) [6], where there were two copies of the protein from the same unit cell, IL2 had 2.5 turns of an α-helical conformation for chain A in contrast to a disordered loop lacking electron density for chain B (figure 7A). ADRB 1 [72] and ADRB 2 [26], which have an overall percent identity of 80% and nearly identical IL2 sequences, have displayed an α-helical conformation in ADRB 1 and unstructured conformation in ADRB 2 (figure 7B; PDB: ADRB 1 -2VT4, ADRB 2 -2RH1). The three dimensional structure of IL2 may be dependent upon the functional state of the protein, as well as interactions with intracellular partners. IL3 exhibits the greatest length variability amongst IL, ranging from five to hundreds of residues and has been implicated in G protein selectivity. IL3 251664384251663360 has been observed to form a disordered conformation or, more often, has been replaced by a fusion partner for increased conformational homogeneity to enable crystal formation in several solved GPCR structures [10]. There are crystallographic structures of rhodopsin (PDB: 3CAP) [73], ADRB 1 (PDB: 2YCW, 2YCX, 2YCY, 2YCZ) [74], and δ-opioid receptor (PDB: 4N6H) [70] where IL3 adopts an α-helical conformation resulting in extended TM5 and TM6 helices [10,62]. The IL region undergoes considerable conformational changes required for G protein interaction and the consequent initiation of the signal transduction cascade.

Secretin and Adhesion (Class B)
Class B, the second largest class within the Rhodopsin family, is comprised of the Secretin and Adhesion families. Secretin and Adhesion receptors have been classified together due to sequence similarities between their 7TM regions, although they have distinctions elsewhere that establish them as separate families. The Secretin family has an extracellular hormonebinding domain that interacts with peptide hormones [5]. The members of this family include the calcitonin/calcitonin-like receptors, the corticotropin-releasing hormone receptors, the glucagon receptor, the gastric inhibitory polypeptide receptor, the glucagonlike peptide receptors, the growth-hormone-releasing hormone receptor, the adenylyl cyclase activating polypeptide hormone receptor, the parathyroid hormone receptors, the secretin receptor, the vasoactive intestinal peptide receptors, and additional orphan receptors [3,5]. The Secretin family includes potential targets for drug development due to their involvement in central homeostatic functions. Members of this family have been connected to appetite regulation and type-2 diabetes [5].
The Adhesion family includes the epidermal growth factor receptors and lectomedin receptors, though a majority of the members are currently classified as orphan receptors [5]. This GPCR family exhibits a highly variable number of amino acids at the N-terminal region, ranging from 200-2800 in length, making this family phylogenetically and structurally different from the rest of the class B GPCR. The "Adhesion" family name is related to the N-terminal region that contains sequence motifs, such as the GPCR proteolysis (GPS) motif, which serve as intracellular autocatalytic processing sites that participate in cellular adhesion [3,5]. Adhesion family receptors bind extracellular matrix molecules rather than peptide hormones [5]. These receptors are believed to be involved in cell proliferation/migration, as well as immune system function via the mediation of leukocyte and neutrophil interactions. Adhesion receptors that contain long N-termini have become targets of monoclonal antibodies used as drugs candidates [5]. Numerous receptors from this family are localized in the central nervous system (CNS) [75] though their functional role in the CNS is not fully understood [5].

Structure
Within class B, sequence alignments show that Adhesion and Secretin families contain structural differences at the N-terminal region. Adhesion receptors contain distinctive O-and N-glycosylation sites, as well as EGF and GPS motifs [76]. Secretin family members have a 60-80 amino acid N-terminal domain [3] that include a hormone-binding (HRM) domain that is believed to have a key role in binding peptide hormones [63]. The Ballesteros-Weinstein numbering system somewhat extends to class B/Adhesion/Secretin where residues E3.50 and W4.50 are conserved [77]. The binding pockets of class B GPCR are broader and deeper inside the 7TM bundle compared to class A [45], as illustrated in figure 6B, in order to accommodate endogenous peptide ligands. On the other hand, the crystal structure for GCGR (PDB: 5EE7) shows an allosteric binding site outside of the 7TM domain between TM6 and TM7 [78].

Glutamate (Class C)
The Glutamate (class C) family of receptors consists of the metabotropic glutamate receptors, GABA receptors, single calcium-sensing receptors, and sweet and umami taste receptors. The known endogenous ligands of this family are known to bind to the N-terminal region, but many allosteric ligands have been discovered to interact with TM3, TM5, TM6, and TM7 [79][80][81]. In addition, Ca 2+ can bind to the extracellular region [82] and enhance the effects of glutamate in some glutamate receptors [83,84]. Many of the Ca 2+ interacting residues are conserved within the "Venus flytrap" region of the Glutamate (class C) receptors [82] and have potential significance in drug design targeting depression learning, and memory [85].

Structure
In class C/Glutamate receptors, the 280-580 amino acid N-terminus [3] forms a cavity surrounded by two lobes which close upon ligand binding through a process known as the "Venus flytrap" mechanism (VFTM) [5,86]. This mechanism involves a ligand bindinginduced conformational change that results in the formation of disulfide bonds between the N-terminus and 7TM domain [63,87]. Receptors in this class lack the conserved residues defined by the Ballesteros-Weinstein numbering scheme seen extensively in Class A and to a lesser extent in Class B GPCR. Although the conserved ligand binding pocket of most class C receptors is located within the extracellular region, there are allosteric binding sites within the 7TM bundle [5] as seen in figure 6C. These allosteric binding sites may be a way to achieve receptor specificity using allosteric modulators.

Frizzled/Taste2 (Class F)
The Frizzled/Taste2 (class F) group of receptors includes frizzled and smoothened receptors (SMO) [3,5]. The frizzled receptors are known to bind secreted glycoproteins [88] while the SMO receptor functions in a ligand-independent manner through the SMO and sonic hedgehog (SHH) complex [89]. Frizzled receptors are involved in cell fate, proliferation, and polarity through association with secreted glycoproteins at the cysteine-rich N-terminus [3,5]. Although the cysteine-rich region is highly conserved in the frizzled receptor, there is evidence that there are additional binding sites located in the extracellular loops of the TM regions [90]. SMO receptors are structurally similar to frizzled receptors [5]. The cysteinerich region found in the frizzled receptors, as well as the chemical properties of residues that bind secreted glycoproteins, are conserved in SMO [90,91]. Class F GPCR, though not well understood, have been linked to cancer development and are potential targets for cancer therapy [5].

Structure
Based on sequence analysis, class F/Frizzled/Taste2 is the most highly conserved class within the GPCR superfamily [63]. These proteins have a distinctive N-terminus that spans 200-320 amino acids in length [3,5], in addition to a variable linker region between the EL and TM domains [5]. While the Ballesteros-Weinstein numbering scheme does not apply to this class, these proteins still share the common structure of the 7TM hydrophobic core [92]. There is limited structural information about this class of receptors since SMO is the only class F GPCR to be crystallized to date. The binding pocket is observed to be narrow and elongated in the currently available crystal structures ( figure 6D). In one SMO crystal structure (PDB: 5L7D), cholesterol is bound to the extracellular cysteine-rich domain (CRD) that is highly conserved in vertebrates. An oxysterol was observed to displace cholesterol and bind to the CRD groove leading to SMO activation and Hedgehog (Hh) signaling. Cholesterol has been proposed as an endogenous ligand that stabilizes the inactive SMO conformation [93].

Conclusions/further perspectives
Since the initial crystallization of rhodopsin in 2000, numerous technological advances have significantly impacted GPCR structure determination efforts, resulting in crystal structures of 42 individual receptors (to date). Currently > 180 structures of GPCR have been solved and made available through the Protein Data Bank (table 1). Of these, over 150 proteinligand complexes are available (table 2), representing the entire spectrum of ligand functions from inverse agonist to agonist. These data sets give insights into structural features and ligand recognition within this diverse protein superfamily. Specifically, the majority of crystallized ligand complexes with GPCR exhibit a ligand binding pocket near the extracellular end of the TM helical bundle, as shown in figure 6. The TM helical bundle structure shows considerable structural similarity across the family (figure 4), while EL2 shows the greatest structural diversity in the vicinity of the ligand binding pocket ( figure 5). Thus, differences in ligand selectivity between GPCR family members are driven both by sidechain differences in the TM helical bundle (rather than backbone conformational differences) as well as overall EL2 fold. GPCR have essential biological roles, and many have been confirmed to have value as therapeutic targets. Therefore solved, threedimensional GPCR structures have tremendous potential to influence drug discovery. Structure-based drug discovery approaches can be applied directly to crystallized GPCR family members as therapeutic targets. Additionally, the crystallized GPCR structures exhibit close sequence identity to additional GPCR family members and can serve as templates for the development of reliable and predictive homology models. Validated models allow structure-based drug discovery approaches to be used against this even broader set of target GPCR members. Although efforts in GPCR crystal structure determination over the past two decades have been fruitful, vast amounts of work remain to characterize unrepresented family members that have lower sequence identities to currently crystallized GPCR family members. GPCR structural characterization will continue to be a rich research area in need of further advances and innovative approaches into the foreseeable future.     (A) Receptor deactivation is mediated by the RGS family of proteins. RGS alters the conformation of Gα-GTP complex making it a better hydrolase, which accelerates the hydrolysis of GTP to GDP. (B) GPCR desensitization through internalization occurs when GRK phosphorylates the activated receptor promoting β-arresting binding, which sterically hinders receptor-G protein interaction. The receptor is either degraded or recycled back to the cell surface. Figure 3A was adapted from Neubig, et al. [191] and 3B was adapted from Pierce, et al. [4].