Structural biology of glycoprotein hormones and their receptors: Insights to signaling

This article reviews the progress made in the field of glycoprotein hormones (GPH) and their receptors (GPHR) by several groups of structural biologists including ourselves aiming to gain insight into GPH signaling mechanisms. The GPH family consists of four members, with follicle-stimulating hormone (FSH) being the prototypic member. GPH members belong to the cystine-knot growth factor superfamily, and their receptors (GPHR), possessing unusually large N-terminal ectodomains, belong to the G-protein coupled receptor Family A. GPHR ectodomains can be divided into two subdomains: a high-affinity hormone binding subdomain primarily centered on the N-terminus, and a second subdomain that is located on the C-terminal region of the ectodomain that is involved in signal specificity. The two subdomains unexpectedly form an integral structure comprised of leucine-rich repeats (LRRs). Following the structure determination of hCG in 1994, the field of FSH structural biology has progressively advanced. Initially, the FSH structure was determined in partially glycosylated free form in 2001, followed by a structure of FSH bound to a truncated FSHR ectodomain in 2005, and the structure of FSH bound to the entire ectodomain in 2012. Comparisons of the structures in three forms led a proposal of a two-step monomeric receptor activation mechanism. First, binding of FSH to the FSHR high-affinity hormone-binding subdomain induces a conformational change in the hormone to form a binding pocket that is specific for a sulfated-tyrosine found as sTyr 335 in FSHR. Subsequently, the sTyr is drawn into the newly formed binding pocket, producing a lever effect on a helical pivot whereby the docking sTyr provides as the 'pull & lift' force. The pivot helix is flanked by rigid LRRs and locked by two disulfide bonds on both sides: the hormone-binding subdomain on one side and the last short loop before the first transmembrane helix on the other side. The lift of the sTyr loop frees the tethered extracellular loops of the 7TM domain, thereby releasing a putative inhibitory influence of the ectodomain, ultimately leading to the activating conformation of the 7TM domain. Moreover, the data lead us to propose that FSHR exists as a trimer and to present an FSHR activation mechanism consistent with the observed trimeric crystal form. A trimeric receptor provides resolution of the enigmatic, but important, biological roles played by GPH residues that are removed from the primary FSH-binding site, as well as several important GPCR phenomena, including negative cooperativity and asymmetric activation. Further reflection pursuant to this review process revealed additional novel structural characteristics such as the identification of a 'seat' sequence in GPH. Together with the 'seatbelt', the 'seat' enables a common heteodimeric mode of association of the common α subunit non-covalently and non-specifically with each of the three different β subunits. Moreover, it was possible to establish a dimensional order that can be used to estimate LRR curvatures. A potential binding pocket for small molecular allosteric modulators in the FSHR 7TM domain has also been identified.


Introduction
The glycoprotein hormone (GPH) family consists of the three gonadotropins, luteinizing hormone (LH), follicle-stimulating hormone (FSH) and chorionic gonadotropin (CG), and a fourth nongonadotropin member, thyroid-stimulating hormone (TSH). All four members are important pharmaceutical drugs (PDR, 2013). FSH is clinically used for controlled ovarian stimulation in women undergoing assisted reproduction, most commonly involving in vitro fertilization of retrieved oocytes. It is also used to treat anovulatory infertility in women and hypogonadotropic hypogonadism in men, while LH is used to support FSH therapy. CG is used to induce ovulation in women and to increase sperm count in men, as well as to treat young boys when their testicles do not normally descend into the scrotum. TSH, in combination with 131 I, is administered to post-surgery thyroid cancer patients to suppress and ablate remnant cancerous tissues. Despite decades of successful clinical use and multi-billion-dollar annual sales, it remains poorly understood how glycoprotein hormones activate their receptors in host cells at an atomic level. In this article, we review the progress made by several groups, including ourselves, in the field of the structural biology of glycoprotein hormones and their receptors in an attempt to provide an insightful picture which portrays how FSH binding leads to FSHR activation at the atomic level.
GPHs belong to the superfamily of cystine-knot growth factors (CKGF). FSH, LH and TSH are all secreted from anterior pituitary gland as heteodimeric (two dissimilar subunits) glycoproteins of 30 kDa. Each is composed of a common a-subunit with the same amino acid sequence and a hormone-specific b subunit. Their secretion is controlled by releasing hormones from the hypothala-mus. Specifically, gonadotropin-releasing hormone (GnRH) controls the secretion of FSH and LH, and thyroid-releasing hormone (TRH) controls TSH (Simoni et al., 1997;Szkudlinski et al., 2002;Tao and Segaloff, 2009;Ulloa-Aguirre et al., 2007;Vassart and Costagliola, 2011). Acting to control thyroid functions, TSH induces production of thyroxine (T4) and triiodothyronine (T3), two molecules that are required for metabolism in almost every tissue in the human body (Porter, 2011). FSH and LH act synergistically to regulate follicular growth and ovulation, respectively, in ovaries and maintenance of normal sperm quality and quantity in testis. Another glycoprotein hormone, human chorionic gonadotropin (hCG) is secreted by human placenta during early pregnancy, and acts on the corpus luteum of pregnancy inducing progesterone production, which plays a critical role in maintaining pregnancy (Pierce and Parsons, 1981). CG and LH are directly related in evolutionary origin, as the CG beta subunit gene evolves from the LH beta-subunit gene by duplication and subsequent reading through into the 3 0 -untranslated region in the same chromosome location (19q13.2 for human) (Fiddes and Goodman, 1980;Talmadge et al., 1984). The characteristics in sequence and function shared by the four members indicates their common evolutionary origin (Uchida et al., 2010).
GPHs exercise their biological function upon interacting with their cognate receptors. Like the hormones, their receptors are also closely related. LH and CG share the same receptor, LHR; FSH binds to FSHR and TSH to TSHR (Combarnous, 1992;Dias, 1992;Nagayama and Rapoport, 1992;Segaloff and Ascoli, 1993). These receptors belong to the leucine-rich-repeat-containing G-protein coupled receptor (LGR) subfamily (Hsu et al., 1998(Hsu et al., , 2000. The LGR subfamily, in turn, belongs to Family A of the G-protein cou-pled receptor (GPCR) superfamily. GPCRs transduce extracellular signals through their seven-helical transmembrane (7TM) domains to activate G-protein (Pierce et al., 2002). The prototypical member of Family A is rhodopsin.
LGRs differ from the non-LGR members of Family A in their ectodomains. While the non-LGR members contain short extracellular N-terminal peptides and bind small molecules, the LGRs have unusually large ectodomains containing leucine-rich repeats (LRR) (Chen et al., 2013;Wang et al., 2013). The GPHR ectodomains contain 340-420 amino acid residues and bind their large ligands which have molecular weights of about 33 kilo Daltons (kDa). Upon hormone binding, the hormone-induced conformational changes in the receptor transduce the hormone signals down-stream to the inside of the target cell, by turning on a few signaling molecules, preliminarily a G-protein heterotrimer, leading to the dissociation of aand dimeric bc-subunits. The a-subunit then activates adenylyl cyclase, resulting in increased cAMP, which ultimately leading to increased production of steroids in the case of LH/CG and FSH. Meanwhile, the free bc dimers recruit GPCR kinases (GRK) to the receptor, which phosphorylate the intracellular loops of the receptor. This in turn, leads to the recruitment of b-arrestin to the receptor. By means of these and other signal pathways (McDonald et al., 2006), the hormones induce the necessary physiological responses in their respective host tissues.
Unlike some other ligand-receptor pairs (e.g., cytokines) where the receptors are in a same family but their ligands belong to distinct families (Jiang et al., 2000;Liu et al., 2007;Shim et al., 2010), the common families for both GPHs and GPHRs suggest common structural folds for the hormones as well as the receptors, as the result of the underlying co-evolution of ligand-receptor pairs (Moyle et al., 1994). Fig. 1 shows the co-evolutionary family trees of the hormones and their receptors and their kinship with other proteins. These common evolutionary roots suggest that GPH-GPHR binding and signaling share a common mechanism; thus, understanding any one complex should be instructive for the whole family. In this article, we will concentrate our discussions on the FSH-FSHR pair since only the crystal structures of FSH and its receptor in the hormone-bound form are currently available. Analyses of structures mentioned in this article were done using CCP4 suite (CCP4, 1994), homology modeling was performed using Coot (Emsley and Cowtan, 2004) and commercial software MOE from Chemical Computing Group. Structural figures were produced using PyMol (DeLano, 2002).

Discoveries of glycoprotein hormones as therapeutic drugs
The discovery and clinical application of the two pituitary gonadotropic hormones, FSH and LH, directly resulted from studies which established the existence of a hypothalamic-pituitary-ovarian axis, beginning early in the twentieth century (Goodman, 2004;Ludwig et al., 2002;Lunenfeld, 2002Lunenfeld, , 2004. Work by Crowe et al. linked pituitary gland function to the development of genital organs (Crowe et al., 1910). Two years later, Aschner postulated that a higher center in the brain controls the pituitary function (Aschner, 1912). The pituitary-ovarian link hypothesis was confirmed when Smith showed that the ovarian atrophy after removal of the hypophysis was reversed by pituitary implants (Smith, 1926a,b) and Zondek demonstrated that the external implantation of anterior pituitary glands in immature animals evoked precocious sexual maturation (Zondek, 1926). Zondek and Aschheim then showed that extracts from urine collected from postmenopausal women, which is rich with FSH, produced a predominant follicle-stimulating effect, whereas extracts from urine of pregnant women showed a strong luteinizing activity (Aschheim and Zondek, 1927). These observations led Zondeck to propose the idea that two hormones might be required for normal ovarian function: one to stimulate follicular growth and maturation and another to trigger ovulation and luteinization (Zondek, 1929(Zondek, , 1930. This hypothesis was proven after Fevold and a coworker isolated two crude pituitary hormones with distinct actions on the rat ovary: FSH to stimulate ovarian follicular development and LH to cause FSH-stimulated follicles to luteinize (Fevold et al., 1931). The hypothalamus-pituitary link was explicitly proposed by Guillemin in 1967, when he suggested that a releasing factor in the hypothalamus controlled the secretion of gonadotropins from pituitary (Guillemin, 1967). The existence of hypothalamic-pituitary-gonad axis was finally confirmed when Schally and Guillemin elucidated the chemical structure of gonadotropin-releasing hormone (GnRH) in early 1970s (Burgus et al., 1972;Schally et al., 1971).
The crucial step for the hormones' clinical application is to establish an extraction and purification method to obtain purified materials in sufficient quantity. Donini found that the hormones could be extracted by kaolin clay from urine collected from postmenopausal women (Donini, 1949). This pioneering work laid the foundation for the development of Serono's fertility drug Pergonal Ò , which helped Lunenfeld to treat an anovulatory patient, who had the first live birth by assisted reproductive technology (ART) in 1961 (Lunenfeld et al., 1962), and then in 1978 allowed Steptoe and Edwards to achieve the world's first live birth (Louise Brown) from an oocyte fertilized in vitro (Steptoe and Edwards, 1978). In recognition of this achievement in in vitro fertilization technology (IVF), Robert Edwards was awarded the Nobel Prize in physiology or medicine in 2010. The recombinant FSH was developed in 1990s and has been marketed under the commercial name Gonal-f Ò (Loumaye et al., 1998). The recombinant LH was approved by FDA and marketed under commercial name Luveris Ò in 2004 but the detailed research data has not been published yet.
Human CG (hCG) was discovered as a by-product from the long quest for FSH and LH (Lunenfeld, 2004). Aschheim and Zondek demonstrated that urine extracts from pregnant women had strong luteinizing activities (Aschheim and Zondek, 1927). They believed that this originated from the anterior pituitary. However, the in vitro studies (Seegar-Jones et al., 1943) demonstrated that it was the placenta, not the pituitary that was responsible for producing this activity. They further pinpointed the chorionic villi as the source of gonad-stimulating activity. Compared to pathways leading to clinical application of FSH and LH, the path of hCG to clinic was relatively smooth, due to the fact that hCG is abundant in placenta, which made its purification easier. Clinical grade hCG was first manufactured and marketed by Organon in 1931, first under the name 'Pregnon Ò ' and then 'Pregnyl Ò ' (Tausk, 1978). Subsequent clinical studies showed that, in the absence of FSH, hCG alone had no effect on follicle stimulation, ovulation or corpus luteum formation (Hamblen et al., 1945). Current clinical usage of hCG is mainly for treating the pre-pubertal boys whose testis do not descend into the scrotum and to synergize FSH treatment of anovulatory women. The Ascheim-Zondek assay to test the ability of hCG to induce follicular rupture in female rats has long been the standard pregnancy test based on hCG activity.
Finally, the history of TSH discovery can also be traced back to the time period when the biological linkage between the pituitary and gonad was discovered and the subsequent discovery of thyroidstimulating activity itself in the pituitary gland (Szkudlinski et al., 2002). Uhlenhuth and Schwartzbach (1927) discovered that a factor secreted from the pituitary gland caused a histological change in the thyroid gland. This factor was subsequently named thyroid stimulating factor (TSH). The clinical benefits of TSH treatment were not well established until the end of last century (Borget et al., 2007;Cole et al., 1993). Recombinant human TSH was approved and marketed by Genzyme in the U.S. in 1998 under the commercial name Tryrogen Ò . It has been administered in combination with 131 I (radioidine) in thyroid cancer patients post cancer-removal to suppress and ablate remnant cancerous thyroid tissues by inducing the uptake of radioiodine into the thyroid gland.
Recently, another pituitary hormone has been found (Nakabayashi et al., 2002). Named thyrostimulin, this protein is similar to other glycoprotein hormones, especially to TSH. It is a heterodimer and activates TSHR (Breous et al., 2005;Okada et al., 2006). Its physiological role, however, remains unclear.
3. The structure of glycoprotein hormones

Early quest for a GPH structure
In the backdrop of the long history of basic research and the successful clinical use of glycoprotein hormones, and with an aim to create more active analogs, many researchers sought the three dimensional structures of these therapeutically important drugs in order to understand their biological functions at the atomic level. GPHs are cystine-rich: 10 cysteine residues in the common a-subunit and 12 in each b-subunit. Despite numerous studies, definite assignments for all the disulfide bonds had remained elusive (Cornell and Pierce, 1974;Fujiki et al., 1980;Giudice and Pierce, 1976Bahl, 1980, 1981;Pierce et al., 1976;Rathnam et al., 1982;Reeve et al., 1975;Reeve and Pierce, 1981). Consequently, all of the GPH theoretical models (Hage-van Noort et al., 1992;Lustbader et al., 1993;Moyle et al., 1990;Willey and Leidenberger, 1989) based on those assignments and other information were incorrect.

GPH structure architecture
When the hCG crystal structure (PDB: 1HCN, 1HRP) was determined in 1994 it was heralded as a major breakthrough (Lapthorn et al., 1994;Wu et al., 1994) (Fig. 2a). The structure unexpectedly revealed that GPHs belong to the cystine-knot superfamily which includes growth factors McDonald and Hendrickson, 1993). Both aand b-subunits are folded into elongated, non-globular cystine-knot structures with three loops extending out from the core motif containing three knotting disulfide bridges (Fig. 2b). Each subunit has an exceptionally high surface-to-volume ratio and lacks apparent hydrophobic cores. The two hetero-subunits are super-imposable at the cystine-knot cores; they dimerize in a quasi-dyad symmetry (Wu et al., 1994) (Fig. 2a-d). The peptide that extends beyond the cystine-knot core in the b-subunit acts like a 'seatbelt', wrapping around the helix-containing L2 loop of the a-subunit and buckling the C-terminal end of the peptide back to the body of the b-subunit via an intra-molecular disulfide bond between residues 110 and 26 next to the cystine-knot core (Fig. 2e). This unusual dimerization raised the obvious question whether heterodimer assembly involves threading of the L2a loop with its attached oligosaccharide through the 'seatbelt' (the threading pathway) or alternatively, whether an unlatched 'seatbelt' wraps around the a-subunit before it is ''buckled'' (the wraparound pathway). Xing et al. demonstrated that hCG, hFSH, and hTSH are assembled primarily by the threading pathway in mammalian cells (Xing et al., 2004b).

Implication of the GPH structures in biological functions
Extensive mutagenesis and structure-activity relationship data preceded the hCG and the subsequent FSH crystal structures (PDB: 1FL7) (Fox et al., 2001). Together, these studies suggested topological features and hormone-receptor binding and biological functions. Having a common a-subunit, GPHs must be distinguished by their b-subunits for receptor specificity. At the center of the 'seatbelt' of hCG b-subunit lies a 'determinant loop', b93-100. The importance of this sequence stretch was first recognized by Ward and Moore, who proposed that the charge distribution in this loop may determine the hormone specificity: a net positive charge for LH/hCG and a net negative charge for FSH and TSH (Ward and Moore, 1979). Numerous follow-up studies confirmed this hypothesis (Campbell et al., 1991;Dias et al., 1994;Grossmann et al., 1997;Huang et al., 1993). The second interesting region is the L2b which has an unusual conformation. The L2b charged residue R43 is partially buried while its neighboring hydrophobic residues in both aand b-subunits are often exposed (Jiang et al., 1995;Lapthorn et al., 1994). Indeed, nicks in the L2b loop disrupt receptor activation (Cole et al., 1991;Ward et al., 1986); an R43L mutation in hCG (Chen and Puett, 1991) significantly diminishes its receptor binding activity; a natural mutation Q54R of LH diminishes its receptor binding capability (Weiss et al., 1992); and the side chains of residues 37 LVY 39 of FSHb affect receptor interaction and steroi-dogenesis (Roth and Dias, 1995). In the common a subunit, the Cterminal peptide (88-92) is disordered in the hCG structure but adopts different conformations in two independent FSH molecules in the crystal structure (Fox et al., 2001). This peptide has been implicated in receptor binding in hCG, FSH and TSH (Arnold et al., 1998;Chen et al., 1992;Grossmann et al., 1995;Yoo et al., 1993). In summary, the determinant loop and the C-terminal peptide of the a subunit, along with the L2b loop, are the major contributors to receptor binding and activation.

Quest for glycosylation structures
GPHs are heavily glycosylated in their natural state. The carbohydrates play important roles in GPH stability, folding, cellular trafficking and in vivo half-life and receptor signaling (Baenziger, 1994;Bousfield et al., 1996;Butnev et al., 2002;Fares, 2006;Szkudlinski et al., 1995;Ulloa-Aguirre et al., 1999;Walton et al., 2001). The human common a subunit carries two N-linked carbohydrate sites, the b subunit carries one or two N-linked carbohydrate sites and hCG b subunit has additional four O-linked carbohydrate sites at the C-terminal peptide (CTP) (Fig. 2e). Nlinked glycosylation in the b-subunit is located uniquely in the L1b loop: two sites for FSH (site 1 at N7 and site 2 at N24) and hCG (site 1 at N13 and site 2 at N30) but only one site each for LH (N30) and TSH (N23). Glycosylation at these sites has been reported to affect subunit folding and secretion by assisting disulfide bond formation, with site 2 having a greater effect than site 1 (Feng et al., 1995a,b). The phenomenon that hCG maintains a prolonged plasma half-life has been mainly attributed to its additional four Olinked oligosaccharides in the CTP (Fares, 2006;Fares et al., 1992).
Glycosylation at N78a in loop 3 of a subunit enhances the hormone's thermal stability (van Zuylen et al., 1997), while the carbohydrates at N52a in loop 2 have been shown to be indispensable for signaling . Heikoop et al. (1998), however, have suggested that the carbohydrate at N52a is responsible for heterodimeric association rather than for full efficacy. This peculiar finding has been rebutted by later studies from Moyle's group Moyle et al., 2004;Xing et al., 2004a;Xing and Moyle, 2003).
Unfortunately, the hCG protein used for the structure determinations published in 1994 was partially deglycosylated with hydrogen fluoride (HF), and was biologically inactive (Lapthorn et al., 1994;Wu et al., 1994). The question has been how much influence the full oligosaccharides play on the hormone structure. Two crystallographic studies have been carried out to determine the structures of fully active hCG and FSH with the necessary glycosylation. (Tegoni et al., 1999) crystalized the ternary complex of intact or desialylated hCG with the variable domains of two high affinity antibodies (PDB: 1QFW). The intact and the desialylated complexes diffracted to 4.5 and 3.5 Å, respectively. Only two glycan residues (one at N52a and the other at N78a) were present in the final structure, because the electron densities from the low-resolution crystals were not strong enough for the additional glycan residues to be included in the final refinement. it was observed that glycosltaion at N24 of the b-subunit was detectable in only about half of the molecules .
To reduce glycoform heterogeneity, the second glycosylation site was disrupted by mutagenesis (T26A). This isoform of hFSH was fully active and diffracted to 3.0 Å. Seven glycan residues (two at N52a, three at N78a and two at N7b) have been included in the final refinement (Fox et al., 2001). Comparative analysis of hFSH and hCG structures indicated that glycosylation has no global effect on GPH conformations.

Dimerization mode of cystine-knot structures
Cystine-knot structures have been observed in several other proteins (Bergner et al., 1996;Daopin et al., 1992;Hymowitz et al., 2001;McDonald et al., 1991;Oefner et al., 1992;Schlunegger and Grutter, 1992;Shim et al., 2010). All cystine-knot protein structures that have been determined are dimeric. The dimerization mode of GPHs is different from the others (NGF, TGF-b, PDGF-BB, coagulogen and IL-17). Following Wu et al.'s analysis (Wu et al., 1994), the dimerization mode for cystine-knot dimers, including two more structures (coagulogen and IL-17) that were solved after Wu et al.'s publication, were re-analyzed. The cystine-knot core motif of one monomer of the other dimers was superimposed on that of the hCG a-subunit, and the locations of the other monomers relative to the superimposed ones were then examined. Amazingly, as found by Wu et al. earlier, the association modes for all these dimers are distinct from each other ( Fig. 3a and b), suggesting that the cystine-knot structure motif per se plays little role in the dimerization of all these growth factors.

Similar 'seats', different 'seatbelts'
Given the close genetic and functional relationships among the four GPHs, we can safely assume that the modes of their heterodi-meric association are identical. Structural comparison between hCG and FSH indeed reveals identical folds and modes of dimerzation (Fox et al., 2001). On the other hand, our above analysis clearly shows that the dimerization modes of cystine-knot structures from different subgroups are completely different. How does the common a subunit achieve a common non-covalent association mode with three different b subunits, despite the fact that the three beta subunits of FSH, TSH and hCG/LH share only a moderate level (35%) of sequence identity?
To answer this question, we re-examined the GPH sequences and the hCG and FSH structures. Our study revealed that the common a subunit interacts with two separate sequence stretches of the diverse b subunits (Fig. 2d). The first stretch, which we term the 'seat', is around the CAGYC common sequence motif, predominantly consisting of common atoms among the different b subunits. The other, the 'seatbelt', contains mostly different residues in the stretch. Considering these data, we can now propose a mechanism by which the common a chain achieves its common association mode with three different b subunits non-covalently: the a subunit first interacts in low to medium affinity with the homologous 'seat' residues in the b subunits, and the affinity is then strengthened by tightening the 'seatbelt' from the second nonhomologous sequence stretch (Fig. 3c). Indeed, thyrostimulin, which lacks a putative 'seatbelt' loop, is not as stable as the other GPHs (Okajima et al., 2008).

Non-heterodimeric structures
Although aand b-subunits mainly exist and function as heterodimers, individual chains do exist separately and have been isolated (Cole et al., 2006;Garnier, 1978;Iles et al., 2010). Free bsubunits have been shown to be excessively secreted from choriocarcinoma cells and form a homodimer. It remains to be seen if the c b a Fig. 3. Dimerization mode of cystine-knot structures. (a) The different dimer interfaces used by hCG, NGF, TGF-b, PDGF-BB, Noggin, IL-17F and coagulogen. One protomer from each of the dimers was superimposed with the hCG a-subunit by aligning the core cystine-knot residues. The protomers not used in these superpositions were transformed and are shown along with the hCG heterodimer. Only the core conserved cystine-knot residues are shown as ribbons except for the non-core residues of hCG heterodimer, which is indicated by thin wires. The color codes are: red (hCG a-subunit), blue (hCG-b subunit), magenta (TGF-b), PDGF-BB (yellow), NGF (light blue), IL-17F (cyan) and coagulogen (grey). (b) Schematic diagram of Fig. 3a, showing the different association modes used by the cystine-knot dimers (only cystine-knot core residues are concerned here). The red ellipsoid represents one protomer from each of the dimers. The ellipsoids of other colors represent their respective other protomers that were not used in the superpositions. (c) Illustration of the 'seat' residues (shown as cyan surface) and 'seatbelt' residues (shown as yellow ribbon) of hCG. The a-subunit is colored red, and b-subunit blue except the 'seatbelt' residues. N52a glycan atoms are shown as balls. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) homodimer adopts the same dimerization mode as seen in the heterodimer. Free a subunit is also present in large quantities in the pituitary and human placenta (Blithe, 1990;Blithe et al., 1991;Parsons et al., 1983). NMR studies show that isolated a subunit is more flexible than when it is in the heterodimeric environment and the L2a loop that contains an a-helix in the heterodimer is disordered in the monomeric structure (PDB: 1E9J, 1HD4, 1DZ7) (Erbel et al., 1999(Erbel et al., , 2000.

The binding mode of FSH with the hormone-binding portion of FSHR
The advances in cloning and expression of glycoprotein hormone receptors (GPHRs) since 1989 (Kelton et al., 1992;McFarland et al., 1989;Parmentier et al., 1989;Sprengel et al., 1990) initiated a period of rapid progress in understanding and ultimately providing reagents which would lead to the structural evidence for the mode of interaction of GPH ligands with their receptors. An imperfect motif of leucine-rich repeats (LRRs) in the sequence of the ectodomain of LHR identified by McFarland and coworkers led to the proposal that the ectodomain contains 14 LRRs (McFarland et al., 1989). In 1993, the first crystal structure of a LRR protein, porcine ribonuclease inhibitor (RI), was solved (PDB: 1BNH, which has been superseded by 2BNH) (Kobe and Deisenhofer, 1993). The LRRs in RI are arranged tandemly in a horseshoe shape with 16 a-helices surrounding 17 parallel b strands, all aligned nearly parallel around a common axis. Each LRR is composed of 28 or 29 amino acids composed of a b strand and an a helix separated by two loops. These advances opened the door to molecular modeling of GPHRs.

Molecular modeling of GPHRs
In 1995, three groups published 3D GPHR models. Moyle's group (Moyle et al., 1995) aligned the LRR sequence of LHR with those of RI and placed the intron-exon junctions of LHR in solvent-exposed loops furthest from the transmembrane domain. (Kajava et al., 1995) obtained the distribution of the LRR lengths from 68 different LRR proteins selected by sequence-profile searching. They found that while the length of a 'typical' LRR is 24, RI has the lengths 28 or 29 and belongs to a less populated LRR subfamily. That group then performed comparative sequence analysis to distinguish residues with possible structural roles from those with essential functional roles, and used that knowledge to model the structure of the 'typical' LRR units. Based on these modeled units, they built the three-dimensional model of the ectodomain of TSHR (residues 54-254).
A completely different approach was used to model the threedimensional structures of GPHRs (Jiang et al., 1995). First, the relationship of LRRs with their corresponding 9 exon structures was analyzed, and it was found that each of exons 2-8 remarkably correlated with one LRR with appreciable homology within the repeats. Exons 1 and 9, on the other hand, were quite different. Although there might be an adjoining LRR in exon 1 and two additional repeats at the start of exon 9, the bulky exons 1 and 9 made unambiguous identification of LRRs within them difficult. Focusing on exons 2-8, which encode major determinants for hormone binding, and realizing that the LRRs in the GPHR are composed of about 24 amino acids and have a somewhat distinct motif from RI, the strategy did not presume the shorter repeats adopt the same b/a conformation as in RI. In order to identify the structural motif for LRRs in GPHRs, the accuracy of secondary-structure prediction was augmented with an averaging technique, taking advantage of the periodic nature in the LRR sequences. The GPHRs showed strong propensity of b strands for the amino-acid residues encoded around the exon boundaries but somewhat diminished helix probability and enhanced loop propensity relative to RI for the amino-acid residues encoded in the middle of the exons. The analysis of shear-number to strand-number ratio further constrained the inclinations of parallel b strands to near zero degree. Next it seemed reasonable to postulate that the dominantly positive charges on the hCG surface were complementary with the negative electrostatic potentials of the b-strand region on the inner surface of LHR. Based on complementarities of both shape and electrostatic attraction between hormone and receptor, it was possible to build the complex models of GPHs and their receptors. These proposed models (Jiang et al., 1995) have been supported by a wealth of biochemical data, including those from a well-designed epitope mapping study (Pantel et al., 1993).
Because RI shares such low sequence identity with GPHRs that almost approaches a random level, a disclaimer was made about the accuracy in side-chain interactions (Jiang et al., 1995). As a result, the model was quite crude, and detailed interactions indeed turned out to be incorrect (Fan and Hendrickson, 2007). Nevertheless, the model correctly predicted the LRR b strands and the dominant electrostatic interactions in the hormone-receptor interface, and the overall complex uncannily resembled the crystal structure of FSH-FSHR HB that was solved a decade later (Fan and Hendrickson, 2005). The overall correct model enabled one of the authors (X.J.) and his colleagues to focus on a smaller surface of FSH for site-directed mutagenesis and successfully produce mutants with improved physicochemical properties (Brondyk et al., 2008;Garone et al., 2010Garone et al., , 2011Muda et al., 2010Muda et al., , 2011. Several groups later also published theoretical models for portions of the receptors (Bhowmick et al., 1996;Kleinau et al., 2004;Moyle et al., 2004;Puett et al., 2007;Smits et al., 2003;Song et al., 2001;Szkudlinski et al., 2002). These models have been used to explain experimental data or design new experiments.

Hormone-receptor interface in the crystal structure of FSH-FSHR HB complex
Another major advance was achieved when a crystal structure was determined for FSH in complex with the truncated hormone binding portion of FSHR ectodomain (PDB: 1XWD) (FSHR HB ) (Fan and Hendrickson, 2005). This structure revealed detailed atomic interactions between the hormone and FSHR HB (Dias, 2005;Fan and Hendrickson, 2005). The curvature of FSHR HB , unlike the uniformly curved RI, is steeply graded, starting from nearly flat at the N-terminal repeats 1-7, and increasing gradually to horseshoe-like curvature at the C-terminal repeats 7-10.
The Fan-Hendrickson structure shows that FSH binds to FSHR HB like a 'handclasp'. As noted above, the overall picture is similar to what had been predicted earlier (Jiang et al., 1995): most b-strands in the inner surface of FSHR are involved in hormone-binding, electrostatic attractions are dominant in the hormone-receptor interface, and carbohydrate does not participate in the primary binding. For the hormone part, the 'determinant loop' of FSHb (residues 87-94), together with its neighboring residues 95-99 in the 'seatbelt' loop, is at the center of the interface, sandwiched by the aL2 loop and C-terminal segment of the a subunit (Fig. 4). The interactions between a-subunit residues and the four conserved receptor residues constitute the common hormone-receptor interface. Two of them are charge-charge interactions between electronegative receptor residues and electropositive hormone residues from aL2 loop (D153 and D81 of FSHR pairing with K51 and K45/R42, respectively). The importance of these residues had been recognized earlier by peptide mapping as well as mutational studies (Bhowmick et al., 1996;Erickson et al., 1990;Leinung et al., 1991;Liu and Dias, 1996). The other two are LRRs 5-6 of FSHR interacting with residues from the a subunit C-terminus. To make these interactions, the C-terminus (residues 88-92) undergoes a dramatic conformational change by rotating almost 180 degrees from the free-state and swinging its end more than 20 Å. D150 of FSHR interacting with K91 of a subunit is another salt bridge contributed by an electronegative residue from receptor and an electropositive residue from the hormone. The fourth interaction, that of the side chains of a subunit Y88 and FSHR Y124 stacking their aromatic rings against each other, contributes significant energy to the interface by burying 106 Å 2 of solvent-accessible surface area.
Indeed, the vital role of Y88 and other C-terminal residues of a chain in receptor binding and signaling have been demonstrated in earlier experiments (Arnold et al., 1998;Chen et al., 1992).
Hormone specificity is conferred by the divergent residues on the b subunits that interact with variable receptor residues. Structure-based sequence comparative analysis indicates that five FSHR residues, L55, E76, R101, K179 and I222, are the primary candidates for defining receptor specificity. Not only do these residues appear at the interface, they vary from receptor to receptor in human species and also interact with FSH residues that differ from hormone to hormone. FSHR residues L55 and K179 have been suggested (Fan and Hendrickson, 2005) to distinguish between FSH and TSH versus LH/CG, whereas E76 and R101 account for the specificity between FSH and TSH. This proposal is supported by earlier mutagenesis studies (Campbell et al., 1991;Dias et al., 1994;Grossmann et al., 1997;Keutmann et al., 1989;Moyle et al., 1994;Smits et al., 2003;Vischer et al., 2003).
Many other residues, both direct and non-direct contact residues, are buried in the interface. The non-direct contact residues may have a collective effect on interface affinity and receptor specificity. For those direct contact residues, it is still not simple to place them into the category of hormone-receptor interface common for all GPHs or that of hormone specificity, except for one of them: the salt bridge between D93 of b subunit and K104 of FSHR, which likely belongs to the first category. D93 is a conserved residue among the hormones. K104, however, is not conserved among the receptors, as the corresponding residues are N107 in LHR and N110 in TSHR. Nevertheless, LHR and TSHR do have positively charged residues nearby. K109 in LHR and R112 or R109 in TSHR could form salt bridges with the corresponding acidic residues in the b subunits (D99 in LH/CG and D94 in TSH, respectively). Fig. 4 schematically summarizes the hormone-receptor interactions at the interface.

FSH changes its conformations upon binding to FSHR HB
Structural comparison of the independent FSH molecules within a same form revealed the internal conformational flexibility of FSH. There are two FSH molecules in the free form structure (pdb code 1FL7) and in the FSHR HB -bound form (pdb code 1XWD). Conformational comparison of the two independent FSH molecules in the 1FL7 structure may uncover the internal FSH flexibility in the free form while that in the 1XWD structure would show the internal FSH flexibility in the receptor-bound form. Although the conformations of the two molecules in the free form are generally similar, there are substantial conformational changes in several regions. Upon superimposition of two FSH heterodimers in the free form, 46 residues in the loops and C-terminus (i.e., E14 to I25 in loop L1a, L48 and V49 in loop L2a, R67 to K75 in loop L3a, T86 to the last visible C-terminal residue H90 in the a-subunit; E16 to R18 in loop L1b, K40 to Q48 in loop L2b, two isolated residues (R62, A67) in loop L3b, C87 of the determinant loop, F106 to the last common visible C-terminal residue E108 in the b-subunit) have moved over 1 Å, resulting in an overall Ca root-mean-squared Fig. 4. Schematic illustration of detailed interaction at the FSH-FSHR interface. Contacting residues from FSHR HB are shown as yellow dots and those from FSHa as red dots and FSHb as blue dots. The middle column summarizes the specific side-chain interactions between FSH and FSHR HB . Interactions that contribute to common affinities among all the GPH-GPHR family members are shown as green-filled circles (for charge-charge interactions) or boxes (for non-charged atomic contacts) and they are connected by green lines back to the yellow dots in FSHR or red or blue dots in FSH aor b-subunits, respectively. Interactions that contribute to specificity are shown as purple-filled circles or boxes and they are connected by purple lines to the dotted residues in FSHR and FSH. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) deviation (r.m.s.d) of 1.9 Å between the two FSH copies (Fan and Hendrickson, 2005;Fox et al., 2001) (Fig. 5a). Most of these residues have later been found in the receptor interface (see below). The largest shift is over 5 Å for the residues in loop L2b and C-terminus of both chains. In contrast, the FSH conformation in the FSHR HB -bound form is quite rigid. Only 21 residues (i.e., residues P17 to I23 in loop L1a, G72 in loop L3a, K14 in loop L1b, Y39, K40 and P45 in loop L2b, V63 to G65 and H69 to L73 in loop L3b, C-terminal residue G107 in the b-subunit) have move over 1 Å, resulting in a much smaller overall Ca r.m.s.d. of 0.7 Å between independent FSH copies in the bound state (Fig. 5b). No residues have shifted over 3 Å. The significantly reduced r.m.s.d. implies that FSH rigidifies its overall conformations upon receptor binding.
Receptor-binding induced conformational change can be revealed from structural comparisons of FSH molecules between the free form and the receptor-bound form. The average Ca r.m.s.d for overall molecule is 1.2 Å. The noticeable changes occur in the loops and C-terminus: residue P21 in loop L1a moves 2 Å, K45 in loop L1a 1.5 Å, G72 in loop L3a 5 Å, R44 in loop L2b 7 Å, H68 in loop L3b 1.5 Å, S92 in C-terminal a subunit 24 Å and G107 in C-terminal b subunit 5 Å ( Fig. 5d and e).

A common rigid LRR structure in FSHR and TSHR
Two crystal structures of the truncated ectodomain of TSHR in complex with the thyroid-stimulating autoantibody (M22) (PDB: 3G04) or the thyroid-blocking autoantibody (K1-70) (PDB: 2XWT) were determined in 2007 and 2011, respectively. This truncated TSHR domain (residues 22-260) corresponds roughly to the equivalent residues in the truncated FSHR domain in the 1XWD structure. These structures provide opportunities to assess the rigidity of the LRR domain within each receptor and between receptors in the GPHR family. In both receptors, the LRR domain is flanked at the N-terminus by a cysteine-rich box. However, the disulfide bond patterns in FSHR and TSHR are different. While the four cysteines in TSHR are paired sequentially forming two horizontally parallel disulfide bridges (C24 bonds to C29, and C31 to C41), the four cysteines in FSHR are paired in a skipping manner forming two disulfide bridges (C18 bonds to C25, and C23 to C32) that almost stagger over each other, as in the case of the Nogo receptor (pdb accession code: 1OZN) (He et al., 2003). Therefore, the cysteine-rich boxes in FSHR and TSHR are not equivalent. When all four molecules (2 FSHR copies in 1XWD, 1 TSHR each in 3G04 and 2XWT) are aligned, the common residues are C23-R247 in FSHR and C31-R255 in TSHR. There is little conformation change for the LRR common residues within each receptor (overall r.m.s.d. for all 225 Ca atoms is 0.5 Å between two FSHR copies in 1XWD or between two TSHR molecules in 2XWT and 3G04). Amazingly, FSHR and TSHR can also be superimposed extremely well for the LRR common residues, even though there are 6 gaps in the structure-based sequence alignment and the receptors bind to different types of ligands in the structures (Fig. 5g). These data suggest that the LRR domains adopt a common structure across the receptor family.

First structure-based proposal of receptor activation mechanism
In addition to the detailed description of hormone-receptor interactions, Fan and Hendrickson (2005) proposed a two-step receptor activation mechanism based on the observed FSH-FSHR HB complex dimers in the crystal and also on detected dimer in solution by methods of chemical cross-linking, analytical centrifugation and light scattering. They proposed that once FSH binds to the ectodomain of FSHR, the complex dimerizes via hydrophobic interactions between the hFSHR HB protomers, with residue Y110 at the center of the dimerization interface. The dimerization would relay the hormone-binding signal across the cell membrane and activate the G-protein, unleashing a cascade of down-stream signaling events inside the target cell.
There are two outstanding issues regarding the structural aspect of receptor activation. The first is about the orientation of the hormone in the complex in relative to the 7TM domain of the receptor. Fan and Hendrickson (2005) have suggested the two hormone a loops ðL1a and L3aÞ are oriented directly towards the 7TM domain. The root of this proposed orientation can be traced back to several earlier publications (Braun et al., 1991;Jiang et al., 1995;Remy et al., 1996). In contrast, several other research groups have placed the b subunit towards the 7TM domain (Duprez et al., 1997;Moyle et al., 2005;Puett et al., 2007;Szkudlinski et al., 2002;Vassart and Costagliola, 2011). The second issue is whether the Fan-Hendrickson dimer is necessary and sufficient for activation of the full-length receptor (Fan and Hendrickson, 2007;Guan et al., 2010;Latif et al., 2010;Ulloa-Aguirre et al., 2007). An FSHR Y110N mutation had no effect on FSH mediated cAMP production (Guan et al., 2010), implying that Y110 mediated FSHR dimer is not important for receptor activation. This interpretation, however, has to be treated with a caution for two reasons. The FSHR C-terminal fusion protein used to demonstrate dimerization based on BRET was not shown to be biologically active, and FSHR oligomerization was shown occurring in endoplasmic reticulum with evidence of carboxyl-terminal proteolytic processing (Thomas et al., 2007). On the other hand, it has been suggested (Latif et al., 2010) that TSHR Y116 (corresponding to Y110 in FSHR) plays a critical role in the receptor oligomerization.
5. The crystal structure of FSH bound to the entire ectodomain of FSHR

Introduction to the concept of a GPHR 'signal specificity domain'
The so-called 'signal specificity domain' is a stretch of amino acid sequence after the hormone-binding segment and before the transmembrane domain, roughly corresponding to residues L263-R366 in FSHR, N267-R363 in LHR and H271-R418 in TSHR. Other terms for this region include hinge region, hinge domain and C-terminal cysteine-rich region; for TSHR, this region is also called a cleavage domain due to an insertion of a unique 50 amino-acid stretch with two cleavage sites (Mueller et al., 2010). This region had been proposed to form a separate structural unit and be highly flexible, and was described as 'enigmatic' (Grossmann et al., 1998;Moyle et al., 2005;Mueller et al., 2010;Zhang et al., 2000).
Several important residues that greatly contribute to receptor signaling are located in this region. It has been shown that the sulfated tyrosine in the 'hinge' region (Y335 of FSHR, Y331 of LHR and Y385 of TSHR) is required for hormone recognition and signaling . Mutations of a single residue in each of the three GPHRs (S273 of FSHR, S277 of LHR, S281 of TSHR) result in stronger activation of the receptor (Duprez et al., 1997;Gruters et al., 1998;Kopp et al., 1997;Nakabayashi et al., 2000). TSHR mutations of D403, E404, N406 and two consecutive cysteine residues, C283S and/or C284S, led to increased constitutive activation of the receptor (Ho et al., 2001). Stepwise deletion studies revealed that the absence of sequence motif 371-384 before the sulfated tyrosine 385 caused constitutive activation of the TSHR (Mizutori et al., 2008). Despite the demonstrated importance of this region, understanding how it functions has been lacking due to the absence of a 3D model of this region.

Earlier models of the 'signal specificity domain'
Several homology modeling studies on this region have been published (Kleinau et al., 2004;Kleinau and Krause, 2009;Majumdar et al., 2012;Miguel et al., 2004;Moyle et al., 2004;Puett et al., 2010;Rapoport and McLachlan, 2007). Kleinau et al. (2004) reasoned early on that the Nogo receptor structure (He et al., 2003) was a more appropriate template than RI for GPHRs and correctly assigned b strands up to LRR11. For the 'hinge' region, they modeled part of that region using the complex of interleukin-8 and its receptor as a template. Two other groups (Miguel et al., 2004;Moyle et al., 2004), in contrast, modeled the whole 'hinge' region as a separate domain by using tissue inhibitor of matrix metalloproteinases 2 and Menkes copper-transporting ATPase as templates, respectively. Retrospectively, these structures were not the best templates for modeling the sequence region. For example, Krause and coworkers (Krause et al., 2012) noticed that none of the published theoretical GPHR models correctly predicted the cysteine-linked helical structural element and the last b-strand, both of which have been revealed by the recently determined crystal structure of FSH bound to the entire ectodomain of its receptor (Jiang et al., 2012).

Architecture of the entire ectodomain of FSHR
To clarify the role of the 'hinge' region in signal transduction as well as to help solve the above mentioned controversies, the structure of FSH bound to the entire ectodomain of the receptor (FSHR ED ) was recently determined (PDB: 4AY9) (Jiang et al., 2012). An expectation was that the juxtamembrane 'hinge' region or signal specificity domain would form a distinct structure apart from the LRR fold of the hormone-binding segment (Gadkari et al., 2007;Kleinau et al., 2004;Latif et al., 2009;Moyle et al., 2005). Instead, this new structure revealed that the 'hinge' region and the N-terminal hormone-binding region form an integral domain (Fig. 6). The signal specificity sequence adds two additional b-strands (LRRs 11 and 12) and a helix to the first 10 LRRs revealed in the Fan-Hendrickson structure. The activation-sensitive residue S273 is located in the unique helix, which engages two other parts of the ectodomain with disulfide bridges. One disulfide bond is formed with the N-terminal hormone-binding LRRs and the other disulfide bond establishes a short chain link to the seven-helical transmembrane domain (7TM). This helix imposes a strong conformational restraint on the ectodomain with respect to the 7TM domain, which is essential for signaling. A long loop insertion, from one end of the pivot helix to the beginning of the last b strand, harbors the sulfated Y335. A long stretch of residues (295-330) is conformationally disordered at the middle of the loop. Neutralizing antibody studies have identified an epitope in this disordered region (Lindau-Shepard et al., 2001). The so-called 'C-peptide' in TSHR, for its unique cleavable 50 amino-acid insertion (Rapoport et al., 1998), is also located within the corresponding disordered stretch.

FSHR mosaic LRRs inform a generalized prediction for curvatures in other proteins with mosaic LRRs
As Fan and Hendrickson (2005) have noticed, the LRR curvatures in FSHR HB are gradually steepened from N-terminus to the C-terminus. This trend is maintained in LRRs 11 and 12 in the 'hinge' region. These 12 inner b strands with progressively steepened curvature form a sleigh-like sheet (Fig. 7a). A question is whether the curvature is coded in the amino acid sequence of each repeat.
The origin of a curvature comes from the dimensional difference between the inner and outer surfaces. Fig. 7b illustrates the cause of curvature formation, when the width of an inner element which is represented by line AB, is shorter than the width of its counterpart on the outer surface, represented by line CD. For LRRs, the dimension of each inner b strand is similar, because its backbone is in an almost fully extended conformation. The b strands are adjacent to each other forming a b sheet with an extensive hydrogenbond network in the backbone of adjacent strands. Side chains often point in or out perpendicularly to the sheet in an alternate fashion; thus, the side chains occupy the sheet surface in an economical manner, and the ''sideways'' distance between adjacent Ca atoms in hydrogen-bonded b strands is about 5 Å (Creighton, 1992). This short dimension is the reason that b strands are frequently found in the inner surface of LRRs. With the inner surface width fixed, the width of the LRR outer surface will determine the curvature. The wider the outer surface, the steeper the LRR curvature.
A number of descriptors have been used to calculate the curvature of a general surface (Sternberg, 2012). Several groups applied the concept to biological molecules (Coleman et al., 2005;Goodsell and Dickerson, 1994;Koh et al., 2006). For b-sheets, the twist property further complicates a curvature calculation (Weatherford and Salemme, 1979). The analysis of LRR curvature of parallel b-sheets involves first trimming away flanking regions or other irregular elements from each LRR. This leaves an isolated b sheet, constituting part of a circle. This partial circle is then used as a building block to construct a whole circle. The number of b strands for a full circle is counted by looking down the super b sheet along the twisting axis. Using this empirical approach, the following paragraphs will describe the curvature of a LRR structure in terms of number of b strands to complete a circle where the bigger the number, the flatter the surface. A positive number means a concaved b sheet, and a negative number depicts a convex b surface. Left out is a description of the degree of twist as that is beyond the scope of this article.
The 11 complete LRRs in FSHR can be categorized into three groups based on their sequences in the outer surface segments (Fig. 7c). The LRRs in the first group include LRRs 2-4 and 6-7. The sequence motif of this group in the outer surface segment is IxxxAF (AF motif), where 'A' is sometimes substituted by other small residues. The prototypic LRR in this category is in the Nogo receptor (He et al., 2003), characterized by its relative flat curvature. It would take 42 such repeats to complete a circle (Fig. 7d). The second group contains LRRs 1, 5, 8-10, and its sequence motif in the outer surface segment is LPxxL (LP motif). The prototype of this LRR structure is in the platelet-receptor glycoprotein Iba (gpIba) (Huizinga et al., 2002). Its curvature is steeper than that of the first group, requiring 28 repeats to complete a circle. The LRR 11 belongs to the third group where it contains an alpha helix. As the a-helix in the outer surface of the repeat has a wider dimension, it would only take 22 repeats to complete a circle, making the curvature of this type of LRRs the steepest among the three groups. Thus, the gradually steepening curvature of the ectodomain is the result of the sequential mosaics of different types of LRRs in the ectodomain of the receptor.
To test if the different curvatures are due to the difference in sequence motifs or different overall sequence environments, the PDB database (Bernstein et al., 1977) was searched for other LRR examples containing these two motifs. We found a LRR structure containing both the LP and AF motifs (PDB: 4FS7). This LRR protein (BACOVA_04585) from Bacteroides ovatus contains two motifs that are on the opposite sides in each repeat, where the 'spine'forming phenylalanines (He et al., 2003) oppose the leucines in the hydrophobic core of the LRR structure (Fig. 7e). The residue after the consensus b strand leucine residue is proline (Fig. 7e  and f). A combination of the requirement of the leucine side chain pointing straight towards the interior hydrophobic core and the restraint of proline main-chain conformation (Anderson et al., 2005) predisposes the proline rings to orient sideways, demanding wider spaces for the associated b strands than normal b strands. When the LP-motif-containing b strand is wider than the AF-motif-containing b-turns on the opposite side, it creates an 'inverted' LRR structure where the parallel b-sheets are on the outer convex side instead of the inner concave surface found in a typical LRR structure (Fig. 7e).
Careful examination of the 4FS7 structure led to other interesting observations. The leucine and phenylalanine residues in LRR7 Cyan is for residues within inner parallel b-sheet, yellow for those on upper rim, grey for outer segments and pink for lower rim. The a-helix residues in LRR11 are shown in bold and underlined. Dashes indicate gaps introduced for the alignment, and symbol (hps) denotes an inserted 59-residue hairpin sequence that is omitted in the sequence alignment. (d) Illustration of curvature of three different LRR types from reconstructed circles of b strands. In each type, the parallel b strands of a typical inner-surface segment are extracted from the corresponding structure and the arc segment is repeated until it completes a smooth circle. In cyan is the reconstructed circle from the b strands of the sequence T84 to L233 of the Nogo receptor (pdb code: 1OZN), in green from the b strands of the sequence T58 to L202 of the platelet-receptor glycoprotein Iba (gpIba) (pdb code: 1P8V), and in magenta from the b strands of the sequence Q184 to N236 of the S-phase kinase-associated protein 2 (PDB: 2AST). (e) Crystal structure of the Bacteroides ovatus hypothetical LRR protein (PDB: 4FS7). The N-terminal flanking cap is colored grey and the LRR structure is colored green. Shown in yellow sticks are the side chains in the consensus sequence (Fig. 7f) residues LP or F. (f) Sequence alignment and secondary structure profile for the leucine-rich repeats of the Bacteroides ovatus protein. Consensus L/V/I/F/C residues are shaded in grey. Other consensus residues are colored cyan. The dashes denote gaps. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) (Fig. 7e) are reversed, as compared to those in LRRs 1-6. Since the phenylalanine side-chain is larger than a leucine side chain, the dimensional difference between the FP and AL motifs is larger than the difference between the typical LP and AF motifs, creating steeper 'inverse' curvature. The space demanded for phenylalanine in the LP motif is on the right while that for the proline is on the left when the b strand arrows are pointed down and the proline residues are in the b strands. Indeed, the b strands between LRRs 7 and 8 are wider than others due to F235 in LRR7 and P258 in LRR8 (Fig. 7e). In contrast, LRRs 9-11 lack the proline residues as found in LRRs 1-8 and 12-13. Without the proline residues, the space requirements for the b strands are smaller. Consequently, the local b sheet for LRRs 8-11 is on a 'normal' concaved surface (same as the 'Phe-spine' LRRs in the Nogo receptor), within an environment of the overall 'inversed' convex surface. These observations strengthen the notion that using AF and LP motifs can explain the steepening curvatures in FSHR LRRs.
Furthermore, based on the above analysis, the following types of LRR elements with increasing dimensions can be proposed: Identification of such LRR element types may be useful when modeling new LRR structures, particularly when attempting to predict their curvatures. For example, suppose there is a new LRR protein with 5 repeats of Nogo receptor type, followed by 3 repeats of gpIba type, 3 repeats of 4FS7 typical LRRs and 2 repeats of RI type. Traditional modeling approaches, such as threading, would fail to generate a 3D model with a high confidence. Yet, with the identification of each LRR element as illustrated above, the model can be easily constructed by incorporating relevant LRR modules from appropriate structures and linking these modules by superimposing the joining repeats to mosaic modules such as the LRRs from 4FS7 and FSHR ED . One example: joining the Nogo receptor LRRs and gpIba LRRs can be done by simply superimposing the last repeat from Nogo LRR to LRR7 of FSHR ED and the first repeat of gpIba LRR to LRR8 of FSHR ED .

FSH-FSHR ED has a broader interface than FSH-FSHR HB
A significant but not unexpected finding of the new FSH-FSHR ED structure was a greater buried solvent accessible surface area (SASA) and more rigid FSH conformation, suggesting that FSH interacts with FSHR ED more tightly than FSHR HB . The conformations of FSH in the FSHR ED -bound form and in the FSHR HBbound form are almost identical to each other (overall Ca r.m.s.d. of 0.5 Å between the two forms). However, a small but noticeable 0.4 Å translation of FSH horizontally towards FSHR ED , as compared to FSH-FSHR HB (Fig. 5f), resulted in the greater buried SASA and more rigid FSH conformation. The buried SASA in the FSH-FSHR HB interface is 2600 Å 2 ; this area has been increased 1000 Å 2 to 3600 Å 2 in the FSH-FSHR ED interface. When the same set of atoms in FSH-FSHR HB complex was used in the calculation, the buried SASA in the FSH-FSHR ED interface was 2850 Å 2 . Therefore, the 250 Å 2 difference comes from the horizontal translation of FSH towards the receptor within FSH-FSHR ED as compared to FSH-FSHR HB . We further calculated the overall Ca r.m.s.d. of FSH in the FSHR ED -bound structure, which is 0.2 Å as compared to 0.7 Å in the FSHR HB -bound structure. Therefore, the FSH conformation in the FSHR ED -bound form has been further stabilized compared to the FSHR HB -bound form. There is little conformational change in the hormone-binding subdomain of the receptor (overall Ca r.m.s.d. of 0.6 Å for residues C18-Y250 in the two forms), except for a few C-terminal FSHR HB residues whose conformation is disrupted due to lack of the integral 'hinge' subdomain (Fig. 5h). The conformational variations of FSHR ED residues are mainly from the hairpin loop within the 'hinge' subdomain, due to the absence of 7TM domain that might stabilize the loop. The overall Ca r.m.s.d. is 1.3 Å (residues C18-I359) as compared to 0.6 Å for the N-terminal subdomain (C18-Y250).

A second FSH-FSHR interaction site
The driving force for tighter FSH interaction with FSHR ED (as compared to FSHR HB ) is mainly from the second FSH-FSHR interaction site which contains a sulfated tyrosine site. A sulfated tyrosine in the 'hinge' region of the GPHRs (FSHR Y335, LHR Y331, TSHR Y385) has been shown in several functional studies to be indispensable for hormone recognition and signaling (Bonomi et al., 2006;Bruysters et al., 2008;Costagliola et al., 2002). The new structure (Jiang et al., 2012) reveals a detailed interaction between FSH, the sulfated tyrosine (sTyr) and FSHR residues in its vicinity. Buried in the interface are residues P290-I291, N293, E332, D334, sTyr335 and L337 from FSHR and N15-F18a, Q27a, F74a, R35b, L37-Y39b, P45b and K49b from FSH. Multiple hydrogen bonds are found between the sTyr335 sulfate group and residues (N15a, Q27a, V38b and Y39b) at the sTyr-binding-pocket in FSH. There is also a salt bridge formed between E332 of FSHR and K49b of FSH (Fig. 8a).
The sTyr binding pocket is formed at the interface between FSH a and b chains (Fig. 8a). Hydrophobic residues dominate at the bottom of the pocket where a subunit is the main contributor. The top of the pocket is a positive electrostatic surface, contributed by b chain short-range residue R35 and long-range residue K49 (Fig. 8b). The multiple hydrophobic residues build a necessary low dielectric-constant microenvironment to accommodate the Y335's hydrophobic phenyl ring and enhance Coulomb chargecharge attraction between the sulfate ion and the positive potentials lining the ceiling of the pocket. FSH changes a loop conformation dramatically in order to form the sTyr-binding pocket. The L2b loop (residues from V38b to Q48b), roughly half of the building block of the pocket, swings 10 Å from the open and loose conformation in the free form to the closed and rigid conformation in the receptor-bound form (Fig. 8c). Interestingly, the L2b loop conformations in two FSHRbound forms (i.e., in the FSH-FSHR HB and FSH-FSHR ED complexes) are nearly identical ( Fig. 8c and d). The formation of the sTyr pocket is induced by the act of binding, independent of the presence of sulfated tyrosine, because the FSHR HB lacks the sTyr residue, but the binding pocket on FSH has already been formed in the FSH-FSHR HB complex. Interestingly, a sulfate ion occupied the sTyr pocket on the FSHR HB -bound FSH when 0.1M Li 2 SO 4 was used in the crystallization buffer (Fan and Hendrickson, 2005). In contrast, the sulfate ion was not present in the pocket in the free-form FSH when a higher concentration of 0.9-1.2 M ammonium sulfate was used in the crystallization (Fox et al., 2001). No sulfate-containing salt was used to obtain the FSH-FSHR ED crystals (Jiang et al., 2012). These data indicate that the act of FSH binding to the hormonebinding subdomain of FSHR is sufficient to form the sTyr-binding pocket.
In addition to interacting with sTyr335 and its neighboring residues, FSH residues in the pocket-forming L2b loop interact considerably with FSHR residues on the inner LRR concave surface (Fig. 8a, left inset). Buried in this interface are residues D196-E197, V221-I222, K242-K243, R245 and M265 of FSHR and K40-R44 and K46 of FSHb. Among these residues, FSHR E197 salt-bridges with FSH R44b, and E197 and K243 of FSHR form hydrogen bonds with R44 and A43 of FSHb, respectively.
It is worth pointing out that there are two extra residues (K44-Y45) in the TSH b chain, which if present in FSH b chain, would reside between residues R44 and P45. These extra residues are predicted to be buried in the complex formed with TSHR; therefore, the hormone-receptor interface in this region would be more extensive for TSH-TSHR complex than FSH-FSHR or LH/CG-LHR complexes. This is consistent with the notion that TSHR residues encoded by exons beyond exon7 do play significant roles in hormone binding (Kosugi et al., 1991;Mizutori et al., 2008;Mueller et al., 2008), in contrast to the insignificant gonadotropin-binding roles for the corresponding residues in LHR or FSHR (Braun et al., 1991;Moyle et al., 1994). It should also be noted that although an earlier molecular modeling study of TSH-TSHR complex (Miguel et al., 2008) suggested that TSHR residue E251 (corresponding to FSHR K243 and LHR R247) contributed to hormone binding by forming a salt bridge with TSH residue K44b, E251K mutant had little effect on TSH binding but reduced signal transduction in terms of cAMP production .
6. Structure-based receptor activation mechanism in the hormone-receptor monomer complex

Significance of the 'hinge' disulfide bonds in signal transduction
The ectodomain contains two peculiar sequence motifs coupled by three disulfide bonds. The three disulfide bonds and the sequence motifs in the 'hinge' region are remarkably positioned to transduce signal effectively from the hormone-binding subdomain to the transmembrane domain. The L264-R298 residues correspond to the previously identified sequence motif CF3, known as a C-terminal LRR capping domain, uniquely existing in GPCR proteins (Kajava, 1998). The second sequence motif consists of the residues of Y322-R366, right before the beginning of the transmembrane domain. The intronic sequences at the analogous position of LHR residue Y317 correspond to the promoter and other regulatory regions of the intronless genes of other G-protein-coupled receptors, such as the b-adrenergic receptor and rhodopsin (Gromoll et al., 1996;Koo et al., 1991). Thus, the second motif of GPHRs corresponds to the N-terminal extracellular loop of the rhodopsin-like domain. The three cysteine residues of the CF3 motif (FSHR: C275, Cys276, Cys292) are linked respectively by disulfide bonds to the cysteines of the rhodopsin-like extracellular loop (FSHR: C346, C356, C338) (Bruysters et al., 2008;Jiang et al., 2012). In this aspect, the two chains in TSHR after protease cleavage (Rapoport and McLachlan, 2007) are equivalent to two separate protein molecules linked by three disulfide bridges: one is a LRR and the other the rhodopsin-like protein, supporting the proposed model for the evolution of diverse LRR-containing genes (Hsu et al., 1998). Among the three disulfide bonds, the two in the helix play a pivot role in transducing the hormone binding signal to the transmembrane domain of the receptor. The helix is locked into the space between the hormone-binding subdomain on the N-terminus and the last short loop before the first transmembrane helix on the C-terminus by two disulfide bonds, one on each side. Signals occurring at the N-terminus can be faithfully transduced via this locked helix (see the proposed mechanism below).

Proposed receptor activation mechanism
As discussed earlier, the sTyr-binding pocket on FSH does not exist in the free form. Rather, it is formed post hormone binding to its receptor in the absence or presence of the 'hinge' domain. This knowledge led to the following proposal of two-step receptor activation mechanism (Jiang et al., 2012) shown schematically in Fig. 9.
Upon FSH approaching the large and high-affinity inner concave surface of hormone-binding subdomain (LRRs 1-8) of FSHR, the initial high affinity interaction causes the FSH L2b loop to adopt the ''swung in'' conformation, leading to an additional hydrophobic interaction between the L2b loop of FSH and FSHR residues around b strands of LRR 8/9, as well as the formation of a sTyr-binding pocket at the interface of the FSH aand b-subunits (Fig. 8a). Then, FSH utilizes the nascent pocket to draw the sulfated Y335 in. Binding of the sulfated Y335 to the sTyr pocket of FSH lifts the hairpin loop linked by the disulfide bond between C338 and C292. The lift of the hairpin loop unlocks the inhibitory nature of the hairpin loop and activates the 7TM domain.
The LRR11 helix hosts two consecutive cysteine residues (C275 and C276) that play an important role in receptor activation. The disulfide bond, formed by C275 and C346 fastens the last LRR b strand to the helix to form a rigid body. The hairpin loop looks like a purse string, with one end attached to the helix and the other to the b strand. The other disulfide bond, formed by C276 and C356, ties the helix to the last a few residues before the first transmembrane helix (TM1) (Fig. 6a). Due to these constraints, in addition to the rigidity of the LRR domain, movement of the last b-strand, whether by lifting the hairpin loop directly or via other interactions, will be passed on to the residues on or close to TM1 on the other side of the helix. Conceivably, the combination of lift of the hairpin loop and rotation of the helix would lead to conformational change of the transmembrane domain and activation of the recep-tor. This activation mechanism could be mimicked by a simple rotation of the helix, such as switching the hydrophilic S273 to a hydrophobic residue. Indeed, S273I mutation leads to constitutive activation of the receptor . In addition, the constitutively activating mutations in the ectodomain are concentrated on or around the pivotal helix residues (Krause et al., 2012), including the two consecutive cysteine residues (Ho et al., 2001). This above activation model for monomeric receptors has been consistent with a number of experimental observations. The gonadotropin a-chain mutations Q13K, E14K, P16K, and Q20K convert human TSH into a superagonist (Szkudlinski et al., 1996); these mutations are concentrated near the top right side of the pocket, generating additional positive charges for a stronger anodic potential to pull the hairpin loop further to the top right (Fig. 8b). Various deletion experiments on the extracellular portion resulted in partial activation of FSHR and TSHR, leading to the proposal that there is an extracellular ''intramolecular tethered inverse agonist'' that suppresses the 7TM constitutive activity (Chen et al., 2003;Ho et al., 2005;Vlaeminck-Guillem et al., 2002;Zhang et al., 2000). The ''tethered inverse agonist'' region in the ectodomain has been further mapped to the hairpin segment 296-331 in FSHR (Agrawal and Dighe, 2009). This proposal has further been supported by the enhanced signaling effect of two FSH mutants that were designed to push the hairpin loop up towards the ceiling of the sTyr pocket ( Fig. 8b) (Jiang et al., 2012). We also noticed that the charges of a neighboring pair of residues, K243 and E266 in FSHR, are retained in LHR (corresponding to R247 and E270) but reversed in TSHR (corresponding to E251 and R274). These residues are near the low left corner to the sTyr pocket in the complex structure. A TSHR E266K mutation would enhance the electropositive potential on the low left side of the pocket and make the sTyr less likely to be pulled to the top right side. This might explain the reduced signal activity of equivalent E251K mutant of TSHR .

Towards understanding the 7TM domain
Understanding the 7TM domain is important because it is where the receptor transduces its extracellular signal across the membrane into the cell. For pharmaceutical companies committed to developing innovative medicines targeting the GPHRs, the goal is to develop a generation of orally bioavailable non-peptide small molecules with MW less than or around 500 Da that induce signal transduction and other biological functions similar to GPHs. To this date, there are over 170 small molecules targeting GPHRs listed in Thomson Reuters' Integrity Database (http://integrity.thomsonpharma.com). While the small molecule mimetics might not all bind to the 7TM domain, the binding region of three tested molecules has been mapped to the 7TM domain (Bruysters et al., 2008;van Koppen et al., 2013;Yanofsky et al., 2006). Using the 7TMbinding small molecule as a tool, Bruysters and coworkers (Bruysters et al., 2008) have successfully identified three pairs of disulfide bonds in the 'hinge' region. These studies demonstrate that the small molecules are allosteric modulates that directly bind to the 7TM, instead of the LRR domain where GPHs bind. In this regard, it is highly desirable to have a 7TM structure not only to gain insight into GPH signaling but also to assist design of orally bioavailable drugs.

Molecular modeling of the 7TM domain
In light of a large number of GPCR crystal structures determined in recent years, effort has been made to model the 7TM domains of GPHRs (Kleinau et al., 2013;Puett et al., 2010). The quality of these theoretical models depends greatly on the choice of template Fig. 9. Schematic diagram of the proposed two-step receptor activation mechanism of FSHR monomer. The FSHR extracellular LRRs, in a putative orientation relative to the seven-transmembrane (7TM) domain, are shown as magenta blocks with a hairpin loop, and the 7TM domain is shown as a cylinder with the inactivated state colored gray and the activated state colored green. The hormone-binding subdomain is labeled as HBSD, and signal specificity subdomain is labeled as SSSD. Sulfated Y335 is shown as a yellow ball, residue S271 is shown as a green star, and disulfide bonds as yellow jagged lines. Heterotrimeric Gs or b-arrestin is indicated by green ellipsoid. FSH is shown as a blue ellipsoid. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) structures and sequence alignment of the subject GPHR to the templates. GPHRs are members of the Family A GPCRs which have been further divided into several subfamilies. There have been conflicting classifications for GPHRs. GPHRs have been placed into the d subfamily which includes olfactory receptors (Fredriksson et al., 2003). This classification is posted in the widely visited GPCR network web site (http://gpcr.scripps.edu/) (Katritch et al., 2012). Featured on the web site is a recently determined crystal structure a b Fig. 10. Molecular modeling of the 7TM domains. (a) Sequence alignment of the 7TM domains of GPHRs and representative GPCRs with known crystal structures. Conserved residues across all members are colored green except for two conserved disulfide-bridging cysteine residues, which are colored yellow. Residues that are conserved in the representative GPCRs are colored magenta, and those that are only conserved in the GPHR family are colored cyan. Residues in the representative GPCR set are also colored cyan if most of them are identical to the corresponding conserved GPHR residues. Residues that are conserved in some of the sequences in both GPHRs and the representative GPCR set are shaded in grey. Gaps are represented by dash ''-'' symbols, and omitted residues are denoted by the symbol ''SS''. The position of a transmembrane helix is marked as ''TM'' followed by a number. The last helix is marked as ''helix8''. Abbreviations are: ADRB2 (human b2 adrenergic receptor), OPSD (bovine rhodopsin), AA2AR (human adenosine receptor A2a), DRD3 (human D 3 dopamine receptor), OPRM1 (human l-type opioid receptor) and PAR1 (human proteinase-activated receptor 1). (b) Templates used in the construction of GPHR models. Modeled residues in the GPHR sequences and a template sequence are colored identically where green is for ADRB2, yellow for OPSD and magenta for AA2AR, except for residues identical in both a GPHR sequence and the template sequence that are grey shaded. Loop positions are marked as either ''ECL'' (extracellular) or ''ICL'' (intracellular) followed by a number. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) of protease-activated receptor 1 (PAR1) in the d subfamily (PDB: 3VW7) (Zhang et al., 2012). The sequence alignment of GPHRs to PAR1, however, does not indicate that the PAR1 structure would be a good template because the percent sequence identity between GPHRs and PAR1 is at a random level. Another study (Bjarnadottir et al., 2006) has applied a similar approach of phylogenetic analysis to the 7TM regions of GPCRs and yielded a similar picture globally but different results for GPHR members. GPHRs were classified to the c subfamily and olfactory receptors to another standalone subfamily. An alternative classification (Joost and Methner, 2002) has categorized GPHRs to an A10 subfamily. This classification is posted in a wikipedia page (wiki entry: Rhodopsin-like receptors) and adopted by a popular GPCR database web site (www.gpcr.org). There are other classifications. In two highly cited articles, GPHRs are placed into 1C subfamily (Bockaert and Pin, 1999), or grouped with opioid receptors (Gether, 2000). Interestingly, opioid receptors and GPHRs belong to two different subfamilies in two classifications. It seems that how to categorize the GPHRs in the rhodopsin-like GPCR family is still an unsettled matter, presumably due to the difficulty with grouping members together when the sequence identity level is in the twilight zone (Rost, 1999).
Rather than treating all residues equally as it is implemented in sequence identity calculations, one strategy to improve the accuracy of sequence alignment is to focus on a few key conserved amino acids. It has been observed that the conserved residues in Family A include an aspartic acid at the N-terminal TM2 helix, a DRY motif near the C-terminal TM3 helix, a proline in the middle of TM4, a WxP motif in the middle of TM6, an NP motif at the end of TM7, a disulfide bridge that connects the first and second extracellular loops (ECL1 and ECL2), and a palmitoylated cysteine in the C-terminal tail (George et al., 2002). A multiple sequence alignment is shown in Fig. 10a with the key residues highlighted. The molecular model of FSHR 7TM domain was constructed by using one structure as a master template and other structures as the supplementary templates to take advantage of the best suitable structures in different sequence regions. The b2 adrenergic receptor (b 2 AR) was chosen as the master template for the following reasons. First, it shares as many residues with GPHR 7TM domains as any other template structure according to the alignment (Fig. 10a). Second, b 2 AR is the only GPCR with a known structure in an active state in complex with G-protein abc heterotrimer (PDB: 3SN6) (Rasmussen et al., 2011). Third, b 2 AR binds external agonists or antagonists; therefore, it is an appropriate model for drug discovery. Rhodopsin was considered but excluded as the master template for three reasons: (1) it contains a pre-bound ligand, unlike GPHRs or any other known GPCRs; (2) several key residues are not present in the rhodopsin sequence, including GN in TM1, serine in TM3 and the NS motif in TM7; (3) no human rhodopsin structure is available.
Once the master template was chosen, the next step was to choose appropriate supplementary templates. Several factors were considered when choosing one supplementary template over the others for a given region: (1) minimum gap numbers and lengths, (2) maximum sequence identity with GPHRs, and (3) least number of supplementary templates so that potential coordinate incompatibilities among the templates are reduced. Thus, rhodopsin and A2A adenosine receptor were chosen as the supplementary templates (Fig. 10b). Due to high availability of inactive GPCRs, we first constructed the FSHR 7TM domain in an inactive state (Fig. 11a). The active-state model was then constructed by changing residues from the inactive-state positions to the active-state positions according to the b 2 AR structure (PDB: 3SN6) (Fig. 11b). LHR and TSHR 7TM domains were then constructed by using the FSHR model as template.
Construction of the FSHR 7TM domain in inactive state followed the priority of preserving residue conformations in templates in the order from key residues (colored green and cyan across the subfamilies in Fig. 10a) to other conserved residues (colored yellow or grey in Fig. 10a) to the rest residues. First, the atomic coordinates of the key residues were copied from the templates. The coordinates of other conserved residues were copied directly from the templates. No atomic crashes were found in the combined model of key residues and other conserved residue. For each of the remaining TM residues, a side-chain rotamer was selected if its conformation overlapped well with that of the corresponding template residue. In a few occasions, an alternative rotamer was selected when the previously-selected conformation caused atomic crashes. The helix8 residues as well as the linking residues to TM7 were then added. The connecting loops were constructed last in the order of ECL3, intracellular loop2 (ICL2), ICL1, ECL2, ICL3 and ECL1. The resulting model was subjected to energy minimization in a reverse order of the construction: the coordinates for the ECL1 loop were minimized first while keeping the rest residues fixed, and those for the key residues were minimized last. Finally, the entire model was globally minimized.
Our 7TM models seem to be similar to those from earlier studies, but there are a few noticeable differences. The side chains of I568 and I640 of TSHR are contacting with each other in the model from Krause's group (Kleinau et al., 2007). This is not the case in our model. R464 and D564 in LHR form a salt bridge in another group's model (Puett et al., 2010), but not present in our model, although the likelihood of the salt bridge increases when PAR1 (PDB: 3VW7) is used as the supplement template for modeling the ICL3 region.
The 7TM extracellular loops may interact with the highly concentrated charged residues in the hairpin loop of the 'hinge' region (Fig. 11e). Conserved mutations in the hairpin loop, E297Q or D382N in TSHR, caused 50% drop of TSH binding but increased affinity from 80 nM to 40 nM (Mueller et al., 2008). E297A, D382A, E297K and D382K mutations yielded similar results. E297D and D382E mutations largely retained the full w.t. binding ability but reduced ligand-affinity from 80 nM to 180 nM (Mueller et al., 2008). These data imply that the conformations of these two residues must be restrained in the inactive state since they are sensitive to even the most conserved mutations. It is possible that these residues are tethered to the 7TM extracellular loops via one or more positively charged residues such as K565, K651, H478 and H484. Perturbation of the ''tethering bonds'' would loosen up the local hairpin-loop conformation, causing a local blockage and leading to reduced TSH binding. The important roles of ECL2 and ECL1 loops in GPCR activation have been shown in several studies. Alanine scanning of the ECL2 loop residues of TSHR revealed two residues, Y563 and K565 (corresponding to Y511 and K513 of FSHR) are important in receptor activation (Kleinau et al., 2007). A study of the complement factor 5a receptor (C5aR) suggests the ECL2 loop controls the on-off transition for the receptor activation (Klco et al., 2005;Massotte and Kieffer, 2005). Another study suggests the ECL1 loop plays an essential role in activating the adenosine A 2B receptor (Peeters et al., 2011). It is worth noting that the sequence region from E297 to D382 of TSHR roughly corresponds to the mapped ''tethered inverse agonist'' region 296-331 in FSHR (Agrawal and Dighe, 2009).
One of the caveats of the modeling practice is the assumption that all the Family A GPCR structures are similar enough so their coordinates can be copied. Although all the known GPCRs do look similar, there are substantial differences among the structures; for example b 2 AR is quite different from protease-activated receptor 1. Nevertheless, a 3D-model is a useful tool for understanding biological functions as long as one knows its limitations and treats the reliabilities of different modeled regions properly with regard to the templates. As the key residue coordinates are super-imposable in most of the known GPCR structures, they are likely reliable with the assumption that proper templates were chosen. The side-chain conformations for other conserved residues are generally considered as reliable, especially for residues within the helices, but their reliability is lower than those of the key residues. The main-chain conformations for the remaining helical residues are considered as reliable, but the side-chain conformations have to be treated with caution. For the loop regions, reliability is the greatest when no gap was introduced; thus, the main-chain conformations for ICL1, ICL2, ECL3 and the link between TM7 and helix8 are considered as reliable. When there is a gap in aligned sequences, the modeling becomes a ring-closure problem (Go and Scheraga, 1970) where a deletion in the subject sequence decreases the degree of freedom and the locations of the modeled residues are relatively certain, but the conformation of these residues may still not be reliable. Loops of ECL2 and ICL3 are in this category. The degree of freedom for loop ECL2 is even smaller as its conformation is restricted by a disulfide-bond between this loop and TM3 (Fig. 11c). In contrast, when an insertion is introduced in the subject sequence, the degree of freedom is increased; as a result, the modeled residues are least reliable. Loop ECL1 is in this category. Nevertheless, one has to treat theoretical models with caution, given the low percent sequence identities between GPHRs and other GPCRs with known structures.

Small molecules as allosteric modulators
Identifying small molecular weight GPH agonists or antagonists is a major research field of molecular reproductive endocrinology. In addition to being developed for drug candidates Lunenfeld, 2004;McGregor et al., 2007), these small molecules can be useful tools for understanding the basic science of GPHRs: they are likely essential for obtaining GPCR crystals (Rosenbaum et al., 2009); they aided the successful efforts in deciphering disulfide-bond pairs of the LHR 'hinge' region (Bruysters et al., 2008) and in identifying the ligand-binding region in FSHR (van Koppen et al., 2013).
To date, there are over 170 known small molecules targeting GPHRs, of which a few chemical fragments are often observed across different chemical series. A comprehensive review and analysis of these small molecules is beyond the scope of this article. Three small molecules, however, are worth mentioning here because their effects on the FSH binding to FSHR played an important role in the formulation of our proposed trimeric receptor activation mechanism (see below). Importantly, despite their different chemical types, these three small molecules all increase FSH binding to the cell-surface receptors from 1-fold to approximately 3-fold ( Fig. 12) (Dias et al., 2011;Janovick et al., 2009;van Koppen et al., 2013). Since FSH binds to FSHR with subnanomolar affinity, most FSH molecules must remain receptor-bound even without the presence of small molecules. Therefore, it seems reasonable to suggest that the dramatic increase of FSH binding to its receptor is unlikely due to an increase of binding affinity in the presence of the small molecules. Instead, these small molecules are proposed to have somehow changed the receptor form to expose more FSHbinding sites.

Functional relevance of receptor trimers
The oligomerization of glycoprotein hormone receptors is a well-observed phenomenon (Lei et al., 2007;Rivero-Muller et al., 2010;Roess et al., 2000;Roess and Smith, 2003;Thomas et al., 2007;Urizar et al., 2005;Zoenen et al., 2012). Two independent protomers were observed to form a dimer, mainly mediated by the residue Y110, in the FSH-FSHR HB crystals (Fan and Hendrickson, 2005). Unexpectedly, a trimer was observed in the asymmetrical unit of the new FSH-FSHR ED crystal structure (Fig. 13) (Jiang et al., 2012). Since GPCR trimers have never been explicitly proposed before, caution is deemed appropriate about the physiological relevance of the FSHR ED trimer observed in the more recent crystal structure (Jiang et al., 2012), because such high order oligomers could be due to artificial crystal lattice contacts. Nevertheless, numerous pieces of evidence supporting the functional relevance of the GPHR trimers should not be ignored.
First, high molecular weight (MW) electrophoresis bands, consistent with FSHR, LHR and TSHR trimers, have been repeatedly documented in earlier publications (Dattatreyamurty et al., 1992;Latif et al., 2010;Tao et al., 2004;Thomas et al., 2007). A GPHR monomer has a MW of approximately 80 kDa. Studies from two independent groups explicitly labeled the bands with MW of 240 kDa, corresponding to a trimer. Purified FSHR from bovine testes (a physiological source) have been shown to contain a form with MW of 240 kDa (Dattatreyamurty et al., 1992). In an investigation of LHR oligomerization, the major band on Western blot was shown to migrate at 240 kDa (Tao et al., 2004). Since some receptors associate constitutively as oligomers even in harsh buffers containing SDS, the main driving force for oligomerization likely lies with the transmembrane domains. This is consistent with the earlier observation that receptor oligomerization is mediated through both transmembrane and ectodomains in LHR and TSHR (Guan et al., 2010;Urizar et al., 2005). However the significance of such high molecular weight bands which cannot be dissociated with SDS under reducing conditions should be addressed with caution. In one case it was reasoned that the high molecular weight bands are from not-yet-fully-processed receptor, possibly on their way to being degraded, because these bands still retained the mycand FLAG-tags on the C-terminus (Thomas et al., 2007). Since the fully processed 85kDa receptor did not have the tags and did not present as a high molecular weight band, it was concluded that the receptor monomer forms oligomers (as demonstrated by antibody FRET) but that the oligomers could be dissociated in SDS gels (Thomas et al., 2007).
New work seems more convincing for the physiological existence of receptor trimers. As mentioned above, three publications show an approximate one-to-three ratio of receptor binding by FSH in the absence and presence of small molecular modulators (Dias et al., 2011;Janovick et al., 2009;van Koppen et al., 2013). As will be demonstrated below, FSHR trimer, as observed in the new crystal structure, can only geometrically accommodate one fully glycosylated FSH molecule (Fig. 14a). A simple explanation of the 1-to-3 ratio is that one fully-glycosylated FSH molecule binds to one FSHR trimer in the absence of small molecular modulators but each FSH binds to one FSHR monomer after small molecule modulators dissociate the FSHR trimer (see below).
To facilitate crystallization, endoglycosidase F1 was used to trim the glycans in the FSH-FSHR ED complex. It has been well established that full glycosylation, including the capping of sialic acid of the carbohydrates at N52a, is essential for the full agonist Fig. 12. Effect of small molecule allosteric modulators on specific 125 I-FSH binding to wild-type FSHR. Color coded are three small molecule structures and their effect on the relative binding of 125 I-FSH to the wt-hFSHR that changes from one-fold to three-fold in the absence (shown as open bars) and presence (shown as filled bars) of these small molecules. The data source of ADX61623 is Fig. 4c in the reference (Dias et al., 2011), that of Org 41841 is Fig. 3a in the reference (Janovick et al., 2009), and that of Org 214444-0 is Fig. 3A in the reference (van Koppen et al., 2013). The original data were normalized for direct comparisons. The effect of ADX61623 was measured for 125 I-hFSH binding to hFSHRs in HEK293 cells, that of Org 41841 to FSHRs in Cos7 cells, and that of Org 214444-0 to membranes of CHO stably expressing hFSHR. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) activity of glycoprotein hormones. In the new structure, only the first glycan residue is visible for the carbohydrates at N52a of FSH in the electron density map. The glycan extends towards the central cavity of the FSH-FSHR ED complex trimer (Figs. 13 and 14a). In order to understand the roles of the N52a carbohydrates in receptor activation, the dimension of the central cavity was measured and compared to the dimension of known full glycans in available crystal structures. Although it is impossible to make a precise calculation due to the flexibility and the heterogeneous nature of oligosaccharides, it is possible to estimate the dimension owing to the availability of a number of N-linked carbohydrate structures. The dimension of the central cavity is approximately 25 Å in diameter (Fig. 13b). The dimensions of a bi-antennary glycan in human fibrinogen (PDB: 3GHG) are 20 Å from the first Nlinked sugar residue to the terminal sialic acid and 30 Å to the other branch (Kollman et al., 2009). Measurements of the glycan dimensions in other structures (e.g., the glycans in IgG structures) also indicate 20 Å is the minimum dimensional requirement for a full-length oligosaccharide. This implies that the binding of one fully glycosylated FSH molecule to a receptor trimer would prevent other FSH molecules from binding to the trimer, which is confirmed by docking a bi-antennary oligosaccharide glycan into the trimer structure (Fig. 14a). This is consistent with experimental observations of negative cooperativity of TSH binding to TSHR Urizar et al., 2005). It seems therefore reasonable to conclude that an FSHR ED trimer can accommodate only one fully glycosylated FSH molecule.
The trimeric receptor model presented in this article provides rational explanations for the important biological roles played by the GPH residues which are removed from either the primary hormone-binding site or the sTyr site. Specifically, the hormone residues in the loops L2a and L3b are known to play important roles in receptor binding and signaling. Since these residues project away from the primary hormone-receptor interface, a full understanding about the function of these residues has so far been elusive. The proposed potential hormone-receptor interaction exosite (Jiang et al., 2012) thus provides a mechanistic explanation for numerous experimental data concerning these residues. Foremost, the oligosaccharide at N52a is essential for the full agonistic activity of glycoprotein hormones, because its removal dramatically reduces the efficacy of the hormones. In hCG, this loss of efficacy can be reversed by either adding an oligosaccharide at N77b in loop L3b (corresponding to D71b in FSH) (Moyle et al., 2004), or applying a monoclonal antibody (B111) that recognizes the nearby residues (Moyle et al., 2004) or a polyclonal antiserum against hCG (Rebois and Liss, 1987). In the case of hTSH, a super a b Fig. 13. FSH-FSHR ED trimer. (a) Top and front views (left and right panels, respectively) of the trimeric complexes. FSH aand b subunits are shown in green and light-blue, respectively, while FSHR ED in magenta. The side chain of sulfated Y335 is represented as sticks for the tyrosine and as colored balls (sulfur: yellow; oxygen: red) for the sulfate. The carbohydrate atoms at N52a are shown as yellow balls. The disordered residues in the receptors are marked as dashed lines in the front view. For clarity, the third complex in the back in the front view is represented as a grey surface model. Inset: A zoomed region shows the detailed interactions between FSH and FSHR ED at the trimer interface. (b) FSH-FSHR ED trimer in the surface representation of electrostatic potentials that were calculated from amino acid residues. The N52a glycan is located in the inner space of the FSH-FSHR trimer. The left panel is a top view of the trimer where the N52a glycans are not visible due to the blockage of FSH b subunits. The right panel is a front view of the trimer. For clarity, the front residues are cut away to reveal the N52a glycan atoms (yellow balls) on one of the trimeric protomers. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) agonist (TSH + a4K) can also be engineered to become an ultra super agonist (TSH + a4K + b3R) when three residues in the b chain (I58b, E63b and L69b) are all mutated to arginine (Grossmann et al., 1998). All these residues in hCG and TSH are located at or near the potential exosite (Fig. 13a inset), suggesting the important role of the potential exosite in signal transduction of the GPHRs.
In light of the available crystal structure of b2 adrenergic receptor-G s protein complex (Rasmussen et al., 2011), it was important to determine if three G s protein heterotrimeric molecules could be modeled to an FSHR trimer. No GPCR trimer structure had ever been reported, but its evolutionary ancestor, bacteriorhodopsin (bR), exists as a trimer both in crystal and solution (Takeda et al., 1998). Each molecule in the bR trimer was replaced with the above-constructed FSHR 7TM domain to generate an FSHR 7TM trimer. The crystal structure of the b2 adrenergic receptor-G s protein complex was then superimposed to one of the three 7TM domains. The resulting trimer model clearly shows that the FSHR trimer can only accommodate one G s protein heterotrimer (Fig. 14b). Although it is possible that the FSHR 7TM domains associate differently from the bR trimer, the bulkiness of the G protein heterotrimer relative to that of the FSHR 7TM domain, nevertheless, makes it unlikely that three G heterotrimers would geometrically fit into a tightly-associated FSHR 7TM trimer. Therefore, it seems reasonable to conclude that the FSHR trimer would only interact with one G s protein heterotrimer.
The transmembrane domain of GPHRs has been known to be activated either intramolecularly (cis) or intermolecularly (trans) after the binding of a hormone to the ectodomain (Ji et al., 2002;Osuga et al., 1997;Rivero-Muller et al., 2010). It was proposed (Ji et al., 2002) that the receptor can be trans-activated by a largescale translational movement of the ectodomain from its own transmembrane domain to its neighbor's transmembrane domain, presumably due to an extensive conformational melt of the signal specificity (''hinge'') region. Because the 'hinge' domain is an represented as surfaces with each color denoting a protomer of the trimer (i.e., one 7TM domain). The Gs complex is shown as ribbons with a chain in blue, b chain in green and c chain in cyan. (c) Receptor trimers may exist in a cis-or trans-configuration (left and right panels, respectively). Either configuration is compatible with the FSHR ED trimer described in Fig. 13. Each color represents one receptor monomer. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) integral part of LRR, such a large-scale translational movement is unlikely (Jiang et al., 2012). To understand this phenomenon, it is necessary to construct a full-length FSHR trimer. It is reasonable to assume that the three-fold axis of the FSH-FSHR ED trimer aligns with that of the FSHR 7TM trimer (Fig. 14c). There is still uncertainty about the rotation angle along the 3-fold axis between the ectodomain trimer and the 7TM trimer, as one FSHR ectodomain may sit on top of one 7TM domain or sit in between two 7TM domains. In the FSHR ectodomain trimer, the distance separated the last visible residue in each ectodomain monomer, I359, is about 14 Å. This C-terminal proximity makes it possible for multiple configurations, including the co-existence of both cis-and trans-receptor trimers (Fig. 14c).

Hypothesis: the role of glycoprotein hormone receptor trimers in signal transduction
Based on the recent crystal structure in combination with a wealth of biochemical data in the literature, it seems appropriate to present here a proposal for a trimeric glycoprotein hormone receptor activation mechanism (Fig. 15). In the absence of FSH, FSHRs exist mainly as trimers in a closed form (State A in Fig. 15). Upon binding of a single fully glycosylated protein hormone molecule, the ectodomains transform themselves to an open form, forced by initial high-affinity binding of the hormone to the inner surface of FSHR hormone-binding subdomain and subsequent pushing-in by the bulky N52a carbohydrates into the central hole of the receptor trimer. As a result, one of the three receptors in the trimer is activated. The receptor trimer is switched asymmetrically from an inactive state to an active state, and subsequently binds one heterotrimeric G-protein (State B in Fig. 15). Independently, small-molecule allosteric modulators bind to the transmembrane domains, causing conformational changes and dissociation of the 7TM trimer. The receptor separation allows two more FSH molecules to bind the yet-unbound two receptors. Each receptor is activated via the monomeric activation mechanism, resulting in the formation of three activated G protein molecules (State C in Fig. 15). Due to the absence of bulky glycans, three deglycosylated hormones can bind a receptor timer (State D in Fig. 15). The antagonistic activity of the deglycosylated hormone likely arises from lack of the glycan's pushing-in force for the trimeric dissociation in the ectodomains, therefore locking the receptor in hormone bound with high-affinity into an inactive or partially active state.
An important aspect of this mechanism is that FSHR can function as both a monomer and a trimer. Receptor trimerization is mediated via both the 7TM domains and the ectodomains. A complete dissociation of the trimeric receptor requires the separation among both the trimeric 7TM domains (which can be achieved by an allosteric modulator) and the trimeric ectodomains (which can be achieved by a full-length N52a glycan). Like a monomer, the trimeric FSHR only activates one G protein and binds one b arrestin. A major difference between the trimeric mechanism and monomeric mechanism is the N52a glycan plays little role for receptor activation in the monomeric form. In the presence of an allosteric modulator, the trimeric mechanism model predicts that FSHR would achieve 3-fold binding of fully-glycosylated FSH, 3fold b-arrestin binding and 3-fold G protein activation when all three monomers are completely separated. This model does not predict the extent of cAMP increase, due to signal amplification at the adenylate cyclase step in the signaling pathway. It is not a b d c Fig. 15. Proposed mechanism of trimeric GPHR activation. The extracellular LRRs of GPHR are represented as purple blocks with a hairpin loop (the primary hormone-binding subdomain is labeled as HBSD and signal specificity subdomain as SSSD) and seven transmembrane domain (7TM) as cylinders (inactivated and activated forms are colored as grey and green, respectively). The other key receptor elements are also shown, where sulfate group at Y335 is depicted as a yellow ball, residue S271 as a green star and disulfide bonds as thin yellow lines. Heterotrimeric Gs protein is shown as an ellipsoid (inactivated and activated forms are colored grey and green, respectively). Glycoprotein hormone heterodimer is represented in blue whereas carbohydrates at N52a as Y-shaped yellow sticks. Small molecule allosteric modulators are shown as yellow hexagons. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) unusual for different compounds to produce similar maximal cAMP responses, unlike the response in a b-arrestin assay where the response is proportional to receptor number (Nickolls et al., 2011). The trimer model also suggests that the L1a and L3a loops of FSH are near the 7TM domain. The receptor activation, however, is not achieved by a direct contact between FSH and the 7TM domain, rather by both the perturbation of the ectodomain trimerization and lifting the hairpin loop in the signal specificity subdomain.

Perspectives and closing remarks
Glycoprotein hormones and their receptors form a complex and sophisticated biological system. Understanding the signaling mechanism at the atomic level to the current status is the result of a cumulative effort from several laboratories. The key events of the structural understanding of the system are shown in Fig. 16. In 1994, the crystal structure of hCG was solved by two groups (Lapthorn et al., 1994;Wu et al., 1994) that laid the foundation of the field. In 1995, a view of the hormone-receptor complex was presented (Jiang et al., 1995) in which several aspects of hormone binding and orientation have stood the test of time. In 2001, the crystal structure of FSH (Fox et al., 2001) was solved using the hCG structure as the search model. In 2005, the crystal structure of FSH-FSHR HB complex (Fan and Hendrickson, 2005) was solved using the FSH structure as the search model. In 2012, the crystal structure of FSH-FSHR ED complex (Jiang et al., 2012) was solved using the FSH-FSHR HB as the search model. It is clearly a long relay process over 18 years that has taken us this far in the understanding this exquisite biological system. The next big task obviously is to obtain the crystal structure of a full-length GPHR. Without that, how the signal specificity subdomain interacts with the transmembrane domain in atomic details will remain unknown.
Apart from obtaining a full-length GPHR crystal structure, other means can be explored to gain insight into receptor signaling. One important issue is to refine the final proposed step and map the details of the monomeric two-step receptor activation mechanism shown in Fig. 9. In this step, the bound hormone employs a lever-like mechanism where the 'pulling & lifting' of the hairpin loop presumably releases the inhibitory effect of the ectodomain on the extracellular loops of the 7TM domain, and relays the signal to a more subtle, propagated conformational change to the GPCR helix bundle.
A more important question is whether the FSHR trimer observed in the new crystal structure is a physiologically relevant entity. This question is important not only for the GPHR family members but for the GPCR superfamily in general. Members of GPCRs have been shown to display negative or positive cooperativity in ligand binding and asymmetric signaling (Damian et al., 2006;Rovira et al., 2010). Growing evidence shows many GPCRs do exist as oligomers (Khelashvili et al., 2010;Skrabanek et al., 2007). Whether GPCRs function as a monomer or an oligomer, however, has been extensively debated. Convincing studies have demonstrated that rhodopsin or b 2 AR can signal to their respective G protein as monomeric units (Ernst et al., 2007;Whorton et al., 2007), but atomic-force microscopy has shown rhodopsin exists as a dimer and higher order of oligomers (Fotiadis et al., 2003). Several other approaches have demonstrated that GPCRs form oligomers under physiological conditions. Single molecule imaging TIRF microscopy recorded the formation and dissociation of muscarinic receptor oligomers in CHO cells (Hern et al., 2010). The M2 muscarinic receptor was identified as a tetramer in live cells by measuring the oligomeric size of using quantitative FRET method (Pisterzi et al., 2010). Another study (Albizu et al., 2010) added one more compelling qualifier: asymmetric activation of oxytocin receptors as shown with time-resolved FRET between ligands. Indeed, many GPCR oligomers bind ligands asymmetrically with a single protomer activated (Rovira et al., 2010). Other examples are GPCRs for GABA, metabotropic glutamate (Kniazeff et al., 2004), leukotriene B4 (Damian et al., 2006), dopamine (Han et al., 2009) and serotonin (Mancia et al., 2008).
Several earlier studies have demonstrated that GPHRs form oligomers under physiological conditions. LHR forms oligomers in vitro as well as in vivo (Rivero-Muller et al., 2010;Urizar et al., 2005). The oligomer behaves as a single monomer and activation of a single protomer is enough for signal transduction. TSHR has also been shown to form oligomer under physiological conditions and TSH binding occurs only on a single protomer (Vassart, 2010;Vlaeminck-Guillem et al., 2002). These observations are consistent with a proposed receptor trimer. In addition to supporting an asymmetric mechanism of receptor activation, a trimer model also suggests a potential origin for negative cooperativity in receptor binding when the hormones are fully glycosylated. A trimer activation mechanism further suggests that receptors can be activated as both trimer and monomer. While a monomeric GPHR is capable of full coupling to one G-protein complex, the proposed trimeric receptors can interact with a single G-protein complex. Indeed, rhodopsin has been shown to be capable of activating only a single G-protein in both monomeric and dimeric forms (Bayburt et al., 2007). Taken together, a step forward has been taken to provide a mechanistic explanation of the phenomena. In this regard, the confirmation of the physiological relevance of the trimer by further work should shed light not only on the activation mechanism of GPHRs but also on the whole GPCR superfamily members in general. structures. Color codes are as follows: a subunit (green), hCGb (cyan), FSHb (blue) and FSHR (magenta). Carbohydrates and sulfated tyrosine are shown as balls. Below each structure is the year when the crystal structure was published, the name of the structure, and the structure PDB code(s) (in parenthesis). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)