Visualization of Insulin Receptor Activation by a Novel Insulin Analog with Elongated A Chain and Truncated B Chain


 Cone snail venoms contain a wide variety of bioactive peptides, including insulin-like molecules with distinct structural features, binding modes, and biochemical properties. Here, we report a fully active humanized cone snail venom insulin with an elongated A chain and a truncated B chain, and use cryo-electron microscopy and protein engineering to elucidate its interactions with the human insulin receptor ectodomain. We reveal how an extended A chain can compensate for deletion of B-chain residues, which are essential for activity of native insulin but also compromise therapeutic utility by delaying the onset action, suggesting approaches to develop improved therapeutic insulins. Curiously, a receptor conformation present in low abundance adopts a highly asymmetric structure that displays novel coordination of a single humanized venom insulin using elements from both of the previously characterized site 1 and site 2 interactions.


Introduction
The ~1,000 extant species of marine cone snails utilize complex venoms to capture prey, which can include sh, worms or other snails 1 . The majority of cone snail toxins target ion channels in the prey's nervous and locomotor system to elicit rapid paralysis 2 . We previously showed that some species additionally use insulin as part of their toxin arsenal. Venom insulins bind rapidly to and activate the prey's insulin receptor (IR) and, as a result, induce dangerously low blood glucose levels, rendering the envenomated animal unable to escape 3 . Venom insulins have hence evolved unique structure-function properties that enable very fast action. We recently showed that these features can directly inform the design of new, fast-acting insulin-based drug leads for the treatment of type-1 diabetes 4 , a disease for which daily insulin injection remain the only effective treatment.
Insulin is a conserved peptide hormone found in all animals 5 . In vertebrates, including sh and human, insulin is secreted as a hexamer that dissociates into a dimer, then monomer, in order to bind and activate the human insulin receptor. Unlike physiological release of insulin from pancreatic b cells, subcutaneous injection of insulin results in a relatively slow hexamer-to-dimer-to-monomer conversion, which can delay diffusion from the subcutaneous injection site and compromise effective glucose control in people with diabetes 6,7 . Designing insulin analogs that do not form dimers and hexamers has proven challenging because the region involved in dimerization-near the C terminus of the B chain-is also of critical importance for IR activation 8 . As a result, removal of residues that mediate dimerization-such as in desoctapeptide insulin (DOI, which lacks the last eight residues of the B chain)-also results in a near complete loss of biological activity 9 .
We previously showed that insulins from sh-hunting cone snails of the Gastridium clade, Conus geographus and Conus tulipa, lack eight residues at the B-chain C terminus -which, in vertebrate insulins, mediate both dimerization and receptor binding -yet are able to activate the sh and human IR 10,11 . Structure-function studies revealed that receptor activation was facilitated by two aromatic residues in the B chain, which act as surrogates for the missing C-terminal residues 11 , and by two mutations in a loop in the A chain. Subsequently, we reported that another venom insulin (Con-Insulin K1) from Conus kinoshitai, a divergent sh-hunting species of the Afonsoconus clade, also activates the sh and human IR but contains neither the B-chain C-terminal residues nor aromatic substitutes in the B chain 10 . Instead, unlike any other insulin reported in nature, C. kinoshitai insulin displays a four-aminoacid C-terminal elongation of the A chain. Based on this observation, we proposed that, in the absence of the B-chain C-terminal residues, the elongated A chain provides an alternative mechanism for receptor activation 10 .
Here, we identify several additional venom insulins with varying A-chain elongations, use these sequences to generate a panel of human-venom insulin hybrid analogs lacking B-chain C-terminal residues, and investigate their ability to activate the human IR through their various A-chain elongations. In particular, the venom-insulin hybrid (Vh-Ins) analogs, Vh-Ins-HALQ and Vh-Ins-HSLQ (Vh-Ins-H(A/S)LQ), lack the human B-chain residues that mediate insulin dimerization but displayed activity similar to that of native human insulin. Using electron cryo-electron microscopy (cryo-EM), we determined structures of the IR ectodomain in complex with up to four Vh-Ins-HSLQ molecules. This revealed how residues at the C terminus of the A chain of Vh-Ins-H(A/S)LQ compensate for the loss of the B-chain interactions with IR. This work establishes a new paradigm for IR engagement and provides the basis for designing nextgeneration insulin therapeutics with improved properties, including insulin analogs with potential for ultra-rapid action. Moreover, the cryo-EM analysis revealed conformational dynamics within the Vh-Ins:IR complex and a novel binding mode that may be relevant for signaling.

Results
Sequencing and comparative analysis of venom insulins from sh-hunting cone snails Sequencing of the venom gland transcriptomes of the two sh hunters, Conus laterculatus and Conus mucronatus, from the Phasmoconus clade, led to the identi cation of four new venom insulins, two from each species. Molecular phylogenetics closely grouped these sequences with other cone snail venom insulins, particularly with those isolated from other sh-hunting species (Fig. 1a, red lines). In line with previous observations 12 , endogenous snail signaling insulins group separately and are less diversi ed ( Fig. 1a, gray lines). According to the nomenclature introduced for cone snail venom insulins 3 , the new sequences were named Con-Ins La1 and Con-Ins La2 for insulins identi ed from C. laterculatus, and Con-Ins Mo1 and Con-Ins Mo2 for insulins from C. mucrontatus. All four precursor sequences have the canonical organization de ned by human preproinsulin, with an N-terminal signal sequence for relocation into the endoplasmic reticulum and secretory pathway, followed by three regions encoding the B chain, C peptide(s) and A chain (Fig. S1). Proteolytic processing of venom preproinsulin is predicted to yield mature venom insulins with the same cysteine framework and disul de connectivities as vertebrate insulin (Fig. 1b-c). All four sequences lack residues at the C terminus of the B chain that are critical for receptor activation in vertebrate insulin and the aromatic residues previously shown to be important for receptor binding of other venom insulins, such as the C. geographus venom insulin Con-Ins G1 3 (Fig. 1d).
Strikingly, all of the new venom insulin sequences have C-terminal extensions in their A chains with diverse amino acid composition (-PSLL#, -GSLL#, -GSLLD, -PVQ, -HTLQ#, and -ASLLGL (Fig. 1c), where # represents C-terminal amides, a common and bioinformatically predictable modi cation in cone snail toxins 13 ). This pattern suggests that C-terminal A-chain elongations may play a functional role in IR activation of this family of venom insulins and serve as a substitute for the missing B-chain residues of human (and sh) insulin. To investigate this hypothesis, we synthesized a panel of venom-human hybrid analogs (Vh-Ins) for functional and structural studies.
Design and functional evaluation of insulin analogs with elongated C-terminal A-chain residues Because the six venom insulins all display anionic B10 and hydrophobic B20 residues (Fig. 1c), we incorporated the GluB10 and LeuB20 mutations into a human des octapeptide insulin (DOI), lacking the C-terminal eight residues on the B chain, and attached the respective A-chain elongation motifs from six venom insulins onto the thus-modi ed DOI backbone to create six Vh-Ins analogs (Fig. 2a). We measured the extent of AKT phosphorylation in IR-overexpressing NIH 3T3 cells as an indicator of insulin potency. Strikingly, four of the six Vh-Ins molecules with elongated A chains display potency comparable to native human insulin (Fig. 2b) and are 400-to 800-fold more potent than DOI (Fig S4). These four potent Vh-Ins molecules all have serine at position A22 and leucine at position A23 within their elongation motifs. On the other hand, the analog containing the A-chain elongated sequence in C. kinoshitai venom, Vh-Ins-HTLQ, which has threonine instead of serine at position A22, has an 11-fold reduction in potency with respect to human insulin. To determine if ThrA22 is responsible for the reduced potency, we rst mutated it to serine and found that Vh-Ins-HSLQ has equal potency to human insulin (Fig. 2c), further demonstrating the importance of this position. To understand better the role of A-chain elongation residues in signaling potency, we performed alanine scanning mutagenesis on the additional residues, A21-24 in Vh-Ins-HTLQ. This revealed that individual AlaA21 or AlaA24 substitution results in slightly lower potency than Vh-Ins-HTLQ (Fig. 2c). In contrast, AlaA23 substitution led to greatly reduced bioactivity, while the AlaA22 substitution displayed comparable bioactivity as human insulin. Two of the analogs-Vh-Ins-HALQ and Vh-Ins-HSLQ-showed potency similar to native insulin (Fig. 2c).

Structure determination of a Vh-Ins-HSLQ:receptor ectodomain complex
To elucidate the molecular interactions between the elongated A chain of Vh-Ins-HSLQ and the insulin receptor, we used a receptor isoform A (IR-A) ectodomain construct puri ed from suspension-adapted HEK 293-F cells, as described previously 14 . The puri ed receptor ectodomain (hereafter "receptor") comprises wildtype residues 1 to 917 with a C-terminal linker and 8xHis tag. To prepare samples for cryo-EM structure determination, the receptor was incubated with Vh-Ins-HSLQ and applied to holey-carbon Cu grids. Movies were collected on a Titan Krios equipped with a Gatan K2 detector and energy lter. Our analysis focuses on three reconstructions: one for the symmetric insulin-binding "head" region (3.3 Å resolution) one from a subset of those particles that additionally shows an ordered C-terminal "stalk" (4.1 Å resolution), and one for an asymmetric conformation (4.4 Å resolution) Figs 3, S2-4 and Table 1).

Symmetric structure
The C2 symmetric head structure, which is represented by most of the particles, explains our biochemical and biological ndings with venom-derived insulins. This reconstruction is essentially as reported previously for the insulin receptor in complex with two or more human insulin molecules [14][15][16] . Density is apparent for four Vh-Ins-HSLQ molecules, one at each of the two symmetry-related site 1 positions and the two site 2 positions, although the site 2 Vh-Ins-HSLQ had weaker density and did not contribute notable high-resolution information in the nal reconstructions, possibly due to greater exibility ( Fig. 3cd). Initial 3D reconstructions of the receptor resolved only one of the two receptor "stalks" comprised of the FnIII-2 and -3 domains, indicating conformational heterogeneity. The subset of particles subsequently reconstructed with both stalks resolved in a close-approaching conformation matches much more closely with the chimeric IR-leucine zipper construct used by Weis et al. 17 than with other previously reported human insulin:IR complexes 14,16 .
Binding of Vh-Ins-HSLQ at site 1 and site 2 resembles that seen in previously reported cryo-EM structures of IR:insulin complexes 14,16 . Following structural overlay based on surrounding receptor residues, the relative displacement of Vh-Ins-HSLQ Ca atoms at the site 1 positions ranges from 0.3-0.9 Å (B5-B18; A1-A20) compared to insulin-receptor complex structures (PDB entries 6HN5, 6PXW and 6SOF) 14,16,17 . Essentially all of the IR contacts with Vh-Ins-HSLQ residues that are common to those with native insulin are retained, although several residues unique to native insulin or Vh-Ins-HSLQ are at the site 1 interface.
Similarly, alignment of IR residues surrounding site 2 show Ca overlap of insulin versus Vh-Ins-HSLQ of 0.5-2.6 Å (PDB entries 6PXW and 6SOF) 14,16 , indicating that contacts between insulin and IR at site 2 are also largely conserved. In notable contrast to site 1, however, there is almost no change in residue identity between insulin and Vh-Ins-HSLQ at the site 2 interface (Fig S5). Consequently, our analysis of the Vh-Ins-HSLQ interaction focuses primarily on binding at site 1.
Vh-Ins-HSLQ binding at site 1 Vh-Ins-HSLQ, like insulin, binds site 1 though contacts with receptor surfaces formed by L1, αCT, and a loop near the periphery of FnIII-1 (Fig 4a). The structure reveals how the A-chain C-terminal elongation of Vh-Ins-HSLQ compensates for loss of C-terminal B-chain residues of native insulin. In particular, the new LeuA23 side-chain projects into the receptor pocket otherwise occupied by insulin PheB24, with LeuA23 aligning with one side of the PheB24 benzyl ring (Fig. 4b-d). Despite the resulting difference in docking residue coordination, the conformations and positions of the residues that form this pocket are virtually unchanged compared to native insulin complexes 16,18 (Fig. 4c).
The role of PheB24 and surrounding residues in receptor binding has been characterized through extensive mutagenesis 18-20 . The equivalent roles seen here for Vh-Ins-HSLQ LeuA23 and insulin PheB24 align with the broader set of hydrophobic side chains that are compatible with receptor recognition at this site. An insulin analog with PheB24 substituted by cyclohexylalanine-a non-natural amino acid with a non-planar, six-member alicyclic side chain-retained full a nity for IR in competition binding assays, as did substitution of PheB24 by methionine 20 . These ndings contradicted the hypothesis that an aromatic residue that interacts with the amino group of receptor residue Asn16 and/or with the sulfurs of the insulin A20-B19 disul de is required to achieve full binding a nity 19 . Substitution by other hydrophobic residues at the B24 position showed a preference for side chains larger than alanine, which gave 300-fold weaker a nity than the native phenylalanine. LeuB24 and IleB24 substitutions had similar (~2-3 fold lower) a nities to PheB24, whereas the larger hydrophobic residues tyrosine and tryptophan each had 20-fold lower a nity than native insulin 20 . Consistent with structures of insulin-receptor complexes [14][15][16][17][18] , these data indicate that shape complementarity at B24 is important for binding and that this binding pocket behaves as a "delimited non-polar cavity" 20 . PheA23 might be expected to mimic more exactly the binding of PheB24; however, Vh-Ins-HAFQ showed comparable activity to Vh-Ins-HALQ (Fig 1c, Fig 5c). Consistent with the leucine consensus of the venom sequences at this position (Fig 1c), LeuA23 in Vh-Ins-HSLQ ts the hydrophobic pocket normally occupied by insulin PheB24.
In addition to the extended A chain, the LeuB20 and GluB10 substitutions were identi ed as important compensatory mutations during development of the Vh-Ins analogs, with GluB10 providing a three-fold improvement to the EC 50 of Vh-Ins-HTLQ as assessed by AKT phosphorylation (Fig S6, Table S1, comparing Vh-Ins-HTLQ, B20Gly with Vh-Ins-HTLQ, B10His, B20Gly). The mechanism behind the increased insulin receptor a nity seen for GluB10/AspB10 in the context of insulin X10 and related analogs 21 -and presumably in the Vh-Ins analogs presented here-was proposed to be due to a formation of a salt bridge between GluB10 and Arg539 17 . Indeed, the Vh-Ins-HSLQ GluB10 carboxylate is situated near (~4 Å) Arg539 in our atomic model (Fig. 4a), indicating a moderate charge-charge interaction. The close proximity of GluB10 and Arg539 in this interaction is consistent with the expected modest increase in binding energy needed to drive a three-fold change in EC 50 .
Substitution of native GlyB20 with LeuB20 also enhances activity of Vh-Ins-HTLQ by providing a further two-fold improvement to the EC 50 (Fig. S6, Table S1). The site 1 Vh-Ins-HALQ LeuB20 side chain excludes 27 Å 2 of solvent accessible surface area at the receptor interface. Furthermore, LeuB20 might stabilize the helical binding conformation of B9-B20 due to its more restricted main chain and through side-chain contacts with TyrB16 (Fig. 4e). In native insulin, the conformational range of GlyB20 is important for the formation of a type-II β turn that allows B23-B30 to fold back against the B-chain helix when insulin is not bound to the receptor 18 . Because B23-B30 are not present in Vh-Ins-HSLQ, there is no functional requirement to maintain a glycine at B20. These observations suggest that this region of Vh-Ins may provide opportunity for further optimization of receptor contacts and stabilization of analog conformation.
Vh-Ins-HSLQ binding at site 2 To better visualize interactions at site 2, we used symmetry expansion and focused 3D classi cation to enrich for complexes displaying insulin at this position (Fig. S7). Approximately 25% of the sub-particles showed occupancy of Vh-Ins-HSLQ within a mask surrounding site 2, and subsequent 3D re nement resulted in a reconstruction with an overall resolution of 3.9 Å and recognizable density for Vh-Ins-HSLQ ( Fig. S5). Docking of FnIII-1 and site-2-bound insulin from published insulin-receptor structures 14,16 into the site 2 density in the asymmetric reconstruction convincingly places insulin into the map, indicating that there is no discernible difference at this resolution in the positioning of Vh-Ins-HSLQ at site 2 relative to native insulin.
Residues previously determined to be important for interactions at site 2-namely, LeuA13 and LeuB1714,16-are not mutated in Vh-Ins-HSLQ, and their interactions with the receptor do not appear to be altered from the native insulin interactions. Neither the extended A chain nor LeuB20 of Vh-Ins-HSLQ approach the receptor at this site. The only other substitution relative to native insulin-GluB10-lacks side-chain density but may approach receptor residues Lys494 and Asp483. The impact of GluB10 on binding a nity at site 2 is unclear, although it is apparent that binding geometry is not substantially altered and that residues in the vicinity of GluB10 are poorly ordered in the structure. These observations support the inference that insulin substitutions to Vh-Ins are highly relevant for binding to site 1 but much less relevant for binding to site 2.

Structure-guided analysis of the Vh-Ins extended A-chain residues
Guided by the structural insights, Vh-Ins-H(S/A)LQ-speci c residues were further investigated by mutagenesis and by cellular signaling assays that monitored the level of AKT phosphorylation. Substitution of HisA21 by proline had almost no effect on signaling, which is consistent with the A-chain extended helix being kinked 24° at this residue ( Fig. 4b) and the absence of receptor contacts by the HisA21 side chain. Glutamine, lysine and glutamate substitutions at A21 each led to a slightly reduced (2-4-fold) potency ( Fig. 5a), although the reason for this modest reduction in potency is not apparent from inspection of the structure.
The side chain of residue SerA22 approaches the backbone of receptor αCT residues Val713, Phe714 and Val715. Inspection of our Vh-Ins-HSLQ complex structure suggested that glycine, serine and alanine are the only natural amino acid residues capable of accommodation at this position without signi cant steric hindrance. Indeed, when SerA22 was subjected to mutagenesis, there was a negative correlation between the size of the A22 side chain and AKT signaling activity (Fig. 5b). Consistent with the modeling, SerA22 and AlaA22 both showed activity comparable to native insulin. While SerA22 is capable of forming hydrogen bonds with either an amide or carbonyl on the αCT backbone (at Val713 and Val715), the equivalent activity of AlaA22 indicates that a water molecule may substitute for the serine hydroxyl in formation of these hydrogen bonds. In contrast, GlyA22 showed two-fold reduced activity, likely because it destabilizes the helical conformation of the remaining extended A-chain residues. ValA22, LeuA22, PheA22, GluA22 and LysA22 all resulted in greater than ten-fold reductions in activity (Fig. 5b).
As discussed above, LeuA23 plays a key role in receptor binding by docking into a hydrophobic pocket on the receptor surface that is otherwise occupied by PheB24 of native insulin (Fig 4b,c). We evaluated hydrophobic substitutions by leucine, isoleucine, valine and phenylalanine at this position, and found that only PheA23 led to comparable potency to LeuA23 (Fig. 5c). Both the ValA23 and IleA23 substitutions led to reduced potency, which may be due to the unfavorable nature of β-branched amino acids in α helices 22 , or due to geometric incompatibility with the binding pocket. The preference for Leu at position A23 is consistent with the observation that LeuA23 is almost completely buried in the Vh-Ins-HSLQreceptor complex and with our nding that LeuA23 is conserved in potent Vh-Ins sequences (Fig. 2b).
The C-terminal residue of the Vh-Ins A chain, GlnA24, does not contact the receptor in our structure. We therefore evaluated residues that naturally occur at high frequency at the C-terminal end of helices 23 for their potential to increase activity further. All A24 substitutions tested had at least modest activity; however, the native Con-Ins K1 venom glutamine residue was the most potent (Fig. 5d). The effects of the A24 mutants tested were subtle, consistent with GlnA24 not directly engaging IR.
Having ascertained the similar behavior of Vh-Ins-HALQ and Vh-Ins-HSLQ, we performed uorescencebased competition binding assays with Vh-Ins-HALQ to determine its relative a nity for both IR (Fig. 5e) and IGF-1R (Fig. 5f) that were detergent-solubilized and immobilized. These assays revealed that Vh-Ins-HALQ has full, native-insulin-like a nity for both IR and IGF-1R (Table S2). The IGF-1R a nity is notable in the context of the GluB10 mutation present in Vh-Ins-HALQ because previous investigations of some insulin variants containing anionic sidechains at B10 found a higher a nity for IGF-1R relative to native insulin 21 . In contrast, we nd that Vh-Ins-HALQ has native-insulin-like binding preference for both IR and IGF-1R.
Binding was also investigated using isothermal titration calorimetry to determine the a nity of Vh-Ins-HALQ for a minimized model of receptor site 1 assembled from IR485 (a construct comprising IR domains L1, CR and L2) 24 and the IR-A aCT peptide (receptor residues 704-719). Consistent with published work 4,8 , binding of human insulin was ~60-fold weaker in this assay than in the previous assay with immobilized full-length receptor. The inability of the model construct used in this assay to recapitulate the GluB10-Arg539 interaction (due to the absence of domain FnIII-1) might underlie the 10fold weaker binding of Vh-Ins-HALQ relative to human insulin. Nevertheless, consistent with the compensating interaction seen in the structure, Vh-Ins-HALQ displays 24-fold tighter binding than DOI (Table S3).

Dynamic conformations of Vh-Ins-HSLQ-receptor complexes
Three-dimensional classi cation of the particles in our cryo-EM dataset indicated the presence of a subset of particles that exhibited increased conformational heterogeneity relative to the 4:1 Vh-Ins-HSLQreceptor complex described above, appearing as a blurring of the head region of one of the two receptor protomers. CryoSPARC 3D variability analysis indicated that this subset displayed a range of conformations ( Fig S2, right side). To visualize snapshots along the conformational trajectory, the particles were split into eight groups based on their latent coordinates. Subsequent 3D reconstructions produced a series of maps of 6-7 Å resolution (Fig. 6a), in which most of the variability is displayed by just one of the two receptor protomers. At one extreme, conformations in this trajectory approach our symmetric state (Fig. 3) and published insulin receptor complex structures with two or more insulins [14][15][16] . (Fig. 6c). The other most asymmetric extreme of the trajectory bears some resemblance to some other previously reported structures 15 , including an "intermediate state" for the interaction between human receptor ECD and native insulin (EMD-10311) 14 (Fig. 6b), with one protomer closely resembling the apo receptor crystal structure 25 and the other protomer resembling the symmetric complex [14][15][16] (Fig. 6f).
Remarkably, unlike other reported structures, this asymmetric conformation is ordered and reveals an intriguing novel coordination state of Vh-Ins-HSLQ bound at a composite site that includes features of both site 1 and site 2 (Fig. 6e). As the trajectory progresses towards the symmetric conformation, the site-1 and site-2 surfaces diverge toward their ~40 Å-separated positions in the symmetric state, with Vh-Ins-HSLQ binding at both sites and overlapping density indicating partial occupancy throughout most of the trajectory ( Fig. 6b-e, Video S1).
The reconstruction of the asymmetric conformation was further improved to an overall resolution of 4.4 Å by using Topaz 26 to increase the number of particles picked followed by focused 3D classi cation in Relion 27 to obtain a particle set with reduced conformational heterogeneity in the dynamic protomer ( Fig  S2). This revealed that the site-2 interface is indistinguishable between the combined site and the canonical site 2 of the symmetric structure (Fig. 6g). In contrast, although the combined and canonical site-1 interactions are similar, some differences are apparent in the relative positioning of Vh-Ins-HSLQ/insulin and αCT with respect to the L1 domain. In particular, the orientation of Vh-Ins-HSLQ relative to L1 is rotated approximately 70 degrees along the axis of the αCT helix (Fig 6h). Moreover, the αCT density is shorter than seen in site-1-bound structures, and is more consistent with αCT seen in the apo-IR crystal structure 25 . Unfortunately, the resolution is insu cient to conclusively assign the register of αCT, which also differs between the apo and bound states of receptor site 1 8 .
A previously reported insulin receptor complex structure using the same receptor ectodomain preparation shows some resemblance to the most asymmetric state that we observe. Gutmann et al. 14 reported this low-occurrence conformation that resembles maps near the center of the conformational trajectory described here and, although the details present were insu cient for unambiguous modelling, a 3:1 insulin:receptor state was proposed as an intermediate between the 2:1 and 4:1 states. A low-occurrence receptor conformation reported by Scapin et al. 15 also has some overall similarity to our asymmetric state but lacked su cient resolution to visualize relevant details. Although the asymmetric IRΔβ-Zip construct used by Weis et al. 17 displays some similarity near the insulin-occupied site 1 and the Cterminal regions of the stalks, which show the same close approach as in our symmetric and asymmetric reconstructions, the organization of the unoccupied site-1 domains (L1, CR, L2, αCT) is distinctly different and the human insulin:IRΔβ-Zip complex does not display a combined site-1/site-2 architecture nor any density for more than the single site-1 insulin molecule.
Vh-Ins-HALQ signaling response Insulin is capable of stimulating both metabolic and mitogenic responses through the PI3K/AKT and Ras/MAPK/ERK pathways, respectively. To characterize the signaling pro le of Vh-Ins-HALQ, the relative phosphorylation of AKT and ERK induced by Vh-Ins-HALQ administration in L6 myoblasts overexpressing IR-A was determined (Fig. 7a). We found that the overall ratio of AKT/ERK phosphorylation induced by Vh-Ins-HALQ was the same as human insulin, indicating a native-like signaling pro le with no bias towards AKT or ERK. To evaluate the metabolic e cacy of Vh-Ins-HALQ, an in vivo comparison between Vh-Ins-HALQ and human insulin (Humulin R) was evaluated in an insulin tolerance test. Subcutaneous administration of human insulin or Vh-Ins-HALQ (0.017 mg.kg -1 ) in streptozotocin induced diabetic rats lowered blood glucose levels and reached similar nadir levels (~60 mg.dL -1 ) (Fig. 7b). These observations indicate that the metabolic potency of Vh-Ins-HALQ is similar to that of human insulin. As a nal assay of signaling response, the cell-proliferative potency of Vh-Ins-HALQ was assessed by DNA synthesis in L6 myoblasts over-expressing IR-A (Fig. 7c). We found that human insulin was slightly more potent than Vh-Ins-HALQ in its ability to induce DNA synthesis, indicating that Vh-Ins-HALQ may have the desirable property of being slightly less mitogenic than human insulin (Insulin EC 50 4.9 nM vs Vh-Ins-HALQ EC 50 7.3 nM, 95% C.I.s 4.2-5.5 nM, 6.3-9.5 nM, p <0.001).

Discussion
Our earlier discovery that sh-hunting cone snails deploy insulins in their venom which rapidly induce hypoglycemia in prey has provided a means to overcome a critical challenge in the development of therapeutic insulins 3,10 . Speci cally, the venom insulins have dispensed with residues near the B-chain C terminus of the hormone that, in mammalian insulins, mediate both the receptor binding that is essential for activity and the dimerization that makes human and therapeutic insulins slow acting when injected subcutaneously. Our earlier work demonstrated that an insulin analog inspired by a venom insulin from C. geographus maintains potency in the absence of the C-terminal B-chain residues through four substitutions in the core of the insulin structure 4 . Here, we report the discovery of additional, highly diverged venom insulins that use an alternative strategy to overcome loss of native insulin B-chain Cterminal residues, namely, the addition of residues at the C terminus of the A chain. Our protein engineering and structural studies demonstrate further that a variant human insulin based on these venom insulins with extended A chains has similar binding a nity and potency to native insulin and makes compensating receptor interactions that explain the retention of potency.
Our discussion of receptor interactions has focused on site 1, which displays substantially altered interactions due to the substitution of Vh-Ins-HSLQ residues at this interface relative to human insulin. In contrast, residues at the site 2 interface are essentially unchanged, with the minor exception of the poorly ordered GluB10. Inspection of site 1 explains how the variations in Vh-Ins-HSLQ substitute for the cognate interactions of insulin and also suggests approaches to further optimize Vh-Ins-H(A/S)LQ as a therapeutic lead compound. Most strikingly, LeuA23 in the A-chain extension substitutes for PheB24, which is within the part of the B chain removed in the fast-acting venom insulins. This substitution a rms earlier predictions made from molecular dynamics simulations using the insulin-like peptide from the venom of C. kinoshitai (Con-Ins-K1) 10 , from which Vh-Ins-HSLQ is derived. Our results, however, reveal notable differences in the molecular dynamics model of Con-Ins-K1 relative to Vh-Ins-HSLQ binding at site 1, particularly in the overall positioning of Con-Ins-K1 and αCT with respect to the receptor L1 domain (Fig. S8).
Vh-Ins has the potential to be further improved by substitution with non-naturally-occurring amino acids, particularly at the key binding residues LeuA23, GluB10, and LeuB20. The remaining substitution, LeuB20, likely contributes to binding/potency partly through a limited interaction with receptor and primarily through stabilizing the conformation of the B-chain helix relative to the cognate glycine residue. Both of these effects might be further optimized by protein engineering. Importantly, we found that Vh-Ins-HALQ has very similar biological activity to human insulin (Fig. 7).
In summary, Vh-Ins-H(A/S)LQ is a minimized insulin that shares with native insulin its in-vivo metabolic potency, its a nity for insulin receptor and the IGF-1 receptor, and its signaling capability. Despite lacking the conserved insulin B-chain C-terminal residues that are critical for binding of human insulin to the primary receptor site, Vh-Ins-H(A/S)LQ compensates through a novel extended A chain that provides receptor contacts that mimic those of the native insulin B chain. The cryo-EM structure shows that interactions at site 1 explain Vh-Ins-H(A/S)LQ activity, with the same factors apparently supporting binding to the novel combined site-1/2 conformation. Our structural and functional data also demonstrate multiple opportunities for further optimization as a fast-acting insulin that has the potential to improve therapeutic options for the treatment of diabetes.  CryoEM sample preparation, data collection, and 3D reconstruction Insulin receptor ectodomain was prepared as described previously 14  Data processing (Fig. S2) was performed using Relion 3.1.0 27 and CryoSPARC v3.0-v3.2 30 . All movies from the multiple grids and both sample preparations (above) were combined to produce the best results.

Sequencing of venom insulins
Motion correction was conducted using Relion's implementation of the MotionCor2 algorithm 31 . Patch CTF estimation was done in CryoSPARC followed by particle picking using Topaz 26 . 1.5 M picked particles were 2D classi ed in subsequent rounds using CryoSPARC and selected classes (776 k particles) were subject to 3D classi cation, giving two major conformations (symmetric and asymmetric) that were subsequently processed independently from each other. Following initial reconstruction using non-uniform re nement, particles were exported back to Relion using pyem 32 and iterative rounds of Bayesian polishing and CTF re nement were performed. Polished, CTF-re ned particle stacks were imported back into CryoSPARC. Asymmetric particles were subjected to an additional round of alignmentfree 3D classi cation in Relion using a mask around the dynamic region of the receptor. Particles with the well-de ned L1 + CR domains were selected and imported back into CryoSPARC for a nal non-uniform re nement and reconstruction at 4.4 Å (Fig. S2). For the focused re nement head region, particles sets were further cleaned using 2D classi cation and multi-class ab-initio reconstructions prior to production of the nal 3.3 Å volume using non-uniform re nement. To resolve the FnIII-1, -2 and -3 domains in the symmetric state, all particles following initial 2D classi cation and a single consensus re nement (in C2) were subjected to alignment-free 3D classi cation in Relion and the particles with FnIII domains resolved were selected. Particles were then subjected to supervised heterogeneous re nement in CryoSPARC to remove the dynamic/asymmetric conformational state, and the remaining symmetric particles were re ned in C2 using non-uniform re nement, giving a nal resolution of 4.1 Å for the whole unmasked ectodomain. 3D variability analysis (3DVA) was additionally performed on asymmetric particles to visualize conformational heterogeneity (Fig S2, right side). Intermediate results along the trajectory were ltered to 8 Å resolution, which allowed for visualization of particle subsets based on the continuous exible motion of the L1, CR, L2 and αCT domains.
Model building was done in Coot version 0.9.3 33 using PDB entry 6PXW as a starting model 16 . Real-space re nement was conducted in Phenix version 1.19.1 34 using per-chain symmetry restraints and secondarystructure constraints for α helices and β sheets. Intermediate and nal models were validated using MolProbity 35 . Data visualization was performed using UCSF Chimera 36 and UCSF ChimeraX 37 .

Competition receptor binding assay
Competition binding assays were performed with solubilized immunocaptured human insulin receptor

Western immunoblots
Activation of the human insulin receptor was assessed by immunoblotting as previously described 40 . L6 myoblasts overexpressing IR-A (240,000 cells/well) were seeded in 6-well plates and allowed to grow to con uence (~ 48 hours) and then were stimulated with 10 nM human insulin or Vh-Ins-HALQ for different times. Lysates of cells were precipitated with trichloroacetic acid, pH neutralized with 1M Tris pH8.0 and then separated on 10% SDS-PAGE, transferred to nitrocellulose membrane, and immunoblotted with primary antibodies for 16 hours at 4 °C. Antibodies used were phospho-AKT (T308) (New England Biolabs #9275S), phospho p44/42 MAPK (ERK1/2) (T202/Y204) (New England Biolabs #9101S) and mouse anti-b-tubulin (Invitrogen #32-2600). Total AKT and ERK1/2 levels do not change over the time course measured (not shown). The b-tubulin was used as a loading control, against which pAKT and pERK1/2 were normalized. Quantitation of the blots was performed using the Image Studio Lite software.
Activation was expressed as a percentage of the response to insulin at 10 min (three independent experiments  Outliers (%) 0.16 0.23 0.53 †: Resolution as measured by an independent half-map FSC threshold of 0.143 ‡: Resolution as measured by a model-to-map FSC threshold of 0.5 Figure 1 Alignment of insulin sequences. a, Molecular phylogenetics closely groups the newly identi ed venom insulin sequences with venom insulins previously identi ed from other sh hunters (red branches in tree).

Figures
Tree branches of venom insulins from snail and worm-hunters are shown in black, and those representing endogenous signaling insulins are depicted gray. b, Sequence alignment of human and zebra sh insulin c, Alignment of venom insulins sequenced here from C. mucronatus (Mo1 and Mo2) and C. laterculatus (La1 and La2) and previously from C. kinoshitai10 (K1 and K2). d, Alignment of venom insulins from C. geographus (G1) and C. tulipa (T1A and T1B)3. B-chain residues deleted in DOI and important for receptor-binding and dimerization (b) or residues unique to venoms and predicted to bind insulin receptor  Activities of venom-insulin hybrid (Vh-Ins) analogs based on cone snail venoms containing extended Achain sequences. a, Sequences of human insulin, desoctapeptide insulin (DOI) and venom-DOI hybrids.
Red letters indicate altered sequence relative to human insulin, X represents the elongated A chain sequences in Fig. 1c. b, Cellular activities of Vh-Ins analogs (all with B10E, B20L substitutions) with various elongated A-chain sequences from cone-snail insulins (without C-terminal amidation). Error bars are shown when greater than the size of the symbols. c, Cellular activities of Vh-Ins analogs with alanine substitutions. All sequences are without C-terminal amidation.

Figure 3
The dominant symmetric Vh-Ins-HSL-receptor structure. a, Schematic of the insulin receptor domains and disul de connectivity. b, Structure of the insulin receptor ectodomain with four insulin molecules bound (PDB 6SOF)14. Insulins are depicted in surface representation (orange, site 1; cyan, site 2). c, Consensus re nement density of the Vh-Ins-HSLQ-receptor complex (4.1 Å). Box indicates the region selected for focused re nement. Receptor protomers, blue and pink. Vh-Ins-HSLQ orange (site 1) and cyan (site 2). d, Focused re nement map. Site 2 insulins do not contribute signi cant signal relative to noise at the lter frequency used for the nal reconstruction (3.3 Å). e-h, example model-to-density ts for Vh-Ins A chain, Vh-Ins B chain, L1, and αCT residues, respectively. Vh-Ins-HSLQ at site 1. a, The extended A-chain residues (green) are in close proximity to the receptor αCT and L1 domains. LeuB20 contacts L1 and GluB10 interacts with FnIII-1. b, Comparison of site 1 Vh-Ins-HSLQ (orange) and human insulin (white, PDB 6PXW)16 aligned by superposition of L1 and αCT residues. Vn-Ins-HSLQ extended A-chain residues (green) contact the same receptor surface engaged by native insulin PheB24 and PheB25. The helix formed by Vh-Ins-HSLQ residues A13-A23 is kinked 24°a way from αCT at HisA21. The native insulin A-chain C terminus is indicated with an asterisk. c, Vh-Ins-HSLQ LeuA23 binds a hydrophobic pocket in a similar fashion to native insulin PheB24. d, Surface representation of the pocket formed by αCT and L1 and binding by LeuA23 and PheB24. e, Vh-Ins-HSLQ LeuB20 packs against TyrB16 and approaches receptor Lys40.  Conformational heterogeneity in Vh-Ins-HSLQ-receptor reconstructions. a, The ~25% of particles that segregated into an asymmetric conformation following 3D classi cation were binned into eight distinct maps along the conformational trajectory. b, Model of the most asymmetric cluster of particles. Three Vh-Ins-HSLQ molecules are bound. Site 1 (orange) and site 2 (cyan) positions are occupied for the right-hand "up" arm of the receptor in the same manner as the symmetric conformation (Fig. 2). One Vh-Ins-HSLQ is bound to the left-hand "down" arm of the receptor at the combined 1/2 site (green) that approximates a combination of site-1 and site-2 interactions. c, Model of the conformational state that most closely resembles a two-fold symmetric ectodomain complex, built into a 6.0 Å map. Vh-Ins is apparent at both  Activity of Vh-Ins-HALQ relative to native insulin. a, ERK(pThr202, pTyr204) and AKT(pThr308) signaling pro le induced in L6 myoblasts by Vh-Ins-HALQ (pink) and native insulin (black). Data points here and in other panels represent the average of three experiments, and error bars (standard error of the mean) are shown when larger than the symbols. A representative western blot is shown. b, Insulin tolerance test in rats determined by the lowering of blood glucose following subcutaneous injection of 0.017 mg kg-1 insulin or Vh-Ins-HALQ. c, DNA synthesis as response to concentration of insulin or Vh-Ins-HALQ is shown as percent incorporation of 5-Ethynyl-2'-uridine (EdU) above the basal level. Insulin vs Vh-Ins-HALQ *** p value <0.001 (2-