A rotamer relay information system in the epidermal growth factor receptor–drug complexes reveals clues to new paradigm in protein conformational change

Graphical abstract


Introduction
Epidermal growth factor receptor (EGFR) protein was discovered in the late 1970s, and to this day it remains a primary target for anticancer therapy [1,2]. Mutations and upregulation of EGFR are known mechanisms of both oncogenesis and therapeutic resistance. Active EGFR mutants escape the effects of chemotherapy via continuous proliferation and evasion of apoptosis, especially in the case of non-small cell carcinoma of the lung (NSCLC) [3]. Furthermore, EGFR is prevalent in other solid tumours including: breast, colon, renal, ovarian, and head-and-neck cancers [4]. EGFR is a member of the receptor tyrosine kinases (RTKs) cell surface receptor family [5,6], and is also a member of the ErbB family, a four member family responsible mainly for regulating cell proliferation [6]. The primary function of EGFR is to mediate signals for differentiation, motility, and apoptosis [6]. Therefore, any irregular EGFR behaviour can easily become pro-oncogenic [7]. Pro-oncogenic mutations in EGFRs are widespread and linked to tumour overgrowth and resistance to chemotherapy [8].
Chemotherapeutic approaches have focused on several aspects of the EGFR structure and mechanism of action. For example, monoclonal antibodies, such as cetuximab, have been devised to target the extracellular domain, as reported in treatments for bowel or head and neck cancer [21]. Inhibitors of the intracellular tyrosine kinase subdomain act to attenuate tyrosine kinase activated signalling pathways, and thereby trigger cancer cell death [22], with reports of success in NSCLC patients in terms of treatment and quality of life [23]. Unfortunately, it has become clear now that both monoclonal antibody-based agents and tyrosine kinase inhibitors (TKIs) are struggling to keep pace with the emerging of new EGFR mutations. While monoclonal antibodies are effective against wild type EGFR [24], they are much less effective against EGFR mutants (i.e., exon 19 deletions and L858R mutation) detectable in 10-15% of Caucasian NSCLC patients and 50% of Asian patients [25,26]. The first generation TKIs such as erlotinib and gefitinib were originally thought to be more robust against EGFR mutations. However, the emergence of the T790M mutation, in the EGFR ATP-binding site of the tyrosine kinase subdomain, was sufficient to curtail the efficacies of reversible first generation TKIs [27]. The T790M mutation is present in about 50%-60% of patients that develop chemoresistance to TKIs [28]. First generation TKIs were quickly replaced by irreversible second generation EGFR TKIs (pan-HER inhibitors), such as afatinib, dacomitinib and neratinib, that functionally inhibit wildtype and T790M EGFR mutants [29]. Thereafter, third generation EGFR TKIs were developed, such as osimertinib and rociletinib, to overcome this particular chemoresistance, with more efficacy and less side effects than observed with first and second generation inhibitors particularly owing to covalent binding to the C797 residue [26]. Unfortunately, the appearance of a C797S mutation in the tumours of patients treated with third generation TKIs rapidly curtailed the efficacy of these drugs [30,31]. Accordingly, state-of-the-art fourth generation allosteric EGFR inhibitors, like EAI045 and EAI001, were created to target a different binding site in the EGFR kinase. These drug entities avoid the problems of both T790M and C797S mutations. However, the evidence suggests that such fourth generation inhibitors are insufficiently effective alone, for example EAI045 is only properly effective in combination with cetuximab [32,33].
Clearly, resistance to chemotherapy is not just simply a function of EGFR mutations alone. EGFR-independent factors include the overexpression of additional growth factor receptors such as HER2, MET and FGFR, reduced NF1 expression, and overactivity of PI3K or B-Raf [28,[34][35][36]. Having said this, EGFR mutations are dominant, and thus EGFR remains a primary upstream target for cancer chemotherapy.
Recently exploited concepts in cancer like the game theory and the biological informational theory could help to understand this problem in more sophisticated way. The game theory in cancer, describes how cancer can take advantage of the dynamics of the tumour microenvironment for its own survival by different ways, for instance, tumour therapy resistance [37]. On the other hand, the biological information theory in cancer, could help us to understand how cancer cells can maintain their signal transduction specificity and the amount of information transmitted with different mutations. Further, it explains how cancer develops resistance to hold the integrity of its informational system, which will be of interest of cancer cells survival [38]. Accordingly, there is a major unmet need to understand the molecular mechanisms of EGFR mutation-induced resistance to chemotherapy. In the light of increasing numbers in deposited EGFR kinase 3D structures, it has become a challenge in structure-based drug design to make a choice without fully understanding the variations involved at the molecular level. In a previous study, we have identified C-helix in the N-lobe of EGFR kinase domain as the major structural variation occurring in EGFR kinase complexes with ligands based on analysis of backbone movements [39]. The objective of this work is to investigate the relationship between EGFR kinase domain rotamer variations, mutations, C-helix movement, and kinase activation (DFG domain movements). By taking advantage of our recently developed code for rotamer analysis [40], here, we attempted to shed the light on the biological relevance of global rotameric changes in the EGFR kinase.

Dataset processing
Using the keyword ''EGFR", RCSB protein databank (www.rcsb. org) database search resulted in 260 structures. All entries that did not cover the kinase domain of EGFR were excluded. Entries without inhibitors were also excluded except for four wildtype entries: 1M14, 2GS2, 3GOP, 4TKS. Only chain A was retrieved (the number of chains ignored were 28 from all structures). In total, 116 EGFR kinase 3D structures spanning 714-950 (Uniprot ID P00533-1) were trimmed and further studied. The two lobes of the kinase domain were identified as N-lobe (spanning 714-795) and C-lobe (spanning 796-950), and the ligands were salvaged.

Visualization
Visualization of Protein and ligand 3D Structures was done in UCSF Chimera (version 1.10.2). The matchmaker plugin was used for superposition of all heavy atoms via BLOSUM-62 scoring matrix and Needleman-Wunsch alignment algorithm. Structure rendering and animation were done in UCSF Chimera using the command line. Graphics were processed using MS Powerpoint and Adobe Photoshop.

Structure fitting and cluster analysis
Structure fitting and cluster analysis were done in R language (Version 3.6.1, The R foundation for Statistical Computing, Austria) using rmsd() function from Bio3D library (Grant lab, University of California, San Diego, USA) and agnes() function from Cluster library (Martin Machler, ETH Zurich, Switzerland) using the Ward method for root mean square deviations (RMSD) dissimilar matrix, respectively. The Ward method of clustering, which is also known as the minimum variance method, is general purpose method of clustering that starts with n clusters (each containing a single structure), then these clusters are combined in each step (minimizing the variance) until all structures are contained within a single cluster. RStudio Version 1.2.5001 (RStudio, Inc.) was used for coding and obtaining results.

C-helix and DFG domain analysis
C-helix movement and DFG domain clustering were measured by two techniques. 1) the angles of the helix axis (residues 756-767) between each structure and a reference structure (Reference PDB IDs: 3gop and 1m14) were measured using a command line in UCSF Chimera after superposition of 3D structures. 2) Active and inactive kinase structures based on clusters of DFG domain torsions were estimated according to the new nomenclature of Modi and Dunbrack [41]. Briefly, the DFG cluster is comprised of the Ramachandran regions (A, alpha; B, beta; L, left) of the Tyr854, Asp855 and Phe856 in addition to the first side chain angle of Phe856 (minus, plus, trans). The most common active DFGin cluster is BLAminus (beta Tyr854, left Asp855, alpha Phe856 and minus Phe856 side chain), whereas the most common inactive DFGout cluster is BBAminus (beta Tyr854, beta Asp855, alpha Phe856 and minus Phe856 side chain).

Comparative rotamer analysis
Comparative rotamer analysis was done in R language according to our previously published method [40]. Briefly, using Bio3D library, structures were loaded via read.pdb() function and torsional angles were calculated via torsion.pdb() function. Classification of rotamers was done according to the Richardson's Penultimate rotamer library [42], using IF/ELSE statements as previously described [40]. Rotamer nomenclature is based on the side chain torsional angles (v1 torsion between N, Ca, Cb and Cc atoms, v2 angle between Ca, Cb, Cc and Cd atoms, etc.). The angle modes were used to classify the three main classes as plus/trans/minus (i.e., p, t, m for +60°, 180°, and -60°, respectively). For atoms in side chains other than carbon, the angles modes were shifted and were replaced with explicit angle mode (e.g., m-80 rotamer for Asn described v1 = -60°and v2 = -80°modes). In this manuscript, the degree symbol for explicit angles was conveniently removed from nomenclature for coding purposes. The R scripts, which are still not optimized as a package, are available for academic purposes upon request from the corresponding author.
Chi square test of independence was used to construct crosstabs between groups of structures and residue rotamers and to determine if there was any association between the variables in IBM SPSS Statistics 21 program (IBM Corporation, Armonk, New York, USA). A p-value below 0.05 was considered significant.

Rotameric differences between EGFR kinase structures
In the past two decades, an impressive wide range of TKIs targeting EGFR have been devised [39]. At the same time, a large number of EGFR-ligand complexes have been deposited in the database every year, although these structures are not necessarily representative of the latest trends in EGFR inhibitor design due to the time required for X-ray crystallography experiments. Nevertheless, the availability of hundreds of similar such structures does provide statistical confidence in the validity of comparative structure analysis regarding conformational changes.
Accordingly, we obtained and trimmed 116 EGFR structures for fitting and clustering analysis using RMSD for all atoms.
Cluster analysis showed two main clans of structures ( Fig. 1) similar to our previous analysis of the N-lobe of EGFR kinase [39]. The first clan of 73 structures was larger and comprised mostly of the C-helix-IN conformation and two or less mutations. While the second clan of 43 structures was correlated with the C-helix-OUT conformation and three or less mutations. In our previous work, the focus was on RMSD values calculated for Ca-backbone movements, here we chose to study both backbone and side chain movements. Initially, in order to quantify the C-helix axis movement, two reference structures were used to define C-helix-IN (PDB ID 1m14) and C-helix-OUT (PDB ID 3gop) conformations. Interestingly, the C-helix axis angles for 1m14, and other C-helix-IN conformations, were in the range > 0°t o 6°, while C-helix angles for 3gop, and other C-helix-OUT conformations, were in the range of 6°to 33°degrees (average 25°degrees) (Supplementary Table 1 For more insight, we then performed a deep rotamer analysis (Supplementary Table 1). Nearly 43 residues (18%) of the total of 237 residues spanning the studied kinase structures showed significant rotamer variations between the C-helix-IN and Chelix-OUT clans (Table 1 and Fig. 2). In particular, six residues spanning the C-helix (namely, Asn756, Ile759, Glu762, Tyr764, Ser768, and Val769) exhibited significant rotamer variations between C-helix-IN and C-helix-OUT conformations. In addition, two residues of the DFG domain (Asp855 and Phe856) also displayed significant rotamer variations between the C-helix-IN and C-helix-OUT clans.
In general, three types of rotamer variations were observed between the C-helix-IN and C-helix-OUT clans: Firstly, variations comprising unique rotamers. Secondly, where one rotamer is common in both clans but the second ranked rotamer is different. Thirdly, where one rotamer is common in 100% of cases in one clan, while other rotamers dominate in the other clan. Rotamer differentiations were most clear with triple mutants (last ten rows at the bottom of Figs. 1 and 2). Although, triple mutants were seen to share similar rotamers in their DFG domains, particularly at Lys852 and Phe856, and at other more distant residues (Arg889 and Lys913). Most importantly, the locations of these rotamer variations take on the appearance of side chain conformational relays extending out from points of mutation to different regions of the EGFR kinase (Figs. 3 and 4, Supplementary Movie 1-3).
This poses the obvious question which is, can EGFR mutations act singly or together to induce drug resistant conformational changes in EGFR that are communicated via these side chain conformational relays? Furthermore, can rotamer variations on the protein surface also play role in allosteric mechanisms? In fact, given that most of the side chains in the 43 residues mentioned above are actually facing the protein surface (with access to water solvent), then any potential ''flow of information" from rotamer to rotamer in such a dynamic environment must also involve torsional angle movements in both backbone and side chains in addition to surface network H-bonds or salt bridges with water molecules. Indeed, nearly 20 residues (47%) in the relay are located in the smaller N-lobe and thus are more exposed to water solvent. Furthermore, rotamer variations are observed across most types of amino acids (except for Cys and Trp, perhaps due to their general infrequency in proteins), and not only long chains amino acids (e.g. such as Arg and Lys) that may display alternative rotamers on the surface due to their flexibility alone [43,44].

Resistance, mutations and EGFR relay systems
Tumours are known to develop two types of drug resistance, i.e., the innate and the acquired. Innate resistance is defined as the failure of initial therapy due to various tumour mechanisms. Acquired resistance is defined as progression (e.g., due to mutations) of the disease after a period of ''clinical benefit" [45]. Acquired EGFR-TKIs resistance mechanisms vary according to TKI types, mutation types, and other factors. Before we highlight those EGFR mutations involved in acquired resistance mechanisms, it is important to emphasize on the nature of wildtype EGFR kinase in free and drug-bound forms. The catalytic activity of EGFR is regulated by three mechanisms: phosphorylation, autoinhibition, and allosteric binding [46]. Wildtype EGFR kinase is intrinsically autoinhibited in a similar way to the Src and CDK proteins, and interestingly, even though EGF ligand-free wildtype EGFR is not phosphorylated (not activated) this protein adopts the active form conformation in crystals. Indeed, unlike other tyrosine kinases, EGFR kinase is not as ''tightly" autoinhibited, and can maintain an active conformation and basal activity at high concentrations of EGFR or ErbB2 heterodimer [15,47]. In our previous study [39], we mentioned three wildtype PDB structures of ligand-free EGFR, namely, PDB IDs 1m14, 2gs2, and 4tks all showing the C-helix-IN (DFGin/BLAminus) conformation. Hence why 1m14 was selected as reference structure for the C-helix-IN clan. The C-helix-OUT conformation is found in 3gop which was used here as reference structure for the C-helix-OUT clan (even thought this does in fact comprise a K745M mutation).
The first and most common type of mutation in EGFR, the T790M mutation, occurs at exon 20 of the EGFR gene which is responsible for > 60% of the acquired resistance cases in NSCLC. Here, we observed T790M single mutants with the DFGin conformation in both C-helix-IN (10 structures) and C-helix-OUT (10 structures) clans, with nearly 12 BLAminus conformations Table 1 Top frequent differential rotamers between C-helix-IN and C-helix-OUT clans (count is shown after each rotamer). (Supplementary Table 1). Hence, we would suggest that the T790M mutation is not critical in triggering the relay of a ''flow of information" from rotamer to rotamer to desensitize EGFR kinase to TKIs. Such a statement is not surprising given that T790M mutation involves just the replacement of threonine -a local gatekeeper residue and important determinant of inhibitor specificity in the ATP-binding pocket -to methionine. This residue replacement decreases first and second generation TKI binding affinities for the ATP-binding pocket, owing to increased steric hindrance, and increases the ATP-binding affinity; thus, enhancing competition for binding between ATP and TKIs [45,48,49]. According to Yun et. al., [48] who studied the T790M mutation in both active and inactive EGFR structure states, the mutation is hypothesized to alter directly the conformation of the DFG moiety in the ATP-binding pocket from an inactive to active form via favourable hydrophobic interactions between M766 and L777, that could lead to changes in the positions of the DFG loop or C-helix. In terms of comparative binding data, the ATP-binding affinity is higher for the T790M/L858R double mutant than the L858R single mutant. The difference correlates directly with higher resistance towards gefitinib and erlotinib [50]. Here, we reported  Table 1). Accordingly, we would suggest that the L858R mutation could act to trigger the relay of a ''flow of information" from rotamer to rotamer to sensitize EGFR kinase to TKIs. Indeed, In some cases, the low percentage is actually representing the second top rank rotamer as statistically differentiating where the first top rank rotamer was similar (e.g., in Ile715 the first top rank rotamer was mt in both relays). this L858R single mutant is a single missense mutation in exon 21 that is also one of the most frequent EGFR alterations found in NSCLC tumours [51]. Moreover, L858R is very frequently mutated to T790M/L858R double mutants in cancer patients, such that other L858R double mutants are found with at best only 5% incidence [52][53][54]. Furthermore, if the T790M single mutant and the T790M/L858R double mutant are compared, although they maintain the same low nanomolar affinity for gefitinib as the L858R single mutant, the T790M single mutant exhibits a higher ATPbinding affinity than the L858R single mutant. Accordingly, the T790M/L858R double mutant represents an activated enzyme that becomes resistant to ATP-competitive TKIs [48].
Moving on to the C797S mutation, this is also located in the ATP binding pocket and prevents covalent binding of covalent TKIs. When cysteine is substituted with serine at codon 797, crossresistance is gained with respect to irreversible third generation TKIs, such as Osimertinib. In this instance, 2 structures, double mutant T790M/C797S and triple mutant T790M/C797S/V948R, appear in the C-helix-OUT clan. The former double mutant from PDB ID 5xgn presents a DFGin/BLAminus conformation with the C-helix axis at equal angles (13°) to both 3gop and 1m14 reference structures (Supplementary Table 1). The latter triple mutant from PDB ID 5zwj is firmly with C-helix-OUT (DFGin/BLBplus) conformation (Supplementary Table 1). Accordingly, the C797S mutation could act to trigger the relay of a ''flow of information" from rotamer to rotamer to desensitize EGFR kinase to TKIs. In this respect it is worth noting about T790M/C797S double mutations that there are in fact three well described resistance states: 1) the cis T790M/C797S allelic state, where both mutations occur in the same receptor protein, which is resistant to all available EGFR-TKIs although sensitive to fourth generation, 2) the trans T790M/C797S allelic state, where either of the two expressed receptor proteins harbours one or both mutations, which is sensitive to first and third generation TKIs, 3) a T790M mutation loss combined with a C797S mutation gain which is sensitive to first and the second generation TKIs. Even though the cis state dominates, further investigation is needed to understand the structural differences in the ATP binding pocket that are associated with the different mutational combinations [55][56][57][58][59].
The alternative G719S mutation occurs in the phosphatebinding loop (P-loop) which is considered TKI-sensitive according to the National Comprehensive Cancer Network (NCCN, www.nccn.org) guidelines. Here, 5 cases of the mutant were located in the C-helix-IN clan with active C-helix-IN (DFGin/BLAminus) conformation. In addition, a double mutant G719S/T790M is also located in the C-helix-IN (Supplementary Table 1. Arguably, the G719S mutation could act to trigger the relay of a ''flow of information" from rotamer to rotamer to sensitize EGFR kinase to TKIs, in this instance. Computational studies on the G719S mutation suggest that TKIs can enter the ATP-binding site with ease. Indeed, simulations indicate that the distance between residues L718 and G796 is increased widening the ATP-binding site for TKIs to enter (conversely, the T790M mutation causes the distance between L718 and G796 to decrease) [60]. Moreover, the G719S mutation  ID 5ugc) showing the information relay residues as solid light blue surface (C-helix in green and mutations in red). The rest of the kinase is shown in blue ribbon and transparent solvent accessible surface. Most of the relay is connected and fits the flow of information from mutation to the rest of the relay (note that V948R mutation connects to only few residues in the C-lobe, in which case the information from V948R is transferred by other means than rotameric moves such as backbone, water and allosteric effects). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) destabilizes the inactive conformation and promotes the active conformation of the kinase, leading to more TKI sensitivity [61][62][63][64][65]. However, when G719S is combined with T790M as a double mutation, the secondary T790M mutation overturns the impact of G719S on the distance between the P-loop and activation loop [60].
Finally, turning to the T790M/C797S/L858R triple mutation, studies on targeted therapy -via the new allosteric inhibitor EAI045 in combination with cetuximab -demonstrate a different mode of resistance as compared to that exhibited previously [66,67]. EAI045 binds allosterically to T790M via a pocket that is facilitated by external dislocation of the C-helix. EAI045 is able to achieve allosteric binding to EGFR by binding to the mutant M790 gatekeeper residue and forming a hydrogen bond with the DFG motif. At least two mechanisms account for the mutantspecificity of the EGFR allosteric inhibitors: Firstly, the M790 gatekeeper residue enhances the selectivity of EAI045 for the T790M mutant. Secondly, in the wildtype EGFR, EAI045 is unable to bind efficiently given the lack of allosteric pocket in the kinase. Cetuximab -a dimerization blocking agent -is usually used with EAI045 to mimic the effect of mutations that disrupt the asymmetric dimer in EGFR. Basically, the allosteric pocket in the L858R/ T790M mutant is accessible in the two subunits of the asymmetric dimer unlike in wild type EGFR. Therefore, it is very rational to use cetuximab to enhance the potency of allosteric agents [67].
Our knowledge of another triple EGFR mutant T790M/C797S/V948R comes mainly from comparative binding studies between EAI001 and EAI045. EAI045 exhibits a higher affinity for triple mutants than its predecessor EAI001 for T790M/V948R double mutants [68]. This increased affinity was attributed to the formation of new hydrogen bonds between EAI045 and the backbone of F856. In this case, the C-helix is pushed outwards to accommodate EAI045 binding and the formation of multiple hydrophobic interactions via its aromatic rings (particularly with L747, I759, M766, L777, L788, M790, and F856). On the other hand, further studies are required to shed light on the role of T790M/L858R/V948R triple mutants in EGFR resistance to TKIs. In this instance, it is important to emphasize the role of dimerization dependency in understanding the structurefunction consequences of mutations. For example, while several mutants like L858R or G719S are dimerization-dependent (requiring dimerization for oncogenic activation of EGFR), other mutations were reported to be dimerization-independent. Indeed, the V948R mutant represents a surface mutation in the C-lobe and is known as a dimerization-deficient mutant, which is very useful in functional studies [69,70]. Here we have identified 12 structures belonging to the C-helix-OUT clan with C-helix-OUT (DFGout/ BBAminus) conformation (Supplementary Table 1). Moreover, all the V948R mutants (single, double and triple) belonged to the Chelix-OUT clan, thus emphasizing its role in inactivation of EGFR kinase. Accordingly, we would suggest that the V948R mutation could act to trigger the relay of a ''flow of information" from rotamer to rotamer to desensitize EGFR kinase to TKIs.

Biological significance of EGFR relay system
EGFR is a part of the signalling processes involved in cell-to-cell communication system [71]. Therefore, this receptor is central to normal as well as cancer cell viability. During chemotherapeutic interventions, EGFR becomes under tremendous selective pressure to maintain its activity, in order to promote cell growth and proliferation. This is clearly demonstrated by the development and subsequent obsolescence of three generations of TKIs through the appearance of a combination of innate (random) and adaptive (induced) EGFR mutations. According to our analysis, EGFR has undergone an adaptive and cumulative sequence of three point mutations which could act singly or together to induce drug resistant conformational changes in EGFR that are communicated by a ''flow of information" from rotamer to rotamer via side chain conformational relays (Figs. 4 and 5A) [72][73][74][75]. Each conformational relay represents a chain of mutation-induced, linked changes (domino-like) in amino acid residue rotamer conformations, that we propose cause the displacement of a whole helix moiety within the tyrosine kinase subdomain (C-helix-OUT). The combined effects of this conformational relay presents a situation where TKI inhibitors no longer have a suitable binding pocket to bind to and inhibit EGFR, and mutant EGFRs themselves become more aggressive agents of signal transduction without the need for typical tyrosine kinase activity [76]. Indeed, such mutant EGFRs preserve the ''informational system" with sustained pro-proliferative signalling that is pro cancer cell survival, [38] leading to more aggressive tumour progression than is possible with wildtype EGFR, as observed in NSCLC [77].
The failure of three generations of TKIs to inhibit the EGFR communication system can be understood using Shannon's Biological Information Theory (Fig. 5B) [72][73][74][75]. Briefly, EGFR with a mutant, inactive tyrosine kinase is still capable of transmitting signals and information, hence the kinase function is related to growth not survival [76]. Therefore, in the case of wildtype EGFR drug treatment (Normal sensitivity, Wt > Mt = C-helix-IN > C-helix-OUT), the cell will search for ways to develop drug resistance to maintain growth transmission [36]. Hence, even if the wildtype EGFR has been targeted correctly this will only affect tumour growth temporarily not overall survival [78]. Cancer cells expressing wildtype EGFR do not rely on the EGFR kinase activity but on EGFR for survival [79,80]. The same is true in the case of the mutant type where EGFR is already undruggable (Higher sensitivity = Mt Wt = C-helix-OUT C-helix-IN). In this case, emerging mutations will confer the rotameric relay and further desensitise EGFR to TKIs. Therefore, should physicians insist on using TKIs on EGFR until more mutant inactive EGFR kinases are realized (Ultra sensitivity = Mt >>> Wt = C-helix-OUT >>> C-helix-IN) [81,82], then the result must be an undruggable protein that will accelerate the signalling cascade, promote cancer cell survival, and worsen disease prognosis [83][84][85].
Seen from a different angle, the use of first, second and third generation TKIs leading to the widespread appearance of undruggable mutant EGFRs can be seen and understood through the lens of game theory (Fig. 5C) [86,87]. Cancer cells are known for their adaptivity, but they can neither anticipate nor evolve adaptations for treatments that the physician has not yet applied. Therefore, a distinctive leader-follower (or ''Stackelberg") dynamic must apply, in which the oncologist ''leader" plays first and the cancer cell ''follower" then responds and adapt to the treatment regime. The physician ''plays" a fixed strategy even while the opposing cancer cells continuously evolve counter measures until disease progression is no longer halted [88]. Furthermore, by changing treatment only when the tumour progresses, the physician abandons leadership to the cancer cells and treatment failure becomes nearly inevitable. At the molecular level, we observe the end product of this game theory challenge. In the case of EGFR, obsessive use of one class of drugs for one key target renders the target undruggable as a result of only three mutations and their linked conformational relays (Fig. 5D). Our structural and computational data highlight the need to adopt more sophisticated combination approaches for treatment in order to overwhelm tumours before they can mount direct adaptive changes at the molecular level that lead to resistance of treatment. While our structural and computational data account for TKI insensitivity, we do not currently have an equivalent molecular level understanding for how mutant EGFRs possess heightened signal transduction for cancer cell proliferation.
Finally, if we are to look at the conformational relay as equivalent to an electric relay, one should wonder how far the relay ''wire" extends? In other words, do rotamer-based conformational relays transfer conformational information beyond the EGFR extracellular domain, or are these relays only for the transmission of relatively local conformational information? Considering that dimerization might play a role in forming the relays via protein-protein interactions, it is unclear if such relays exist in all protein molecules as a mechanism operating via protein-protein interactions. Future work may reveal if such mechanisms are more common or just a unique phenomenon of the kinase domain of EGFR. On the other hand, the literature is full of biological examples where helix movement controls protein functions such as open/close conformations in membrane transporter proteins [89,90], and enzymes [91]. Furthermore, helix movements are known to facilitate the longrange transmission of conformational change either in case of the closure of clefts between domains in enzymes or in allosteric transitions [92]. We hope that our findings will bring new insights in understanding protein behaviour and also protein folding, particularly taking into consideration that torsional angle movements are the leading switches in protein folding.

Conclusions
The development of TKIs against the EGFR kinase domain in NSCLC has been a great challenge for researchers in the past two decades. Crystal structures of EGFR kinase domain complexed with ligands have identified several structural conformations, i.e., Chelix-IN/OUT, DFG-in/out, and DFG clusters. Here, we further identified global conformational changes at the side chain level that represent major networks of rotamer-based relays encompassing nearly 43 residues that could correlate with the aforementioned structural conformations. Wildtype apo-EGFR adopts the active Chelix-IN (DFGin/BLAminus) conformation, whereas, wildtype EGFR complexed with ligands adopts both active C-helix-IN (DFGin/ BLAminus) and inactive C-helix-OUT (DFGin/BLBplus) conformations. Since single T790M mutants were found in both C-helix-IN and C-helix-OUT clans, and single L858R mutants plus single C719S mutants were found in the C-helix-IN clan, then these mutations appear not to trigger the relay of a ''flow of information" from rotamer to rotamer to desensitize EGFR kinase to TKIs. This conclusion is reinforced by the fact that the T790M/L858R double mutant was found mostly in the C-helix-IN clan with C-helix-IN/DFGin conformation (mostly as BLAminus with a few cases reported as BLAplus). Note that we have used the term desensitization as synonymous to inducing C-helix-OUT conformation. In spite of this, the T790M mutation does desensitize EGFR kinase to TKIs through the mechanism of local steric hinderance to drug binding. On the other hand, single C797S and V948R mutants are associated with the C-helix-OUT clan. Accordingly, we suggest that these mutations can trigger the relay of a ''flow of information" from rotamer to rotamer to desensitize EGFR kinase to TKIs. Indeed, the T790M/L858R/ V948R triple mutant is firmly associated with the C-helix-OUT clan, which underlines the signal importance of the V948R mutation (C) A game theory explanation for mutations and conformational relays: the blue line describes the actions of the physicians against the NSCLC, Rx is the point, which describes the most appropriate action of physician as game leader. We propose at this point, that the Wt EGFR mostly in the C-helix-IN conformation exceeds Mt EGFRs. The red line describes the consequences of the physician's actions on EGFR, N is the Nash balance point where his/her actions start to have no effect on the NSCLC tumour with the formation of C-helix-OUT conformations, Rm is the point where the NSCLC takes the lead from the physician due to his/her repeated actions (therapeutic strategy) that will force the NSCLC to adapt to the new environment. At this point, there will be more mutant C-helix-OUT EGFRs. (D) Comparison between different EGFRs in aspects of targeting, druggability, kinase activity/signalling and C-helix position. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) given the variable individual impacts of T790M and L858R mutations, as noted above. The emergence of EGFR mutations that appear to transmit conformational changes by rotamer to rotamer conformational relays to render EGFR undruggable, is an important new concept. Finally, by employing the biological information theory and game theory, we can connect the dots between what is observed in clinic versus what is happening in the cancer microenvironment and its impact on EGFR at the molecular level. Our take-home message for physicians is not to abuse any therapeutic option that might cause loss of lead in the game of cancer therapy. Hence combinatorial approaches to treatment that exploit other weaknesses in the tumour during the early stages of EGFR mutations would be highly appropriate and desirable.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.