Allosteric binding sites in Rab11 for potential drug candidates

Rab11 is an important protein subfamily in the RabGTPase family. These proteins physiologically function as key regulators of intracellular membrane trafficking processes. Pathologically, Rab11 proteins are implicated in many diseases including cancers, neurodegenerative diseases and type 2 diabetes. Although they are medically important, no previous study has found Rab11 allosteric binding sites where potential drug candidates can bind to. In this study, by employing multiple clustering approaches integrating principal component analysis, independent component analysis and locally linear embedding, we performed structural analyses of Rab11 and identified eight representative structures. Using these representatives to perform binding site mapping and virtual screening, we identified two novel binding sites in Rab11 and small molecules that can preferentially bind to different conformations of these sites with high affinities. After identifying the binding sites and the residue interaction networks in the representatives, we computationally showed that these binding sites may allosterically regulate Rab11, as these sites communicate with switch 2 region that binds to GTP/GDP. These two allosteric binding sites in Rab11 are also similar to two allosteric pockets in Ras that we discovered previously.

Intestinal Rab11a is also required for the apical protein localization [6].
Rab11b also localizes to ERC [5] and regulates the recycling of transferrin receptor to the plasma membrane [8]. Rab11b regulates exocytosis in neurons and neuroendocrine cells [9]. However, little is known about the functional differences between Rab11a and Rab11b. They localize to different vesicle compartments in gastric parietal cells [10]. Silvis et al. have shown that Rab11b, but not Rab11a, specifically regulates the recycling of the intracellular cystic fibrosis transmembrane conductance regulator (CFTR) in polarized epithelial cells in the intestines [11] [5]. Later, Haugsten et al. have shown that Rab11a and Rab11b may play slightly different roles in fibroblast growth factor receptor 4 (FGFR4) recycling [12]. While knockdown of Rab11a, Rab11b or Rab11a/b simultaneously reduced FGFR4 transport out of the ERC, knockdown of Rab11b alone but not Rab11a, accumulated FGFR4 in a perinuclear compartment.

Ras and Rab1 proteins
Previously, we have identified novel binding sites in Ras and Rab1 proteins [13] [14]. Ras is a family of proteins in the Ras superfamily of proteins that regulates signaling pathways that control gene expression of cell growth, differentiation and survival. Three members of this family: K-Ras, H-Ras and N-Ras, are frequently mutated in cancer and hyper-proliferative developmental disorders [15]. K-Ras localizes to cytosol and plasma membrane. H-Ras and N-Ras localizes to golgi apparatus and plasma membrane. These isoforms share more than 85% identity. Through computational and experimental methods, we have previously identified three allosteric pockets and inhibitors for Ras [14]. Rab1 is a member of the Rab GTPase family that regulates membrane trafficking pathways that are related to transport between endoplasmic reticulum and golgi apparatus, and autophagy [16]. Rab1 localizes to ER, GA and early endosome [16] [17]. It has two isoforms, Rab1a and Rab1b, that share 92% of sequence identity [18]. Rab1 is associated with various human cancers including prostate cancer [19], triplenegative breast cancer (TNBC) [20], colorectal cancer [21] and tongue cancer [22]. Aberrant expression of Rab1 is also associated with diseases such as cardiac hypertrophy [23] and Parkinson's disease [24].

Principal Component Analysis (PCA), Independent Component Analysis (ICA) and
Locally Linear Embedding (LLE) PCA and ICA are linear dimensionality reduction techniques. PCA projects data from high dimensional space to low dimensional space such that the variance of data is maximized, assuming that the direction with the biggest variance is the most important [25] [26]. ICA performs dimensionality reduction by deriving independent components from the high dimensional data in such a way that maximizes non-Gaussianity [26]. While PCA minimizes covariance of data, ICA minimizes mutual information of data [25]. LLE is a manifold learning algorithm [27] which is used for non-linear dimensionality reduction. LLE can identify the underlying structure of the manifold better than PCA and ICA [28].

Rab11 structures
We first performed PCA on the ensemble of 28 structures. More than 80% of the variance is captured in the first three principal components (PCs) (S1.1 Fig). We projected the structures in the Rab11 ensemble onto the first two PCs and, the first and third PCs. More than 60% of the variance is captured in these PCs (S1.