My first Editorial, I thought, would be an excellent platform to share some interesting, less appreciated facets of biology, taking cues from our own research findings, melding them with others in the field and take the issue forward. The editorial is written in way so that less of my own complexity appears and only the complexity that is innate to biology is showcased. Biology, at any length and time-scales of study, is not only exciting but also very intriguing. I will, in my successive editorials, move into various scales of biology, only to wonder about its inner workings.

We all know that protein active sites are shaped into executing specific chemical transformations via the steps of catalysis. Therefore, an active site is really the ‘business end’ of a protein when it is an enzyme, where the directional nature of spatial and charge transfers within the active site seems to mediate the act of catalysis. Active site predictions in a protein must, therefore, invoke methods that combine and compare both spatial congruence and electrostatic potentials of putative active site residues with those in bona fide active site motifs. Methods that successfully perform spatial matches of motifs are fairly abundant (Kleywegt 1999; Goyal et al. 2007; Debret et al. 2009), while those that factor in electrostatic potential comparisons are somewhat rare, even today. We realized this deficit sometime back and tried to develop a method that showed that electrostatic potential difference (PD) between analogous residue pairs in an active site from different proteins of the same enzyme family are similar and therefore can be effectively combined with spatial matching methods. That means we found strong correlation in the electrostatic PD between sets of cognate residues in active sites, where for a given enzymatic activity, pairs of residues in an active site from various proteins of the same family yield highly correlated electrostatic PD with very small standard deviation where all such PD’s lie in a narrow band. Using this robust finding, we pruned out false matches from spatial congruence sets and offered an additional filter for improving accuracy in active site prediction. By demonstrating conservation of electrostatic potential differences in cognate pairs of residues in a wide range of related proteins, we established a computational method, CataLytic Active Site Prediction (CLASP) to detect active sites (Chakraborty et al. 2011, 2013a).

We extended the method further and quantified promiscuous active sites in a wide range of proteins (PROMISE) (Chakraborty and Rao 2012; Chakraborty et al. 2012a, b). We found that by interrogating any given hypothetical protein using CLASP, we could ‘sniff’ for a potential active site in the bait-protein after comparing with known motifs in the Catalytic Site Atlas (CSA) database and even rank the ‘hit’ matches in the bait. Such ranking comparison allowed CLASP in classifying promiscuous activities associated with the bait-protein. We probed various alkaline phosphatase (AP) structures for well defined motifs from CSA and uncovered a uniquely promiscuous proteolytic active site in shrimp alkaline phosphatase (SAP). We substantiated the same by providing experimental evidence in vitro. It turned out that trypsin inhibitors could competitively inhibit AP activity, which was extended to probing other APs as well (Chakraborty et al. 2011, 2013b; Chakraborty and Rao 2012).

While it is conceivable that stereochemical equivalence is hardwired for amino acids with similar properties, one finds instances of active sites where residues with different properties occupy the same sequence and spatial location but perform the same function (Lobkovsky et al. 1993)! Finally, while choosing a site in a protein as a possible scaffold, it is sometimes useful to relax the constraints to allow identification of moonlighting active sites ‘lurking’ in its vicinity (Jeffery 2009). Indeed, modulating the radius of search defining the ‘active site vicinity’ did help us characterize properties of residues that determine promiscuity (Chakraborty and Rao 2012).

Notwithstanding the general mutational changes in a protein, understanding those that drive novel protein active site functions remains an exciting challenge even today. There the issues under intense debate are: how an evolutionary trajectory shapes the mutation-spectrum of an active site from the pre-existing to the next with altered biochemical functions; whether early mutations set an influential trend where the mutational paths get additive or get restricted by dynamic adaptive changes or a combination of both and how the newly evolving function trades off with the pre-existing function, leading to conversion from either a ‘generalist’ to ‘specialist’ or vice versa and the routes that shape intermediates leading to high-affinity-high-selectivity sites, etc. (Tawfik et al. 2009). It has been proposed that novel activities are acquired by active sites when ‘generalists’ convert to ‘specialists’ via intermediates that exhibit exceptionally wide ranges of promiscuous activities (Matsumura and Ellington 2001). Indeed, examples do exist suggesting that non-overlapping specificities become accessible via ‘generalist’ intermediates emerging under selection (Rockah-Shmuel and Tawfik 2012). Therefore, a ‘hot-soup’ of promiscuity seems to serve as a fertile ground for selections that eventually ‘fix’ specificity in a newly emerging ‘specialist’!

It is well known that entire body of the protein can accumulate mutations over large evolutionary time-scales. Interestingly, sets of correlated-mutations where a particular amino acid change at one location is accompanied by certain other change at another location in the protein (perhaps as a compensatory alteration), supporting a model that 3D structural contacts might influence the mutation landscape within the protein (Jacob et al. 2015). There is now a strong hope, backed by powerful statistical tools, that accurate mapping of many such ‘correlated mutations’ during its evolution in a protein could indeed yield sufficient structural constraints that one could start predicting the entire 3D-structure of the protein using such statistically ‘cosher’ constraints!

So the protein evolution involves a balance between ‘whole body’ centric structure driven mutational landscape, within which local active site biochemistry spins out additional trajectories of mutational changes that are tightly coupled to functional specificity of the protein. Moreover, most proteins co-exist within the biochemical niche of its interactome, which in turn imposes additional constraints on mutation landscape of a protein. We know much less on the nature of these constraints as our understanding of interactome evolution is very nascent now. It is conceivable that interactome based constraints adds additional layer of complexity in protein evolution by imposing other adaptive changes in the protein functions. All these arguments put together seems to strongly suggest that a single target, the protein, is subject to multiple mutational pressures, ranging from its local active site centric changes to overall 3D folds-related effects to those emanating from 3D-structural plasticity of the same protein within its dynamic interactome. It is excitingly intriguing to think about all such possibilities.