Combinatorial Discovery and Validation of Heptapeptides with UTP Binding Induced Structure

In biology, supramolecular recognition typically involves an 'induced‐fit' mechanism, where structures rearrange upon complexation to accommodate binding ligands. Designing minimalistic compounds with such adaptability is challenging as they involve subtle conformational changes that are energetically similar. Here, we demonstrate the integration of combinatorial screening with molecular modelling to identify heptapeptides that form a stable loop upon recognition of uridine triphosphate (UTP). Peptide sequences selected using phage display were refined computationally and correlated with experimental KD values. This combined approach may serve as a method for the de novo selection and subsequent rationalization of the compositional and organizational principles that dictate chemical functionality in flexible structures with dynamic conformations.

In biology, supramolecular recognition typically involves an 'induced-fit' mechanism, where structures rearrange upon complexation to accommodate binding ligands. Designing minimalistic compounds with such adaptability is challenging as they involve subtle conformational changes that are energetically similar. Here, we demonstrate the integration of combinatorial screening with molecular modelling to identify heptapeptides that form a stable loop upon recognition of uridine triphosphate (UTP). Peptide sequences selected using phage display were refined computationally and correlated with experimental K D values. This combined approach may serve as a method for the de novo selection and subsequent rationalization of the compositional and organizational principles that dictate chemical functionality in flexible structures with dynamic conformations.
The design and discovery of novel proteins and peptides with tailored folds and functions is of great interest due to their enormous potential for future applications in biotechnology and medicine. [1] One important characteristic of these bioreceptors is their ability to accommodate ligands through conformational change, commonly referred to as 'induced fit'. As proposed by Emil Fischer 125 years ago, [2] many protein-ligand interactions are initially weak, but increase in strength upon structural rearrangement of the receptor that favors complexation. [3] While short peptides may not be able to match the performance of proteins, it is increasingly recognized that their simplicity and robustness makes them attractive candidates for incorporation in synthetic materials with life-like or life-interfacing functions. [4] Peptides are more amenable for the systematic investigation of the fundamentals underlying supramolecular recognition and induced structure formation. Moreover, short peptides do not have an inherently stable structure, due to their simplicity, but move along a shallow energy landscape, which makes them especially suited to study minimalistic induced-fit mechanisms.
Since the molecular principles that underlie these phenomena are still poorly understood, identification of new binders typically involves screening of large chemical search spaces using methods such as SELEX, [5] CoPhMoRe, [6] dynamic combinatorial libraries, [7] yeast [8] and phage [9] display. The latter has been used effectively to find peptides that bind to antibodies, cell surface proteins, carbohydrate antigens, [10] metal surfaces, [11] nanoparticles, [12] polymers [13] and macrocycles, [14] as well as for sequences that form dynamically interchanging complexes with ATP [15] and peptide catalysts. [16] In addition, molecular dynamics simulations have been used to predict the interaction of thereby identified sequences with gold [17] and silica [18] surfaces. Combining experimental screening with computation has been applied successfully to protein design and evolution [19] as well as directed discovery of peptides. [20] Thus, we reasoned that the work flow of phage display could be accelerated and improved through seamless integration of these complementary strategies.
The objective of the present work was to use computational chemistry to rationally direct the discovery of functional peptides from a random library displaying approximately 100 million different sequences to efficiently select for heptapeptides that interact with UTP. Selected sequences were exploited to elucidate the relationship between supramolecular complexation and structure formation and the obtained experimental results were found to be in good agreement with theoretical predictions. The thus identified lead sequence KAIHPMR forms a stable loop upon interaction with UTP, constituting the simplest example of induced fit in a peptide reported thus far.
A commercial library of M13 phages that display 5 copies of a heptapeptide on a terminal protein surface was mixed with biotinylated UTP (Scheme 1 and Figure S1) for post-selection recovery using streptavidin-carrying paramagnetic agarose beads. The biotin-streptavidin decorated beads were separated into a pellet, thereby isolating phages that have bound to the target. The use of paramagnetic beads together with a magnetic rack facilitates selection compared to our previous centrifugation approach. [16,21] After four rounds of panning and negative selection against streptavidin-agarose beads and soluble biotin, a total of 18 heptapeptide sequences were identified by DNA sequencing (Table S1).
Atomistic molecular dynamics simulations using the CHARMM force field were applied to investigate the interaction between these 18 heptapeptides (with free C-terminus) for their propensity to interact with UTP. The presence of H-bonds between peptides and the nucleotide was analyzed in VMD with a cut-off distance of 3.2 Å (see Supporting Information for details).
The results (Table S1) show that five of the tested sequences did not interact with UTP during the 50 ns simulation time. Eight sequences interacted with UTP only via a single hydrogen bond (often through the sequence-independent N-terminus interacting with the phosphates) and were therefore considered poor and non-selective binders. Consequently, these 13 peptide sequences were disregarded, leaving five peptide sequences for further investigation.
Since preliminary results suggested electrostatic interactions as the main driving force of peptide-nucleotide complexation, additional molecular dynamics simulations of the five selected sequences were carried out using C-terminally amidated versions of the peptides, to better reflect the phage surface conjugation. All simulations were run for 50 ns after the first peptide-UTP interaction was observed and the stability of the formed complexes was analyzed ( Table 1). The results revealed the same trend that was observed for peptides with a free Cterminus, identifying KAIHPMR as the most promising candidate, as it remains bound to UTP 82 % of the simulation time after initial contact. As expected, the net charge of + 3 of this sequence favors electrostatic interactions with the nucleotide. The peptide ITLKGLT was observed in complex with UTP on Scheme 1. Combining experimental and computational tools to search a large molecular space for short peptides with induced structure. Selected point mutations were investigated to gain atomistic insights on nucleotide-binding. Solid-phase synthesis was used to produce the amidated versions of the five selected peptide sequences shown in Table 1. Following synthesis, all peptides were purified by preparative HPLC and extra care was taken to remove any residual trifluoroacetate using 10 mM HCl [22] because of its propensity to form stable adducts with positively charged amino-groups, thereby impeding supramolecular interactions (see Supporting Information for details). 19 F-NMR was performed to confirm that peptides were free from TFA ( Figure S2). 1 H-NMR was applied to follow titrations of the five peptides with UTP and the changes in chemical shift perturbations of amino acid side chains were analyzed to calculate dissociation constants (Table 1 and Figures S3-12). The heptapeptide KAIHPMR showed the lowest dissociation constant with a K D of 0.74 mM ( Figure 1A). The affinity of the sequence ITLKGLT was two-fold lower, with a K D of 1.78 mM. The peptides FPVVTRN and SMRDGAV show similar affinities, with K D of 3.65 mM and 3.31 mM, respectively. The lowest affinity was observed for the sequence TSAVSLR, which showed a K D of 10.80 mM, which is almost 15-times weaker than the best sequence KAIHPMR.
Comparison of experimental affinities and predicted stability of peptide-UTP complexes revealed that experimental and computational results are in good agreement for all peptides except TSAVSLR (Table 1, Figure 2). Closer analysis of this sequence revealed that the peptide interacts with UTP mainly via its C-terminal Arg-side chain, with the rest of the peptide pointing away from the nucleotide, thus not forming a complex.
The best performing sequence, KAIHPMR, was investigated computationally in more detail, revealing that the peptide interacts with UTP mainly via its Lys-and Arg-side chains as well as the N-terminal amino group ( Figure 1B). Importantly, simulations of the peptide KAIHPMR in absence of UTP revealed that the free peptide remains in an extended conformation throughout a 50 ns simulation ( Figure 1C, D) while the peptide forms a loop-structure upon complexation of the nucleotide (Figure 1E, F). Analysis of the distance between the Lys-and Argside chains shows that the loop, once formed, is stable over the course of the simulation.
More detailed investigation revealed that after 5 ns of equilibration, the Lys-residue forms the first contact to UTP via a stable hydrogen-bond to the phosphate-moiety of the nucleotide ( Figure 1G). Subsequently, the Arg-residue engages in hydrogen-bonding with UTP as well, thereby inducing the loop structure ( Figure 1H). To quantitatively analyze the stability of this UTP-binding induced structure, the simulation was extended further to yield 50 ns of simulation time after the initial peptide-UTP interaction. The results revealed that the loop-structure is retained during 91.7 % of a 50 ns simulation after it was formed.
While no intermolecular H-bonds were observed between residue H4 and UTP, intramolecular H-bonds were detected between the side chains of residue H4 and R7 (18 % of the time) and also between the backbone of these residues (23 % of the time) after the loop has formed. This demonstrates that the formation of the loop is induced by binding to the nucleotide and, while it is a sampled conformation in the free peptide, it is not stable in the absence of UTP. Interaction with the nucleotide leads to a stabilization of the peptide structure, whereas the free peptide in solution remains dynamic throughout the simulation.
Unexpectedly, the computational results revealed that His, rather than the expected Pro residue, is important for stable complexation-induced loop formation. The loop structure is observed for the His-mutant KAIAPMR only during 45.4 % of a 50 ns simulation after it was formed initially. Hence, the stability of the loop is decreased compared to the original peptide KAIHPMR (Figure 3A, B). The side chain of residue K1 interacts with the phosphate moiety of the nucleotide 58 % of the simulation time after initial contact. The side chain of residue R7, in contrast, forms a Hbond to UTP only 23 % of the time ( Figure 3C), the majority of which (20 %) are with the phosphate-moiety. Experimental characterization of the peptide KAIAPMR binding to UTP by 1 H-NMR revealed a significantly increased dissociation constant of 3.45 mM compared to 0.74 mM for KAIHPMR (Table 1 and Figures S13, 14).
Replacing the Pro-residue by Ala in peptide KAIHAMR does not affect the stability of the loop, as it is observed during 91.5 % of a 50 ns simulation after formation. Intramolecular Hbonds between residue H4 and the N-terminus are observed 10 % of the time and they are likely to contribute to the formation of the stable structure ( Figure 3D-F).
Experimental characterization of the peptide KAIHAMR binding to UTP by 1 H-NMR revealed a dissociation constant of 0.83 mM (Table 1 and Figures S15-16), which is comparable to that of the original sequence KAIHPMR. This confirms our hypothesis that the role of the H4-residue is to stabilize the intramolecular structure of the peptide and suggests that replacing it for residues that can perform that function too should give similar results.
The proposed model was further tested by additional molecular dynamics simulations on peptides KAIQPMR and KAIYPMR that were selected for their theoretical potential to form intramolecular H-bonds to residue R7. The analysis of these simulations revealed that the Q4-residue in KAIQPMR does not form a H-bond to R7, consequently, no stable loop structure was observed ( Figure S17). In contrast, the formation of a stable loop structure was observed for KAIYPMR upon interaction with UTP, which coincides with the formation of a stable H-bond between residues Y4 and R7 ( Figure S18). It is important to note that for these peptides H-bonds are not formed between the residue in position 4 and the nucleotide,  thus we conclude that stability of the induced loop depends on residue 4 to engage in intramolecular hydrogen bonding with residue R7. This exemplifies the difficulties encountered in rational protein/peptide design. The delicate balance between the strain induced in the structure and the enthalpic gain in binding the substrate is not simply a function of the interactions between the substrate and the receptor.
Rather, it includes subtle effects such as whether additional intramolecular interactions can be leveraged to mitigate the conformational strain and loss of entropy from the receptor.
Interestingly, analysis of the average distance D between residues 1 and 7 of the investigated peptides over time revealed that complexation of the nucleotide decreases D in peptide KAIHPMR by 9.4 Å, from 17.9 Å to 8.5 Å ( Figure S19). A less significant change is observed for the peptides TSAVSLR and FPVVTRN, where the average D is reduced by 5.3 Å and 4.5 Å, respectively. No reduction of D is observed for the peptides ITLKGLT and SMRDGAV. Thus, we propose that the distance between the N-and C-terminal residues in short peptides can be used as a measure to predict 'induced fit' in short peptide sequences.
While functional proteins and peptides that have stable folds can increasingly be designed and rationally modified, there is only a limited understanding of the compositional and organizational principles that dictate chemical functionality in those with flexible structures and dynamic conformations.
In this work, we demonstrate the integration of experimental screening and computational refinement to identify novel minimalistic peptide sequences that take on a stable structure upon complexation with a single nucleotide. The simple binding measure used in this selection procedure was validated through the quantitative agreement shown with the experimentally determined binding constants. Moreover, molecular modelling allows for the rationalization of the interactions of selected sequences with the target.
Our approach resulted in the first example of an 'induced fit' mechanism in a heptapeptide and is therefore a minimalistic model of structural adaptation upon interaction with a ligand, a phenomenon which is widely observed in complex proteins but cannot yet be designed in simpler peptides. The findings contribute to our understanding of how the primary amino acid sequence affects supramolecular recognition and structure formation in minimalistic peptides, of relevance as components for adaptive peptide-based supramolecular systems with dynamically induced structures and functions [4].