Fragment Libraries Designed to Be Functionally Diverse Recover Protein Binding Information More Efficiently Than Standard Structurally Diverse Libraries

Current fragment-based drug design relies on the efficient exploration of chemical space by using structurally diverse libraries of small fragments. However, structurally dissimilar compounds can exploit the same interactions and thus be functionally similar. Using three-dimensional structures of many fragments bound to multiple targets, we examined if a better strategy for selecting fragments for screening libraries exists. We show that structurally diverse fragments can be described as functionally redundant, often making the same interactions. Ranking fragments by the number of novel interactions they made, we show that functionally diverse selections of fragments substantially increase the amount of information recovered for unseen targets compared to the amounts recovered by other methods of selection. Using these results, we design small functionally efficient libraries that can give significantly more information about new protein targets than similarly sized structurally diverse libraries. By covering more functional space, we can generate more diverse sets of drug leads.


■ INTRODUCTION
Fragment-based drug design (FBDD) is now well-established as a powerful approach to early stage drug discovery and has led to success for targets that proved to be otherwise intractable. 1 The first stage entails screening, in which libraries of fragments, compounds around a third of the size of typical drug-like molecules, are screened for binding to a protein of interest. The concept is that the small size of the fragments allows a more efficient search of chemical space and recovers more protein binding information than in traditional highthroughput screens, allowing the size of the library to be much smaller. This substantially reduces the number of experiments that need to be conducted within a screen.
Ideally, a fragment screen obtains information about the molecules or functional groups that bind the protein of interest and the interactions they make. 2 Fragments are subsequently elaborated or combined to create larger lead molecules that can be developed into potential drugs. The better the site of interest is explored by fragments, the more insights we have into the key interactions critical for binding. As these key interactions are usually conserved upon generation of larger lead molecules, 3 this translates to improved chances of a viable drug candidate being discovered.
Design of Fragment Libraries. Maximizing the useful information that can be extracted from each fragment screening experiment is key, so design of fragment libraries is a major element of FBDD research. 4−8 There are two major aspects to library design: definition of the desired region of chemical space and the sampling of that region of chemical space. Table S1 describes the design strategies and size of several major fragment libraries.
Definition of Desired Chemical Space. Fragment libraries tend to be built from molecules that adhere to the "rule of three": 5 fragments that have a molecular weight of <300 Da, fewer than three hydrogen-bond donors and acceptors, fewer than three rotatable bonds, and a cLogP of ≤3. Heavy atom counts tend to be limited to <20. 2 These rules aim to limit the structural complexity of the fragments so that only one or two efficient interactions with the protein target are required for binding. Compounds containing toxicophores or highly reactive groups are not included, 9 as these would not be appropriate fragments to use in the development of drug leads. Fragment libraries also tend to prioritize chemical tractability 10 and/or the availability of analogues 11 to enable fast and easy follow-up experiments. Such fragments that are ideal for subsequent lead development have recently been termed "social fragments" 12 and are contrasted with "unsocial fragments" that have limited or no synthetic pathways for elaboration, and no analogues.
Historical experimental results can be incorporated to further guide their definition of desired chemical space. The SpotXplorer library 13 was designed to contain pharmacophores that have been observed to commonly bind protein hot spots, 14 based on a comprehensive analysis of the Protein Data Bank. 15 A library designed recently 16 used past experimental results to develop a machine learning model that generated novel fragments that contain the characteristics of fragments that bind multiple targets, termed "privileged fragments".
Libraries may also have a particular intent, for example, to target certain protein classes or to fulfill specific properties. Examples of such properties include high Fsp 3 character, 17 a three-dimensional (3D) shape, 18 the ability to form covalent bonds with the target, 19 protein−protein interface binding character, 20 or natural product resemblance. 17 Sampling of Desired Chemical Space. To avoid the synthetic challenges presented by the design of novel compounds, most libraries are made using previously available fragments, whether available commercially or in house. In these cases, a catalogue of fragments that lie within the desired chemical space is generated, and fragments are selected from this in a way that maximizes the structural or shape diversity of the library. A common approach is to use molecular fingerprints such as ECFP, 21 MACCS, 22 and USRCAT. 23 A fingerprint is generated for each fragment, and a maximinderived algorithm 24 (such as the RDKit MaxMin picker) is used to select the most structure or shape diverse fragments. For example, the DSiP library 25 (the successor to ref 10) uses USRCAT fingerprints while the F2X libraries 11 use MACCS fingerprints to maximize structural and shape diversity.
Another method used to achieve structural diversity is to cluster fragments on the basis of structure or functional groups. Representatives of each cluster can then be selected for the final library. An advantage of this approach is that clusters that cover more attractive chemical space can be sampled more often than those that are less desirable. The 3D shape diverse library 26 is an example of a library employing this strategy.
A few libraries consist of novel fragments that were designed and synthesized specifically for the library. 27,7 The most common aim of such libraries is to address the historic uneven coverage of chemical space. Final compounds for the library are generally selected to ensure synthetic feasibility and a low degree of similarity to commercially available fragments and to maximize the shape diversity of the final library. 28 In nearly all cases, chemists are reported to be the final gatekeeper of selection, using visual inspection. 16,29,30 This indicates that algorithmic approaches are never trusted to completely select the fragments and makes it difficult to learn from these final decisions, which are rarely fully documented or quantified.
Influence of HTS Library Design on Fragment Library Design. To assess the effectiveness of current library design methods, it is useful to first understand how these strategies originated. The emphasis on structural diversity on library design appears to have its origin in the design of HTS libraries, 31 where the large chemical space made computable metrics imperative in the selection of compounds. Experimental approaches like diversity-oriented synthesis 32 also built up on this principle. The premises were transferred to fragment library design, but to the best of our knowledge, the underlying assumptions were never rigorously interrogated.
Gordon et al. 31 proposed that HTS screening libraries should be iteratively redesigned on the basis of the results of screening campaigns. This has been applied to several libraries to guide their definition of "attractive chemical space"; however, only knowledge of which fragments produced hits in past experiments was utilized, with 3D information from the protein−fragment structures being ignored. For example, an analysis of results from screens using the Astex 2012 fragment library 33 showed a need to focus on fragments with 10−14 heavy atoms and ensure that larger fragments are not overly complex. The Vernalis library was analyzed using the results of 12 fragment screening campaigns, 29 and it was found that compounds with slightly lower molecular weights had higher hit rates. AstraZeneca observed a high rate of project failure between 2002 and 2008 34 and used the results to iteratively improve the library in several ways: fragments that were prone to decomposition, highly reactive, or deemed "unattractive" for follow-up chemistry by medicinal chemists were all removed, while pharmacophoric and structural diversity analyses were employed to "fill in" gaps in chemical space.
Functional Activity of Fragment Libraries. Using only binary hit or miss results does not tell us whether these frequently hitting fragments are giving us diverse information about a target, so this may fail in achieving the primary aim of a fragment screen of thoroughly exploring the binding site of a protein of interest. It is known that the molecular structure of a fragment does not accurately predict the interactions that it can make with a protein. 35 Similar fragments may have diverse functional activity (e.g., they bind to different protein environments), while structurally diverse fragments may have very similar activity. 36 Thus, the number of hits cannot be used as a proxy for the quantity of information.
Most library designs aim to elucidate as much information as possible about a protein's binding site(s). However, so far it has been difficult to establish which or indeed whether any fragment libraries achieve this. Now that crystallographic fragment screens are routine, we can use primary data of protein−fragment interactions to assess whether structurally diverse libraries are behaving in a functionally diverse manner.
To establish whether fragment libraries designed to be structurally diverse have functional redundancy, we examined a set of 10 diverse targets that have all been screened against the majority of fragments in the DSiP library by XChem. 37 The same set of fragments has been tested on the same targets, generating full data of what bound and how, as well as which fragments did not bind. Using these data, we describe an approach for analyzing functional redundancy within a fragment library and examine the relationship between structurally diverse and functionally diverse fragment libraries. Our findings suggest that structurally diverse fragment libraries do not necessarily exhibit any more functional diversity than randomly selected libraries. On the contrary, by selecting functionally diverse fragment libraries, we show that the information recovered for unseen targets is substantially improved compared with that obtained by using randomly selected or structurally diverse fragments.
for each structure, between fragment atoms and protein residues (residue IFP) and between fragment atoms and protein atoms (atomic IFP). For both types of IFP, the fragments were ranked on the basis of the novel interactions they formed with all or a subsection of protein targets (see Experimental Section). These rankings were used to group fragments for analysis, defined in Table 1.
We also examined the types and frequencies of different protein−ligand interactions made (shown in Figures S2−S4).
Structurally Diverse Fragments Can Form Overlapping Interactions. We compared the molecular similarity (calculated using ECFP2 fingerprints 21 ) to the residue IFP (our measure of functional activity) similarity for fragments bound to 10 highly diverse protein targets (Table S3). Fragments that are structurally very different (ECFP2 similarities as low as 0.02) bound to the same location on a target and form one or more of the same interactions, leading to an IFP similarity above zero ( Figure 1a). Across the set, we found 44 pairs of structurally distinct fragments that formed identical interactions. An example of this structural dissimilarity but functional similarity is shown in Figure 1b, where three distinct fragments form identical interactions with the protein. The molecular fingerprint (ECFP2) similarity of these three fragments ranges between 0.27 and 0.34, which would be considered appropriately diverse for inclusion in conventional libraries.
While the 44 pairs of fragments that form identical interactions account for only 0.82% of all fragment pairs, more than one-fourth (26.9%) of fragment pairs shared at least one common interaction. To explore the possibility that fewer fragments could form the same interactions with the targets, we assessed which fragments formed the most novel interactions and calculated the minimum number of fragments required to explore all interactions across all targets.
Ranking of Fragments Reveals Redundancy in Interactions. Fragments were ranked by the number of novel interactions (residue level and atomic level) that they formed with the 10 targets in the data set (Table S3). Two libraries, one randomly ordered and another structurally diverse, were generated for comparison to the functionally diverse libraries (see Experimental Section for details of how these were prepared). Each of these three methods of ordering fragments was repeated 100 times, and the mean fraction of unique interactions (also termed "information") recovered at each library size across all runs was calculated ( Figure 2).
Interactions were recovered at smaller library sizes for the functionally ranked library compared with either the random or the structurally diverse library. On a residue level, all interactions with targets were recovered using only the 135 top-ranked fragments of the 225 that were bound to at least one target ( Figure 2a); for the atomic level of interactions, 146 fragments were required (Figure 2b). This result shows that upon selection of the fragments in this way, a 135-fragment screen could recover the same information as randomly selected 520-fragment screens on all 10 targets. It was expected that the ranked libraries (using both residue and atomic IFP) would recover information at smaller library sizes due to the method of ranking; however, it is notable that structurally  diverse libraries do not recover information at library sizes any smaller than the randomly selected libraries. The lack of difference between structurally diverse and random libraries may be because the fragments used are already relatively structurally diverse.
To assess whether different libraries were selecting similar fragments, the mean number of fragments in common between libraries of 100 fragments was calculated. The functionally diverse and structurally diverse libraries had 49 fragments in common, similar to the 44 fragments in common between the functionally diverse and randomly selected fragments. This indicates that the level of similarity between the functionally diverse and structurally diverse libraries is little more than random. Conversely, the functionally diverse libraries generated using the residue IFP and atomic IFP methods have 83 fragments in common.
Functionally Diverse Compounds Exhibit Chemical Properties Different from Those of Nonbinding Fragments. To explore the relationship between particular chemical properties and a fragment's rank, we split the DSiP fragment library into three sets: the 100 top-ranked fragments (those that are most informative), the 125 remaining bound fragments (fragments that had bound to one or more proteins), and the 295 fragments that had never bound to a target ( Figure 3). This was performed on sets generated by both residue and atomic IFP methods, and as a comparison, chemical properties were also calculated for a structurally diverse set of 100 fragments.
Fragments ranked in the top 100 functionally diverse were more likely to have a molecular weight (MW) of ∼200 for both residue and atomic IFP methods, while the "other bound" groups of fragments were slightly more likely to have a MW of <175. For all other properties, there was no substantial difference between groups of fragments that had bound one or more targets. Fragments that had not been observed to bind any target were much more likely to have a low MW (<175) and a higher heavy atom count (≥17) compared to those that had bound one or more targets. The low MW of many "never bound" fragments suggests that they may be too small for reliable detection and modeling in many experiments. As expected, the fragments selected as a diverse subset had lower MACCS similarity with each other compared with the overall library and the highly ranked fragments.

Promiscuous Binders Are Not Necessarily the Most Informative Fragments.
To examine whether a fragment that bound multiple targets (a promiscuous fragment) was likely to be selected as a highly informative fragment, we compared the number of hits (targets bound) for three sets of fragments: the 100 top-ranked fragments, the remaining minimum fragments (fragments that would be required to recover all information from the original screen), and the fragments that could be removed without losing any information (redundant fragments). While the top-ranked sets consisted mostly of fragments that had bound multiple targets, some fragments that bound to three or four targets were excluded in favor of those that had bound to only a single target (Figure 4).
Functionally Diverse Fragments Recover Information More Efficiently from Unseen Targets. The results presented above show that a functionally diverse fragment set contains fragments different from those of a structurally diverse one. We investigated whether using such functionally diverse fragments is an effective strategy for more efficiently obtaining information about unseen targets. To do this, we performed a leave one out test (see Experimental Section for more details).
To compare the information recovery for each target when using functionally diverse fragments with the other methods of fragment selection, we analyzed the information recovered from each target at a library size of 100 fragments ( Figure 6). We also calculated the fractional improvement when using the functionally diverse fragments compared with random and structurally diverse fragments ( Figure S5). We compared the methods of fragment selection across all unseen targets. The mean values of information recovery across all targets are shown in Figure 5b. On average, the functional information about the unseen target was recovered more efficiently using the functionally diverse fragments than the random libraries.
Targets Contribute Differently to the Prediction of Important Fragments for Unseen Targets. The targets in this data set are very diverse (the maximum pairwise global sequence identity is 27%), but some fragments do form similar interactions with different targets. To assess the impact of each target on the effectiveness of a fragment set for giving information about an unseen target, we removed the previously seen targets one by one, ranked only the nine remaining targets, and took the 100 top-ranked fragments as the functionally diverse library. The recovery of interactions was compared with the original (when results of all 10 targets were used to rank), and the factor that each target impacts the recovery of interactions for every other target was calculated. These impact scores are a measure of the similarity of the most informative fragments between two targets (Figure 7a,b). Between the residue IFP method and the atomic IFP method, the scores are consistent in terms of overall effect, but the magnitudes differ. Each target positively impacts some targets while negatively impacting others.
As a high impact score is indicative of similar fragments forming interactions with the targets, we assessed which targets were in the same classes (phosphatase/kinase, protease, nucleic, or other). This is shown in Figure 7c; however, it does not appear that there is a strong correlation between these two factors. This confirms that there are no two targets with similar fragment binding activity, yet by ranking fragments on previously seen targets, we can improve information recovery on unseen unrelated targets.

■ DISCUSSION
In agreement with previous work, 36 we found that structurally diverse fragments can form similar or even identical interactions. Additionally, we have shown that by defining fragments as the interactions they make with all targets, we can select fragment libraries on the basis of their functional diversity. Such libraries canform more diverse interactions with previously unseen targets and thus improve the information recovered by an average of 68% and a maximum of 152% across all targets tested, compared with traditionally designed fragment libraries.
These findings suggest that selection based on structural diversity is not the optimal strategy when diverse functional information is desired. Given that suitable experimental data Figure 3. Fragments that have never bound to a target (red) are more likely to have very low or very high molecular weights and heavy atom counts. Fragments have been categorized into the "top 100" and "remaining bound" groups on the basis of our functional ranking (see Experimental Section). Fragments that bound no targets make up the "never bound" set, while a set of structurally diverse fragments are shown for comparison. Various properties of these sets of fragments are compared. (a) Fragments have been ranked using the residue IFP method. (b) Fragments have been ranked using the atomic IFP method. In panels a and b, medium-sized fragments are most likely to be highly informative. are now available for many fragments bound to multiple targets, it is feasible to explore new approaches to select fragments to screen previously unseen targets.
Ranking Fragments Shows Redundancy in Interactions. To select the fragments that show the most diverse functional activity, we ranked fragments by the number of novel interactions they made with 10 targets. This showed that some fragments form far more novel interactions than others. We analyzed the most informative fragments (those that were most highly ranked) and compared them to less informative fragments and those that have never been seen to bind. Our analysis of the most informative fragments is broadly in agreement with previous work 38 and suggests that fragments with molecular weights between 175 and 240 (and heavy atom counts between 12 and 16) perform optimally in fragment screens, perhaps as they can make multiple interactions without being structurally too complex, but are also large enough to be detected. Additionally, while promiscuous binders (fragments that bind multiple targets) are more likely to be considered as highly informative fragments, their promiscuity alone is not enough to guarantee this. This supports our hypothesis that using only information about which fragments have a high hit rate is not the most effective strategy for library redesign.
Functionally Diverse Fragments Recover Information More Efficiently from Unseen Targets. We then proceeded to study the potential of the ranking protocol described above as a novel method for selecting fragments to screen on unseen targets. We tested whether a set of fragments that exhibited functional diversity in previously screened targets were more efficient in information recovery than a random set of fragments or a structurally diverse set of fragments on an unseen target. Both residue IFP and atomic IFP ranking methods achieved better information recovery than comparison methods at a library size of 100 fragments: the residue IFP method improved recovery by a maximum of 126% and an  Ranked fragment libraries show superior information recovery for unseen targets at every library size. The recovery of information for each target when unseen was calculated 100 times. The mean across each run for each target was calculated, and the mean of these values was taken. This value is shown at each library size, with error clouds showing one standard deviation across 100 runs. (a) Fragments have been ranked using the "residue IFP" method. (b) Fragments have been ranked using the "atomic IFP" method.

Journal of Medicinal Chemistry
pubs.acs.org/jmc Article average of 59%, and the atomic IFP method improved recovery by a maximum of 153% and an average of 76%. The ranked library was less efficient at information recovery on only one target (INPP5DA), only when using the residue IFP method and by only 2%. At smaller library sizes, the average improvement in information recovery is even larger. This result indicates that the functional redundancy of the DSiP fragment library remains across diverse targets and that designing the library in a functionally diverse way can lead to more information being generated for a target from a screen compared with traditionally designed libraries.
The atomic IFP method gives slightly more consistent results than the residue IFP method. This suggests that the particular atom within a residue with which a fragment interacts is important, as the atomic IFP method captures this information whereas the residue IFP method does not.
Different Targets Contribute Differently to Prediction of Important Fragments for Unseen Targets. Finally, we set out to understand whether particular targets contributed disproportionately to the performance of our method, thus indicating that there were targets with similarity in their fragment binding activity within the data set. By testing the impact of each target's screening results on the recovery of information for every other target in a 100-fragment screen, we scored the similarity between the most informative fragments between each pair of targets. Each target positively impacts some targets while negatively impacting other. Some targets are mostly negatively impacted by others at a residue level while being unaffected or positively affected at an atomic level. As a measure of binding similarity, we compared this to substrate class of protein; however, there was little to no correlation. Considering the complexities of protein−ligand binding behavior, it is unlikely that a simple predictor of this behavior exists, so further research would be required to explore potential ways to predict such similarities between targets. Additionally, as we would expect, the impact of each target on each other target will change as more results are included and artifacts due to the relatively small data set used here are reduced.

■ CONCLUSIONS
Currently, the most common strategy for fragment library design is to select the most structurally diverse set of fragments from those that lie in the desired chemical space, without considering structural results from previous fragment screens. In this study, we have shown that libraries designed on the basis of functional diversity recover information more efficiently from unseen targets than traditionally designed structurally diverse libraries.
Even with a limited data set, we have proven the potential for functionally diverse fragment selections to substantially improve the information recovered from fragment screens. Every time more experimental data become available, it would be possible to quickly reselect functionally diverse fragments that can reliably and significantly outperform traditional library selection methods.
This ability to better explore the interactions in protein binding sites would allow a larger number of diverse lead compounds to be developed, thus improving the chances of a fragment screening campaign producing a viable drug candidate. Additionally, we believe that by collecting comprehensive information about the functional activity of fragments (by testing them on many targets), we can begin to understand and predict which fragments will be functionally diverse just on the basis of their structure.

■ EXPERIMENTAL SECTION
In this section, we describe how we rank fragments by their ability to give us the most information about key interactions and how we test the capacity of fragments we ranked highly to recover information about unseen targets.
XChem Data Set. Structures of 309 protein−fragment complexes were used, from 10 targets and 225 unique fragments, representing results from 4928 individual crystals. Each target had between 13 and 65 fragment-bound structures. The targets were diverse, with a global pairwise sequence identity mean of 12% and a maximum of 27%. The sequence identity was calculated using EMBOSS Needle. 39 No two targets shared a CATH class. 40 Table S3 contains descriptions of the targets used, and the fragment-bound structures of these are available on the Fragalysis platform. 41 Full lists of fragments screened on the these targets are included within the Supporting Information. To ensure that our analyses were not biased by fragments observed only a few times, we selected those fragments that had been tested on at least seven of our 10 targets. This resulted in 520 fragments being included, of which 295 had not bound to any targets. The remaining 225 fragments had bound to one to four targets. We selected fragments that had been tested on at least seven targets as a balance between coverage (seven of 10) and data set size. Requiring fragments to have been tested on eight or more of our targets would have led to the inclusion of only 480 fragments in the data set, of which only 214 had bound one or more targets. The fragments used in our analysis are included in the Supporting Information.
Selection of Functionally Diverse Fragments. Definition of Functional Activity. We define the functional activity of a fragment as the interactions it forms with the protein. This definition is used to prioritize our understanding of a fragment's ability to form interactions rather than the structure of the fragment itself. The interactions in each structure within the data set were calculated using ODDT's InteractionFingerprint module, 42 which generates a binary fingerprint for each protein−fragment structure. This method, in this study termed "residue IFP", calculates up to eight types of interaction between the fragment and each residue in the protein (see Table S2 for details of interaction types). We also adapted the InteractionFingerprint module to output the interactions between the fragment and each atom of the protein, resulting in what we will term the "atomic IFP" method.
Ranking of Fragments Based on the Novelty of Functional Activity. The ranking protocol aims to identify fragments that add the most information about interactions a target can form. We start with a library size of one and append fragments one by one, each time including the fragment that adds the largest amount of novel information to the current library. As several fragments may add identical amounts of information, we repeat this 100 times, randomly shuffling the order of the fragment list before each run. At each library size, we calculate the number of interactions recovered compared to all of the interactions from the full screen. The mean fractional recovery rate at each library size is calculated, along with the standard deviation. The code used to rank fragments is available at https:// github.com/oxpig/fragment-ranking.
Other Methods of Fragment Ordering for Comparison. For comparison, we used two other methods of fragment ordering, including only fragments that had bound one or more targets, to match the set of fragments that were possible to rank. The first of these was a random control, where we shuffled the fragments into a random order, repeating this 100 times and calculating the mean and standard deviation of the recovery rate at each library size. We also generated a structurally diverse control, using the technique employed Figure 7. Impact of each target on every other target's information recovery at library size of 100 fragments. For example, including mArh improves recovery of information about CD44MMA by 40% compared with only using the eight remaining targets to rank the fragments; including CD44MMA improves recovery of information about mArh by 27%. (a) Fragments have been ranked using the "residue IFP" method. (b) Fragments have been ranked using the "atomic IFP" method. (c) Heat map showing whether two targets fall into the same class. Red indicates that two targets are in the same class, and blue indicates a different class.
in the selection of the F2X-Entry library. 11 We used RDKit 22 to calculate the MACCS key for each fragment and MaxMin picker to select a structurally diverse set of fragments for every library size tested. This was also repeated 100 times with the mean recovery rate and standard deviation taken. This strategy for fragment selection was compared with other fingerprinting methods (shown in Figure S1).
Testing a Fragment Ranking Protocol on Unseen Targets. To test the effectiveness of ranking fragments by their functional information, we tested the protocol on each target in the data set, using a leave one out test.
For each target, we ignored the results of its own screen, ranked the fragments using the other targets, and calculated the recovery of information about this previously ignored target. This was run 100 times, and the mean amount of information recovered was calculated, along with the standard deviation. These values were also calculated for randomly ordered and structurally diverse fragment libraries.
Only fragments that had bound to previously seen targets could be ranked, so once this library size was exceeded, "dummy fragments" were used, which did not recover any new information from the protein. This resulted in 100 sets of fragment libraries at every size from one to 200 for each of the library types. For each method, the mean information recovered was calculated, along with the standard deviation of this information recovery.