Quantitative Mapping of Binding Specificity Landscapes for Homologous Targets by using a High-Throughput Method

To facilitate investigations of protein-protein interactions (PPIs), we developed a novel platform for quantitative mapping of protein binding specificity landscapes, which combines multi-target screening of a mutagenesis library into high- and low-affinity populations with sophisticated next-generation sequencing analysis. Importantly, this method generates accurate models to predict affinity and specificity values for any mutation within a protein complex, and requires only a small number of experimental binding affinity measurements using purified proteins for calibration. We demonstrated the utility of the approach by mapping quantitative landscapes for interactions between the N-terminal domain of the tissue inhibitor of metalloproteinase 2 (N-TIMP2) and three matrix metalloproteinases (MMPs) having homologous structures but different affinities (MMP-1, MMP-3 and MMP-14). The binding landscapes for N-TIMP2/MMP-1 and N-TIMP2/MMP-3 showed the PPIs to be almost fully optimized, with most single mutations giving a loss of affinity. In contrast, the non-optimized PPI for N-TIMP2/MMP-14 was reflected in a wide range of binding affinities, where single mutations exhibited a far more attenuated effect on the PPI. Our new platform reliably and comprehensively identified not only hot- and cold-spot residues, but also specificity-switch mutations that shape target affinity and specificity. Thus, our approach provides a methodology giving an unprecedentedly rich quantitative analysis of the binding specificity landscape, which will broaden the understanding of the mechanisms and evolutionary origins of specific PPIs and facilitate the rational design of specific inhibitors for structurally similar target proteins.


Introduction
Protein fitness -the ability of a protein to perform its main function -is delicately balanced against other protein properties, including solubility and stability to unfolding, aggregation, and degradation [1,2]. Despite evolutionary fine-tuning, protein fitness is very often not perfect [3], and thus to enhance protein functions for desired applications, we may need to apply high-throughput screening and selection of mutagenized protein libraries [4][5][6][7]. In such an endeavor, it is necessary to have in hand a reliable tool for evaluating the functional potency of the mutagenized proteins, for example, for determining the binding affinities of mutants with altered protein-protein interactions (PPIs). This type of evaluation can take the form of mapping fitness landscapes, which relate the effects of all possible mutations to protein fitness [8][9][10][11][12][13][14][15][16][17][18][19].
In investigations of PPIs, protein fitness landscapes find particular utility in the study of pairs of proteins with similar physicochemical properties and similar binding interface sizes but markedly different binding affinities (>12 orders of magnitude) [20]. Determinants of binding affinity include 'hot spots' [21,22], namely, residues at critical positions in the binding interface for which mutations cause a significant reduction in affinity (>3 orders of magnitude); the binding landscapes of such PPIs are referred to here as 'steep' [8,9,23,24]. Mutations at other positions in the PPI interfaces, known as 'cold spots,' will not greatly affect -or may even enhance [25] -the binding affinity, producing landscapes referred to here as 'shallow.' In situations in which it is desirable to block natural PPIs for therapeutic applications, hot spots -being energetically favorable residues -make the greatest contribution to the stability of protein-protein complexes and may thus represent druggable sites on target interfaces [26]. Cold spots, which are believed to play a major role in the evolution of PPIs [27], are particularly important for optimization in affinity maturation experiments aimed at improving the binding of protein-based therapeutics. For both these strategies, there remains a need to bridge knowledge gaps regarding hot and cold spots, particularly their exact positions and frequency in the binding interface.
To date, the techniques used in studies of PPIs range from alanine scanning mutagenesis to yeast surface display (YSD) in combination with next-generation sequencing (NGS) [9,[28][29][30][31]. However, these, and most other currently available, approaches generate affinity -but not necessarily specificity -landscapes, and, in the few studies that were designed to generate specificity landscapes [29,[32][33][34], the methodology was confined to discrimination between only two target proteins that have different binding epitopes. This constitutes a limitation because broad-spectrum binding proteins may have many potential targets with binding epitopes that have high sequence homology and structural similarity. The current study is thus aimed to address the need for better methods to comprehensively and accurately quantify the impact of individual mutations on the binding specificity of a protein toward homologous targets exhibiting a broad range of affinities.
In our initial efforts to tackle this problem, our laboratory previously developed a qualitative approach for mapping protein specificity landscapes by combining experimental multi-target selective screening of YSD libraries with NGS analysis [35]. This approach enabled us to identify both hot-spot and cold-spot residues, as well as specificity-switch and correlated mutations that act in tandem to shape specificity; nonetheless, there remained a need for similarly facile methods to obtain quantitative binding landscapes. To address this need, we used our previously developed reliable and comprehensive platform as the starting point to construct a quantitative generic methodology to map protein specificity landscapes, regardless of the K D (or K i ) values of the PPIs, as shown schematically in Fig. 1. The major advance offered by this methodology for accurately quantifying the target specificity for thousands of protein mutants lies in the need to determine the inhibition constants (K i ) for only a small number of selected purified mutants after the steps of protein sequence randomization, fractional library screening using YSD into low-and high-affinity PPIs, and NGS for a large number of mutants.
As model systems for this study, we chose three homologous protein-protein complexes, each composed of the N-terminal domain of the tissue inhibitor of metalloproteinase 2 (N-TIMP2) and the catalytic domain of a matrix metalloproteinase (MMP). As the MMP partner, we chose to focus on MMPs that represent three different functional groupscollagenases, stromelysins, and membrane-type MMPs, namely, MMP-1, MMP-3, and MMP-14, respectively. Importantly, despite their high structural homology (known from Xray structures), these three MMPs bind N-TIMP2 with different affinities, spanning two orders of magnitude [36][37][38]. This range of affinities renders N-TIMP2 an optimal model through which to develop and test our novel approach, in that its lack of discrimination between MMP-3 and MMP-14 offers a good starting point for manipulating relative specificity, while its higher preexisting selectivity toward MMP-1 offers an opportunity for engineering specificity switches. In addition, the binding interface residues of N-TIMP2 have been shown to tolerate substitution or incorporation of additional amino acids with only a minimal impact on protein stability [36][37][38]. In addition and equally importantly, the three model MMPs selected represent potential targets of clinical value, since MMP-1 and MMP-14 are oncogenic [39][40][41][42][43][44], while MMP-3 plays important roles in tissue regeneration and wound healing [45,46]. To date, there has been very limited progress toward development of specific inhibitors -natural or synthetic -targeting these or other MMP family members, probably due to the high similarity in the sequences and structures MMPs, which share nearly identical active sites in their catalytic domains [47,48].
We thus applied our novel platform to explore all possible single mutations in the N-TIMP2 binding interface of the three complexes, N-TIMP2/MMP-1, N-TIMP2/MMP-3 and N-TIMP2/MMP-14. In contrast to the affinity landscapes obtained in previous studies that generated qualitative information, the quantitative binding landscapes obtained here for the three different N-TIMP2/MMP complexes enabled us to dissect out the contribution of each interface position to binding and hence to quantitatively analyze the effects of all single

Library sorting and analysis by flow cytometry
The yeast-displayed N-TIMP2 library that had been grown in SDCAA medium overnight was transferred into SGCAA medium (2% galactose, 0.67% yeast nitrogen base, 0.5% Bacto casamino acids, 1.47% sodium citrate, and 0.429% citric acid monohydrate, adjusted to pH 4.5) for overnight growth at 30 °C to induce expression of the library and to display it on the surface of the yeast cells [54]. For screening against the MMP targets, ~5×10 7 yeast cells expressing the library were collected and washed with a binding buffer (50 mM Tris, pH 7.5, 100 mM NaCl, 5 mM CaCl 2 , and 1% BSA). Thereafter, the cells were incubated for 1 h at room temperature (RT) with the primary antibody, mouse anti-c-Myc antibody (Abcam, Cambridge, UK), in 1:50 ratio together with soluble biotinylated MMP-1, MMP-3 or MMP-14 in final concentrations in the binding buffer of 0.78 nM, 12.5 nM or 12.5 nM, respectively. The cells were washed with the binding buffer and incubated with streptavidinconjugated to Alexa Fluor® 647 (Invitrogen, IL) in a 1:200 ratio together with sheep antimouse antibody conjugated to phycoerythrin (PE) (Sigma-Aldrich, MO) in a 1:50 ratio in the binding buffer in the dark for 30 min on ice. Thereafter, the cells were washed and sorted into two gates to differentiate between high-and low-affinity clones by using an FACSAria (Ilse Katz Institute for Nanoscale Science and Technology, BGU, Israel). Dual-color flow cytometry was used for analysis of the expression and binding of the N-TIMP2 library and selected individual clones to the different MMP targets using an Accuri C6 flow cytometer (BD Biosciences, San Jose, CA).
For the YSD-based affinity titration curves, the same MMP labeling and N-TIMP2 WT /MMP binding reaction conditions as those described above were used. The binding of N-TIMP2 WT to MMP-1, MMP-3 and MMP-14 was determined by incubating the yeast cells displaying N-TIMP2 WT with different concentrations of purified MMP-1 (0.05 nM, 0.1 nM, 0.2 nM, 0.5 nM, 1 nM, 2 nM, 5 nM, 8 nM, 10 nM, 20 nM, 75 nM, 100 nM and 250 nM) and MMP-3 and MMP-14 (1 nM, 5 nM, 10 nM, 20 nM, 50 nM, 100 nM, 500 nM, 750 nM and 1 μM). To obtain a plot of the normalized binding for the different MMP concentrations, each affinity measurement was normalized to the highest binding against the same MMP target, which was set as 1,. A nonlinear binding fit was implemented using GraphPad Prism (GraphPad software, CA, USA) for the N-TIMP2 WT /MMP-1, N-TIMP2 WT /MMP-3 and N-TIMP2 WT /MMP-14 interactions. The values of the apparent K D for each interaction were calculated using the 'binding saturation -one site total' fit.
DNA Sequencing of the sorted N-TIMP2 library fractions.-The sorted N-TIMP2 library fractions (high and low affinity) were isolated from the yeast cells using the E.Z.N.A. Yeast Plasmid Mini Kit (Omega Bio-tek, Norcross, GA, USA) according to the manufacturer's protocol. The kit products were run on a 1% agarose gel and were then purified using HiYield Gel/PCR Fragments Extraction Kit (RBC Bioscience, Taiwan). Thereafter, ~1 ng of plasmid DNA was used for gene amplification in a PCR reaction, containing 2% DNA template, 5% forward primer (10 μM), 5% reverse primer (10 μM) (Table S2), 20% Q5 reaction buffer, 2% dNTPs, and 1% Q5 HF polymerase (New England Biolabs, Ipswich, MA, USA) in doubly distilled water. The PCR conditions were as follows: 98 °C for 30 s, followed by 30 cycles of 98 °C for 10 s, 56 °C for 30 s and 72 °C for 1 min; the reaction mixture was then left to stand at 72 °C for 10 min. The PCR products were then sent for sequencing (Hylabs, Rehovot, Israel), and a second PCR reaction was performed using the Fluidigm Access Array primers to add the adaptors and barcodes. Then, the samples were purified with AmpureXP beads (Beckman Coulter, CA, USA), and their DNA concentrations were detected in a DNA high sensitivity assay performed in a Qubit fluorometer (Thermo Fisher Scientific). Thereafter, the samples were run on a TapeStation (Agilent, CA, USA) to verify the size of the PCR product. As a final quality test, the samples underwent a qRT-PCR reaction to determine the concentration of the DNA that could be sequenced. The pools were then loaded for sequencing on an Illumina Miseq, using the 500v2 kit.
Data filtration and integration of high-fidelity reads.-The data from all the sequencing runs for the N-TIMP2 library fractions was analyzed in the same manner. An average Illumina quality score was calculated for each read in a given set of paired-end reads, and read pairs in which either read had an average quality score lower than 20 (i.e., less than 99% accuracy) were discarded. The paired end reads were combined together into a single sequence by Fast Length Adjustment of Short reads (FLASH) software [57] [The Center for Computational Biology (CCB), Johns Hopkins University, Maryland, USA].

Analysis of the high-throughput sequencing data
Translation and analysis of the DNA sequences were performed with MATLAB software, version R2016a. The DNA sequences of the sorted N-TIMP2 library fractions were translated to their corresponding amino acid sequences and aligned to the N-TIMP2 WT sequence (PDB ID code 1BUV). Sequences containing longer or shorter lengths than the WT or sequences with a stop codon incorporated inside them were filtered out. Thereafter, a minimum threshold value was set in the form of a minimum frequency (f min ), rather than an absolute mutant count, for two reasons: to avoid inflation of the data in cases of mutants with low read counts and because the library size of the sorted library fractions was not consistent (~5×10 5 -9×10 5 reads). The f min value (SI Appendix, Fig. S7) was set to 2 −12 and was based on the distribution of mutants in the pre-sorting library. Then, the numbers of appearances of each mutant in each library fraction were summed, and the frequency of each mutant was calculated as: f mut, i = #reads mut, i ∑ #reads mut, i .
where # reads mut,i is the number of reads of a mutant in library fraction i, and ∑# reads mut,i is the sum of all the reads for all mutants in library fraction i.
Next, to compare the frequencies of each mutant to that of the WT in the same library fraction, we derived a parameter that we termed the normalized frequency (NF), which is the ratio between the frequency of a given mutant and the frequency of the WT in library fraction i, i.e.: For mutants with low frequencies, i.e., below f min , the NF value was calculated with f mut,i set to the f min value.
Based on the normalized frequency, we determined the following NGS scores: i. NGSaffinity score = NF mut, Higℎ Affinity NF mut, Low Affinity .
where NF mut,High Affinity and NF mut,Low Affinity are the normalized frequencies of a mutant in the high-and low-affinity library fractions for a given target, respectively; a high score implies a high affinity and vice versa.
ii. which is the product of all the paired NGS specificity scores for target A vs. the other two targets. In all the scoring calculations above, a score was discarded if it had to be used with more than one NF parameter containing the f min value.
We then presented these NGS scores (affinity scores, paired specificity scores and total specificity scores) in the form of qualitative heat maps presenting the log 2 transformation of the scores listed above (see Eq. 2-5), in which blue indicates an increase and red a decrease in the estimated affinity or specificity relative to N-TIMP2 WT .
Based on the above-mentioned NGS results, several mutants whose affinity and specificity scores fell within a wide spectrum (i.e., from low to high) were chosen for further production and purification, followed by functional evaluation, i.e., K i determination, with the aim to achieve an optimal representation of each mutant population for the validation and 'scaling' of the NGS results.
Engineering N-TIMP2 variants by using site-directed mutagenesis.-Based on our NGS results, N-TIMP2 WT and several variants were selected for further purification and characterization. For purification, we used the pPICZαA construct, containing both the N-TIMP2 WT gene [36] and the Zeocin resistance gene, and also the AOX1 promoter at its N terminus and a hexahistidine tag its C terminus. The plasmid was propagated in DH5α E. coli cells and then purified using a HiYield plasmid mini kit (RBC Bioscience, Taiwan).
Thereafter, the site-directed mutagenesis procedure for the selected N-TIMP2 variants was carried out in a PCR reaction using specific primers (Table S2) containing the desired amino acid encoding codon in the middle and 15 bp flanking it from each side, complementary to the template WT DNA sequence. The PCR reaction mixture comprised: 2% DNA plasmid template (~50 ng), 5% forward primer (10 μM), 5% reverse primer (10 μM), 20% Phusion HF buffer, 2% dNTPs, and 1% Phusion HF polymerase in doubly distilled water. The PCR conditions were: 98 °C for 3 min, followed by 25 cycles of 98 °C for 10 s, 65°C for 30 s and 72°C for 10 min; the reaction mixture was then left to stand at 72 °C for 10 min. Next, the PCR reaction products were loaded on a diagnostic 1% agarose gel to verify the procedure's success, and then transformed into competent DH10β E. coli cells. The transformed bacteria were plated on LB agar plates containing 50 μg/ml Zeocin (Invitrogen, NY, USA). The plasmid was extracted from several bacterial colonies, and the correct sequence of the desired mutation was verified (Genetics Unit, NIBN, BGU, Israel). Purification of N-TIMP2 variants was performed as previously described [36]. For each N-TIMP2 variant, 5 colonies were selected and grown overnight in 5 ml of BMGY medium (2% peptone, 1% yeast extract, 0.23% K 2 H(PO 4 ), 1.181% KH 2 (PO 4 ), 1.34% yeast nitrogen base, 4×10 −5 % biotin, 1% glycerol) at 30 °C, and then transferred into 5 ml of BMMY medium (2% peptone, 1% yeast extract, 0.23% K 2 H(PO 4 ), 1.181% KH 2 (PO 4 ), 1.34% yeast nitrogen base, 4×10 -5 % biotin, 0.5% methanol) for protein induction of 72 h, with the addition of 1% methanol once a day in the last two days. Overexpression of the secreted proteins was determined by western blot, using a 1:3000 dilution of mouse anti-6×His antibody (Abcam, Cambridge, UK) primary antibody, followed by a 1:5000 dilution of antimouse secondary antibody conjugated to alkaline phosphatase (Jackson ImmunoResearch, West Grove, PA, USA), and detection by incubation in 2 ml of 5-bromo-4-chloro-3-indolyl phosphate reagent (Sigma-Aldrich, USA). Large-scale production of the proteins was performed by growth of the N-TIMP2-expressing yeast exhibiting the highest protein overexpression in 50 ml of BMGY medium overnight, followed by 72 h of growth in BMMY medium, with daily additions of 1% methanol. The proteins were purified by centrifugation of the yeast cell suspension at 3800 g for 10 min and filtration of the supernatant, followed by addition of 500 mM NaCl and 10 mM imidazole in pH 8.0. The supernatant was incubated for 1 h at 4 °C, centrifuged at 3800 g for 10 min, filtered, and then loaded on nickel-nitrilotriacetic acid-Sepharose beads (Invitrogen, USA), washed with 50 mM Tris, pH 7.5, 100 mM NaCl, and 10 mM imidazole, and eluted with 20 ml of 50 mM Tris, pH7.5, 100 mM NaCl, 300 mM imidazole, and 5 mM CaCl 2 . The elution fraction was concentrated using a Vivaspin centrifugal concentrator with a 3-kDa cutoff (GE Healthcare Life Sciences, USA). The proteins were further purified using a Superdex 75 column (GE Healthcare Life Sciences, USA) with elution buffer (50 mM Tris, pH 7.5, 100 mM NaCl and 5 mM CaCl 2 ) in an ÄKTA pure instrument (GE Healthcare Life Sciences, USA). SDS-PAGE analysis on a 15% polyacrylamide gel under reducing conditions for the purified proteins was then performed. Bands were visualized by staining with Instant Blue (CBS Scientific, CA, USA). Protein samples were concentrated using a Vivaspin centrifugal concentrator with a 3-kDa cutoff and subjected to mass spectrometry analysis (Ilse Katz Institute for Nanoscale Science and Technology, BGU, Israel). Protein concentrations were determined by UV-Vis absorbance at 280 nm, using a NanoDrop Spectrophotometer (Thermo Scientific, USA), with an extinction coefficient (ε 280 ) of 13,500 M −1 cm −1 for N-TIMP2 WT and all its variants except for N-TIMP2 V71W with ε 280 = 18,825 M −1 cm −1 .

Catalytic activity and inhibition assays
Catalytic activity and inhibition assays were performed as previously described with minor modifications [37]. N-TIMP2 WT and its variants were tested for inhibitory activity by incubating them with the three different MMPs in the following concentrations: 0.
where V i and V 0 are the enzyme (MMP) velocities in the presence and absence of the relevant N-TIMP2 inhibitor, respectively; E and I are the concentrations of enzyme and inhibitor, respectively; K m is the Michaelis-Menten constant; and K i app is the apparent inhibition constant, which is given by , where S is the substrate concentration.
Next, the K i affinity score of each mutant for each target MMP (K i fold) was evaluated by calculating the fold change in its inhibition constant relative to the inhibition constant of the WT, as: Thereafter, the K i specificity scores were evaluated in the same manner as for the NGS specificity scores (see Eq. 3-5): (Eq. 8) was used to calculate the specificity of a mutant towards target A vs. target B by dividing their respective K i fold values, and was used to evaluate the extent of specificity of a mutant towards target A vs. the other targets, by multiplying the paired K i specificity scores for target A vs. all the other targets.

'Scaling' of the binding landscapes
After determining the K i affinity and specificity scores, we performed log 2 transformations of the NGS and K i scores and then fitted the NGS and K i results to a linear regression model. Analysis of the same data was also performed using leave-one-out cross-validation approach, where each data point was predicted without the enrichment information for that particular data point. The obtained regression lines allowed us to place the NGS scores of other, unpurified, variants in the regression formulas, and calculate their corrected corresponding affinity and specificity scores via interpolation. The calculation was carried out for all the single mutations in the seven N-TIMP2 interface positions, which then enabled us to 'scale' our initial qualitative ('apparent') binding landscapes and transform them to quantitative heat maps that reflected/predicted the exact outcomes of single amino acid substitutions on target binding affinity and specificity.

Statistical analysis
To validate the correlation between the NGS scores and the Ki scores, we used Pearson correlation (R), p-value, Spearman's correlation (ρ) and Kendall correlation (τ); the analysis was performed in the Partek Genomics Suite [58,59].

Fractional sorting of the N-TIMP2 mutagenesis library for high and low affinity to MMP-1, MMP-3 and MMP-14
As the first step to comprehensively mapping the affinity and specificity landscapes of the interactions of N-TIMP2 with the three different MMPs, i.e., to determine the contribution of each position and of each specific mutation to the affinity and specificity of N-TIMP2 for the target MMP, we generated a single-mutation N-TIMP2 library with mutations in seven key positions, i.e., 4, 35, 38, 68, 71, 97 and 99 (Fig. 1, step A). These positions were chosen for their known importance in MMP binding, their close proximity (within 4 Å) to the MMP interface and to the catalytic zinc in the N-TIMP2/MMP-14 complex structure [53,60], and their structural tolerance to mutagenesis [36,53]. Although previous experiments have evaluated the combined effects of certain mutations in these positions, the contributions of each position and mutation to the affinity and, particularly, to the specificity for the target, i.e., as cold spots or hot spots, has not been previously determined.
We used a YSD platform to select N-TIMP2 variants that bind to MMP-1, MMP-3 and MMP-14 with different affinities, as follows. We cloned the coding region of the N-TIMP2 variants into the YSD plasmid pCHA (which allows the N-terminus of N-TIMP2 to be freely exposed) for presentation of the proteins on the Saccharomyces cerevisiae yeast surface as fusion proteins with the Aga2p/Aga1p system (SI Appendix, Fig. S1A). The N-TIMP2 library, expressed in the YSD system (SI Appendix, Fig. S1A), was subjected to fluorescence-activated cell sorting (FACS) for separate screening against each of the three MMP targets, namely, the catalytic domains of MMP-1, MMP-3 and MMP-14 ( Fig. 1, step  B). Target concentrations were chosen to give maximal scattering in the affinity signal of the sorted populations and hence to facilitate identification of, and differentiation between, highand low-affinity populations, where 'high' is > wild type (WT) and 'low' is < WT. We note that to compensate for differences between target concentrations used for sorting (in the FACS experiment) and the K i of the relevant N-TIMP2/MMP complex, the results were later calibrated as described below in the section Scaling of NGS scores by linear regression vs. K i scores. For each sort, high-and low-affinity fractions were collected by applying diagonal sorting gates, which facilitated the normalization of the binding signal to the expression for each N-TIMP2 clone (SI Appendix, Fig. S1B). Further post-sort flow cytometry analysis of the fractionated sub-populations verified the differences in the binding signal between the sorted high-and low-affinity fractions (SI Appendix, Fig. S1C). A comparison between scattering pattern of the pre-sorted library population vs that of each of the individual clones verified that the range of scattering for a single mutant was significantly narrower than that of the library (SI Appendix, Fig. S2).

Generating qualitative affinity and specificity heat maps based on NGS analysis of the sorted N-TIMP2 library fractions
Following the isolation of the high-and low-affinity fractions for each target, the plasmid DNA of each fraction was extracted and subjected to high-throughput NGS analysis. The quality of the sequencing output was verified, with ~90% of the sequenced paired end reads undergoing successful quality filtration and integration. Thereafter, the DNA sequences of the N-TIMP2 library fractions were translated into their respective amino acid sequences and aligned to the WT N-TIMP2 sequence (N-TIMP2 WT ; PDB ID code 1BUV). The numbers of reads for each mutant in each sorted fraction were summed, and the frequency of the mutant in each sorted fraction was determined, where frequency was defined as the ratio of number of reads for a single mutant normalized to the total number of reads for a given sorted fraction (Eq. 1; for Eqs 1-6, see Analysis of the high-throughput sequencing data in Materials and Methods). Next, to estimate the contribution of each amino acid substitution in the N-TIMP2 binding interface to the change in binding affinity and specificity for each MMP target, we defined a parameter, which we termed the normalized frequency (NF), as the ratio between the frequency of a single mutant and the frequency of N-TIMP2 WT in the same sorted fraction (Eq. 2; see Analysis of the high-throughput sequencing data). This parameter allowed us to compare the affinity of each N-TIMP2 mutant for a specific MMP to that of the WT N-TIMP2 for the same MMP in a manner that was not influenced by either the size of the pooled libraries or the frequency of the mutant in the original (pre-sorting) mutated library. This approach complements previous studies that evaluated changes in target affinities on the basis of enrichment ratios comparing the frequency of a particular mutant in a sorted library to its frequency in the original library [29,61].
To enable us to treat the raw NGS data -and ultimately to quantify our findings for binding affinity and specificity -we then defined three different scores, as follows: i. the 'NGS affinity score' (Eq. 3; see Analysis of the high-throughput sequencing data) as the ratio between the mutant's NF parameters in the high-and lowaffinity fractions of the same MMP target, our assumption being that if a mutant's NGS affinity score exceeds 1, then its affinity towards the desired target would be higher than that of N-TIMP2 WT , and vice versa; ii. the 'paired NGS specificity score' (Eq. 4; see Analysis of the high-throughput sequencing data) as the NF value obtained from a high-affinity fraction for a desired target divided by the NF value of the same high-affinity fraction for a different target, thereby quantifying the specificity of a mutant towards a given target over another target (a score > 1 representing a higher specificity of a mutant towards target A over B); and iii. the 'total NGS specificity score' (Eq. 5; see Analysis of the high-throughput sequencing data) as the product of all the paired NGS specificity scores for a certain target vs. all the other targets, thereby giving a score that determines the extent of specificity of a mutant towards a particular target in a group of homologous targets.
The information obtained from the NGS analysis and scoring calculations provided qualitative insight into the relative contribution of each amino acid substitution in the N-TIMP2 binding interface to the binding affinity and specificity for a particular target. The construction of quantitative binding landscapes then required further adjustments and 'scaling' to complementary experimental affinity determinations of representative purified mutants. To facilitate the optimal selection of a number of representative variants for binding/inhibition measurements, we first summarized the initial NGS results (expressed as log 2 of the affinity, paired specificity and total specificity NGS scores) into heat maps (SI Appendix, Fig. S3); these heat maps enabled the qualitative visualization of the apparent effects of each amino acid substitution on the affinity and specificity (vs. a single different MMP or multiple MMPs) towards the MMP targets (Fig. 1, step C).

Empirical assessment of affinity and specificity of purified calibrator mutants for scaling of the NGS data
To 'scale' the apparent affinity and specificity scores obtained from the NGS data, we selected eight mutants (covering the entire spectrum of NGS affinity and specificity scores; see below in this section) and purified them for further empirical binding analysis (SI Appendix, Fig. S4). This analysis of the interactions between free proteins (solution K D ) was required because solution K D values were ~ 50-fold lower than apparent K D values obtained for the same PPI when using the yeast display format (SI Appendix, Fig. S5 and Table 1). By creating linear regression equations, we were able to correlate the NGS scores with the K i binding results obtained for these purified 'calibrator' mutants, and subsequently to use these equations to interpolate the affinity and specificity of each of the mutants in the N-TIMP2 mutated library (i.e., for each mutation in the binding interface) to the selected MMP targets. To optimally represent the complete range of affinity and specificity of mutants present within the library, the selection of calibrator mutants for the binding analysis was aimed to cover the entire spectrum of NGS affinity and specificity scores, as follows. For each of the three MMP targets, two mutants exhibiting high specificity scores were chosen, as follows: N-TIMP2 T99M and N-TIMP2 T99Q selective for MMP-1, N-TIMP2 S68M and N-TIMP2 V71W selective for MMP-3, and N-TIMP2 H97R and N-TIMP2 T99G selective for MMP-14. In addition, we chose one mutant that exhibited highaffinity scores towards all three targets, namely, N-TIMP2 S68E , and one mutant with low affinity scores to all three targets, namely, N-TIMP2 V71R . We then examined the affinity and specificity of each of the purified calibrator mutants for the selected MMP targets by performing catalytic inhibition assays. In these assays, K i for the binding between each MMP target and N-TIMP2 WT or one of the selected N-TIMP2 mutants was evaluated by monitoring the cleavage of a fluorescent MMP substrate over time in the presence of increasing concentrations of inhibitors and fitting the acquired data by multiple regression to Morrison's tight binding inhibition equation (Eq. 6; see Catalytic activity and inhibition assays in Materials and Methods), as shown in Fig. 2 and Table 1. Utilizing the abovemeasured inhibition constants, we then determined the following K i affinity and specificity scores for each mutant: i. K i affinity score, namely, K i fold, as the ratio between the K i of the WT and the K i of the mutant, which represents the fold change in affinity as a result of a given amino acid substitution, where K i fold > 1 thus means an increase in affinity relative to the WT, and vice versa (Eq. 7; see Catalytic activity and inhibition assays); ii. paired K i specificity score, as a ratio between the K i fold change of a mutant towards one target and the K i fold change towards a second target, which indicates the extent of specificity awarded by a single mutation towards a given target over another (Eq. 8; see Catalytic activity and inhibition assays); and iii.
total K i specificity score, as the product of all the paired K i specificity scores for a certain target vs. the rest of the targets, which aids in quantifying the degree of specificity of a mutant towards a single target vs. all the other targets (Eq. 9; see Catalytic activity and inhibition assays).   Table S1.

Scaling of NGS scores by linear regression vs. K i scores
Due to the differences between the target MMP concentrations used for the N-TIMP2 library sorts vs the K i values for the respective purified N-TIMP2 WT /MMP complexes, we calibrated the NGS scores by linear regression vs the K i scores (see Materials and Methods). Log 2 transformations of the K i affinity and specificity scores of the purified mutants were plotted against the corresponding NGS scores, and the plots were fitted to linear regression models (Fig. 1, step D; SI Appendix, Fig. S6). The affinity plots showed good agreement between the experimentally obtained K i results and the NGS scores (SI Appendix, Fig.   S6A-C,  Fig. S6G-I, Table S3). Analysis of the same data was performed using leave-one-out cross-validation approach [62], where each data point was predicted without the enrichment information for that particular data point. Good correlations were obtained both for the affinity, specificity and total specificity scores, with Pearson's R-values of 0.87, 0.87 and 0.88, respectively (SI Appendix, Fig.S7A-C). In all the NGS scoring calculations, a data point was excluded from the analysis if the mutant's frequency was below the minimum frequency threshold in more than one subpopulation (i.e., both high-and low-affinity gates in the affinity score calculations, or both high-affinity gates of two different targets in the specificity score calculations, see Materials and Methods for further explanations). This approach enabled the scaling of the NGS scores to the K i data, and hence further refinement, fine-tuning and quantification of the affinity and specificity landscapes in a precise way.

Quantitative N-TIMP2 affinity and specificity landscapes
Use of the regression equations linking the NGS and K i scores for as small a number as eight individual calibrant variants allowed us to scale our entire set of NGS results through interpolation and thereby to predict the effects of numerous other mutants -and almost all the single amino acid substitutions in our library -with regard to affinity and specificity towards the MMP targets. Heat maps prepared on the basis of the scaled affinity and specificity scores constitute quantitative binding landscapes for N-TIMP2 and the three MMPs (Fig. 1, step E), as presented in Fig. 3.
Examination of the scaled binding landscapes revealed the effects of the various mutations in the N-TIMP2 binding interface on target binding affinity and specificity and showed clearly that some N-TIMP2/MMP complexes were more tolerant to mutations than others. The binding affinity landscape for MMP-1 (Fig. 3A) showed that most of the tested mutations led to a substantial decrease in binding affinity. Nonetheless, among the seven positions that were examined, mutations in positions 97 and 99 were less deleterious or sometimes even affinity enhancing vs those in the other five positions, e.g., mutations H97R and H97L in position 97 increased affinity towards MMP-1. The affinity landscape for MMP-3 ( Fig. 3B) showed that all the mutations led to decreased affinity, with mutations in position 71 being the most deleterious. In contrast, the binding affinity landscape for MMP-14 suggested that the binding affinity of N-TIMP2 was not impaired dramatically by the mutations, and that many mutations have the potential to increase the binding affinity to MMP-14 ( Fig. 3C): Positions 97 and 99 were identified as potentially affinity-enhancing positions for MMP-14, as most of the amino acid substitutions in those positions led to an increase in affinity. Notably, our analysis revealed a subtle shift in specificity from one target to another; for example, while most of the mutations in position 68 led to an increase in specificity for MMP-14 over MMP-1, mutation S68G led to higher specificity for MMP-1 over MMP-14. In addition, a decrease in the total specificity for MMP-3 was observed in most of the mutations in position 97, except for the mutation to Glu (Fig. 3A), while an increase in total specificity for MMP-14 was revealed at position 99, except for the mutation to Gln (Fig. 3C).

Discussion
We report here a generic platform for quantitatively predicting the binding affinity and specificity for all protein variants with a single mutation in the binding interface for a particular PPI. Such predictions can be made on the basis of a small data set of experimentally determined K i values for a small number of purified variants and scaled NGS data. Our methodology differs from previous approaches in which correlations between NGS results and K i (or K D ) could not necessarily be generalized, since those studies required library screening using high-affinity sorts of mutant populations and a target protein concentration similar to the K i (or K D ) of the PPI, and could thus predict affinities lying only in a narrow range. Furthermore, such previous approaches could not predict affinities for very high affinity (low K i ) complexes due to technical limitations, as the required low target protein concentrations would fall below the detection levels of the screening process. In contrast, we are able to predict the affinity and specificity of PPIs spanning a broad range of values, including very high affinity interactions, by analysis of NGS data from low-and high-affinity sorts of mutant populations.
As expected, for the three tight binding N-TIMP2/MMP complexes investigated here as model systems for quantitative affinity landscapes, the correlation curves for the NGS results vs. the K i values were shifted towards the low values of our NGS affinity scores (i.e., for negative NGS scores, positive K i scores were obtained). Nevertheless, we were able to correct for the bias imposed by the differences between the K i values and the concentrations used in the YDS library sorting (by scaling the NGS affinity scores to the K i affinity values) and to obtain strong correlations between NGS and K i values for these interactions, even though the K i values of the three complexes were approximately 10-fold lower than the protein concentrations used in the YSD library sorting. Indeed, the resulting binding landscapes correlated very well with experimental data obtained in this paper and with the findings of previous studies [36,53].
To obtain the linear regression curves for the affinity predictions and for generating the quantitative affinity landscape maps, it was necessary to use the more accurate solution K Dand not the apparent K D of the PPI -to normalize the NGS scores. It is likely that the resultant changes to the highly complementary electrostatic potentials of N-TIMP2 WT and MMP, particularly those caused by the biotinylation of the target MMP, reduced the strength of the PPI. Support for this idea may be drawn from our findings that the MMP-TIMP PPI is highly optimized (especially for MMP-1 and MMP-3), giving affinities in the low to sub-nanomolar range, and that even a single mutation can easily lead to a drastic decrease in affinity; hence, modifications of the proteins that are used for detection in the YSD setup could result in in drastic changes to K D values. Since the same modifications to both proteins in the N-TIMP2/MMP PPI are used in all the sorting experiments, they are likely to reduce the apparent affinity by the same amount for all the different variants. This premise is corroborated by the high correlation that we obtained between the NGS-derived enrichments and the in vitro K D values.
For any mutant, the correlations between the experimentally obtained K i scores and the NGS scores for all three N-TIMP2/MMP complexes allow the accurate prediction of K i values and of both affinity and specificity, without the need to purify that mutant or to perform a functional assay to determine its activity. In order to show how well the linear regression models could predict a K i measurement for a mutant that was not included when building the model, we used a computational cross-validation approach. Using this approach, involving sequential recalculation of regression models with omission of one mutant at a time and then using those models to predict K i for the omitted mutant, we demonstrated that the predicted K i values are still very close to the measured ones. Indeed, we were able to predict five affinity-enhancing mutations that were also identified in an N-TIMP2 variant that was previously evolved by our group for high affinity towards MMP-14 in preference to other MMPs [36]. In fact, three out of five of these mutations showed the highest affinity enhancing scores (vis-à-vis other mutations) for each mutated position. We note that the high correlations between the experimentally obtained K i scores and the NGS scores for N-TIMP2/MMP-1 and N-TIMP2/MMP-3 (Pearson's R-values of 0.9803 and 0.8849, respectively) may be attributed to the broad range of K i values that covered both affinityenhancing and affinity-reducing mutations, whereas the lower correlation (R = 0.7662) obtained for the N-TIMP2/MMP-14 complex was probably the outcome of using data for proteins purified and tested in different laboratories (our data were supplemented with data from the literature).
Multigate sorting approaches similar -but not identical -to ours for mapping binding landscapes have been reported by the group of Keating [30,63] and the group of Kinney [64], with the consensus of results pointing to the potential utility of exploiting binding landscapes in studies designed to improve the affinity and specificity of PPIs. A key difference between our approach and the approaches taken in these prior studies is that whereas the groups of Keating and Kinney correlated NGS data to apparent dissociation constants derived from flow analysis in the artificial yeast surface display context, we have used actual in vitro affinity measurements of purified proteins for normalization. We attribute the superior correlations between our K i predictions and actual in vitro measurements (R=0.76-0.98), over a much broader range than reported in prior studies, to our use of the more accurate solution K D values to normalize the NGS scores. Furthermore, if we restrict our analysis to the data generated in our own laboratory, thereby removing the variability introduced by incorporating data points from a previous report [53], the correlations obtained with our approach are even further improved (R values ranging from 0.87 to 0.98). This approach has enabled us to accurately model binding affinities extending to the far extremes of the K D range. An additional advantage of our new approach derives from its practical simplicity, as it requires only two sorting gates, and thus implementation should be straightforward for application of this approach to other systems in other laboratories.
One of the most important attributes of the quantitative binding landscapes is that they allow us to compare structurally similar PPIs with different binding affinities, for which the landscapes are significantly different in terms of average mutational effects. The quantitative binding landscapes for N-TIMP2/MMP-1 and N-TIMP2/MMP-3 illustrate the almost completely optimal nature of these two PPIs, since the binding landscapes are narrow (i.e., have small range of binding affinities) with many single mutations that can cause a large decrease in affinity and only a few single mutations that can cause affinity enhancement. For example, for N-TIMP2/MMP-1 (the highest affinity complex), most single mutations (91.1%) caused a loss of affinity (i.e., NGS affinity score ˂ 1), whereas only few single mutations increased (8.0%) or preserved (0.9%) the initial affinity of N-TIMP2 for MMP-1 (NGS affinity scores of > 1 or lie between -1 and +1, respectively).
In contrast, the quantitative binding landscape for N-TIMP2/MMP-14 is much broader (having the highest range of binding affinities), with single mutations enhancing (21.7%), reducing (17.4%) or not affecting (60.9%) affinity. Most importantly, the PPI for N TIMP2/ MMP-14 is far less well optimized than the PPIs for N-TIMP2/MMP-1 and N TIMP2/ MMP-3, for which there are larger proportions of affinity-reducing mutants (91.1% and 89.9% for MMP-1 and MMP-3, respectively). These differences between MMP-14, on the one hand, and MMP-1 and MMP-3, on the other, can be seen both in the experimentally measured affinities using the calibrant mutants and in the NGS affinity scores.
In addition, we confirmed experimentally that the NGS predictions for mutant H97R to be slightly affinity enhancing and for V71R and V71W to be affinity reducing were indeed correct. Similar agreement between the NGS predictions and K i experimental measurement was observed for T99M, T99Q and T99G mutants, with all of them being affinity reducing towards MMP-1, −2 and −14 except T99G mutant being affinity enhancing towards MMP-14. The results for specificity also showed similar correlations between NGS scores and experimentally obtained K i scores for all the mutants. For example, indications from the NGS scores that the mutants would be specific for a certain MMP were confirmed by the experimentally obtained enhancements in specificity for the different MMPs. These include N-TIMP2 T99M and N-TIMP2 T99Q obtaining enhancements in specificity for MMP-1 vs. MMP-3 and MMP-14, TIMP2 S68M and N-TIMP2 V71W obtaining enhancements in specificity for MMP-3 vs. MMP-1 and MMP-14, and TIMP2 H97R and N-TIMP2 T99G obtaining enhancements in specificity for MMP-14 vs. MMP-1 and MMP-3.
Since our setup uses sequential screens against three different targets, a comparison between affinity scores obtained from individual screens against each target enables the identification of specificity-enhancing mutations. This method can be easily applied for multiple target proteins by comparing the high-affinity library fractions screened against each target separately and calculating the paired specificity scores for each target pair. This type of information cannot be obtained from a competitive multi-target screening approach using multiple target proteins per single screen, where a library is screened against a target of interest (labeled with one type of fluorophore) versus a mixture of competitors (all labeled with a second, different of fluorophore) [65]. Such a competitive multi-target screen approach is useful only for a single primary target of interest but not when specificity between multiple target pairs is sought. A different approach utilizes single-step pairwise selective screens in which a library is sorted against two targets simultaneously for variants with differential selectivity toward each target in the pair [35]. Then, NGS is used to sequence these fractions and analyze them. However, this method, too, cannot be scaled up for multiple target proteins, as the selective screens are performed only against one pair at a time.
A limitation inherent in our platform is that very large changes in binding affinities may not be properly estimated for some PPIs, because the NGS-based scores are log 2 -linear with K i binding scores only within a specific range. A methodology to expand the range and possibly improve the calculated binding affinity would require sorting the mutants with the highest and the lowest affinities at different concentrations [66]. Yet another direction that is yet to be explored is the comprehensive investigation of sequences with more than one mutation [67,68], which remains technically very challenging and must therefore await progress in methodologies for library constructions and NGS analysis.
The above notwithstanding, our platform for mapping binding landscapes is sufficiently mature to facilitate the study of many additional PPIs with different functions and binding affinities. In particular, the platform is applicable for studies of multiple mutations, for which it can provide comprehensive information on PPI evolution. In future modeling and protein engineering studies, our approach will allow us to study how protein function influences the evolution of the binding interface sequences and how high-affinity and highspecificity PPIs differ from short-lived and unnatural PPIs. It will also facilitate the rational design of specific inhibitors for structurally similar target proteins.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material. Our N-TIMP2 library contained variants with a single-site random mutation in each clone. (B) Following library preparation, transformation, and expression in the yeast surface display system, the N-TIMP2 library was sequentially screened against three MMP targets (denoted MMP-A, MMP-B and MMP-C) to obtain populations with high and low affinities for each target. The DNA from the sorted library fractions was extracted and subjected to next-generation sequencing (NGS) to determine the frequency of each variant in each subpopulation. NGS affinity and specificity scores were calculated from the collected NGS data, providing insights into the contribution of each amino acid substitution in the N-TIMP2 binding interface to the target binding affinity and specificity, relative to wild-type N-TIMP2. (C) Calculated NGS fitness scores were used to construct qualitative fitness landscapes for each MMP target. As the first step to obtaining quantitative fitness landscapes, several mutants (shown in dotted yellow circles) were chosen for empirical assessment, with focus on mutants that spanned the entire fitness scoring spectrum so as to obtain an optimal representation of the library's diversity. Following the completion of complementary functional assays with the purified mutants, K i affinity and specificity scores were calculated and compared to their corresponding NGS scores. (D) The NGS and K i scores were fitted to linear regression models. The resulting regression equations were then used to 'scale' the entire NGS data set through interpolation and to predict the effect of single amino acid substitutions on target affinity and specificity. (E) The scaled scores were finally summarized into definitive binding landscape heat maps, which facilitated the quantitative visualization of the predicted effects of amino acid substitutions on target binding affinity and specificity and the identification of hot and cold spots and specificity switch residues in the N-TIMP2/MMP binding interface. The color scale bars on the right of panels C and E range from blue, representing increases, to red, representing decreases, in the estimated affinity or specificity relative to N-TIMP2 WT . Grey indicates cases for which there was insufficient data to determine the score. This lack of data was a result of a data point being excluded from the analysis if the mutant's frequency was below the minimum frequency threshold in more than one subpopulation, as explained in the Results section under Scaling of NGS scores by linear regression vs. K i scores.  Scaled quantitative N-TIMP2 binding landscapes for (A) MMP-1, (B) MMP-3, and (C) MMP-14, which show the predicted effects of single amino acid substitutions on target affinity and specificity, as determined from the scaling regression equations. The color scale bars on the right of each panel range from blue, representing increases, to red, representing decreases, in the estimated affinity or specificity relative to N-TIMP2 WT . Grey indicates cases for which there was insufficient data to determine the score. This lack of data was a result of a data point being excluded from the analysis if the mutant's frequency was below the minimum frequency threshold in more than one subpopulation, as explained in the Results section under Scaling of NGS scores by linear regression vs. K i scores. Wild-type residues are shown as black letters inside the matrix; the substituting amino acid is shown on the X-axis; and the identity and position of the original substituted amino acid are shown in the Y-axis. ** K i fold represents the ratio between the K i value of the wild-type and the K i of the mutant.
ND -Could not be determined.