Proteome-wide Detection of Abl1 SH3-binding Peptides by Integrating Computational Prediction and Peptide Microarray*

Protein-protein interactions are essential for regulating almost all aspects of cellular functions. Many of these interactions are mediated by weak and transient protein domain-peptide binding, but they are often under-represented in high throughput screening of protein-protein interactions using techniques such as yeast two-hybrid and mass spectrometry. On the other hand, computational predictions and in vitro binding assays are valuable in providing clues of in vivo interactions. We present here a systematic approach that integrates computer modeling and a peptide microarray technology to identify binding peptides of the SH3 domain of the tyrosine kinase Abl1 in the human proteome. Our study provides a comprehensive list of candidate interacting partners for the Abl1 protein, among which the presence of numerous methyltransferases and RNA splicing proteins may suggest a novel function of Abl1 in chromatin remodeling and RNA processing. This study illustrates a powerful approach for integrating computational and experimental methods to detect protein interactions mediated by domain-peptide recognition.

Protein-protein interactions are essential for regulating almost all aspects of cellular functions. Many of these interactions are mediated by weak and transient protein domain-peptide binding, but they are often under-represented in high throughput screening of protein-protein interactions using techniques such as yeast two-hybrid and mass spectrometry. On the other hand, computational predictions and in vitro binding assays are valuable in providing clues of in vivo interactions. We present here a systematic approach that integrates computer modeling and a peptide microarray technology to identify binding peptides of the SH3 domain of the tyrosine kinase Abl1 in the human proteome. Our study provides a comprehensive list of candidate interacting partners for the Abl1 protein, among which the presence of numerous methyltransferases and RNA splicing proteins may suggest a novel function of Abl1 in chromatin remodeling and RNA processing. This study illustrates a powerful approach for integrating computational and experimental methods to The human tyrosine kinase Abl1 plays important roles in signal transduction and interacts with a broad variety of cellular proteins (1,2). Fusion of the BCR (break point cluster region) and ABL genes form the oncogenic BCR-ABL, which is causally linked to certain human leukemias (3). The Abl1 protein consists of several modular domains, including one SH3 and one SH2 domain that form intra-and intermolecular interactions to regulate its kinase activity (1,2). Although many interacting partners of Abl1, such as Abi1, Abi2, and Rin1, have been reported (2), a proteomewide identification of those mediated by the weak and tran-sient SH3-peptide interaction is still incomplete and remains a challenging problem.
High throughput technologies such as yeast two-hybrid and protein complex purification in conjunction with mass spectrometry have greatly facilitated the identification of protein-protein interactions. However, because they are not optimally designed to detect the weak and transient domain-peptide interactions, such interactions are often under-represented in proteomic screenings (4,5). On the other hand, in vitro binding assays such as phage display, peptide array, and protein array are often employed to determine the binding specificity of modular domains. Each of these technologies has advantages and limitations. Phage display and peptide array are complementary techniques: the former can test binding of a huge number of random peptides against a given domain, whereas the latter can determine binding of a set of specific peptides of interest. Compared with these two techniques, protein array often presents a relatively smaller number of proteins on a surface to test binding of synthesized peptides or purified proteins. Computational analysis is critical to these in vivo or in vitro experimental measurements because it removes false positives and recovering false negatives for the in vivo data and predicts proteome-wide interacting partners based on binding motifs determined by in vitro experiments to guide further experimental investigation.
We present here an integrative approach to identify interacting partners of Abl1 in the human proteome: computational predictions were made by combining multiple sources of data, including structural information, energetic pattern of SH3-peptide interactions, and conservation during evolution; the predicted interactions were then tested using a peptide microarray technology. Different from the popular SPOT technology that synthesizes peptides on a cellulose membrane (6), we printed presynthesized and quality-controlled peptides on a glass surface to hybridize with the purified Abl1 SH3 domain, which confirmed 237 predicted interactions. Proteome-wide identification of putative interacting partners of Abl1 mediated by the SH3-peptide interactions may reveal novel functions of Abl1 in chromatin remodeling and RNA processing. It will also greatly enhance our understanding of the regulatory mechanism of Abl1.

Computational Prediction of Abl SH3-binding Peptides Database Screening Using a Position-specific Scoring Matrix Derived from Free Energy Calculations
All 10-residue-long peptides in the UniProt database (7) were scored using the position-specific scoring matrix (PSSM) 1 that we developed in a previous study (8). The PSSM was a 10 ϫ 20 matrix that represented the difference in binding free energies between the mutated peptides and the template peptide APSYSPPPPP (see details in Ref. 8). Briefly, we calculated the binding free energy of the template peptide based on the crystal structure (Protein Data Bank entry 1bbz) using molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA). We then mutated each residue in the peptide to every amino acid of the other 19 and calculated the change in binding free energy. This change reflected the preference of an amino acid at each peptide residue and was encoded in a 10 ϫ 20 matrix to use as a PSSM to score peptide sequences. The peptide score was computed as follows, where M S,i was the score of the amino acid S at the i th position in the PSSM, and S i was the amino acid at the i th position of the peptide. The top 10,000 peptides with a PXXP motif (where X represents any amino acid) were saved for further analyses.

Conservation Analysis across Seven Species
The conservation analysis was conducted on seven species: Homo sapiens (human), Pan troglodytes (chimpanzee), Macaca mulatta (rhesus macaque), Mus musculus (mouse), Rattus norvegicus (rat), Canis familiaris (dog), and Bos taurus (cow). The protein sequences of the seven species were taken from the nonredundant protein sequence database at the NCBI BLAST server (http://blast.ncbi.nlm.nih.gov). For each protein containing the putative binding peptide, the best match found by PSI-BLAST (33) in each species was considered as a homologue if its E-value was Ͻ10 Ϫ10 ; otherwise, it was not a homologue. The human protein and its homologues were then aligned using ClustalW 1.7 (34). Next, for the 10-residue-long putative binding peptide, we calculated a pair-wise similarity score, where S A i B i was the amino acid similarity score in the PAM500 mutation matrix between residue A at the i th position in the human peptide and residue B at the i th position in the other species. If a gap was found in the corresponding peptide in any species, this peptide was not considered as a homologue. If no homologue was found, the conservation analysis was not informative, and we set the sequence similarity to 1.0. The peptide was conserved only if the pairwise similarity was equal to or larger than 0.9. In addition, a human peptide was included in our list only if it was conserved in at least four other species.

Distinguishing Binders and Nonbinders Using the MIEC-SVM Model
Considering the noise in free energy calculation that might affect the accuracy of the PSSM, we used the molecular interaction energy component (MIEC)-support vector machine (SVM) model developed in our previous study (12) to classify all conserved peptides into binder and nonbinder groups.
Modeling the Complexes for All Conserved Peptides-The crystal structure of the peptide APSYSPPPPP in complex with the Abl1 SH3 domain (Protein Data Bank entry 1bbz) was used as the initial template to model complexes containing the other peptides. The peptide in the template was systematically mutated to another peptide using the scap program (35). Because of the large number of peptides under consideration, we only minimized each modeled structure using the sander program in AMBER9.0 (36) and the AMBER03 force field (37). The solvent effect was taken into account using the generalized Born (GB) model (igb ϭ 2) implemented in sander (38). The maximum number of minimization steps was set to 4,000, and the convergence criterion for the root mean square of the Cartesian elements of the energy gradient was 0.05 kcal/mol/Å. The first 500 steps were performed with the steepest descent algorithm, and the rest of the steps were performed with the conjugate gradient algorithm.
Calculating the Molecular Interaction Fields (MIECs) for Each Peptide-The MIECs for each residue-residue pair were computed using the molecular mechanics/Generalized Born (MM/GB) protocol (11,12). The MIECs included: (a) electrostatic (Columbic) interaction ⌬E ele , (b) van der Waals interaction ⌬E vdw , and (c) polar contribution to desolvation free energy ⌬G GB . The cutoff for calculating ⌬E vdw and ⌬E ele was set to 18.0 Å. A distance-independent interior dielectric constant of 1 was used to calculate ⌬E ele . The charges used in the GB calculation were taken from the AMBER03 force field (37), and other GB parameters were taken from (39). The values of interior and exterior dielectric constants in the GB calculation were set to 1 and 80, respectively. In addition, the MIECs for the 9 residue pairs between the adjacent residues in 10-residue-long peptides were also calculated to consider the conformational preference of the peptide. The molecular interaction component calculations, including read-in of the SH3-peptide complexes, definition of atom types in the GB calculation, and assignment of the force field parameters, were automatically carried out using the gleap program in AMBER10 (40).
Prediction of Binding Peptides Using the MIEC-SVM Model-Finally, the MIEC-SVM model was used to classify each peptide into binder and nonbinder categories. The LIBSVM program (41) was used in the predictions.

Peptide Microarray Screening
The GST-tagged Abl SH3 domain fusion protein was purified as previously described (12). Protein concentration was determined using the Bradford assay (Bio-Rad). The purity of the fusion protein was checked by electrophoresis on SDS-PAGE gel and Coomassie Blue staining. The fusion protein was also subjected to electrophoresis on SDS-PAGE gel, followed by Western blot using a horseradish peroxidase-conjugated anti-GST antibody (Santa Cruz Biotechnology) and the SuperSignal West chemiluminescent substrate (Pierce).
All of the peptides were synthesized by Sigma-Aldrich. The peptides were printed onto glass slides in triplicate by ArrayIt Corporation. A buffer blank and a Cy3 marker were also printed in triplicate. The peptide arrays were blocked with the blocking buffer (5% nonfat dry milk, TBS, pH 8.0, 0.05% Tween 20) for an hour at room temperature. Next, the peptide arrays were incubated overnight at 4°C with purified GST-Abl SH3 in the blocking buffer at a final concentration of 5 M. After washing three times for 10 min with the 1 The abbreviations used are: PSSM, position-specific scoring matrix; MM/PBSA, molecular mechanics/Poisson-Boltzmann surface area; GB, generalized Born; MIEC, molecular interaction energy component; SVM, support vector machine.
TBST buffer (TBS, pH 8.0, 0.05% Tween 20), an anti-GST antibody (Santa Cruz Biotechnology) was added to a final concentration of 0.2 g/ml in the blocking buffer for 1 h at room temperature. The arrays were then washed three times for 10 min with the TBST buffer. Finally, the arrays were incubated with the secondary antibody, Cy3-conjugated goat-anti-mouse IgG (HϩL) (Jackson Immu-noResearch), for 1 h at room temperature, followed by washing for three times for 10 min with the TBST buffer. As a control, the peptide array was incubated with the anti-GST and the secondary antibodies alone.

Analysis of Peptide Microarray Results
The peptide microarray was processed by the InnoScan 710 laser scanner (ArrayIt Corporation). Cy-3 fluorescence was detected using the 532-nm laser at 3-m resolution. The resulting microarray image was analyzed by the microarray image process software Mapix2.8.2 (ArrayIt Corporation). The fluorescent intensity of a microarray spot was defined as its own intensity minus the background intensity around it, as determined from the scanned image.

A Pipeline to Predict Protein-Peptide Interactions in the Human Proteome
To comprehensively identify interacting partners of the Abl1 protein, we exploited a systematic searching strategy (Fig. 1). We first scored all 69,404 peptides in the human proteome that contain the PXXP motif in the UniProt database (7) using a scoring matrix. This scoring matrix was generated using the virtual mutagenesis method that we previously developed (see details in Ref. 8). Briefly, each amino acid of the peptide was mutated to the other 19 amino acids, and the mutated complex structure was optimized using 2 nanoseconds (ns) molecular dynamics simulations. The difference in SH3-peptide binding free energy between the wild type and mutated peptides was calculated using the MM/PBSA method (9,10) to represent how much an amino acid was preferred at each peptide position.
Next, we focused on analyzing the top 10,000 peptides ranked by the scoring matrix and used conservation information to remove false positives. Because of functional constraints, the binding peptides of the Abl1 SH3 domain should be more conserved across species than nonbinding peptides. For each protein that contains at least one putative binder, we generated multiple sequence alignment across seven proteomes, including H. sapiens (human), P. troglodytes (chimpanzee), M. mulatta (rhesus macaque), M. musculus (mouse), R. norvegicus (rat), C. familiaris (dog), and B. taurus (cow). Because the human Abl1 SH3 domain is completely conserved across these seven species, it is reasonable to assume that its interacting partners are also highly conserved. Therefore, we removed those nonconserved peptides from the candidate list. After applying the conservation filter, approximately half of the peptides were removed, and 4981 peptides were left for further analysis.
Our goal was to find the most confident interacting partners of Abl1 SH3 domain, and we used the following pro-cedure to further remove false positives. As illustrated in our previous studies (11,12), we modeled each of these 4981 peptides in complex with the Abl1 SH3 domain and calculated its MIECs, which reflect the energetic characteristics of the binding. Using the MIEC-SVM previously trained on 18 SH3 domains (12), we classified each peptide into binder or nonbinder category. After this round of filtering, 1394 peptides in 714 proteins were predicted as binders of the Abl1 SH3 domain.
Among the top 10 predicted peptides shown in Table I, five were from proteins known to interact with the Abl1 SH3 domain (in bold), including EVL that binds to the SH3 domain of Abl1. Considering how difficult such proteome-wide prediction is, a 50% (5 of 10) rate of identifying known interacting partners demonstrated a satisfactory performance of our approach. On the other hand, the remaining peptides may be new binders that have not been reported. To further refine our predictions, we conducted peptide microarray experiments The numbers on the right indicate the numbers of peptides remained after the step on the left. We started with 69,404 peptides containing the PXXP motif in the human proteome. All of the peptides were ranked by the PSSM determined in (8). This PSSM was generated by mutating the peptide residue at each position to all of the other 19 amino acids. Binding free energy difference between the wild type and mutated peptides was calculated to represent the preference of amino acids at a specific peptide position. The top 10,000 peptides ranked by the PSSM were selected for further analysis. Because the Abl1 SH3 domain is completely conserved in seven species, it is reasonable to hypothesize that the binding peptides are likely to be conserved as well. We thus applied a conservation filter to identify the most reliable interacting partners of Abl1 and 4,981 peptides passed this filter. Then a classification model called MIEC-SVM was sequentially applied to remove false positives. This MIEC-SVM model was based on energetic characterization of the SH3-peptide interaction interface and was trained on 18 SH3 domains in a previous study (12). The MIEC-SVM model predicted 1,394 peptides as binders. We synthesized the top 700 in the 1394 peptides (ranked by the PSSM) and printed them on the microarray. 237 peptides showed significant binding intensity and were considered as binding partners of the Abl1 SH3 domain. on the predicted binding peptides to find the most reliable binders of the Abl1 SH3 domain.

Determination of Protein-Peptide Binding Using Peptide Microarray
Peptide array provides a much higher throughput measurement of protein-peptide binding than in-solution assays. Unlike the phage display method, peptide sequences on the array can be specified, which is crucial for confirming computational predictions. Numerous peptide array platforms are available, and a popular one is the SPOT technology, which synthesizes peptides directly on a cellulose membrane (6). Despite its wide usage, this technology has several limitations. One of them is the lack of quality control of the peptides because the peptides are directly synthesized onto the membrane. Another is that the amount of peptides printed on the membrane is relatively large compared with other platforms, and therefore more reagents are needed. To overcome these limitations, we chose a new platform in which peptides were synthesized in situ and then printed onto a glass surface in a way similar to the DNA/RNA oligonucleotide microarray. The protein was overlaid onto the microarray, and protein-peptide binding was quantified by measuring fluorescence intensity.

Design of the Peptide Microarray and Control Experiments
To print peptide microarrays, we needed to decide how to immobilize the peptides on the glass surface, as well as the amount of peptide printed. Therefore, we first selected 10 peptides to test out the conditions. These peptides included four known binders of the Abl1 SH3 domain (13,14) and six nonbinders (including four peptides that bind to other SH3 domains, but not Abl1 (13,14), and two random peptides without the PXXP motif). Just as in our previous study (12), each peptide was 10 amino acids long, and two alanines were added at each end. A linker, either aminohexanoic acid or polyethylene glycol, was added at the N terminus. The peptides were immobilized to the glass surface via the N terminus. We tested two spot sizes, 150 and 600 m in diameter, which corresponded to 1 and 5 nl of 0.3 mg/ml peptide solution dispensed to the array, respectively. Each printing amount and linker was tested in triplicate. To better quantitatively measure the signal intensity, we used fluorescence and laser scanner to detect the binding of the SH3 domain to the peptides, instead of chemiluminescence and film in our previous study (12). Briefly, GST-Abl SH3 fusion protein was purified and characterized as described before (12) (data not shown). Purified protein was overlaid onto the array, followed by incubation with an anti-GST antibody and then a Cy3-conjugated secondary antibody. Binding of GST-Abl SH3 to peptides generated a Cy3 fluorescent signal that was recorded on a laser scanner (supplemental Table S1). The fluorescence intensity for each peptide was an average of triplicates. Fig. 2 shows that the peptide microarray correctly distinguished binders from nonbinders. By quantifying signal from each spot, we found that there was no difference in signal intensity using either linker. As expected, the large spots generated much stronger overall signal than the small ones. However, the signal intensity was more even, and the signal variation among triplicates was less in small spots than those in large spots. Hence, we chose the lower cost aminohexanoic acid linker and 150 m spots for the rest of the experiments. As a control, we probed the array with anti-GST and the secondary antibodies alone, and only basal levels of fluorescence were recorded (data not shown), demonstrating that there was no nonspecific binding of these antibodies to the array.

Analysis of the Peptide Microarray Data
In the production run, limited by the synthesis cost, we synthesized and printed the top 700 predicted binding pep-  (2); WASF1, WBP7, SHAN3, and WASP proteins are known to interact with the Abl protein (in italic bold type) (42).
b The score was calculated based on the PSSM reported in our previous study (8). This PSSM was generated by mutating the peptide residue at each position to all of the 19 other amino acids. Binding free energy difference between the wild type and mutated peptides was calculated to represent the preference of amino acids at a specific peptide position. The total score of the peptide was a sum of the score at each peptide position.
tides. We also included 50 binders (including 38 with previously measured K d values (13)) and 50 nonbinders on the array as positive and negative controls, respectively (supplemental Tables S2 and S3). The nonbinders were randomly selected from the proteome, which are unlikely to interact with the Abl1 SH3 domain. Each peptide was printed in triplicate at two concentrations, 1 and 0.5 mg/ml. The production array was probed the same way as the test run (Fig. 3). Similarly, we did a control experiment with just the antibodies and detected only basal levels of fluorescence (data not shown).
To analyze the microarray data, we first removed the spots that were flagged as noisy. For the high (1 mg/ml) and low (0.5 mg/ml) printing concentrations, 724 and 767 (of 800) spots remained for later analyses, respectively. Next, we determined the statistical cutoff for the significant fluorescence signal by modeling the background intensity distribution using a gamma distribution (Fig. 4). We binned the spots based on the fluorescence intensity. To have a better curve fitting, we removed the long tail of the histogram that represented strong binding peptides and only kept the bins with intensity less than 900 and 1500 for the high and low concentrations, respectively. The model fitting and p value FIG. 2. Binding of the Abl1 SH3 domain to 10 testing peptides on the microarray. Ten peptides included four known binders and six nonbinders (including four peptides that bind to other SH3 domains, but not Abl1, and two peptides without the PXXP motif). The upper left panel shows the fluorescence signals of the tested peptides on two microarrays with two printing amounts of the peptides. The bottom left panel shows the peptide positions on the microarray (black, small printing spots; red, large printing spots). We tested two printing sizes of the peptides on the microarray and two linkers (aminohexanoic acid (Ahx) and polyethylene glycol (PEG)) at the N terminus of the peptides that were attached to the glass surface (right panel). calculations were conducted using R, and a p value of 0.05 was used as the cutoff.
After establishing the parameters, we first examined the 100 control peptides and found that 40 from the 50 positive controls exhibited a strong signal and 48 from the 50 negative controls displayed only basal signal (Fig. 5). Therefore, our production array showed a satisfactory performance with a sensitivity of 0.80 (40 of 50) and a specificity of 0.96 (48 of 50). It is worth noting that we synthesized twice of the 10 false negative peptides and tested their binding to the Abl1 SH3 domain using dot blot, but only one of these 10 peptides showed signals (data not shown). Although these 10 peptides were reported to be binders, our experiments could not confirm this conclusion, and the true sensitivity of our peptide microarray could be 0.98 (40 of 41).
We then checked the 700 predicted binders and identified 237 nonredundant peptides from 158 proteins as putative binders, including 220 and 175 from high and low concentrations, respectively (supplemental Table S4). When examining the PSSM scores of the 237 binding peptides and the remaining nonbinding peptides, we found that the binders have more favorable (smaller) PSSM scores than the nonbinders, although there is significant overlap between the two distributions (supplemental Fig. S1). This figure illustrates that the PSSM provides separation between binders and nonbinders to some extent.
Among the identified binding peptides, we found 13 known binding partners of the Abl1 SH3 domain, as well as two paralogs of the known binding partners (2) (Table II) known substrate and interacting partner of Abl1, WASF3 (16). The interaction between Abl1 and its substrates is often mediated by SH3 domain (as in the case of the six substrates mentioned above). It is also known that the interactions between WASF family proteins and Abl1 are mediated by the SH3 domain (16). Therefore, WASF3 is likely to be a true interacting partner of the Abl1 SH3 domain. It is worth noting that there are numerous databases that document many partners of the Abl1 SH3 domain, but we only used the curated list in Ref. 2 as the gold standard for known interacting partners, which might miss quite a few true positives (like SHIP2). Given the challenge of proteomewide identification of domain-peptide interactions, an 8.2% (13 of 158) rate of retrieving the curated interacting partners in this study was quite satisfactory, whereas the other confirmed peptides may well be unknown binders.
We also compared our approach with another peptide array-based (SPOT array) method for detecting binding peptides of the Abl1 SH3 domain: only two known interacting partners (M4K1 and DYN2), three paralogs of known interacting partners (SOS2, ABR, and EFS), and one known substrate (SYNJ2) listed in Ref. 2 were discovered in the study of Wu et al. (17) (2.5% ϭ 2 of 81 recovering rate). The superior performance of our method demonstrates the power of integrating computational predictions and peptide microarray in identifying domain-peptide interactions in the human proteome.

Putative Novel Functions of Abl1 in Chromatin Modification and RNA Processing
The identified Abl1 binding peptides came from 158 proteins. To illustrate their functions, we performed enrichment analysis of their gene ontology (18) annotations using the DAVID software package (19) with the default parameters (Table III). It is unsurprising that many proteins are related to actin cytoskeleton function, in which Abl1 is well known to be involved. Unexpectedly, we found putative binding partners that are involved in chromatin modification, RNA processing, transcriptional regulation, and apoptosis (Table III). Interactions with these proteins suggested possible new functions of Abl1. We highlighted several interesting groups of putative interacting partners of Abl1 in Table IV.
Chromatin Remodeling Enzymes-Several methyltransferases are critical for transcription, for example, SET1A, SET1B, and WBP7 methylate Lys-4 of histone protein H3. Mono-, di-, and trimethylation of H3K4 (represented as H3K4me1, H3K4me2, and H3K4me3, respectively) are known to mark active promoters and enhancers (20). In addition to our study, WBP7 was also found to interact with Abl1 SH3 domain in Ref. 17, although the function of this interaction is unclear.
Our analysis revealed the binding of Abl1 to SETD2, a histone methyltransferase that methylates H3K36, which is a mark for transcription elongation and splicing (20). The SETD2 peptides (amino acids 185-194 and 187-196) are located in a proline-rich region. Because this region does  Another identified interacting partner of the Abl1 SH3 domain, DOT1, is a histone methyltransferase that methylates H3K79. H3K79me2 is a histone modification mark for transcribed regions (20). DOT1 also interacts with MLLT10, a myeloid/lymphoid or mixed lineage leukemia protein. Confirmation of the interaction between Abl1 and MLLT10 and illustration of its functional importance await future experimental test.
Our analysis also revealed that Abl1 recognized peptides in JMJD3, a demethylase that demethylates repressive histone mark of H3K27me3 (20), as well as in EP300, an acetyltransferase that is a co-factor often located in enhancers (20). The other identified interacting partners involved in chromatin structure remodeling include a putative polycomb group protein, ASXL3 (21), and a chromatin modifier, CHD8 (22).
Additionally, we found that Abl1 interacted with two methyl-CpG binding domain-containing proteins, MBD5 and MBD6. The functions of these two proteins are still largely unknown. Given the potential functions of Abl1 in the modification of chromatin structure, it would not be surprising if it does play roles in DNA methylation because there exists interplay between histone modifications and DNA methylation (23,24).
Transcription and Splicing Complexes-In addition to these putative interactions between Abl1 and chromatin modification enzymes, our work showed that Abl1 SH3 domain also directly interacted with the transcription machinery, including TAF1, the largest component of the TFIID basal transcription factor complex (25) and MED19, a component of the Mediator complex and a co-activator involved in the regulated transcription of nearly all polymerase II-dependent genes (26). Another putative Abl1 binder is ACINU, a component of the splicing-dependent multiprotein exon junction complex de-posited at splice junctions on mRNAs (27,28). We also identified many proteins involved in RNA splicing and mRNA processing (Table III), which suggested novel functions of Abl1 in transcription and slicing regulation.
Proteins Involved in Apoptosis-Two potential interacting proteins of Abl1 are known to be involved in apoptosis: PDCD7 promotes apoptosis when overexpressed (29), and ASPP2 regulates p53 by enhancing the DNA binding and transactivation function of p53 on the promoters of pro-apoptotic genes in vivo (30). In addition, CHD8, a chromatin modifier and putative Abl binder mentioned above, can suppress p53-mediated apoptosis by recruiting histone H1 and preventing p53 transactivation activity (31).

DISCUSSION AND CONCLUSION
The tyrosine kinase Abl1 is an important therapeutic target in hematopoietic malignancies. We present here a systematic approach of combining computational predictions and peptide microarray to identify proteins bound to Abl1 via the SH3 peptide recognition in the human proteome. A comparison with the documented Abl1-interacting proteins showed a satisfactory performance of this approach. Given the complexity of protein interactions in human and the importance of determining such interactions for therapeutic development, our approach holds great promise for proteome-wide search of binding partners of any disease-related protein. More excitingly, this kind of proteome-wide search may reveal unknown functions of these proteins, as demonstrated in our study. We found putative interactions between Abl1 and numerous chromatin remodeling enzymes and RNA processing proteins, suggesting novel functions of Abl1 in chromatin modification and RNA regulation.
Our computational approach integrates computer modeling and bioinformatics analysis that provide complementary information to reduce false positives. Virtual mutagenesis and MIEC-SVM calculations take into account conformational flexibility and energetic characteristics of the SH3-peptide interactions, whereas conservation analysis uses evolution information to remove noise in computer modeling. Thus, our approach better captures the structural and energetic fea- tures of protein recognition than pure bioinformatics analysis based on sequences only. Because the transient and weak domain-peptide interactions are hard for immunoprecipitation in conjunction with mass spectrometry analysis, in vitro binding assay such as peptide (micro)array is often the first step to identify candidate peptides. Because the cost of peptide synthesis is still quite high, it is too expensive to include all possible peptides in the (micro)array. The more accurate our computational methods are, the more domain-peptide interactions could be identified in the follow-up experiments, as reflected by the much higher retrieving rate of known interacting partners of the Abl1 SH3 domain listed in Ref. 2 by our study than approaches without a computational component.
In this study, we employed a peptide microarray platform that prints peptides onto the glass surface. This technology uses much less peptides (in the order of picomoles) than the popular SPOT array (in the order of nanomoles). Consequently, it also needs less protein, antibodies, and other reagents than the SPOT array. In addition, peptide spots on the microarray are much smaller in size (150 m in diameter) than those on the SPOT array (ϳ3 mm in diameter), which allows high density printing on a very small surface. Hence screenings can be carried out in multiple replicates on a comprehensive scale and in a systemic way. The use of DNA/RNA oligonucleotide microarray slides in our assay also allows quantitative measurement by a microarray imager. Signals on peptide microarrays can be detected using fluorescence, chemiluminescence, or radioisotopes, whereas fluorescent dyes are somewhat problematic in SPOT arrays, because the synthetic arrays show some background fluorescence (32). The quality-controlled synthesis of peptides reduces the number of false negatives because of lack of or poor quality peptide on the spot, which may occur in the SPOT array. The same batch of presynthesized peptides can be used in thousands of microarray experiments, as well as in various insolution assays should there be a need. This not only saves cost but also takes out one variable in data interpretation. The power of our peptide microarray in detecting the weak SH3peptide binding was first demonstrated in our validation experiment with 10 peptides in the test run and 100 control peptides in the production run. It was further demonstrated by a satisfactory correlation of our identified binders of the Abl SH3 domain to the documented known binders. Furthermore, measuring fluorescence intensity provides a quantitative signal that can be analyzed using a statistical test to determine the binding peptides.