Comprehensive Analysis of Yeast Surface Displayed cDNA Library Selection Outputs by Exon Microarray to Identify Novel Protein-Ligand Interactions

Phosphatidylinositides are important signaling molecules that interact with a myriad of cellular proteins, many of which remain unidentified. We previously screened a yeast surface displayed human proteome library to identify protein fragments with affinity for the phosphatidylinositides, phosphatidylinositol-4,5-bisphosphate and phosphatidylinositol-3,4,5-trisphosphate. Much of the diversity in the screened selection outputs was represented by clones present at low frequencies, suggesting that a significant number of additional phosphatidylinositide-binding protein fragments might be present in the selection outputs. In the studies described in this report, we developed a novel cDNA library analysis method and comprehensively analyzed the polyclonal selection outputs from the phosphatidylinositol-4,5-bisphosphate and phosphatidylinositol-3,4,5-trisphosphate selections using a high-density exon microarray. In addition to the nine previously reported phosphatidylinositide-binding protein fragments, we identified 37 new phosphatidylinositide-binding candidates. Nine of 37 contain known phosphatidylinositide-binding domains, whereas the remaining 28 contain no known phosphatidylinositide-binding domain. We cloned and confirmed phosphatidylinositide binding by fluorescence-activated cell sorting for 17 of these novel candidate protein fragments. Our experiments suggest that phosphatidylinositide binding by these 17 novel protein fragments is dependent on both the inositol phosphate “headgroup” and the lipid “tail.” This is in contrast with the PH domain containing fragments we tested, for which the inositol phosphate headgroup was sufficient for binding. The novel PtdIns-binding fragments come from a wide variety of proteins, including splicing factors, transcription factors, a kinase, and a polymerase. Intriguingly, 11 of the phosphatidylinositide-binding protein fragments are from nuclear proteins, including four containing homeobox domains. We found that phosphatidylinositides and double-stranded DNA oligonucleotides derived from homeobox domain target sequences compete for binding to homeobox domain-containing protein fragments, suggesting a possible mechanism for phospholipid-dependent transcriptional regulation. FACS enrichment of target-binding clones in yeast human cDNA display libraries coupled with comprehensive analysis of the selection output by DNA microarray analysis is an effective method for investigating common as well as rare protein interactions. In particular, this method is well suited for the study of small molecule/protein and drug/protein interactions.

and double-stranded DNA oligonucleotides derived from homeobox domain target sequences compete for binding to homeobox domain-containing protein fragments, suggesting a possible mechanism for phospholipid-dependent transcriptional regulation. FACS enrichment of target-binding clones in yeast human cDNA display libraries coupled with comprehensive analysis of the selection output by DNA microarray analysis is an effective method for investigating common as well as rare protein interactions. In particular, this method is well suited for the study of small molecule/protein and drug/protein interactions. Molecular & Cellular Proteomics 10 Phosphatidylinositides are a minor species of phospholipids present in cellular membranes that function as important regulators of many processes, including signal transduction, ion channels and transporters, vesicle trafficking, cytoskeletal organization, and motility (1)(2)(3)(4)(5). In addition to cytosolic membrane phosphatidylinositides, an independent pool of nuclear phosphatidylinositides also exists. Although significantly less is known about the functions, mechanism of action, and physical form of nuclear phosphatidylinositides, they are known to play a role in transcription, pre-mRNA splicing, cell cycle regulation, differentiation, and chromatin organization (6 -10).
The various biological functions of phosphatidylinositides are mediated through association with proteins containing phosphatidylinositide-binding globular domains, which bind to different species of phosphatidylinositides with varying degrees of affinity and specificity (11), or unstructured regions with clusters of basic and hydrophobic residues (12). The identification and characterization of additional protein domains or regions that bind to phosphatidylinositides is important for understanding the mechanisms by which they regulate cellular processes. In particular, the identification of nuclear phosphatidylinositide-binding proteins will be critical for increasing our understanding of how phosphatidylinositides regulate processes in the nucleus.
Generally speaking, identifying proteins that interact with small bioactive molecules such as phosphatidylinositides is an important, but often challenging and rate-limiting step in understanding cellular signaling pathways. Using yeast surface display techniques, heterologous protein fragments can be efficiently displayed on the surface of the Saccharomyces cerevisiae yeast cell as C-terminal fusions to the yeast a-agglutinin subunit, Aga2p (13). Yeast surface display technology has been extensively utilized for protein engineering (14,15). We have previously described the construction of large (2 ϫ 10 7 ) yeast surface-displayed human protein fragment libraries (16,17). When coupled with fluorescence-activated cell sorting (FACS) 1 , yeast surface-displayed human protein fragment libraries can theoretically be used to identify protein fragments with affinity for any soluble molecule that can be fluorescently detected. These libraries have been used to identify human protein fragments with affinity for tyrosine-phosphorylated peptides (16), tumor-targeting antibodies (18), and phosphatidylinositides (17).
We previously screened our yeast surface human cDNA display libraries for protein fragments with affinity for the phosphatidylinositides (PtdIns), phosphatidylinositol-4,5bisphosphate (PtdIns (4,5)) and phosphatidylinositol-3,4,5trisphosphate (PtdIns (3,4,5)), and recovered fragments from nine unique proteins. Eight of these fragments encoded known phosphatidylinositide binding domains (7 pleckstrin homology (PH) domains and one phosphotyrosine-binding domain), demonstrating the effectiveness of the approach (17). Our analysis revealed that whereas a small number of clones dominated the selection outputs, there were nevertheless many binding clones present at low frequencies. This suggested to us that we did not capture the full diversity of the selection outputs with our original screening and sequencing, and that a significant number of additional phosphatidylinositide-binding clones might be present at a low frequency in the selection outputs. This is a general problem for all cDNA library selection-based ligand identification approaches, as monoclonal sequencing is inadequate to cover the full diversity in the selection output. In this report we describe a novel method to address this problem. We analyzed the total polyclonal selection outputs using high density human exon microarrays, capturing the full diversity of the cDNA library selection output. In addition to the nine previously reported phosphatidylinositide-binding protein fragments, we identified 17 novel phosphatidylinositide-binding protein fragments, including 11 derived from nuclear proteins. Interestingly, in contrast with the PH domain containing fragments we tested, for which the inositol phosphate "headgroup" was sufficient for binding, phosphatidylinositide binding by these novel protein fragments is dependent on the nature of the lipid "tail." Four of the nuclear phosphatidylinositide-binding pro-tein fragments contain a homeobox domain. We found that phosphatidylinositides and double-stranded DNA oligonucleotides derived from homeobox domain target sequences compete for binding to homeobox domain-containing protein fragments, suggesting a possible mechanism for phospholipid-dependent transcriptional regulation. Our results suggest that selection experiments with yeast human cDNA display libraries coupled with comprehensive analysis of the outputs using DNA microarrays is an effective method for investigating protein interactions.

EXPERIMENTAL PROCEDURES
Microarray Analysis-Plasmids were recovered from phosphatidylinositide-binding enriched pooled polyclonal yeast selection outputs using a modified QIAprep Spin Miniprep protocol incorporating a glass bead cell lysis step (Qiagen. Hilden, Germany). The isolated plasmid was transformed by electroporation into 10G Supreme bacteria (Lucigen, Middleton, WI) and plasmid DNA was purified from the pooled transformants using a Qiagen Maxi Prep kit (Qiagen). Biotinylated cRNA probes were made by in vitro transcription using Xho1 or Not1-linearized selection output plasmid as templates with the Ambion MAXIscript T7 kit according to manufacturer's protocols (Ambion, Austin, TX). The biotinylated cRNA was fragmented by sonication and 5 g was hybridized to an Affymetrix GeneChip® Human Exon 1.0 ST Array. The GeneChip Human Exon 1.0 ST Array has over 5 million unique 25-mer oligonucleotide probes grouped into 1.4 million probesets that represent over 1 million exon clusters (19). After processing according to manufacturer's instructions, the data was collected using an Affymetrix GeneChip Scanner 3000 (Affymetrix, Santa Clara, CA). The raw exon microarray dataset was processed using Affymetrix Expression Console software (Affymetrix), and analyzed with Excel (Microsoft, Redmond, WA) and Integrated Genome Browser software (Affymetrix) to generate a list of phosphatidylinositide-binding candidates.
Recovery of Candidate PtdIns-binding Clones-Internal primers for phosphatidylinositide-binding candidates were designed based on the exon microarray data and used along with vector-specific primers to PCR amplify fragments containing the 5Ј and 3Ј ends of the cDNA inserts, which were cloned into TA vector (Invitrogen, Carlsbad, CA) and sequenced to define the precise 5Ј and 3Ј cDNA insert boundaries and determine if the cDNA encoded protein is in-frame with AGA2 in the pYD1 yeast display vector. To generate plasmids for yeast display, new primer sets based on the 5Ј and 3Ј boundaries of the cDNA inserts were designed with EcoRI overhangs and used to PCR amplify the cDNA inserts, which were cloned into pYD1 (Invitrogen) and verified by sequencing. The candidate cDNA insert containing pYD1 plasmids were transformed into the S. cerevisiae yeast strain EBY100 (Invitrogen) using the lithium acetate method and transformants were selected on S.D.-CAA medium (2% dextrose, 0.67% yeast nitrogen base, 0.5% casamino acids).
FACS Analysis-Candidate phosphatidylinositide-binding yeast clones were grown overnight in SRG-CAA (2% galactose, 2% raffinose, 0.67% yeast nitrogen base, 0.5% casamino acids) to induce surface display of the candidate phosphatidylinositide-binding protein fragments. Induced yeast clones expressing the putative phosphatidylinositide-binding protein fragments or negative control yeast displaying Xpress epitope-tagged Aga2p from the empty pYD1 were washed twice with phosphate buffered saline (PBS) and incubated for 2 h at room temperature with 500 nM biotinylated PtdIns(4,5) and PtdIns (3,4,5) (hybrid phosphatidylinositol polyphosphates, Echelon Biosciences, Salt Lake City, UT) and mouse anti-Xpress antibody (1/1000 dilution of stock solution, Invitrogen) to monitor surface dis-play of the protein fragments via the Xpress epitope tag, which is immediately upstream of the inserted cDNAs in the pYD1 vector. The yeast were washed twice with PBS and incubated with 1/500 streptavidin-phycoerythrin (Invitrogen) and 1/500 Alexa fluor 647-conjugated anti-mouse secondary antibody (Jackson ImmunoResearch, West Grove, PA) in PBS for 30 min. Following two washes with PBS, the yeast were analyzed by FACS (LSRII, BD Biosciences, San Jose, CA). Induced phosphatidylinositide-binding yeast clones were also tested for binding to 1 M Biotin-InsP 3 and Biotin-PtdIns, C 6 (Echelon) using the same protocol. For competition experiments, the induced yeast displaying the phosphatidylinositide binding protein fragments were washed with PBS and co-incubated with 500 nM GST-Grp1 PH domain (PIP 3 Grip, Echelon) and 250 nM biotinylated PtdIns (3,4,5) for 2 h and then analyzed by FACS as described above. For the binding and competition experiments with the homeobox target sequence, oligos corresponding to the homeobox target sequence (20) (5Ј-acactaattgcaggc-3Ј and 5Ј-gcctgcaattagtgt-3Ј) and randomized oligos containing the same nucleotide composition (5Ј-gcatagtactcagac-3Ј and 5Ј-gtctgagtactatgc-3Ј) were synthesized and annealed. For the binding experiments, one of the homeobox target sequence oligos was synthesized with a fluorescein isothiocyanate (FITC) label. For the titration experiments, varying concentrations of biotinylated PtdIns (3,4,5) were incubated with the homeobox domain displaying yeast for 2 h, washed with PBS, and then incubated with 1/500 streptavidin-phycoerythrin in PBS for 30 min, washed twice with PBS, and analyzed by FACS. The FITC-labeled double-stranded annealed homeobox target sequence oligos were incubated for 2 h with the HOXD4 homeobox domain displaying yeast and then washed twice with PBS and analyzed by FACS. For competition experiments, HOXD4 homeobox domain displaying yeast were co-incubated with varying concentrations of biotinylated PtdIns(3) or PtdIns (3,4,5) and 200 nM FITC-labeled double-stranded homeobox target sequence oligo for 2 h at room temperature, washed twice with PBS, and then analyzed by FACS.
Pulldown Assays-The cDNA inserts encoding the PtdIns-binding protein fragments of HMGN2, SFRS4, and WNK1 were cloned into the GST bacterial expression vector pET-41 (Novagen, Madison, WI) and purified using the Pierce B-PER GST Fusion Protein purification kit according to manufacturer's directions (Thermo Fisher Scientific, Rockford, IL). Recombinant GST-HOXA5 (aa 171-270) was purchased from Abnova (Abnova, Taipei City, Taiwan). Streptavidin agarose beads (Pierce, Thermo Fisher Scientific) were incubated with 1 M biotinylated PtdIns(3,4,5) (Echelon) for 10 min, washed three times with PBS, and incubated with 500 ng of the GST fusion proteins and GST alone as a control for 2 h, washed three times with PBS and analyzed by SDS-PAGE western using rabbit anti-GST (GenScript, Piscataway, NJ) as the primary detection antibody and analyzed by Western blotting using peroxidase-conjugated anti-rabbit secondary antibody (Jackson ImmunoResearch). HOXA5 pulldown competition experiments were performed by co-incubating 500 ng HOXA5 and 2 M of either the double-stranded homeobox target DNA or randomized oligos containing the same nucleotide composition and performing pulldown experiments with PtdIns(3,4,5)-coated agarose beads as described above.

RESULTS
Accessing the full diversity of yeast surface-displayed human protein fragment library selection outputs by exon microarray analysis-We have previously described the use of yeast surface-displayed human protein fragment libraries for FACS-based selection of protein fragments with affinity for the phosphatidylinositides PtdIns(4,5) and PtdIns (3,4,5) (17). Analysis of the distribution of phosphatidylinositide-binding clones in the sort outputs revealed that 77% (72/93) of the sequenced binding clones contained cDNA inserts derived from three genes (PDK1, DAB2, and SBF1). Several binding clones were present at low frequencies, suggesting that our screening did not reveal the full diversity of the sorting outputs, and that additional phosphatidylinositide-binding clones were likely to be present in the sort outputs. We therefore sought to use DNA microarray analysis to comprehensively analyze the polyclonal selection outputs and identify additional phosphatidylinositide-binding protein fragments. A diagram of the selection process and output analysis is presented in Fig. 1. Plasmid DNA was purified from polyclonal yeast from the PtdIns(4,5) and PtdIns (3,4,5) round three selection outputs and used as a template for in vitro transcription reactions to generate biotin-labeled RNA probes representing the cDNA inserts contained in the polyclonal selection outputs. These biotin-labeled probes were used to interrogate a high-density human exon microarray and the results were analyzed.
To establish the ability of the microarray approach to analyze the composition of the polyclonal selection outputs, we FIG. 1. Diagram of procedure for enrichment of clones displaying phosphatidylinositide-binding protein fragments and comprehensive exon microarray analysis of polyclonal selection output. A yeast library displaying human protein fragments was incubated with labeled phosphatidylinositides and binding yeast clones were enriched through several rounds of FACS. Polyclonal plasmid containing cDNAs encoding phosphatidylinositide-binding protein fragments was isolated from the polyclonal selection output and used as a template for in vitro transcription to generate biotinlabeled RNA. The labeled RNA was hybridized to the exon microarray and data was collected and analyzed. first examined the microarray data for the presence of the nine proteins with PtdIns-binding fragments that were previously identified in the selection outputs by monoclonal screening and sequencing (17). Strong signal intensities were observed in the regions corresponding to all nine of the previously identified proteins, suggesting that the microarray data accurately reflects the composition of the cDNA inserts in the polyclonal selection outputs (Fig. 2). Further analysis of the microarray data revealed 37 new candidate phosphatidylinositide-binding protein fragments that were not identified by previous monoclonal screening and sequencing (see Fig. 3 for examples of microarray data for new candidates). Nine of these candidate protein fragments contained known phosphatidylinositide-binding domains (7 PH domains and 2 C2 domains), whereas 28 represented novel candidate phosphatidylinositide-binding protein fragments. Fourteen of the novel candidate protein fragments are from proteins whose primary localization is known or predicted to be nuclear. Thus our exon microarray-based comprehensive analysis of cDNA library selection output is highly effective in assessing the full diversity of the output, overcoming a major limitation of traditional monoclonal sequencing-based output analysis.
Confirmation of Candidate Protein Fragment Binding to Pt-dIns (4,5) and PtdIns (3,4,5)-We first confirmed binding of candidate protein fragments to phosphatidylinositides. We chose 17 of the novel candidate cDNA sequences for testing, giving priority to nuclear proteins or proteins with known enzymatic activity. The candidate cDNA fragments were amplified by PCR using plasmid DNA purified from the polyclonal selection outputs as template, sequenced to verify in frame fusion with Aga2p, and recloned into the yeast display vector. We then transformed the plasmids into yeast and tested for binding to biotinylated PtdIns(4,5) and PtdIns(3,4,5) (Fig. 4A) by FACS. All 17 of the yeast displayed protein fragments bound to both biotinylated PtdIns(4,5) and PtdIns(3,4,5) (Fig.  4B). The 17 confirmed novel phosphatidylinositide-binding protein fragments in addition to the nine previously reported protein fragments (17) are summarized in Table I (protein  sequences for the 17 novel fragments are listed in Supplemental Table 1). Interestingly, 11 of the newly identified phosphatidylinositide-binding fragments are from nuclear proteins. Details of the 20 unconfirmed candidate phospholipid-binding protein fragments are presented in Table II.
We next tested the ability of some of the protein fragments to bind PtdIns (3,4,5) using an in vitro pulldown assay. We used glutathione S-transferase (GST) fusion proteins containing the phosphatidylinositide-binding fragments of HMGN2, HOXA5, SFRS4, and WNK1 for pulldown experiments with FIG. 2. Microarray probeset intensity data corresponding to phosphatidylinositide-binding protein fragments previously identified in the selection outputs. To confirm the ability of the microarray approach to determine the composition of the cDNA inserts in the polyclonal selection outputs, we examined the microarray data for PtdIns-binding protein fragments previously identified in the selection outputs (17). Gray bars represent the total protein with boxes indicating the location of known functional domains (PH, pleckstrin homology domain; PTB, phosphotyrosine-binding domain). The blue bars below represent the protein fragments encoded by the cDNA inserts contained in the corresponding phosphatidylinositide-binding clone. The gene exon structure is shown in green. Because the gene exon structures are interrupted by nonprotein coding intronic sequences, they do not precisely align with the proteins depicted above. Intensity values for exon-associated probesets are represented by red bars below the gene exon structure. Data is visualized using Integrated Genome Browser software.
Analysis of Inositol Phosphate "Headgroup" and Lipid "Tail" Contributions to Binding of Phosphatidylinositide-binding Protein Fragments-To explore the relative contributions of the inositol phosphate "headgroup" and the lipid "tail" to binding, we tested 16 of the novel PtdIns-binding protein fragments and 2 PH domain-containing fragments for binding to biotinylated Ins(3,4,5) (Fig. 7A) by FACS using the yeast display system. The headgroups of the biotinylated Ins (3,4,5) and the previously tested biotinylated PtdIns (3,4,5) are identical, but the structures of their hydrophobic lipid tail regions are different (compare Fig. 4A to Fig. 7A). Although the two PH domain-containing protein fragments bound to the Ins (3,4,5), no binding of the novel, non-PH domain-containing protein fragments bound was observed at the concentration tested (1 M) (Fig. 7B). We also tested some of the clones for binding to a different form of biotinylated phosphatidylinositide, PtdIns(3,4,5) C 6 , which has a shortened six carbon chain lipid tail, and obtained similar results (no binding) (Supplemental Fig. 1). These results suggest that the inositol phosphate headgroup is not sufficient for binding to these protein fragments and that binding is dependent in part on the nature of the lipid tail.
To further explore this issue, we performed competition experiments using a purified protein fragment containing the Grp1 PH domain. PH domains are known to bind to the inositol phosphate headgroups of phosphatidylinositides (21). Using the same FACS assay as in previous experiments, we found that addition of the Grp1 PH domain significantly reduced the binding of PtdIns (3,4,5) to all protein fragments we tested (Fig. 8). Thus, the PtdIns(3,4,5) cannot efficiently bind the Grp1 PH domain and any of the protein fragments we tested simultaneously, suggesting that the inositol phosphate headgroup or nearby regions are necessary for binding to the tested protein fragments. Taken together, these data suggest that the novel phosphatidylinositide-binding protein frag- ments we tested bind to phosphatidylinositides differently than PH domains, in a manner that is dependent on the lipid tail.
PtdIns (3,4,5) and Homeobox Domain Target Sequence DNA Compete for Binding to Homeobox Domain-containing Protein Fragments-Four of the phosphatidylinositide-binding protein fragments contain a homeobox domain, a helix-turnhelix structure that binds DNA in a sequence specific manner and is commonly found in transcription factors (22). We did competition studies to determine if PtdIns(3,4,5) could compete with homeobox domain target sequence DNA for binding to the yeast surface-displayed HOXD4 homeobox domain phosphatidylinositide-binding protein fragment. We first measured the apparent affinity of biotinylated PtdIns (3,4,5) and a FITC-labeled double-stranded DNA oligonucleotide designed based on the consensus homeobox domain binding sequence (Fig. 9A). The use of yeast surface display for the quantitative analysis of protein-protein (23) and protein-DNA interactions (24) has been previously reported. The apparent affinity of PtdIns (3,4,5) for the HOXD4 homeobox domaincontaining protein fragment was ϳ50 nM whereas the apparent affinity of the homeobox domain target sequence oligo was ϳ100 nM. We tested the ability of increasing amounts of PtdIns (3,4,5) to reduce binding of the homeobox target sequence oligo to the yeast surface-displayed HOXD4 homeobox domain. As a control, we used PtdIns(3), which does not bind to the yeast surface-displayed HOXD4 homeobox domain. We found that addition of PtdIns(3,4,5), but not PtdIns(3), could reduce binding of the homeobox target sequence oligo to yeast surface-displayed HOXD4 homeobox domain in a concentration-dependent manner (Fig. 9B). We next tested the ability of the homeobox target sequence oligo to compete with PtdIns(3,4,5) for binding to recombinant HOXA5 homeobox domain using an in vitro pulldown assay. The homeobox target sequence oligo, but not a randomized negative control oligo, was able to prevent the pulldown of the HOXA5 homeobox domain protein fragment by PtdIns(3,4,5)coated beads (Fig. 9C). Thus, our experiments suggest that PtdIns (3,4,5) and homeobox domain target sequence DNA compete for binding to homeobox domain-containing protein fragments.

DISCUSSION
Bioactive molecules often bind to multiple cellular proteins. We have previously developed an effective approach based on screening yeast surface displayed human cDNA libraries to identify binding proteins for any ligands that can be fluorescently labeled (16 -18). After enrichment of binding clones by several rounds of FACS, individual clones were tested for binding to the target ligand and then sequenced. Although effective, the distribution of binding clones in the sorting outputs suggested that the full diversity had not been uncovered by the monoclonal screening approach. Conventional monoclonal screening and sequencing approaches often identify dominant clones, leaving important biological interactions undetected because of low representation in the selection output. This is a general problem for all expression clon-ing strategies based on cDNA library selection. We have developed a novel method to address this problem. By analyzing the selection output using a high-density human exon array, the full diversity of the selection output has been uncovered at once. Our study shows that the combination of FACS selection using yeast human proteome surface display libraries with exon microarray analysis of the selection outputs is a powerful way of identifying protein/ligand interactions. The method is comprehensive, and not biased toward abundant proteins. The method can be used to identify protein fragments with affinity for any soluble ligand that can be fluorescently detected, including small biological molecules and drugs. The method is generally applicable to analyzing cDNA library selection outputs when many binders are expected. For example, the method can be applied to the analysis of selection outputs from yeast and phage cDNA display libraries, two-hybrid and three-hybrid libraries, mammalian cDNA expression libraries, and other viral or bacteria-based cDNA display libraries.
Various techniques have been previously used to systematically screen for proteins that bind phosphatidylinositides, including protein microarrays (25), column-based affinity purification using agarose-conjugated phosphatidylinositides Exon Array Analysis of cDNA library Selection Outputs (26), a yeast growth rescue assay (27), gt11-based expression cloning (28), and an in vitro transcription/translationbased expression cloning assay (29); however, with the notable exception of the protein microarray study that analyzed the yeast phosphatidylinositide-binding proteins, only a small number of phosphatidylinositide-binding proteins have been identified in each study. Including our previously published results (17) we have now used yeast surface cDNA display library screening to identify 26 confirmed phosphatidylinosit-ide-binding protein fragments and another 20 candidate phosphatidylinositide-binding protein fragments. Although many of these protein fragments contain known or predicted phosphatidylinositide-binding domains (14 PH, 2 C2, and 1 phosphotyrosine-binding), 29 contain no previously reported phosphatidylinositide-binding domains or regions. We confirmed phosphatidylinositide binding to 17 of these protein fragments displayed on the surface of yeast cells by FACS. Thus, affinity-based selection with yeast cDNA display libraries coupled with DNA microarray analysis of the selection outputs is an effective method for investigating protein interactions. Sequence analysis of the 17 confirmed novel phosphatidylinositide-binding protein fragments did not reveal any similarity with known PtdIns-binding domains (i.e. PH, C2) or any apparent conserved motifs. Ten of the 17 protein fragments (FAM71B, HMGN2, HOXA5, HOXB6, HOXC6, HOXD4, POLS, RNPS1, SFRS4, and WDR60) are predicted to have a significant net positive charge at neutral pH (pI Ͼ 9.0) whereas five (ATN1, CRABP1, NUCKS1, PNPLA7, and PTPN5) are predicted to have a net negative charge (pI Ͻ 6). Thus, although charge may contribute to binding with the negatively charged phosphatidylinositides in some cases, there is no absolute requirement for a strong net positive charge.

TABLE II
Unverified candidate phospholipid-binding protein fragments from microarray analysis of polyclonal selection outputs. NCBI Entrez protein database accession numbers for human protein matches and the expressed regions and total aa lengths are shown. Candidate clones with a cDNA encoded protein that has been determined by sequencing to be in-frame with AGA2 in the pYD1 yeast display vector are indicated in the "Verified in-frame" column. The predicted or observed subcellular localization (retrieved from the GO database (58)) of the protein is also indicated (PM, peripheral membrane; IM, integral membrane; M, membrane; C, cytoplasm; U, unknown; I, intracellular; N, nucleus; NS, nuclear  speckle; CP, clathrin coated pit; IF, intermediate filament; G, golgi; MT, microtubule) phatidylinositides and enzymes involved in their synthesis exist in the nucleus, and nuclear phosphatidylinositides have been functionally connected to several nuclear events, including chromatin organization, and mRNA processing and export (6 -9, 30, 31). The few nuclear phosphatidylinositide-binding effector proteins that have been described include histones H1 and H3 (32), the PHD finger of ING2 (33), the nuclear receptors SF-1 and LRH-1 (34 -36), Nucleophosmin/B23 (26), the PDZ domains of syntenin-2 and zonula occludens-1 and -2 (37,38), and SAP30L and SAP30 (39). The exact physiochemical nature of nuclear phosphatidylinositides is currently unclear, although it appears that they exist independently of the nuclear envelope (40) and may reside in distinct nuclear domains termed "speckles" (41)(42)(43). Nuclear speckles are subnuclear structures enriched in pre-mRNA splicing factors (44). Three of the novel phosphatidylinositide-binding protein fragments we identified (FAM71B, RNPS1, and SFRS4) and one unverified candidate (SRRP35) are from proteins that localize to nuclear speckles (45)(46)(47)(48). Phosphatidylinositide binding by the PDZ domains of syntenin-2 and zonula occludens-1 and -2 regulates their localization to nuclear speckles (37,38). It is therefore possible that the protein fragments we identified in FAM71B, RNPS1, SFRS4, and SRRP35 regulate their localization to nuclear speckles in a similar manner. Four of the phosphatidylinositide-binding protein fragments contain a homeobox domain, which is a helix-turn-helix sequence-dependent DNA binding domain found in transcription factors (22). In our experiments, the apparent affinity of yeast displayed HOXD4 for PtdIns (3,4,5) was ϳ50 nM. The amount of PtdIns (4,5) in detergent-washed rat liver nuclei was measured as ϳ30 pmol/mg of protein (40). Using an estimate of 72 mg/ml (calculated using a value of 3.1 pg for the mass of the haploid nuclear DNA content of the rat genome (49) and a 2:1 nuclear protein:DNA mass ratio (50,51) for the total nuclear protein concentration gives a rough estimate of 4 M for the nuclear PtdIns (4,5) concentration, indicating that interactions in the affinity range we observed could be biologically significant. Our experiments demonstrated that phosphatidylinositides and double-stranded DNA oligonucleotides derived from homeobox domain target sequences compete for binding to homeobox domain-containing protein fragments. It is possible that nuclear phosphatidylinositides play a role in transcriptional regulation by modulating the interaction between homeobox domain proteins and target DNA sequences (Fig. 10A). A similar model has recently been proposed for the Sin3A corepressor complex proteins SAP30 and SAP30L (39). Future studies could attempt to identify mutations within the homeobox domain that selectively disrupt PtdIns binding without affecting DNA binding. If homeobox protein mutants with these properties could be found, they could be used to model homeobox domain-PtdIns interaction and investigate the possible regulation of homeobox proteins by PtdIns in cell homeostasis and tissue development.
One of the PtdIns-binding regions is from the DNA polymerase POLS. The yeast homolog of POLS, Trf4p, plays a role in sister chromatid cohesion (52) and DNA double-strand break repair (53). Intriguingly, the phosphoinositide 3-kinase catalytic subunit p110␤, its substrate, PtdIns (4,5), and its product, PtdIns (3,4,5), have recently been shown to localize to DNA double-strand breaks (54). Thus, it is possible that accumulation of PtdIns at sites of DNA damage acts to recruit POLS (and potentially other factors) that participate in DNA repair (Fig. 10B). Additionally, it is possible that binding by PtdIns directly regulates the enzymatic activity of POLS.
Although the seventeen novel phosphatidylinositide-binding protein fragments bound to the hybrid form phosphatidy- linositide, they did not bind to Ins(3,4,5) (headgroup only), indicating that the structure of the lipid tail is also critical for binding. The hybrid form phosphatidylinositides we used in our experiments are soluble in aqueous solutions and have a diacyl lipid tail with two 16-carbon chains, and a covalently linked biotin reporter group (55). The other form of phosphatidylinositide we used, PtdIns(3,4,5) C 6 , has a short six-carbon monoacyl lipid tail and did not bind to the PtdIns-binding clones that we tested. An attractive hypothesis is that the pool of nuclear phosphatidylinositides that exists in a nonmembranous, detergent-resistant form is associated with nuclear proteins, likely through interactions with their hydrophobic lipid tails. Thus, the lipid tail-dependent PtdIns-binding nuclear proteins we identified in this study could participate in sequestering phosphatidylinositides in the nucleus.
We have not attempted to define the lipid specificity of the phosphatidylinositide binding protein fragments. It is possible that they are able to bind to other lipid species that we have not studied. In addition, we studied only phosphatidylinositide binding by protein fragments but not the full-length protein. It is formally possible that other regions of the fulllength protein could regulate phosphatidylinositide binding by the protein fragments that we identified. Finally, it is important to note that we have not verified the interaction between the proteins we identified and phosphatidylinositides in living cells. In the future, experiments could be done to determine if the proteins we identified interact with endogenous phosphatidylinositides in vivo.
Although there are many more PH domain (258) and C2 domains (149) in the human proteome than what we recov-ered, it is uncertain whether they all bind strongly to phosphatidylinositides. Indeed, currently available binding data, including genome-wide analysis of PH domains in the yeast Saccharomyces cerevisiae, suggest the likelihood that most PH domains do not bind strongly to phosphatidylinositides (11). Thus, our recovery of 14 PH domains may represent a significant fraction of the total number of PH domains in the human genome that bind strongly to phosphatidylinositides. C2 domains demonstrate highly variable affinity and specificity for phospholipids and binding is frequently calcium-dependent (56). Because we did not add calcium during our selections, it is possible that the number of C2 domains we recovered was diminished. Several proteins or domains known to bind to PtdIns(4,5) and/or PtdIns (3,4,5), such the tubby domain (57), the PDZ domain (37,38), nucleophosmin/ B23 (26) were not recovered in our analysis. It is possible that clones containing the full domain or region necessary for binding were not present in the yeast display library, or they may not be displayed in the properly folded conformation for binding.
In summary, we developed a novel exon microarray-based method suitable for the comprehensive analysis of cDNA library polyclonal selection outputs. We identified 37 new phosphatidylinositide-binding candidates from previously generated selection outputs enriched for PtdIns-binding protein fragments (17) and confirmed phosphatidylinositide binding by FACS for 17 of the novel candidate protein fragments. Eleven of the phosphatidylinositide-binding protein fragments are from nuclear proteins, including four containing homeobox domains. We found that phosphatidylinositides can compete with target DNA binding to homeobox domains and thus could play a role in transcriptional regulation. Our studies show that FACS enrichment of target-binding clones in yeast FIG. 9. Testing PtdIns(3,4,5) and homeobox domain target sequence DNA competion for binding to homeobox domain containing protein fragments. A, Concentration-dependent binding of biotinylated PtdIns (3,4,5) and homeobox target sequence double-stranded oligonucleotide to yeast displayed HOXD4 homeobox domain-containing protein fragment; B, Competition between PtdIns (3,4,5) or PtdIns(3) and the homeobox target sequence oligonucleotide for binding to the yeast surface-displayed HOXD4 homeobox domain; C, Measuring homeobox target sequence oligonucleotide competition with PtdIns (3,4,5) for binding to recombinant HOXA5 homeodomain by an in vitro pulldown assay with PtdIns(3,4,5)-coated beads. A randomized double-stranded oligonucleotide with the same nucleotide composition was used as a negative control.