University of Huddersfield Repository Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation

The generation of afﬁnity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of > 800 human protein domains, and has additionally expressed and puriﬁed a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of afﬁnity reagents. We present a summary of the approaches used to generate puriﬁed human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modiﬁcation of > 70 proteins by biotinylation in vivo . These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of afﬁnity reagents to the human proteome.


Introduction
Antibodies and other affinity reagents are an invaluable resource in investigating the function and distribution of proteins in addition to potential therapeutic use. Considerable efforts are being made to expand the spectrum of human proteins for which validated and selective antibodies are available. Ideally a variety of antibodies to a specific target protein should include molecules suitable for different uses, including detection in ELISA and Western blots, immunofluorescent imaging, sandwich assays, immunoprecipitation and co-crystallisation, as well as modulating the activity of target molecules in a biological context. The provision of high-quality antigens is crucial to this purpose.
While short peptides may be best for eliciting antibodies to specific post-translation modifications, a better variety of antibodies are likely to be generated to larger protein fragments. The use of recombinant Protein Epitope Signature Tags (PrESTs), which are informatically derived fragments (50-150 amino acids) of human proteins, has allowed the construction of vast antigen and antibody libraries [1,2]. It has been argued that well-folded protein domains can serve as better antigens for some purposes, but this point has not been systematically explored. The provision of a large variety of such folded domains in the context of systematic affinity-reagent generating projects may shed light on these issues.
Production and characterisation of stable domains from a wide variety of proteins have been at the core of large-scale structural genomics projects. Consequently, two by-products of these projects are large collections of recombinant human protein domains (mostly preserved as expression clones and detailed protocols for expression and purification), as well as a set of methodologies for dissecting and producing new protein domains.
This report presents three aspects of the production of human protein domains. First, the existing bank of purified proteins and a summary of the methods used to obtain these proteins. Second, a description of a pilot study aimed at producing soluble domains of a set of proteins selected without regard to feasibility or prior knowledge. Finally, we present an extensive study on in vivo protein biotinylation, an important step in preparing proteins for immobilisation in procedures such as panning and Surface Plasmon Resonance (SPR).

Materials and methods
Plasmids and strains pNIC-Bio3 and pNIC-Bio2 are kanamycin-resistance vectors that express fusion proteins with N-terminal histidine tags (His 6 and His 10 , respectively) followed by a TEV protease cleavage site, and a C-terminal biotin acceptor site. pNIC28-Bsa4 and pNIC-H102 are identical to pNIC-Bio3 and pNIC-Bio2 respectively, but lack the Cterminal tag. All vectors are suitable for ligation-independent cloning as described [3]; more vector details are provided in Fig. 1.
The expression host strain BL21(DE3)-R3-pRARE2 is a phage T1resistant strain bearing a plasmid (pRARE2; chloramphenicol-resistance) that provides rare-codon tRNAs [3]. This strain was transformed with pCDF-LIC and colonies were selected on media containing chloramphenicol (34 mg/ml) and spectinomycin (50 mg/ml) to create the strain Rosetta-R3-BirA, which was used as host in biotinylation experiments.

Overview of protein production methods
The methods used for cloning, protein expression and purification are summarised briefly here; full details have been published (intracellular proteins [3,4]; secreted proteins in bacteria [5] and baculovirus [6]).
Multiple constructs of every target gene were cloned in parallel as PCR fragments, using ligation-independent cloning (LIC). The cloning vectors for E. coli included fusion tags for affinity purification, typically N-terminal His 6 tags that can be cleaved with Tobacco Etch Virus (TEV) protease. After clone verification, the plasmids were used to transform an expression strain, typically a derivative of Rosetta2 (a BL21 derivative harbouring the plasmid pRARE2 that provides 7 rare-codon tRNAs; Novagen). All clones were tested in small-scale cultures in rich medium (TB or LB), and protein expression was induced by IPTG or arabinose at low temperatures (15-258C). The recombinant proteins were then purified from clarified lysates by immobilised metal affinity chromatography (IMAC) in batch, and the eluted proteins were detected by SDS-PAGE and Coomassie blue staining. Selected clones were grown and induced to a larger scale (0.75-6 L) and the proteins were purified by protocols including IMAC, gel filtration and for some proteins tag cleavage and additional steps as indicated. Proteins were analyzed by SDS-PAGE, mass spectrometry and other biophysical or biochemical means as indicated. Variations of this basic procedure include the use of different

RESEARCH PAPER
New Biotechnology Volume 29, Number 5 June 2012 fusion tags such as C-terminal His 6 tag, N-terminal His 6 -thioredoxin tags [3], or biotin acceptor peptides (this study) and the use of bacterial secretion vectors inducible with arabinose, with proteins purified from the culture medium [5].

Small-scale expression tests of biotinylated proteins
Rapid, high-throughput tests for production of soluble recombinant proteins were performed using 1-ml bacterial cultures in 96 deep-well plates by a modification of an earlier method [3]. Cells were grown at 378C in TB containing kanamycin and spectinomycin as described. When the culture turbidity reached 1-3, the temperature was reduced to 188C. After 30 min, protein expression was induced by adding IPTG (0.1 mM) and biotin (50 or 100 mM, as indicated). Following overnight incubation, the cultures were centrifuged and the supernatant was discarded. The cell pellets could be stored frozen at À808C or processed directly. The pellets were thoroughly suspended in 250 ml of lysis buffer comprising 100 mM HEPES, pH 8.0, 500 mM NaCl, 10% glycerol, 10 mM imidazole, 1 mg/ml lysozyme, 0.1% n-dodecyl b-D-maltoside (DDM), 1 mM MgSO 4 , 0.5 mM Tris(2-carboxyethyl)phosphine (TCEP), Benzonase (Merck; 0.5 unit/ml) and protease inhibitors (Calbiochem cocktail VI, 1:1000 dilution). The blocks were placed at À808C for at least 20 min, then thawed in a water bath at room temperature for 10-15 min. The suspensions were mixed in a shaker at 700 rpm to effect complete lysis. The blocks were centrifuged at 3500 Â g for 10 min. Meanwhile, Ni-NTA agarose was aliquoted (50 ml of a 50% suspension in lysis buffer) into wells of a 96-well filter plate (1.2 mm, Millipore). The clarified supernatants were transferred into the wells of the filter plate; the plate was sealed at the top and mixed for 30 min on a shaker at 400 rpm, 188C. The liquid was then removed by vacuum filtration, taking care not to dry the beads. The beads were washed three times by adding 250 ml of wash buffer (20 mM HEPES, 500 mM NaCl, 25 mM imidazole, 10% glycerol and 0.5 mM TCEP) and vacuum filtration.
The filter plate was placed on top of a waste block (96 deep-well block) and centrifuged for 2 min at 300 Â g to remove the remaining wash buffer. The bound proteins were then eluted by adding 40 ml of elution buffer (20 mM HEPES, pH 7.5, 500 mM NaCl, 500 mM imidazole, 10% glycerol and 0.5 mM TCEP) and mixing for 20 min at 188C. The filter plate was placed on top of a 96-well microtiter plate and the eluates were collected by centrifugation (300 Â g, 3 min). The eluted proteins were analyzed by SDS-PAGE and mass spectrometry as described previously [3,7].

Expression and purification of biotin-tagged SH2 domains
Large-scale expression was performed in a custom-made expression system (LEX) (Harbinger Biotech). In this system, E. coli cells are cultivated in 1.5 L of medium in common 2 L glass bottles. Filtered air is bubbled through the medium at a typical rate of 4-6 L/min and thus the cultivations are both aerated and stirred. The temperature is regulated by a thermostat-controlled water bath. Inoculation cultures (20 ml) were started from glycerol stocks in TB in 100 ml Erlenmeyer flasks supplemented with kanamycin (100 mg/ml) and chloramphenicol (34 mg/ml). The cultures were incubated overnight at 308C with shaking at 175 rpm. The following morning, bottles with 1.5 L of TB supplemented with kanamycin (50 mg/ml) and 500 mL Antifoam 204 (anti-foam agent, Sigma) were inoculated with the starter cultures. The cultures were incubated at 378C until OD 600 reached 2. The temperature was then reduced to 188C and protein production was induced by the addition of 0.5 mM IPTG and 50-100 mM biotin. Protein expression was continued for approximately 20 h. Cells were harvested by centrifugation at 4500 Â g for 10 min, resuspended in approximately 50 ml binding buffer (50 mM Na-phosphate, 500 mM NaCl, 10% glycerol, 10 mM imidazole, 0.5 mM TCEP, pH 7.5) supplemented with protease inhibitors, (Complete EDTA-free, 1 tablet/100 ml) and then stored in a freezer at À808C.

SOCS-box-containing 4
All structures of distinct human proteins or domains were divided into biochemical areas, defined either by structural similarity or involvement in biological processes. The larger groups may encompass highly diverse proteins. A full list of structures including experimental procedures is provided on-line at www.thesgc.org/structures.
The resuspended cells were thawed briefly with warm water and Benzonase (2000 U) was added. The suspensions were diluted in lysis buffer to approximately 100 ml before sonication (6 min, 80% amplitude, 4 s/4 s pulsing on a Sonics VibraCell) followed by centrifugation at 49,000 Â g for 20 min. The supernatants were filtered (0.45 mm) and applied to a two-step purification procedure, IMAC and gel filtration, on an Ä KTA Xpress system (GE Healthcare). Briefly, the lysates were loaded onto a 1 ml HiTrap Chelating HP column (GE Healthcare) loaded with Ni 2+ ions, at 0.8 ml/min. The immobilised proteins were washed first with binding buffer until stable baselines were obtained, and then with wash buffer (50 mM Na-phosphate, 500 mM NaCl, 10% glycerol, 25 mM imidazole, 0.5 mM TCEP, pH 7.5) for 20 column volumes (CV) before elution with elution buffer (50 mM Na-phosphate, 500 mM NaCl, 10% glycerol, 500 mM imidazole, 0.5 mM TCEP, pH 7.5) for 7.5 CV. The eluted proteins were collected and stored in a loop on the system, reinjected onto a gel filtration column (HiLoad Superdex 75 or 200, GE Healthcare) and finally eluted in PBS buffer (10 mM Na-phosphate, 154 mM NaCl, 0.5 mM TCEP, pH 7.5) at 1.2 ml/min. Peaks were collected in 2 ml fractions in a deep-well plate and analyzed by SDS-PAGE (Novex NuPAGE 4-12% BisTris 17w gels, Invitrogen). Relevant fractions were pooled and protein concentration was assessed by measuring the absorbance at 280 nm on a Nanodrop ND-1000 (NanoDrop Technologies) spectrophotometer. In case peaks corresponding to different multimeric states were observed in the gel filtration step, these were pooled separately. Samples from each protein batch were analyzed by electrospray ionisation mass spectrometry (ESI-MS) according to the protocol described in [8] to check the extent of the biotinylation reaction.

Results and discussion
Protein production and crystallisation at the SGC The SGC has solved and deposited structures of more than 841 distinct human protein domains [9]; a similar number of other proteins have been purified but not yet crystallised (data not shown). All proteins were produced in recombinant cells, most commonly in E. coli but in some cases (3-4%) in baculovirus-infected insect

RESEARCH PAPER
New Biotechnology Volume 29, Number 5 June 2012

FIGURE 2
Overview of expression and purification statistics (SGC-Oxford, 2004-2009). (a) Pipeline of targets tested in E. coli. The bars represent the number of targets (proteins) that were tested; targets which showed production of soluble protein in small scale test expression; targets that were purified from large-scale culture; targets that generated diffracting crystals; initial models; and finished structures. (b) Targets that failed to express as soluble proteins in E. coli were subcloned into baculovirus vetors and expressed in insect cells. The bars denote the same data as in (a). (c) Summary of the characteristics of constructs and purification schemes used for the proteins that crystallised successfully. Full-length proteins include those with 'trivial' truncations (deletion of membrane-spanning, targeting signals or 1-2 residues from either end). The large tags (GST or Trx) were cleaved before crystallisation. Purification schemes: Ni -IMAC purification. GF -gel filtration. Ni-GF-TEV-Ni: IMAC and gel filtration, followed by cleavage of the His 6 tag and removal of contaminating proteins be re-binding to the IMAC resin. Additional steps include a diversity of chromatographic methods.
cells. More detailed accounts of the methods and parameters used to produce and crystallise these proteins have been published [3,10]. The structures represent a highly diverse selection of proteins with a variety of metabolic, regulatory and structural functions. A rough division of the solved targets is shown in Table 1. We have attempted to cover multiple members of protein or domain families, aiming both to provide insights on biological specificity [11][12][13][14][15][16][17][18] and to build expertise in selected areas. A consequence of the family-based approach is the availability of sets of related protein domains that can be used to test the selectivity of affinity reagents [4,19,20]. Production of soluble recombinant proteins in both E. coli and baculovirus-infected insect cells relied on the following process: (1) Bioinformatic analysis of the protein sequence, to predict soluble domains and their boundaries.   [27,28] For each target and vector, the number of constructs expressing soluble proteins is listed, out of the number of constructs tested. 'Domains produced' indicates the segment of the target protein included in a construct that was selected for scale-up and purification. N-His, C-His, Trx and Bacolu refer to clones in the vectors pNIC28-Bsa4, pNIC-CTHF, pNH-TrxT and the baculovirus transfer vector pFB-LIC-Bse [4].
www.elsevier.com/locate/nbt 519 described in the web site. A detailed analysis of a subset of protein domains produced at the Oxford SGC has been published [3], also providing guidelines to construct design. Figure 2 shows the success rates in expressing human protein domains in E. coli (Fig. 2a) and in insect cells (Fig. 2b). Figure 2c summarises the approaches used for production and purification of the proteins that yielded crystal structures. A clear outcome of the parallel testing of multiple constructs has been the identification, for a large fraction of targets, of domains that can be expressed as stable, soluble proteins in relatively high yields. Once optimal constructs are identified, purification of most protein domains can be achieved using standardised procedures. All the truncated proteins represent intact, independently folded domains; these comprise enzymatic (e.g. kinase, dehydrogenase, phosphatase), molecular recognition (e.g. PDZ, 14-3-3, SH2) or regulatory (e.g. RGS) domains [3].

Processing newly prioritised targets
When facing new proteins that emerge from genetic studies or pathway analyses, the impact of the accumulated experience of the SGC and similar organisation can be two-fold. First, many of the new genes of interest may already have been produced in the SGC; alternatively, for novel targets, the well-tried methods can be used to rapidly generate and identify constructs that produce soluble proteins. To test this, we attempted to handle a set of non-membrane proteins suggested by collaborators, with no consideration of prior work at the SGC or of predicted tractability. The 15 selected proteins were new to us; a few were purified previously by other groups, while some were not reported to be purified. Table  2 summarises the cloning and testing processes performed on each of the targets, within a time frame of three months. In all cases, we have applied our standard construct design and evaluation principles regardless of prior knowledge; when a protein structure was known, the designed construct boundaries closely clustered around the published structural boundaries. Constructs for cytoplasmic expression were cloned in parallel into four vector systems [3]: the E. coli vectors pNIC28-Bsa4 (N-terminal His 6 tag), pNIC-CTHF (C-terminal His 6 + Flag tag), pNH-TrxT (Nterminal His 6 /Thioredoxin tag) and the baculovirus transfer vector pFB-LIC-Bse (N-terminal His 6 tag). Four targets encoding extracellular or secreted domains were expressed in E. coli as fusions to the bacterial secreted protein OsmY [5]. Equivalent constructs were cloned into a baculovirus transfer vector, fused to a signal peptide of baculovirus gp64.   Figs. 3 and 4. The numbering corresponds to the annotations in the figures. The molecular weights are of the proteins produced from the vector pNIC28-Bsa4, which include a His 6 -tag only. The C-terminal biotinylation tag adds 2.6 kDa. The column marked Biotin/unmodified summarises the results of test expression of the C-terminal tagged proteins, comparing the yield in presence and absence of biotin in the growth medium ( Fig. 3 and similar experiments). '0' indicates no effect, '-' indicates a reduction in yield in the presence of biotin, and N/D indicates that the yields were too low to compare.

R E -SEARCH PAPER
New Biotechnology Volume 29, Number 5 June 2012

FIGURE 4
Effect of the N-terminal tag on protein yields. The 24 genes listed in Table 3 were cloned into vectors with a His 6 tag (panels A and B) or a His 10 tag (panels C and D). Each pair of adjacent lanes differ by the absence (0) or presence (B) of a C-terminal biotin tag, which adds 2.6 kDa to the protein mass. In panels A and B, the genes are cloned into pNIC28-Bsa4 (lanes 0) and pNIC-Bio3 (lanes B). In panels C and D, the genes are cloned into pNIC-H102 (lanes 0) and pNIC-Bio2 (lanes B). 1-ml cultures of each clone were induced in presence of 100 mM in the culture medium. The recombinant proteins were extracted and analyzed as in Fig. 3.
(N-terminal His 6 tag) in bacteria provided soluble constructs for the majority of cytoplasmic targets tested. One gene (RCC1) only yielded substantial levels of soluble expression in E. coli with either the C-terminal tag vector or a large fusion tag (thioredoxin), and another (TEX9) was only soluble with the C-terminal tag vector. Two other genes (PLEKHH1 and the enzymatic domain of HCK) could only be expressed in bacteria with a thioredoxin tag. The recently introduced OsmY fusion [5] allowed the production of secreted domains of SPON1, CST3 and FAT3; the secreted proteins could be harvested in approximately equal amounts from the culture supernatants and from the periplasm. One construct was selected from each gene for large-scale (1-4 L) purification; all proteins showed well-defined peaks on gel filtration, and were confirmed by mass spectrometry. Finally, expression of ITGBV (Integrin b5) was attempted in insect cells as a near full-length protein, in combination with integrin a5, but the levels were marginal. This target may require more extensive optimisation or expression as fragments.
In summary, we have been able to produce soluble domains of 14 out of 15 targets tested, with yields of several milligrams. This, together with earlier data from the SGC and others, demonstrates the feasibility of providing soluble domains for the majority of novel targets emerging from genetic and systematic studies of disease pathways.

In vivo biotinylation
Biotinylation of the antigen is often the method of choice for protein immobilisation for selecting and evaluating affinity reagents. In vitro biotinylation is frequently used, whereby lysine residues in the antigen are chemically modified. However, biotinylation of a short acceptor peptide in vivo is an attractive method to achieve site-specific modification without the risk of interfering with protein folding or function of the antigen. In vivo biotinylation is achieved by co-expressing the protein of choice (fused to a biotin acceptor peptide) and the bacterial biotin-protein ligase (BirA) in the presence of biotin. To investigate factors that affect the yield and homogeneity of biotinylated proteins, we tested a set of 24 human proteins using two vector systems. Biotin acceptor tags were added at the C-termini, and oligohistidine sequences (cleavable with TEV protease) were added at the N-termini for protein purification. Based on our sporadic observations, we tested both hexahistidine (His 6 ) and decahistidine (His 10 ) tags. The latter could be useful for some applications (e.g. SPR), but may decrease protein yields because of aggregation.
24 diverse human protein domains (Table 3) were cloned into each of four vectors, generating combinations of His 6 or His 10 tags, with or without biotinylation sites (vectors pNIC-Bio3, pNIC-Bio2, pNIC28-Bsa4 and pNIC-H102, see Fig. 1). Soluble protein production was tested in triplicate small-scale cultures in the presence of 100 mM biotin; the yield of soluble protein was evaluated using SDS-PAGE of fractions eluted from Ni-NTA beads. In separate experiments, the clones in the His 6 vectors pNIC28-Bsa4 and pNIC-Bio3 were tested in the absence of biotin or in presence of 50 and 100 mM biotin. Figure 3 shows a representative experiment, comparing protein production in absence (lanes marked 'À') and presence ('+') of 50 mg/ml biotin. Panels A and B show expression of protein domains cloned in pNIC28-Bsa4, lacking a biotinylation signal.
In general, the intensity of the stained bands in each pair of lanes (À/+ biotin in the growth medium) is similar. Panels C and D show expression of protein domains cloned in pNIC-Bio3, which contain a C-terminal biotin acceptor site. Here, the picture is different: A fraction of clones (e.g. clones 1,4,5,9,12,15) show significantly lower yield (2-7-fold) of protein in the presence of biotin. Furthermore, some clones (e.g. 17, 18) yield very little protein with the Cterminal tag compared with the untagged protein, regardless of the biotin concentration in the culture medium. These effects are protein-specific, as several clones (e.g. 8, 10, 11, 22 and 24) are indifferent to the presence of biotin. The results of this (and replicate) experiment(s) are summarised in Table 3. Figure 4 shows a representative experiment comparing the two N-terminal purification tags: His 6 (panels A and B) and His 10 (panels C and D). In each pair of lanes, the lane marked '0' is the protein lacking the C-terminal biotin tag, and the lane marked 'B' is the C-terminal tagged protein. The difference in size between each pair represents the 2.6 kDa tag (peptide + biotin). Comparing the recovery of purified proteins from the His 6 and His 10 vectors (panels A vs. C and B vs. D), the results are gene-specific. However, there is a tendency for lower yields of the His 10 -tagged proteins relative to the His 6 -tagged counterpart.
Small-scale experiments provide only a semi-quantitative estimate of protein yields. We tested a separate set of 35 SH2 domains cloned into pNIC-H102 and pNIC-Bio2, at a production scale of 1.5 L (in the presence of 50 mM biotin). The proteins were purified using a standard two-step procedure (IMAC and gel filtration), and the yields were measured. Figure 5 shows the comparison of the New Biotechnology Volume 29, Number 5 June 2012 RESEARCH PAPER

FIGURE 5
Comparison of expression of SH2 domains with (Y-axis) or without (X axis) a Cterminal biotin acceptor tag. 35 human SH2 domains were cloned into the vectors pNIC-H102 (no biotin tag) and pNIC-Bio2 (C-terminal biotin tag). Each of the resulting 70 clones was used in a 1.5-Litre expression culture, in presence of 50 mM biotin. The proteins were purified and the yields were measured as described. Each spot represents the yields from one gene in both vectors, in mg of purified protein/L of culture. The dotted line (x = y) is overlaid to indicate that, for most proteins, the yield of biotinylated protein is lower than the corresponding protein lacking the biotin tag.
yields of proteins expressed with or without the C-terminal tag. The graph shows the considerable scatter of the results; most of the points are below the diagonal (x = y; dotted line), illustrating that the yield of biotinylated proteins is usually lower than that of the corresponding clone lacking the biotin acceptor peptide. The average reduction in yield is only 30%, but 6/35 clones tested showed more than 5-fold reduction in yield. Although low yields can be overcome by increasing culture volumes, it may be worth testing in individual cases whether the biotin tag affects the stability or solubility of the purified protein.
The precise masses of the purified proteins were evaluated using mass spectrometry (representative results are shown in Fig. 6). For all proteins expressed with the biotin acceptor tag, >90% of the purified protein was biotinylated. We could also obtain mass measurements for the highly expressed proteins purified from the small-scale cultures. In all cases, fully biotinylated proteins were observed when the culture medium included 50 or 100 mM biotin. No biotinylation was seen when the medium did not include added biotin. The lower concentration of biotin is sufficient for full biotinylation of all proteins included in this study; however, we have encountered a small number of (highly expressed) proteins where higher concentrations of biotin were required.
The optimal procedure that emerges from these biotinylation experiments and other experiments not shown are: (1) Addition of a biotin acceptor tag can affect protein expression in unpredicted ways, often leading to reduced yields. (2) Addition of 50 mM biotin to the culture medium is generally sufficient to achieve full biotinylation, although special cases of highly expressed genes may required adding 100 mM or more. (3) A host strain that expresses BirA as well as rare-codon tRNAs gives optimal, consistent results for eukaryotic genes. (4) As always with protein RESEARCH PAPER New Biotechnology Volume 29, Number 5 June 2012 production, individual proteins may require specific optimisation of the induction, extraction and purification conditions.

Concluding remarks
Earlier studies have shown the synergy between the protein-producing capacity of structural biology and high-throughput production of affinity reagents [4,20]. In the original studies [4,20], a set of purified protein domains from the SH2 family was produced, and recombinant or monoclonal binders to most of them were obtained within a short time span. The panel of related protein domains provided an excellent platform for assessing the selectivity and the binding affinities of the binders. The present work explores the possibility of extending the antigen space to a wider variety of human proteins, especially those associate with disease or with signalling networks. The panels of purified proteins produced through the activity of large-scale structural biology programs already include >1000 human proteins of interest. The small pilot study reported here shows that processing of new proteins to generate soluble domains can be achieved with high efficiency. Finally, the initial results of transferring previously expressed proteins to an in vivo biotinylation system show that, although the protein yields are sometimes lower, it is possible to routinely achieve complete biotinylation of all proteins tested.
The ability to rapidly provide purified proteins (and, subsequently, affinity reagents) from novel genes that emerge from functional and genetic studies, can provide a major opportunity to understanding the roles of these proteins and their suitability as targets for clinical intervention.