High-resolution Antibody Array Analysis of Childhood Acute Leukemia Cells*

Acute leukemia is a disease pathologically manifested at both genomic and proteomic levels. Molecular genetic technologies are currently widely used in clinical research. In contrast, sensitive and high-throughput proteomic techniques for performing protein analyses in patient samples are still lacking. Here, we used a technology based on size exclusion chromatography followed by immunoprecipitation of target proteins with an antibody bead array (Size Exclusion Chromatography-Microsphere-based Affinity Proteomics, SEC-MAP) to detect hundreds of proteins from a single sample. In addition, we developed semi-automatic bioinformatics tools to adapt this technology for high-content proteomic screening of pediatric acute leukemia patients. To confirm the utility of SEC-MAP in leukemia immunophenotyping, we tested 31 leukemia diagnostic markers in parallel by SEC-MAP and flow cytometry. We identified 28 antibodies suitable for both techniques. Eighteen of them provided excellent quantitative correlation between SEC-MAP and flow cytometry (p < 0.05). Next, SEC-MAP was applied to examine 57 diagnostic samples from patients with acute leukemia. In this assay, we used 632 different antibodies and detected 501 targets. Of those, 47 targets were differentially expressed between at least two of the three acute leukemia subgroups. The CD markers correlated with immunophenotypic categories as expected. From non-CD markers, we found DBN1, PAX5, or PTK2 overexpressed in B-cell precursor acute lymphoblastic leukemias, LAT, SH2D1A, or STAT5A overexpressed in T-cell acute lymphoblastic leukemias, and HCK, GLUD1, or SYK overexpressed in acute myeloid leukemias. In addition, OPAL1 overexpression corresponded to ETV6-RUNX1 chromosomal translocation. In summary, we demonstrated that SEC-MAP technology is a powerful tool for detecting hundreds of proteins in clinical samples obtained from pediatric acute leukemia patients. It provides information about protein size and reveals differences in protein expression between particular leukemia subgroups. Forty-seven of SEC-MAP identified targets were validated by other conventional method in this study.

the analysis of proteins and protein modifications can elucidate the pathological mechanisms of leukemia or clarify the response mechanisms to current and emerging therapies. Currently, flow cytometry is used in clinical laboratories to analyze dozens of proteins that are expressed by leukemic cells (8,9). These proteins, which are mostly surface CD markers, can reflect lineage commitment, developmental status and even the underlying genetic lesion (10, 11) but they do not carry information about the intracellular processes that control malignant transformation. Moreover, many cancer alterations are manifested only at the functional level, including changes in subcellular localization, post-translational modification (e.g. phosphorylation), protein cleavage, or proteinprotein interactions (12). Proteomic techniques that can capture disease-associated changes are needed. Mass spectrometry (MS) is presently the technique of choice for largescale proteomic analysis. MS can uncover thousands of molecules without an a priori probe selection, e.g. new disease-associated features in B-cell precursor acute lymphoblastic leukemia (BCP-ALL) (13,14). Despite its tremendous analytical power, MS is complex and not widely accessible. Unlike MS, affinity proteomics is a simple technology suitable for large-scale protein analysis in primary cancer samples in the clinical laboratories. Recently, a technique linking size exclusion chromatography (SEC) to microsphere-based antibody arrays (microsphere-based affinity proteomics (MAP)) has been developed (15,16). SEC-MAP enables the detection of hundreds of proteins in a single sample and provides essential information about protein size. Because only five to ten million cells are necessary, SEC-MAP can serve as a sensitive, sample-sparing and high-content tool for protein profiling in leukemia samples probing the relative amounts of different proteins, as well as protein size and cleavage (17). Our in-house-assembled MAP array is a set of 1152 populations of fluorescent-labeled microbeads, each carrying an antibody against a single human antigen. Native cellular proteins (and their complexes) are isolated from cellular compartments using detergents, labeled with biotin (biotin-PEO4-NHS) and subjected to SEC to obtain 24 size fractions. The SEC fractions are incubated with MAP microbeads, and antibody-protein binding is detected using phycoerythrin (PE)labeled streptavidin with flow cytometry. The flow cytometer resolves the color code of each microbead population and reads the amount of bound protein. The data from 24 SEC fractions are combined, and a protein's binding relative to its size is detected as a "protein entity." Data are analyzed with in-house R-based software. This approach permits automatic batch processing of raw flow cytometry standard (FCS) files in addition to advanced analyses including quality control steps (the minimal number of microspheres required in a population and the unimodality of the signal in the PE channel is checked) (17). We wanted to find out whether SEC-MAP can be used in the clinical laboratory to bring a biologically important information, e.g. to classify acute leukemias or to find the marker with a prognostic relevance. We assembled MAP arrays to carry antibodies against proteins that are known to be important for leukemia diagnostics (18,9) and against components of intracellular signaling networks (16). Through extensive testing on leukemia samples, we have identified antibodies that are suitable for immunoprecipitation-based techniques. Furthermore, we have improved the software tools to allow for large-scale data normalization, fast automatic protein entity detection with manual correction, and the discovery of differentially expressed entities in multiple samples. Using innovative software tools, we have identified entities that were differentially expressed between particular AL subgroups. To ensure the specificity we have validated the data collected by SEC-MAP with classical flow cytometry-based immunophenotyping (FACS), Western blot (WB) and quantitative real-time PCR (qRT-PCR). Moreover, we have addressed practical sample processing issues related to patient material handling and logistics. Based on the protein size profile, we were able to discriminate proteolytically degraded samples from those with an uncleaved proteome. Importantly, proteolysis would be missed by conventional protein load controls in Western blots. Thus, the SEC-MAP array was demonstrated to be a useful, reproducible and accurate high-content proteomic tool for the assessment of primary leukemia samples.

EXPERIMENTAL PROCEDURES
Patient Samples-Fifty-seven bone marrow samples obtained at diagnosis from patients with acute leukemia were included in the project. The study was approved by the institutional review board, and informed consent was obtained from the patients and their guardians in accordance with the Declaration of Helsinki. The samples (in K3-EDTA tubes) were routinely assessed by flow-cytometrybased immunophenotyping, as previously described (19) and were classified as B-cell precursor acute lymphoblastic leukemia (BCP-ALL, n ϭ 35), T-cell acute lymphoblastic leukemia (T-ALL, n ϭ 9), and acute myeloid leukemia (AML, n ϭ 13). Leukemic blasts (10 million cells) were separated by a Ficoll-Paque gradient (GE Healthcare, Uppsala, Sweden). BCP-ALL and T-ALL samples presented with median percentage of blast of 88%. AML samples with lower blast counts were enriched using a custom-made magnetic negative separation kit (Stem Cell Technologies, Vancouver, Canada), according to the manufacturer's instructions, to an average purity of 81%. In brief, the custom mixture contained antibodies against CD3, CD8, CD19, CD20, glycophorin A (Gly-A), and CD56 to deplete T-lymphocytes, B-lymphocytes, erythrocytes and NK-cells. Upon binding to their targets, the antibodies were attached to magnetic nanoparticles and depleted from the sample using a magnet. Table I summarizes the patients' characteristics, including the type of AL, the age at diagnosis, molecular lesions (presence of fusion gene or hyperdiploidy), risk-group stratification based on Berlin-Frankfurt-Mü nster (BFM) treatment protocols (20), and outcome (complete remission (CR) versus relapse (R)).
Cell Lines-The cell lines NALM-6, REH, RS4;11, SUP-B15, TOM-1 (all BCP-ALL), CEM, JURKAT (all T-ALL), K562 (chronic myelogenous leukemia, CML), BV-173 (CML in BCP-ALL blast crisis), NB-4, and MV4;11 (all AML) were purchased from the German Collection of Microorganisms and Cell Cultures (Braunschweig, Germany). All the cell lines were cultured in RPMI 1640 with 25 mM HEPES, L-glutamine, 100 U/ml penicillin and 100 mg/ml streptomycin (Lonza, Basel, Swit-  CR  10  T-ALL  14  None  MR  CR  12  B-ALL  13  None  HR  CR  13  B-ALL  3  None  MR  CR  15  B-ALL  9  None  HR  CR  17  B-ALL  2  None  SR  CR  26  T-ALL  2  None  MR  R  27  T-ALL  15  None  HR  CR  29  B-ALL  4  None  SR  CR  31  B-ALL  15  None  HR  CR  33  T-ALL  18  None  HR  CR  35  B-ALL  13  None  HR  CR  36  B-ALL  4  None  SR/MR  CR  44  T-ALL  8  None  SR/MR  CR  46  B-ALL  14  None  MR  R  49  T-ALL  14  None  HR  CR  50  T-ALL  17  None  MR  CR  51  T-ALL  8  None  MR  CR  52  B-ALL  17  None  SR  CR  53  B-ALL  7  None  MR  CR  57  B-ALL  5  None  HR  CR  16  AML  15  AML1-ETO  SR  CR  40  AML  10  CBFb/MYH11  SR  CR  3 (buffy coats were obtained from The  Institute of Hematology and Blood Transfusion, Prague, Czech Republic) were isolated using a Ficoll-Paque gradient (GE Healthcare). When needed, B-cell or T-cell enrichment using the appropriate Ro-setteSep kit (Stem Cell Technologies) or Myeloid Enrichment kit (as above) was performed, for an average purity of 92% (79 -95%).
Microsphere-based Affinity Proteomics (MAP) Arrays-Bead arrays were produced as previously described (15), with more extended color-coding as follows: 2 levels of Alexa Fluor 750 (Ax750), 6 levels of Alexa Fluor 488 (Ax488) and Alexa Fluor 647 (Ax647), and 4 levels of Pacific Blue (PB) and Pacific Orange (PO) labeling intensity. Unique capture antibodies were attached to color-coded beads, and the full spectrum of beads was mixed to form a bead array.
Antibodies for MAP Array Assembly-Potential predictors of the treatment response and relapse risk for pediatric ALL patients were selected from the following published expression profiling studies. (1) Cario et al. (7) investigated the expression profiles of diagnostic samples from children with BCP-ALL without detected fusion genes (BCR-ABL1, ETV6-RUNX1, and MLL-AF4) and hyperdiploidy. Patients were stratified according to the minimal residual disease (MRD) level at day 33 and week 12. Genes that were highly expressed in high risk (HR) or standard risk (SR) groups were selected. (2) Bhojwani et al. (6). used the diagnostic samples from a group with high-risk BCP-ALL (children older than 10 years and/or with white blood counts (WBC) above 50 000 cells/l). Genes that were highly expressed in the following groups of patients were selected: rapid early responders (less than 5% blasts at day 7), slow early responders (more than 25% blasts at day 7), complete clinical remission (CCR) within at least 4 years, and relapse before 3 years from diagnosis. (3) Flotho et al. (5) showed that the expression profiles of pediatric ALL samples were correlated with MRD positivity at day 19. Relapse predictors were selected from genes that were differentially expressed in patients with MRD positivity at day 19. Based on these published predictors of prognosis, we selected the available antibodies that were specific for their respective proteins for the MAP array assembly. In our previous studies, we reanalyzed published gene expression data from Yeoh et al. (4) to identify genes associated with genotypic subtypes and the risk of relapse. This work revealed a correlation of drebrin (DBN1), WW domain binding protein 1-like (WBP1L or OPAL1), and chloride intracellular channel protein 5 (CLIC5) expression with the ETV6-RUNX1 genotype and led to the preparation of their respective antibodies for flow cytometry and MAP array detection (21). Moreover, antibodies from Erasmus MC (Rotterdam, The Netherlands) were included based on their specificity against leukemia fusion proteins (22,23). supplemental Table S1 contains a list of 632 antibodies that were attached to microbeads in the MAP arrays (array #1 and array #2), reference to the expression profiling study and interpretation of predictor gene.
Cell Lysis and Protein Labeling-Unless otherwise stated, chemicals were bought from Sigma-Aldrich (St. Louis, MO). Cells were suspended in 50 mM HEPES, 10 mM MgCl 2 , 140 mM NaCl and 0.1% (v/v) Tween 20, pH 8 and lysed by performing a freeze-thaw step on dry ice. The lysis buffer was supplemented immediately before use with 2 mM PMSF, proteinase inhibitors and phosphatase inhibitors (catalogue nos. P8340 and P5736, respectively) according to the manufacturer's instructions. Membranes and nuclei were pelleted by centrifugation at 21,255 ϫ g for 5 min at 4°C. The supernatant was harvested as the hydrophilic (cytoplasmic) fraction. The pellet was solubilized with 1% (w/v) n-dodecyl beta-D-maltoside and 280 mM NaCl and sonicated 4ϫ for 10 s each time to extract membraneassociated and nuclear proteins. This detergent-soluble fraction was incubated for 20 min on ice and then centrifuged at 21,255 ϫ g for 10 min at 4°C. The supernatants of both fractions were incubated for 15 min on ice with protein G (GE Healthcare) to deplete free immunoglobulins from the serum, and were centrifuged at 21,255 ϫ g for 1 min at 4°C. The amount of total protein was determined with a bicinchoninic acid (BCA) Protein Assay kit (Thermo Fisher Scientific) according to the manufacturer's instructions. The supernatants, typically containing 1.5 and 0.5 mg/ml of total protein, respectively, were labeled with 1 mg/ml biotin-PEO 4 -NHS (Thermo Fisher Scientific) for 30 min at 4°C.
Size Exclusion Chromatography-Biotinylated proteins (280 l) were filtered through a 0.2 m centrifuge filter (Millipore, Billerica, MA), loaded onto a Superdex 200, 10/300 column (GE Healthcare), and separated on an Ä kta FPLC system (GE Healthcare) at 4 -8°C at a flow rate of 0.5 ml/min. The running buffer consisted of PBS with 0.05% (v/v) Tween 20 and 1 mM EDTA. Twenty-four fractions of 0.5 ml were collected and frozen at Ϫ80°C (for at least 24 h).
Immunoprecipitation-Frozen aliquots of the bead array suspensions were thawed, pelleted, and resuspended in PBS with 1% (w/v) casein (Thermo Fisher Scientific) and 20 g/ml of nonimmune mouse and goat gamma globulins (Jackson ImmunoResearch, West Grove, PA). Ten microliters of the suspension was added to the wells of 96-well polypropylene PCR plates (Axygen, Union City, CA). Thirty microliters and 60 l of hydrophilic and detergent-soluble fractionated proteins were added, respectively, and the volume was adjusted to 180 l with PBS containing 1% (v/v) Tween 20 and 1 mM EDTA. The wells were capped, and the plates were rotated overnight at 4 -8°C in the dark. The beads were then pelleted by centrifugation, washed three times in PBS with 1% (v/v) Tween 20 and 1 mM EDTA, and labeled with streptavidin-PE (2 g/ml in PBS with 1% (w/v) BSA, Jackson ImmunoResearch). Labeled beads were washed three times in PBS with 1% (v/v) Tween 20 and 1 mM EDTA and analyzed by flow cytometry.
Western Blots-The cells were suspended in 50 mM HEPES, 10 mM MgCl 2 , 140 mM NaCl and 0,1% (v/v) Tween 20, pH 8, and lysed by performing a freeze-thaw step on dry ice. Immediately before use, the lysis buffer was supplemented with 2 mM PMSF, proteinase inhibitors and phosphate inhibitors (catalogue nos. P8340 and P5736, respectively) according to the manufacturer's instructions. The lysate was further solubilized with 1% (w/v) n-dodecyl beta-D-maltoside and 280 mM NaCl and sonicated 4x for 10 s each time to extract membraneassociated and nuclear proteins. The lysate was incubated for 20 min on ice and then centrifuged at 21,255 ϫ g for 10 min at 4°C. The supernatant was incubated for 15 min on ice with protein G (GE Healthcare) to deplete free immunoglobulins from the serum and then centrifuged at 21,255 ϫ g for 1 min at 4°C. Supernatants containing 0.5, 1, and 2 mg/ml of protein were diluted 1:1 with Laemmli reducing sample buffer and heated to 90°C for 5 min. Eighty, 40, and 20 g of protein were separated in SDS-PAGE gels and transferred to nitrocellulose membranes (Bio-Rad, Hercules, CA). The membranes were blocked in 7.5% (w/v) low-fat bovine milk in PBS with 0.05% (v/v) Tween 20 at 8°C overnight. To detect human proteins by Western blot, primary antibodies were used together with the Bio-Rad immunodetection system (Bio-Rad).
Immunophenotyping Leukemia Cell Lines and Primary Leukocytes-The expression levels of three intracellular and twenty-eight surface markers commonly used in leukemia immunophenotyping (9) (supplemental Table S3) were analyzed in 11 leukemic cell lines, as well as in purified peripheral blood B-lymphocytes, T-lymphocytes and monocytes that had been isolated from healthy donors. All incubation steps were performed at room temperature in the dark. Aliquots of 1 ϫ 10 5 cells were incubated with antibodies (according to the manufacturer's instructions) for 15 min and washed once in PBS. Apoptosis-The cells were washed once in Annexin V Binding Buffer (Exbio Praha a.s.). The cell pellet was supplemented with propidium iodide (PI, Miltenyi Biotec), Annexin V-Dy647 (Exbio Praha a.s.) and CD45 PerCP-Cy5.5 (clone HI30, BioLegend) and incubated for 30 min on ice in the dark. The cell pellet was washed once in Annexin V Binding Buffer. The data were collected with an LSR II flow cytometer (BD Biosciences) and analyzed with FlowJo software (Treestar). The cells were gated according to their forward scatter (FSC), side scatter (SSC) and CD45 positivity.
RNA Extraction, Reverse Transcription, Quantitative Real-time PCR-The cells were treated with an RNA extraction kit (RNeasy Micro Kit, Qiagen, Hilden, Germany). DNase-treated RNA was then transcribed into cDNA (iSCRIPT, Bio-Rad), and diluted cDNA was used as a template for quantitative real-time PCR (qRT-PCR). The qRT-PCR experiments are described according to the minimum information for publication of quantitative real-time PCR experiments (MIQE) recommendations (24). The qRT-PCR system was based on commercially available hydrolytic probes (TaqMan gene expression assays, Life Technologies). For the quantification cycle (Cq) value assessment, LinReg software was used to avoid bias resulting from subjective evaluation (25). Normalized gene expression was then assessed by the ⌬Cq method. The appropriate combination of inter-nal controls was obtained by intra-and intergroup variation analysis using the NormFinder tool (26).
Computational Analyses and Statistics-All computations and graph visualizations were performed in R-project/Bioconductor [packages "cluster", "flowCore", "Matrix", "igraph", "rggobi", "reshape", "ggplot2", wmtsa, available at: http://www.r-project.org/ or http://bioconductor.org/] as described earlier (17). SEC-MAP and FACS data were compared by using Pearson correlations, and p Ͻ 0.05 was considered to be statistically significant. Euclidean distance was used for the hierarchical clustering of the data, as shown in the heat maps. The significance of the results in the SEC-MAP array data sets was tested using the Multiple Testing Procedures-Bioconductor Package multtest.
Automatically Detecting Protein Entities-In the first step, the signal (the quantity per SEC fraction) was transformed via continuous wavelet transform (27) with a Mexican hat wavelet into a 24 ϫ 24 matrix. Each column of the matrix was weighted by the inverse respective scale factor to remove the bias toward broader peaks. Local bidimensional maxima were identified as candidates for peak modus. The borders of the peaks were set to a modus Ϯ respective scale factor (including up to 2 fractions on both sides where the signal remained a monotone). The width of the peak was limited to 14 fractions. To assemble the peaks in step two, we defined the distance to peaks P 1 and P 2 as d(P 1 , P 2 ) ϭ 1 -min(A 1 (P 1 പ P 2 )/A 1 (P 1 ), A 2 (P 1 പ P 2 )/A 2 (P 2 ), where A i (P) is the sum of the values of the i-th signal over SEC fractions P. By using this metric, all the peaks for specific antibodies across all samples were clustered via partitioning around a medoids (PAM) algorithm. The number of clusters is determined as follows: for each peak p, we include all peaks in set p that are not further than 0.4 from peak p (with respect to the above defined metric). From these sets, we choose the minimal cover of all peaks and set the number of clusters as the magnitude of this cover. The greedy algorithm was used to approximate the minimal set cover problem (28). Once clustered, the medians of the left-most or rightmost fraction of peaks were stored as the final definition of each entity.
Normalization of SEC-MAP Signal-To normalize the data, we used two different methods, depending on the data distribution. In hydrophilic cytoplasmic fractions, the majority of protein entities were unchanged among the samples. The samples were mapped via loess transformation on the reference sample (by default, the sample with the highest protein load was chosen as the reference sample) after background subtraction. For detergent-soluble fractions, when analyzing the membrane fractions of different cell lineages, only background subtraction was employed. The background was defined as the 30% quantile of the median fluorescence intensity (MFI) for the phycoerythrin (PE, 586/15) channel in empty bead populations (microbeads with no primary antibody bound) for each SEC fraction.

Protein Entities Can be Defined in a High-throughput Manner-
The primary challenge of SEC-MAP technology is created by the complexity of the generated data. We had to build a software tool that would be able to analyze complex proteomic data from large cohorts of samples. Fifty-seven samples from patients with acute leukemia were divided into detergent-soluble (membrane-associated and nuclear) and hydrophilic (soluble proteins in cytosol, cytoplasmic organelles and nuclei) parts by differential detergent treatment (15). Both parts were separated by size exclusion chromatography (SEC) to 24 "size" fractions; 632 antibodies were used for the detection of 501 different markers in each fraction (more antibody clones against single protein were used when available). We built on our previous tools that were designed to batch-analyze the flow cytometry standard (FCS) data and generate size distribution profiles for the antibody targets (17). Additional software tool functionalities have now been developed to allow for high-content data normalization, semi-automated analysis of size distribution profiles, and differentially expressed entity discovery in multiple samples. Expert interpretation of the protein entities on the line plot remained the most time-consuming part of the analysis. To facilitate this effort, we devised an automatic entity detection function accompanied by a Graphical User Interface (GUI) application to define, adjust, remove and store the detected entities. We exploited the fact that entities that were detected repeatedly in the data set (Fig. 1A) were more likely to be true protein entities and were thus pre-selected for correction by expert. Automatic entity detection was performed in four steps. First, candidate peaks (signal values for a subset of SEC fractions) were identified separately for each microbead populationantibody in every sample as described under "Experimental Procedures" (Fig. 1B). Second, the candidate peaks for each microbead population-antibody were assembled across all the samples (Fig. 1C), and final entities were defined (Fig. 1D). Third, manual adjustment using a GUI was performed when needed. Last, the borders of the entities were saved to an entity catalogue (supplemental Table S2) and the line plots were created to allow for the visual control of results (supplemental Fig. S1). Entities were automatically extracted with respect to the above definition both in hydrophilic and detergent-soluble fractions of cell lysates, and the sum of all signals within a defined entity was calculated. Then, multiple testing procedures methods (29) were used to detect differentially expressed entities between the acute leukemia subtypes (more details below).
SEC-MAP Can Reveal Immunophenotype of Leukemia Cell Lines-To assess the performance of the SEC-MAP technology in leukemia immunophenotyping, the detection of 31 markers, including intracellular CD79a, MPO, and TdT, was tested using both methods-SEC-MAP (sum of signal per entity) and classical flow cytometry-based immunophenotyping (mean fluorescence intensity, Fig. 2A) in 11 leukemic cell lines (Fig. 2B), and in purified peripheral blood B-lymphocytes, T-lymphocytes and monocytes isolated from healthy donors. A simple linear regression model was used to compare the data. A good quantitative correlation was found for 18 markers (58% (CD2, CD3, CD4, CD5, CD7, CD10, CD13, CD15, CD22, CD33, CD44, CD45, CD58, CD72, CD74, CD79a, HLA-DR, and MPO)), p Ͻ 0.05 (15 of them with p Ͻ 0.01), Pearson product-moment correlation, supplemental Fig. S2). Notwithstanding that we frequently tested more than one antibody clone against a particular antigen (supplemental Table S3), several antibodies were weakly effective (CD19,  figure 1A. The x axis represents 24 size exclusion chromatography (SEC) fractions, and the y axis indicates the median fluorescence intensity (MFI) of phycoerythrin (PE) from streptavidinlabeled protein caught by the antibody-microbead population. The signals from all 57 samples are matched. The target gene name is shown above each line plot (A). The candidate protein peaks (subsets of SEC fractions for each signal) were computationally identified separately for each microbead population-antibody in every sample. The peaks are shown on one representative sample (B). The candidate peaks for each microbead population-antibody were assembled across all samples with respect to a defined metric into "protein entities" (three entities were defined on the LAT). A(P) is the sum of the values for signal A belonging to SEC fraction set P. The assembly of peaks into entities was performed using PAM clustering of peaks (C). An entity definition was stored as peak borders (B and E), and the final entities were defined (D). CD20, CD34, CD117) or ineffective (CD24, CD27, CD56) for capture in SEC-MAP. This result was, however, expected. Antibody performance is application-dependent, and all these antibodies were validated for surface staining of viable cells, but not for immunoprecipitation. Surprisingly, IgM was found in detergent-soluble fraction by SEC-MAP in the NALM-6, REH, and SUP-B15 cell lines (Fig. 2B), whereas flow cytometry showed no surface expression of IgM. This discrepancy could reflect the fact that the SEC-MAP-based membranous detergent-soluble fractions could contain intracellular membrane-associated molecules as well, suggesting that the technical aspects of sample preparation for both methods must be taken into account. NALM-6, REH, and SUP-B15 cell lines were verified as intracellular IgM positive by FACS (data not shown). The discordance in the CD14 protein measurement was caused by a low percentage of CD14 positive monocytes in the lysate made from myeloid cells (data not shown). Finally, all cell types were positive for CD38 and CD99, which precluded the correlation calculation (supplemental Fig. S2). Next, we set out to test the reproducibility of the different array lots. For this purpose, the leukemic cell line REH (BCP-ALL) was measured with the two SEC-MAP array lots (array #1 and array #2). The REH cells were analyzed together with the other nine cell lines which were measured with array #1, using hierarchical clustering with Euclidean distance metrics and average linkage. The two SEC-MAP lots provided the same results, and the REH cells measured with the two array lots were clustered with each other, as shown in Fig. 2C. The two SEC-MAP lots were used throughout the prospective study period from 2010 -2013. In summary, the SEC-MAP performance was critically dependent on the capture antibody performance. Once established, the performance of SEC-MAP was reproducible and the differential expression was comparable using SEC-MAP and flow cytometry.

SEC-MAP Differentiates Between Clinical Samples from Patients With BCP-ALL, T-ALL and AML-
During the prospective study period in 2010 -2013, 501 antigens (supplemental Table S1) were tested by SEC-MAP in 57 primary samples obtained at diagnosis from AL patients. Beads with 632 different antibodies bound to them were used. The entities were defined both in detergent-soluble fractions of cell lysates (779 entities) and in hydrophilic fractions (980 entities) using a combination of algorithmic entity definition and manual adjustment. Background subtraction and loess normalization plus background subtraction were used to normalize the expression within all samples for detergent-soluble and hydrophilic antigens, respectively. We sought to find differentially expressed entities in different subtypes of AL (B-cell precursor acute lymphoblastic leukemia, BCP-ALL, n ϭ 35), T-cell acute lymphoblastic leukemia (T-ALL, n ϭ 9), and acute myeloid leukemia (AML, n ϭ 13). In total, 51 entities (that could be assigned to 45 proteins (including phosphorylated forms of CD45 and lymphocyte cytosolic protein 2 (LCP2)), and to one glycolipid carbohydrate (stage-specific embryonic antigen-4, SSEA4)) were found to be differentially expressed between at least two of the subsets (p Ͻ 0.05) (Fig. 3A,  supplemental Fig. S3). Of these 46 antigens, eleven were identified as CD markers and two as adapter molecules. Others were found to be involved in diverse cellular processes (e.g. proliferation or differentiation, Fig. 3B). Moreover, SEC-MAP identified proteins that had not previously been described in particular subtypes of pediatric acute leukemia. For example, Drebrin (DBN1) has previously been described in patients with BCP-ALL in DNA-microarray study (6), and has been associated with ETV6-RUNX1 positivity (21). In our study, DBN1 was found in T-ALL and BCP-ALL and had higher expression level in the BCP-ALL samples (Fig. 3C,  supplemental Fig. S4). Glutamate dehydrogenase 1, mitochondrial (GLUD1) have previously been detected in B-cell chronic lymphoblastic leukemia and in peripheral blood cells from patients with infectious mononucleosis (30). In our cohort, GLUD1 was overexpressed in the AML samples (Fig.  3C). Moreover, the signal transducer and activator of transcription 5A (STAT5A), as previously described in BCP-ALL (31) and T-cell lymphoma (32), was found to be overexpressed in T-ALL (p Ͻ 0.05) (Fig. 3C). In addition, the following well-known lineage-specific proteins were detected as expected: CD19, CD22, CD72, B-cell linker (BLNK), and paired box protein (PAX5) (33) in BCP-ALL; CCAAT/enhancer-binding protein alpha (CEBPA, phosphorylated form) (34), tyrosine-protein kinase HCK (HCK) (35), tyrosine-protein kinase SYK (SYK) (36), and protein kinase C delta type (PRKCD) (37) in AML; and CD2, CD3, CD8 (9), SH2 domain-containing protein 1A (SH2D1A) (38), and linkers for the activation of T cells (LAT) (39) in T-ALL. The expression of all differentially expressed antigens is shown in supplemental Fig. S3. Finally, we tested whether we could resolve biologically significant protein signatures that could correlate with nonrandom molecular lesion (presence of fusion genes and hyperdiploidy). Because our cohort was relatively small, SEC-MAP identified only the OPAL1 (WBP1L) as a highly expressed protein in ETV6-RUNX1-positive BCP-ALL samples (n ϭ 6) compared with ETV6-RUNX1-negative BCP-ALL samples (n ϭ 29) (p Ͻ 0.05), which corresponds to previously reported findings of higher mRNA levels in ETV6-RUNX1 positive BCP-ALL (40) (Fig. 3D). No entity could distinguish hyperdiploid cases (n ϭ 11) from non-hyperdiploid BCP-ALL cases (n ϭ 24).
Verification of 47 Differentially Expressed Antigens-We set out to verify the detection of all forty-seven differentially expressed antigens that had been identified by SEC-MAP using other commonly used approaches, e.g. FACS or Western blot (supplemental Table S4). The detection of the thirteen markers that distinguished committed lineages was confirmed with flow cytometry in leukemic cell lines (e.g. CD22 or CD44, Fig.  4A). Twenty-one intracellular antigens were tested with a Western blot in the leukemic cell lines and patient samples (e.g. CTBP2 or SH2D1A and Fig. 4B). When neither flow cytometry nor WB detection was available in the laboratory, FIG. 3. SEC-MAP identified 47 antigens that were differentially expressed in diverse subtypes of primary childhood acute leukemia samples. Forty-five proteins (including phosphorylated forms of CD45 and lymphocyte cytosolic protein 2, LCP2), and one glycolipid carbohydrate (stage-specific embryonic antigen-4, SSEA4) were differentially expressed in the three subtypes of primary AL (BCP-ALL, n ϭ 35, T-ALL, n ϭ 9, and AML, n ϭ 13) according to SEC-MAP (p Ͻ 0.05). The heat map shows these 46 antigens (in rows, expressed as the sum of the MFI from PE-labeled streptavidin in the SEC fractions determining the entity) as measured by the SEC-MAP in 57 patient samples qRT-PCR was used to test concordance with the mRNA findings (41) (Fig. 4C). Two different clones of antibodies against the same target were used to ensure specificity for the target proteins Caspase 3 (CASP3), focal adhesion kinase 1 (PTK2), and ribosomal protein S6 kinase alpha-1 (RPS6KA1). Four other proteins were confirmed by being eluted from SEC as expected for their size (LCP2 and LCP2 (pY145), 60 kDa; DNA replication licensing factor MCM2, 101 kDa; and CD45 (pS999), 147 kDa). Additionally, five proteins were indirectly validated with information in the literature that indicated the presence or absence of the respective proteins in particular cell lines. In conclusion, we could verify that SEC-MAP detected specifically all forty-seven differentially expressed antigens (supplemental Fig. S5).

SEC-MAP Revealed a Pre-analytical Degradation Process in Primary Leukemia
Samples-We noted that Abelson tyrosineprotein kinase 1 (ABL1) and protein kinase B (AKT1) presented with altered line plot pattern that was skewed toward smallsized entities in 4 out of 39 primary BCP-ALL samples and in three out of 12 primary T-ALL samples whereas beta-actin (ACTB) line plot was not altered (Fig. 5A). We speculated that degradation from ex vivo sample aging and resulting apoptosis of leukemic cells might be the cause. Confirming that hypothesis, we found that the pro-apoptotic protein Bcl2associated agonist of cell death (BAD) was overexpressed in the degraded BCP-ALL samples (p Ͻ 0.05), indicating ongoing apoptosis (42) (Fig. 5B). To mimic this process, the buffy coat samples were left ex vivo for 24 h at 25°C prior to PBMC isolation. When comparing the 24-hour-old samples with freshly isolated PBMC, there was a documented increase in the initial steps of apoptosis (31.9% Ϯ 0,29% versus 6.7% Ϯ0.91%) and cell death (0.4% Ϯ0,08% versus 7.2% Ϯ0,87%) (Fig. 5C). Additionally, we checked the line plot patterns of ABL1 and AKT1 and observed the same skewed pattern to small-sized entities with aging that was paralleled by appearance of protein fragments on WB. No changes on SEC-MAP line plot or on WB was observed for ACTB and beta-2-microglobulin (B2M) (B2M data not shown) in the 24-hour-old sample (Fig. 5A). Next, we used the above described entityseeking algorithm to search for small-size entities that are only present in degraded samples. We have found 27 new entities that were more abundant in proteolytically degraded samples. The cytoplasmic kinases (e.g. tyrosine-protein kinase JAK2) and nuclear proteins (e.g. TCF3) were frequently cleaved (supplemental Table S5). DISCUSSION A major goal of current cancer research is the integration of molecular information about genes, mRNA and proteins. Recent molecular genetic studies have identified multiple alterations in genes that are known to have roles in leukemia development. These include transcriptional regulators of lymphoid development (e.g. Pax5), cell cycle regulators (e.g. cyclin-dependent kinase 4 inhibitor B, CDKN2B) or lymphoid signaling genes (e.g. BLNK) (33). However, because mRNA levels often do not correlate with protein expression (43), these findings may not ultimately influence the pathogenesis of a disease. Thus, proteomics must complement transcriptional profiling and quantify the level and activation status of cancer-related proteins. Tools that can better categorize diseases based on potential drug targets in the proteomic machinery, as well as tools that can identify biomarkers for treatment outcomes, are needed. Protein microarray technology fulfills the need to measure a multitude of protein markers simultaneously. Currently, two types of protein microarrays are widely used, namely planar microarrays and bead-based systems. Planar microarrays serve as high content assays for identifying differences in the expression of cellular proteins (44), even CD markers on intact cells can be directly tested (45,46). Reported planar assays contain 82 (45) or 60 (46) antibodies per array, or reverse phase arrays contain 160 samples probed by 20 -40 antibodies (44), neither type resolved protein size and cellular localization. Bead-based assays are frequently used to detect cytokines in various body fluids and generally employ a sandwich design, in which a microsphere-bound antibody captures an analyte and a fluorescent-labeled antibody is used as a reporter for measurement (47,48). The major limitation of the use of multiplexed sandwich assays is the lack of matched antibody pairs, which limit these assays to ϳ100 analytes. In 2009, Wu et al. intro-(columns). Hierarchical clustering with Euclidean distance metrics and average linkage was used for the analysis (A). Four markers were found to be included in adhesion and migration, 3 in proliferation, 1 in differentiation, 3 in apoptosis, 1 in ubiquitylation, 12 in cell signaling, 7 in transcription, and 1 in metabolism. Eleven CD antigens and two adapter molecules were identified (B). Drebrin (DBN1) was found to be highly expressed in BCP-ALL in comparison with AML (top panel). Glutamate dehydrogenase 1, mitochondrial (GLUD1) was found to be highly expressed in AML in comparison with BCP-ALL (middle panel). STAT5A was found to be highly expressed in T-ALL in comparison duced SEC-MAP, a novel microsphere-based antibody array platform that enables the detection of more than a thousand markers in a single experiment without requiring the use of matched antibody pairs (15).
In this study, we assembled carefully selected capture antibodies to form SEC-MAP array, and we developed automated software tools to facilitate hypothesis driven data analysis. We demonstrated that SEC-MAP can accurately reproduce leukemia immunophenotyping. Major challenge of any antibody array is the diverse performance of individual antibodies and its validation (49). We have taken the approach of assembling large scale array using affinity reagents (antibodies) recognizing a priori selected antigens (CD markers, intracellular proteins-products of differentially expressed genes identified by expression profiling studies) combined with a posteriori validation of antibodies recognizing differentially expressed antigens between groups of interests using conventional methods (FACS, WB, qRT-PCR). cipitating antibody, other false negative results in SEC-MAP analysis could be caused by the localization of detected antigens in different molecular complexes, rendering the epitopes inaccessible for specific binding by the antibody. This finding is especially evident in native proteins that are isolated from the cell with detergents (50,51). The use of more antibodies, each of which recognizes a different epitope, is a possible solution. In this sense, SEC-MAP could serve as a high-content platform for the testing and pre-selection of optimal antibody clones for immunoprecipitation (49). False positive results in SEC-MAP could be caused by nonspecific binding of the respective antibody clone. Nonspecific binding of antibodies is a well-documented phenomenon in Western blot studies (52). However, nonspecific binding can be mostly revealed by the fractionation of protein lysates in the gel (for Western blots) or by size exclusion chromatography (for SEC-MAP). Nonspecific bands (for Western blots) or nonspecific entities (for SEC-MAP) are mostly found in size fractions inappropriate for the specifically detected antigen. Furthermore, specific binding should be found only in samples known to express given target protein and nonspecific binding can occur in samples lacking the target protein expression. For example, clone MEM-31 (anti-CD8) showed nonspecific binding in SEC fractions 1-5 where the antibody gave a signal in all tested cell types including B-cells and a specific binding in SEC fractions 4 -13 where the signal was retained only in T-cells (data not shown). Reactivity of different antibody clones with the same target can be tested by clustering of the SEC-MAP results across samples (49). We track performance of all used antibodies across our projects (15,16,17,49). Where available, two or more antibody clones against particular target were used in this study (supplemental Table S1).
Collectively, we tested 55 antibodies against 31 leukemia immunophenotyping markers and identified 22 good, and six suboptimal but still useful clones suitable for immunoprecipitation-based technology. Although classical flow cytometrybased immunophenotyping serves as an important diagnostic method, it is still limited in the number of detected molecules. By contrast, SEC-MAP offers high-content screening possibilities of hundreds of molecules at once. Altogether, our MAP arrays consisted of 632 different antibodies, mostly against markers with possible predictive values for leukemia. From them, 79% could immunoprecipitate from the hydrophilic (cytoplasmic) or detergent-soluble (membrane-bound and nuclear) cellular compartments, in which 980 and 779 protein entities differing in molecular sizes were identified, respectively. The expression of all the entities was tested in 57 primary diagnostic samples of childhood acute leukemia. All the samples were characterized by FACS and classified as B-cell precursor acute lymphoblastic leukemia (BCP-ALL, n ϭ 35), T-cell acute lymphoblastic leukemia (T-ALL, n ϭ 9) and acute myeloid leukemia (AML, n ϭ 13). The cohort was relatively limited, but still we could identify 51 entities distinguishing different lineages (BCP-ALL, T-ALL and AML). Apart from well-known markers such as CD19 and PAX5 in BCP-ALL (33), CD2 and SH2D1A in T-ALL (9,38) or CD33 and CEBPA (pT222/226) in AML (9,34), we identified other proteins (frequently localized to the cytoplasm or nucleus) that were not previously linked to a particular subtype of pediatric acute leukemia. The overexpression of drebrin (DBN1), which has previously been described in BCP-ALL and has been correlated with the ETV6-RUNX1 chromosomal translocation (21), was found in both T-ALL and BCP-ALL. Another marker, GLUD1, was overexpressed in AML, although it has previously been demonstrated in B-cell chronic lymphoblastic leukemia and in peripheral blood cells from patients with infectious mononucleosis (30). Ultimately, we tested whether we could resolve biologically significant protein signatures in cases with molecular lesions (ETV6-RUNX1 or hyperdiploidy). In our limited cohort, OPAL1 (WBP1L) was the only apparent marker for ETV6-RUNX1. OPAL1 was first described as a predictor of a superior outcome (53) and was later found to be associated with ETV6-RUNX1-positive BCP-ALL at the mRNA level in a DNA microarray study (40). Since ETV6-RUNX1 fusion gene alone has been connected with a low risk of relapse (1) the future studies are needed to clarify the contribution of OPAL1 (either mRNA or protein) in the response to treatment. Anti-OPAL1 antibodies were developed locally; to date, this is the first report of OPAL1 protein detection.
SEC-MAP technology has another unique feature, which is its ability to approximate molecular size of protein entity. It is known that leukemia cells of particular genotypes (e.g. hyperdiploidy) are prone to apoptosis by intrinsic mechanisms (54) or because of treatment (55). Because the time lapse between sampling at the clinical department and laboratory processing is frequently 24 h or more (56), the proteolysis that is connected with apoptosis must be taken into account. Moreover, the high number of protease-containing phagocytic cells can destroy the proteome during lysis, even with the use of classical protease inhibitors (57). The degradation of ABL1 protein (ABL1 was cleaved and shifted to lower size fractions) provided the first sign of proteolysis in the sample. ABL1 has previously been described as a substrate for caspases during apoptosis (58). Additional protein entities were shifted to smaller-sized fractions, e.g. AKT1, BLNK, receptor-interacting serine/threonine-protein kinase 1 (RIPK1), or transcription factor E2-alpha (TCF3) in BCP-ALL; AKT1, DBN1, hematopoietic lineage cell-specific protein (HCLS1), and signal transducer and activator of transcription 6 (STAT6) in T-ALL samples. DBN1, HCLS1, STAT molecules, and RIPK1 were recently described as caspase-dependent cleavage substrates in a JURKAT T-ALL cell line upon apoptosis induction in a SILAC study (59). Moreover, a higher level of the pro-apoptotic protein BAD was found in degraded samples (42). All these findings indicate that apoptosis can occur in clinical samples. However, the commonly used protein loading controls (betaactin and beta-2-microglobulin) were not cleaved nor reduced in the degraded samples. The stability of beta-actin and beta-2-microglobulin and the sensitivity of ABL1 and AKT1 during the procedure were confirmed in the healthy PBMC that were left for 24 h at room temperature to mimic ex vivo sample aging, implicating a general mechanism (not a leukemia-specific mechanism). SEC-MAP has the power to detect hundreds of proteins at once and thus uncover proteolysis in individual proteins or protein groups. By contrast, Western blots that used beta-actin as the only control failed. Ruan and Lai documented that the mRNA expression of beta-actin varied during growth and differentiation or in response to biomedical stimuli, and they did not recommend it for use as the internal control in gene expression studies (60). Moreover, Dittmer showed that beta-actin is not an optimal loading control for Western blot experiments because it cannot effectively distinguish between different protein loads (61). We demonstrated that the significant proteome alteration caused by ex vivo proteolysis could have been missed if only betaactin and beta-2-microglobulin were used. Optimally, a technique that tracks multiple markers with size resolution is necessary for accurate screening of primary leukemia samples.
In conclusion, SEC-MAP technology is a powerful proteomic tool that is applicable to primary samples. It provides excellent lot-to-lot variability, which is a fundamental feature in clinical research to ensure transferable and comparable data (62). The procedure requires only 10 million cells and can deliver reproducible information about hundreds of proteins within two laboratory days in a format that is amenable to computational data-mining. Because both the protein expression and protein sizes are detected, sample proteolysis can be easily identified. However, the selection and validation of effective immunoprecipitating antibodies is the key to obtaining sensitive, specific and quantifiable information. In the present study, we were able to confirm detection of wellknown biomarkers and we identified additional protein biomarkers in AL.